Perl

1st term, 2017-2018
MWF 1:30-2:20 in W2033
F 10:30-11:20 in W2033

Instructor: Fernando Pineda

Course materials

  1. Lecture and Lab notes
  2. Books, references,and Cheat Sheets

Description

Please read ALL of this document carefully.

This course uses the perl programming language to introduce skills and concepts needed to process and interpret data from high-throughput technologies in the biological sciences. The course focuses on generally applicable computer-science concepts rather than statistical analysis concepts.  Lectures with live computer demonstrations and hands-on-laboratories will be used to introduce key concepts. These will be reinforced and extended with weekly readings and programming exercises. Exercises and examples will draw heavily from biological sequence analysis,  proteomics, genetics and computational biology. Occasional guest lecturers will present case studies. Students will be introduced to the wealth of bioinformatics and computational software-development resources available on the World Wide Web. Students will be introduced to necessary fundamentals in computer science including: (1) pattern matching, parsing and translation (2) data structures, algorithms and complexity, (3) Object-oriented programming (4) programming style and best practices. Applied topics to be covered include: (1) Biological sequence analysis, (2) Perl as middleware (3) how to use unix and perl to manage and process high-throughput datasets and (4) automated interaction with local (e.g. MySQL) and remote (e.g. Genbank) biological databases, (5) High performance computing, (6) parallel processing, (7) simulation.

People

Name Role Contact/Location Office Hours
Fernando Pineda Instructor Tel: 443-287-3673
fernando.pineda@jhu.edu
office: E3626
by appointment
Mark Miller Computing Systems Manager mmil116@jhu.edu

Prerequisites

Permission of the instructor AND a previous course in computer science OR computer programming experience. If you have never done any programming or if you have never used a command line on a unix workstation, you will find this course to be very challenging,  If this is the case, you may wish to register as an auditor instead of taking it for credit.

Homework and Grading Policy

Grades are based on four programming assignments and a final project. The programming assignments count for 60% of the grade, the final project counts for 40% of the grade. It is expected that each student will coordinate with the instructor to select a suitable project based on the student’s interest. The final project must be presented in class and working code must be demonstrated. Homework problems are generally awarded 2-5 points each. Programming assignments typically receive full credit if they produce correct results when run by the instructor on the cluster and are well documented. It is not sufficient that that the code runs on the student’s machine Documentation and programming style are necessarily somewhat subjective. Note that not everything needed to complete the assignments will be presented in the lectures. Assignments will need material in the readings as well as the lectures.

Homework is accepted electronically as html formatted documents AND as working programs/scripts. No homework will be accepted via email or on paper. No late homework will be accepted. Once the due date and time has passed, it will be impossible to submit homework electronically. (We will check the timestamps on the files!). For programming assignments students may discuss ideas and approaches with others. However, programs and projects are to be completed independently and must be original work. The first block of comments in your Perl code should contain, at a minimum, the following items.

* Name of the program
* Your name and the date
* Assignment number
* Usage instructions for the program

Final Project

You should have a meeting with the instructor to decide on a final project no later than four weeks before the end of the course. A written proposal (a paragraph or two describing the project) which is put on your final project web page is due three weeks before the end of the course. I find that the best projects are those that come from active research projects. So it is best to consult with your advisor, or a faculty member in your department for potential projects. A suitable project should be about as much work as two or three homework assignments. The final project is graded on how well is shows mastery of the subject matter taught in the course. For example a project that makes effective use of modules, data structures (e.g. references), regular expressions or databases, will get more points than a program that uses just rudimentary perl. Documentation and maintainability is also important. Note: A project that solves an interesting and useful problem will also get more points than a problem that is just a homework exercise (This is graduate school after all). Here are some example projects from previous years: FinalProjects

Schedule

Day
Month-Day
Venue
Topic
Remarks
MonAug-28W2033Intro to Perl
Lecture & New User Registration
Wed
Aug-30
W2033
Computing environment orientation
Cluster orientation
FriSep-01
W2033Basic Perlunix/linux & bash shell
FriSep-01W2033Basic PerlPerl Intro
MonSep-03
Labor Day Holiday
WedSep-06W2033Basic Perlcharacter representations & strings
Fri
Sep-08
W2033Basic Perl
Control structures
Fri
Sep-08
W2033Basic Perllists, arrays and hashes
MonSep-11
W2033Basic PerlRegular expressions
Wed
Sep-13W2033Intermediate PerlScope: The general concept,
"global", "our" and "lexical" variables
FriSep-15W2033Intermediate Perl
Functions, References and Modules
FriSep-15W2033Intermediate PerlData structures, computational complexity,
Concepts from Computer Science
MonSep-18W2033cancelled
Wed
Sep-20W2033Intermediate PerlPutting it all together: Dynamic Programming example
FriSep-22W2033Intermediate PerlObject Oriented Programming (OOP)
FriSep-22W2033Intermediate PerlObject Oriented Programming (OOP)
MonSep-25W2033Intermediate PerlBlast
Wed
Sep-27W2033High Performance Computing
FriSep-29W2033High Performance Computing
FriSep-29W2033High Performance Computing
MonOct-02W2033Relational Databases
WedOct-04W2033Relational Databases
FriOct-06W2033Relational Databases
FriOct-06W2033Relational Databases
MonOct-09W2033Python
Wed
Oct-11W2033Python
FriOct-13W2033Python
FriOct-13W2033Python
MonOct-16W2033Python
WedOct-18W2033Python
FriOct-20W2033Python
FriOct-20W2033Student Presentations