Biological Sciences
BIO 402:Bioinformatics (Credit, 4)
Course Contents:
Introduction to Bioinformatics: Basic overview, concepts, utility, scope and applications. (1-2 Lectures)
- Databases::NCBI (37 internal Databases including PubMed, GenBank, RefSeq, SRA, Taxonomy Viral- resources etc); EMBL-EBI (over 50 internal databases including Ensembl, Enzyme portal, HGNC, PDBe, Pfam, ChEMBL, Rfam, TreeFam, Interpro, UniProt/SwissProt, TrEMBL, WormBase etc); DDBJ; Expasy- Enzyme, IUBMB Enzyme Nomenclature, KEGG, MetaCyC, BRENDA; UCSC, PDB, PDB-SUM, SCOP, CATH, SUPERFAMILY, COGs and web resources; (5 lectures)
- Sequence formats:FASTA, GenBank, EMBL, PDB, XML, Medline, GCG, Phylip, Nexus, Newick,Stockholm, SAM/BAM etc. Conversion from one format to another, tools available for format interconversion (Emboss Seqret). (2 lectures)
- Sequence analysis:Introduction to sequence alignment, homology, similarity, identity. Local and global alignments, multiple sequence alignments, insertions, deletions, gaps, Needleman-Wunsch algorithm, Dot matrix method, dynamic programming algorithm, scoring matrices- PAM and BLOSUM, BLAST-Packages, Blat, Clustalw, MAFFT, BLOCKS, and other sequence-alignment software packages. Strengths and limitations. Sequence Profile-Building and Profile based-sequence searches: HMMER, JACKHMMER, PSI- BLAST, PSSMs etc. (5 Lectures).
- Protein annotation, classification and structure prediction:Introduction to domains, motifs, fold, family, Helices, beta-sheets, loops, coils. Primary, secondary and tertiary structure. Protein annotation using CDD-Search, Pfam-Search, HHblits, HHpred, Interproscan etc. Protein clustering using BLASTCLUST, CD-HIT, MMSEQS, and other tools. Structure visualization tools such as MOLSTAR, PYMOL etc; Protein structure similarity analysis using DALI. (5 Lectures)
- Phylogenetic analysis:Concepts and Terminologies, commonly used phylogenetic tree construction software packages: IQTREE, MEGA, PHYLIP. PHyML, RAxML, MRBayes, BEAST etc. Bootstrapping: Concepts in Bootstrapping etc. Tree-reconstruction methods: Maximum parsimony, Maximum Likelihood, Distance Matrix and Bayesian methods; Advantages and disadvantages of each tree-building methodologies. (5 Lectures)
- Gene prediction and annotation methods:Concept of genes, challenges in gene prediction, ORFs, reading frames, codons and codon bias, genetic code, commonly used gene prediction methods- ORF finder, Glimmer, GeneMark, Metagene, etc. Annotation using homology-based alignment using Blast or Blat, COGs and Gene ontology based functional annotation. Genome analysis: Introduction to genomes and packages for genomic analysis such as EMBOSS; Introduction to Linux and Perl. (5 Lectures)
Suggested Reading::
- 1) Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins (Andreas D. Baxevanis; B. F. Francis Ouellette).
- Introduction to Bioinformatics (Arthur M. Lesk) .
- Bioinformatics and Molecular Evolution (Paul G Higgs and Teresa K Attwood)
- From Protein Structure to Function with Bioinformatics (Daniel J. Rigden (Editor). Springer; Softcover reprint of the original 2nd ed. 2017 edition (25 July 2018)
- Protein Families: Relating Protein Sequence, Structure, and Function (Christine A. Orengo, Alex Bateman, et al.) Wiley; 1st edition (18 March 2014)
![]() |
Back to Course List | Next ![]() |