Protein Bioinformatics

Protein Bioinformatics

Project Leader:Vikram Alva
Department:Protein Evolution - Lupas
Assistant: Karin Lehmann
Phone:+49 7071 601-340
Fax:+49 7071 601-349
Staff:Alphabetical List | Alumni


The total number of different proteins existing today is estimated to be a trillion. Although this may seem a vast number, the actual diversity of proteins in nature is rather limited. Many proteins share detectable similarity in sequence and structure, since they arose by amplification, recombination, and divergence from a basic complement of autonomously folding modules, referred to as domains. Indeed, sequence comparison of modern proteins shows that they fall into only about ten thousand domain families, which, based on structural similarity, can be grouped further into one of a thousand folds. Many of these folds were already established at the time of the Last Universal Common Ancestor, a theoretical primordial organism from which all life on earth descended.

We are broadly interested in understanding the events that led to the emergence of these first folds as well as the events that led to their diversification into the many functional protein families we recognize today. To track these events, we use sensitive sequence analysis tools to establish correlations between sequence and structure similarity of today’s proteins. Many of the tools we use are integrated into the MPI Bioinformatics Toolkit (, a one-stop, integrative resource for protein bioinformatic analysis, which we develop and maintain.



Selected Publications

Fuchs ACD., Alva V., Maldoner L., Albrecht R., Hartmann MD., Martin J. (2017) The Architecture of the Anbu Complex Reflects an Evolutionary Intermediate at the Origin of the Proteasome System. Structure Jun 6;25(6):834-845.e5.
PMID: 28479063.


Lupas AN., Alva V. Ribosomal proteins as documents of the transition from unstructured (poly)peptides to folded proteins. J Struct Biol. 2017 May;198(2):74-81.
PMID: 28454764.


Alva V., Nam SZ., Söding J., Lupas AN. (2016) The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Res 29.
pii: gkw348. [Epub ahead of print].
PMID: 27131380


Alva V., Lupas AN. (2016) The TULIP superfamily of eukaryotic lipid-binding proteins as a mediator of lipid sensing and transport. Biochim Biophys Acta doi: 10.1016/j.bbalip.2016.01.016.
PMID: 26825693


Alva V., Söding J., Lupas AN. (2015) A vocabulary of ancient peptides at the origin of folded proteins. eLife pii: e09410. doi: 10.7554/eLife.09410.
PMID: 26653858


Scharfenberg F., Serek-Heuberger J., Coles M., Hartmann MD., Habeck M., Martin J., Lupas AN., Alva V. (2015) Structure and evolution of N-domains in AAA metalloproteases.
J Mol Biol 427(4):910-23.
PMID: 25576874


Alva V., Remmert M., Biegert A., Lupas AN., Söding J. (2010) A galaxy of folds. Protein Sci 19(1):124-30.
PMID: 19937658


Alva V., Koretke KK., Coles M., Lupas AN. (2008) Cradle-loop barrels and the concept of metafolds in protein classification by natural descent. Curr Opin Struct Biol (3):358-65.
PMID: 18457946