Introduction

Proteins with a single transmembrane domain (TMD) make up ~30% of all membrane proteins. Many diseases and biological processes are affected by their interaction and oligomerisation in the membrane. Due to a lack of crystal structures, their interaction interfaces are poorly understood. THOIPA (Transmembrane Homodimer Interface Prediction Algorithm) predicts interfacial residues from evolutionary sequence data.

How was THOIPA developed?

THOIPA is a machine-learning algorithm that was trained to detect interface residues of self-interacting TMDs.
The training dataset consisted of a non-redundant collection of TMDs with known self-interaction, derived from crystal structures, ToxR-like E. coli TM reporter assay (ETRA) experiments, and NMR studies.

How does THOIPA work?

1) The full-length sequence is used to obtain homologous sequences for the protein, using BLAST.

2) Based on the input TMD, the TMD region of each homologue is identified, extracted and combined into a multiple sequence alignments.

3) Parameters such as sequence conservation, hydrophobicity, and residues co-variation are extracted from the multiple-sequence alignment.

4) The parameters are used as the input for a machine learning algorithm, previously trained against interfaces derived from ToxR-like ETRA, X-ray/EM, and NMR experimental studies.

Citation

Yao Xiao, Bo Zeng, Nicola Berner, Dmitrij Frishman, Dieter Langosch, Mark George Teese (2020) Experimental determination and data-driven prediction of homotypic transmembrane domain interfaces. Computational and Structural Biotechnology Journal
https://doi.org/10.1016/j.csbj.2020.09.035