If I may put forward a key protein in Alzheimer disease as an example
that we are grappling with, there is full-length APP (which itself
has a number of forms as well as mutations); various peptides derived
from cleavage of APP; and then multimeric forms of the peptides,
particularly Abeta42, which is known to form soluble dimer, trimer,
tetramer, hectamer, and dodecamer, each of which may have different
functions or toxicities, as well as "misfolded" protofibrillar and
insoluble fibrillar forms, and possibly a pore-like form consisting
of I-forget-how-many Abetas. In addition, proteins form complexes
that have functions that are different from those of the non-
complexed protein. I look forward to seeing how the Protein Ontology
unfolds, so to speak! - June
On Jul 19, 2007, at 11:23 AM, Darren Natale wrote:
We don't yet have formal definitions for many of the classes and
relations (the effort only began in earnest a few months ago).
But, basically, there is a distinction made between the full-length
(in terms of amino acid sequence) protein and the sub-length parts
of proteins (commonly called domains by protein scientists,
unfortunately). The term "whole protein" is somewhat of a
placeholder; it is used to signify the evolutionary classes
(families) of full-length proteins as opposed to the evolutionary
classes of domains. Sequence form is again a placeholder term used
to denote the initial translation product from an mRNA, which
itself might be based on a "normal" gene or a mutant thereof, or
which might be one of several possible alternatively spliced
transcripts from the normal or mutant gene. The cleaved or
modified product is a further breakdown of those initial
translation products, and allows one to distinguish between a
phosphorylated version of a protein and the non-phosphorylated
version (as an example). The need for the latter derives from the
fact that the two versions might have different functions.
Eric Jain wrote:
Darren Natale wrote:
We recently began a new Protein Ontology (PRO) effort geared
precisely toward the formal definition of the "smaller entities"
referred to by Alan. By "we" I mean the PRO Consortium,
comprising the PIs Cathy Wu of PIR (which is also a member
organization of the UniProt Consortium), Barry Smith of SUNY
Buffalo, and Judy Blake of Jackson Labs. PRO is being developed
within the framework of the OBO Foundry, and aims to specify
protein entities at the level mentioned by Chris (accounting for
splice variation and post-translational modification and
cleavage). Where appropriate, PRO will indeed make reference to
both other ontologies and to UniProt Knowledgebase (UniProtKB)
records. Furthermore, we are also undertaking the "wildly
ambitious" job of representing broader, more-inclusive classes of
similar proteins based on evolutionary relatedness.
A further description of PRO (with examples and link to a paper)
can be found at http://pir.georgetown.edu/pro
This will no doubt be interesting to quite a few people here! For
the sake of this discussion, could you elaborate a bit more on how
the different concepts in PRO are defined, i.e. what is a
"protein", "whole protein", "sequence form" and "cleaved and/or
modified product"?