Summary: Continued discussion of whether we need to have identifiers for protein classes in addition to those for records. Example finding is given to support my view that we do need them, in response to Phil's suggestion I examine my scenarios.
[yah, I know I'm not being consistent about the summaries yet]

On Jul 18, 2007, at 2:22 PM, Phillip Lord wrote:

"Alan" == Alan Ruttenberg <[EMAIL PROTECTED]> writes:

Alan> Summary: Answering Phil's questions, and clarifying one thing he
  Alan> asserts about what I said.

What if they have a polymorphism?
  Alan> No.
Are two isoforms from an alternate splice the same protein?
  Alan> No.

In both of these you differ from uniprot.

Well, if I am restricted to using such Uniprot classes I will have trouble representing important scientific findings. If Uniprot only has one name for the two molecules, one of which has a snp that leads to a loss of function that is the initiating factor of a disease, then we have a problem, no? How do we say things about the disease related form?


Unsatisfying, maybe. Clear definitions are important. But
interoperability, and the lack of duplication are more so.

Alan> Forgive my confusion, but how exactly will we achieve interoperability Alan> and lack of duplication if we don't have definitions? How would we
  Alan> know that we don't have duplication, for example?

If you create identifiers to describe proteins rather than protein records (like uniprot) then you have created a whole new set of IDs. When anyone wants
to talk about a protein, they will have to look up the ID.

As they will when they want to talk about a record. Of course perhaps we all will add some links of the sort that say the record is about some set of classes of proteins, and that aspects of the protein in a class can be described by pieces of the record.

But at least we'll know what we are talking about.


<snip>

And, yet, you just told me that you could buy a antibody with just a
swissprot ID. So, let me restate the question, what are you going to do with a protein ID that you are not going to do with a swissprot ID, or
"the protein formally known as OPSD_HUMAN".

Alan> I did not say that. I've said some people have identified antibodies Alan> by such ids. Unfortunately this information is of limited use when Alan> actually ordering an antibody, where I am interested in much more Alan> information, such as how specific it is, how it has been validated, Alan> and other properties related to how it behaves in certain experimental Alan> settings. I *want* to be able to have identifiers(URIs) that are up to
  Alan> the job of ordering reagents.

Well, I am not sure that you are going to achieve this with an identifier. You
need significant extra amounts of metadata.

By that reasoning I don't need DOIs for publications. All I need is the URI for the journal and some metadata.

My point here is simple. Separating out the informatics and biology conform better to our notion of reality, sure. But you are talking about modelling what makes a protein and, more, a type of protein. Work through your scenarios and see whether you need a protein ID for this. If not, you are introducing a
layer of abstraction that you don't need.

I'm trying to be able to make statements that capture, among other things, the conclusions that one finds in journal articles. In http://www.nature.com/onc/journal/v21/n46/full/1205845a.html there is a description of different isoforms of BAG-1. The different isoforms have names, e.g. "BAG-1 p29" This name indicates a class of protein instances. I expect I need a name and a definition for "BAG-1 p29" and the others, so that I don't get confused and think there is a contradiction between the statement that "BAG-1 p29 failed to protect the transfected cells from apoptosis" and "BAG-1 p50, p46 and p33 isoforms enhanced the resistance to apoptosis"

But I'm open to discussing suggestions for representing these statements by only making use of the Uniprot records ids, if you have any.

-Alan



Reply via email to