Summary: Answering Phil's questions, and clarifying one thing he
asserts about what I said.
On Jul 16, 2007, at 12:22 PM, Phillip Lord wrote:
"Alan" == Alan Ruttenberg <[EMAIL PROTECTED]> writes:
Take these rhethorical questions:
I am interpreting these as questions of fact, that "same" means
instances of the same class, with the classes you name considered
narrowly construed. That doesn't mean that we can't define broader
classes in which instances of these two types are considered to be
members of the same class.
Is Red Opsin in human the same as Red Opsin in Cattle?
No.
Is Red Opsin in me, necessarily the same as Red Opsin in you?
No.
What if they have a polymorphism?
No.
Are two isoforms from an alternate splice the same protein?
No.
If a protein has been partly digested, is it still the same?
No.
Are haemoglobin alpha and beta the same?
No.
The point is that you can't deal with a protein computationally.
You can't
resolve it, analyze it computationally. It's always second hand
information
that you want to deal with.
Yes, but we generalize and boldly make statements about what we can
directly see, and find that these are supported by further
experiments or not, and possibly revise our statements. I *think* we
want to be able to capture such statements on the semantic web, no?
Yes, exactly. A uniprot record defines a class of proteins
extensionally. This
means, antibodies to the proteins described by OPSD_HUMAN (for
example).
Well, if I tell my agent to go order some OPSD_HUMAN from Invitrogen,
what will you expect to get back. Or do you deny that I will want to
use identifiers such as this for this kind of purpose.
<snip in the interest of brevity>
If we have the ability to express "the class of protein molecules
defined by the swissprot record OPSD_HUMAN"
then I think we have all we need.
That would be a good start. How will we see if we've succeeded? I
have some ideas, like picking two people who work in the field,
asking them to describe what the set of proteins are that are
described by the swissprot record OPSD_HUMAN, and then comparing what
they say. How would you know when we've succeeded at this?
I think that if we were there, then we could effectively start to
build formal statements.
If we make our own definitions, all that we have done is duplicate
what the uniprot team are already doing. And we will, almost
inevitably, do it somewhat differently. All we would do is create
confusion. The only way that we ensure that we do the same thing as
uniprot is say "yeah, what they said".
Unsatisfying, maybe. Clear definitions are important. But
interoperability, and the lack of duplication are more so.
Forgive my confusion, but how exactly will we achieve
interoperability and lack of duplication if we don't have
definitions? How would we know that we don't have duplication, for
example?
<snip>
And, yet, you just told me that you could buy a antibody with just
a swissprot ID. So, let me restate the question, what are you going
to do with a protein ID that you are not going to do with a
swissprot ID, or "the protein formally known as OPSD_HUMAN".
I did not say that. I've said some people have identified antibodies
by such ids. Unfortunately this information is of limited use when
actually ordering an antibody, where I am interested in much more
information, such as how specific it is, how it has been validated,
and other properties related to how it behaves in certain
experimental settings. I *want* to be able to have identifiers(URIs)
that are up to the job of ordering reagents.
-Alan