Phillip Lord wrote:
Well, swissprot refers to isoforms I think. Push comes to shove, just use the
sequence.
Note that we do have stable identifiers for isoforms, for example in
http://beta.uniprot.org/uniprot/P00750.rdf you can find URIs for the
isoforms we describe, e.g. http://purl.unipro
> "Alan" == Alan Ruttenberg <[EMAIL PROTECTED]> writes:
Alan> Well, if I am restricted to using such Uniprot classes I will have
Alan> trouble representing important scientific findings. If Uniprot only
Alan> has one name for the two molecules, one of which has a snp that leads
Alan>
les/NOT-GM-07-108.html
Karen Skinner
NIDA/NIH
-Original Message-
From: Eric Jain [mailto:[EMAIL PROTECTED]
Sent: Friday, July 20, 2007 11:56 AM
To: Alan Ruttenberg
Cc: Phillip Lord; Matthias Samwald; public-semweb-lifesci@w3.org
Subject: Re: Ambiguous names. was: Re: URL +1, LSID -1
Alan
Alan Ruttenberg wrote:
"Remember that one of the reasons this came up was the claim that the
Uniprot URI should be used to identify a set of real things."
OK, I think that describes my current point of view.
I get confused when I read statements that sound like "x means the same
thing in in
On Jul 20, 2007, at 3:52 AM, Eric Jain wrote:
Alan Ruttenberg wrote:
Who's mission? Remember that one of the reasons this came up was
the claim that the Uniprot URI identified the protein in the real
world.
Who claimed that?
If we wanted to identify each protein in the real world we'd ha
Alan Ruttenberg wrote:
Who's mission? Remember that one of the reasons this came up was the
claim that the Uniprot URI identified the protein in the real world.
Who claimed that?
If we wanted to identify each protein in the real world we'd have to assign
zillions of URIs just for the protein
Summary: Continued discussion of whether we need to have identifiers
for protein classes in addition to those for records. Example
finding is given to support my view that we do need them, in response
to Phil's suggestion I examine my scenarios.
[yah, I know I'm not being consistent
On Jul 19, 2007, at 4:16 AM, Eric Jain wrote:
Alan Ruttenberg wrote:
In that case, I would recommend that it is unwise to use Uniprot
ids as identifiers of protein classes on the semantic web. Doing
so would encourage exactly the kind of ambiguity that we need to
avoid in order to write
On Jul 18, 2007, at 11:26 AM, Phillip Lord wrote:
I think that there are many clear reasons for keeping statements
about the informatics entities -- the database entries for example.
No question about that. I totally agree.
To do otherwise, runs the risk of enormous mission creep (always a
> Many post-translational modifications like glycosylation
> (http://www.functionalglycomics.org/static/index.shtml)in proteins
> fundamentally change the (functional) 'nature' of the protein (as also
the
> molecular structure of the protein in case of glycosylation through
> addition of sugar cha
> An interesting issue, one of identity. What determines the identity
of
> a molecule, a protein in this case?
I strongly believe that the identity of a molecule is only dependent on
its physical (chemical) composition.
> If you have a protein that becomes
> phosphorylated, is the phosphoryla
Quite a nice example! These are the sorts of issues that we must
contend with while creating the PRO framework. In fact, this addresses
another issue of scope; that is, whether or not (in the long or short
term) to also account for homodimers, trimers, and so on (currently, GO
handles heter
Original message
>Date: Thu, 19 Jul 2007 16:29:18 -0400
>From: Michel_Dumontier <[EMAIL PROTECTED]>
>Subject: RE: protein entities (was Re: Rules (was Re: Ambiguous names. was:
>Re: URL +1, LSID -1)
>To: Darren Natale <[EMAIL PROTECTED]>, Michel_Dumont
If I may put forward a key protein in Alzheimer disease as an example
that we are grappling with, there is full-length APP (which itself
has a number of forms as well as mutations); various peptides derived
from cleavage of APP; and then multimeric forms of the peptides,
particularly Abet
Michel_Dumontier wrote:
Darren,
Also, while we recognize
that there are different qualities that can be ascribed to a basically
identical biochemical entity in different structural conformations or
states of ligand binding, we are not attempting (at least in the
beginning) to describe these st
Darren,
> Also, while we recognize
> that there are different qualities that can be ascribed to a basically
> identical biochemical entity in different structural conformations or
> states of ligand binding, we are not attempting (at least in the
> beginning) to describe these structural conforma
Michel_Dumontier wrote:
Sequence form is again a placeholder term ...
... distinguish between a phosphorylated version of a
protein and the non-phosphorylated version (as an example). The need
for the latter derives from the fact that the two versions might have
different functions.
Inde
as Re: Rules (was Re: Ambiguous names.
was: Re: URL +1, LSID -1)
We don't yet have formal definitions for many of the classes and
relations (the effort only began in earnest a few months ago). But,
basically, there is a distinction made between the full-length (in
terms
of amino acid sequence) p
9, 2007 11:24 AM
> To: Eric Jain
> Cc: Alan Ruttenberg; Chris Mungall; Bijan Parsia;
public-semweb-lifesci
> hcls
> Subject: Re: protein entities (was Re: Rules (was Re: Ambiguous names.
> was: Re: URL +1, LSID -1)
>
>
> We don't yet have formal definitions for many of t
Thank you Chris for including me on this thread. I can well see why you
did so!
We recently began a new Protein Ontology (PRO) effort geared precisely
toward the formal definition of the "smaller entities" referred to by
Alan. By "we" I mean the PRO Consortium, comprising the PIs Cathy Wu
We don't yet have formal definitions for many of the classes and
relations (the effort only began in earnest a few months ago). But,
basically, there is a distinction made between the full-length (in terms
of amino acid sequence) protein and the sub-length parts of proteins
(commonly called
Protein, in this scheme, is the amino acid polymer produced by a
translation process using an mRNA as a template. I suppose this
excludes peptides (also amino acid polymers) that are produced
non-ribosomally, but perhaps that is okay for the time being. The
precise definition will be constr
Darren Natale wrote:
We don't yet have formal definitions for many of the classes and
relations (the effort only began in earnest a few months ago). But,
basically, there is a distinction made between the full-length (in terms
of amino acid sequence) protein and the sub-length parts of protei
Darren Natale wrote:
We recently began a new Protein Ontology (PRO) effort geared precisely
toward the formal definition of the "smaller entities" referred to by
Alan. By "we" I mean the PRO Consortium, comprising the PIs Cathy Wu of
PIR (which is also a member organization of the UniProt Con
> "Alan" == Alan Ruttenberg <[EMAIL PROTECTED]> writes:
Alan> Summary: Answering Phil's questions, and clarifying one thing he
Alan> asserts about what I said.
>> What if they have a polymorphism?
Alan> No.
>> Are two isoforms from an alternate splice the same protein?
Alan> No.
> "MS" == Matthias Samwald <[EMAIL PROTECTED]> writes:
>> It would be more satisfying for us to know intentionally what we mean
>> by "protein". It would be good to have a clear set of definitions. But,
>> ultimately, I think it would be mistaken. If we have the ability to
>> expr
Alan Ruttenberg wrote:
In that case, I would recommend that it is unwise to use Uniprot ids as
identifiers of protein classes on the semantic web. Doing so would
encourage exactly the kind of ambiguity that we need to avoid in order
to write statements that will not confuse semantic web agent
On Jul 18, 2007, at 6:02 AM, Xiaoshu Wang wrote:
But please note, just because "http://purl.uniprot.org/core/
Protein" contains the string "Protein" does not make it the
identifier for *Protein*, unless everyone else agrees to it
I wouldn't have thought that http://purl.uniprot.org/core/Pro
I agree with Alan but feel sympathy for Eric as well. In the absence of
a universally accepted ontology for describing biological entities, Eric
has to develop something to start working on SW.
But please note, just because "http://purl.uniprot.org/core/Protein";
contains the string "Protei
On Jul 17, 2007, at 1:44 AM, Eric Jain wrote:
Chris Mungall wrote:
We have also switched from talk of defining specific proteins to
rules to automatically annotate protein records.
You're right, small digression, hope it's of interest anyway :-)
Definitely - although I don't think OWL/SW
As EricJ's recent note confirmed, and as I suspected, the problem
goes substantially deeper than the issue of simply punning the record
and the protein class.
The fundamental problem is that the record, having not been designed
as the definition of something, isn't the *unambiguous* defin
In that case, I would recommend that it is unwise to use Uniprot ids
as identifiers of protein classes on the semantic web. Doing so would
encourage exactly the kind of ambiguity that we need to avoid in
order to write statements that will not confuse semantic web agents
(including peopl
Chris Mungall wrote:
We have also switched from talk of defining specific proteins to rules
to automatically annotate protein records.
You're right, small digression, hope it's of interest anyway :-)
I read "broad classes of proteins" as being more inclusive than the
class denoted by OPSD_H
Alan Ruttenberg wrote:
To clarify, no, I didn't mean this. I meant that the definition of
Uniprot records are already broad in the sense that sometimes multiple
splice variants are included in a single record, as are population and
disease-causing variants, according to Eric. Basically I don't
Thanks for the elaboration Chris - as usual better expressed than
when I tried :)
One minor clarification:
On Jul 16, 2007, at 11:24 PM, Chris Mungall wrote:
I read "broad classes of proteins" as being more inclusive than the
class denoted by OPSD_HUMAN in my interpretation, but also
in
On Jul 16, 2007, at 10:29 AM, Eric Jain wrote:
Bijan Parsia wrote:
Eric, I would be very much interested in some more details about
the sort of rules used and how they are used. I personally tend to
distinguish between the use of rules in modeling and the use of
rules for data munging t
Matthias Samwald wrote:
The evidence for what I point out is found everywhere: "P12345 is
expressed in some tissues"... according to Alan's points, this
would be a wrong statement.
When the Semantic Web should really find widespread adoption, they would be saying
something like "C12345 is
Summary: Answering Phil's questions, and clarifying one thing he
asserts about what I said.
On Jul 16, 2007, at 12:22 PM, Phillip Lord wrote:
"Alan" == Alan Ruttenberg <[EMAIL PROTECTED]> writes:
Take these rhethorical questions:
I am interpreting these as questions of fact, that "same"
Bijan Parsia wrote:
Eric, I would be very much interested in some more details about the
sort of rules used and how they are used. I personally tend to
distinguish between the use of rules in modeling and the use of rules
for data munging tasks. Obviously, where you draw this boundary can be a
> It would be more satisfying for us to know intentionally what we
> mean by "protein". It would be good to have a clear set of
> definitions. But, ultimately, I think it would be mistaken. If we
> have the ability to express "the class of protein molecules defined
> by the swissprot record OPSD_
On Jul 16, 2007, at 5:53 PM, Eric Jain wrote:
Alan Ruttenberg wrote:
We've got a SW language for making definitions - it's called OWL.
One thing I can say here is that there is the trend that curators
create rules (and check the outcome) instead of adding data
themselves directly. Unfort
Alan Ruttenberg wrote:
We've got a SW language for making definitions - it's called OWL.
One thing I can say here is that there is the trend that curators create
rules (and check the outcome) instead of adding data themselves directly.
Unfortunately OWL is insufficient for the kind of ugly r
> "Alan" == Alan Ruttenberg <[EMAIL PROTECTED]> writes:
>> I agree. The argument is that it's very hard to describe what you mean by
>> a "protein". We almost certainly don't mean a protein molecule. We might
>> mean a type of protein. But then we don't know whether two protein
>> mol
I'm not advocating that we build definitions around protein
sequences, just that we build definitions, period.
And that we don't confuse a page of html with a definition.
The uniprot curators are great! They know what they are looking for
and they are skilled at finding it. Let's put work i
Alan Ruttenberg wrote:
I'm confused. I think we all would agree that there are instances of
proteins and we have a good idea of what they are. We also know that
there are groups of proteins that are built off the same template and
share certain properties. If we define classes using such prope
On Jul 16, 2007, at 10:19 AM, Phillip Lord wrote:
"MK" == Marijke Keet <[EMAIL PROTECTED]> writes:
MK> Lack of sufficient knowledge about a particular (biological)
entity is
MK> a sideshow, not an argument, to the issue of distinguishing
real proteins from
MK> their records.
I
On Mon, 16 Jul 2007 16:09:03 +0200, Eric Jain wrote:
> http://purl.uniprot.org/uniprot/P12345 does not represent any
> physical object, but it is a useful generalization of certain
> physical objects that you might find.
That sounds like defining a class of physical objects to me.
- Matthias
Eric Neumann ha scritto:
Alan,
the life science community has for years applied an implicit
transitivity to records of things, so that when many say:
"http://purl.uniprot.org/uniprot/P12345 is expressed only in species
homo sapien"
they usually imply that "the protein referenced by
data
> "MK" == Marijke Keet <[EMAIL PROTECTED]> writes:
MK> Lack of sufficient knowledge about a particular (biological) entity is
MK> a sideshow, not an argument, to the issue of distinguishing real proteins
from
MK> their records.
I agree. The argument is that it's very hard to describe
> Having worked directly with bench scientists for many years, they
> view data and databases as "extensions" to what they are really
> interested in.
Uhm, probably this differentiates molecular biology from classic, organismal
biology (my background). I would never make such statements.
> Yo
Matthias Samwald wrote:
Well, they might talk like database entries and physical objects would
be the same, but this is not what they *think*. With the Semantic Web /
ontologies we want to capture the semantics and the actual thinking, not
the linguistic / textual surface representations.
http
Not true Matthias.
Having worked directly with bench scientists for many years, they
view data and databases as "extensions" to what they are really
interested in.
Your example of "bank" and "bank" are disjoint and non-related; in
the case of gene and gene-data-record there is a kind of c
> the life science community has for years applied an implicit
> transitivity to records of things, so that when many say:
> "http://purl.uniprot.org/uniprot/P12345 is expressed only in
> species homo sapien"
> they usually imply that "the protein referenced by
> datarecord:http://purl.uniprot.
Alan,
the life science community has for years applied an implicit
transitivity to records of things, so that when many say:
"http://purl.uniprot.org/uniprot/P12345 is expressed only in species
homo sapien"
they usually imply that "the protein referenced by datarecord:http://
purl.unipro
Waclaw Kusnierczyk wrote:
Oh, no. If there are two proteins out there, they are two, and you have
nothing to *decide* about that. You may fuse them in that or another
way, but this does not change the fact that at the previous time there
were two.
"Out there" you'll find all kinds of molec
sorry! i got confused by the response to response pattern.
vQ
Marijke Keet wrote:
p.s.: I did not write that, that was Eric.
I know I ought to have taken up that point as well, but then, my time is
limited.
regards,
marijke
Waclaw Kusnierczyk ha scritto:
Marijke Keet wrote:
The problem
ic-semweb-lifesci"
; "Mark Wilkinson" <[EMAIL PROTECTED]>;
"Benjamin Good" <[EMAIL PROTECTED]>; "Natalia Villanueva Rosales"
<[EMAIL PROTECTED]>
Sent: Monday, July 16, 2007 4:52 AM
Subject: Re: Ambiguous names. was: Re: URL +1, LSID -1
Mari
Waclaw Kusnierczyk wrote:
sure. how can you determine that *two* entities are *one* entity?
(they may become one, but that's a different story.)
You mean how we *decide* that they should be a single "entity"? I'm afraid
I can't tell, not because it's our trade-secret, but because such decisi
Marijke Keet wrote:
"...due to lack of knowledge...": and I presume it may be that
biologists disagree also because of insufficient knowledge about the
protein, and/or its (over-)simplification, that is, comparing apples and
oranges at a too coarse level of granularity. Moreover, that we don't
Marijke Keet wrote:
The problem with proteins is that I haven't seen any biologists agree
on a general way to determine whether two proteins are the same or
not,
sure. how can you determine that *two* entities are *one* entity?
(they may become one, but that's a different story.)
vQ
Eric Jain ha scritto:
Marijke Keet wrote:
just because proteins are smaller than persons does not make them
into mere abstractions--thingies of your imagination that only
materialise by means of their representations in some information
system. proteins were around for quite a while before
Marijke Keet wrote:
and by analogy, then there is no real Eric Jain, just a webpage with
that name, a blog, an URL http://eric.jain.name/, some database records
in the uniprot HR systems with a string "Eric Jain" and related data,
email trails in the hcls archive and, well, any person that
Eric Jain ha scritto:
Alan Ruttenberg wrote:
There are proteins, and there are records about proteins. Records
come in different formats. If I make a statement using this url, is
is about the record? or the protein? How should the agent come to know?
The concept of "protein" is abstract eno
Alan Ruttenberg wrote:
There are proteins, and there are records about proteins. Records come
in different formats. If I make a statement using this url, is is about
the record? or the protein? How should the agent come to know?
The concept of "protein" is abstract enough that anything you mi
64 matches
Mail list logo