Re: adding pubmed ids to BAMS

Huajun Chen @ Zhejiang University Fri, 20 Apr 2007 08:28:49 -0700


We met similar problems when we tried to relate publications to neurons.


Check the research notes for CA3 pyramidal neuron:
http://senselab.med.yale.edu/senselab/NeuronDB/ndbEavSum.asp?id=259&mo=4&re=

All of the notes are about papers supporting the evidences of being
present or absent of a receptor/current/transmitter in a specific
compartment.  The notes can not simply attach to the neurocell class,
since they are actually annotations that should be attached to
statements.  All of the notes should be attached to the
owl:Restriction defined for those receptor/current/transmitter, see
the definition for CA3 pyramidal neuron in old DL syntax below.

If taking Alan's approach, we have to create additionally a huge
number of new named classes. For example, for CA3, we have to create
extra 30 named classes. And if take it as an average for other cells,
we have to create nearly 30*33=990 new named classes for all cells in
neuroDB. Previously we only have less than 50 classes in total.

We also have to come up new class name for each one which tends to be
long and somewhat odd. For example, the class name might be
AMPA_in_DAD_in_CA3_pyramidal_neuron.

Unnamed class is one of the elegant and neat features of DL,
especially in the cases where we do not want or do not know how to
explicitly specify the name.
Besides, I don't think unnamed class dis-enable the evidential
inference to take place.  For examples, the inconsistence we've just
found were inferred out from those unnamed class.

I've also found protégé does supports annotating unnamed classes, but
I'm not quite sure if the OWL specification allows us to do that?

Best all,
Huajun

Principal_Neuron AND

ro:hasPart SOME [(Dad AND
                 (has_Receptors SOME AMPA) AND
                 (has_Receptors SOME NMDA) AND
                 (has_Currents SOME I_p_q) AND
                 (has_Currents SOME I_K)),
                (Dap AND
                 (has_Receptors SOME Glutamate) AND
                 (NOT (has_Currents SOME I_Na_t)) AND
                 (has_Currents SOME I_K) AND
                 (has_Currents SOME I_p_q)),
                (Soma AND
                 (has_Receptors SOME AMPA) AND
                 (has_Receptors SOME NMDA) AND
                 (has_Receptors SOME GabaB) AND
                 (has_Receptors SOME GabaA) AND
                 (has_Receptors SOME mGluR) AND
                 (has_Receptors SOME Gaba) AND
                 (has_Currents SOME I_p_q) AND
                 (has_Currents SOME I_K_Ca) AND
                 (has_Currents SOME I_Na_t) AND
                 (has_Currents SOME I_N) AND
                 (has_Currents SOME I_A) AND
                 (has_Currents SOME I_K) AND
                 (has_Currents SOME I_IR_Q_h) AND
                 (has_Currents SOME I_T_low_threshold) AND
                 (has_Currents SOME I_L_high_threshold)),
                (Dam AND
                 (has_Receptors SOME mGluR) AND
                 (has_Receptors SOME GabaB) AND
                 (has_Receptors SOME AMPA) AND
                 (has_Receptors SOME Gaba) AND
                 (has_Receptors SOME Glutamate) AND
                 (has_Receptors SOME NMDA) AND
                 (has_Receptors SOME GabaA) AND
                 (has_Currents SOME I_L_high_threshold) AND
                 (has_Currents SOME I_p_q) AND
                 (has_Currents SOME I_T_low_threshold) AND
                 (has_Currents SOME I_K)),
                (T AND
                 (has_Receptors SOME NO) AND
                 (has_Currents SOME I_N) AND
                 (has_Transmitters SOME Glutamate)),
                (A AND
                 (has_Currents SOME I_Na_t)),
                (AH AND
                 (has_Currents SOME I_K) AND
                 (has_Currents SOME I_Na_t))]


On 4/18/07, Alan Ruttenberg <[EMAIL PROTECTED]> wrote:

Here is an idea I am exploring. Perhaps you might mock this up:

The essential idea is that evidence and other annotation is about
named classes. In those cases where one might think of annotating
some axiom, or piece of axiom, we would instead look for the class
that is the referent of the annotation and name that class.
Then, we can connect that class, using an annotation property,  to
whatever kind of annotation or evidence we think appropriate.

Suppose we have a class HumanP53Protein, which we will define as:
Those proteins whose sequence of amino acids are described by the
sequence in the sequence information field of the Uniprot P53_Human
Record, or which are derived from such a protein. (I'm open to
discussion on what this definitions should be, BTW, but I think we
should have one)

One gene ontology annotation to P53 is:
GO:0000739; Molecular function: DNA strand annealing activity
(inferred from direct assay from UniProtKB).

GO:0000739 is defined as in OBO, as a class, a subclass of function.

We will say that the referent of this annotation is the class

HumanP53ProteinWithFunctionDNAStrandAnnealing:  HumanP53Protein and
has_function some GO:0000739

The annotation property itself might be called "ExistsAccordingTo",
by which we mean that this class has instances

The thing it exists according is

Inference001
   type InferredFromDirectAssay
   describedInPaper theArticlePMID1234Describes

So our annotation is

HumanP53ProteinWithFunctionDNAStrandAnnealing ExistsAccordingTo
Inference001

Up to this point we have been conservative. We haven't made any
statement about P53 in general. Here, we will overstate (our only
choice, if we want to make a statement about biology from which some
useful inference can be done, given the evidence we have)

HumanP53Protein subclassOf HumanP53ProteinWithFunctionDNAStrandAnnealing

This may be wrong. For instance, it may be the case that only that
P53 phosphorylated in some way actually has this function.
I hope that by some other statement, a contradiction is inferred that
will force us (or the curators) to be more specific.

----

What's nice about this?

1) We are making statements about biology (better than making
statements about "terms")
2) There is no RDF reification involved - the main contender for
representing this sort of thing.
3) We have been (relatively) conservative about what we say there is
evidence for
4) We are owning the fact that we are making an overstatement
5) We are enabling some inference to take place.

What's the cost?

1) One extra triple, in which we name the class
HumanP53ProteinInvolvedInDNADamageResponse
Where we previously would have used a restriction to introduce the
participation, we now use the named class.
2) When querying about what the evidence is for, we need to query the
asserted (or told) assertions only. That's because after inference
has been done, new assertions may be known about
HumanP53ProteinWithFunctionDNAStrandAnnealing and we won't be able to
tell the difference between what was asserted and what is inferred,
given that we have associated the only the class name with the evidence

---

Taking this to BAMS it means that we associate the paper with the
cell class for which we already have an name.
For the molecule is found in cell cases, we create the named class
for the cell contains some molecule class, use that
class in place of the restriction, and associate the paper to that
named class.

You can define

Class(article :partial)
Class(pubmedRecord :partial)
ObjectProperty(definedByPMID inversefunctional)

Represent the pubmed record as an instance of pubmedRecord named
http://purl.org/commons/pubmed/1234

The last issue is the nature of the relationship between the paper
and the class. If we can't easily distinguish between whether
these annotations are evidence or simply discussion we could use the
relation "isMentionedBy", which we will mean to say that the class
(or some instances of the class) are discussed in the paper.

---

Call me if you want to discuss this. Admittedly this may seem
involved and odd, since it is a new idea, though I will blame Chris
and Jonathan, who I bounced it off of, for not telling me straight
off it didn't make sense :)

But how about we give it a go and see what it feels like. I'm
planning to use this translation for the GO annotations and the rest
of the similar sources, unless somebody comes forth with some
arguments about what would be a better idea.

Best,
Alan

On Apr 18, 2007, at 3:49 PM, [EMAIL PROTECTED] wrote:

>
>> From what Mihai sent me, the pubmed refs are about:
>
>> the cell and
>> the fact the molecule is found in cell
>
> Pending your recomendation, I had tentatively suggested the
> following for
> representing this as:
>
> pubmedID has "<id>" or
> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>
> where one of more of these is associated with a cell. I was under the
> impression that you were thinking about a general representation
> that everyone
> would use for pubmedID. So, I haven't yet added these to the BAMS
> OWL version.
>
>> OK. Can you send me this for a quick look?
>
> I'm not sure what you are asking to see. Do you want to see the
> original
> tables Mihai sent me?
>
> thanks,
>
> jb
>
>
>
> Date:  Wed, 18 Apr 2007 12:30:17 -0400
> From:  Alan Ruttenberg <[EMAIL PROTECTED]>
> To:  John Barkley <[EMAIL PROTECTED]>
> Cc:  Jonathan A Rees <[EMAIL PROTECTED]>
> Subject:  Re: adding pubmed ids to BAMS
> Quoting Alan Ruttenberg <[EMAIL PROTECTED]>:
>
>>
>> On Apr 13, 2007, at 1:51 PM, John Barkley wrote:
>>
>>> I have confirmed from Mihai that all of the pubmed references in
>>> BAMS are evidence for or elaboration about.
>>
>> OK. Can you send me this for a quick look?
>> Is it clear what the they are about
>> i.e.
>>
>> the cell
>> the part
>> the fact that cell is located in part
>> the fact the molecule is found in cell
>> the fact the molecule is found in part
>> the fact the molecule is found in cell in part
>> etc.
>>
>> ?
>>
>>>
>>>
>>> ----- Original Message ----- From: "Alan Ruttenberg"
>>> <[EMAIL PROTECTED]>
>>>
>>>> Don't have time at this moment, but I think that generally you
>>>> want  to state the the article is either evidence for, or
>>>> elaboration about  the scientific statement involving the cells,
>>>> molecules, etc. Then  then use the pubmed id in some standard URI
>>>> form (maybe neurocommons  record url style) or
>>>> Jonathan's purl.org suggestion. In other words the pubmed id is
>>>> the identifier for a thing (the article, or the abstract,
>>>> depending on  one's point of view).
>>>>
>>>> More details later.
>>>>
>>>> You could look and see how Gene ontology represents evidence.
>>>>
>>>> -Alan
>>>>
>>>> On Apr 11, 2007, at 3:46 PM, John Barkley wrote:
>>>>
>>>>> hi alan,
>>>>>
>>>>> I recieved spreadsheets from Mihai relating cells & pubmed ids,
>>>>> and cells, molecules, & pubmed ids. I wanted to consult with you
>>>>> about  your preferences for how to integrate this into BAMS. I am
>>>>> thinking  something like defining a datatype property pubmedID
>>>>> from owl:thing  to string. Then for cells, you would have:
>>>>>
>>>>> pubmedID has "<id>"
>>>>>
>>>>> and for cells with molecules within, you would have:
>>>>>
>>>>> cell_has_molecule_within some (<cell> and (pubmedID has "<id>"))
>>>>>
>>>>> Please let me know.
>>>>>
>>>>> thanks,
>>>>>
>>>>> jb
>>>>>
>>>>
>>>
>>>
>>
>>
>
>

Re: adding pubmed ids to BAMS

Reply via email to