RE: URI thoughts

Eric Neumann Wed, 21 Jun 2006 06:33:39 -0700


Xiaoshu ,

Hmmm... I see a possible reason why SW is often hard to understand froma data provider point of view....

Authority is the organization that has published annotated data, suchas NCBI or SwissProt. For the foreseeable future, they will beresponsible for curating and managing some subset of the RDF graph fordata records. Dereferencing in my opinion, should alsways defer to theauthority site responsible for the base data.

If you or someone adds a new property to their base graph, the basicRDF model will allow merging, but that will not preventinconsistencies( or "opinions") from being added. We will eventuallyhave to address these issues of provenance, and there are severalpossible ways forwards here. But by dereferencing a Entrez Gene datarecord, I should get back only that part of the graph that NCBI Entrezis responsible for. Outside references and annotations could bediscovered/aggregated by specific services, such as a DAS-like(distributed annotation server) model.

I don't believe authority responsibility can ever be disregarded fordata record URIs...

To Chimezie's point, I also do not think all URI's need to bedereferensible, but certainly data record URI's should always bedereferensible.

In addition, the practice of using rdfs:isDefinedBy to link datarecords to accepted concept bio/chem entities will work as long asrdfs:isDefinedBy does not get used in other ways that would obscure thespecial usage being proposed here. We would need to agree on theintended practice and meaning throughout the community so thatpredicates do what others would expect them to do: "data records ofentities like 'genes' always refer to unique concepts of bio/chementities, even if polymorphisms exist"


Eric

--- Xiaoshu Wang <[EMAIL PROTECTED]> wrote:

>
> > 1) Dereferencing: The dereferencing of a URI to a
> data record
> > results in the return of all the "authority
> managed"
> > information about it (locally curated data) in the
> form of a
> > RDF graph. Outside annotations would not be
> included unless
> > the authority provided an open annotative service.
> This is
> > what you get back when you query sources such as
> NCBI or EBI.
>
> I am not sure what the "authority" here means.  RDF
> itself is monotonic and
> open.  Hence, anyone can say anything about
> anything.  In the eyes of RDF,
> there is only the problem of model consistency and
> an RDF engine can not
> consider one assertion is "more" correct than
> others.
>
> > 2) Versioning: A few useful pieces of metadata for
> changeable
> > (mutable) URI-referenced RDF graphs (dereferenced)
> is what
> > version is current, when it was assigned or
> created (date and
> > time, UTC), and a reference to the sorted list of
> all earlier
> > versions. This would allow precise rolling back to
> any
> > version for performing a re-analysis of info from
> an earlier time.
>
> I think Dublin Core's relation element and
> associated element refinement
> like dc:replaces and dc:isReplacedBy etc., would
> handle this adequately.
>
> > 3) Signifiers: Life science data records of bio or
> chem
> > entities (genes, snps, protein, chemicals, agents,
> diseases,
> > pathways, anatomical parts) should always
> reference a
> > community agreed upon conceptualized
> bio/chem-entity, i.e.,
> > to what the scientist in his or her mind commonly
> and
> > collectively regard when hearing "human GSK3
> beta". These
> > could have ontologies layered on them when they
> become
> > available. These entities represent the
> 'signifiers or signs'
> > for the 'signified or real-world objects' such as
> "Hu GSK3b"
> > or " Mus MAP12"
> > (for the curious, see
> http://en.wikipedia.org/wiki/Sign_(semiotics),
> > btw the full RDF graph around an entity would be
> equivalent
> > to Peirce's 'interpretant'). They would exist as
> non-data
> > objects, more like scientific placeholders, but
> can use
> > rdfs:seeAlso to point to real data records of
> them. Data
> > records by themselves WOULD NOT be of this special
>
> > meta-class. If this sounds fuzzy to you, consider
> what it
> > took to align most of the gene synonym names to
> one agreed
> > symbol; sociologically this is no different.
>
> I can't agree more.  We should not mixup the
> data/description about a
> resource with the resource itself.  This is the
> reason why I have strongly
> opposed the idea of using wiki URI to represent
> biological entities.
> Information and non-information resource are
> disjoint.  Mixing them up will
> break the foundation of web and of course the logic
> of an RDF engine.
>
> > 4)  Covering Mapping: Propose an initial set of
> properties to
> > support the above model. As a starter, define an
> equivalent
> > of rdfs:isDefinedBy for life science that would
> specifically
> > map an instance graph of the data record to the
> singular
> > conceptualized bio/chem-entity, using something on
> the order
> > of  hcls:isDefinedAs :
> >
> > <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?
> > db=gene&cmd=Retrieve&list_uids=2932>
> <hcls:isDefinedAs>
> > <http://purl.org/hcls/bioentity/hu_gsk3b>
> >
> > In line with what Chimezie proposed, rdfs:seeAlso
> could be
> > used to declare the inverse relation for a select
> set of data
> > records; not sure if any new relation is needed
> here.
>
> I think such sets of vocabulary is needed.  But
> rdfs:seeAlso etc. is refined
> to be an AnnotationProperty in OWL so it can not be
> extended anymore.  Some
> simple property like
> hcls:nchientry will just do in my opinion.  As a
> start, I think such kind of
> property should be very coarse grained.  Because the
> more general, the more
> sharable.
>
> Xiaoshu
>
>
>

Eric Neumann, PhD
co-chair, W3C Healthcare and Life Sciences,
and Senior Director Product Strategy
Teranode Corporation
83 South King Street, Suite 800
Seattle, WA 98104
+1 (781)856-9132

www.teranode.com

RE: URI thoughts

Reply via email to