RE: [BioRDF] global uniqueness requirement of LSIDs and RDF

Miller, Michael D (Rosetta) Mon, 14 Aug 2006 08:29:07 -0700

Title: Message

Hi Sean,

Thanks for your clarification, exactly what John's e-mail brought to my mind but much better explained.

A similar use case might be a gene _expression_ experiment that is sent into ArrayExpress. At some point someone who downloads the experiment discovers that one of the hybridization is totally clustering with a different set of replicates than the one it was assigned. The original investigator takes a look and discovers that the lab technician had grabbed the sample aliquot from the wrong shelf and recorded the original sample's LSID.

So to update ArrayExpress, the Hybridization is still the same but it needs a new version and needs to be associated with the proper sample LSID. The experiment itself needs to get a new version and have the Hybridization be moved to the proper set of replicates and the data needs to have new versions and the DataCubes updated with the new, recalculated replicate DataCubes..

cheers,

Michael

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Sean Martin
Sent: Monday, August 14, 2006 5:47 AM
To: public-semweb-lifesci@w3.org
Subject: Re: [BioRDF] global uniqueness requirement of LSIDs and RDF

Hello John,
> > > How I've come to think about this is that some properties are intrinsic > > to the type of record, for a person, perhaps their SSN if American, and > > some are not, such as a person's age. But even this becomes context > > dependent if one wishes to track the state of the person once a year. > > If I understand the uniqueness requirement of LSIDs, then a new LSID for > "Michael Miller" must be created every year when the age property changes.
This is not quite how it is meant to work. You would only create a new LSID for Michael Miller each year if he was a data file and somehow his bytes changed :-) In the case you describe Michael is more of an idea (sorry Michael!) with many facets, some that can be concretely represented as bytes (the bytes named) and some conceptual that can be described in metadata (that further describe the concept named) and have no associated unique data (that is named) bytes.

You could use an LSID (or any kind of URI) without any directly associated data bytes to represent Michael as a central concept. Then a metadata graph associated with this conceptual URI might tell you his date of birth, it might also contain links to LSIDs and other URIs that contain separate concrete representations of Michael - for example x-ray images, MRIs, his DNA sequence or results for other tests that have a binary representation and where it makes sense to uniquely name each as a discrete data item. These different representations may even be made available in different contexts/formats (e.g. images of differing size, resolution or binary format like png and gif) and each with its own LSID. Similarly if for some reason one of these images is changed later (say a better algorithm for sharpening), that new image instance could be made available as an LSID revision by incrementing the version area of the LSID name.

Kindest regards, Sean

--
Sean Martin
IBM Corp.

RE: [BioRDF] global uniqueness requirement of LSIDs and RDF

Reply via email to