Re: Immunity of SW statements to changes in location. Was: Re: URL +1, LSID -1

Balaji S. Srinivasan Mon, 16 Jul 2007 07:05:23 -0700


statements with the subject being

> http://beta.uniprot.org/entry/P12345 and another set makesstatements about http://uniprot.org/entry/P12345. They are reallytalking about the same subject, > but our semantic web agent won'tknow that. If we had used the PURL, then we wouldn't have a problem.

One solution is to have a "freshen_rdf" script that periodically goesthrough an RDF file or triplestore, does an HTTP GET on each uniqueURI, and updates the URI if it's been 301 redirected to a newlocation. People are probably going to end up doing this periodicallyanyway in order to validate each URI as pointing to a resolvableresource before doing anything nontrivial with the triplestore.

Now, a naive GET on every URI might take some time, but it could bemade more efficient by first resolving the namespace declarations atthe beginning of the RDF file. For each namespace, such asbeta.uniprot.org, you do one GET to see whether any 301 redirectshave been set up. Perhaps the cleanest way to do this is for the EBIpeople to have metadata at "http://beta.uniprot.org/uniprot/redirect.rdf" (or a similar URI) which contains a set of triples withredirect information. This might be as simple as a rewriting regex.If it's just a regex, then you can apply it to quickly freshen allthe URIs from this namespace without having to do HTTP GETS on eachof them. Alternatively, that redirect.rdf file might contain a tableof "sameAs" mappings which, again, can be used to freshen the URIs inyour triplestore.


--
Balaji S. Srinivasan, Ph.D.
Stanford University
Lecturer, Depts. of Statistics and Computer Science
318 Campus Drive, Clark Center S251
(650) 380-0695
[EMAIL PROTECTED]
http://jinome.stanford.edu


On Jul 15, 2007, at 9:34 PM, Alan Ruttenberg wrote:

On Jul 15, 2007, at 1:53 PM, Eric Jain wrote:
Alan Ruttenberg wrote:
The point of having the PURLs is to ensure that there is amechanism for handling three cases that LSIDs were intended toaddress (but which can be addressed without the trouble ofintroducing a separate resolving mechanism)1) To be immune from the "actual URL of the representation"changing. (e.g. beta.uniprot.org goes out of beta)
1) We'll do a 301 "permanent" redirection, promise.
Yes, but how will we handle the case where some set of people makestatements with the subject beinghttp://beta.uniprot.org/entry/P12345 and another set makesstatements about http://uniprot.org/entry/P12345. They are reallytalking about the same subject, but our semantic web agent won'tknow that. If we had used the PURL, then we wouldn't have a problem.
Comments to your other points in separate email.

-Alan

Re: Immunity of SW statements to changes in location. Was: Re: URL +1, LSID -1

Reply via email to