statements with the subject being
> http://beta.uniprot.org/entry/P12345 and another set makes
statements about http://uniprot.org/entry/P12345. They are really
talking about the same subject, > but our semantic web agent won't
know that. If we had used the PURL, then we wouldn't have a problem.
One solution is to have a "freshen_rdf" script that periodically goes
through an RDF file or triplestore, does an HTTP GET on each unique
URI, and updates the URI if it's been 301 redirected to a new
location. People are probably going to end up doing this periodically
anyway in order to validate each URI as pointing to a resolvable
resource before doing anything nontrivial with the triplestore.
Now, a naive GET on every URI might take some time, but it could be
made more efficient by first resolving the namespace declarations at
the beginning of the RDF file. For each namespace, such as
beta.uniprot.org, you do one GET to see whether any 301 redirects
have been set up. Perhaps the cleanest way to do this is for the EBI
people to have metadata at "http://beta.uniprot.org/uniprot/
redirect.rdf" (or a similar URI) which contains a set of triples with
redirect information. This might be as simple as a rewriting regex.
If it's just a regex, then you can apply it to quickly freshen all
the URIs from this namespace without having to do HTTP GETS on each
of them. Alternatively, that redirect.rdf file might contain a table
of "sameAs" mappings which, again, can be used to freshen the URIs in
your triplestore.
--
Balaji S. Srinivasan, Ph.D.
Stanford University
Lecturer, Depts. of Statistics and Computer Science
318 Campus Drive, Clark Center S251
(650) 380-0695
[EMAIL PROTECTED]
http://jinome.stanford.edu
On Jul 15, 2007, at 9:34 PM, Alan Ruttenberg wrote:
On Jul 15, 2007, at 1:53 PM, Eric Jain wrote:
Alan Ruttenberg wrote:
The point of having the PURLs is to ensure that there is a
mechanism for handling three cases that LSIDs were intended to
address (but which can be addressed without the trouble of
introducing a separate resolving mechanism)
1) To be immune from the "actual URL of the representation"
changing. (e.g. beta.uniprot.org goes out of beta)
1) We'll do a 301 "permanent" redirection, promise.
Yes, but how will we handle the case where some set of people make
statements with the subject being
http://beta.uniprot.org/entry/P12345 and another set makes
statements about http://uniprot.org/entry/P12345. They are really
talking about the same subject, but our semantic web agent won't
know that. If we had used the PURL, then we wouldn't have a problem.
Comments to your other points in separate email.
-Alan