On Aug 21, 2007, at 1:39 PM, Eric Jain wrote:
Hilmar Lapp wrote:
It seems to me that domain-specific resolution systems are rather
a fact and we deal with them all the time.
We try to deal with it, but it's a pain, even though the number of
different systems I need to deal with is limited compared to
someone who is developing applications that must work across the
entire life-sciences domain, or even outside of this domain as well
-- completely impractical!
Right. That was one of the problems that was faced when the I3C
consortium started (namely multiple identifier systems with
idiosyncratic translation rules to convert to a resolvable URL), and
which it tries to address by unifying the identifier and resolution
schemes.
My point was that domain-specific identifier and resolution schemes
are a matter of fact, and some evidence shows that the fact that they
are domain specific doesn't diminish their ability to succeed and
become de-facto standards.
As for being limited to a domain or not, would the LSID mechanism be
more appealing if it read urn:guid:foo.org:Foo:12345? There's nothing
in the LSID spec that makes it LS-specific, or due to which it make
no sense outside of the LS.
For example, articles are referenced by DOI, entries in most
institutional repositories are referenced by Handles, and GenBank
sequences are referenced by a GI number. Any generic tool that
wants to deal with statements made about or to articles
(presumably almost all will want to) will need to know how to
dereference a DOI. Alternatively, for the time being we can prefix
the DOI with http://dx.doi.org/ and have a dereferancable HTTP URI.
That's the single best feature of that system, in my opinion :-)
Do you mean you would prefer if each journal set up URIs based on its
self-chosen domain-name and we reference articles through that
instead of DOIs? Or did you want to say something else?
I'm not sure why we can't apply the same principle to LSIDs. The
life science field isn't necessarily a small one, and it seems
like a small price to pay for a tool creator to implement a single
resolution system to resolve any life science identifier. Is this
being naive?
From what I see, tool creators haven't shown much interest in
implementing domain specific schemes, or even at least make it easy
to plug in your own.
How many semantic web tools support LSID resolution, for example?
I'm not sure you are trying to advocate future standards based on the
abilities or lack thereof of the current generation of semantic web
tools?
Just as they will have to support DOIs to be practical, I don't see
why they would shy away from supporting LSIDs, if they are widely used.
To make them widely used is upon the data providers, though, not the
tool makers.
There seems to be a notion that all "life science databases" will
be there in perpetuity, but in reality there are plenty of
examples of databases that lost funding and went "out of
business", with PIR or BIND being some of the better known ones.
I'm not quite following why after all these years of discussion
the validity of URIs should again be subject to the vagaries of
funding, or the business acumen of commercial enterprises.
The going out of business problem is a big challenge, but in my
experience the majority of changes are nothing else but URLs
changing from something like /cgi-bin/fetch.cgi?P00001 to /fetch.do?
id=P00001 etc.
Well, yeah, but the big challenge is still a big challenge and a real
one, and advocating stable HTTP URIs as a solution surely will not
contribute to solving the big challenge?
There are also some issues with such URLs that have nothing to do
with stability, such as the fact that there are no separate URLs
for concepts and their representations, see previous discussions on
this list...
Right. Does this advocate for or against an opaque identifier system?
BTW there are standards to deal with that, such as OpenURL (however
imperfect that may be).
Domain names are quickly bought, used, and sold to someone else,
and this is not just theoretical. The proposed "ease" with which
HTTP URIs can be stably maintained first of all is clearly
contradicted by the empirical evidence that it's not happening
right now (why would a W3C recommendation change that? That we
want stable HTTP URIs can't be new to anyone), and second requires
continued ownership of the domain name. This seems like a trivial
issue but in reality it's not once funding is cut off.
For example, the journal Phyloinformatics discontinued recently
and the domain name phyloinformatics.org is now for sale. If they
had used HTTP URIs using their domain name, the next owner of the
domain would probably choose not to maintain any of those, or
worse, reassign them to something else.
What am I missing?
The time dimension? :-)
If you reference some resource on phyloinformatics.org, you do well
to note down the time when you accessed the resource
In an RDF document?
[...] This will later allow you to retrieve the same page e.g. via
the Internet Archive (if you are lucky).
And if the semantic web tool supports going to the internet archive
if dereferencing an HTTP URI returns RDF that doesn't quite make
sense with respect to the statement through which you got to it.
And what if the internet archive chose not to archive that HTTP URI?
Don't know how this is best handled in the context of the Semantic
Web...
Would you mind elaborating?
-hilmar
--
===========================================================
: Hilmar Lapp -:- Durham, NC -:- hlapp at duke dot edu :
===========================================================