Hilmar Lapp wrote:
Right. That was one of the problems that was faced when the I3C consortium started (namely multiple identifier systems with idiosyncratic translation rules to convert to a resolvable URL), and which it tries to address by unifying the identifier and resolution schemes.

Great, but from an outside point of view, didn't you just end up adding yet another idiosyncratic system?


My point was that domain-specific identifier and resolution schemes are a matter of fact, and some evidence shows that the fact that they are domain specific doesn't diminish their ability to succeed and become de-facto standards.

I guess that could happen... Do you have some examples of domain-specific standards that became de-facto standards, supported by generic tools etc?


As for being limited to a domain or not, would the LSID mechanism be more appealing if it read urn:guid:foo.org:Foo:12345? There's nothing in the LSID spec that makes it LS-specific, or due to which it make no sense outside of the LS.

You're right, from a technical point of view, it's not domain-specific. But if no one else is using it, doesn't that make it de-facto domain-specific?


Do you mean you would prefer if each journal set up URIs based on its self-chosen domain-name and we reference articles through that instead of DOIs? Or did you want to say something else?

If instead of doi:10.1038/nrg2158 an official URI looked something like
http://dx.doi.org/10.1038/nrg2158, would this make the system less popular?

In fact, I suspect that the lack of such a transformation mechanism turned away many people from the LSID system (that, and the ugly syntax :-)

I'd also be fine with using e.g. http://www.nature.com/nrg/journal/nrg2158; if Nature went out of business, the DOI isn't more useful, or is it?

Note: While most publishers seem to have adopted the DOI system, I don't see many people using it (e.g. in queries) on our site. But if someone who works for a publisher is lurking, they might have better usage stats!


I'm not sure you are trying to advocate future standards based on the abilities or lack thereof of the current generation of semantic web tools?

Are we talking about future standards, or current best practices?

As things are, if I am asked for advice, I can't tell anyone that they should use approach x instead of y, because even though y is simpler and more widely supported, tool providers need to be encouraged to support x.


Just as they will have to support DOIs to be practical, I don't see why they would shy away from supporting LSIDs, if they are widely used.

To make them widely used is upon the data providers, though, not the tool makers.

Chicken-and-egg alert! :-)


Well, yeah, but the big challenge is still a big challenge and a real one, and advocating stable HTTP URIs as a solution surely will not contribute to solving the big challenge?

Forces that work against stable, resolvable HTTP URIs:

1. People reorganize their web servers, change technologies etc.
2. Data is removed or replaced.
3. Data providers disappear.

The first issue is something that might be improved with W3C guidelines -- and third-party PURLs for those who refuse to listen :-)

The second and third issues are trickier -- and I'm not sure how non-HTTP URIs help here? The problem is that even if you want to version your data and allow retrieval of obsolete data, the infrastructure for this isn't trivial. For example, we've invested some effort to support this for some of our data [e.g. try http://beta.uniprot.org/uniprot/P05067?version=42], but that's just part of our data, and we don't support all formats, either.

The best solution to disappearing data I can see is that you have some Google-scale, Internet Archive-like projects that go and collect all data.


Right. Does this advocate for or against an opaque identifier system? BTW there are standards to deal with that, such as OpenURL (however imperfect that may be).

I don't see any strong reason to advocate either approach. Opaque identifiers such as http://purl.uniprot.org/uniprot/Q15848 have the advantage that they don't need to be replaced as often as identifiers such as http://en.wikipedia.org/wiki/Adiponectin, but that may not be a problem, and if you're doing a dictionary, such identifiers can make sense, too.


And what if the internet archive chose not to archive that HTTP URI?

Then you're out of luck, but I don't see how any other non-HTTP scheme would have even given us a chance to recover the "data that is no more"?


Don't know how this is best handled in the context of the Semantic Web...

Would you mind elaborating?

I would, if I had the perfect solution :-) It's probably a good idea to keep track of the source (URL!) and time you obtained any statements; hoping that in future you may be able to retrieve the exact data you were referencing at the time from some archive.

Reply via email to