Re: 303 +1, WSDL -1

Balaji S. Srinivasan Mon, 16 Jul 2007 07:05:36 -0700

Hi,

> WSDL is a widely accepted W3C spec that is becoming increasinglyaccepted worldwide (and is, generally, automatically generated basedon your interface, so requires little or no manual construction), andwhich solves a problem that we *know without any doubt* URLs cannotsolve.

I may be mistaken, but isn't WSDL just an XML format? I don't see howit solves a problem that URLs "cannot solve"...wouldn't the locationof "foo.wsdl" be best specified as a URL?

> in fact, they [WSDL] are currently MORE POPULAR than RDF itself,according to Google Trends

But the appropriate comparison is to URLs, not RDF...and theadvantage of a URL is that there's tons of widely deployed,lightweight technology for requesting data from a given URL (e.g. w/a browser as well as Perl/Python/etc. libraries) and for setting upweb servers (e.g. Apache).

I don't understand why it should be necessary to develop a parallelset of technologies (e.g. the Firefox LSID plugin, or HTTP proxies)for resolving LSIDs, particularly when most (all?) of these toolsseem to be built on top of tools (such as Firefox) which can alreadydo URL resolution without downloading anything.

It would seem to me that the best way to get a reliable set ofcanonical URIs is to get NCBI involved. As soon as NCBI published aset of canonical URIs (e.g. for genes in Entrez Gene, compounds inPubchem, etc.) then everyone could use them with confidence. Reasons:

1) NCBI identifiers (even more so than EBI) are the de facto standardand can be mapped to anything.

2) NCBI is well funded, has serious bandwidth, etc.

3) NCBI can be trusted to stick around for a long time and tomaintain/redirect old URLs, unlike a research lab or most companies.4) In terms of registering new URIs, NCBI is already a standardlocation for data submissions (w/ NCBI GEO, GAIN, etc.).5) People already use NCBI to get other kinds of data, so getting RDFdata from them is not a serious paradigm shift.

Perhaps there's someone from NCBI on the list; if not, it would beworthwhile to contact them. If NCBI adopted the standard thatbeta.uniprot.org is using, with different suffixes for differentformats (as per Eric Jain's email):

http://beta.uniprot.org/uniprot/P12345
http://beta.uniprot.org/uniprot/P12345.xml
http://beta.uniprot.org/uniprot/P12345.rdf
http://beta.uniprot.org/uniprot/P12345.fasta

....then I think people would adopt it immediately, especially if theykept it on their front page for a month (like they do with other newservices). Regarding the way UniProt is doing things, I think it wasa particularly good design decision to have the de-facto suffix beHTML, so that you can get a sense of what the URI represents bylooking at it in a browser.


Also, from Matthias' recent email:

> You should not try to pack ANY information about the 'resolution'of a Semantic Web resource into its URI, quite to the contrary. Makeit as meaningless and generic as possible, in the best case it shouldjust be a large random alphanumeric string, e.g.tag:uri:a938fjhsdcHSDu39. If all URIs look like this, nobody will bedetered from re-using a URI just because of how it looks.

I don't know if this is such a good idea -- when debugging, you wantto have some information about what the URIs represent (e.g. the"http://beta.uniprot.org/uniprot/"; prefix tells you that you'relooking at a UniProt protein with the given ID number). If URIs arejust alphanumeric strings, you need to constantly be doing lookups toremind yourself of what a particular object means.


--B

--
Balaji S. Srinivasan, Ph.D.
Stanford University
Lecturer, Depts. of Statistics and Computer Science
318 Campus Drive, Clark Center S251
(650) 380-0695
[EMAIL PROTECTED]
http://jinome.stanford.edu


On Jul 14, 2007, at 10:30 PM, Mark Wilkinson wrote:

Well... I apologize in advance, but I'm going to be *insultingly*blunt because I'm quite honestly losing interest in this seeminglypre-destined discussion...
"blinkers, are a piece of equipment used on a horse's face thatrestrict the horse's vision. They usually compose of leather orplastic cups that are places on either side of the eye, so that thehorse can not see to his sides. Many racehorse trainers believethis keeps the horse focused on what is in front of him,encouraging him to pay attention to the race rather than otherdistractions, such as crowds" (http://en.wikipedia.org/wiki/Blinders)
WSDL is a widely accepted W3C spec that is becoming increasinglyaccepted worldwide (and is, generally, automatically generatedbased on your interface, so requires little or no manualconstruction), and which solves a problem that we *know without anydoubt* URLs cannot solve. I really don't see an advantage intrying to ignore them, circumvent them, or otherwise relegate themto a secondary lookup, in the base spec for the Semantic Web, whenwe know that we are going to have to deal with them at some point(and in fact, they are currently MORE POPULAR than RDF itself,according to Google Trends: http://www.google.com/trends?q=WSDL%2C+RDF&ctab=0&geo=all&date=all&sort=0
I really don't see the point in trying to build the Semantic Web byspecifically avoiding acknowledgement of one of the most populartrends on the Web, when we already know that the vast majority ofinformation we need to access as bioinformaticians is availablethrough web forms or web services!
I'm sorry for being rude and disrespectful - I'm honestly quiteembarrassed to be saying these things so harshly - but I thinkthis discussion has started to become a singularity around a pre-contrived end-point, rather than a discussion of what the Web (andthe Semantic Web) really is/can be!
WSDL -1 if you wish, but that puts you in opposition to themajority of the world, where WSDL (thanks to Ajax) is finallystarting to make it's mark!
Again, I apologize for being disrespectful and rude... it reallyisn't personal and I feel truly awful about writing this soharshly! I'm just losing patience with a discussion that doesn'tseem to be a discussion, but rather a shoe-horn into a pre-destinedend point.
You are all free to crucify me the next time one of my grants comesto you for review ;-)
M
On Fri, 13 Jul 2007 20:19:41 -0700, Alan Ruttenberg<[EMAIL PROTECTED]> wrote:
On Jul 13, 2007, at 12:20 AM, Mark Wilkinson wrote:
What worries me about the 303 solution (other than that we arenot using it forit's primary purpose [1]) is that the redirection can only beto a *single* resource, specified in the Location header.
On Thu, 12 Jul 2007 03:57:34 -0700, Jonathan Rees<[EMAIL PROTECTED]> wrote:
If this is an important functionality then it can be provided in a
variety of ways - a mere matter of programming. LSID resolverhappens
to be the only way that comes ready made. But the functionality
doesn't need to be tied to the use of LSIDs.
If there is an alternative solution that provides the samefunctionality, and that can be applied universally to allexisting URIs (URLs), then I'm all for it! To be honest, this ismy *primary* objection to moving to a URL solution vs an LSIDsolution... if you can solve that problem, then I am *almost* inthe URL camp.
Here is an alternative:

Problem statement:
Enable third parties to register the fact that they haveadditional statements to provide about something that a URIdenotes, in such a way as to make it easy for anyone to discoverthis fact. Do this in a way which requires minimal coordination(ideally none) between the minter of the original URI, theprovider of the additional statements, and the consumer of all thestatements.
Solution:
For a given URI http://a.b/c/d/e, construct a new URI http://purl.org/about/a.b/c/d/e
Configure the purl server so that http://purl.org/provide-about/a.b/c/d/e redirects to something akin to a structured wiki page ora REST service (let us assume for the moment that whoevercurrently provides the LSID WSDL that contains this informationcurrently is the provider of this service).
This page may be edited (manually or programmatically) to includea description (suitable for a machine to understand) of how toaccess the resource and what sort of resource it is, and perhapssome additional useful information (what predicates does theresource provide). This information rendered as RDF using astandard vocabulary and saved.
Configure the purl server so that http://purl.org/about/a.b/c/d/eretrieves the RDF that was constructed (or a 404 if there isnone). Semantic web agents then interpret this RDF and go fetchwhat they want or need.
We all agree that 303s redirect to a human readable html document,that this document uses a REL link to an RDF document that sayswhat the provider wishes to say and that the RDF also states thathttp://purl.org/about/a.b/c/d/e may have more information.(suitable shortcuts are provided to make bulk retrievals moreefficient - we've already discussed such mechanisms)
This can be done now, with effort analogous to what is being donewith LSIDS. Let me point out some obvious advantages: 1) Norequirement to use web services (though web services *could* bedescribed as ways of accessing further statements using thisscheme) 2) Requires *less* manual intervention than is currentlyrequired to maintain the WSDL. 3) Re-uses purl, which is based onHTTP, which everyone knows how to use already 4) Makes clear thatthe description of these additional resources for statements areto be in RDF, and requires that one advertises what to expect ifyou go to the resource (will you get an RDF document, a SPARQLendpoint, a Web service set of methods?)
---
With a bit more effort expended on extending the purl server codewe can get some more leverage - we enhance it so that retrievinghttp://purl.org/about/a.b/c/d/e actually merges the RDF result ofretrieving each of http://purl.org/about*/a.b/
http://purl.org/about*/a.b/c
http://purl.org/about*/a.b/c/d
http://purl.org/about/a.b/c/d/e
Where the about* top level domain indicates that the informationabout covers all URIs that start with the indicated path.
In this way different providers can note that they have additionalstatements about URIs located in varying amounts of namespace.
With some coordination among us, we could even decide to dedicatea server to hosting the whole mess of this information (I don'texpect that it needs too large a resource) so as to make theservice more efficient in answering queried, and making it easy toprovide, to whoever wishes, a snapshot that they can host themselves.
---

May I now count you among those *almost* in the URL camp? ;-)

-Alan

Re: 303 +1, WSDL -1

Reply via email to