On 12 Aug 2010, at 12:16, Thomas Down wrote:
On Thu, Aug 12, 2010 at 10:30 AM, Andy Jenkinson
<[email protected]>wrote:
Hi Thomas,
Uh-oh, URIs... :)
For coordinate systems, I think the definitions of the component
pieces are
fairly well described. It is a pity that the species name is not
given its
own parameter though. The sources documentation then says: "The uri
(required) attribute is a globally unique identifier for the
coordinate
system. It should be a fully resolvable URL providing more
information about
the coordinate system." This could be misleading as although the
URIs _are_
resolvable, the content is not particularly machine friendly.
I am not willing to change the syntax of the coordinate system URIs
out in
the wild, but if you need the content returned to be machine
readable we
could replace the HTML content with an XML+XSLT combination. That
is, "
http://www.dasregistry.org/dasregistry/coordsys/CS_DS6" would look
more
like one of the entries in "
http://www.dasregistry.org/das/coordinatesystem" to a machine, and
the
same as it currently does to a human. From a practical perspective
though,
if a client parses the XML elements from the registry's
/das/coordinatesystem output, it can identify all the coordinate
systems by
both URI and text description. Changing the output wouldn't
materially
change what a client needs to do given either a URI or a comma
separated
string. It is always going to need to run a HTTP get and do some
parsing of
coordinatesystem XML. But it is certainly true that having the URI
resolve
to the XML is a more elegant and simple to explain system, and in
any case
the spec makes no mention of the fact that a client can even obtain
the XML
for all the coordinate systems together.
CS URIs pointing to XML definitely seems more symmetrical.
Mentioning /das/coordinatesystem in the spec would help, too -- that's
currently rather opaque.
A few other questions (don't really have "preferred" answers to any of
these, just trying to test the boundaries):
1. What do you expect a server to do if it sees a CS URI that
it
hasn't seen before?
2. If my organization has sequenced a new genome and is
running some
internal DAS stuff on that while we finish annotating, etc., what
URI do we
use for the coordinate system?
If it were me I'd just add it to the known coordinate systems in the
registry. You can ask us to do it or run a script that will add it if
it doesn't exist already (can send you script if you want).
Having a not widely available genome in the registry wouldn't be
harmful in any way? especially if it was coming out in the future. The
only issue would be advertising an institute as working on this if
they put themselves down as the authority I guess? which maybe in
issue..
3. If my organization is running an internal mirror of the
central
DAS registry, would I mirror the CS URIs ("
http://das.bigpharma.com/dasregistry/coordsys/CS_DS40/"). Still
point to
dasregistry.org? Something else?
If issue with authority advertising as above then mirror all sources
plus private ones and change config/hardcode change of servers/clients
to point at internal registry (which can just be a sources document
hosted locally).
???
Thomas.
_______________________________________________
DAS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/das
Jonathan Warren
Senior Developer and DAS coordinator
blog: http://biodasman.wordpress.com/
[email protected]
Ext: 2314
Telephone: 01223 492314
--
The Wellcome Trust Sanger Institute is operated by Genome Research
Limited, a charity registered in England with number 1021457 and a
company registered in England with number 2742969, whose registered
office is 215 Euston Road, London, NW1 2BE.
_______________________________________________
DAS mailing list
[email protected]
http://lists.open-bio.org/mailman/listinfo/das