On 27/01/12 13:44, Martynas Jusevicius wrote:
Hey list,

I am looking for an implementation doing what looks like a simple task
(but probably isn't): given a URI, try to extract RDF Model from it in
all possible ways.
It should use content negotiation: ask for RDF/XML as first priority,
Turtle/N-Triples as the second, and try GRDDL on HTML as the last
option.

I can see Jena's RDFReader, JenaReader, and GRDDLReader that all seem
to do a part of what is needed, but I wonder if there already is some
code that combines it all?

Martynas
http://graphity.org

Ah. This is something that's been talked about several times and I went as far as looking for old notes on this for a JIRA moderately recently.

What we need (IMO) is a single reader that opens streams then decides which parser to dispatch to.

FileManager+typed streams.

  Add a locator to the filemanager to do conneg.
  Streams are typed by any MIME info

then the decision on MIME type to believe is based on
1/ MIME type
2/ file extension
3/ user hint

probably in the order 3-1-2. Except for text/plain when 2 overrides 1 or we route it to Turtle regardless.

Given that, look in a registry and call the real parser.

I'm not completely sure it will work for RDFa and GRDDL - maybe if the system is told to read one of those, the dispatching reader believes that over any conneg and just does it.

What I think we should avoid unless really, really necessary is sniffing the content.

org.openjena.riot.web.HttpOp for some code that does HTTP GETs and dispatches to a handler. I don't think this is the way to go; it's not nice to pick the results out of the operation.

org.openjena.riot.WebContent has lots of constants.

        Andy

Reply via email to