On 27/01/12 20:29, Martynas Jusevicius wrote:
Ok, that clears some things up.
So is there a good class to extend, like JenaReader?
Or should I start from scratch and implement RDFReader?
I think most mainstream Linked Data publishing methods should be
supported, at least these:
http://linkeddatabook.com/editions/1.0/#htoc65
Maybe the implementation could be broken into several levels that
extend each other:
a) content negotiation only
b) heuristics (like using file extension) not involving content-sniffing
c) GRDDL
d) HTML-sniffing to find<link>s etc
Martynas
That's a good breakdown. The (a)+(b) is the area I've been wanting to
sort out for some time if only to make adding parser types a bit more
straight forward. RDFa, microX, JSON-LD, native-to-triple generators,
... and definitely not a fixed set.
As a contribution to this discussion for (a)+(b) I've gathered together
various bits and pieces into a experimental design:
http://s.apache.org/wbZ
I don't have a sense of how to incorporate (c)+(d) and hope you have
ideas here.
The idea is that reading/parsing is orthogonal to a model. In Jena2,
there is the possibility of per-model choice of reader implementation.
I'm not sure if any use is made of this feature. RDFReaderFImpl is the
only implementation active in Jena. Are there any others?
There is a need to configure the parsing process per-read, that is
mainly for RDF/XML as described at:
http://s.apache.org/BMB
which is all done with property settings.
We can separate reading from model. The FileManager already does this
with readModel(model, ...).
We can have a factory-style design with a function (static method):
read(Model m, String uri, String hintLang, Context context)
or rather:
read(Sink<Triple> destination, String uri, String hintLang, Context context)
where:
Sink<Triple> destination
where to send triples generated by the parser. There is a standard
wrapper for a graph that turns Sink.send into Graph.add.
String uri
The place to read.
String hintLang
A hint to the system of the syntax.
Context context
A set of property-values to configure the parser.
The process is :
open -> TypeStream
process -- choose parser, call parser
ts.close
"open" can use a FileManager to look in a list of places for the "uri"
(actually, a general label - maybe in the filing system, a Java
resources, zip file, on the web, servlet context, in a cache, ...). A
nice feature of the file manager is you can turn off locations - e.g.
don't put a file system component and the local file system isn't
accessible which is good for servers.
Andy