On 11/02/12 02:04, Stephen Allen wrote:
Is there a preferred way to represent a remote SPARQL endpoint in a manner
that is interchangeable with a local model/graph? I have code that
generates RDF using Jena objects, and I want to store them in either a
remote or local graph store (configurable).
SPARQL? I.e. don't work in terms of API calls but in terms of SPARQL. A
simple DatasetGraph can wrap local/remote issues.
I see there is a GraphSPARQLService, but it doesn't appear finished (also,
I think I would want a DatasetGraph level interface).
It's read-only and work by mapping all Graph.find to a remote query.
Also I note the DatasetGraphAccessorHTTP in Fuseki, which I believe is an
implementation of "SPARQL 1.1 Graph Store HTTP Protocol" [1]. This looks
close to what I want, but forces you to add a dependency on Fuseki, and it
does not have streaming support for inserts or a natural API that accepts
Jena objects.
DatasetGraphAccessorHTTP is supposed to migrate sometime.
What do you mean by "natural API that accepts Jena objects"? The grap
store protocol is put/get of whole graphs only.
If the graph is small, then get - do some operations - put might be
workable. This has transaction semantics (use etags). Just depends on
the size of the graph.
And if not, maybe a design where changes are accumulated locally, then
enacted at the end all at once on the potentially remote graph (with
local mirror?).
Basically I think I'm looking for a Connection object similar to JDBC or
Sesame's RepositoryConnection [2]. You could connect to either a local
DatasetGraph or a remote endpoint. For the remote endpoint case, I don't
think it's possible to accomplish fully with standard SPARQL because of two
issues: 1) no transaction support across multiple queries/updates and 2)
local blank nodes identifiers.
Does anyone have any ideas? Should I start designing such a thing? The
blank node problem could be solved with skolemization [3], and we could
initially ignore the transaction issue (thus support Auto Commit mode only).
Bnodes can be handled by enabling bNode label output, and using <_:...>
on input.
But do think about whether bNodes are the right thing to use in the
first place.
To add transaction support, we would have to add an extension to SPARQL 1.1
Protocol [4]. An extra parameter to indicate the type of transaction (READ
or WRITE, transaction level) and transaction ID seems like it might be a
good approach. The transaction ID could be a client generated UUID, which
would save a round-trip. Or maybe a cookie would be a better approach?
Yes.
Alternative: E-Tags give transactions for consistency. No abort though
and client crash has no roll back but its the web way.
Andy
-Stephen
[1] http://www.w3.org/TR/sparql11-http-rdf-update/
[2]
http://www.openrdf.org/doc/sesame2/api/org/openrdf/repository/RepositoryConnection.html
[3] http://www.w3.org/2011/rdf-wg/wiki/Skolemization
[4] http://www.w3.org/TR/sparql11-protocol/