On Sat, Feb 11, 2012 at 12:42 PM, Andy Seaborne <[email protected]> wrote: > On 11/02/12 02:04, Stephen Allen wrote: >> >> Is there a preferred way to represent a remote SPARQL endpoint in a manner >> that is interchangeable with a local model/graph? I have code that >> generates RDF using Jena objects, and I want to store them in either a >> remote or local graph store (configurable). > > > SPARQL? I.e. don't work in terms of API calls but in terms of SPARQL. A > simple DatasetGraph can wrap local/remote issues.
Yeah, I guess I was mostly getting hung up on transactions spanning multiple queries / inserts. > >> I see there is a GraphSPARQLService, but it doesn't appear finished (also, >> I think I would want a DatasetGraph level interface). > > > It's read-only and work by mapping all Graph.find to a remote query. > > >> Also I note the DatasetGraphAccessorHTTP in Fuseki, which I believe is an >> implementation of "SPARQL 1.1 Graph Store HTTP Protocol" [1]. This looks >> close to what I want, but forces you to add a dependency on Fuseki, and it >> does not have streaming support for inserts or a natural API that accepts >> Jena objects. > > > DatasetGraphAccessorHTTP is supposed to migrate sometime. > > What do you mean by "natural API that accepts Jena objects"? The grap store > protocol is put/get of whole graphs only. > > If the graph is small, then get - do some operations - put might be > workable. This has transaction semantics (use etags). Just depends on the > size of the graph. > > And if not, maybe a design where changes are accumulated locally, then > enacted at the end all at once on the potentially remote graph (with local > mirror?). Creating Jena objects and then having to serialize them to Turtle in order to insert into a SPARQL Update query seemed like step that could be made a little easier. I thought at first a DatasetGraph wrapper around an endpoint might work, but I don't think it's quite the correct interface. It isn't well suited for a streaming add operation, as each add() seems to imply a commit. Also, for querying, SPARQL seems like the better interface than find(). I'll think about what I'm trying to do a little more. Although the RepositoryConnection looks a lot like what I was imagining. Biggest issue is managing the transactions, this is where I thought we might need to extend the SPARQL protocol. But even without transaction support, the interface seems useful. > >> Basically I think I'm looking for a Connection object similar to JDBC or >> Sesame's RepositoryConnection [2]. You could connect to either a local >> DatasetGraph or a remote endpoint. For the remote endpoint case, I don't >> think it's possible to accomplish fully with standard SPARQL because of >> two >> issues: 1) no transaction support across multiple queries/updates and 2) >> local blank nodes identifiers. >> >> Does anyone have any ideas? Should I start designing such a thing? The >> blank node problem could be solved with skolemization [3], and we could >> initially ignore the transaction issue (thus support Auto Commit mode >> only). > > > Bnodes can be handled by enabling bNode label output, and using <_:...> on > input. > > But do think about whether bNodes are the right thing to use in the first > place. > > >> To add transaction support, we would have to add an extension to SPARQL >> 1.1 >> Protocol [4]. An extra parameter to indicate the type of transaction >> (READ >> or WRITE, transaction level) and transaction ID seems like it might be a >> good approach. The transaction ID could be a client generated UUID, which >> would save a round-trip. Or maybe a cookie would be a better approach? > > > Yes. > > Alternative: E-Tags give transactions for consistency. No abort though and > client crash has no roll back but its the web way. > I hadn't heard of E-Tags, I'll take a look! -Stephen
