Re: [PR] GH-2701: Fuseki Mod to list and abort running executions. [jena]

via GitHub Mon, 15 Sep 2025 04:57:14 -0700


Aklakan commented on PR #3184:
URL: https://github.com/apache/jena/pull/3184#issuecomment-3291792984


   The goal of `DatasetGraphOverSparql` is to act as a bridge between ARQ and 
external SPARQL-capabale stores.
   The design of the sparql dispatcher system proposed by this PR allows both 
`QueryExec.dataset(dsgOverSparql)` and `RDFLink.connect(dsgOverSparql)` to 
efficiently proxy queries and updates to the backend.
   However, for graph-store-protocol (GSP) operations, ARQ relies on the 
DatasetGraph API, which is indeed far from ideal for use with a SPARQL backend.
   It is consistent however, because the class name is `DatasetGraphOverSparql` 
and not `DatasetGraphOverSparqlAndGSP` - so the protocol is fixed to SPARQL.
   
   For SPARQL, the current design already allows configuration of protocol 
matters on the RDFLink level.
   The snippet below is a variation of this PR's 
`ExampleDBpediaViaRemoteDataset.java`:
   
   ```java
       Creator<RDFLink> linkCreator = () -> RDFLinkHTTP.newBuilder()
               .destination("http://dbpedia.org/sparql";)
               // Request using thrift instead of the default 
application/sparql-results+json.
               .acceptHeaderSelectQuery(WebContent.contentTypeResultsThrift)
               .build()
   
       DatasetGraph dsg = new DatasetGraphOverRDFLink(linkCreator);
       QueryExec.dataset(dsg)...; // Queries will be dispatched to the link and
                                  // execution won't use the DatasetGraph API.
   ```
   
   The fundamental issue is that DatasetGraph is central to most parts of Jena 
(up to Fuseki).
   
   At some point in the future - in the appropriate places - it might be worth 
superseding DatasetGraph with a more general `RDFLinkSource` (a factory of 
RDFLinks - similar to JDBC's DataSource).
   This way, Fuseki could forward GSP requests to vender-specific driver 
implementations - but I feel that these changes are outside the scope of this 
PR.
   The idea of a JDBC-like DataSource idea was briefly mentioned in 
https://github.com/apache/jena/pull/1390#issuecomment-1165821801
   
   
   > Transforming update requests via 
`StreamRDFToUpdateRequest.sendGraphTriplesToStream`
   
   I think in principle the abstraction with a configurable update sink is ok, 
but I agree that a custom `graph-to-update-requests` mapper should not require 
subclassing!
   Also, the default strategy should be to put the inserts into a single 
request instead of performing some magic splitting that will break blank nodes.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] GH-2701: Fuseki Mod to list and abort running executions. [jena]

Reply via email to