Re: Include named graph data when not using a GRAPH clause

Andy Seaborne Wed, 14 Jul 2021 02:31:20 -0700



On 14/07/2021 07:02, Perry, Kevin wrote:

Thank you, that already helps a lot.
It is not _quite_ what we had in mind, though.

While `unionDefaultGraph` makes the default graph a union of all the
named graphs, we then no longer can query data from the actual default
graph.

Assuming we add three people, two of them in named graphs:
     ex:Alice a foaf:Person
     GRAPH ex:graph1 { ex:Bob a foaf:Person }
     GRAPH ex:graph2 { ex:Carl a foaf:Person }

When we run the query
     SELECT * WHERE { ?person a foaf:Person }

without `unionDefaultGraph` we only get `ex:Alice`.
with `unionDefaultGraph` we get `ex:Bob` and `ex:Carl`.

We would like to get _all three_ people as a result, i.e. a union of
the default graph and all the named graphs.

There is no simple configuration way to do that. The default graph isnot a graph with a hidden name.

It can be accessed as <urn:x-arq:DefaultGraph> but it is not stored likethat.

unionDefaultGraph is focused on the use of managing data using namedgraph but presenting to query as one graph.

In SPARQL Update, this is the difference between NAMED and ALL in say,CLEAR ALL or CLEAR NAMED.


The query

SELECT *
FROM <urn:x-arq:DefaultGraph>
FROM <urn:x-arq:UnionGraph>
WHERE {
  ?s ?p ?o
}

works and this is equivalent to adding to the HTTP request withdefault-graph-uri.

wget -O -'http://localhost:3030/ds?default-graph-uri=urn:x-arq:DefaultGraph&&default-graph-uri=urn:x-arq:UnionGraph&query=SELECT* { ?s ?p ?o }'

so making the endpoint"host?default-graph-uri=...&default-graph-uri=..." which works if thecalling software library accepts endpoints where there is already aquery string.

String URL ="http://localhost:3030/ds?default-graph-uri=urn:x-arq:DefaultGraph&default-graph-uri=urn:x-arq:UnionGraph";;

try ( QueryExecution qExec = QueryExecutionFactory.sparqlService(URL,"SELECT * { ?s ?p ?o }") ) {

      QueryExecUtils.executeQuery(qExec);
}

A similar setup can be written in assemblers for TDB1 - this will notwork for TDB2 because of the way transactions work.


PREFIX fuseki: <http://jena.apache.org/fuseki#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX tdb1: <http://jena.hpl.hp.com/2008/tdb#>
PREFIX ja: <http://jena.hpl.hp.com/2005/11/Assembler#>
PREFIX sdb: <http://jena.hpl.hp.com/2007/sdb#>

[] rdf:type fuseki:Server ;
   .

<#tdb> rdf:type fuseki:Service ;
    fuseki:name "ds" ;
    fuseki:endpoint  [ fuseki:operation fuseki:query ; ] ;
    fuseki:dataset <#dataset> ;
.

<#dataset> rdf:type ja:RDFDataset ;
    ja:defaultGraph <#dftGraph>
    .

<#dftGraph> rdf:type ja:UnionModel ;
   ja:subModel <#graph1> ;
   ja:subModel <#graph2> ;
   .

<#graph1> rdf:type tdb1:GraphTDB ;
    tdb1:dataset <#base> ;
    .

<#graph2> rdf:type tdb1:GraphTDB ;
    tdb1:dataset <#base> ;
    tdb1:graphName <urn:x-arq:UnionGraph> ;
    .


<#base> rdf:type    tdb1:DatasetTDB ;
    tdb1:location "DB1" ;
    .

All of these have a loss in performance over tdb:unionDefaultGraph

    Andy


Kevin

On Tue, 2021-07-13 at 09:41 +0100, Andy Seaborne wrote:

Hi Kevin,

The configuration setting you are looking for is "union default graph"

With this, the default graph of the dataset is a view of the union of
all the named graphs.

for TDB2 (TDB1 is similar), the server configuration would include
something like:

:dataset_tdb2 rdf:type  tdb2:DatasetTDB2 ;
      tdb2:location "DB2" ;
      tdb2:unionDefaultGraph true ;
      .

      Andy

https://jena.apache.org/documentation/tdb2/tdb2_fuseki.html

On 13/07/2021 06:35, Perry, Kevin wrote:

Hi!

We're currently extending an application using both Metaphactory and
Blazegraph to also support Jena, or more specifically Fuseki.

In a perfect world, we'd be using the exact same SPARQL queries for
both Fuseki and Blazegraph.
Unfortunately we hit a bit of a snag when it comes to the handling of
the default and named graphs[1].

We are used to not including a GRAPH clause in our queries and
receiving results from all graphs (i.e. both from the default and
named
graphs).
With Fuseki, we only get results from either the default graph or any
named graph matching a GRAPH clause.

Is there some *configuration setting* (or maybe something equally
trivial) to have Fuseki behave similarly to Blazegraph in this
respect?
At least according to the article[1], there is supposed to be a
setting
("[...] Apache Jena offer options to change their default behavior"),
but it unfortunately doesn't go into more detail.

On the long run, we do intend to rework the queries and how we store
data (i.e. putting everything into named graphs) but unfortunately
that's not something we can afford to do right away.

Kevin

[1]https://blog.metaphacts.com/the-default-graph-demystified

Re: Include named graph data when not using a GRAPH clause

Reply via email to