Re: RDFS subPropertyOf property path query performance

Lorenz Buehmann Mon, 13 May 2024 03:05:01 -0700

Hi,

does it mean the ?origin is always bound to a resource in the graph? Canyou share the whole query maybe?

How long are the sequences in the graph? How many paths starting from anode, i.e. what's the out degree in general per node?


Also, would it be possible to share some kind of data for investigation?

In general, the RDFS inference you're using is pretty light-weight,running at query eval time - all it does at triple pattern eval time isto incorporate in your case the rdfs:subProperty triple from the schema,but it might indeed grow at each step on the path



Cheers,

Lorenz

On 13.05.24 09:41, Christian Clausen wrote:

In our graph we have :flow properties and need to distinguish different
kinds of flows, :flowA and :flowB.

We modelled this with in RDFS:

     :flowA rdfs:subPropertyOf :flow
     :flowB rdfs:subPropertyOf :flow

Some of our SPARQL queries use :flow+ and some use :flowA+, always from an
origin:

     ?origin :flowA+ :?result

or

     ?origin :flow+ :?result

If we start Fuseki *without* RDFS, the following queries finish in a second
or two:

     ?origin :flowA+ :?result
     ?origin :(flowA | :flowB)+ :?result

If we start Fuseki *with* RDFS, the following queries take about 85 seconds:

     ?origin :flowA+ :?result
     ?origin :flow+ :?result


What is causing this difference in performance? Are we missing something or
should we avoid RDFS for optimal performance? Any other alternatives?

Our overall process is:

1. Generate TTL files with :flowA and :flowB properties (not :flow other
than implied by rdfs:subPropertyOf)
2. Load with TDB2 loader
3. Start Fuseki (with RDSF vocabulary or not)

Here follows the code we use to start Fuseki.

Without RDFS:

         *Dataset data = TDB2Factory.connectDataset(options.directory);*

         FusekiServer server = FusekiServer.create()
             .port(options.port)
             .loopback(true)
             *.addDataset(options.datasetName, data.asDatasetGraph())*
             .addEndpoint(options.datasetName, "query", Operation.Query)
             // shortestPath
             .registerOperation(shortestPathOp, WebContent.contentTypeJSON,
new ShortestPathService())
             .addEndpoint(options.datasetName, "shortestPath",
shortestPathOp)
             // diagnostics
             .verbose(true)
             .enablePing(true)
             .enableStats(true)
             .enableMetrics(true)
             .enableTasks(true)
             .build();

         // Start
         server.start();

With RDFS:



*Dataset data = TDB2Factory.connectDataset(options.directory);        Graph
vocabulary = RDFDataMgr.loadGraph(options.vocabularyFileName);
DatasetGraph dsg = RDFSFactory.datasetRDFS(data.asDatasetGraph(),
vocabulary);*

         FusekiServer server = FusekiServer.create()
             .port(options.port)
             .loopback(true)
             *.addDataset(options.datasetName,dsg)*
             .addEndpoint(options.datasetName, "query", Operation.Query)
             // shortestPath
             .registerOperation(shortestPathOp, WebContent.contentTypeJSON,
new ShortestPathService())
             .addEndpoint(options.datasetName, "shortestPath",
shortestPathOp)
             // diagnostics
             .verbose(true)
             .enablePing(true)
             .enableStats(true)
             .enableMetrics(true)
             .enableTasks(true)
             .build();

         // Start
         server.start();

--
Lorenz Bühmann
Research Associate/Scientific Developer

Email [email protected]

Institute for Applied Informatics e.V. (InfAI) | Goerdelerring 9 | 04109 
Leipzig | Germany

Re: RDFS subPropertyOf property path query performance

Reply via email to