Re: SPARQL performance question

Steve Vestal Sun, 23 Feb 2020 15:11:52 -0800

If I comment out the FILTER clause that prevents variable aliasing, the
query is processed almost immediately.  The number of rows goes from 192
to 576, but it's fast.  What is the proper way to write a query when you
want a particular set of variables to have distinct solution values?


I speculated that when I iterated over the statements in the OntModel,
and the number went from a model size() of ~1500 to ~4700 iterated
statements, that I was materializing the entire inference closure (which
was fast).  Is there some other set of calls needed to do that?

Are there circumstances where it is faster to materialize the entire
closure and query a plain model than to query the inference model itself?

On 2/23/2020 3:33 PM, Dave Reynolds wrote:
> The issues is not performance of SPARQL but performance of the
> inference engines.
>
> If you need some OWL inference then your best bet is OWLMicro.
>
> If that's tow slow to query directly then one option to try is to
> materialize the entire inference closure and then query that. You can
> that by simply copying the inference model to a plain model.
>
> If that's too slow then you'll need a higher performance third party
> reasoner.
>
> Dave
>
> On 23/02/2020 18:57, Steve Vestal wrote:
>> I'm looking for suggestions on a SPARQL performance issue.  My test
>> model has ~800 sentences, and processing of one select query takes about
>> 25 minutes.  The query is a basic graph pattern with 9 variables and 20
>> triples, plus a filter that forces distinct variables to have distinct
>> solutions using pair-wise not-equals constraints.  No option clause or
>> anything else fancy.
>>
>> I am issuing the query against an inference model.  Most of the asserted
>> sentences are in imported models.  If I iterate over all the statements
>> in the OntModel, I get ~1500 almost instantly.  I experimented with
>> several of the reasoners.
>>
>> Below is the basic control flow.  The thing I found curious is that the
>> execSelect() method finishes almost instantly.  It is the iteration over
>> the ResultSet that is taking all the time, it seems in the call to
>> selectResult.hasNext(). The result has 192 rows, 9 columns.  The results
>> are provided in bursts of 8 rows each, with ~1 minute between bursts.
>>
>>          OntModel ontologyModel = getMyOntModel(); // Tried various
>> reasoners
>>          String selectQuery = getMySelectQuery();
>>          QueryExecution selectExec =
>> QueryExecutionFactory.create(selectQuery, ontologyModel);
>>          ResultSet selectResult = selectExec.execSelect();
>>          while (selectResult.hasNext()) {  // Time seems to be spent in
>> hasNext
>>              QuerySolution selectSolution = selectResult.next();
>>              for (String var : getMyVariablesOfInterest() {
>>                  RDFNode varValue = selectSolution.get(var);
>>                  // process varValue
>>              }
>>          }
>>
>> Any suggestions would be appreciated.
>>

Re: SPARQL performance question

Reply via email to