Hi Andy, 

Thanks for your response. I was wondering if there is any detailed 
documentation of the Jena optimization (rewriting & reordering) available 
online? If yes, can you please send me the reference?. 

Also, if I create my own query plan (in algebraic form), is it possible to make 
Jena execute it as it is? I mean how to turn off jena’s optimization (rewriting 
& reordering)  and force my query plan for execution. 

Thanks again for your help. 

Regards,

Kashif Rabbani, 
Research Assistant, 
Department of Computer Science,
Aalborg University, Denmark.

> On 3 Mar 2020, at 13.43, Andy Seaborne <a...@apache.org> wrote:
> 
> Hi Kashif,
> 
> Optimization happens in two stages:
> 
> 1. Rewrite of the algebra
> 2. Reordering of the BGPs
> 
> BGPs can be implemented differnet ways - and they are an inferenece extnesion 
> point in SPARQL.
> 
> What you see if the first. BGPs are reordered during execution.
> 
> The algorithm can be stats driven for TDB and TDB2 storage:
>  https://jena.apache.org/documentation/tdb/optimizer.html
> 
> The interface is 
> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> 
> and a general purpose reordering is done for in-memory and is the default for 
> TDB.
> 
> The default reorder is "grounded triples first, leave equal weights alone". 
> It cascades whether a term is bound by an earlier step.
> 
> >    { ?a  mbz:alias           "Amy Beach" .
> >      ?b  cmno:hasInfluenced  ?a .
> >      ?c  mo:composer         ?b ;
> >          bio:date            ?d
> >    }
> 
> That's actually the default order -
> 
> ?a  mbz:alias           "Amy Beach" .
> 
> has two bound terms so is done first.
> 
> and now ?a is bound so
> ?b  cmno:hasInfluenced  ?a .
> 
> etc.
> 
> Given the boundedness of the pattern, and (guess) mbz:alias "Amy Beach" is 
> quite selective, With stats  ? <property> ? would have to be less numerous 
> than ? mbz:alias "Amy Beach".
> 
> There's no algebra optimization for your example, only BGP reordering.
> 
> qparse --print=opt shows stage 1 optimizations.
> 
> Executing with "explain" shows BGP execution.
> 
>    Andy
> 
> 
> 
> On 03/03/2020 11:56, Kashif Rabbani wrote:
>> Hi awesome community,
>> I have a question,  I am working on optimizing SPARQL query plan and I 
>> wonder does the order of triple patterns in the where clause effects the 
>> query plan or not?
>> For example, given a following query:
>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
>> PREFIX  mo:   <http://purl.org/ontology/mo/>
>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
>> SELECT  ?a ?b ?c
>> WHERE
>>   { ?a  mbz:alias           "Amy Beach" .
>>     ?b  cmno:hasInfluenced  ?a .
>>     ?c  mo:composer         ?b ;
>>         bio:date            ?d
>>   }
>> // Let’s generate its algebra
>> Op op = Algebra.compile(query); results into this:
>> (project (?a ?b ?c)
>>   (bgp
>>     (triple ?a <http://dbtune.org/musicbrainz/resource/vocab/alias> "Amy 
>> Beach")
>>     (triple ?b <http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
>>     (triple ?c <http://purl.org/ontology/mo/composer> ?b)
>>     (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
>>   ))
>> The bgp in algebra follows the exact same order as specified in the where 
>> clause of the query. Very precisely, does Jena constructs the query plan as 
>> it is? or it will change the order at some other level?
>> I would be happy if someone can guide me about how the Jena's plan actually 
>> constructed. If I will use some statistics of the actual RDF graph to change 
>> the order of triple patterns in the BGP based on selectivity, would it 
>> optimize the plan somehow?
>> Many Thanks,
>> Best Regards,
>> Kashif Rabbani.

Reply via email to