Re: Order of triple patterns in Where Clause

Marco Neumann Sun, 08 Mar 2020 10:03:07 -0700

sorry my bad, that was a typo should be reductions* . A very basic concept
in functional languages like haskell and heap size measured in cells.



"Reduction is the process of converting an expression to a simpler form.
Conceptually, an expression is reduced by simplifying one reducible
expression (called “redex”) at a time."
https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197


On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <[email protected]> wrote:

> Then I don't understand what you are looking for.
>
> What's a "deduction"? What's a "cell"?
>
> On 08/03/2020 14:22, Marco Neumann wrote:
> > thank you for the hint Andy, but not quite what I was looking for.
> >
> > I was aiming more for a type of feature I am familiar with from purely
> > functional programming languages like haskell, hugs, miranda etc to
> display
> > deductions and cells used during execution.
> >
> > Marco
> >
> > On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <[email protected]> wrote:
> >
> >>
> >>
> >> On 06/03/2020 17:40, Marco Neumann wrote:
> >>> is there statistical data available for the number of deductions /
> >>> joins performed for each SPARQL query of a QueryExecution object?
> >>
> >> If you run with "explain" you can find out but there isn't a specific
> >> record kept by the code.
> >>
> >>>
> >>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <[email protected]> wrote:
> >>>
> >>>>
> >>>>
> >>>> On 05/03/2020 08:32, Kashif Rabbani wrote:
> >>>>> Hi Andy,
> >>>>>
> >>>>> Thanks for your response. I was wondering if there is any detailed
> >>>> documentation of the Jena optimization (rewriting & reordering)
> >> available
> >>>> online? If yes, can you please send me the reference?.
> >>>>
> >>>> The code mainly.
> >>>>
> >>>> The TDB stats is documented.
> >>>>
> >>>>> Also, if I create my own query plan (in algebraic form), is it
> possible
> >>>> to make Jena execute it as it is? I mean how to turn off jena’s
> >>>> optimization (rewriting & reordering)  and force my query plan for
> >>>> execution.
> >>>>
> >>>> Yes - two parts - algebra rewrites and BGP reordering.
> >>>>
> >>>> The context is a mapping of settings.
> >>>> there is a global context (ARQ.getContext())
> >>>> one per the DatasetGraph.getContext()
> >>>> one per query execution. QueryExecution.getContext()
> >>>>
> >>>> and it is treated hierarchically:
> >>>>
> >>>> Lookup in QueryExecution then DatasetGraph the Global.
> >>>>
> >>>> :: Algebra rewrite
> >>>>
> >>>> Some algebra rewrites have to be done - property functions, and
> rewrite
> >>>> some variables due to scoping. These aren't really "optimizations
> steps"
> >>>> but happen in that phase. There is OptimizerMinimal for those.
> >>>>
> >>>> To turn off optimizer and still do the minimum steps.
> >>>>
> >>>> context.set(ARQ.optimization, false)
> >>>>
> >>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's a
> >>>> very low levelway of doing it.
> >>>>
> >>>> Turning the optimizer off is better because all the APIs work. eg
> >>>> QueryExecution.
> >>>>
> >>>> :: BGP reordering
> >>>>
> >>>> The reordering of triple patterns is separate.
> >>>> BGP steps are performed by a StageGenerator.
> >>>>
> >>>> To set up to use a custom StageGenerator:
> >>>>
> >>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ;
> >>>>
> >>>> That's really only  call of
> >>>>       context.set(ARQ.stageGenerator, myStageGenerator) ;
> >>>>
> >>>> The default is StageGenratorGeneric that does ReorderFixed.
> >>>> It is used if there is no other setting in the context.
> >>>>
> >>>>        Andy
> >>>>
> >>>>>
> >>>>> Thanks again for your help.
> >>>>>
> >>>>> Regards,
> >>>>>
> >>>>> Kashif Rabbani,
> >>>>> Research Assistant,
> >>>>> Department of Computer Science,
> >>>>> Aalborg University, Denmark.
> >>>>>
> >>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <[email protected]> wrote:
> >>>>>>
> >>>>>> Hi Kashif,
> >>>>>>
> >>>>>> Optimization happens in two stages:
> >>>>>>
> >>>>>> 1. Rewrite of the algebra
> >>>>>> 2. Reordering of the BGPs
> >>>>>>
> >>>>>> BGPs can be implemented differnet ways - and they are an inferenece
> >>>> extnesion point in SPARQL.
> >>>>>>
> >>>>>> What you see if the first. BGPs are reordered during execution.
> >>>>>>
> >>>>>> The algorithm can be stats driven for TDB and TDB2 storage:
> >>>>>>     https://jena.apache.org/documentation/tdb/optimizer.html
> >>>>>>
> >>>>>> The interface is
> >>>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation
> >>>>>>
> >>>>>> and a general purpose reordering is done for in-memory and is the
> >>>> default for TDB.
> >>>>>>
> >>>>>> The default reorder is "grounded triples first, leave equal weights
> >>>> alone". It cascades whether a term is bound by an earlier step.
> >>>>>>
> >>>>>>>       { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>         ?b  cmno:hasInfluenced  ?a .
> >>>>>>>         ?c  mo:composer         ?b ;
> >>>>>>>             bio:date            ?d
> >>>>>>>       }
> >>>>>>
> >>>>>> That's actually the default order -
> >>>>>>
> >>>>>> ?a  mbz:alias           "Amy Beach" .
> >>>>>>
> >>>>>> has two bound terms so is done first.
> >>>>>>
> >>>>>> and now ?a is bound so
> >>>>>> ?b  cmno:hasInfluenced  ?a .
> >>>>>>
> >>>>>> etc.
> >>>>>>
> >>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy
> >> Beach"
> >>>> is quite selective, With stats  ? <property> ? would have to be less
> >>>> numerous than ? mbz:alias "Amy Beach".
> >>>>>>
> >>>>>> There's no algebra optimization for your example, only BGP
> reordering.
> >>>>>>
> >>>>>> qparse --print=opt shows stage 1 optimizations.
> >>>>>>
> >>>>>> Executing with "explain" shows BGP execution.
> >>>>>>
> >>>>>>       Andy
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote:
> >>>>>>> Hi awesome community,
> >>>>>>> I have a question,  I am working on optimizing SPARQL query plan
> and
> >> I
> >>>> wonder does the order of triple patterns in the where clause effects
> the
> >>>> query plan or not?
> >>>>>>> For example, given a following query:
> >>>>>>> PREFIX  bio:  <http://purl.org/vocab/bio/0.1/>
> >>>>>>> PREFIX  mo:   <http://purl.org/ontology/mo/>
> >>>>>>> PREFIX  mbz:  <http://dbtune.org/musicbrainz/resource/vocab/>
> >>>>>>> PREFIX  cmno: <http://purl.org/ontology/classicalmusicnav#>
> >>>>>>> SELECT  ?a ?b ?c
> >>>>>>> WHERE
> >>>>>>>      { ?a  mbz:alias           "Amy Beach" .
> >>>>>>>        ?b  cmno:hasInfluenced  ?a .
> >>>>>>>        ?c  mo:composer         ?b ;
> >>>>>>>            bio:date            ?d
> >>>>>>>      }
> >>>>>>> // Let’s generate its algebra
> >>>>>>> Op op = Algebra.compile(query); results into this:
> >>>>>>> (project (?a ?b ?c)
> >>>>>>>      (bgp
> >>>>>>>        (triple ?a <
> http://dbtune.org/musicbrainz/resource/vocab/alias
> >>>
> >>>> "Amy Beach")
> >>>>>>>        (triple ?b <
> >>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a)
> >>>>>>>        (triple ?c <http://purl.org/ontology/mo/composer> ?b)
> >>>>>>>        (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d)
> >>>>>>>      ))
> >>>>>>> The bgp in algebra follows the exact same order as specified in the
> >>>> where clause of the query. Very precisely, does Jena constructs the
> >> query
> >>>> plan as it is? or it will change the order at some other level?
> >>>>>>> I would be happy if someone can guide me about how the Jena's plan
> >>>> actually constructed. If I will use some statistics of the actual RDF
> >> graph
> >>>> to change the order of triple patterns in the BGP based on
> selectivity,
> >>>> would it optimize the plan somehow?
> >>>>>>> Many Thanks,
> >>>>>>> Best Regards,
> >>>>>>> Kashif Rabbani.
> >>>>>
> >>>>
> >>>
> >>>
> >>
> >
> >
>


-- 


---
Marco Neumann
KONA

Re: Order of triple patterns in Where Clause

Reply via email to