sorry my bad, that was a typo should be reductions* . A very basic concept in functional languages like haskell and heap size measured in cells.
"Reduction is the process of converting an expression to a simpler form. Conceptually, an expression is reduced by simplifying one reducible expression (called “redex”) at a time." https://www.futurelearn.com/courses/functional-programming-haskell/0/steps/27197 On Sun, Mar 8, 2020 at 4:44 PM Andy Seaborne <[email protected]> wrote: > Then I don't understand what you are looking for. > > What's a "deduction"? What's a "cell"? > > On 08/03/2020 14:22, Marco Neumann wrote: > > thank you for the hint Andy, but not quite what I was looking for. > > > > I was aiming more for a type of feature I am familiar with from purely > > functional programming languages like haskell, hugs, miranda etc to > display > > deductions and cells used during execution. > > > > Marco > > > > On Sun, Mar 8, 2020 at 10:42 AM Andy Seaborne <[email protected]> wrote: > > > >> > >> > >> On 06/03/2020 17:40, Marco Neumann wrote: > >>> is there statistical data available for the number of deductions / > >>> joins performed for each SPARQL query of a QueryExecution object? > >> > >> If you run with "explain" you can find out but there isn't a specific > >> record kept by the code. > >> > >>> > >>> On Fri, Mar 6, 2020 at 3:16 PM Andy Seaborne <[email protected]> wrote: > >>> > >>>> > >>>> > >>>> On 05/03/2020 08:32, Kashif Rabbani wrote: > >>>>> Hi Andy, > >>>>> > >>>>> Thanks for your response. I was wondering if there is any detailed > >>>> documentation of the Jena optimization (rewriting & reordering) > >> available > >>>> online? If yes, can you please send me the reference?. > >>>> > >>>> The code mainly. > >>>> > >>>> The TDB stats is documented. > >>>> > >>>>> Also, if I create my own query plan (in algebraic form), is it > possible > >>>> to make Jena execute it as it is? I mean how to turn off jena’s > >>>> optimization (rewriting & reordering) and force my query plan for > >>>> execution. > >>>> > >>>> Yes - two parts - algebra rewrites and BGP reordering. > >>>> > >>>> The context is a mapping of settings. > >>>> there is a global context (ARQ.getContext()) > >>>> one per the DatasetGraph.getContext() > >>>> one per query execution. QueryExecution.getContext() > >>>> > >>>> and it is treated hierarchically: > >>>> > >>>> Lookup in QueryExecution then DatasetGraph the Global. > >>>> > >>>> :: Algebra rewrite > >>>> > >>>> Some algebra rewrites have to be done - property functions, and > rewrite > >>>> some variables due to scoping. These aren't really "optimizations > steps" > >>>> but happen in that phase. There is OptimizerMinimal for those. > >>>> > >>>> To turn off optimizer and still do the minimum steps. > >>>> > >>>> context.set(ARQ.optimization, false) > >>>> > >>>> Either Algebra.exec(op, dsg) executes the algebra as given - that's a > >>>> very low levelway of doing it. > >>>> > >>>> Turning the optimizer off is better because all the APIs work. eg > >>>> QueryExecution. > >>>> > >>>> :: BGP reordering > >>>> > >>>> The reordering of triple patterns is separate. > >>>> BGP steps are performed by a StageGenerator. > >>>> > >>>> To set up to use a custom StageGenerator: > >>>> > >>>> StageBuilder.setGenerator(ARQ.getContext(), stageGenerator) ; > >>>> > >>>> That's really only call of > >>>> context.set(ARQ.stageGenerator, myStageGenerator) ; > >>>> > >>>> The default is StageGenratorGeneric that does ReorderFixed. > >>>> It is used if there is no other setting in the context. > >>>> > >>>> Andy > >>>> > >>>>> > >>>>> Thanks again for your help. > >>>>> > >>>>> Regards, > >>>>> > >>>>> Kashif Rabbani, > >>>>> Research Assistant, > >>>>> Department of Computer Science, > >>>>> Aalborg University, Denmark. > >>>>> > >>>>>> On 3 Mar 2020, at 13.43, Andy Seaborne <[email protected]> wrote: > >>>>>> > >>>>>> Hi Kashif, > >>>>>> > >>>>>> Optimization happens in two stages: > >>>>>> > >>>>>> 1. Rewrite of the algebra > >>>>>> 2. Reordering of the BGPs > >>>>>> > >>>>>> BGPs can be implemented differnet ways - and they are an inferenece > >>>> extnesion point in SPARQL. > >>>>>> > >>>>>> What you see if the first. BGPs are reordered during execution. > >>>>>> > >>>>>> The algorithm can be stats driven for TDB and TDB2 storage: > >>>>>> https://jena.apache.org/documentation/tdb/optimizer.html > >>>>>> > >>>>>> The interface is > >>>> org.apache.jena.sparql.engine.optimizer.reorder.ReorderTransformation > >>>>>> > >>>>>> and a general purpose reordering is done for in-memory and is the > >>>> default for TDB. > >>>>>> > >>>>>> The default reorder is "grounded triples first, leave equal weights > >>>> alone". It cascades whether a term is bound by an earlier step. > >>>>>> > >>>>>>> { ?a mbz:alias "Amy Beach" . > >>>>>>> ?b cmno:hasInfluenced ?a . > >>>>>>> ?c mo:composer ?b ; > >>>>>>> bio:date ?d > >>>>>>> } > >>>>>> > >>>>>> That's actually the default order - > >>>>>> > >>>>>> ?a mbz:alias "Amy Beach" . > >>>>>> > >>>>>> has two bound terms so is done first. > >>>>>> > >>>>>> and now ?a is bound so > >>>>>> ?b cmno:hasInfluenced ?a . > >>>>>> > >>>>>> etc. > >>>>>> > >>>>>> Given the boundedness of the pattern, and (guess) mbz:alias "Amy > >> Beach" > >>>> is quite selective, With stats ? <property> ? would have to be less > >>>> numerous than ? mbz:alias "Amy Beach". > >>>>>> > >>>>>> There's no algebra optimization for your example, only BGP > reordering. > >>>>>> > >>>>>> qparse --print=opt shows stage 1 optimizations. > >>>>>> > >>>>>> Executing with "explain" shows BGP execution. > >>>>>> > >>>>>> Andy > >>>>>> > >>>>>> > >>>>>> > >>>>>> On 03/03/2020 11:56, Kashif Rabbani wrote: > >>>>>>> Hi awesome community, > >>>>>>> I have a question, I am working on optimizing SPARQL query plan > and > >> I > >>>> wonder does the order of triple patterns in the where clause effects > the > >>>> query plan or not? > >>>>>>> For example, given a following query: > >>>>>>> PREFIX bio: <http://purl.org/vocab/bio/0.1/> > >>>>>>> PREFIX mo: <http://purl.org/ontology/mo/> > >>>>>>> PREFIX mbz: <http://dbtune.org/musicbrainz/resource/vocab/> > >>>>>>> PREFIX cmno: <http://purl.org/ontology/classicalmusicnav#> > >>>>>>> SELECT ?a ?b ?c > >>>>>>> WHERE > >>>>>>> { ?a mbz:alias "Amy Beach" . > >>>>>>> ?b cmno:hasInfluenced ?a . > >>>>>>> ?c mo:composer ?b ; > >>>>>>> bio:date ?d > >>>>>>> } > >>>>>>> // Let’s generate its algebra > >>>>>>> Op op = Algebra.compile(query); results into this: > >>>>>>> (project (?a ?b ?c) > >>>>>>> (bgp > >>>>>>> (triple ?a < > http://dbtune.org/musicbrainz/resource/vocab/alias > >>> > >>>> "Amy Beach") > >>>>>>> (triple ?b < > >>>> http://purl.org/ontology/classicalmusicnav#hasInfluenced> ?a) > >>>>>>> (triple ?c <http://purl.org/ontology/mo/composer> ?b) > >>>>>>> (triple ?c <http://purl.org/vocab/bio/0.1/date> ?d) > >>>>>>> )) > >>>>>>> The bgp in algebra follows the exact same order as specified in the > >>>> where clause of the query. Very precisely, does Jena constructs the > >> query > >>>> plan as it is? or it will change the order at some other level? > >>>>>>> I would be happy if someone can guide me about how the Jena's plan > >>>> actually constructed. If I will use some statistics of the actual RDF > >> graph > >>>> to change the order of triple patterns in the BGP based on > selectivity, > >>>> would it optimize the plan somehow? > >>>>>>> Many Thanks, > >>>>>>> Best Regards, > >>>>>>> Kashif Rabbani. > >>>>> > >>>> > >>> > >>> > >> > > > > > -- --- Marco Neumann KONA
