Re: Join Pruning/Rewrites
Can you give an example using real-ish tables eg customers, products, orders. And show the 2 sql statements that you allege to be equivalent. I have a feeling that it’s important which join keys you use. Julian Ps also please join the list so that you get replies. > On Mar 27, 2020, at 9:57 AM, Chris Channing > wrote: > > Hey Folks, > > Is it currently possible to have Calcite optimize-away unnecessary joins? > The case I'm thinking of is where (hopefully the formatting holds up!): > > ...you have a bushy tree as follows: > > - > | | > | | > || |-| > T1 T2 T2 T3 > > ...which could be rewritten as a left-deep tree to avoid scanning T2 again: > >|---| >| | > |--|| > T1 T2 T3 > > If so, does any have an example of how to do this please? > > Many thanks, > Chris
Re: Join Pruning/Rewrites
A note I forgot to mention earlier is that I’m starting with a RelBuilder... On Fri, 27 Mar 2020 at 15:59, Chris Channing wrote: > Hey Folks, > > Is it currently possible to have Calcite optimize-away unnecessary joins? > The case I'm thinking of is where (hopefully the formatting holds up!): > > ...you have a bushy tree as follows: > > - > | | > | | > || |-| > T1 T2 T2 T3 > > ...which could be rewritten as a left-deep tree to avoid scanning T2 again: > > |---| > | | > |--|| > T1 T2 T3 > > If so, does any have an example of how to do this please? > > Many thanks, > Chris > >
Re: Join Pruning/Rewrites
Hi Chris, I think your question is equivalent to the following: Does Calcite perform query minimization [1, 2] ? I don't remember seeing any part in Calcite that does this at the moment but I may be wrong. In general the problem is NP-Hard, so no hope for treating it entirely for arbitrary queries. Technically it could be done for small queries, and for larger queries we can adopt some heuristics (e.g., not look for minimal queries but try to remove one-two-k tables and see if the queries are equivalent). Certainly a part where contributions are very welcomed! Best, Stamatis [1] http://webdam.inria.fr/Alice/pdfs/all.pdf [2] http://pages.cs.wisc.edu/~paris/cs838-s16/lecture-notes/lecture2.pdf On Fri, Mar 27, 2020 at 6:17 PM Chris Channing < christopher.chann...@gmail.com> wrote: > A note I forgot to mention earlier is that I’m starting with a > RelBuilder... > > On Fri, 27 Mar 2020 at 15:59, Chris Channing < > christopher.chann...@gmail.com> > wrote: > > > Hey Folks, > > > > Is it currently possible to have Calcite optimize-away unnecessary joins? > > The case I'm thinking of is where (hopefully the formatting holds up!): > > > > ...you have a bushy tree as follows: > > > > - > > | | > > | | > > || |-| > > T1 T2 T2 T3 > > > > ...which could be rewritten as a left-deep tree to avoid scanning T2 > again: > > > > |---| > > | | > > |--|| > > T1 T2 T3 > > > > If so, does any have an example of how to do this please? > > > > Many thanks, > > Chris > > > > >
Re: Re: Join Pruning/Rewrites
In many cases, they are not equivalent, unless the top level join is a semi-join, e.g. (T1⋈T2)⋉(T2⟖T3) on T1.a=T3.a can be optimized to (T1⋈T2)⋉T3 on T1.a=T3.a It is better to give a concrete example, so that we can discuss case by case. - Haisheng -- 发件人:Stamatis Zampetakis 日 期:2020年03月28日 01:47:30 收件人: 主 题:Re: Join Pruning/Rewrites Hi Chris, I think your question is equivalent to the following: Does Calcite perform query minimization [1, 2] ? I don't remember seeing any part in Calcite that does this at the moment but I may be wrong. In general the problem is NP-Hard, so no hope for treating it entirely for arbitrary queries. Technically it could be done for small queries, and for larger queries we can adopt some heuristics (e.g., not look for minimal queries but try to remove one-two-k tables and see if the queries are equivalent). Certainly a part where contributions are very welcomed! Best, Stamatis [1] http://webdam.inria.fr/Alice/pdfs/all.pdf [2] http://pages.cs.wisc.edu/~paris/cs838-s16/lecture-notes/lecture2.pdf On Fri, Mar 27, 2020 at 6:17 PM Chris Channing < christopher.chann...@gmail.com> wrote: > A note I forgot to mention earlier is that I’m starting with a > RelBuilder... > > On Fri, 27 Mar 2020 at 15:59, Chris Channing < > christopher.chann...@gmail.com> > wrote: > > > Hey Folks, > > > > Is it currently possible to have Calcite optimize-away unnecessary joins? > > The case I'm thinking of is where (hopefully the formatting holds up!): > > > > ...you have a bushy tree as follows: > > > > - > > | | > > | | > > || |-| > > T1 T2 T2 T3 > > > > ...which could be rewritten as a left-deep tree to avoid scanning T2 > again: > > > > |---| > > | | > > |--|| > > T1 T2 T3 > > > > If so, does any have an example of how to do this please? > > > > Many thanks, > > Chris > > > > >