Hi Julian, I have logged a jira CALCITE-4990 <https://issues.apache.org/jira/browse/CALCITE-4990> for this new rule, later I will submit a pr for this.
Julian Hyde <[email protected]> 于2022年1月21日周五 01:37写道: > Process logic sounds great. Please log a Jira case. > > There’s a lot here, so it’s probably wise to split into a simple PR that > just does the simple case (UNION ALL, no Project) and extend later. > > > On Jan 20, 2022, at 7:07 AM, Yanjing Wang <[email protected]> > wrote: > > > > Thanks Julian and Justin. > > > > What do you think the rule should be called? UnionFilterTransposeRule, > >> perhaps? > >> > > SetOpFilterMergeRule? > > > > Maybe that problem does not occur when applied to UNION than when applied > >> to UNION ALL. > > > > Yes, This is very important. > > > > There would seem to be analogous rules for INTERSECT (combine the > >> conditions using AND) and EXCEPT (combine the conditions using AND NOT). > >> > > Excellent extensions, all the three operators process logic are: > > > > For UNION: > > New Filter = left Filter *OR* right Filter. > > > > For INTERSECT: > > New Filter = left Filter *AND* right Filter. > > > > For EXCEPT: > > If left Filter, right Filter have no overlap, transform to left child > tree, > > Otherwise > > New Filter = left Filter *AND NOT *right Filter. > > > > For UNION ALL: > > Add prerequisites: > > left Filter, right Filter must have no overlap. > > > > For INTERSECT ALL: > > Add prerequisites: > > If left Filter, right Filter have no overlap, transform to empty values. > > Otherwise > > the rule can't be applied. > > > > For EXCEPT ALL: > > same for EXCEPT. > > > > > > work for N-way Union, not just 2-way Union. > >> > > Yes, I will add tests for this. > > > > And I think you should make it work whether or not a Project is present. > > > > Ok, It seems I need construct several different operand match trees for > > plan. > > > > Hi Julian, what do you think of the above process logic? > > > > Julian Hyde <[email protected] <mailto:[email protected]>> > 于2022年1月20日周四 10:18写道: > > > >> Justin, > >> > >> For planning table or index scans, I would recommend using a single > >> TableScan with a Filter that uses a Sarg, rather than using multiple > >> TableScans connected by a Union. So I think this rule will be useful. > >> > >> But I do agree that this proposed rule is not a “no brainer”. It may not > >> do what people want/expect in all cases, and therefore it probably > should > >> not be enabled it by default. > >> > >> Julian > >> > >> > >> > >> > >> > >>> On Jan 19, 2022, at 3:38 PM, Justin Swanhart <[email protected]> > >> wrote: > >>> > >>> Hi, > >>> > >>> Note that this will negate the optimization that one usually is looking > >> for > >>> when writing such queries: > >>> > >>> Select * from TAB where a = 1 > >>> UNION ALL > >>> Select * from TAB where b = 1 > >>> > >>> In a database with indexes (most databases) this will allow indexes to > be > >>> used on both the a column and the b column. > >>> Databases with bitmap indexes or without indexes would benefit from the > >>> rule. > >>> > >>> On Wed, Jan 19, 2022 at 4:32 PM Julian Hyde <[email protected] > >> <mailto:[email protected] <mailto:[email protected]>>> wrote: > >>> > >>>> Can you log a Jira case for this? > >>>> > >>>> I think you should make your rule work for N-way Union, not just 2-way > >>>> Union. And I think you should make it work whether or not a Project is > >>>> present. > >>>> > >>>>> On Jan 19, 2022, at 1:25 PM, Julian Hyde <[email protected] > <mailto:[email protected]>> > >> wrote: > >>>>> > >>>>> It sounds useful. > >>>>> > >>>>> What do you think the rule should be called? > UnionFilterTransposeRule, > >>>> perhaps? > >>>>> > >>>>> A challenge when writing the rule will be to ensure that all of the > >>>> inputs to the Union are the same. The Volcano framework is not very > >> good at > >>>> that. > >>>>> > >>>>> You should be careful of the case that the conditions overlap. For > >>>> example, the rewrite > >>>>> > >>>>> SELECT * FROM Emp WHERE deptno < 30 > >>>>> UNION ALL > >>>>> SELECT * FROM Emp WHERE deptno IN (25, 35, 45) > >>>>> > >>>>> to > >>>>> > >>>>> SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45) > >>>>> > >>>>> Is not valid, because rows with deptno = 25 will appear twice in the > >>>> first query, once in the second. Maybe that problem does not occur > when > >>>> applied to UNION than when applied to UNION ALL. > >>>>> > >>>>> There would seem to be analogous rules for INTERSECT (combine the > >>>> conditions using AND) and EXCEPT (combine the conditions using AND > NOT). > >>>> Perhaps one rule could cover all set operations (see > >>>> FilterSetOpTransposeRule). > >>>>> > >>>>> Julian > >>>>> > >>>>> > >>>>> > >>>>>> On Jan 19, 2022, at 2:38 AM, Yanjing Wang < > [email protected] <mailto:[email protected]> > >>>> <mailto:[email protected] <mailto:[email protected]> > <mailto:[email protected] <mailto:[email protected]>>>> > >> wrote: > >>>>>> > >>>>>> A simple example is converting SELECT a, b FROM t WHERE c = 1 UNION > >> ALL > >>>> SELECT a, b FROM t WHERE c = 2 to SELECT a, b FROM t WHERE c in (1, 2) > >>>>>> > >>>>>> Yanjing Wang <[email protected] <mailto: > [email protected]> <mailto: > >> [email protected] <mailto:[email protected]>> <mailto: > >>>> [email protected] <mailto:[email protected]>>> > >> 于2022年1月19日周三 18:35写道: > >>>>>> Hi, community > >>>>>> > >>>>>> Here I recommend a new rule for converting UNION ALL sub plan to a > >>>> single input with an OR filter, the following is its conversion > diagram. > >>>>>> > >>>>>> > >>>>>> The conversion prerequisites are > >>>>>> 1. left filter range has no intersection with right filter range. > >>>>>> 2. Project and Input Sub Tree must be identical. > >>>>>> > >>>>>> The rule will be used when Input Sub Tree is a computing-intensive > or > >>>> large IO operation. > >>>>>> > >>>>>> I don't know whether the community supports it or not, any > suggestions > >>>> will be appreciated. > >
