Process logic sounds great. Please log a Jira case. There’s a lot here, so it’s probably wise to split into a simple PR that just does the simple case (UNION ALL, no Project) and extend later.
> On Jan 20, 2022, at 7:07 AM, Yanjing Wang <zhuangzixiao...@gmail.com> wrote: > > Thanks Julian and Justin. > > What do you think the rule should be called? UnionFilterTransposeRule, >> perhaps? >> > SetOpFilterMergeRule? > > Maybe that problem does not occur when applied to UNION than when applied >> to UNION ALL. > > Yes, This is very important. > > There would seem to be analogous rules for INTERSECT (combine the >> conditions using AND) and EXCEPT (combine the conditions using AND NOT). >> > Excellent extensions, all the three operators process logic are: > > For UNION: > New Filter = left Filter *OR* right Filter. > > For INTERSECT: > New Filter = left Filter *AND* right Filter. > > For EXCEPT: > If left Filter, right Filter have no overlap, transform to left child tree, > Otherwise > New Filter = left Filter *AND NOT *right Filter. > > For UNION ALL: > Add prerequisites: > left Filter, right Filter must have no overlap. > > For INTERSECT ALL: > Add prerequisites: > If left Filter, right Filter have no overlap, transform to empty values. > Otherwise > the rule can't be applied. > > For EXCEPT ALL: > same for EXCEPT. > > > work for N-way Union, not just 2-way Union. >> > Yes, I will add tests for this. > > And I think you should make it work whether or not a Project is present. > > Ok, It seems I need construct several different operand match trees for > plan. > > Hi Julian, what do you think of the above process logic? > > Julian Hyde <jhyde.apa...@gmail.com <mailto:jhyde.apa...@gmail.com>> > 于2022年1月20日周四 10:18写道: > >> Justin, >> >> For planning table or index scans, I would recommend using a single >> TableScan with a Filter that uses a Sarg, rather than using multiple >> TableScans connected by a Union. So I think this rule will be useful. >> >> But I do agree that this proposed rule is not a “no brainer”. It may not >> do what people want/expect in all cases, and therefore it probably should >> not be enabled it by default. >> >> Julian >> >> >> >> >> >>> On Jan 19, 2022, at 3:38 PM, Justin Swanhart <greenl...@gmail.com> >> wrote: >>> >>> Hi, >>> >>> Note that this will negate the optimization that one usually is looking >> for >>> when writing such queries: >>> >>> Select * from TAB where a = 1 >>> UNION ALL >>> Select * from TAB where b = 1 >>> >>> In a database with indexes (most databases) this will allow indexes to be >>> used on both the a column and the b column. >>> Databases with bitmap indexes or without indexes would benefit from the >>> rule. >>> >>> On Wed, Jan 19, 2022 at 4:32 PM Julian Hyde <jhyde.apa...@gmail.com >> <mailto:jhyde.apa...@gmail.com <mailto:jhyde.apa...@gmail.com>>> wrote: >>> >>>> Can you log a Jira case for this? >>>> >>>> I think you should make your rule work for N-way Union, not just 2-way >>>> Union. And I think you should make it work whether or not a Project is >>>> present. >>>> >>>>> On Jan 19, 2022, at 1:25 PM, Julian Hyde <jhyde.apa...@gmail.com >>>>> <mailto:jhyde.apa...@gmail.com>> >> wrote: >>>>> >>>>> It sounds useful. >>>>> >>>>> What do you think the rule should be called? UnionFilterTransposeRule, >>>> perhaps? >>>>> >>>>> A challenge when writing the rule will be to ensure that all of the >>>> inputs to the Union are the same. The Volcano framework is not very >> good at >>>> that. >>>>> >>>>> You should be careful of the case that the conditions overlap. For >>>> example, the rewrite >>>>> >>>>> SELECT * FROM Emp WHERE deptno < 30 >>>>> UNION ALL >>>>> SELECT * FROM Emp WHERE deptno IN (25, 35, 45) >>>>> >>>>> to >>>>> >>>>> SELECT * FROM Emp WHERE deptno < 30 OR deptno IN (25, 35, 45) >>>>> >>>>> Is not valid, because rows with deptno = 25 will appear twice in the >>>> first query, once in the second. Maybe that problem does not occur when >>>> applied to UNION than when applied to UNION ALL. >>>>> >>>>> There would seem to be analogous rules for INTERSECT (combine the >>>> conditions using AND) and EXCEPT (combine the conditions using AND NOT). >>>> Perhaps one rule could cover all set operations (see >>>> FilterSetOpTransposeRule). >>>>> >>>>> Julian >>>>> >>>>> >>>>> >>>>>> On Jan 19, 2022, at 2:38 AM, Yanjing Wang <zhuangzixiao...@gmail.com >>>>>> <mailto:zhuangzixiao...@gmail.com> >>>> <mailto:zhuangzixiao...@gmail.com <mailto:zhuangzixiao...@gmail.com> >>>> <mailto:zhuangzixiao...@gmail.com <mailto:zhuangzixiao...@gmail.com>>>> >> wrote: >>>>>> >>>>>> A simple example is converting SELECT a, b FROM t WHERE c = 1 UNION >> ALL >>>> SELECT a, b FROM t WHERE c = 2 to SELECT a, b FROM t WHERE c in (1, 2) >>>>>> >>>>>> Yanjing Wang <zhuangzixiao...@gmail.com >>>>>> <mailto:zhuangzixiao...@gmail.com> <mailto: >> zhuangzixiao...@gmail.com <mailto:zhuangzixiao...@gmail.com>> <mailto: >>>> zhuangzixiao...@gmail.com <mailto:zhuangzixiao...@gmail.com>>> >> 于2022年1月19日周三 18:35写道: >>>>>> Hi, community >>>>>> >>>>>> Here I recommend a new rule for converting UNION ALL sub plan to a >>>> single input with an OR filter, the following is its conversion diagram. >>>>>> >>>>>> >>>>>> The conversion prerequisites are >>>>>> 1. left filter range has no intersection with right filter range. >>>>>> 2. Project and Input Sub Tree must be identical. >>>>>> >>>>>> The rule will be used when Input Sub Tree is a computing-intensive or >>>> large IO operation. >>>>>> >>>>>> I don't know whether the community supports it or not, any suggestions >>>> will be appreciated.