Hi Xiening, thank you for your feedback.
> 1. Do we really need RelGroup and RelSubGroup? I believe the memo structure > would be largely the same even if we move towards a Cascade planner. I think > we can reuse most of the RelSet/RelSubset today. RelSubset is a tricky one. > There’s no corresponding concept in the Cascade framework. But we do need a > way to track the RelNode physical traits anyway. A RelSubset is more like a > logical grouping of the RelNodes with same trait set, which I think is still > valuable. No, we don't really need RelGroup and RelSubGroup. My initial approach was to use only RelGroup as in original Cascades framework, but later I realized that RelSubGroup is a convenient concept and introduced it. I'm going to replace both with RelSet/RelSubset later. Despite RelSet/RelSubset have many things that not needed for cascades (parents, propagateCosImprovements, etc). > Keeping backward compatibility is a key goal. We have to think through this > at the very beginning of the design. The community has invested a lot with > Volcano planner over the years. We cannot ask people to rewrite their rules > just to make them work with the new planner. We should expect current rules > would just work, or would work with “minimal” changes. I agree with you. My next step is to bring backward compatibility for Cascades planner. > There are some building blocks that are currently missing which will prevent > us from using the Cascade top down search strategy. For example, the way how > rule is matched, the lack of transformation phase, the lack of the ability to > create physical nodes by the framework, etc. If we don’t care about current > rules, and start from scratch, we probably don’t need to fix these issues. > But with #2 in mind, we have to work out solutions for these so we can carry > over existing rules. I think we can reuse existing rules and at the same time achieve correct top-down Cascades search strategy. For example, I've changed rule matching strategy in the prototype: rules are matched on demand in the top-down way (see org.apache.calcite.plan.cascades.ApplyRule.Bindery class). And I believe that all other things (like creating physical nodes) we can fix in the similar way: just changing the planner code while maintaining backward compatibility. -- Kind Regards Roman Kondakov On 28.04.2020 03:12, Xiening Dai wrote: > Hi Roman, > > First, thank you for sharing your design and prototype code. I took a quick > look at your design and have some high level feedback - > > 1. Do we really need RelGroup and RelSubGroup? I believe the memo structure > would be largely the same even if we move towards a Cascade planner. I think > we can reuse most of the RelSet/RelSubset today. RelSubset is a tricky one. > There’s no corresponding concept in the Cascade framework. But we do need a > way to track the RelNode physical traits anyway. A RelSubset is more like a > logical grouping of the RelNodes with same trait set, which I think is still > valuable. > > 2. Keeping backward compatibility is a key goal. We have to think through > this at the very beginning of the design. The community has invested a lot > with Volcano planner over the years. We cannot ask people to rewrite their > rules just to make them work with the new planner. We should expect current > rules would just work, or would work with “minimal” changes. > > 3. There are some building blocks that are currently missing which will > prevent us from using the Cascade top down search strategy. For example, the > way how rule is matched, the lack of transformation phase, the lack of the > ability to create physical nodes by the framework, etc. If we don’t care > about current rules, and start from scratch, we probably don’t need to fix > these issues. But with #2 in mind, we have to work out solutions for these so > we can carry over existing rules. > > I think your execution plan looks good overall. We can iterate on the design > while working out a document. > > For the purpose of transparency, I’ve been working with Haisheng in the last > few months regarding this topics. I agree with him on most parts of his > roadmap, which I think it’s a tangible plan to evolve current Volcano planner. > > >> On Apr 27, 2020, at 11:29 AM, Roman Kondakov <kondako...@mail.ru.INVALID> >> wrote: >> >> Hi all, >> >> Stamatis, Haisheng thank you very much for your feedback! I really >> appreciate it. >> >>> If in the new planner we end up copy-pasting code then I guess it will be a >>> bad idea. >> >> Yes, there are some code duplication between Volcano planner and >> Cascades planner. I think I'll move it to some common superclass: either >> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner. >> >>> Like that it will be easier for everybody to test and see if the changes >>> make this better or worse. From a backward compatibility perspective it >>> seems feasible to keep the new eatures configurable for a certain amount of >>> time. >> >> I was thinking about backward compatibility in this way: what if we can >> switch planner in tests by setting some flag (system property?) >> somewhere and check what happens with new planner if we replace Volcano >> with it. And if ever it turns out that the new planner passes all >> Volcano's tests and at the same time it works more efficiently, we can >> safely replace Volcano planner with Cascades planner. >> >>> A design doc would definitely help, especially if it has a few end-to-end >>> (from logical to physical plan) examples showing how the optimizer works >> at each step before/after the changes. This is actually what is usually >> missing in research papers that makes them hard to understand. >>> I am thinking some similar to the examples that Haisheng send in the first >>> email but possibly a bit more detailed. >> >> I agree, I'll add some exmples to the design doc very soon. >> >>> I looked very briefly in the PR by Roman but I think I didn't see tests >>> where the final plan contains operators from multiple conventions. Multiple >>> conventions is among the choices that complicate certain parts of he >>> existing planner so we should make sure that we take this into account. >> >> It's on my radar. I'm going to add these tests. >> >> I would like to ask the commutnity about my next steps on the way from >> transition of the Cascades planner from the prototype status to becoming >> a part of the project. I see these steps like this: >> >> 1. Create a jira ticket. >> 2. Update design document with examples. >> 3. Make some research to obtain backward compatibility with Volcano >> planner to be able to replace Volcano planner with Cascades planner in >> test and to understand the current porblems with planner. >> 4. Solve known problems >> - materialized views >> - hints >> - multiple convetions >> - listener hooks >> - problems from p.3. >> 5. new PR, review and merge. >> 6. Replacie Volcano planner wiith Cascades after several releases. >> >> What do you think about this roadmap? >> >> >> -- >> Kind Regards >> Roman Kondakov >> >> >> On 27.04.2020 01:55, Stamatis Zampetakis wrote: >>> Hi all, >>> >>> I am very excited about the ideas discussed so far and especially by the >>> enthusiasm of many people that are ready to help for pulling this out. >>> I wouldn't except that we could have a prototype so quickly. >>> Thanks a lot everyone! >>> >>> In the debate between creating new planner or patching the existing one, I >>> don't have a clear preference. >>> I think the answer depends on how many things can we reuse. >>> If in the new planner we end up copy-pasting code then I guess it will be a >>> bad idea. >>> On the other hand, if the new and old planner do not have many things in >>> common then I guess the answer is obvious. >>> >>> From Haisheng's description, I was thinking that many of the proposed >>> changes could go in the existing planner. >>> Like that it will be easier for everybody to test and see if the changes >>> make this better or worse. >>> From a backward compatibility perspective it seems feasible to keep the new >>> features configurable for a certain amount of time. >>> >>> From the wish-list, I think we should focus initially on points: >>> 1. Top-down trait request >>> 2. Convert traits without Abstract converters >>> 4. Bottom-up trait derivation >>> >>> I know that 3, and 5, are also important but I have the feeling they can >>> wait a bit longer. >>> >>> A design doc would definitely help, especially if it has a few end-to-end >>> (from logical to physical plan) examples showing how the optimizer works at >>> each step before/after the changes. >>> This is actually what is usually missing in research papers that makes them >>> hard to understand. >>> I am thinking some similar to the examples that Haisheng send in the first >>> email but possibly a bit more detailed. >>> >>> I looked very briefly in the PR by Roman but I think I didn't see tests >>> where the final plan contains operators from multiple conventions. >>> Multiple conventions is among the choices that complicate certain parts of >>> the existing planner so we should make sure that we take this into account. >>> >>> Hoping to find some time to think over all this more quietly. Very >>> interesting stuff :) >>> >>> Best, >>> Stamatis >>> >>> On Sun, Apr 26, 2020 at 11:14 PM Haisheng Yuan <hy...@apache.org> wrote: >>> >>>> Hi Roman, >>>> >>>> Excellent! This is definitely a helpful contribution to the Calcite >>>> community. >>>> Thank you for your endeavors. >>>> >>>> Haisheng >>>> >>>> On 2020/04/26 19:25:00, Roman Kondakov <kondako...@mail.ru.INVALID> >>>> wrote: >>>>> Hi everyone! >>>>> >>>>> Haisheng, thank you for bringing this subject up. A new Cascades-style >>>>> optimizer should be definitely the next step for Apache Calcite. Many >>>>> projects suffer from the lack of this kind of optimizer. >>>>> >>>>> That was the reason why several weeks ago I started working on the >>>>> prototype of Cascades optimizer for Apache Calcite. I was not sure that >>>>> I would build something useful without much experience in this area. But >>>>> now I'd like to share my work with the community. You can find the >>>>> Cascades prototype in PR [1]. This prototype is based on the well-known >>>>> paper [2]. >>>>> >>>>> What prototype can do: >>>>> - Top-down trait request >>>>> - Convert traits without Abstract converters >>>>> - Top-down rule apply >>>>> - Bottom-up trait derivation >>>>> >>>>> What is not supported yet: >>>>> - Search space pruning (but I'm going to fix it in the next commits) >>>>> - Materialized views >>>>> - Planner hooks >>>>> - Hints >>>>> - Backward compatibility with Volcano planner (some research needed) >>>>> >>>>> I prepared a design doc for this planner [3], you can find many details >>>>> there. I also opened it for comments. >>>>> >>>>> I've written several basic test cases in >>>>> org.apache.calcite.plan.cascades.CascadesPlannerTest including that was >>>>> discussed in this thread previously in the context of trait requests: >>>>> >>>>>> MergeJoin hash[a] >>>>>> |---- TableScan R hash[a] (RelSubset) >>>>>> +---- TableScan S hash[a] (RelSubset) >>>>> >>>>> >>>>> Haisheng, this is very similar to what you propose. Answering your >>>> question: >>>>> >>>>>> There are 2 ways: >>>>>> a) Modify on current VolcanoPlanner. >>>>>> Pros: code reuse, existing numerous test cases and infrastructure, >>>> fast integration >>>>>> Cons: changing code always brings risk >>>>>> >>>>>> b) Add a new XXXXPlanner >>>>>> Pros: no risk, no tech debt, no need to worry about backward >>>> compatability >>>>>> Cons: separate test cases for new planner, one more planner to >>>> maintain >>>>> >>>>> I've chosen the second approach. Because I don't have clear >>>>> understanding how to fix Volcano planner gradually. >>>>> >>>>> This new planner is very far from perfect, but I think it can be a good >>>>> starting point for community. >>>>> >>>>> Please, share your thoughts about this planner. >>>>> >>>>> >>>>> [1] PR: https://github.com/apache/calcite/pull/1948 >>>>> [2] Paper: >>>>> >>>> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf >>>>> [3] Design doc: >>>>> >>>> https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing >>>>> >>>>> >>>>> >>>>> -- >>>>> Kind Regards >>>>> Roman Kondakov >>>>> >>>>> >>>>> On 22.04.2020 09:52, Danny Chan wrote: >>>>>>> Is there any recommended approach to make that happen smoothly besides >>>>>> coding and testing work? We need to be aware that the new planner >>>> might be >>>>>> co-exist with VolcanoPlanner for 5 or more years, or even never replace >>>>>> VolcanoPlanner. >>>>>> >>>>>> If that is true, i might say the new planner is probably with a not >>>> that >>>>>> good design, we expect to see in advance for what cases/reasons user >>>> has >>>>>> the reason to keep the old VolcanoPlanner and we *must* give a >>>> solution for >>>>>> those problems in the new design. >>>>>> >>>>>> I was expecting that migrating to a new planner would at least take 1 >>>> year >>>>>> for developing, if that is true, modifying directly based on current >>>>>> planner means for the near future 3~4 versions Calcite, there would >>>> bring >>>>>> in huge plan changes/bugs for each release which i believe all the >>>> users of >>>>>> Calcite don't want to see. And on one can guarantee that modifying >>>> directly >>>>>> can keep good stability and compatibility, only the test set do. >>>>>> >>>>>> From the experience of Alibaba Blink planner which has contributed to >>>>>> Apache Flink, yes, the old/new planner would co-exist at least for 2 >>>> years. >>>>>> For the reasons that the new and old planner has different ability in >>>> some >>>>>> corner cases. >>>>>> >>>>>> From my point of view, we should at least: >>>>>> - Give a convincing test set for the new planner that makes us believe >>>> the >>>>>> new planner is stable and powerful enough. I mean obviously the current >>>>>> rule tests are far away from enough to support the new planner >>>>>> - We should give a more detailed design doc about the new planner, >>>>>> especially about the interfaces changes and any change that would >>>> bring in >>>>>> the compatibility problem. Then we can make more accurate decision how >>>> much >>>>>> work the new planner would bring in, until then, we can decide if >>>> switch to >>>>>> a pure new planner development is a good idea or modify the existing >>>> one. >>>>>> >>>>>> >>>>>> Haisheng Yuan <hy...@apache.org> 于2020年4月22日周三 上午9:45写道: >>>>>> >>>>>>> Hi Andrii, >>>>>>> >>>>>>>> Obviously, from what is written here, I could guess that this would >>>>>>> require me to change my physical planning rules, even if only by >>>>>>> implementing a marker interface. >>>>>>> You don't need to change your physical rules, it will be treated as >>>> equal >>>>>>> as logical rules and be applied together with the real logical rules, >>>> no >>>>>>> more logical/physical rules difference. This is also how current >>>>>>> VolcanoPlanner works. >>>>>>> >>>>>>>> I don't want you to think that I somehow resent the changes you are >>>>>>> pushing. >>>>>>> Don't get me wrong. I am seriously thinking of revert these changes, >>>> since >>>>>>> most people like the idea of adding new planner, why don't we make >>>> all the >>>>>>> plan changes in the new planner, instead of forcing people changing >>>> test >>>>>>> cases for the code changes that they might not need in VolcanoPlanner >>>>>>> during upgrade. >>>>>>> >>>>>>> I didn't intend to replace VolcanoPlanner, thought just change the >>>> search >>>>>>> strategy and add trait derivation mechanism, because most of the code >>>> in >>>>>>> VolcanoPlanner can be reused. But since many agree to add new planner >>>> and >>>>>>> replace VolcanoPlanner as the final goal, I won't be against most >>>> people's >>>>>>> decision. >>>>>>> >>>>>>> Is there any recommended approach to make that happen smoothly besides >>>>>>> coding and testing work? We need to be aware that the new planner >>>> might be >>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even never >>>> replace >>>>>>> VolcanoPlanner. >>>>>>> >>>>>>> More thoughts are welcome. >>>>>>> >>>>>>> Haisheng >>>>>>> >>>>>>> On 2020/04/21 19:56:25, Андрей Цвелодуб <a.tsvelo...@gmail.com> >>>> wrote: >>>>>>>> Hello Haisheng, >>>>>>>> >>>>>>>>> To keep backward compatibility, all the un-marked rules will be >>>> treated >>>>>>>> as logical rules, except rules that uses AbstractConverter as rule >>>>>>> operand, >>>>>>>> these rules still need to applied top-down, or random order. >>>>>>>> Obviously, from what is written here, I could guess that this would >>>>>>> require >>>>>>>> me to change my physical planning rules, even if only by >>>> implementing a >>>>>>>> marker interface. I am not saying this is a bad thing, but this is a >>>>>>> thing >>>>>>>> that should be communicated and planned ahead in case the >>>> VolcanoPlanner >>>>>>> is >>>>>>>> modified. >>>>>>>> >>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and >>>> CALCITE-3753, >>>>>>>> because they will cause another tons of plan changes. >>>>>>>> I see you are still bitter due to all the discussions on this list >>>>>>> lately, >>>>>>>> I'm sorry. I don't want you to think that I somehow resent the >>>> changes >>>>>>> you >>>>>>>> are pushing, au contraire I support them and would be happy to help >>>> if I >>>>>>>> can. I just want the process of these changes to be executed in the >>>> best >>>>>>>> possible way. >>>>>>>> As I see there are already several opinions in this thread that >>>> basically >>>>>>>> align with what I am saying, so I guess I am not the crazy guy >>>> running >>>>>>>> around and yelling "the end is nigh!". >>>>>>>> >>>>>>>> Thank you for taking these mumbled thoughts into account. >>>>>>>> >>>>>>>> Bestest Regards, >>>>>>>> Andrii Tsvielodub >>>>>>>> >>>>>>>> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan <hy...@apache.org> >>>> wrote: >>>>>>>> >>>>>>>>> Hi Andrii, >>>>>>>>> >>>>>>>>>> I guess changing the planner would lead to changes in tons of rules >>>>>>> and >>>>>>>>> even more tests. >>>>>>>>> Obviously you didn't read through my email. You are not required to >>>> do >>>>>>> any >>>>>>>>> changes to your rule if you don't want to, but if you do, just need >>>> to >>>>>>> mark >>>>>>>>> the rule to tell planner whether it is a physical rule or not, >>>> simply >>>>>>> by >>>>>>>>> implementing an empty interface. >>>>>>>>> >>>>>>>>>> many on this list already experienced problems with upgrading even >>>>>>>>> between the minor versions of Calcite. >>>>>>>>> Sorry to see the problem you have experienced when upgrading >>>> Calcite. >>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and >>>> CALCITE-3753, >>>>>>>>> because they will cause another tons of plan changes. >>>>>>>>> >>>>>>>>> But I will see if I can add a setting to use the old search >>>> strategy, >>>>>>>>> which can be left untouched. >>>>>>>>> >>>>>>>>> Haisheng >>>>>>>>> >>>>>>>>> On 2020/04/21 06:33:08, Андрей Цвелодуб <a.tsvelo...@gmail.com> >>>> wrote: >>>>>>>>>> Hello everyone, >>>>>>>>>> >>>>>>>>>> First of all, thanks for this great effort of improving the core >>>>>>> parts of >>>>>>>>>> the framework we all are using, >>>>>>>>>> I believe this is long overdue and hope this will have benefits >>>> both >>>>>>> for >>>>>>>>>> the maintainers and users of the library. >>>>>>>>>> >>>>>>>>>> I don't have anything to say about the general idea at the moment, >>>>>>>>>> but I want to make a point that maintaining the old implementation >>>> of >>>>>>>>>> VolcanoPlanner during >>>>>>>>>> the initial stages of implementing the new planner is absolutely >>>>>>>>> CRITICAL. >>>>>>>>>> As a lot of users of Calcite do various customizations to the >>>>>>> engine, to >>>>>>>>>> the rules >>>>>>>>>> and all that is there in between, I believe changing the >>>>>>> implementation >>>>>>>>> of >>>>>>>>>> the core component >>>>>>>>>> would have a huge impact on most users of the library. I think many >>>>>>> on >>>>>>>>> this >>>>>>>>>> list >>>>>>>>>> already experienced problems with upgrading even between the minor >>>>>>>>> versions >>>>>>>>>> of Calcite, >>>>>>>>>> so I guess changing the planner would lead to changes in tons of >>>>>>> rules >>>>>>>>> and >>>>>>>>>> even more tests. >>>>>>>>>> >>>>>>>>>> I don't have anything against replacing VolcanoPlanner as a final >>>>>>> goal of >>>>>>>>>> this effort, >>>>>>>>>> but I don't think that modifying it directly and merging it to >>>>>>> master is >>>>>>>>> a >>>>>>>>>> viable development approach. >>>>>>>>>> While I understand how burdensome it is to maintain several >>>> parallel >>>>>>> core >>>>>>>>>> components at once >>>>>>>>>> (we did this while moving the engine of our product to Calcite), we >>>>>>>>> should >>>>>>>>>> still respect those who depend >>>>>>>>>> on it and not introduce the risks related to the development of a >>>> new >>>>>>>>>> component into existing processing flows. >>>>>>>>>> >>>>>>>>>> A good model to try to follow would be the way new Garbage >>>>>>> Collectors are >>>>>>>>>> introduced in Java. >>>>>>>>>> First, add it as an experimental option, then make it generally >>>>>>>>> available, >>>>>>>>>> then after everyone agrees >>>>>>>>>> this is the best option - make it the default one. >>>>>>>>>> With this approach, everyone can then move to the new planner at >>>>>>> their >>>>>>>>> own >>>>>>>>>> pace, guaranteeing a smooth transition overall. >>>>>>>>>> Yes, this could take some time, maybe even a year, but this is the >>>>>>> price >>>>>>>>> of >>>>>>>>>> doing major changes in a popular framework. >>>>>>>>>> >>>>>>>>>> Again, thank you for initiating this discussion and leading this >>>>>>> effort. >>>>>>>>>> >>>>>>>>>> Best Regards, >>>>>>>>>> Andrii Tsvielodub >>>>>>>>>> >>>>>>>>>> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu <wjpabc...@gmail.com> >>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Hi, Xiening. >>>>>>>>>>> >>>>>>>>>>> Regarding calculating the logical cost, here are some ways I >>>>>>> though: >>>>>>>>>>> 1. Logical rel may implement their own computeSelfCost method. >>>> Some >>>>>>>>>>> rels can provide such information, for example the >>>>>>>>>>> LogicalProject/LogicalFilter contains nearly the same information >>>>>>> as >>>>>>>>> their >>>>>>>>>>> physical implementations. If we don't have enough confidence, just >>>>>>>>> return >>>>>>>>>>> zeroCost is also OK, as it only affects pruning. >>>>>>>>>>> 2. Logical rel tells its parents what its physical input could be >>>>>>>>> after >>>>>>>>>>> implementation. Then the problem come back to calculating lower >>>>>>> bound >>>>>>>>> of a >>>>>>>>>>> physical rel. >>>>>>>>>>> There should always be ways. The only problem is how to find a >>>>>>> pretty >>>>>>>>> one. >>>>>>>>>>> >>>>>>>>>>> Regarding the risk, new planner do have different risk. It is not >>>>>>>>> because >>>>>>>>>>> new planner could stop us doing something wrong but we can decide >>>>>>> when >>>>>>>>> to >>>>>>>>>>> use the new one. Some scenarios: >>>>>>>>>>> 1. If modifying the VolcanoPlanner directly, the only way user >>>>>>> could >>>>>>>>>>> control the risk is not to upgrade calcite version until it is >>>>>>>>> considered >>>>>>>>>>> stable. You know, it is quite different from keeping calcite >>>>>>> updated >>>>>>>>> and >>>>>>>>>>> switching to the new planner at a proper time. >>>>>>>>>>> 2. It is very importance for SLA control. For the important >>>>>>> business >>>>>>>>> and >>>>>>>>>>> jobs, we may keep using the old and stable planner. And use the >>>>>>> new one >>>>>>>>>>> only for jobs that have fault tolerance. And this helps testing >>>> new >>>>>>>>> planner >>>>>>>>>>> with actual scenarios. >>>>>>>>>>> 3. It is helpful when upgrading online services. When the new >>>>>>> planner >>>>>>>>>>> happened to have some bugs, we can switch to the old planner >>>>>>> directly >>>>>>>>>>> without rollback the whole service. >>>>>>>>>>> 4. With all these ways to prevent issues becoming disasters, we >>>>>>> are not >>>>>>>>>>> vulnerable to making mistakes. This not only enables faster >>>>>>> iterations >>>>>>>>> but >>>>>>>>>>> also let us have enough time to resolve big bugs, like considering >>>>>>> it >>>>>>>>> in >>>>>>>>>>> detail and applying a time-consuming refactoring for it. To work >>>>>>>>> around a >>>>>>>>>>> critical bug using tricky ways usually introduces more issues. >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Jinpeng >>>>>>>>>>> >>>>>>>>>>> On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai <xndai....@gmail.com> >>>>>>>>> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Jinpeng, >>>>>>>>>>>> >>>>>>>>>>>> Regarding this comment - I believe there are ways to calculate >>>>>>> the >>>>>>>>>>> logical >>>>>>>>>>>> cost, but I think it’s not that simple as "cardinality * >>>>>>>>>>> unit_copy_cost.”, >>>>>>>>>>>> would you provide more details of other different ways? Just the >>>>>>>>>>> algorithm >>>>>>>>>>>> description or pseudo code would help us understand. Thanks. >>>>>>>>>>>> >>>>>>>>>>>> Regarding the approach of creating new planner, I don’t think a >>>>>>> new >>>>>>>>>>>> planner would lower the risk. We don’t know what we don’t know. >>>>>>> If we >>>>>>>>>>>> introduced an issue while modifying the planner, most likely we >>>>>>>>> would do >>>>>>>>>>>> the same with new planner class. A new planner doesn’t >>>>>>> necessarily >>>>>>>>>>> prevent >>>>>>>>>>>> the issue from happening, but just delay its surfacing, which is >>>>>>>>> worse >>>>>>>>>>> IMO. >>>>>>>>>>>> >>>>>>>>>>>> There’s one obvious benefit with new planner is that we can >>>>>>> provide >>>>>>>>> some >>>>>>>>>>>> sort of isolation so the change won’t cause test baseline >>>>>>> updates, >>>>>>>>> which >>>>>>>>>>>> could be painful at times. We should see if we could use planner >>>>>>>>> config >>>>>>>>>>> to >>>>>>>>>>>> achieve the same if we decide to just modify the current planner. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On Apr 20, 2020, at 8:32 AM, Haisheng Yuan <hy...@apache.org> >>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Igor, >>>>>>>>>>>>> >>>>>>>>>>>>> That's great. >>>>>>>>>>>>> >>>>>>>>>>>>> On 2020/04/20 11:17:49, Seliverstov Igor <gvvinbl...@gmail.com >>>>>>>> >>>>>>>>> wrote: >>>>>>>>>>>>>> Haisheng, Xiening, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok, Now I see how it should work. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Thanks for your replies. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> Igor >>>>>>>>>>>>>> >>>>>>>>>>>>>>> 20 апр. 2020 г., в 09:56, Seliverstov Igor < >>>>>>> gvvinbl...@gmail.com >>>>>>>>>> >>>>>>>>>>>> написал(а): >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Haisheng, Xiening, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Thanks for clarifying. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In this proposal, we are not trying to split logical and >>>>>>> physical >>>>>>>>>>>> planning entirely. - actually I was in doubt about an idea of >>>>>>> entire >>>>>>>>>>>> splitting logical and physical phases, if you aren't going to, I >>>>>>>>> have no >>>>>>>>>>>> objections. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> But it returns me to my first question: how we will propagate >>>>>>>>> traits >>>>>>>>>>>> in bottom-up manner using proposed approach. (See [DISCUSS] >>>>>>> Proposal >>>>>>>>> to >>>>>>>>>>> add >>>>>>>>>>>> API to force rules matching specific rels for details) >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> One of inconveniences of current VolcanoPlanner >>>>>>> implementation is >>>>>>>>>>>> amount of tricks that we need to get desired behaviour. It would >>>>>>> be >>>>>>>>> great >>>>>>>>>>>> if some of issues (or all of them) were solved in the new >>>>>>> approach. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>> Igor >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> пн, 20 апр. 2020 г., 7:02 Xiening Dai <xndai....@gmail.com >>>>>>>>> <mailto: >>>>>>>>>>>> xndai....@gmail.com>>: >>>>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Your comment - "because actual cost may be calculated >>>>>>> correctly >>>>>>>>> using >>>>>>>>>>>> physical operators only. So won't be able to implement Branch and >>>>>>>>> Bound >>>>>>>>>>>> Space Pruning.“ is actually not true. In Cascade’s lower bound / >>>>>>>>> upper >>>>>>>>>>>> bound pruning algorithm, you can get cost lower bound of input >>>>>>>>> RelNode >>>>>>>>>>>> using cardinality * unit_copy_cost. The unit_copy_cost is a >>>>>>> constant, >>>>>>>>>>> which >>>>>>>>>>>> stands for the minimal cost for processing a tuple from the >>>>>>> input. So >>>>>>>>>>> this >>>>>>>>>>>> will be the minimal cost the input RelNode can achieve, and if >>>>>>> this >>>>>>>>> is >>>>>>>>>>>> indeed larger than the current best cost, this input node can be >>>>>>>>> pruned. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> In this proposal, we are not trying to split logical and >>>>>>> physical >>>>>>>>>>>> planning entirely. But for any given equivalent set, we would >>>>>>> need to >>>>>>>>>>>> finish exploration before implementation. But for the entire memo >>>>>>>>> tree, >>>>>>>>>>>> each set could be in different planning stage, and an equivalent >>>>>>> set >>>>>>>>> can >>>>>>>>>>> be >>>>>>>>>>>> pruned even before it’s implemented. A text book example of >>>>>>>>>>> aforementioned >>>>>>>>>>>> “lower bound / upper bound pruning” would be the join ordering >>>>>>> case. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> Regarding #3, I think we can still achieve that (partially) >>>>>>>>> through >>>>>>>>>>>> this proposal. Remember every time when we start the >>>>>>> optimization, we >>>>>>>>>>> pass >>>>>>>>>>>> down an upper bound limit. Initially this upper bound for root is >>>>>>>>>>> infinite >>>>>>>>>>>> - meaning that no feasible plan is available. Every time when we >>>>>>>>> find a >>>>>>>>>>>> physical plan we update this upper bound, then start the search >>>>>>>>> again. We >>>>>>>>>>>> could stop the search when the cost is less than a pre-defined >>>>>>>>> threshold >>>>>>>>>>> - >>>>>>>>>>>> which gives you a “good enough” plan with early termination. >>>>>>> Still >>>>>>>>> this >>>>>>>>>>>> wouldn't avoid the logical exploration. For that, you would >>>>>>> probably >>>>>>>>>>>> archive through rule configurations, and avoid some expansive >>>>>>>>>>>> transformation to keep the cost down. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan < >>>>>>> hy...@apache.org >>>>>>>>>>> <mailto: >>>>>>>>>>>> hy...@apache.org>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Igor, >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> a) Given current Calcite's stats derivation strategy, mixing >>>>>>>>> logical >>>>>>>>>>>> and physical planning won't make it better. I hope you went >>>>>>> through >>>>>>>>> my >>>>>>>>>>>> email to the end, currently operators inside a memo group don't >>>>>>> share >>>>>>>>>>> stats >>>>>>>>>>>> info, each operator's stats may differ with the other one, and >>>>>>> the >>>>>>>>>>>> RelSubset's best node and stats may vary during optimization. So >>>>>>>>> logical >>>>>>>>>>>> and physical rule pruning are not safe at the moment, otherwise >>>>>>> it >>>>>>>>> almost >>>>>>>>>>>> implies changing downstream project's cost model silently. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On the other hand, ensuring child group exploration task >>>>>>> finish >>>>>>>>>>> first >>>>>>>>>>>> will make rule mutual exclusivity check possible, like the >>>>>>> result of >>>>>>>>>>>> ReduceExpressionRule won't need trigger the same rule again, The >>>>>>> join >>>>>>>>>>>> result of JoinCommuteRule won't need trigger JoinCommuteRule and >>>>>>>>>>>> ReduceExpressionRule again. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> More importantly, most if not all the the long planning >>>>>>> queries >>>>>>>>> in >>>>>>>>>>>> Calcite are not caused by too many alternatives, but mutual >>>>>>>>> triggering, >>>>>>>>>>> or >>>>>>>>>>>> cyclic triggering, which can be avoided by customizing the rules >>>>>>>>> instead >>>>>>>>>>> of >>>>>>>>>>>> using the default one. Unless you use dynamic programming >>>>>>> (Calcite >>>>>>>>> use >>>>>>>>>>>> heuristic) on join reordering (left-deep, right-deep, bushy), >>>>>>> space >>>>>>>>>>> pruning >>>>>>>>>>>> won't help make long / infinite running query faster. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> b) No evidence shows current version of Calcite will return >>>>>>> the >>>>>>>>> most >>>>>>>>>>>> promising plan in first planning iteration. Instead of praying >>>>>>> for >>>>>>>>>>> getting >>>>>>>>>>>> good enough plan in the first iteration, why not focus on fixing >>>>>>>>> rules >>>>>>>>>>> that >>>>>>>>>>>> causes the issue? >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> c) That is not the goal. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On 2020/04/19 15:14:57, Seliverstov Igor < >>>>>>> gvvinbl...@gmail.com >>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote: >>>>>>>>>>>>>>>>> Haisheng, >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> From my point of view splitting logical and physical >>>>>>> planning >>>>>>>>> parts >>>>>>>>>>>> isn’t a good idea. >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> I think so because actual cost may be calculated correctly >>>>>>>>> using >>>>>>>>>>>> physical operators only. So that we: >>>>>>>>>>>>>>>>> a) won't be able to implement Branch and Bound Space >>>>>>> Pruning >>>>>>>>> (as >>>>>>>>>>> far >>>>>>>>>>>> as I understand, at exploring time there are no physical >>>>>>> operators, >>>>>>>>> no >>>>>>>>>>>> correct costs, but assumptions only, I don’t think we should >>>>>>> rely on >>>>>>>>>>> them) >>>>>>>>>>>>>>>>> b) won’t be able to get good enough plan (without Branch >>>>>>> and >>>>>>>>> Bound >>>>>>>>>>>> Space Pruning it’s non-trivial task to get right search >>>>>>> direction and >>>>>>>>>>> most >>>>>>>>>>>> promising plans in first planning iterations) >>>>>>>>>>>>>>>>> c) won’t be able to tune the good enough plan during >>>>>>> several >>>>>>>>>>> similar >>>>>>>>>>>> queries are executed >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>> Igor >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan <hy...@apache.org >>>>>>>>>>> <mailto: >>>>>>>>>>>> hy...@apache.org>> написал(а): >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> You can't have your cake and eat it. >>>>>>>>>>>>>>>>>> But one feasible work item definitely we can do is that >>>>>>> once >>>>>>>>>>>> timeout, stop exploring, use the first available physical >>>>>>> operator in >>>>>>>>>>> each >>>>>>>>>>>> group and optimize it. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Because most, if not all, of the long / infinite running >>>>>>>>>>>> optimizations are caused by project related transpose, join >>>>>>>>> reordering >>>>>>>>>>>> (join commute + associative), constant reduction for large >>>>>>> expression >>>>>>>>>>> (see >>>>>>>>>>>> CALCITE-3460), all of which are logical transformations rules and >>>>>>>>> many of >>>>>>>>>>>> which have corresponding JIRAs. So another alternative is, let's >>>>>>> fix >>>>>>>>>>> these >>>>>>>>>>>> bugs to improve Calcite to make timeout option less usable. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Another thing worth noting is that sql server optimizer >>>>>>>>> timeout >>>>>>>>>>>> mechanism is not based on clock time, but the number of >>>>>>> optimization >>>>>>>>>>> tasks >>>>>>>>>>>> it has done [1]. >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> [1] >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188 >>>>>>>>>>>> < >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>> >>>>>>> >>>> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188 >>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>> Haisheng >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On 2020/04/19 11:31:27, Seliverstov Igor < >>>>>>>>> gvvinbl...@gmail.com >>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote: >>>>>>>>>>>>>>>>>>> Haisheng, >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Ok, then such notification isn't needed >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> But in this case we don't have any control over how long >>>>>>>>> planning >>>>>>>>>>>> takes >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> For some systems it's necessary to get good enough plan >>>>>>>>> right now >>>>>>>>>>>> instead >>>>>>>>>>>>>>>>>>> of best one after while >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> For example we've been considering a case when a query is >>>>>>>>>>>> optimised several >>>>>>>>>>>>>>>>>>> times in short iterations in case it's impossible to get >>>>>>> the >>>>>>>>> best >>>>>>>>>>>> plan in >>>>>>>>>>>>>>>>>>> reasonable period of time (for example there is an SLA >>>>>>> for >>>>>>>>>>>> response time) >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> This mean we need all needed physical implementations >>>>>>> after >>>>>>>>> each >>>>>>>>>>>> logical >>>>>>>>>>>>>>>>>>> transformation is applied. >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>>> Igor >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan < >>>>>>> hy...@apache.org >>>>>>>>>>>> <mailto:hy...@apache.org>>: >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Hi Igor, >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> There will be no rule queue anymore. Y will be fully >>>>>>>>> explored >>>>>>>>>>>> (logical >>>>>>>>>>>>>>>>>>>> rules are matched and applied) before it can be >>>>>>> implemented >>>>>>>>> and >>>>>>>>>>>> optimized. >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Thanks, >>>>>>>>>>>>>>>>>>>> Haisheng >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On 2020/04/19 10:11:45, Seliverstov Igor < >>>>>>>>> gvvinbl...@gmail.com >>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote: >>>>>>>>>>>>>>>>>>>>> Hi Haisheng, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Great explanation, thanks. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> One thing I'd like to cover in advance is trait >>>>>>> propagation >>>>>>>>>>>> process (I >>>>>>>>>>>>>>>>>>>> need >>>>>>>>>>>>>>>>>>>>> it for Apache Ignite SQL engine implementation). >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> For example: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> There are two nodes: Rel X and its child node Rel Y >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Both nodes are in Optimized state, and there is a >>>>>>> Logical >>>>>>>>> rule >>>>>>>>>>>> for Rel Y >>>>>>>>>>>>>>>>>>>> in >>>>>>>>>>>>>>>>>>>>> a rules queue matched, >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> After logical rule is applied, there is a logical rel >>>>>>> Rel >>>>>>>>> Y' >>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>> containing >>>>>>>>>>>>>>>>>>>>> it set moves into Exploring state (since there is a >>>>>>> logical >>>>>>>>>>> node >>>>>>>>>>>> which >>>>>>>>>>>>>>>>>>>> has >>>>>>>>>>>>>>>>>>>>> to be implemented) >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> After whole process Exploring -> ... -> Optimized we >>>>>>> need >>>>>>>>> to >>>>>>>>>>>> force parent >>>>>>>>>>>>>>>>>>>>> Rel X set to switch its state from Optimized to >>>>>>> Optimizing >>>>>>>>> to >>>>>>>>>>>> peek the >>>>>>>>>>>>>>>>>>>>> newly implemented child and (possible) produce a new >>>>>>> Rel X' >>>>>>>>>>>> physical rel >>>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>> the basis of previously requested from the Rel X set >>>>>>>>> traits. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> If a new Rel X' is produced, a Rel X' parent should >>>>>>> move >>>>>>>>> its >>>>>>>>>>>> state >>>>>>>>>>>>>>>>>>>>> Optimized -> Optimizing and repeat described above >>>>>>>>> operations. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Does it look like true? >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>>>>>>>>> Igor >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan < >>>>>>>>>>> h.y...@alibaba-inc.com >>>>>>>>>>>> <mailto:h.y...@alibaba-inc.com>>: >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hi, >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In the past few months, we have discussed a lot about >>>>>>>>> Cascades >>>>>>>>>>>> style >>>>>>>>>>>>>>>>>>>>>> top-down optimization, including on-demand trait >>>>>>>>>>>> derivation/request, >>>>>>>>>>>>>>>>>>>> rule >>>>>>>>>>>>>>>>>>>>>> apply, branch and bound space pruning. Now we think >>>>>>> it is >>>>>>>>> time >>>>>>>>>>>> to move >>>>>>>>>>>>>>>>>>>>>> towards these targets. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We will separate it into several small issues, and >>>>>>> each >>>>>>>>> one >>>>>>>>>>> can >>>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>>>> integrated as a standalone, independent feature, and >>>>>>> most >>>>>>>>>>>> importantly, >>>>>>>>>>>>>>>>>>>>>> meanwhile keep backward compatibility. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 1. Top-down trait request >>>>>>>>>>>>>>>>>>>>>> In other words, pass traits requirements from parent >>>>>>>>> nodes to >>>>>>>>>>>> child >>>>>>>>>>>>>>>>>>>> nodes. >>>>>>>>>>>>>>>>>>>>>> The trait requests happens after all the logical >>>>>>>>>>> transformation >>>>>>>>>>>> rules >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> physical implementation rules are done, in a top-down >>>>>>>>> manner, >>>>>>>>>>>> driven >>>>>>>>>>>>>>>>>>>> from >>>>>>>>>>>>>>>>>>>>>> root set. e.g.: >>>>>>>>>>>>>>>>>>>>>> SELECT a, sum(c) FROM >>>>>>>>>>>>>>>>>>>>>> (SELECT * FROM R JOIN S USING (a, b)) t >>>>>>>>>>>>>>>>>>>>>> GROUP BY a; >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Suppose we have the following plan tree in the MEMO, >>>>>>> and >>>>>>>>> let's >>>>>>>>>>>> only >>>>>>>>>>>>>>>>>>>>>> consider distribution for simplicity, each group >>>>>>>>> represents a >>>>>>>>>>>> RelSet >>>>>>>>>>>>>>>>>>>> in the >>>>>>>>>>>>>>>>>>>>>> MEMO. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Group 1: Agg on [a] >>>>>>>>>>>>>>>>>>>>>> Group 2: +-- MergeJoin on [a, b] >>>>>>>>>>>>>>>>>>>>>> Group 3: |--- TableScan R >>>>>>>>>>>>>>>>>>>>>> Group 4: +--- TableScan S >>>>>>>>>>>>>>>>>>>>>> | Group No | Operator | Derived Traits | Required >>>>>>>>> Traits | >>>>>>>>>>>>>>>>>>>>>> | ----------- | ------------- | --------------- | >>>>>>>>>>>> --------------- | >>>>>>>>>>>>>>>>>>>>>> | Group 1 | Aggregate | Hash[a] | N/A >>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | Group 2 | MergeJoin | Hash[a,b] | Hash[a] >>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | Group 3 | TableScan R | None | Hash[a,b] >>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | Group 4 | TableScan S | None | Hash[a,b] >>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We will add new interface PhysicalNode (or extending >>>>>>>>> RelNode) >>>>>>>>>>>> with >>>>>>>>>>>>>>>>>>>>>> methods: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - Pair<RelTraitSet,List<RelTraitSet>> >>>>>>>>>>> requireTraits(RelTraitSet >>>>>>>>>>>>>>>>>>>> required); >>>>>>>>>>>>>>>>>>>>>> pair.left is the current operator's new traitset, >>>>>>>>> pair.right >>>>>>>>>>> is >>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>> list >>>>>>>>>>>>>>>>>>>>>> of children's required traitset. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - RelNode passThrough(RelTraitSet required); >>>>>>>>>>>>>>>>>>>>>> Default implementation will call above method >>>>>>>>> requireTraits() >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> RelNode.copy() to create new RelNode. Available to be >>>>>>>>>>> overriden >>>>>>>>>>>> for >>>>>>>>>>>>>>>>>>>>>> physical operators to customize the logic. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> The planner will call above method on MergeJoin >>>>>>> operator >>>>>>>>> to >>>>>>>>>>>> pass the >>>>>>>>>>>>>>>>>>>>>> required traits (Hash[a]) to Mergejoin's child >>>>>>> operators. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We will get a new MergeJoin: >>>>>>>>>>>>>>>>>>>>>> MergeJoin hash[a] >>>>>>>>>>>>>>>>>>>>>> |---- TableScan R hash[a] (RelSubset) >>>>>>>>>>>>>>>>>>>>>> +---- TableScan S hash[a] (RelSubset) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Now the MEMO group looks like: >>>>>>>>>>>>>>>>>>>>>> | Group No | Operator | Derived Traits | >>>>>>> Required >>>>>>>>>>>> Traits >>>>>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | ---------- | -------- ----- | -------------------- | >>>>>>>>>>>>>>>>>>>>>> --------------------- | >>>>>>>>>>>>>>>>>>>>>> | Group1 | Aggregate | Hash[a] | >>>>>>> N/A >>>>>>>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | Group2 | MergeJoin | Hash[a,b], Hash[a]| Hash[a] >>>>>>>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | Group3 | TableScan R | None | >>>>>>>>>>> Hash[a,b], >>>>>>>>>>>>>>>>>>>> Hash[a] >>>>>>>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> | Group4 | TableScan S | None | >>>>>>>>>>> Hash[a,b], >>>>>>>>>>>>>>>>>>>> Hash[a] >>>>>>>>>>>>>>>>>>>>>> | >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Calcite user may choose to ignore / not implement the >>>>>>>>>>> interface >>>>>>>>>>>> to keep >>>>>>>>>>>>>>>>>>>>>> the original behavior. Each physical operator, >>>>>>> according >>>>>>>>> to >>>>>>>>>>> its >>>>>>>>>>>> own >>>>>>>>>>>>>>>>>>>> logic, >>>>>>>>>>>>>>>>>>>>>> decides whether passThrough() should pass traits down >>>>>>> or >>>>>>>>> not >>>>>>>>>>> by >>>>>>>>>>>>>>>>>>>> returning: >>>>>>>>>>>>>>>>>>>>>> - a non-null RelNode, which means it can pass down >>>>>>>>>>>>>>>>>>>>>> - null object, which means can't pass down >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 2. Provide option to disable AbstractConverter >>>>>>>>>>>>>>>>>>>>>> Once the plan can request traits in top-down way in >>>>>>> the >>>>>>>>>>>> framework, many >>>>>>>>>>>>>>>>>>>>>> system don't need AbstractConverter anymore, since it >>>>>>> is >>>>>>>>> just >>>>>>>>>>> a >>>>>>>>>>>>>>>>>>>>>> intermediate operator to generate physical sort / >>>>>>>>> exchange. >>>>>>>>>>> For >>>>>>>>>>>> those, >>>>>>>>>>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>>>> can provide option to disable AbstractConverter, >>>>>>> generate >>>>>>>>>>>> physical >>>>>>>>>>>>>>>>>>>> enforcer >>>>>>>>>>>>>>>>>>>>>> directly by adding a method to interface Convention: >>>>>>>>>>>>>>>>>>>>>> - RelNode enforce(RelNode node, RelTraitSet traits); >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> The default implementation may just calling >>>>>>>>>>>>>>>>>>>> changeTraitsUsingConverters(), >>>>>>>>>>>>>>>>>>>>>> but people may choose to override it if the system has >>>>>>>>> special >>>>>>>>>>>> needs, >>>>>>>>>>>>>>>>>>>> like >>>>>>>>>>>>>>>>>>>>>> several traits must implement together, or the >>>>>>> position of >>>>>>>>>>>> collation in >>>>>>>>>>>>>>>>>>>>>> RelTraitSet is before distribution. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 3. Top-down driven, on-demand rule apply >>>>>>>>>>>>>>>>>>>>>> For every RelNode in a RelSet, rule is matched and >>>>>>> applied >>>>>>>>>>>>>>>>>>>> sequentially, >>>>>>>>>>>>>>>>>>>>>> newly generated RelNodes are added to the end of >>>>>>> RelNode >>>>>>>>> list >>>>>>>>>>>> in the >>>>>>>>>>>>>>>>>>>> RelSet >>>>>>>>>>>>>>>>>>>>>> waiting for rule apply. RuleQueue and >>>>>>> DeferringRuleCall >>>>>>>>> is not >>>>>>>>>>>> needed >>>>>>>>>>>>>>>>>>>>>> anymore. This will make space pruning and rule mutual >>>>>>>>>>>> exclusivity check >>>>>>>>>>>>>>>>>>>>>> possible. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> There are 3 stages for each RelSet: >>>>>>>>>>>>>>>>>>>>>> a). Exploration: logical transformation, yields >>>>>>> logical >>>>>>>>> nodes >>>>>>>>>>>>>>>>>>>>>> b). Implementation: physical transformation, yields >>>>>>>>> physical >>>>>>>>>>>> nodes >>>>>>>>>>>>>>>>>>>>>> c). Optimization: trait request, enforcement >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> The general process looks like: >>>>>>>>>>>>>>>>>>>>>> - optimize RelSet X: >>>>>>>>>>>>>>>>>>>>>> implement RelSet X >>>>>>>>>>>>>>>>>>>>>> for each physical relnode in RelSet X: >>>>>>>>>>>>>>>>>>>>>> // pass down trait requests to child RelSets >>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode: >>>>>>>>>>>>>>>>>>>>>> optimize RelSet Y >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> - implement RelSet X: >>>>>>>>>>>>>>>>>>>>>> if X has been implemented: >>>>>>>>>>>>>>>>>>>>>> return >>>>>>>>>>>>>>>>>>>>>> explore RelSet X >>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X: >>>>>>>>>>>>>>>>>>>>>> apply physical rules >>>>>>>>>>>>>>>>>>>>>> - explore RelSet X: >>>>>>>>>>>>>>>>>>>>>> if X has been explored >>>>>>>>>>>>>>>>>>>>>> return >>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X: >>>>>>>>>>>>>>>>>>>>>> // ensure each child RelSet of current relnode is >>>>>>>>>>> explored >>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode: >>>>>>>>>>>>>>>>>>>>>> explore RelSet Y >>>>>>>>>>>>>>>>>>>>>> apply logical rules on current relnode >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Basically it is a state machine with several states: >>>>>>>>>>>> Initialized, >>>>>>>>>>>>>>>>>>>>>> Explored, Exploring, Implemented, Implementing, >>>>>>> Optimized, >>>>>>>>>>>> Optimizing >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> several transition methods: exploreRelSet, >>>>>>> exploreRelNode, >>>>>>>>>>>>>>>>>>>> implementRelSet, >>>>>>>>>>>>>>>>>>>>>> implementRelNode, optimizeRelSet, optimizeRelNode... >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> To achieve this, we need to mark the rules either >>>>>>> logical >>>>>>>>> rule >>>>>>>>>>>> or >>>>>>>>>>>>>>>>>>>> physical >>>>>>>>>>>>>>>>>>>>>> rule. >>>>>>>>>>>>>>>>>>>>>> To keep backward compatibility, all the un-marked >>>>>>> rules >>>>>>>>> will >>>>>>>>>>> be >>>>>>>>>>>>>>>>>>>> treated as >>>>>>>>>>>>>>>>>>>>>> logical rules, except rules that uses >>>>>>> AbstractConverter as >>>>>>>>>>> rule >>>>>>>>>>>>>>>>>>>> operand, >>>>>>>>>>>>>>>>>>>>>> these rules still need to applied top-down, or random >>>>>>>>> order. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 4. On-demand, bottom-up trait derivation >>>>>>>>>>>>>>>>>>>>>> It is called bottom-up, but actually driven by >>>>>>> top-down, >>>>>>>>>>>> happens same >>>>>>>>>>>>>>>>>>>> time >>>>>>>>>>>>>>>>>>>>>> as top-down trait request, in optimization stage >>>>>>> mentioned >>>>>>>>>>>> above. Many >>>>>>>>>>>>>>>>>>>>>> Calcite based bigdata system only propagate traits on >>>>>>>>> Project >>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>> Filter by >>>>>>>>>>>>>>>>>>>>>> writing rules, which is very limited. In fact, we can >>>>>>>>> extend >>>>>>>>>>>> trait >>>>>>>>>>>>>>>>>>>>>> propagation/derivation to all the operators, without >>>>>>>>> rules, by >>>>>>>>>>>> adding >>>>>>>>>>>>>>>>>>>>>> interface PhysicalNode (or extending RelNode) with >>>>>>> method: >>>>>>>>>>>>>>>>>>>>>> - RelNode derive(RelTraitSet traits, int childId); >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Given the following plan (only consider distribution >>>>>>> for >>>>>>>>>>>> simplicity): >>>>>>>>>>>>>>>>>>>>>> Agg [a,b] >>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a] >>>>>>>>>>>>>>>>>>>>>> |---- TableScan R >>>>>>>>>>>>>>>>>>>>>> +--- TableScan S >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Hash[a] won't satisfy Hash[a,b] without special >>>>>>> treatment, >>>>>>>>>>>> because >>>>>>>>>>>>>>>>>>>> there >>>>>>>>>>>>>>>>>>>>>> isn't a mechanism to coordinate traits between >>>>>>> children. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Now we call derive method on Agg [a,b] node, >>>>>>>>> derive(Hash[a], >>>>>>>>>>>> 0), we get >>>>>>>>>>>>>>>>>>>>>> the new node: >>>>>>>>>>>>>>>>>>>>>> Agg [a] >>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a] (RelSubset) >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We will provide different matching type, so each >>>>>>> operator >>>>>>>>> can >>>>>>>>>>>> specify >>>>>>>>>>>>>>>>>>>> what >>>>>>>>>>>>>>>>>>>>>> kind of matching type it requires its children: >>>>>>>>>>>>>>>>>>>>>> - MatchType getMatchType(RelTrait trait, int childId); >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> a) Exact: Hash[a,b] exact match Hash[a,b], aka, >>>>>>> satisfy >>>>>>>>>>>>>>>>>>>>>> b) Partial: Hash[a] partial match Hash[a,b] >>>>>>>>>>>>>>>>>>>>>> c) Permuted: Sort[a,b,c] permuted match Sort[c,b,a] >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> In addition, optimization order is provided for each >>>>>>>>> opertor >>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>> specify: >>>>>>>>>>>>>>>>>>>>>> a) left to right >>>>>>>>>>>>>>>>>>>>>> b) right to left >>>>>>>>>>>>>>>>>>>>>> c) both >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> For example, for query SELECT * FROM R join S on >>>>>>> R.a=S.a >>>>>>>>> and >>>>>>>>>>>> R.b=S.b >>>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>> R.c=S.c: >>>>>>>>>>>>>>>>>>>>>> Suppose R is distributed by a,b, and S is distributed >>>>>>> by >>>>>>>>> c. >>>>>>>>>>>>>>>>>>>>>> MergeJoin [a,b,c] >>>>>>>>>>>>>>>>>>>>>> |--- TableScan R [a,b] >>>>>>>>>>>>>>>>>>>>>> +-- TableScan S [c] >>>>>>>>>>>>>>>>>>>>>> a) left to right, call derive(Hash[a,b], 0), we get >>>>>>>>> MergeJoin >>>>>>>>>>>> [a,b] >>>>>>>>>>>>>>>>>>>>>> b) right to left, call derive(Hash[c], 1), we get >>>>>>>>> MergeJoin >>>>>>>>>>>> [c], most >>>>>>>>>>>>>>>>>>>>>> likely a bad plan >>>>>>>>>>>>>>>>>>>>>> c) both, get above 2 plans. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> For system that doesn't have a fine-tuned stats and >>>>>>> cost >>>>>>>>>>> model, >>>>>>>>>>>> it may >>>>>>>>>>>>>>>>>>>> not >>>>>>>>>>>>>>>>>>>>>> be able to make a right decision based purely on cost. >>>>>>>>>>> Probably >>>>>>>>>>>> we >>>>>>>>>>>>>>>>>>>> need to >>>>>>>>>>>>>>>>>>>>>> provide the MergeJoin with both children's derived >>>>>>>>> traitset >>>>>>>>>>>> list. >>>>>>>>>>>>>>>>>>>>>> - List<RelNode> derive(List<List<RelTraitSet>>); >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Of course, all above methods are optional to >>>>>>> implement for >>>>>>>>>>>> those who >>>>>>>>>>>>>>>>>>>>>> doesn't need this feature. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> 5. Branch and Bound Space Pruning >>>>>>>>>>>>>>>>>>>>>> After we implement on-demand, top-down trait >>>>>>> enforcement >>>>>>>>> and >>>>>>>>>>>>>>>>>>>> rule-apply, >>>>>>>>>>>>>>>>>>>>>> we can pass the cost limit at the time of passing down >>>>>>>>>>> required >>>>>>>>>>>>>>>>>>>> traits, as >>>>>>>>>>>>>>>>>>>>>> described in the classical Cascades paper. Right now, >>>>>>>>> Calcite >>>>>>>>>>>> doesn't >>>>>>>>>>>>>>>>>>>>>> provide group level logical properties, including >>>>>>> stats >>>>>>>>> info, >>>>>>>>>>>> each >>>>>>>>>>>>>>>>>>>> operator >>>>>>>>>>>>>>>>>>>>>> in the same group has its own logical property and the >>>>>>>>> stats >>>>>>>>>>>> may vary, >>>>>>>>>>>>>>>>>>>> so >>>>>>>>>>>>>>>>>>>>>> we can only do limited space pruning for trait >>>>>>>>> enforcement, >>>>>>>>>>>> still >>>>>>>>>>>>>>>>>>>> good. But >>>>>>>>>>>>>>>>>>>>>> if we agree to add option to share group level stats >>>>>>>>> between >>>>>>>>>>>> relnodes >>>>>>>>>>>>>>>>>>>> in a >>>>>>>>>>>>>>>>>>>>>> RelSet, we will be able to do more aggresive space >>>>>>>>> pruning, >>>>>>>>>>>> which will >>>>>>>>>>>>>>>>>>>> help >>>>>>>>>>>>>>>>>>>>>> boost the performance of join reorder planning. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> With all that being said, how do we move forward? >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> There are 2 ways: >>>>>>>>>>>>>>>>>>>>>> a) Modify on current VolcanoPlanner. >>>>>>>>>>>>>>>>>>>>>> Pros: code reuse, existing numerous test cases and >>>>>>>>>>>> infrastructure, >>>>>>>>>>>>>>>>>>>> fast >>>>>>>>>>>>>>>>>>>>>> integration >>>>>>>>>>>>>>>>>>>>>> Cons: changing code always brings risk >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> b) Add a new XXXXPlanner >>>>>>>>>>>>>>>>>>>>>> Pros: no risk, no tech debt, no need to worry about >>>>>>>>> backward >>>>>>>>>>>>>>>>>>>>>> compatability >>>>>>>>>>>>>>>>>>>>>> Cons: separate test cases for new planner, one more >>>>>>>>> planner to >>>>>>>>>>>>>>>>>>>> maintain >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> We'd like to hear the community's thoughts and >>>>>>> advices. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks. >>>>>>>>>>>>>>>>>>>>>> - Haisheng >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>> >>>>> >>>> >>> >