Sure. I will add more cases to my PR. I did not design more cases because our own product has a test frameworks, which contains thousands of actual user queries. Calcite's code base is quite different. I cannot just migrate cases to calcite. So it may take some time.
On Wed, Apr 29, 2020 at 4:27 AM Roman Kondakov <kondako...@mail.ru.invalid> wrote: > Hi Jinpeng, > > I went through your PR and it seemed very impressive to me. It is very > similar to what I did, but you've reused many existing logic from the > Volcano planner. We should definitely stay in sync in our experiments. I > believe the future Cascades planner will be the kind combination of our > works. > > Is there any way to run tests that are close to the real system query > execution? May be with Enumerable convention, or, better, with > convention that supports distribution trait? I just want to look through > your planner's optimization steps more thoroughly. I've found some tests > in org.apache.calcite.plan.volcano package, but they use synthetic > conventions and nodes. May be I missed something. > > Thank you for sharing your work! > > -- > Kind Regards > Roman Kondakov > > > On 28.04.2020 15:19, Jinpeng Wu wrote: > > Hi, Roman. It's great to see your proposal. Actually my team has also > been > > working on a cascade planner based on calcite. And we already have some > > outcome as well. Maybe we can combine our works. > > > > I've pushed my code as https://github.com/apache/calcite/pull/1950 . > > > > Our works have many places in common. We both developed a new > > CascadePlanner and avoid modifying the old VolcanoPlanner directly. We > > both implemented the top-down search strategy according to the > > Columnbia optimizer > > generator > > < > https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf > >。But > > we also have some differences. > > > > The first difference is that I try to reuse the existing VolcanoPlanner > as > > much as possible. My CascadePlanner inherits from the existing > > VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan > method > > to rearrange rule applies, most logic generally inherit from > > VolcanoPlanner. For example, > > - It reuses the RelSet and RelSubset class and the register method > > - Rules are fired as soon as a RelNode is registered (In the > > Columnbia optimizer generator, rules are not fired until exploring). The > > ApplyRule task controls when to invoke the onMatch method of a RuleMatch. > > This design have a benefit that we do not need to worry about missing a > > rule or firing a rule multiple times. > > - It leverages AbstractConverter to pass traits requirements down. So > > currently AC is still essential in my code. > > This makes the new planner highly compatible with the old VolcanoPlanner. > > Features like MV and Hints can apply to it directly. And I tried to > change > > VolcanoPlanner to the new CascadePlanner in tests. Most tests passed. > > Several cases did fail. I know the reason and how to fix them. But I am > > still thinking about making them as "won't fix" as the ruleset violates > > some basic principles of top-down trait requests. > > > > The second difference is that our design have the ability for space > > pruning. Currently it contains a simply LowerBoundCost metadata to > compute > > the lower bound of a RelNdoe. Because logical properties like cardinality > > of a RelSet is not stable across exploring, it is required that a group > to > > be fully explored (implementation rules and enforcement rules should > never > > modify the logical properties) before it can provide a valid lower bound > > cost. Because of that, logical search space pruning is not supported now. > > It can only pruned out implementation rules and enforcement rules. > Testing > > with cases in our own product, the new planner saves about 10% rule > > applies. I am still considering how to support logical space pruning, > > looking forwards to have more improvements. > > > > Hope my code will help. > > > > Thanks, > > Jinpeng > > > > > > On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai <xndai....@gmail.com> > wrote: > > > >> For #1, aside from that we need to be able to build physical nodes based > >> on a convention. For example, if we merge two EnumerableProject, we > would > >> want to create an EnumerableProject as a result, instead of > LogicalProject. > >> The RelBuilder change I work on would help this case. > >> > >> For #2, I don’t think it’s just a bug. If the physical cost cannot be > >> reliable before transformation is finished, we should probably delay the > >> physical cost calculation, or we risk doing it over again. The other > way is > >> to complete RelSet transformation before implementing it - which is a > >> common practice in industry, including Orca. > >> > >> The multi-convention is a key scenario, and I agree we should support. > My > >> thinking is more about seperating logical one (Conventions.NONE) from > >> others. > >> > >> > >>> On Apr 27, 2020, at 4:59 PM, Julian Hyde <jh...@apache.org> wrote: > >>> > >>> Re 1. By all means have multiple instances of a rule (e.g. one instance > >> that matches LogicalFilter and another that matches FooFilter) and > enable > >> different instances during different phases. (We have been slow to > create > >> all of these variant instances, in part because of the difficulty of > >> changing the constructors of many existing rules. My proposed change > >> https://issues.apache.org/jira/browse/CALCITE-3923 < > >> https://issues.apache.org/jira/browse/CALCITE-3923> would make it > easier > >> to create new instances that are more precisely targeted.) > >>> > >>> Re 2. Sure, there are bugs. > >>> > >>> Re 3. That’s a good idea. The planner could have a "single-convention > >> mode", which might be faster, but would throw if it encountered a rel > in a > >> different convention. > >>> > >>> I think we’re in agreement - separating rules is a best practice. But > we > >> shouldn’t force that best practice on everyone. The multi-convention > case > >> is still crucial for planning hybrid queries (e.g. joining MySQL to > >> MongoDB). > >>> > >>> Julian > >>> > >>> > >>>> On Apr 27, 2020, at 4:28 PM, Xiening Dai <xndai....@gmail.com> wrote: > >>>> > >>>> Hi Julian, > >>>> > >>>> In my view, separating logic and physical rules have a number of > >> benefits - > >>>> > >>>> 1. With current design, a rule can match both physical and logical > >> nodes. This behavior could cause duplication of rule firings and > explosion > >> of memo and search space. There was a long discussion regarding this > >> (CALCITE-2223). Although the indefinitely rule matching problem is > fixed by > >> a separate change, the duplicate rule firing is not resolved. There's a > >> patch trying to address it (https://github.com/apache/calcite/pull/1543 > < > >> https://github.com/apache/calcite/pull/1543>), but it still fell short > >> due to current design limitation. > >>>> > >>>> 2. We have a few meta inconsistency issues today which are due to the > >> reality that we don’t clearly define transformation phase. For example, > we > >> don’t have a solution for CALCITE-2166 as long as transformation can > still > >> apply to a RelSet after it’s been implemented, which means the group > >> logical properties (such as row count) can still change and invalidate > all > >> the previous best cost calculation, so best cost can increase (oops!). > >>>> > >>>> 3. The other benefit is the planner can choose shortcut a number of > >> expensive operations (such as RelSubSet registration, cost calculation > and > >> propagation, etc) during transformation phase, if it was clearly defined > >> and enforced by framework. > >>>> > >>>> > >>>>> On Apr 27, 2020, at 11:59 AM, Julian Hyde <jh...@apache.org> wrote: > >>>>> > >>>>> This thread has almost gotten too long to respond to. I confess I’ve > >> not read much of it. I’m going to reply anyway. Sorry. > >>>>> > >>>>> I support making Calcite’s optimizer support “Cascades”. > >>>>> > >>>>> We should keep the existing VolcanoPlanner working during the > >> transition, and perhaps longer. (I acknowledge that this will not be > easy. > >> But it will not be impossible, because we already have 2 planner > engines, > >> and going from 2 to 3 is much harder than going from 1 to 2.) > >>>>> > >>>>> I perhaps have a different philosophy than Haisheng + Cascades on > >> logical vs physical. I do believe that logical & physical rels and rules > >> are more similar than they are different, therefore they should have the > >> same classes, etc. Pragmatically, it makes a lot of sense to create rule > >> sets and planner phases that deal with just logical rels, or just > physical > >> rels. But that’s a best practice, not a rule. > >>>>> > >>>>> This philosophy manifested in a discussion a couple of weeks ago > about > >> whether RelBuilder should be able to create physical rels. I still > believe > >> that it should. > >>>>> > >>>>> Julian > >>>>> > >>>>> > >>>>>> On Apr 27, 2020, at 11:29 AM, Roman Kondakov > >> <kondako...@mail.ru.INVALID> wrote: > >>>>>> > >>>>>> Hi all, > >>>>>> > >>>>>> Stamatis, Haisheng thank you very much for your feedback! I really > >>>>>> appreciate it. > >>>>>> > >>>>>>> If in the new planner we end up copy-pasting code then I guess it > >> will be a bad idea. > >>>>>> > >>>>>> Yes, there are some code duplication between Volcano planner and > >>>>>> Cascades planner. I think I'll move it to some common superclass: > >> either > >>>>>> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner. > >>>>>> > >>>>>>> Like that it will be easier for everybody to test and see if the > >> changes make this better or worse. From a backward compatibility > >> perspective it seems feasible to keep the new eatures configurable for a > >> certain amount of time. > >>>>>> > >>>>>> I was thinking about backward compatibility in this way: what if we > >> can > >>>>>> switch planner in tests by setting some flag (system property?) > >>>>>> somewhere and check what happens with new planner if we replace > >> Volcano > >>>>>> with it. And if ever it turns out that the new planner passes all > >>>>>> Volcano's tests and at the same time it works more efficiently, we > can > >>>>>> safely replace Volcano planner with Cascades planner. > >>>>>> > >>>>>>> A design doc would definitely help, especially if it has a few > >> end-to-end (from logical to physical plan) examples showing how the > >> optimizer works > >>>>>> at each step before/after the changes. This is actually what is > >> usually > >>>>>> missing in research papers that makes them hard to understand. > >>>>>>> I am thinking some similar to the examples that Haisheng send in > the > >> first email but possibly a bit more detailed. > >>>>>> > >>>>>> I agree, I'll add some exmples to the design doc very soon. > >>>>>> > >>>>>>> I looked very briefly in the PR by Roman but I think I didn't see > >> tests where the final plan contains operators from multiple conventions. > >> Multiple conventions is among the choices that complicate certain parts > of > >> he existing planner so we should make sure that we take this into > account. > >>>>>> > >>>>>> It's on my radar. I'm going to add these tests. > >>>>>> > >>>>>> I would like to ask the commutnity about my next steps on the way > from > >>>>>> transition of the Cascades planner from the prototype status to > >> becoming > >>>>>> a part of the project. I see these steps like this: > >>>>>> > >>>>>> 1. Create a jira ticket. > >>>>>> 2. Update design document with examples. > >>>>>> 3. Make some research to obtain backward compatibility with Volcano > >>>>>> planner to be able to replace Volcano planner with Cascades planner > in > >>>>>> test and to understand the current porblems with planner. > >>>>>> 4. Solve known problems > >>>>>> - materialized views > >>>>>> - hints > >>>>>> - multiple convetions > >>>>>> - listener hooks > >>>>>> - problems from p.3. > >>>>>> 5. new PR, review and merge. > >>>>>> 6. Replacie Volcano planner wiith Cascades after several releases. > >>>>>> > >>>>>> What do you think about this roadmap? > >>>>>> > >>>>>> > >>>>>> -- > >>>>>> Kind Regards > >>>>>> Roman Kondakov > >>>>>> > >>>>>> > >>>>>> On 27.04.2020 01:55, Stamatis Zampetakis wrote: > >>>>>>> Hi all, > >>>>>>> > >>>>>>> I am very excited about the ideas discussed so far and especially > by > >> the > >>>>>>> enthusiasm of many people that are ready to help for pulling this > >> out. > >>>>>>> I wouldn't except that we could have a prototype so quickly. > >>>>>>> Thanks a lot everyone! > >>>>>>> > >>>>>>> In the debate between creating new planner or patching the existing > >> one, I > >>>>>>> don't have a clear preference. > >>>>>>> I think the answer depends on how many things can we reuse. > >>>>>>> If in the new planner we end up copy-pasting code then I guess it > >> will be a > >>>>>>> bad idea. > >>>>>>> On the other hand, if the new and old planner do not have many > >> things in > >>>>>>> common then I guess the answer is obvious. > >>>>>>> > >>>>>>> From Haisheng's description, I was thinking that many of the > proposed > >>>>>>> changes could go in the existing planner. > >>>>>>> Like that it will be easier for everybody to test and see if the > >> changes > >>>>>>> make this better or worse. > >>>>>>> From a backward compatibility perspective it seems feasible to keep > >> the new > >>>>>>> features configurable for a certain amount of time. > >>>>>>> > >>>>>>> From the wish-list, I think we should focus initially on points: > >>>>>>> 1. Top-down trait request > >>>>>>> 2. Convert traits without Abstract converters > >>>>>>> 4. Bottom-up trait derivation > >>>>>>> > >>>>>>> I know that 3, and 5, are also important but I have the feeling > they > >> can > >>>>>>> wait a bit longer. > >>>>>>> > >>>>>>> A design doc would definitely help, especially if it has a few > >> end-to-end > >>>>>>> (from logical to physical plan) examples showing how the optimizer > >> works at > >>>>>>> each step before/after the changes. > >>>>>>> This is actually what is usually missing in research papers that > >> makes them > >>>>>>> hard to understand. > >>>>>>> I am thinking some similar to the examples that Haisheng send in > the > >> first > >>>>>>> email but possibly a bit more detailed. > >>>>>>> > >>>>>>> I looked very briefly in the PR by Roman but I think I didn't see > >> tests > >>>>>>> where the final plan contains operators from multiple conventions. > >>>>>>> Multiple conventions is among the choices that complicate certain > >> parts of > >>>>>>> the existing planner so we should make sure that we take this into > >> account. > >>>>>>> > >>>>>>> Hoping to find some time to think over all this more quietly. Very > >>>>>>> interesting stuff :) > >>>>>>> > >>>>>>> Best, > >>>>>>> Stamatis > >>>>>>> > >>>>>>> On Sun, Apr 26, 2020 at 11:14 PM Haisheng Yuan <hy...@apache.org> > >> wrote: > >>>>>>> > >>>>>>>> Hi Roman, > >>>>>>>> > >>>>>>>> Excellent! This is definitely a helpful contribution to the > Calcite > >>>>>>>> community. > >>>>>>>> Thank you for your endeavors. > >>>>>>>> > >>>>>>>> Haisheng > >>>>>>>> > >>>>>>>> On 2020/04/26 19:25:00, Roman Kondakov <kondako...@mail.ru.INVALID > > > >>>>>>>> wrote: > >>>>>>>>> Hi everyone! > >>>>>>>>> > >>>>>>>>> Haisheng, thank you for bringing this subject up. A new > >> Cascades-style > >>>>>>>>> optimizer should be definitely the next step for Apache Calcite. > >> Many > >>>>>>>>> projects suffer from the lack of this kind of optimizer. > >>>>>>>>> > >>>>>>>>> That was the reason why several weeks ago I started working on > the > >>>>>>>>> prototype of Cascades optimizer for Apache Calcite. I was not > sure > >> that > >>>>>>>>> I would build something useful without much experience in this > >> area. But > >>>>>>>>> now I'd like to share my work with the community. You can find > the > >>>>>>>>> Cascades prototype in PR [1]. This prototype is based on the > >> well-known > >>>>>>>>> paper [2]. > >>>>>>>>> > >>>>>>>>> What prototype can do: > >>>>>>>>> - Top-down trait request > >>>>>>>>> - Convert traits without Abstract converters > >>>>>>>>> - Top-down rule apply > >>>>>>>>> - Bottom-up trait derivation > >>>>>>>>> > >>>>>>>>> What is not supported yet: > >>>>>>>>> - Search space pruning (but I'm going to fix it in the next > >> commits) > >>>>>>>>> - Materialized views > >>>>>>>>> - Planner hooks > >>>>>>>>> - Hints > >>>>>>>>> - Backward compatibility with Volcano planner (some research > >> needed) > >>>>>>>>> > >>>>>>>>> I prepared a design doc for this planner [3], you can find many > >> details > >>>>>>>>> there. I also opened it for comments. > >>>>>>>>> > >>>>>>>>> I've written several basic test cases in > >>>>>>>>> org.apache.calcite.plan.cascades.CascadesPlannerTest including > >> that was > >>>>>>>>> discussed in this thread previously in the context of trait > >> requests: > >>>>>>>>> > >>>>>>>>>> MergeJoin hash[a] > >>>>>>>>>> |---- TableScan R hash[a] (RelSubset) > >>>>>>>>>> +---- TableScan S hash[a] (RelSubset) > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> Haisheng, this is very similar to what you propose. Answering > your > >>>>>>>> question: > >>>>>>>>> > >>>>>>>>>> There are 2 ways: > >>>>>>>>>> a) Modify on current VolcanoPlanner. > >>>>>>>>>> Pros: code reuse, existing numerous test cases and > infrastructure, > >>>>>>>> fast integration > >>>>>>>>>> Cons: changing code always brings risk > >>>>>>>>>> > >>>>>>>>>> b) Add a new XXXXPlanner > >>>>>>>>>> Pros: no risk, no tech debt, no need to worry about backward > >>>>>>>> compatability > >>>>>>>>>> Cons: separate test cases for new planner, one more planner to > >>>>>>>> maintain > >>>>>>>>> > >>>>>>>>> I've chosen the second approach. Because I don't have clear > >>>>>>>>> understanding how to fix Volcano planner gradually. > >>>>>>>>> > >>>>>>>>> This new planner is very far from perfect, but I think it can be > a > >> good > >>>>>>>>> starting point for community. > >>>>>>>>> > >>>>>>>>> Please, share your thoughts about this planner. > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> [1] PR: https://github.com/apache/calcite/pull/1948 > >>>>>>>>> [2] Paper: > >>>>>>>>> > >>>>>>>> > >> > https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf > >>>>>>>>> [3] Design doc: > >>>>>>>>> > >>>>>>>> > >> > https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Kind Regards > >>>>>>>>> Roman Kondakov > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On 22.04.2020 09:52, Danny Chan wrote: > >>>>>>>>>>> Is there any recommended approach to make that happen smoothly > >> besides > >>>>>>>>>> coding and testing work? We need to be aware that the new > planner > >>>>>>>> might be > >>>>>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even never > >> replace > >>>>>>>>>> VolcanoPlanner. > >>>>>>>>>> > >>>>>>>>>> If that is true, i might say the new planner is probably with a > >> not > >>>>>>>> that > >>>>>>>>>> good design, we expect to see in advance for what cases/reasons > >> user > >>>>>>>> has > >>>>>>>>>> the reason to keep the old VolcanoPlanner and we *must* give a > >>>>>>>> solution for > >>>>>>>>>> those problems in the new design. > >>>>>>>>>> > >>>>>>>>>> I was expecting that migrating to a new planner would at least > >> take 1 > >>>>>>>> year > >>>>>>>>>> for developing, if that is true, modifying directly based on > >> current > >>>>>>>>>> planner means for the near future 3~4 versions Calcite, there > >> would > >>>>>>>> bring > >>>>>>>>>> in huge plan changes/bugs for each release which i believe all > the > >>>>>>>> users of > >>>>>>>>>> Calcite don't want to see. And on one can guarantee that > modifying > >>>>>>>> directly > >>>>>>>>>> can keep good stability and compatibility, only the test set do. > >>>>>>>>>> > >>>>>>>>>> From the experience of Alibaba Blink planner which has > >> contributed to > >>>>>>>>>> Apache Flink, yes, the old/new planner would co-exist at least > >> for 2 > >>>>>>>> years. > >>>>>>>>>> For the reasons that the new and old planner has different > >> ability in > >>>>>>>> some > >>>>>>>>>> corner cases. > >>>>>>>>>> > >>>>>>>>>> From my point of view, we should at least: > >>>>>>>>>> - Give a convincing test set for the new planner that makes us > >> believe > >>>>>>>> the > >>>>>>>>>> new planner is stable and powerful enough. I mean obviously the > >> current > >>>>>>>>>> rule tests are far away from enough to support the new planner > >>>>>>>>>> - We should give a more detailed design doc about the new > planner, > >>>>>>>>>> especially about the interfaces changes and any change that > would > >>>>>>>> bring in > >>>>>>>>>> the compatibility problem. Then we can make more accurate > >> decision how > >>>>>>>> much > >>>>>>>>>> work the new planner would bring in, until then, we can decide > if > >>>>>>>> switch to > >>>>>>>>>> a pure new planner development is a good idea or modify the > >> existing > >>>>>>>> one. > >>>>>>>>>> > >>>>>>>>>> > >>>>>>>>>> Haisheng Yuan <hy...@apache.org> 于2020年4月22日周三 上午9:45写道: > >>>>>>>>>> > >>>>>>>>>>> Hi Andrii, > >>>>>>>>>>> > >>>>>>>>>>>> Obviously, from what is written here, I could guess that this > >> would > >>>>>>>>>>> require me to change my physical planning rules, even if only > by > >>>>>>>>>>> implementing a marker interface. > >>>>>>>>>>> You don't need to change your physical rules, it will be > treated > >> as > >>>>>>>> equal > >>>>>>>>>>> as logical rules and be applied together with the real logical > >> rules, > >>>>>>>> no > >>>>>>>>>>> more logical/physical rules difference. This is also how > current > >>>>>>>>>>> VolcanoPlanner works. > >>>>>>>>>>> > >>>>>>>>>>>> I don't want you to think that I somehow resent the changes > you > >> are > >>>>>>>>>>> pushing. > >>>>>>>>>>> Don't get me wrong. I am seriously thinking of revert these > >> changes, > >>>>>>>> since > >>>>>>>>>>> most people like the idea of adding new planner, why don't we > >> make > >>>>>>>> all the > >>>>>>>>>>> plan changes in the new planner, instead of forcing people > >> changing > >>>>>>>> test > >>>>>>>>>>> cases for the code changes that they might not need in > >> VolcanoPlanner > >>>>>>>>>>> during upgrade. > >>>>>>>>>>> > >>>>>>>>>>> I didn't intend to replace VolcanoPlanner, thought just change > >> the > >>>>>>>> search > >>>>>>>>>>> strategy and add trait derivation mechanism, because most of > the > >> code > >>>>>>>> in > >>>>>>>>>>> VolcanoPlanner can be reused. But since many agree to add new > >> planner > >>>>>>>> and > >>>>>>>>>>> replace VolcanoPlanner as the final goal, I won't be against > most > >>>>>>>> people's > >>>>>>>>>>> decision. > >>>>>>>>>>> > >>>>>>>>>>> Is there any recommended approach to make that happen smoothly > >> besides > >>>>>>>>>>> coding and testing work? We need to be aware that the new > planner > >>>>>>>> might be > >>>>>>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even never > >>>>>>>> replace > >>>>>>>>>>> VolcanoPlanner. > >>>>>>>>>>> > >>>>>>>>>>> More thoughts are welcome. > >>>>>>>>>>> > >>>>>>>>>>> Haisheng > >>>>>>>>>>> > >>>>>>>>>>> On 2020/04/21 19:56:25, Андрей Цвелодуб <a.tsvelo...@gmail.com > > > >>>>>>>> wrote: > >>>>>>>>>>>> Hello Haisheng, > >>>>>>>>>>>> > >>>>>>>>>>>>> To keep backward compatibility, all the un-marked rules will > be > >>>>>>>> treated > >>>>>>>>>>>> as logical rules, except rules that uses AbstractConverter as > >> rule > >>>>>>>>>>> operand, > >>>>>>>>>>>> these rules still need to applied top-down, or random order. > >>>>>>>>>>>> Obviously, from what is written here, I could guess that this > >> would > >>>>>>>>>>> require > >>>>>>>>>>>> me to change my physical planning rules, even if only by > >>>>>>>> implementing a > >>>>>>>>>>>> marker interface. I am not saying this is a bad thing, but > this > >> is a > >>>>>>>>>>> thing > >>>>>>>>>>>> that should be communicated and planned ahead in case the > >>>>>>>> VolcanoPlanner > >>>>>>>>>>> is > >>>>>>>>>>>> modified. > >>>>>>>>>>>> > >>>>>>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and > >>>>>>>> CALCITE-3753, > >>>>>>>>>>>> because they will cause another tons of plan changes. > >>>>>>>>>>>> I see you are still bitter due to all the discussions on this > >> list > >>>>>>>>>>> lately, > >>>>>>>>>>>> I'm sorry. I don't want you to think that I somehow resent the > >>>>>>>> changes > >>>>>>>>>>> you > >>>>>>>>>>>> are pushing, au contraire I support them and would be happy to > >> help > >>>>>>>> if I > >>>>>>>>>>>> can. I just want the process of these changes to be executed > in > >> the > >>>>>>>> best > >>>>>>>>>>>> possible way. > >>>>>>>>>>>> As I see there are already several opinions in this thread > that > >>>>>>>> basically > >>>>>>>>>>>> align with what I am saying, so I guess I am not the crazy guy > >>>>>>>> running > >>>>>>>>>>>> around and yelling "the end is nigh!". > >>>>>>>>>>>> > >>>>>>>>>>>> Thank you for taking these mumbled thoughts into account. > >>>>>>>>>>>> > >>>>>>>>>>>> Bestest Regards, > >>>>>>>>>>>> Andrii Tsvielodub > >>>>>>>>>>>> > >>>>>>>>>>>> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan <hy...@apache.org > > > >>>>>>>> wrote: > >>>>>>>>>>>> > >>>>>>>>>>>>> Hi Andrii, > >>>>>>>>>>>>> > >>>>>>>>>>>>>> I guess changing the planner would lead to changes in tons > of > >> rules > >>>>>>>>>>> and > >>>>>>>>>>>>> even more tests. > >>>>>>>>>>>>> Obviously you didn't read through my email. You are not > >> required to > >>>>>>>> do > >>>>>>>>>>> any > >>>>>>>>>>>>> changes to your rule if you don't want to, but if you do, > just > >> need > >>>>>>>> to > >>>>>>>>>>> mark > >>>>>>>>>>>>> the rule to tell planner whether it is a physical rule or > not, > >>>>>>>> simply > >>>>>>>>>>> by > >>>>>>>>>>>>> implementing an empty interface. > >>>>>>>>>>>>> > >>>>>>>>>>>>>> many on this list already experienced problems with > upgrading > >> even > >>>>>>>>>>>>> between the minor versions of Calcite. > >>>>>>>>>>>>> Sorry to see the problem you have experienced when upgrading > >>>>>>>> Calcite. > >>>>>>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and > >>>>>>>> CALCITE-3753, > >>>>>>>>>>>>> because they will cause another tons of plan changes. > >>>>>>>>>>>>> > >>>>>>>>>>>>> But I will see if I can add a setting to use the old search > >>>>>>>> strategy, > >>>>>>>>>>>>> which can be left untouched. > >>>>>>>>>>>>> > >>>>>>>>>>>>> Haisheng > >>>>>>>>>>>>> > >>>>>>>>>>>>> On 2020/04/21 06:33:08, Андрей Цвелодуб < > a.tsvelo...@gmail.com > >>> > >>>>>>>> wrote: > >>>>>>>>>>>>>> Hello everyone, > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> First of all, thanks for this great effort of improving the > >> core > >>>>>>>>>>> parts of > >>>>>>>>>>>>>> the framework we all are using, > >>>>>>>>>>>>>> I believe this is long overdue and hope this will have > >> benefits > >>>>>>>> both > >>>>>>>>>>> for > >>>>>>>>>>>>>> the maintainers and users of the library. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I don't have anything to say about the general idea at the > >> moment, > >>>>>>>>>>>>>> but I want to make a point that maintaining the old > >> implementation > >>>>>>>> of > >>>>>>>>>>>>>> VolcanoPlanner during > >>>>>>>>>>>>>> the initial stages of implementing the new planner is > >> absolutely > >>>>>>>>>>>>> CRITICAL. > >>>>>>>>>>>>>> As a lot of users of Calcite do various customizations to > the > >>>>>>>>>>> engine, to > >>>>>>>>>>>>>> the rules > >>>>>>>>>>>>>> and all that is there in between, I believe changing the > >>>>>>>>>>> implementation > >>>>>>>>>>>>> of > >>>>>>>>>>>>>> the core component > >>>>>>>>>>>>>> would have a huge impact on most users of the library. I > >> think many > >>>>>>>>>>> on > >>>>>>>>>>>>> this > >>>>>>>>>>>>>> list > >>>>>>>>>>>>>> already experienced problems with upgrading even between the > >> minor > >>>>>>>>>>>>> versions > >>>>>>>>>>>>>> of Calcite, > >>>>>>>>>>>>>> so I guess changing the planner would lead to changes in > tons > >> of > >>>>>>>>>>> rules > >>>>>>>>>>>>> and > >>>>>>>>>>>>>> even more tests. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> I don't have anything against replacing VolcanoPlanner as a > >> final > >>>>>>>>>>> goal of > >>>>>>>>>>>>>> this effort, > >>>>>>>>>>>>>> but I don't think that modifying it directly and merging it > to > >>>>>>>>>>> master is > >>>>>>>>>>>>> a > >>>>>>>>>>>>>> viable development approach. > >>>>>>>>>>>>>> While I understand how burdensome it is to maintain several > >>>>>>>> parallel > >>>>>>>>>>> core > >>>>>>>>>>>>>> components at once > >>>>>>>>>>>>>> (we did this while moving the engine of our product to > >> Calcite), we > >>>>>>>>>>>>> should > >>>>>>>>>>>>>> still respect those who depend > >>>>>>>>>>>>>> on it and not introduce the risks related to the development > >> of a > >>>>>>>> new > >>>>>>>>>>>>>> component into existing processing flows. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> A good model to try to follow would be the way new Garbage > >>>>>>>>>>> Collectors are > >>>>>>>>>>>>>> introduced in Java. > >>>>>>>>>>>>>> First, add it as an experimental option, then make it > >> generally > >>>>>>>>>>>>> available, > >>>>>>>>>>>>>> then after everyone agrees > >>>>>>>>>>>>>> this is the best option - make it the default one. > >>>>>>>>>>>>>> With this approach, everyone can then move to the new > planner > >> at > >>>>>>>>>>> their > >>>>>>>>>>>>> own > >>>>>>>>>>>>>> pace, guaranteeing a smooth transition overall. > >>>>>>>>>>>>>> Yes, this could take some time, maybe even a year, but this > >> is the > >>>>>>>>>>> price > >>>>>>>>>>>>> of > >>>>>>>>>>>>>> doing major changes in a popular framework. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Again, thank you for initiating this discussion and leading > >> this > >>>>>>>>>>> effort. > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> Best Regards, > >>>>>>>>>>>>>> Andrii Tsvielodub > >>>>>>>>>>>>>> > >>>>>>>>>>>>>> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu < > wjpabc...@gmail.com > >>> > >>>>>>>>>>> wrote: > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Hi, Xiening. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Regarding calculating the logical cost, here are some ways > I > >>>>>>>>>>> though: > >>>>>>>>>>>>>>> 1. Logical rel may implement their own computeSelfCost > >> method. > >>>>>>>> Some > >>>>>>>>>>>>>>> rels can provide such information, for example the > >>>>>>>>>>>>>>> LogicalProject/LogicalFilter contains nearly the same > >> information > >>>>>>>>>>> as > >>>>>>>>>>>>> their > >>>>>>>>>>>>>>> physical implementations. If we don't have enough > >> confidence, just > >>>>>>>>>>>>> return > >>>>>>>>>>>>>>> zeroCost is also OK, as it only affects pruning. > >>>>>>>>>>>>>>> 2. Logical rel tells its parents what its physical input > >> could be > >>>>>>>>>>>>> after > >>>>>>>>>>>>>>> implementation. Then the problem come back to calculating > >> lower > >>>>>>>>>>> bound > >>>>>>>>>>>>> of a > >>>>>>>>>>>>>>> physical rel. > >>>>>>>>>>>>>>> There should always be ways. The only problem is how to > find > >> a > >>>>>>>>>>> pretty > >>>>>>>>>>>>> one. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Regarding the risk, new planner do have different risk. It > >> is not > >>>>>>>>>>>>> because > >>>>>>>>>>>>>>> new planner could stop us doing something wrong but we can > >> decide > >>>>>>>>>>> when > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>> use the new one. Some scenarios: > >>>>>>>>>>>>>>> 1. If modifying the VolcanoPlanner directly, the only way > >> user > >>>>>>>>>>> could > >>>>>>>>>>>>>>> control the risk is not to upgrade calcite version until it > >> is > >>>>>>>>>>>>> considered > >>>>>>>>>>>>>>> stable. You know, it is quite different from keeping > calcite > >>>>>>>>>>> updated > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>> switching to the new planner at a proper time. > >>>>>>>>>>>>>>> 2. It is very importance for SLA control. For the important > >>>>>>>>>>> business > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>> jobs, we may keep using the old and stable planner. And use > >> the > >>>>>>>>>>> new one > >>>>>>>>>>>>>>> only for jobs that have fault tolerance. And this helps > >> testing > >>>>>>>> new > >>>>>>>>>>>>> planner > >>>>>>>>>>>>>>> with actual scenarios. > >>>>>>>>>>>>>>> 3. It is helpful when upgrading online services. When the > new > >>>>>>>>>>> planner > >>>>>>>>>>>>>>> happened to have some bugs, we can switch to the old > planner > >>>>>>>>>>> directly > >>>>>>>>>>>>>>> without rollback the whole service. > >>>>>>>>>>>>>>> 4. With all these ways to prevent issues becoming > disasters, > >> we > >>>>>>>>>>> are not > >>>>>>>>>>>>>>> vulnerable to making mistakes. This not only enables faster > >>>>>>>>>>> iterations > >>>>>>>>>>>>> but > >>>>>>>>>>>>>>> also let us have enough time to resolve big bugs, like > >> considering > >>>>>>>>>>> it > >>>>>>>>>>>>> in > >>>>>>>>>>>>>>> detail and applying a time-consuming refactoring for it. To > >> work > >>>>>>>>>>>>> around a > >>>>>>>>>>>>>>> critical bug using tricky ways usually introduces more > >> issues. > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>> Jinpeng > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai < > >> xndai....@gmail.com> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Hi Jinpeng, > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Regarding this comment - I believe there are ways to > >> calculate > >>>>>>>>>>> the > >>>>>>>>>>>>>>> logical > >>>>>>>>>>>>>>>> cost, but I think it’s not that simple as "cardinality * > >>>>>>>>>>>>>>> unit_copy_cost.”, > >>>>>>>>>>>>>>>> would you provide more details of other different ways? > >> Just the > >>>>>>>>>>>>>>> algorithm > >>>>>>>>>>>>>>>> description or pseudo code would help us understand. > Thanks. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> Regarding the approach of creating new planner, I don’t > >> think a > >>>>>>>>>>> new > >>>>>>>>>>>>>>>> planner would lower the risk. We don’t know what we don’t > >> know. > >>>>>>>>>>> If we > >>>>>>>>>>>>>>>> introduced an issue while modifying the planner, most > >> likely we > >>>>>>>>>>>>> would do > >>>>>>>>>>>>>>>> the same with new planner class. A new planner doesn’t > >>>>>>>>>>> necessarily > >>>>>>>>>>>>>>> prevent > >>>>>>>>>>>>>>>> the issue from happening, but just delay its surfacing, > >> which is > >>>>>>>>>>>>> worse > >>>>>>>>>>>>>>> IMO. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> There’s one obvious benefit with new planner is that we > can > >>>>>>>>>>> provide > >>>>>>>>>>>>> some > >>>>>>>>>>>>>>>> sort of isolation so the change won’t cause test baseline > >>>>>>>>>>> updates, > >>>>>>>>>>>>> which > >>>>>>>>>>>>>>>> could be painful at times. We should see if we could use > >> planner > >>>>>>>>>>>>> config > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> achieve the same if we decide to just modify the current > >> planner. > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On Apr 20, 2020, at 8:32 AM, Haisheng Yuan < > >> hy...@apache.org> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> Igor, > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> That's great. > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>> On 2020/04/20 11:17:49, Seliverstov Igor < > >> gvvinbl...@gmail.com > >>>>>>>>>>>> > >>>>>>>>>>>>> wrote: > >>>>>>>>>>>>>>>>>> Haisheng, Xiening, > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Ok, Now I see how it should work. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Thanks for your replies. > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>> Igor > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> 20 апр. 2020 г., в 09:56, Seliverstov Igor < > >>>>>>>>>>> gvvinbl...@gmail.com > >>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> написал(а): > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Haisheng, Xiening, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Thanks for clarifying. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> In this proposal, we are not trying to split logical > and > >>>>>>>>>>> physical > >>>>>>>>>>>>>>>> planning entirely. - actually I was in doubt about an idea > >> of > >>>>>>>>>>> entire > >>>>>>>>>>>>>>>> splitting logical and physical phases, if you aren't going > >> to, I > >>>>>>>>>>>>> have no > >>>>>>>>>>>>>>>> objections. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> But it returns me to my first question: how we will > >> propagate > >>>>>>>>>>>>> traits > >>>>>>>>>>>>>>>> in bottom-up manner using proposed approach. (See > [DISCUSS] > >>>>>>>>>>> Proposal > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>> add > >>>>>>>>>>>>>>>> API to force rules matching specific rels for details) > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> One of inconveniences of current VolcanoPlanner > >>>>>>>>>>> implementation is > >>>>>>>>>>>>>>>> amount of tricks that we need to get desired behaviour. It > >> would > >>>>>>>>>>> be > >>>>>>>>>>>>> great > >>>>>>>>>>>>>>>> if some of issues (or all of them) were solved in the new > >>>>>>>>>>> approach. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>>> Igor > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> пн, 20 апр. 2020 г., 7:02 Xiening Dai < > >> xndai....@gmail.com > >>>>>>>>>>>>> <mailto: > >>>>>>>>>>>>>>>> xndai....@gmail.com>>: > >>>>>>>>>>>>>>>>>>> Hi Igor, > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Your comment - "because actual cost may be calculated > >>>>>>>>>>> correctly > >>>>>>>>>>>>> using > >>>>>>>>>>>>>>>> physical operators only. So won't be able to implement > >> Branch and > >>>>>>>>>>>>> Bound > >>>>>>>>>>>>>>>> Space Pruning.“ is actually not true. In Cascade’s lower > >> bound / > >>>>>>>>>>>>> upper > >>>>>>>>>>>>>>>> bound pruning algorithm, you can get cost lower bound of > >> input > >>>>>>>>>>>>> RelNode > >>>>>>>>>>>>>>>> using cardinality * unit_copy_cost. The unit_copy_cost is > a > >>>>>>>>>>> constant, > >>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>> stands for the minimal cost for processing a tuple from > the > >>>>>>>>>>> input. So > >>>>>>>>>>>>>>> this > >>>>>>>>>>>>>>>> will be the minimal cost the input RelNode can achieve, > and > >> if > >>>>>>>>>>> this > >>>>>>>>>>>>> is > >>>>>>>>>>>>>>>> indeed larger than the current best cost, this input node > >> can be > >>>>>>>>>>>>> pruned. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> In this proposal, we are not trying to split logical > and > >>>>>>>>>>> physical > >>>>>>>>>>>>>>>> planning entirely. But for any given equivalent set, we > >> would > >>>>>>>>>>> need to > >>>>>>>>>>>>>>>> finish exploration before implementation. But for the > >> entire memo > >>>>>>>>>>>>> tree, > >>>>>>>>>>>>>>>> each set could be in different planning stage, and an > >> equivalent > >>>>>>>>>>> set > >>>>>>>>>>>>> can > >>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>> pruned even before it’s implemented. A text book example > of > >>>>>>>>>>>>>>> aforementioned > >>>>>>>>>>>>>>>> “lower bound / upper bound pruning” would be the join > >> ordering > >>>>>>>>>>> case. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> Regarding #3, I think we can still achieve that > >> (partially) > >>>>>>>>>>>>> through > >>>>>>>>>>>>>>>> this proposal. Remember every time when we start the > >>>>>>>>>>> optimization, we > >>>>>>>>>>>>>>> pass > >>>>>>>>>>>>>>>> down an upper bound limit. Initially this upper bound for > >> root is > >>>>>>>>>>>>>>> infinite > >>>>>>>>>>>>>>>> - meaning that no feasible plan is available. Every time > >> when we > >>>>>>>>>>>>> find a > >>>>>>>>>>>>>>>> physical plan we update this upper bound, then start the > >> search > >>>>>>>>>>>>> again. We > >>>>>>>>>>>>>>>> could stop the search when the cost is less than a > >> pre-defined > >>>>>>>>>>>>> threshold > >>>>>>>>>>>>>>> - > >>>>>>>>>>>>>>>> which gives you a “good enough” plan with early > termination. > >>>>>>>>>>> Still > >>>>>>>>>>>>> this > >>>>>>>>>>>>>>>> wouldn't avoid the logical exploration. For that, you > would > >>>>>>>>>>> probably > >>>>>>>>>>>>>>>> archive through rule configurations, and avoid some > >> expansive > >>>>>>>>>>>>>>>> transformation to keep the cost down. > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan < > >>>>>>>>>>> hy...@apache.org > >>>>>>>>>>>>>>> <mailto: > >>>>>>>>>>>>>>>> hy...@apache.org>> wrote: > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> Igor, > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> a) Given current Calcite's stats derivation strategy, > >> mixing > >>>>>>>>>>>>> logical > >>>>>>>>>>>>>>>> and physical planning won't make it better. I hope you > went > >>>>>>>>>>> through > >>>>>>>>>>>>> my > >>>>>>>>>>>>>>>> email to the end, currently operators inside a memo group > >> don't > >>>>>>>>>>> share > >>>>>>>>>>>>>>> stats > >>>>>>>>>>>>>>>> info, each operator's stats may differ with the other one, > >> and > >>>>>>>>>>> the > >>>>>>>>>>>>>>>> RelSubset's best node and stats may vary during > >> optimization. So > >>>>>>>>>>>>> logical > >>>>>>>>>>>>>>>> and physical rule pruning are not safe at the moment, > >> otherwise > >>>>>>>>>>> it > >>>>>>>>>>>>> almost > >>>>>>>>>>>>>>>> implies changing downstream project's cost model silently. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On the other hand, ensuring child group exploration > task > >>>>>>>>>>> finish > >>>>>>>>>>>>>>> first > >>>>>>>>>>>>>>>> will make rule mutual exclusivity check possible, like the > >>>>>>>>>>> result of > >>>>>>>>>>>>>>>> ReduceExpressionRule won't need trigger the same rule > >> again, The > >>>>>>>>>>> join > >>>>>>>>>>>>>>>> result of JoinCommuteRule won't need trigger > >> JoinCommuteRule and > >>>>>>>>>>>>>>>> ReduceExpressionRule again. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> More importantly, most if not all the the long > planning > >>>>>>>>>>> queries > >>>>>>>>>>>>> in > >>>>>>>>>>>>>>>> Calcite are not caused by too many alternatives, but > mutual > >>>>>>>>>>>>> triggering, > >>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>> cyclic triggering, which can be avoided by customizing the > >> rules > >>>>>>>>>>>>> instead > >>>>>>>>>>>>>>> of > >>>>>>>>>>>>>>>> using the default one. Unless you use dynamic programming > >>>>>>>>>>> (Calcite > >>>>>>>>>>>>> use > >>>>>>>>>>>>>>>> heuristic) on join reordering (left-deep, right-deep, > >> bushy), > >>>>>>>>>>> space > >>>>>>>>>>>>>>> pruning > >>>>>>>>>>>>>>>> won't help make long / infinite running query faster. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> b) No evidence shows current version of Calcite will > >> return > >>>>>>>>>>> the > >>>>>>>>>>>>> most > >>>>>>>>>>>>>>>> promising plan in first planning iteration. Instead of > >> praying > >>>>>>>>>>> for > >>>>>>>>>>>>>>> getting > >>>>>>>>>>>>>>>> good enough plan in the first iteration, why not focus on > >> fixing > >>>>>>>>>>>>> rules > >>>>>>>>>>>>>>> that > >>>>>>>>>>>>>>>> causes the issue? > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> c) That is not the goal. > >>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>> On 2020/04/19 15:14:57, Seliverstov Igor < > >>>>>>>>>>> gvvinbl...@gmail.com > >>>>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote: > >>>>>>>>>>>>>>>>>>>>> Haisheng, > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> From my point of view splitting logical and physical > >>>>>>>>>>> planning > >>>>>>>>>>>>> parts > >>>>>>>>>>>>>>>> isn’t a good idea. > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> I think so because actual cost may be calculated > >> correctly > >>>>>>>>>>>>> using > >>>>>>>>>>>>>>>> physical operators only. So that we: > >>>>>>>>>>>>>>>>>>>>> a) won't be able to implement Branch and Bound Space > >>>>>>>>>>> Pruning > >>>>>>>>>>>>> (as > >>>>>>>>>>>>>>> far > >>>>>>>>>>>>>>>> as I understand, at exploring time there are no physical > >>>>>>>>>>> operators, > >>>>>>>>>>>>> no > >>>>>>>>>>>>>>>> correct costs, but assumptions only, I don’t think we > should > >>>>>>>>>>> rely on > >>>>>>>>>>>>>>> them) > >>>>>>>>>>>>>>>>>>>>> b) won’t be able to get good enough plan (without > >> Branch > >>>>>>>>>>> and > >>>>>>>>>>>>> Bound > >>>>>>>>>>>>>>>> Space Pruning it’s non-trivial task to get right search > >>>>>>>>>>> direction and > >>>>>>>>>>>>>>> most > >>>>>>>>>>>>>>>> promising plans in first planning iterations) > >>>>>>>>>>>>>>>>>>>>> c) won’t be able to tune the good enough plan during > >>>>>>>>>>> several > >>>>>>>>>>>>>>> similar > >>>>>>>>>>>>>>>> queries are executed > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>>>>> Igor > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan < > >> hy...@apache.org > >>>>>>>>>>>>>>> <mailto: > >>>>>>>>>>>>>>>> hy...@apache.org>> написал(а): > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Hi Igor, > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> You can't have your cake and eat it. > >>>>>>>>>>>>>>>>>>>>>> But one feasible work item definitely we can do is > >> that > >>>>>>>>>>> once > >>>>>>>>>>>>>>>> timeout, stop exploring, use the first available physical > >>>>>>>>>>> operator in > >>>>>>>>>>>>>>> each > >>>>>>>>>>>>>>>> group and optimize it. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Because most, if not all, of the long / infinite > >> running > >>>>>>>>>>>>>>>> optimizations are caused by project related transpose, > join > >>>>>>>>>>>>> reordering > >>>>>>>>>>>>>>>> (join commute + associative), constant reduction for large > >>>>>>>>>>> expression > >>>>>>>>>>>>>>> (see > >>>>>>>>>>>>>>>> CALCITE-3460), all of which are logical transformations > >> rules and > >>>>>>>>>>>>> many of > >>>>>>>>>>>>>>>> which have corresponding JIRAs. So another alternative is, > >> let's > >>>>>>>>>>> fix > >>>>>>>>>>>>>>> these > >>>>>>>>>>>>>>>> bugs to improve Calcite to make timeout option less > usable. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Another thing worth noting is that sql server > >> optimizer > >>>>>>>>>>>>> timeout > >>>>>>>>>>>>>>>> mechanism is not based on clock time, but the number of > >>>>>>>>>>> optimization > >>>>>>>>>>>>>>> tasks > >>>>>>>>>>>>>>>> it has done [1]. > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> [1] > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>> > >> > https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188 > >>>>>>>>>>>>>>>> < > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>> > >> > https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188 > >>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>>>>>> Haisheng > >>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>> On 2020/04/19 11:31:27, Seliverstov Igor < > >>>>>>>>>>>>> gvvinbl...@gmail.com > >>>>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote: > >>>>>>>>>>>>>>>>>>>>>>> Haisheng, > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Ok, then such notification isn't needed > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> But in this case we don't have any control over how > >> long > >>>>>>>>>>>>> planning > >>>>>>>>>>>>>>>> takes > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> For some systems it's necessary to get good enough > >> plan > >>>>>>>>>>>>> right now > >>>>>>>>>>>>>>>> instead > >>>>>>>>>>>>>>>>>>>>>>> of best one after while > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> For example we've been considering a case when a > >> query is > >>>>>>>>>>>>>>>> optimised several > >>>>>>>>>>>>>>>>>>>>>>> times in short iterations in case it's impossible > to > >> get > >>>>>>>>>>> the > >>>>>>>>>>>>> best > >>>>>>>>>>>>>>>> plan in > >>>>>>>>>>>>>>>>>>>>>>> reasonable period of time (for example there is an > >> SLA > >>>>>>>>>>> for > >>>>>>>>>>>>>>>> response time) > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> This mean we need all needed physical > implementations > >>>>>>>>>>> after > >>>>>>>>>>>>> each > >>>>>>>>>>>>>>>> logical > >>>>>>>>>>>>>>>>>>>>>>> transformation is applied. > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>>>>>>> Igor > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan < > >>>>>>>>>>> hy...@apache.org > >>>>>>>>>>>>>>>> <mailto:hy...@apache.org>>: > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Hi Igor, > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> There will be no rule queue anymore. Y will be > fully > >>>>>>>>>>>>> explored > >>>>>>>>>>>>>>>> (logical > >>>>>>>>>>>>>>>>>>>>>>>> rules are matched and applied) before it can be > >>>>>>>>>>> implemented > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>>> optimized. > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> Thanks, > >>>>>>>>>>>>>>>>>>>>>>>> Haisheng > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> On 2020/04/19 10:11:45, Seliverstov Igor < > >>>>>>>>>>>>> gvvinbl...@gmail.com > >>>>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote: > >>>>>>>>>>>>>>>>>>>>>>>>> Hi Haisheng, > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Great explanation, thanks. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> One thing I'd like to cover in advance is trait > >>>>>>>>>>> propagation > >>>>>>>>>>>>>>>> process (I > >>>>>>>>>>>>>>>>>>>>>>>> need > >>>>>>>>>>>>>>>>>>>>>>>>> it for Apache Ignite SQL engine implementation). > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> For example: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> There are two nodes: Rel X and its child node > Rel Y > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Both nodes are in Optimized state, and there is a > >>>>>>>>>>> Logical > >>>>>>>>>>>>> rule > >>>>>>>>>>>>>>>> for Rel Y > >>>>>>>>>>>>>>>>>>>>>>>> in > >>>>>>>>>>>>>>>>>>>>>>>>> a rules queue matched, > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> After logical rule is applied, there is a logical > >> rel > >>>>>>>>>>> Rel > >>>>>>>>>>>>> Y' > >>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>> containing > >>>>>>>>>>>>>>>>>>>>>>>>> it set moves into Exploring state (since there > is a > >>>>>>>>>>> logical > >>>>>>>>>>>>>>> node > >>>>>>>>>>>>>>>> which > >>>>>>>>>>>>>>>>>>>>>>>> has > >>>>>>>>>>>>>>>>>>>>>>>>> to be implemented) > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> After whole process Exploring -> ... -> Optimized > >> we > >>>>>>>>>>> need > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> force parent > >>>>>>>>>>>>>>>>>>>>>>>>> Rel X set to switch its state from Optimized to > >>>>>>>>>>> Optimizing > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> peek the > >>>>>>>>>>>>>>>>>>>>>>>>> newly implemented child and (possible) produce a > >> new > >>>>>>>>>>> Rel X' > >>>>>>>>>>>>>>>> physical rel > >>>>>>>>>>>>>>>>>>>>>>>> on > >>>>>>>>>>>>>>>>>>>>>>>>> the basis of previously requested from the Rel X > >> set > >>>>>>>>>>>>> traits. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> If a new Rel X' is produced, a Rel X' parent > should > >>>>>>>>>>> move > >>>>>>>>>>>>> its > >>>>>>>>>>>>>>>> state > >>>>>>>>>>>>>>>>>>>>>>>>> Optimized -> Optimizing and repeat described > above > >>>>>>>>>>>>> operations. > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Does it look like true? > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> Regards, > >>>>>>>>>>>>>>>>>>>>>>>>> Igor > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan < > >>>>>>>>>>>>>>> h.y...@alibaba-inc.com > >>>>>>>>>>>>>>>> <mailto:h.y...@alibaba-inc.com>>: > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hi, > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> In the past few months, we have discussed a lot > >> about > >>>>>>>>>>>>> Cascades > >>>>>>>>>>>>>>>> style > >>>>>>>>>>>>>>>>>>>>>>>>>> top-down optimization, including on-demand trait > >>>>>>>>>>>>>>>> derivation/request, > >>>>>>>>>>>>>>>>>>>>>>>> rule > >>>>>>>>>>>>>>>>>>>>>>>>>> apply, branch and bound space pruning. Now we > >> think > >>>>>>>>>>> it is > >>>>>>>>>>>>> time > >>>>>>>>>>>>>>>> to move > >>>>>>>>>>>>>>>>>>>>>>>>>> towards these targets. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> We will separate it into several small issues, > and > >>>>>>>>>>> each > >>>>>>>>>>>>> one > >>>>>>>>>>>>>>> can > >>>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>>>> integrated as a standalone, independent feature, > >> and > >>>>>>>>>>> most > >>>>>>>>>>>>>>>> importantly, > >>>>>>>>>>>>>>>>>>>>>>>>>> meanwhile keep backward compatibility. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 1. Top-down trait request > >>>>>>>>>>>>>>>>>>>>>>>>>> In other words, pass traits requirements from > >> parent > >>>>>>>>>>>>> nodes to > >>>>>>>>>>>>>>>> child > >>>>>>>>>>>>>>>>>>>>>>>> nodes. > >>>>>>>>>>>>>>>>>>>>>>>>>> The trait requests happens after all the logical > >>>>>>>>>>>>>>> transformation > >>>>>>>>>>>>>>>> rules > >>>>>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>> physical implementation rules are done, in a > >> top-down > >>>>>>>>>>>>> manner, > >>>>>>>>>>>>>>>> driven > >>>>>>>>>>>>>>>>>>>>>>>> from > >>>>>>>>>>>>>>>>>>>>>>>>>> root set. e.g.: > >>>>>>>>>>>>>>>>>>>>>>>>>> SELECT a, sum(c) FROM > >>>>>>>>>>>>>>>>>>>>>>>>>> (SELECT * FROM R JOIN S USING (a, b)) t > >>>>>>>>>>>>>>>>>>>>>>>>>> GROUP BY a; > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Suppose we have the following plan tree in the > >> MEMO, > >>>>>>>>>>> and > >>>>>>>>>>>>> let's > >>>>>>>>>>>>>>>> only > >>>>>>>>>>>>>>>>>>>>>>>>>> consider distribution for simplicity, each group > >>>>>>>>>>>>> represents a > >>>>>>>>>>>>>>>> RelSet > >>>>>>>>>>>>>>>>>>>>>>>> in the > >>>>>>>>>>>>>>>>>>>>>>>>>> MEMO. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Group 1: Agg on [a] > >>>>>>>>>>>>>>>>>>>>>>>>>> Group 2: +-- MergeJoin on [a, b] > >>>>>>>>>>>>>>>>>>>>>>>>>> Group 3: |--- TableScan R > >>>>>>>>>>>>>>>>>>>>>>>>>> Group 4: +--- TableScan S > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group No | Operator | Derived Traits | > >> Required > >>>>>>>>>>>>> Traits | > >>>>>>>>>>>>>>>>>>>>>>>>>> | ----------- | ------------- | --------------- > | > >>>>>>>>>>>>>>>> --------------- | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group 1 | Aggregate | Hash[a] | > N/A > >>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group 2 | MergeJoin | Hash[a,b] | > >> Hash[a] > >>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group 3 | TableScan R | None | > >> Hash[a,b] > >>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group 4 | TableScan S | None | > >> Hash[a,b] > >>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> We will add new interface PhysicalNode (or > >> extending > >>>>>>>>>>>>> RelNode) > >>>>>>>>>>>>>>>> with > >>>>>>>>>>>>>>>>>>>>>>>>>> methods: > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> - Pair<RelTraitSet,List<RelTraitSet>> > >>>>>>>>>>>>>>> requireTraits(RelTraitSet > >>>>>>>>>>>>>>>>>>>>>>>> required); > >>>>>>>>>>>>>>>>>>>>>>>>>> pair.left is the current operator's new > traitset, > >>>>>>>>>>>>> pair.right > >>>>>>>>>>>>>>> is > >>>>>>>>>>>>>>>> the > >>>>>>>>>>>>>>>>>>>>>>>> list > >>>>>>>>>>>>>>>>>>>>>>>>>> of children's required traitset. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> - RelNode passThrough(RelTraitSet required); > >>>>>>>>>>>>>>>>>>>>>>>>>> Default implementation will call above method > >>>>>>>>>>>>> requireTraits() > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>> RelNode.copy() to create new RelNode. Available > >> to be > >>>>>>>>>>>>>>> overriden > >>>>>>>>>>>>>>>> for > >>>>>>>>>>>>>>>>>>>>>>>>>> physical operators to customize the logic. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> The planner will call above method on MergeJoin > >>>>>>>>>>> operator > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>>> pass the > >>>>>>>>>>>>>>>>>>>>>>>>>> required traits (Hash[a]) to Mergejoin's child > >>>>>>>>>>> operators. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> We will get a new MergeJoin: > >>>>>>>>>>>>>>>>>>>>>>>>>> MergeJoin hash[a] > >>>>>>>>>>>>>>>>>>>>>>>>>> |---- TableScan R hash[a] (RelSubset) > >>>>>>>>>>>>>>>>>>>>>>>>>> +---- TableScan S hash[a] (RelSubset) > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Now the MEMO group looks like: > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group No | Operator | Derived Traits > | > >>>>>>>>>>> Required > >>>>>>>>>>>>>>>> Traits > >>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | ---------- | -------- ----- | > >> -------------------- | > >>>>>>>>>>>>>>>>>>>>>>>>>> --------------------- | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group1 | Aggregate | Hash[a] > >> | > >>>>>>>>>>> N/A > >>>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group2 | MergeJoin | Hash[a,b], Hash[a]| > >> Hash[a] > >>>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group3 | TableScan R | None > >> | > >>>>>>>>>>>>>>> Hash[a,b], > >>>>>>>>>>>>>>>>>>>>>>>> Hash[a] > >>>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> | Group4 | TableScan S | None > >> | > >>>>>>>>>>>>>>> Hash[a,b], > >>>>>>>>>>>>>>>>>>>>>>>> Hash[a] > >>>>>>>>>>>>>>>>>>>>>>>>>> | > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Calcite user may choose to ignore / not > implement > >> the > >>>>>>>>>>>>>>> interface > >>>>>>>>>>>>>>>> to keep > >>>>>>>>>>>>>>>>>>>>>>>>>> the original behavior. Each physical operator, > >>>>>>>>>>> according > >>>>>>>>>>>>> to > >>>>>>>>>>>>>>> its > >>>>>>>>>>>>>>>> own > >>>>>>>>>>>>>>>>>>>>>>>> logic, > >>>>>>>>>>>>>>>>>>>>>>>>>> decides whether passThrough() should pass traits > >> down > >>>>>>>>>>> or > >>>>>>>>>>>>> not > >>>>>>>>>>>>>>> by > >>>>>>>>>>>>>>>>>>>>>>>> returning: > >>>>>>>>>>>>>>>>>>>>>>>>>> - a non-null RelNode, which means it can pass > down > >>>>>>>>>>>>>>>>>>>>>>>>>> - null object, which means can't pass down > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 2. Provide option to disable AbstractConverter > >>>>>>>>>>>>>>>>>>>>>>>>>> Once the plan can request traits in top-down way > >> in > >>>>>>>>>>> the > >>>>>>>>>>>>>>>> framework, many > >>>>>>>>>>>>>>>>>>>>>>>>>> system don't need AbstractConverter anymore, > >> since it > >>>>>>>>>>> is > >>>>>>>>>>>>> just > >>>>>>>>>>>>>>> a > >>>>>>>>>>>>>>>>>>>>>>>>>> intermediate operator to generate physical sort > / > >>>>>>>>>>>>> exchange. > >>>>>>>>>>>>>>> For > >>>>>>>>>>>>>>>> those, > >>>>>>>>>>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>>>>>>>> can provide option to disable AbstractConverter, > >>>>>>>>>>> generate > >>>>>>>>>>>>>>>> physical > >>>>>>>>>>>>>>>>>>>>>>>> enforcer > >>>>>>>>>>>>>>>>>>>>>>>>>> directly by adding a method to interface > >> Convention: > >>>>>>>>>>>>>>>>>>>>>>>>>> - RelNode enforce(RelNode node, RelTraitSet > >> traits); > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> The default implementation may just calling > >>>>>>>>>>>>>>>>>>>>>>>> changeTraitsUsingConverters(), > >>>>>>>>>>>>>>>>>>>>>>>>>> but people may choose to override it if the > >> system has > >>>>>>>>>>>>> special > >>>>>>>>>>>>>>>> needs, > >>>>>>>>>>>>>>>>>>>>>>>> like > >>>>>>>>>>>>>>>>>>>>>>>>>> several traits must implement together, or the > >>>>>>>>>>> position of > >>>>>>>>>>>>>>>> collation in > >>>>>>>>>>>>>>>>>>>>>>>>>> RelTraitSet is before distribution. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 3. Top-down driven, on-demand rule apply > >>>>>>>>>>>>>>>>>>>>>>>>>> For every RelNode in a RelSet, rule is matched > and > >>>>>>>>>>> applied > >>>>>>>>>>>>>>>>>>>>>>>> sequentially, > >>>>>>>>>>>>>>>>>>>>>>>>>> newly generated RelNodes are added to the end of > >>>>>>>>>>> RelNode > >>>>>>>>>>>>> list > >>>>>>>>>>>>>>>> in the > >>>>>>>>>>>>>>>>>>>>>>>> RelSet > >>>>>>>>>>>>>>>>>>>>>>>>>> waiting for rule apply. RuleQueue and > >>>>>>>>>>> DeferringRuleCall > >>>>>>>>>>>>> is not > >>>>>>>>>>>>>>>> needed > >>>>>>>>>>>>>>>>>>>>>>>>>> anymore. This will make space pruning and rule > >> mutual > >>>>>>>>>>>>>>>> exclusivity check > >>>>>>>>>>>>>>>>>>>>>>>>>> possible. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> There are 3 stages for each RelSet: > >>>>>>>>>>>>>>>>>>>>>>>>>> a). Exploration: logical transformation, yields > >>>>>>>>>>> logical > >>>>>>>>>>>>> nodes > >>>>>>>>>>>>>>>>>>>>>>>>>> b). Implementation: physical transformation, > >> yields > >>>>>>>>>>>>> physical > >>>>>>>>>>>>>>>> nodes > >>>>>>>>>>>>>>>>>>>>>>>>>> c). Optimization: trait request, enforcement > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> The general process looks like: > >>>>>>>>>>>>>>>>>>>>>>>>>> - optimize RelSet X: > >>>>>>>>>>>>>>>>>>>>>>>>>> implement RelSet X > >>>>>>>>>>>>>>>>>>>>>>>>>> for each physical relnode in RelSet X: > >>>>>>>>>>>>>>>>>>>>>>>>>> // pass down trait requests to child RelSets > >>>>>>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode: > >>>>>>>>>>>>>>>>>>>>>>>>>> optimize RelSet Y > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> - implement RelSet X: > >>>>>>>>>>>>>>>>>>>>>>>>>> if X has been implemented: > >>>>>>>>>>>>>>>>>>>>>>>>>> return > >>>>>>>>>>>>>>>>>>>>>>>>>> explore RelSet X > >>>>>>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X: > >>>>>>>>>>>>>>>>>>>>>>>>>> apply physical rules > >>>>>>>>>>>>>>>>>>>>>>>>>> - explore RelSet X: > >>>>>>>>>>>>>>>>>>>>>>>>>> if X has been explored > >>>>>>>>>>>>>>>>>>>>>>>>>> return > >>>>>>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X: > >>>>>>>>>>>>>>>>>>>>>>>>>> // ensure each child RelSet of current relnode > is > >>>>>>>>>>>>>>> explored > >>>>>>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode: > >>>>>>>>>>>>>>>>>>>>>>>>>> explore RelSet Y > >>>>>>>>>>>>>>>>>>>>>>>>>> apply logical rules on current relnode > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Basically it is a state machine with several > >> states: > >>>>>>>>>>>>>>>> Initialized, > >>>>>>>>>>>>>>>>>>>>>>>>>> Explored, Exploring, Implemented, Implementing, > >>>>>>>>>>> Optimized, > >>>>>>>>>>>>>>>> Optimizing > >>>>>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>> several transition methods: exploreRelSet, > >>>>>>>>>>> exploreRelNode, > >>>>>>>>>>>>>>>>>>>>>>>> implementRelSet, > >>>>>>>>>>>>>>>>>>>>>>>>>> implementRelNode, optimizeRelSet, > >> optimizeRelNode... > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> To achieve this, we need to mark the rules > either > >>>>>>>>>>> logical > >>>>>>>>>>>>> rule > >>>>>>>>>>>>>>>> or > >>>>>>>>>>>>>>>>>>>>>>>> physical > >>>>>>>>>>>>>>>>>>>>>>>>>> rule. > >>>>>>>>>>>>>>>>>>>>>>>>>> To keep backward compatibility, all the > un-marked > >>>>>>>>>>> rules > >>>>>>>>>>>>> will > >>>>>>>>>>>>>>> be > >>>>>>>>>>>>>>>>>>>>>>>> treated as > >>>>>>>>>>>>>>>>>>>>>>>>>> logical rules, except rules that uses > >>>>>>>>>>> AbstractConverter as > >>>>>>>>>>>>>>> rule > >>>>>>>>>>>>>>>>>>>>>>>> operand, > >>>>>>>>>>>>>>>>>>>>>>>>>> these rules still need to applied top-down, or > >> random > >>>>>>>>>>>>> order. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 4. On-demand, bottom-up trait derivation > >>>>>>>>>>>>>>>>>>>>>>>>>> It is called bottom-up, but actually driven by > >>>>>>>>>>> top-down, > >>>>>>>>>>>>>>>> happens same > >>>>>>>>>>>>>>>>>>>>>>>> time > >>>>>>>>>>>>>>>>>>>>>>>>>> as top-down trait request, in optimization stage > >>>>>>>>>>> mentioned > >>>>>>>>>>>>>>>> above. Many > >>>>>>>>>>>>>>>>>>>>>>>>>> Calcite based bigdata system only propagate > >> traits on > >>>>>>>>>>>>> Project > >>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>> Filter by > >>>>>>>>>>>>>>>>>>>>>>>>>> writing rules, which is very limited. In fact, > we > >> can > >>>>>>>>>>>>> extend > >>>>>>>>>>>>>>>> trait > >>>>>>>>>>>>>>>>>>>>>>>>>> propagation/derivation to all the operators, > >> without > >>>>>>>>>>>>> rules, by > >>>>>>>>>>>>>>>> adding > >>>>>>>>>>>>>>>>>>>>>>>>>> interface PhysicalNode (or extending RelNode) > with > >>>>>>>>>>> method: > >>>>>>>>>>>>>>>>>>>>>>>>>> - RelNode derive(RelTraitSet traits, int > childId); > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Given the following plan (only consider > >> distribution > >>>>>>>>>>> for > >>>>>>>>>>>>>>>> simplicity): > >>>>>>>>>>>>>>>>>>>>>>>>>> Agg [a,b] > >>>>>>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a] > >>>>>>>>>>>>>>>>>>>>>>>>>> |---- TableScan R > >>>>>>>>>>>>>>>>>>>>>>>>>> +--- TableScan S > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Hash[a] won't satisfy Hash[a,b] without special > >>>>>>>>>>> treatment, > >>>>>>>>>>>>>>>> because > >>>>>>>>>>>>>>>>>>>>>>>> there > >>>>>>>>>>>>>>>>>>>>>>>>>> isn't a mechanism to coordinate traits between > >>>>>>>>>>> children. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Now we call derive method on Agg [a,b] node, > >>>>>>>>>>>>> derive(Hash[a], > >>>>>>>>>>>>>>>> 0), we get > >>>>>>>>>>>>>>>>>>>>>>>>>> the new node: > >>>>>>>>>>>>>>>>>>>>>>>>>> Agg [a] > >>>>>>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a] (RelSubset) > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> We will provide different matching type, so each > >>>>>>>>>>> operator > >>>>>>>>>>>>> can > >>>>>>>>>>>>>>>> specify > >>>>>>>>>>>>>>>>>>>>>>>> what > >>>>>>>>>>>>>>>>>>>>>>>>>> kind of matching type it requires its children: > >>>>>>>>>>>>>>>>>>>>>>>>>> - MatchType getMatchType(RelTrait trait, int > >> childId); > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> a) Exact: Hash[a,b] exact match Hash[a,b], aka, > >>>>>>>>>>> satisfy > >>>>>>>>>>>>>>>>>>>>>>>>>> b) Partial: Hash[a] partial match Hash[a,b] > >>>>>>>>>>>>>>>>>>>>>>>>>> c) Permuted: Sort[a,b,c] permuted match > >> Sort[c,b,a] > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> In addition, optimization order is provided for > >> each > >>>>>>>>>>>>> opertor > >>>>>>>>>>>>>>> to > >>>>>>>>>>>>>>>>>>>>>>>> specify: > >>>>>>>>>>>>>>>>>>>>>>>>>> a) left to right > >>>>>>>>>>>>>>>>>>>>>>>>>> b) right to left > >>>>>>>>>>>>>>>>>>>>>>>>>> c) both > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> For example, for query SELECT * FROM R join S on > >>>>>>>>>>> R.a=S.a > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>>> R.b=S.b > >>>>>>>>>>>>>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>>>> R.c=S.c: > >>>>>>>>>>>>>>>>>>>>>>>>>> Suppose R is distributed by a,b, and S is > >> distributed > >>>>>>>>>>> by > >>>>>>>>>>>>> c. > >>>>>>>>>>>>>>>>>>>>>>>>>> MergeJoin [a,b,c] > >>>>>>>>>>>>>>>>>>>>>>>>>> |--- TableScan R [a,b] > >>>>>>>>>>>>>>>>>>>>>>>>>> +-- TableScan S [c] > >>>>>>>>>>>>>>>>>>>>>>>>>> a) left to right, call derive(Hash[a,b], 0), we > >> get > >>>>>>>>>>>>> MergeJoin > >>>>>>>>>>>>>>>> [a,b] > >>>>>>>>>>>>>>>>>>>>>>>>>> b) right to left, call derive(Hash[c], 1), we > get > >>>>>>>>>>>>> MergeJoin > >>>>>>>>>>>>>>>> [c], most > >>>>>>>>>>>>>>>>>>>>>>>>>> likely a bad plan > >>>>>>>>>>>>>>>>>>>>>>>>>> c) both, get above 2 plans. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> For system that doesn't have a fine-tuned stats > >> and > >>>>>>>>>>> cost > >>>>>>>>>>>>>>> model, > >>>>>>>>>>>>>>>> it may > >>>>>>>>>>>>>>>>>>>>>>>> not > >>>>>>>>>>>>>>>>>>>>>>>>>> be able to make a right decision based purely on > >> cost. > >>>>>>>>>>>>>>> Probably > >>>>>>>>>>>>>>>> we > >>>>>>>>>>>>>>>>>>>>>>>> need to > >>>>>>>>>>>>>>>>>>>>>>>>>> provide the MergeJoin with both children's > derived > >>>>>>>>>>>>> traitset > >>>>>>>>>>>>>>>> list. > >>>>>>>>>>>>>>>>>>>>>>>>>> - List<RelNode> derive(List<List<RelTraitSet>>); > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Of course, all above methods are optional to > >>>>>>>>>>> implement for > >>>>>>>>>>>>>>>> those who > >>>>>>>>>>>>>>>>>>>>>>>>>> doesn't need this feature. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> 5. Branch and Bound Space Pruning > >>>>>>>>>>>>>>>>>>>>>>>>>> After we implement on-demand, top-down trait > >>>>>>>>>>> enforcement > >>>>>>>>>>>>> and > >>>>>>>>>>>>>>>>>>>>>>>> rule-apply, > >>>>>>>>>>>>>>>>>>>>>>>>>> we can pass the cost limit at the time of > passing > >> down > >>>>>>>>>>>>>>> required > >>>>>>>>>>>>>>>>>>>>>>>> traits, as > >>>>>>>>>>>>>>>>>>>>>>>>>> described in the classical Cascades paper. Right > >> now, > >>>>>>>>>>>>> Calcite > >>>>>>>>>>>>>>>> doesn't > >>>>>>>>>>>>>>>>>>>>>>>>>> provide group level logical properties, > including > >>>>>>>>>>> stats > >>>>>>>>>>>>> info, > >>>>>>>>>>>>>>>> each > >>>>>>>>>>>>>>>>>>>>>>>> operator > >>>>>>>>>>>>>>>>>>>>>>>>>> in the same group has its own logical property > >> and the > >>>>>>>>>>>>> stats > >>>>>>>>>>>>>>>> may vary, > >>>>>>>>>>>>>>>>>>>>>>>> so > >>>>>>>>>>>>>>>>>>>>>>>>>> we can only do limited space pruning for trait > >>>>>>>>>>>>> enforcement, > >>>>>>>>>>>>>>>> still > >>>>>>>>>>>>>>>>>>>>>>>> good. But > >>>>>>>>>>>>>>>>>>>>>>>>>> if we agree to add option to share group level > >> stats > >>>>>>>>>>>>> between > >>>>>>>>>>>>>>>> relnodes > >>>>>>>>>>>>>>>>>>>>>>>> in a > >>>>>>>>>>>>>>>>>>>>>>>>>> RelSet, we will be able to do more aggresive > space > >>>>>>>>>>>>> pruning, > >>>>>>>>>>>>>>>> which will > >>>>>>>>>>>>>>>>>>>>>>>> help > >>>>>>>>>>>>>>>>>>>>>>>>>> boost the performance of join reorder planning. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> With all that being said, how do we move > forward? > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> There are 2 ways: > >>>>>>>>>>>>>>>>>>>>>>>>>> a) Modify on current VolcanoPlanner. > >>>>>>>>>>>>>>>>>>>>>>>>>> Pros: code reuse, existing numerous test cases > and > >>>>>>>>>>>>>>>> infrastructure, > >>>>>>>>>>>>>>>>>>>>>>>> fast > >>>>>>>>>>>>>>>>>>>>>>>>>> integration > >>>>>>>>>>>>>>>>>>>>>>>>>> Cons: changing code always brings risk > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> b) Add a new XXXXPlanner > >>>>>>>>>>>>>>>>>>>>>>>>>> Pros: no risk, no tech debt, no need to worry > >> about > >>>>>>>>>>>>> backward > >>>>>>>>>>>>>>>>>>>>>>>>>> compatability > >>>>>>>>>>>>>>>>>>>>>>>>>> Cons: separate test cases for new planner, one > >> more > >>>>>>>>>>>>> planner to > >>>>>>>>>>>>>>>>>>>>>>>> maintain > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> We'd like to hear the community's thoughts and > >>>>>>>>>>> advices. > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> Thanks. > >>>>>>>>>>>>>>>>>>>>>>>>>> - Haisheng > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>>> > >>>>>>>>>>>>>>> > >>>>>>>>>>>>>> > >>>>>>>>>>>>> > >>>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >>>> > >>> > >> > >> > > >