Re: [DISCUSS] Towards Cascades Optimizer

Jinpeng Wu Tue, 28 Apr 2020 05:20:26 -0700

Hi, Roman. It's great to see your proposal. Actually my team has also been
working on a cascade planner based on calcite.  And we already have some
outcome as well.  Maybe we can combine our works.


I've pushed my code as https://github.com/apache/calcite/pull/1950 .

Our works have many places in common. We both developed a new
CascadePlanner and avoid modifying the old VolcanoPlanner directly. We
both implemented the top-down search strategy according to the
Columnbia optimizer
generator
<https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf>。But
we also have some differences.

The first difference is that I try to reuse the existing VolcanoPlanner as
much as possible. My CascadePlanner inherits from the existing
VolcanoPlanner. Except that it overwrites ruleQueue and findBestPlan method
to rearrange rule applies, most logic generally inherit from
VolcanoPlanner. For example,
  - It reuses the RelSet and RelSubset class and the register method
  - Rules are fired as soon as a RelNode is registered (In the
Columnbia optimizer generator, rules are not fired until exploring). The
ApplyRule task controls when to invoke the onMatch method of a RuleMatch.
This design have a benefit that we do not need to worry about missing a
rule or firing a rule multiple times.
  - It leverages AbstractConverter to pass traits requirements down.  So
currently AC is still essential in my code.
This makes the new planner highly compatible with the old VolcanoPlanner.
Features like MV and Hints can apply to it directly.  And I tried to change
VolcanoPlanner to the new CascadePlanner in tests. Most tests passed.
Several cases did fail. I know the reason and how to fix them. But I am
still thinking about making them as "won't fix" as the ruleset violates
some basic principles of top-down trait requests.

The second difference is that our design have the ability for space
pruning. Currently it contains a simply LowerBoundCost metadata to compute
the lower bound of a RelNdoe. Because logical properties like cardinality
of a RelSet is not stable across exploring, it is required that a group to
be fully explored (implementation rules and enforcement rules should never
modify the logical properties) before it can provide a valid lower bound
cost. Because of that, logical search space pruning is not supported now.
It can only pruned out implementation rules and enforcement rules. Testing
with cases in our own product, the new planner saves about 10% rule
applies. I am still considering how to support logical space pruning,
looking forwards to have more improvements.

Hope my code will help.

Thanks,
Jinpeng


On Tue, Apr 28, 2020 at 11:22 AM Xiening Dai <xndai....@gmail.com> wrote:

> For #1, aside from that we need to be able to build physical nodes based
> on a convention. For example, if we merge two EnumerableProject, we would
> want to create an EnumerableProject as a result, instead of LogicalProject.
> The RelBuilder change I work on would help this case.
>
> For #2, I don’t think it’s just a bug. If the physical cost cannot be
> reliable before transformation is finished, we should probably delay the
> physical cost calculation, or we risk doing it over again. The other way is
> to complete RelSet transformation before implementing it - which is a
> common practice in industry, including Orca.
>
> The multi-convention is a key scenario, and I agree we should support. My
> thinking is more about seperating logical one (Conventions.NONE) from
> others.
>
>
> > On Apr 27, 2020, at 4:59 PM, Julian Hyde <jh...@apache.org> wrote:
> >
> > Re 1. By all means have multiple instances of a rule (e.g. one instance
> that matches LogicalFilter and another that matches FooFilter) and enable
> different instances during different phases. (We have been slow to create
> all of these variant instances, in part because of the difficulty of
> changing the constructors of many existing rules. My proposed change
> https://issues.apache.org/jira/browse/CALCITE-3923 <
> https://issues.apache.org/jira/browse/CALCITE-3923> would make it easier
> to create new instances that are more precisely targeted.)
> >
> > Re 2. Sure, there are bugs.
> >
> > Re 3. That’s a good idea. The planner could have a "single-convention
> mode", which might be faster, but would throw if it encountered a rel in a
> different convention.
> >
> > I think we’re in agreement - separating rules is a best practice. But we
> shouldn’t force that best practice on everyone. The multi-convention case
> is still crucial for planning hybrid queries (e.g. joining MySQL to
> MongoDB).
> >
> > Julian
> >
> >
> >> On Apr 27, 2020, at 4:28 PM, Xiening Dai <xndai....@gmail.com> wrote:
> >>
> >> Hi Julian,
> >>
> >> In my view, separating logic and physical rules have a number of
> benefits -
> >>
> >> 1. With current design, a rule can match both physical and logical
> nodes. This behavior could cause duplication of rule firings and explosion
> of memo and search space. There was a long discussion regarding this
> (CALCITE-2223). Although the indefinitely rule matching problem is fixed by
> a separate change, the duplicate rule firing is not resolved. There's a
> patch trying to address it (https://github.com/apache/calcite/pull/1543 <
> https://github.com/apache/calcite/pull/1543>), but it still fell short
> due to current design limitation.
> >>
> >> 2. We have a few meta inconsistency issues today which are due to the
> reality that we don’t clearly define transformation phase. For example, we
> don’t have a solution for CALCITE-2166 as long as transformation can still
> apply to a RelSet after it’s been implemented, which means the group
> logical properties (such as row count) can still change and invalidate all
> the previous best cost calculation, so best cost can increase (oops!).
> >>
> >> 3. The other benefit is the planner can choose shortcut a number of
> expensive operations (such as RelSubSet registration, cost calculation and
> propagation, etc) during transformation phase, if it was clearly defined
> and enforced by framework.
> >>
> >>
> >>> On Apr 27, 2020, at 11:59 AM, Julian Hyde <jh...@apache.org> wrote:
> >>>
> >>> This thread has almost gotten too long to respond to. I confess I’ve
> not read much of it. I’m going to reply anyway. Sorry.
> >>>
> >>> I support making Calcite’s optimizer support “Cascades”.
> >>>
> >>> We should keep the existing VolcanoPlanner working during the
> transition, and perhaps longer. (I acknowledge that this will not be easy.
> But it will not be impossible, because we already have 2 planner engines,
> and going from 2 to 3 is much harder than going from 1 to 2.)
> >>>
> >>> I perhaps have a different philosophy than Haisheng + Cascades on
> logical vs physical. I do believe that logical & physical rels and rules
> are more similar than they are different, therefore they should have the
> same classes, etc. Pragmatically, it makes a lot of sense to create rule
> sets and planner phases that deal with just logical rels, or just physical
> rels. But that’s a best practice, not a rule.
> >>>
> >>> This philosophy manifested in a discussion a couple of weeks ago about
> whether RelBuilder should be able to create physical rels. I still believe
> that it should.
> >>>
> >>> Julian
> >>>
> >>>
> >>>> On Apr 27, 2020, at 11:29 AM, Roman Kondakov
> <kondako...@mail.ru.INVALID> wrote:
> >>>>
> >>>> Hi all,
> >>>>
> >>>> Stamatis, Haisheng thank you very much for your feedback! I really
> >>>> appreciate it.
> >>>>
> >>>>> If in the new planner we end up copy-pasting code then I guess it
> will be a bad idea.
> >>>>
> >>>> Yes, there are some code duplication between Volcano planner and
> >>>> Cascades planner. I think I'll move it to some common superclass:
> either
> >>>> to existing AbstractRelOptPlanner or a new AbstractCostBasedPlanner.
> >>>>
> >>>>> Like that it will be easier for everybody to test and see if the
> changes make this better or worse. From a backward compatibility
> perspective it seems feasible to keep the new eatures configurable for a
> certain amount of time.
> >>>>
> >>>> I was thinking about backward compatibility in this way: what if we
> can
> >>>> switch planner in tests by setting some flag (system property?)
> >>>> somewhere and check what happens with new planner if we replace
> Volcano
> >>>> with it. And if ever it turns out that the new planner passes all
> >>>> Volcano's tests and at the same time it works more efficiently, we can
> >>>> safely replace Volcano planner with Cascades planner.
> >>>>
> >>>>> A design doc would definitely help, especially if it has a few
> end-to-end (from logical to physical plan) examples showing how the
> optimizer works
> >>>> at each step before/after the changes. This is actually what is
> usually
> >>>> missing in research papers that makes them hard to understand.
> >>>>> I am thinking some similar to the examples that Haisheng send in the
> first  email but possibly a bit more detailed.
> >>>>
> >>>> I agree, I'll add some exmples to the design doc very soon.
> >>>>
> >>>>> I looked very briefly in the PR by Roman but I think I didn't see
> tests where the final plan contains operators from multiple conventions.
> Multiple conventions is among the choices that complicate certain parts of
> he existing planner so we should make sure that we take this into account.
> >>>>
> >>>> It's on my radar. I'm going to add these tests.
> >>>>
> >>>> I would like to ask the commutnity about my next steps on the way from
> >>>> transition of the Cascades planner from the prototype status to
> becoming
> >>>> a part of the project. I see these steps like this:
> >>>>
> >>>> 1. Create a jira ticket.
> >>>> 2. Update design document with examples.
> >>>> 3. Make some research to obtain backward compatibility with Volcano
> >>>> planner to be able to replace Volcano planner with Cascades planner in
> >>>> test and to understand the current porblems with planner.
> >>>> 4. Solve known problems
> >>>> - materialized views
> >>>> - hints
> >>>> - multiple convetions
> >>>> - listener hooks
> >>>> - problems from p.3.
> >>>> 5. new PR, review and merge.
> >>>> 6. Replacie Volcano planner wiith Cascades after several releases.
> >>>>
> >>>> What do you think about this roadmap?
> >>>>
> >>>>
> >>>> --
> >>>> Kind Regards
> >>>> Roman Kondakov
> >>>>
> >>>>
> >>>> On 27.04.2020 01:55, Stamatis Zampetakis wrote:
> >>>>> Hi all,
> >>>>>
> >>>>> I am very excited about the ideas discussed so far and especially by
> the
> >>>>> enthusiasm of many people that are ready to help for pulling this
> out.
> >>>>> I wouldn't except that we could have a prototype so quickly.
> >>>>> Thanks a lot everyone!
> >>>>>
> >>>>> In the debate between creating new planner or patching the existing
> one, I
> >>>>> don't have a clear preference.
> >>>>> I think the answer depends on how many things can we reuse.
> >>>>> If in the new planner we end up copy-pasting code then I guess it
> will be a
> >>>>> bad idea.
> >>>>> On the other hand, if the new and old planner do not have many
> things in
> >>>>> common then I guess the answer is obvious.
> >>>>>
> >>>>> From Haisheng's description, I was thinking that many of the proposed
> >>>>> changes could go in the existing planner.
> >>>>> Like that it will be easier for everybody to test and see if the
> changes
> >>>>> make this better or worse.
> >>>>> From a backward compatibility perspective it seems feasible to keep
> the new
> >>>>> features configurable for a certain amount of time.
> >>>>>
> >>>>> From the wish-list, I think we should focus initially on points:
> >>>>> 1. Top-down trait request
> >>>>> 2. Convert traits without Abstract converters
> >>>>> 4. Bottom-up trait derivation
> >>>>>
> >>>>> I know that 3, and 5, are also important but I have the feeling they
> can
> >>>>> wait a bit longer.
> >>>>>
> >>>>> A design doc would definitely help, especially if it has a few
> end-to-end
> >>>>> (from logical to physical plan) examples showing how the optimizer
> works at
> >>>>> each step before/after the changes.
> >>>>> This is actually what is usually missing in research papers that
> makes them
> >>>>> hard to understand.
> >>>>> I am thinking some similar to the examples that Haisheng send in the
> first
> >>>>> email but possibly a bit more detailed.
> >>>>>
> >>>>> I looked very briefly in the PR by Roman but I think I didn't see
> tests
> >>>>> where the final plan contains operators from multiple conventions.
> >>>>> Multiple conventions is among the choices that complicate certain
> parts of
> >>>>> the existing planner so we should make sure that we take this into
> account.
> >>>>>
> >>>>> Hoping to find some time to think over all this more quietly. Very
> >>>>> interesting stuff :)
> >>>>>
> >>>>> Best,
> >>>>> Stamatis
> >>>>>
> >>>>> On Sun, Apr 26, 2020 at 11:14 PM Haisheng Yuan <hy...@apache.org>
> wrote:
> >>>>>
> >>>>>> Hi Roman,
> >>>>>>
> >>>>>> Excellent! This is definitely a helpful contribution to the Calcite
> >>>>>> community.
> >>>>>> Thank you for your endeavors.
> >>>>>>
> >>>>>> Haisheng
> >>>>>>
> >>>>>> On 2020/04/26 19:25:00, Roman Kondakov <kondako...@mail.ru.INVALID>
> >>>>>> wrote:
> >>>>>>> Hi everyone!
> >>>>>>>
> >>>>>>> Haisheng, thank you for bringing this subject up. A new
> Cascades-style
> >>>>>>> optimizer should be definitely the next step for Apache Calcite.
> Many
> >>>>>>> projects suffer from the lack of this kind of optimizer.
> >>>>>>>
> >>>>>>> That was the reason why several weeks ago I started working on the
> >>>>>>> prototype of Cascades optimizer for Apache Calcite. I was not sure
> that
> >>>>>>> I would build something useful without much experience in this
> area. But
> >>>>>>> now I'd like to share my work with the community. You can find the
> >>>>>>> Cascades prototype in PR [1]. This prototype is based on the
> well-known
> >>>>>>> paper [2].
> >>>>>>>
> >>>>>>> What prototype can do:
> >>>>>>> - Top-down trait request
> >>>>>>> - Convert traits without Abstract converters
> >>>>>>> - Top-down rule apply
> >>>>>>> - Bottom-up trait derivation
> >>>>>>>
> >>>>>>> What is not supported yet:
> >>>>>>> - Search space pruning (but I'm going to fix it in the next
> commits)
> >>>>>>> - Materialized views
> >>>>>>> - Planner hooks
> >>>>>>> - Hints
> >>>>>>> - Backward compatibility with Volcano planner (some research
> needed)
> >>>>>>>
> >>>>>>> I prepared a design doc for this planner [3], you can find many
> details
> >>>>>>> there. I also opened it for comments.
> >>>>>>>
> >>>>>>> I've written several basic test cases in
> >>>>>>> org.apache.calcite.plan.cascades.CascadesPlannerTest including
> that was
> >>>>>>> discussed in this thread previously in the context of trait
> requests:
> >>>>>>>
> >>>>>>>> MergeJoin hash[a]
> >>>>>>>>       |---- TableScan R hash[a] (RelSubset)
> >>>>>>>>       +---- TableScan S hash[a] (RelSubset)
> >>>>>>>
> >>>>>>>
> >>>>>>> Haisheng, this is very similar to what you propose. Answering your
> >>>>>> question:
> >>>>>>>
> >>>>>>>> There are 2 ways:
> >>>>>>>> a) Modify on current VolcanoPlanner.
> >>>>>>>> Pros: code reuse, existing numerous test cases and infrastructure,
> >>>>>> fast integration
> >>>>>>>> Cons: changing code always brings risk
> >>>>>>>>
> >>>>>>>> b) Add a new XXXXPlanner
> >>>>>>>> Pros: no risk, no tech debt, no need to worry about backward
> >>>>>> compatability
> >>>>>>>> Cons: separate test cases for new planner, one more planner to
> >>>>>> maintain
> >>>>>>>
> >>>>>>> I've chosen the second approach. Because I don't have clear
> >>>>>>> understanding how to fix Volcano planner gradually.
> >>>>>>>
> >>>>>>> This new planner is very far from perfect, but I think it can be a
> good
> >>>>>>> starting point for community.
> >>>>>>>
> >>>>>>> Please, share your thoughts about this planner.
> >>>>>>>
> >>>>>>>
> >>>>>>> [1] PR: https://github.com/apache/calcite/pull/1948
> >>>>>>> [2] Paper:
> >>>>>>>
> >>>>>>
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/xu-columbia-thesis1998.pdf
> >>>>>>> [3] Design doc:
> >>>>>>>
> >>>>>>
> https://docs.google.com/document/d/1qaV3eSKTw4gfLuBR3XB_LVIc51hxIng9k1J4VD6qnRg/edit?usp=sharing
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Kind Regards
> >>>>>>> Roman Kondakov
> >>>>>>>
> >>>>>>>
> >>>>>>> On 22.04.2020 09:52, Danny Chan wrote:
> >>>>>>>>> Is there any recommended approach to make that happen smoothly
> besides
> >>>>>>>> coding and testing work? We need to be aware that the new planner
> >>>>>> might be
> >>>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even never
> replace
> >>>>>>>> VolcanoPlanner.
> >>>>>>>>
> >>>>>>>> If that is true, i might say the new planner is probably with a
> not
> >>>>>> that
> >>>>>>>> good design, we expect to see in advance for what cases/reasons
> user
> >>>>>> has
> >>>>>>>> the reason to keep the old VolcanoPlanner and we *must* give a
> >>>>>> solution for
> >>>>>>>> those problems in the new design.
> >>>>>>>>
> >>>>>>>> I was expecting that migrating to a new planner would at least
> take 1
> >>>>>> year
> >>>>>>>> for developing, if that is true, modifying directly based on
> current
> >>>>>>>> planner means for the near future 3~4 versions Calcite, there
> would
> >>>>>> bring
> >>>>>>>> in huge plan changes/bugs for each release which i believe all the
> >>>>>> users of
> >>>>>>>> Calcite don't want to see. And on one can guarantee that modifying
> >>>>>> directly
> >>>>>>>> can keep good stability and compatibility, only the test set do.
> >>>>>>>>
> >>>>>>>> From the experience of Alibaba Blink planner which has
> contributed to
> >>>>>>>> Apache Flink, yes, the old/new planner would co-exist at least
> for 2
> >>>>>> years.
> >>>>>>>> For the reasons that the new and old planner has different
> ability in
> >>>>>> some
> >>>>>>>> corner cases.
> >>>>>>>>
> >>>>>>>> From my point of view, we should at least:
> >>>>>>>> - Give a convincing test set for the new planner that makes us
> believe
> >>>>>> the
> >>>>>>>> new planner is stable and powerful enough. I mean obviously the
> current
> >>>>>>>> rule tests are far away from enough to support the new planner
> >>>>>>>> - We should give a more detailed design doc about the new planner,
> >>>>>>>> especially about the interfaces changes and any change that would
> >>>>>> bring in
> >>>>>>>> the compatibility problem. Then we can make more accurate
> decision how
> >>>>>> much
> >>>>>>>> work the new planner would bring in, until then, we can decide if
> >>>>>> switch to
> >>>>>>>> a pure new planner development is a good idea or modify the
> existing
> >>>>>> one.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Haisheng Yuan <hy...@apache.org> 于2020年4月22日周三 上午9:45写道：
> >>>>>>>>
> >>>>>>>>> Hi Andrii,
> >>>>>>>>>
> >>>>>>>>>> Obviously, from what is written here, I could guess that this
> would
> >>>>>>>>> require me to change my physical planning rules, even if only by
> >>>>>>>>> implementing a marker interface.
> >>>>>>>>> You don't need to change your physical rules, it will be treated
> as
> >>>>>> equal
> >>>>>>>>> as logical rules and be applied together with the real logical
> rules,
> >>>>>> no
> >>>>>>>>> more logical/physical rules difference. This is also how current
> >>>>>>>>> VolcanoPlanner works.
> >>>>>>>>>
> >>>>>>>>>> I don't want you to think that I somehow resent the changes you
> are
> >>>>>>>>> pushing.
> >>>>>>>>> Don't get me wrong. I am seriously thinking of revert these
> changes,
> >>>>>> since
> >>>>>>>>> most people like the idea of adding new planner, why don't we
> make
> >>>>>> all the
> >>>>>>>>> plan changes in the new planner, instead of forcing people
> changing
> >>>>>> test
> >>>>>>>>> cases for the code changes that they might not need in
> VolcanoPlanner
> >>>>>>>>> during upgrade.
> >>>>>>>>>
> >>>>>>>>> I didn't intend to replace VolcanoPlanner, thought just change
> the
> >>>>>> search
> >>>>>>>>> strategy and add trait derivation mechanism, because most of the
> code
> >>>>>> in
> >>>>>>>>> VolcanoPlanner can be reused. But since many agree to add new
> planner
> >>>>>> and
> >>>>>>>>> replace VolcanoPlanner as the final goal, I won't be against most
> >>>>>> people's
> >>>>>>>>> decision.
> >>>>>>>>>
> >>>>>>>>> Is there any recommended approach to make that happen smoothly
> besides
> >>>>>>>>> coding and testing work? We need to be aware that the new planner
> >>>>>> might be
> >>>>>>>>> co-exist with VolcanoPlanner for 5 or more years, or even never
> >>>>>> replace
> >>>>>>>>> VolcanoPlanner.
> >>>>>>>>>
> >>>>>>>>> More thoughts are welcome.
> >>>>>>>>>
> >>>>>>>>> Haisheng
> >>>>>>>>>
> >>>>>>>>> On 2020/04/21 19:56:25, Андрей Цвелодуб <a.tsvelo...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>> Hello Haisheng,
> >>>>>>>>>>
> >>>>>>>>>>> To keep backward compatibility, all the un-marked rules will be
> >>>>>> treated
> >>>>>>>>>> as logical rules, except rules that uses AbstractConverter as
> rule
> >>>>>>>>> operand,
> >>>>>>>>>> these rules still need to applied top-down, or random order.
> >>>>>>>>>> Obviously, from what is written here, I could guess that this
> would
> >>>>>>>>> require
> >>>>>>>>>> me to change my physical planning rules, even if only by
> >>>>>> implementing a
> >>>>>>>>>> marker interface. I am not saying this is a bad thing, but this
> is a
> >>>>>>>>> thing
> >>>>>>>>>> that should be communicated and planned ahead in case the
> >>>>>> VolcanoPlanner
> >>>>>>>>> is
> >>>>>>>>>> modified.
> >>>>>>>>>>
> >>>>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and
> >>>>>> CALCITE-3753,
> >>>>>>>>>> because they will cause another tons of plan changes.
> >>>>>>>>>> I see you are still bitter due to all the discussions on this
> list
> >>>>>>>>> lately,
> >>>>>>>>>> I'm sorry. I don't want you to think that I somehow resent the
> >>>>>> changes
> >>>>>>>>> you
> >>>>>>>>>> are pushing, au contraire I support them and would be happy to
> help
> >>>>>> if I
> >>>>>>>>>> can. I just want the process of these changes to be executed in
> the
> >>>>>> best
> >>>>>>>>>> possible way.
> >>>>>>>>>> As I see there are already several opinions in this thread that
> >>>>>> basically
> >>>>>>>>>> align with what I am saying, so I guess I am not the crazy guy
> >>>>>> running
> >>>>>>>>>> around and yelling "the end is nigh!".
> >>>>>>>>>>
> >>>>>>>>>> Thank you for taking these mumbled thoughts into account.
> >>>>>>>>>>
> >>>>>>>>>> Bestest Regards,
> >>>>>>>>>> Andrii Tsvielodub
> >>>>>>>>>>
> >>>>>>>>>> On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan <hy...@apache.org>
> >>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> Hi Andrii,
> >>>>>>>>>>>
> >>>>>>>>>>>> I guess changing the planner would lead to changes in tons of
> rules
> >>>>>>>>> and
> >>>>>>>>>>> even more tests.
> >>>>>>>>>>> Obviously you didn't read through my email. You are not
> required to
> >>>>>> do
> >>>>>>>>> any
> >>>>>>>>>>> changes to your rule if you don't want to, but if you do, just
> need
> >>>>>> to
> >>>>>>>>> mark
> >>>>>>>>>>> the rule to tell planner whether it is a physical rule or not,
> >>>>>> simply
> >>>>>>>>> by
> >>>>>>>>>>> implementing an empty interface.
> >>>>>>>>>>>
> >>>>>>>>>>>> many on this list already experienced problems with upgrading
> even
> >>>>>>>>>>> between the minor versions of Calcite.
> >>>>>>>>>>> Sorry to see the problem you have experienced when upgrading
> >>>>>> Calcite.
> >>>>>>>>>>> Looks like I have to revert changes in CALCITE-2970 and
> >>>>>> CALCITE-3753,
> >>>>>>>>>>> because they will cause another tons of plan changes.
> >>>>>>>>>>>
> >>>>>>>>>>> But I will see if I can add a setting to use the old search
> >>>>>> strategy,
> >>>>>>>>>>> which can be left untouched.
> >>>>>>>>>>>
> >>>>>>>>>>> Haisheng
> >>>>>>>>>>>
> >>>>>>>>>>> On 2020/04/21 06:33:08, Андрей Цвелодуб <a.tsvelo...@gmail.com
> >
> >>>>>> wrote:
> >>>>>>>>>>>> Hello everyone,
> >>>>>>>>>>>>
> >>>>>>>>>>>> First of all, thanks for this great effort of improving the
> core
> >>>>>>>>> parts of
> >>>>>>>>>>>> the framework we all are using,
> >>>>>>>>>>>> I believe this is long overdue and hope this will have
> benefits
> >>>>>> both
> >>>>>>>>> for
> >>>>>>>>>>>> the maintainers and users of the library.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I don't have anything to say about the general idea at the
> moment,
> >>>>>>>>>>>> but I want to make a point that maintaining the old
> implementation
> >>>>>> of
> >>>>>>>>>>>> VolcanoPlanner during
> >>>>>>>>>>>> the initial stages of implementing the new planner is
> absolutely
> >>>>>>>>>>> CRITICAL.
> >>>>>>>>>>>> As a lot of users of Calcite do various customizations to the
> >>>>>>>>> engine, to
> >>>>>>>>>>>> the rules
> >>>>>>>>>>>> and all that is there in between, I believe changing the
> >>>>>>>>> implementation
> >>>>>>>>>>> of
> >>>>>>>>>>>> the core component
> >>>>>>>>>>>> would have a huge impact on most users of the library. I
> think many
> >>>>>>>>> on
> >>>>>>>>>>> this
> >>>>>>>>>>>> list
> >>>>>>>>>>>> already experienced problems with upgrading even between the
> minor
> >>>>>>>>>>> versions
> >>>>>>>>>>>> of Calcite,
> >>>>>>>>>>>> so I guess changing the planner would lead to changes in tons
> of
> >>>>>>>>> rules
> >>>>>>>>>>> and
> >>>>>>>>>>>> even more tests.
> >>>>>>>>>>>>
> >>>>>>>>>>>> I don't have anything against replacing VolcanoPlanner as a
> final
> >>>>>>>>> goal of
> >>>>>>>>>>>> this effort,
> >>>>>>>>>>>> but I don't think that modifying it directly and merging it to
> >>>>>>>>> master is
> >>>>>>>>>>> a
> >>>>>>>>>>>> viable development approach.
> >>>>>>>>>>>> While I understand how burdensome it is to maintain several
> >>>>>> parallel
> >>>>>>>>> core
> >>>>>>>>>>>> components at once
> >>>>>>>>>>>> (we did this while moving the engine of our product to
> Calcite), we
> >>>>>>>>>>> should
> >>>>>>>>>>>> still respect those who depend
> >>>>>>>>>>>> on it and not introduce the risks related to the development
> of a
> >>>>>> new
> >>>>>>>>>>>> component into existing processing flows.
> >>>>>>>>>>>>
> >>>>>>>>>>>> A good model to try to follow would be the way new Garbage
> >>>>>>>>> Collectors are
> >>>>>>>>>>>> introduced in Java.
> >>>>>>>>>>>> First, add it as an experimental option, then make it
> generally
> >>>>>>>>>>> available,
> >>>>>>>>>>>> then after everyone agrees
> >>>>>>>>>>>> this is the best option - make it the default one.
> >>>>>>>>>>>> With this approach, everyone can then move to the new planner
> at
> >>>>>>>>> their
> >>>>>>>>>>> own
> >>>>>>>>>>>> pace, guaranteeing a smooth transition overall.
> >>>>>>>>>>>> Yes, this could take some time, maybe even a year, but this
> is the
> >>>>>>>>> price
> >>>>>>>>>>> of
> >>>>>>>>>>>> doing major changes in a popular framework.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Again, thank you for initiating this discussion and leading
> this
> >>>>>>>>> effort.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Best Regards,
> >>>>>>>>>>>> Andrii Tsvielodub
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu <wjpabc...@gmail.com
> >
> >>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Hi, Xiening.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding calculating the logical cost, here are some ways I
> >>>>>>>>> though:
> >>>>>>>>>>>>> 1. Logical rel may implement their own computeSelfCost
> method.
> >>>>>> Some
> >>>>>>>>>>>>> rels can provide such information, for example the
> >>>>>>>>>>>>> LogicalProject/LogicalFilter contains nearly the same
> information
> >>>>>>>>> as
> >>>>>>>>>>> their
> >>>>>>>>>>>>> physical implementations. If we don't have enough
> confidence, just
> >>>>>>>>>>> return
> >>>>>>>>>>>>> zeroCost is also OK, as it only affects pruning.
> >>>>>>>>>>>>> 2. Logical rel  tells its parents what its physical input
> could be
> >>>>>>>>>>> after
> >>>>>>>>>>>>> implementation. Then the problem come back to calculating
> lower
> >>>>>>>>> bound
> >>>>>>>>>>> of a
> >>>>>>>>>>>>> physical rel.
> >>>>>>>>>>>>> There should always be ways. The only problem is how to find
> a
> >>>>>>>>> pretty
> >>>>>>>>>>> one.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Regarding the risk, new planner do have different risk. It
> is not
> >>>>>>>>>>> because
> >>>>>>>>>>>>> new planner could stop us doing something wrong but we can
> decide
> >>>>>>>>> when
> >>>>>>>>>>> to
> >>>>>>>>>>>>> use the new one. Some scenarios:
> >>>>>>>>>>>>> 1. If modifying the VolcanoPlanner directly, the only way
> user
> >>>>>>>>> could
> >>>>>>>>>>>>> control the risk is not to upgrade calcite version until it
> is
> >>>>>>>>>>> considered
> >>>>>>>>>>>>> stable. You know, it is quite different from keeping calcite
> >>>>>>>>> updated
> >>>>>>>>>>> and
> >>>>>>>>>>>>> switching to the new planner at a proper time.
> >>>>>>>>>>>>> 2. It is very importance for SLA control. For the important
> >>>>>>>>> business
> >>>>>>>>>>> and
> >>>>>>>>>>>>> jobs, we may keep using the old and stable planner. And use
> the
> >>>>>>>>> new one
> >>>>>>>>>>>>> only for jobs that have fault tolerance. And this helps
> testing
> >>>>>> new
> >>>>>>>>>>> planner
> >>>>>>>>>>>>> with actual scenarios.
> >>>>>>>>>>>>> 3. It is helpful when upgrading online services. When the new
> >>>>>>>>> planner
> >>>>>>>>>>>>> happened to have some bugs, we can switch to the old planner
> >>>>>>>>> directly
> >>>>>>>>>>>>> without rollback the whole service.
> >>>>>>>>>>>>> 4. With all these ways to prevent issues becoming disasters,
> we
> >>>>>>>>> are not
> >>>>>>>>>>>>> vulnerable to making mistakes. This not only enables faster
> >>>>>>>>> iterations
> >>>>>>>>>>> but
> >>>>>>>>>>>>> also let us have enough time to resolve big bugs, like
> considering
> >>>>>>>>> it
> >>>>>>>>>>> in
> >>>>>>>>>>>>> detail and applying a time-consuming refactoring for it. To
> work
> >>>>>>>>>>> around a
> >>>>>>>>>>>>> critical bug using tricky ways usually introduces more
> issues.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>> Jinpeng
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai <
> xndai....@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Hi Jinpeng,
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regarding this comment - I believe there are ways to
> calculate
> >>>>>>>>> the
> >>>>>>>>>>>>> logical
> >>>>>>>>>>>>>> cost, but I think it’s not that simple as  "cardinality *
> >>>>>>>>>>>>> unit_copy_cost.”,
> >>>>>>>>>>>>>> would  you provide more details of other different ways?
> Just the
> >>>>>>>>>>>>> algorithm
> >>>>>>>>>>>>>> description or pseudo code would help us understand. Thanks.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Regarding the approach of creating new planner, I don’t
> think a
> >>>>>>>>> new
> >>>>>>>>>>>>>> planner would lower the risk. We don’t know what we don’t
> know.
> >>>>>>>>> If we
> >>>>>>>>>>>>>> introduced an issue while modifying the planner, most
> likely we
> >>>>>>>>>>> would do
> >>>>>>>>>>>>>> the same with new planner class. A new planner doesn’t
> >>>>>>>>> necessarily
> >>>>>>>>>>>>> prevent
> >>>>>>>>>>>>>> the issue from happening, but just delay its surfacing,
> which is
> >>>>>>>>>>> worse
> >>>>>>>>>>>>> IMO.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> There’s one obvious benefit with new planner is that we can
> >>>>>>>>> provide
> >>>>>>>>>>> some
> >>>>>>>>>>>>>> sort of isolation so the change won’t cause test baseline
> >>>>>>>>> updates,
> >>>>>>>>>>> which
> >>>>>>>>>>>>>> could be painful at times.  We should see if we could use
> planner
> >>>>>>>>>>> config
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>> achieve the same if we decide to just modify the current
> planner.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Apr 20, 2020, at 8:32 AM, Haisheng Yuan <
> hy...@apache.org>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Igor,
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> That's great.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On 2020/04/20 11:17:49, Seliverstov Igor <
> gvvinbl...@gmail.com
> >>>>>>>>>>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>> Haisheng, Xiening,
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Ok, Now I see how it should work.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks for your replies.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> 20 апр. 2020 г., в 09:56, Seliverstov Igor <
> >>>>>>>>> gvvinbl...@gmail.com
> >>>>>>>>>>>>
> >>>>>>>>>>>>>> написал(а):
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Haisheng, Xiening,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks for clarifying.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> In this proposal, we are not trying to split logical and
> >>>>>>>>> physical
> >>>>>>>>>>>>>> planning entirely. - actually I was in doubt about an idea
> of
> >>>>>>>>> entire
> >>>>>>>>>>>>>> splitting logical and physical phases, if you aren't going
> to, I
> >>>>>>>>>>> have no
> >>>>>>>>>>>>>> objections.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> But it returns me to my first question: how we will
> propagate
> >>>>>>>>>>> traits
> >>>>>>>>>>>>>> in bottom-up manner using proposed approach. (See [DISCUSS]
> >>>>>>>>> Proposal
> >>>>>>>>>>> to
> >>>>>>>>>>>>> add
> >>>>>>>>>>>>>> API to force rules matching specific rels for details)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> One of inconveniences of current VolcanoPlanner
> >>>>>>>>> implementation is
> >>>>>>>>>>>>>> amount of tricks that we need to get desired behaviour. It
> would
> >>>>>>>>> be
> >>>>>>>>>>> great
> >>>>>>>>>>>>>> if some of issues (or all of them) were solved in the new
> >>>>>>>>> approach.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> пн, 20 апр. 2020 г., 7:02 Xiening Dai <
> xndai....@gmail.com
> >>>>>>>>>>> <mailto:
> >>>>>>>>>>>>>> xndai....@gmail.com>>:
> >>>>>>>>>>>>>>>>> Hi Igor,
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Your comment - "because actual cost may be calculated
> >>>>>>>>> correctly
> >>>>>>>>>>> using
> >>>>>>>>>>>>>> physical operators only. So won't be able to implement
> Branch and
> >>>>>>>>>>> Bound
> >>>>>>>>>>>>>> Space Pruning.“ is actually not true. In Cascade’s lower
> bound /
> >>>>>>>>>>> upper
> >>>>>>>>>>>>>> bound pruning algorithm, you can get cost lower bound of
> input
> >>>>>>>>>>> RelNode
> >>>>>>>>>>>>>> using cardinality * unit_copy_cost. The unit_copy_cost is a
> >>>>>>>>> constant,
> >>>>>>>>>>>>> which
> >>>>>>>>>>>>>> stands for the minimal cost for processing a tuple from the
> >>>>>>>>> input. So
> >>>>>>>>>>>>> this
> >>>>>>>>>>>>>> will be the minimal cost the input RelNode can achieve, and
> if
> >>>>>>>>> this
> >>>>>>>>>>> is
> >>>>>>>>>>>>>> indeed larger than the current best cost, this input node
> can be
> >>>>>>>>>>> pruned.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> In this proposal, we are not trying to split logical and
> >>>>>>>>> physical
> >>>>>>>>>>>>>> planning entirely. But for any given equivalent set, we
> would
> >>>>>>>>> need to
> >>>>>>>>>>>>>> finish exploration before implementation. But for the
> entire memo
> >>>>>>>>>>> tree,
> >>>>>>>>>>>>>> each set could be in different planning stage, and an
> equivalent
> >>>>>>>>> set
> >>>>>>>>>>> can
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>> pruned even before it’s implemented. A text book example of
> >>>>>>>>>>>>> aforementioned
> >>>>>>>>>>>>>> “lower bound / upper bound pruning” would be the join
> ordering
> >>>>>>>>> case.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Regarding #3, I think we can still achieve that
> (partially)
> >>>>>>>>>>> through
> >>>>>>>>>>>>>> this proposal. Remember every time when we start the
> >>>>>>>>> optimization, we
> >>>>>>>>>>>>> pass
> >>>>>>>>>>>>>> down an upper bound limit. Initially this upper bound for
> root is
> >>>>>>>>>>>>> infinite
> >>>>>>>>>>>>>> - meaning that no feasible plan is available. Every time
> when we
> >>>>>>>>>>> find a
> >>>>>>>>>>>>>> physical plan we update this upper bound, then start the
> search
> >>>>>>>>>>> again. We
> >>>>>>>>>>>>>> could stop the search when the cost is less than a
> pre-defined
> >>>>>>>>>>> threshold
> >>>>>>>>>>>>> -
> >>>>>>>>>>>>>> which gives you a “good enough” plan with early termination.
> >>>>>>>>> Still
> >>>>>>>>>>> this
> >>>>>>>>>>>>>> wouldn't avoid the logical exploration. For that, you would
> >>>>>>>>> probably
> >>>>>>>>>>>>>> archive through rule configurations, and avoid some
> expansive
> >>>>>>>>>>>>>> transformation to keep the cost down.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan <
> >>>>>>>>> hy...@apache.org
> >>>>>>>>>>>>> <mailto:
> >>>>>>>>>>>>>> hy...@apache.org>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Igor,
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> a) Given current Calcite's stats derivation strategy,
> mixing
> >>>>>>>>>>> logical
> >>>>>>>>>>>>>> and physical planning won't make it better. I hope you went
> >>>>>>>>> through
> >>>>>>>>>>> my
> >>>>>>>>>>>>>> email to the end, currently operators inside a memo group
> don't
> >>>>>>>>> share
> >>>>>>>>>>>>> stats
> >>>>>>>>>>>>>> info, each operator's stats may differ with the other one,
> and
> >>>>>>>>> the
> >>>>>>>>>>>>>> RelSubset's best node and stats may vary during
> optimization. So
> >>>>>>>>>>> logical
> >>>>>>>>>>>>>> and physical rule pruning are not safe at the moment,
> otherwise
> >>>>>>>>> it
> >>>>>>>>>>> almost
> >>>>>>>>>>>>>> implies changing downstream project's cost model silently.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On the other hand, ensuring child group exploration task
> >>>>>>>>> finish
> >>>>>>>>>>>>> first
> >>>>>>>>>>>>>> will make rule mutual exclusivity check possible, like the
> >>>>>>>>> result of
> >>>>>>>>>>>>>> ReduceExpressionRule won't need trigger the same rule
> again, The
> >>>>>>>>> join
> >>>>>>>>>>>>>> result of JoinCommuteRule won't need trigger
> JoinCommuteRule and
> >>>>>>>>>>>>>> ReduceExpressionRule again.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> More importantly, most if not all the the long planning
> >>>>>>>>> queries
> >>>>>>>>>>> in
> >>>>>>>>>>>>>> Calcite are not caused by too many alternatives, but mutual
> >>>>>>>>>>> triggering,
> >>>>>>>>>>>>> or
> >>>>>>>>>>>>>> cyclic triggering, which can be avoided by customizing the
> rules
> >>>>>>>>>>> instead
> >>>>>>>>>>>>> of
> >>>>>>>>>>>>>> using the default one. Unless you use dynamic programming
> >>>>>>>>> (Calcite
> >>>>>>>>>>> use
> >>>>>>>>>>>>>> heuristic) on join reordering (left-deep, right-deep,
> bushy),
> >>>>>>>>> space
> >>>>>>>>>>>>> pruning
> >>>>>>>>>>>>>> won't help make long / infinite running query faster.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> b) No evidence shows current version of Calcite will
> return
> >>>>>>>>> the
> >>>>>>>>>>> most
> >>>>>>>>>>>>>> promising plan in first planning iteration. Instead of
> praying
> >>>>>>>>> for
> >>>>>>>>>>>>> getting
> >>>>>>>>>>>>>> good enough plan in the first iteration, why not focus on
> fixing
> >>>>>>>>>>> rules
> >>>>>>>>>>>>> that
> >>>>>>>>>>>>>> causes the issue?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> c) That is not the goal.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On 2020/04/19 15:14:57, Seliverstov Igor <
> >>>>>>>>> gvvinbl...@gmail.com
> >>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote:
> >>>>>>>>>>>>>>>>>>> Haisheng,
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> From my point of view splitting logical and physical
> >>>>>>>>> planning
> >>>>>>>>>>> parts
> >>>>>>>>>>>>>> isn’t a good idea.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> I think so because actual cost may be calculated
> correctly
> >>>>>>>>>>> using
> >>>>>>>>>>>>>> physical operators only. So that we:
> >>>>>>>>>>>>>>>>>>> a) won't be able to implement Branch and Bound Space
> >>>>>>>>> Pruning
> >>>>>>>>>>> (as
> >>>>>>>>>>>>> far
> >>>>>>>>>>>>>> as I understand, at exploring time there are no physical
> >>>>>>>>> operators,
> >>>>>>>>>>> no
> >>>>>>>>>>>>>> correct costs, but assumptions only, I don’t think we should
> >>>>>>>>> rely on
> >>>>>>>>>>>>> them)
> >>>>>>>>>>>>>>>>>>> b) won’t be able to get good enough plan (without
> Branch
> >>>>>>>>> and
> >>>>>>>>>>> Bound
> >>>>>>>>>>>>>> Space Pruning it’s non-trivial task to get right search
> >>>>>>>>> direction and
> >>>>>>>>>>>>> most
> >>>>>>>>>>>>>> promising plans in first planning iterations)
> >>>>>>>>>>>>>>>>>>> c) won’t be able to tune the good enough plan during
> >>>>>>>>> several
> >>>>>>>>>>>>> similar
> >>>>>>>>>>>>>> queries are executed
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan <
> hy...@apache.org
> >>>>>>>>>>>>> <mailto:
> >>>>>>>>>>>>>> hy...@apache.org>> написал(а):
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi Igor,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> You can't have your cake and eat it.
> >>>>>>>>>>>>>>>>>>>> But one feasible work item definitely we can do is
> that
> >>>>>>>>> once
> >>>>>>>>>>>>>> timeout, stop exploring, use the first available physical
> >>>>>>>>> operator in
> >>>>>>>>>>>>> each
> >>>>>>>>>>>>>> group and optimize it.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Because most, if not all, of the long / infinite
> running
> >>>>>>>>>>>>>> optimizations are caused by project related transpose, join
> >>>>>>>>>>> reordering
> >>>>>>>>>>>>>> (join commute + associative), constant reduction for large
> >>>>>>>>> expression
> >>>>>>>>>>>>> (see
> >>>>>>>>>>>>>> CALCITE-3460), all of which are logical transformations
> rules and
> >>>>>>>>>>> many of
> >>>>>>>>>>>>>> which have corresponding JIRAs. So another alternative is,
> let's
> >>>>>>>>> fix
> >>>>>>>>>>>>> these
> >>>>>>>>>>>>>> bugs to improve Calcite to make timeout option less usable.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Another thing worth noting is that sql server
> optimizer
> >>>>>>>>>>> timeout
> >>>>>>>>>>>>>> mechanism is not based on clock time, but the number of
> >>>>>>>>> optimization
> >>>>>>>>>>>>> tasks
> >>>>>>>>>>>>>> it has done [1].
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> >>>>>>>>>>>>>> <
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>> Haisheng
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On 2020/04/19 11:31:27, Seliverstov Igor <
> >>>>>>>>>>> gvvinbl...@gmail.com
> >>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote:
> >>>>>>>>>>>>>>>>>>>>> Haisheng,
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Ok, then such notification isn't needed
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> But in this case we don't have any control over how
> long
> >>>>>>>>>>> planning
> >>>>>>>>>>>>>> takes
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> For some systems it's necessary to get good enough
> plan
> >>>>>>>>>>> right now
> >>>>>>>>>>>>>> instead
> >>>>>>>>>>>>>>>>>>>>> of best one after while
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> For example we've been considering a case when a
> query is
> >>>>>>>>>>>>>> optimised several
> >>>>>>>>>>>>>>>>>>>>> times in short iterations in case it's impossible to
> get
> >>>>>>>>> the
> >>>>>>>>>>> best
> >>>>>>>>>>>>>> plan in
> >>>>>>>>>>>>>>>>>>>>> reasonable period of time (for example there is an
> SLA
> >>>>>>>>> for
> >>>>>>>>>>>>>> response time)
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> This mean we need all needed physical implementations
> >>>>>>>>> after
> >>>>>>>>>>> each
> >>>>>>>>>>>>>> logical
> >>>>>>>>>>>>>>>>>>>>> transformation is applied.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan <
> >>>>>>>>> hy...@apache.org
> >>>>>>>>>>>>>> <mailto:hy...@apache.org>>:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Hi Igor,
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> There will be no rule queue anymore. Y will be fully
> >>>>>>>>>>> explored
> >>>>>>>>>>>>>> (logical
> >>>>>>>>>>>>>>>>>>>>>> rules are matched and applied) before it can be
> >>>>>>>>> implemented
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> optimized.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Thanks,
> >>>>>>>>>>>>>>>>>>>>>> Haisheng
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On 2020/04/19 10:11:45, Seliverstov Igor <
> >>>>>>>>>>> gvvinbl...@gmail.com
> >>>>>>>>>>>>>> <mailto:gvvinbl...@gmail.com>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>> Hi Haisheng,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Great explanation, thanks.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> One thing I'd like to cover in advance is trait
> >>>>>>>>> propagation
> >>>>>>>>>>>>>> process (I
> >>>>>>>>>>>>>>>>>>>>>> need
> >>>>>>>>>>>>>>>>>>>>>>> it for Apache Ignite SQL engine implementation).
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> For example:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> There are two nodes: Rel X and its child node Rel Y
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Both nodes are in Optimized state, and there is a
> >>>>>>>>> Logical
> >>>>>>>>>>> rule
> >>>>>>>>>>>>>> for Rel Y
> >>>>>>>>>>>>>>>>>>>>>> in
> >>>>>>>>>>>>>>>>>>>>>>> a rules queue matched,
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> After logical rule is applied, there is a logical
> rel
> >>>>>>>>> Rel
> >>>>>>>>>>> Y'
> >>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> containing
> >>>>>>>>>>>>>>>>>>>>>>> it set moves into Exploring state (since there is a
> >>>>>>>>> logical
> >>>>>>>>>>>>> node
> >>>>>>>>>>>>>> which
> >>>>>>>>>>>>>>>>>>>>>> has
> >>>>>>>>>>>>>>>>>>>>>>> to be implemented)
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> After whole process Exploring -> ... -> Optimized
> we
> >>>>>>>>> need
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> force parent
> >>>>>>>>>>>>>>>>>>>>>>> Rel X set to switch its state from Optimized to
> >>>>>>>>> Optimizing
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> peek the
> >>>>>>>>>>>>>>>>>>>>>>> newly implemented child and (possible) produce a
> new
> >>>>>>>>> Rel X'
> >>>>>>>>>>>>>> physical rel
> >>>>>>>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>>>>> the basis of previously requested from the Rel X
> set
> >>>>>>>>>>> traits.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> If a new Rel X' is produced, a Rel X' parent should
> >>>>>>>>> move
> >>>>>>>>>>> its
> >>>>>>>>>>>>>> state
> >>>>>>>>>>>>>>>>>>>>>>> Optimized -> Optimizing and repeat described above
> >>>>>>>>>>> operations.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Does it look like true?
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Regards,
> >>>>>>>>>>>>>>>>>>>>>>> Igor
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan <
> >>>>>>>>>>>>> h.y...@alibaba-inc.com
> >>>>>>>>>>>>>> <mailto:h.y...@alibaba-inc.com>>:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hi,
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> In the past few months, we have discussed a lot
> about
> >>>>>>>>>>> Cascades
> >>>>>>>>>>>>>> style
> >>>>>>>>>>>>>>>>>>>>>>>> top-down optimization, including on-demand trait
> >>>>>>>>>>>>>> derivation/request,
> >>>>>>>>>>>>>>>>>>>>>> rule
> >>>>>>>>>>>>>>>>>>>>>>>> apply, branch and bound space pruning. Now we
> think
> >>>>>>>>> it is
> >>>>>>>>>>> time
> >>>>>>>>>>>>>> to move
> >>>>>>>>>>>>>>>>>>>>>>>> towards these targets.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> We will separate it into several small issues, and
> >>>>>>>>> each
> >>>>>>>>>>> one
> >>>>>>>>>>>>> can
> >>>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>>>> integrated as a standalone, independent feature,
> and
> >>>>>>>>> most
> >>>>>>>>>>>>>> importantly,
> >>>>>>>>>>>>>>>>>>>>>>>> meanwhile keep backward compatibility.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 1. Top-down trait request
> >>>>>>>>>>>>>>>>>>>>>>>> In other words, pass traits requirements from
> parent
> >>>>>>>>>>> nodes to
> >>>>>>>>>>>>>> child
> >>>>>>>>>>>>>>>>>>>>>> nodes.
> >>>>>>>>>>>>>>>>>>>>>>>> The trait requests happens after all the logical
> >>>>>>>>>>>>> transformation
> >>>>>>>>>>>>>> rules
> >>>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>> physical implementation rules are done, in a
> top-down
> >>>>>>>>>>> manner,
> >>>>>>>>>>>>>> driven
> >>>>>>>>>>>>>>>>>>>>>> from
> >>>>>>>>>>>>>>>>>>>>>>>> root set. e.g.:
> >>>>>>>>>>>>>>>>>>>>>>>> SELECT a, sum(c) FROM
> >>>>>>>>>>>>>>>>>>>>>>>> (SELECT * FROM R JOIN S USING (a, b)) t
> >>>>>>>>>>>>>>>>>>>>>>>> GROUP BY a;
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Suppose we have the following plan tree in the
> MEMO,
> >>>>>>>>> and
> >>>>>>>>>>> let's
> >>>>>>>>>>>>>> only
> >>>>>>>>>>>>>>>>>>>>>>>> consider distribution for simplicity, each group
> >>>>>>>>>>> represents a
> >>>>>>>>>>>>>> RelSet
> >>>>>>>>>>>>>>>>>>>>>> in the
> >>>>>>>>>>>>>>>>>>>>>>>> MEMO.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Group 1:     Agg on [a]
> >>>>>>>>>>>>>>>>>>>>>>>> Group 2:           +-- MergeJoin on [a, b]
> >>>>>>>>>>>>>>>>>>>>>>>> Group 3:                       |--- TableScan R
> >>>>>>>>>>>>>>>>>>>>>>>> Group 4:                       +--- TableScan S
> >>>>>>>>>>>>>>>>>>>>>>>> | Group No | Operator    | Derived Traits |
> Required
> >>>>>>>>>>> Traits |
> >>>>>>>>>>>>>>>>>>>>>>>> | ----------- | ------------- | --------------- |
> >>>>>>>>>>>>>> --------------- |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group 1  | Aggregate    | Hash[a]         | N/A
> >>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group 2  | MergeJoin    | Hash[a,b]      |
> Hash[a]
> >>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group 3  | TableScan R | None            |
> Hash[a,b]
> >>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group 4  | TableScan S | None            |
> Hash[a,b]
> >>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> We will add new interface PhysicalNode (or
> extending
> >>>>>>>>>>> RelNode)
> >>>>>>>>>>>>>> with
> >>>>>>>>>>>>>>>>>>>>>>>> methods:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> - Pair<RelTraitSet,List<RelTraitSet>>
> >>>>>>>>>>>>> requireTraits(RelTraitSet
> >>>>>>>>>>>>>>>>>>>>>> required);
> >>>>>>>>>>>>>>>>>>>>>>>> pair.left is the current operator's new traitset,
> >>>>>>>>>>> pair.right
> >>>>>>>>>>>>> is
> >>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>> list
> >>>>>>>>>>>>>>>>>>>>>>>> of children's required traitset.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> - RelNode passThrough(RelTraitSet required);
> >>>>>>>>>>>>>>>>>>>>>>>> Default implementation will call above method
> >>>>>>>>>>> requireTraits()
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>> RelNode.copy() to create new RelNode. Available
> to be
> >>>>>>>>>>>>> overriden
> >>>>>>>>>>>>>> for
> >>>>>>>>>>>>>>>>>>>>>>>> physical operators to customize the logic.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> The planner will call above method on MergeJoin
> >>>>>>>>> operator
> >>>>>>>>>>> to
> >>>>>>>>>>>>>> pass the
> >>>>>>>>>>>>>>>>>>>>>>>> required traits (Hash[a]) to Mergejoin's child
> >>>>>>>>> operators.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> We will get a new MergeJoin:
> >>>>>>>>>>>>>>>>>>>>>>>> MergeJoin hash[a]
> >>>>>>>>>>>>>>>>>>>>>>>>    |---- TableScan R hash[a] (RelSubset)
> >>>>>>>>>>>>>>>>>>>>>>>>    +---- TableScan S hash[a] (RelSubset)
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Now the MEMO group looks like:
> >>>>>>>>>>>>>>>>>>>>>>>> | Group No | Operator    | Derived Traits       |
> >>>>>>>>> Required
> >>>>>>>>>>>>>> Traits
> >>>>>>>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>> | ---------- | -------- ----- |
> -------------------- |
> >>>>>>>>>>>>>>>>>>>>>>>> --------------------- |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group1   | Aggregate   | Hash[a]
>  |
> >>>>>>>>> N/A
> >>>>>>>>>>>>>>>>>>>>>>>>  |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group2   | MergeJoin   | Hash[a,b], Hash[a]|
> Hash[a]
> >>>>>>>>>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group3   | TableScan R | None
>   |
> >>>>>>>>>>>>> Hash[a,b],
> >>>>>>>>>>>>>>>>>>>>>> Hash[a]
> >>>>>>>>>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>> | Group4   | TableScan S | None
>   |
> >>>>>>>>>>>>> Hash[a,b],
> >>>>>>>>>>>>>>>>>>>>>> Hash[a]
> >>>>>>>>>>>>>>>>>>>>>>>> |
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Calcite user may choose to ignore / not implement
> the
> >>>>>>>>>>>>> interface
> >>>>>>>>>>>>>> to keep
> >>>>>>>>>>>>>>>>>>>>>>>> the original behavior. Each physical operator,
> >>>>>>>>> according
> >>>>>>>>>>> to
> >>>>>>>>>>>>> its
> >>>>>>>>>>>>>> own
> >>>>>>>>>>>>>>>>>>>>>> logic,
> >>>>>>>>>>>>>>>>>>>>>>>> decides whether passThrough() should pass traits
> down
> >>>>>>>>> or
> >>>>>>>>>>> not
> >>>>>>>>>>>>> by
> >>>>>>>>>>>>>>>>>>>>>> returning:
> >>>>>>>>>>>>>>>>>>>>>>>> - a non-null RelNode, which means it can pass down
> >>>>>>>>>>>>>>>>>>>>>>>> - null object, which means can't pass down
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 2. Provide option to disable AbstractConverter
> >>>>>>>>>>>>>>>>>>>>>>>> Once the plan can request traits in top-down way
> in
> >>>>>>>>> the
> >>>>>>>>>>>>>> framework, many
> >>>>>>>>>>>>>>>>>>>>>>>> system don't need AbstractConverter anymore,
> since it
> >>>>>>>>> is
> >>>>>>>>>>> just
> >>>>>>>>>>>>> a
> >>>>>>>>>>>>>>>>>>>>>>>> intermediate operator to generate physical sort /
> >>>>>>>>>>> exchange.
> >>>>>>>>>>>>> For
> >>>>>>>>>>>>>> those,
> >>>>>>>>>>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>>>> can provide option to disable AbstractConverter,
> >>>>>>>>> generate
> >>>>>>>>>>>>>> physical
> >>>>>>>>>>>>>>>>>>>>>> enforcer
> >>>>>>>>>>>>>>>>>>>>>>>> directly by adding a method to interface
> Convention:
> >>>>>>>>>>>>>>>>>>>>>>>> - RelNode enforce(RelNode node, RelTraitSet
> traits);
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> The default implementation may just calling
> >>>>>>>>>>>>>>>>>>>>>> changeTraitsUsingConverters(),
> >>>>>>>>>>>>>>>>>>>>>>>> but people may choose to override it if the
> system has
> >>>>>>>>>>> special
> >>>>>>>>>>>>>> needs,
> >>>>>>>>>>>>>>>>>>>>>> like
> >>>>>>>>>>>>>>>>>>>>>>>> several traits must implement together, or the
> >>>>>>>>> position of
> >>>>>>>>>>>>>> collation in
> >>>>>>>>>>>>>>>>>>>>>>>> RelTraitSet is before distribution.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 3. Top-down driven, on-demand rule apply
> >>>>>>>>>>>>>>>>>>>>>>>> For every RelNode in a RelSet, rule is matched and
> >>>>>>>>> applied
> >>>>>>>>>>>>>>>>>>>>>> sequentially,
> >>>>>>>>>>>>>>>>>>>>>>>> newly generated RelNodes are added to the end of
> >>>>>>>>> RelNode
> >>>>>>>>>>> list
> >>>>>>>>>>>>>> in the
> >>>>>>>>>>>>>>>>>>>>>> RelSet
> >>>>>>>>>>>>>>>>>>>>>>>> waiting for rule apply. RuleQueue and
> >>>>>>>>> DeferringRuleCall
> >>>>>>>>>>> is not
> >>>>>>>>>>>>>> needed
> >>>>>>>>>>>>>>>>>>>>>>>> anymore. This will make space pruning and rule
> mutual
> >>>>>>>>>>>>>> exclusivity check
> >>>>>>>>>>>>>>>>>>>>>>>> possible.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> There are 3 stages for each RelSet:
> >>>>>>>>>>>>>>>>>>>>>>>> a). Exploration: logical transformation, yields
> >>>>>>>>> logical
> >>>>>>>>>>> nodes
> >>>>>>>>>>>>>>>>>>>>>>>> b). Implementation: physical transformation,
> yields
> >>>>>>>>>>> physical
> >>>>>>>>>>>>>> nodes
> >>>>>>>>>>>>>>>>>>>>>>>> c). Optimization: trait request, enforcement
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> The general process looks like:
> >>>>>>>>>>>>>>>>>>>>>>>> - optimize RelSet X:
> >>>>>>>>>>>>>>>>>>>>>>>> implement RelSet X
> >>>>>>>>>>>>>>>>>>>>>>>> for each physical relnode in RelSet X:
> >>>>>>>>>>>>>>>>>>>>>>>>  // pass down trait requests to child RelSets
> >>>>>>>>>>>>>>>>>>>>>>>>  for each child RelSet Y of current relnode:
> >>>>>>>>>>>>>>>>>>>>>>>>      optimize RelSet Y
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> - implement RelSet X:
> >>>>>>>>>>>>>>>>>>>>>>>> if X has been implemented:
> >>>>>>>>>>>>>>>>>>>>>>>>  return
> >>>>>>>>>>>>>>>>>>>>>>>> explore RelSet X
> >>>>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X:
> >>>>>>>>>>>>>>>>>>>>>>>>  apply physical rules
> >>>>>>>>>>>>>>>>>>>>>>>> - explore RelSet X:
> >>>>>>>>>>>>>>>>>>>>>>>> if X has been explored
> >>>>>>>>>>>>>>>>>>>>>>>>  return
> >>>>>>>>>>>>>>>>>>>>>>>> for each relnode in RelSet X:
> >>>>>>>>>>>>>>>>>>>>>>>>  // ensure each child RelSet of current relnode is
> >>>>>>>>>>>>> explored
> >>>>>>>>>>>>>>>>>>>>>>>> for each child RelSet Y of current relnode:
> >>>>>>>>>>>>>>>>>>>>>>>>     explore RelSet Y
> >>>>>>>>>>>>>>>>>>>>>>>>  apply logical rules on current relnode
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Basically it is a state machine with several
> states:
> >>>>>>>>>>>>>> Initialized,
> >>>>>>>>>>>>>>>>>>>>>>>> Explored, Exploring, Implemented, Implementing,
> >>>>>>>>> Optimized,
> >>>>>>>>>>>>>> Optimizing
> >>>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>> several transition methods: exploreRelSet,
> >>>>>>>>> exploreRelNode,
> >>>>>>>>>>>>>>>>>>>>>> implementRelSet,
> >>>>>>>>>>>>>>>>>>>>>>>> implementRelNode, optimizeRelSet,
> optimizeRelNode...
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> To achieve this, we need to mark the rules either
> >>>>>>>>> logical
> >>>>>>>>>>> rule
> >>>>>>>>>>>>>> or
> >>>>>>>>>>>>>>>>>>>>>> physical
> >>>>>>>>>>>>>>>>>>>>>>>> rule.
> >>>>>>>>>>>>>>>>>>>>>>>> To keep backward compatibility, all the un-marked
> >>>>>>>>> rules
> >>>>>>>>>>> will
> >>>>>>>>>>>>> be
> >>>>>>>>>>>>>>>>>>>>>> treated as
> >>>>>>>>>>>>>>>>>>>>>>>> logical rules, except rules that uses
> >>>>>>>>> AbstractConverter as
> >>>>>>>>>>>>> rule
> >>>>>>>>>>>>>>>>>>>>>> operand,
> >>>>>>>>>>>>>>>>>>>>>>>> these rules still need to applied top-down, or
> random
> >>>>>>>>>>> order.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 4. On-demand, bottom-up trait derivation
> >>>>>>>>>>>>>>>>>>>>>>>> It is called bottom-up, but actually driven by
> >>>>>>>>> top-down,
> >>>>>>>>>>>>>> happens same
> >>>>>>>>>>>>>>>>>>>>>> time
> >>>>>>>>>>>>>>>>>>>>>>>> as top-down trait request, in optimization stage
> >>>>>>>>> mentioned
> >>>>>>>>>>>>>> above. Many
> >>>>>>>>>>>>>>>>>>>>>>>> Calcite based bigdata system only propagate
> traits on
> >>>>>>>>>>> Project
> >>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> Filter by
> >>>>>>>>>>>>>>>>>>>>>>>> writing rules, which is very limited. In fact, we
> can
> >>>>>>>>>>> extend
> >>>>>>>>>>>>>> trait
> >>>>>>>>>>>>>>>>>>>>>>>> propagation/derivation to all the operators,
> without
> >>>>>>>>>>> rules, by
> >>>>>>>>>>>>>> adding
> >>>>>>>>>>>>>>>>>>>>>>>> interface PhysicalNode (or extending RelNode) with
> >>>>>>>>> method:
> >>>>>>>>>>>>>>>>>>>>>>>> - RelNode derive(RelTraitSet traits, int childId);
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Given the following plan (only consider
> distribution
> >>>>>>>>> for
> >>>>>>>>>>>>>> simplicity):
> >>>>>>>>>>>>>>>>>>>>>>>> Agg [a,b]
> >>>>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a]
> >>>>>>>>>>>>>>>>>>>>>>>>           |---- TableScan R
> >>>>>>>>>>>>>>>>>>>>>>>>           +--- TableScan S
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Hash[a] won't satisfy Hash[a,b] without special
> >>>>>>>>> treatment,
> >>>>>>>>>>>>>> because
> >>>>>>>>>>>>>>>>>>>>>> there
> >>>>>>>>>>>>>>>>>>>>>>>> isn't a mechanism to coordinate traits between
> >>>>>>>>> children.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Now we call derive method on Agg [a,b] node,
> >>>>>>>>>>> derive(Hash[a],
> >>>>>>>>>>>>>> 0), we get
> >>>>>>>>>>>>>>>>>>>>>>>> the new node:
> >>>>>>>>>>>>>>>>>>>>>>>> Agg [a]
> >>>>>>>>>>>>>>>>>>>>>>>> +-- MergeJoin [a] (RelSubset)
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> We will provide different matching type, so each
> >>>>>>>>> operator
> >>>>>>>>>>> can
> >>>>>>>>>>>>>> specify
> >>>>>>>>>>>>>>>>>>>>>> what
> >>>>>>>>>>>>>>>>>>>>>>>> kind of matching type it requires its children:
> >>>>>>>>>>>>>>>>>>>>>>>> - MatchType getMatchType(RelTrait trait, int
> childId);
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> a) Exact: Hash[a,b] exact match Hash[a,b], aka,
> >>>>>>>>> satisfy
> >>>>>>>>>>>>>>>>>>>>>>>> b) Partial: Hash[a] partial match Hash[a,b]
> >>>>>>>>>>>>>>>>>>>>>>>> c) Permuted: Sort[a,b,c] permuted match
> Sort[c,b,a]
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> In addition, optimization order is provided for
> each
> >>>>>>>>>>> opertor
> >>>>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>>>>>> specify:
> >>>>>>>>>>>>>>>>>>>>>>>> a) left to right
> >>>>>>>>>>>>>>>>>>>>>>>> b) right to left
> >>>>>>>>>>>>>>>>>>>>>>>> c) both
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> For example, for query SELECT * FROM R join S on
> >>>>>>>>> R.a=S.a
> >>>>>>>>>>> and
> >>>>>>>>>>>>>> R.b=S.b
> >>>>>>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>>>> R.c=S.c:
> >>>>>>>>>>>>>>>>>>>>>>>> Suppose R is distributed by a,b, and S is
> distributed
> >>>>>>>>> by
> >>>>>>>>>>> c.
> >>>>>>>>>>>>>>>>>>>>>>>> MergeJoin [a,b,c]
> >>>>>>>>>>>>>>>>>>>>>>>> |--- TableScan R [a,b]
> >>>>>>>>>>>>>>>>>>>>>>>> +-- TableScan S [c]
> >>>>>>>>>>>>>>>>>>>>>>>> a) left to right, call derive(Hash[a,b], 0), we
> get
> >>>>>>>>>>> MergeJoin
> >>>>>>>>>>>>>> [a,b]
> >>>>>>>>>>>>>>>>>>>>>>>> b) right to left, call derive(Hash[c], 1), we get
> >>>>>>>>>>> MergeJoin
> >>>>>>>>>>>>>> [c], most
> >>>>>>>>>>>>>>>>>>>>>>>> likely a bad plan
> >>>>>>>>>>>>>>>>>>>>>>>> c) both, get above 2 plans.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> For system that doesn't have a fine-tuned stats
> and
> >>>>>>>>> cost
> >>>>>>>>>>>>> model,
> >>>>>>>>>>>>>> it may
> >>>>>>>>>>>>>>>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>>>>> be able to make a right decision based purely on
> cost.
> >>>>>>>>>>>>> Probably
> >>>>>>>>>>>>>> we
> >>>>>>>>>>>>>>>>>>>>>> need to
> >>>>>>>>>>>>>>>>>>>>>>>> provide the MergeJoin with both children's derived
> >>>>>>>>>>> traitset
> >>>>>>>>>>>>>> list.
> >>>>>>>>>>>>>>>>>>>>>>>> - List<RelNode> derive(List<List<RelTraitSet>>);
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Of course, all above methods are optional to
> >>>>>>>>> implement for
> >>>>>>>>>>>>>> those who
> >>>>>>>>>>>>>>>>>>>>>>>> doesn't need this feature.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> 5. Branch and Bound Space Pruning
> >>>>>>>>>>>>>>>>>>>>>>>> After we implement on-demand, top-down trait
> >>>>>>>>> enforcement
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>>> rule-apply,
> >>>>>>>>>>>>>>>>>>>>>>>> we can pass the cost limit at the time of passing
> down
> >>>>>>>>>>>>> required
> >>>>>>>>>>>>>>>>>>>>>> traits, as
> >>>>>>>>>>>>>>>>>>>>>>>> described in the classical Cascades paper. Right
> now,
> >>>>>>>>>>> Calcite
> >>>>>>>>>>>>>> doesn't
> >>>>>>>>>>>>>>>>>>>>>>>> provide group level logical properties, including
> >>>>>>>>> stats
> >>>>>>>>>>> info,
> >>>>>>>>>>>>>> each
> >>>>>>>>>>>>>>>>>>>>>> operator
> >>>>>>>>>>>>>>>>>>>>>>>> in the same group has its own logical property
> and the
> >>>>>>>>>>> stats
> >>>>>>>>>>>>>> may vary,
> >>>>>>>>>>>>>>>>>>>>>> so
> >>>>>>>>>>>>>>>>>>>>>>>> we can only do limited space pruning for trait
> >>>>>>>>>>> enforcement,
> >>>>>>>>>>>>>> still
> >>>>>>>>>>>>>>>>>>>>>> good. But
> >>>>>>>>>>>>>>>>>>>>>>>> if we agree to add option to share group level
> stats
> >>>>>>>>>>> between
> >>>>>>>>>>>>>> relnodes
> >>>>>>>>>>>>>>>>>>>>>> in a
> >>>>>>>>>>>>>>>>>>>>>>>> RelSet, we will be able to do more aggresive space
> >>>>>>>>>>> pruning,
> >>>>>>>>>>>>>> which will
> >>>>>>>>>>>>>>>>>>>>>> help
> >>>>>>>>>>>>>>>>>>>>>>>> boost the performance of join reorder planning.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> With all that being said, how do we move forward?
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> There are 2 ways:
> >>>>>>>>>>>>>>>>>>>>>>>> a) Modify on current VolcanoPlanner.
> >>>>>>>>>>>>>>>>>>>>>>>> Pros: code reuse, existing numerous test cases and
> >>>>>>>>>>>>>> infrastructure,
> >>>>>>>>>>>>>>>>>>>>>> fast
> >>>>>>>>>>>>>>>>>>>>>>>> integration
> >>>>>>>>>>>>>>>>>>>>>>>> Cons: changing code always brings risk
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> b) Add a new XXXXPlanner
> >>>>>>>>>>>>>>>>>>>>>>>> Pros: no risk, no tech debt, no need to worry
> about
> >>>>>>>>>>> backward
> >>>>>>>>>>>>>>>>>>>>>>>> compatability
> >>>>>>>>>>>>>>>>>>>>>>>> Cons: separate test cases for new planner, one
> more
> >>>>>>>>>>> planner to
> >>>>>>>>>>>>>>>>>>>>>> maintain
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> We'd like to hear the community's thoughts and
> >>>>>>>>> advices.
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Thanks.
> >>>>>>>>>>>>>>>>>>>>>>>> - Haisheng
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>
> >>>
> >>
> >
>
>

Re: [DISCUSS] Towards Cascades Optimizer

Reply via email to