Re: [DISCUSS] Towards Cascades Optimizer

Danny Chan Tue, 21 Apr 2020 23:53:25 -0700

> Is there any recommended approach to make that happen smoothly besides
coding and testing work? We need to be aware that the new planner might be
co-exist with VolcanoPlanner for 5 or more years, or even never replace
VolcanoPlanner.


If that is true, i might say the new planner is probably with a not that
good design, we expect to see in advance for what cases/reasons user has
the reason to keep the old VolcanoPlanner and we *must* give a solution for
those problems in the new design.

I was expecting that migrating to a new planner would at least take 1 year
for developing, if that is true, modifying directly based on current
planner means for the near future 3~4 versions Calcite, there would bring
in huge plan changes/bugs for each release which i believe all the users of
Calcite don't want to see. And on one can guarantee that modifying directly
can keep good stability and compatibility, only the test set do.

>From the experience of Alibaba Blink planner which has contributed to
Apache Flink, yes, the old/new planner would co-exist at least for 2 years.
For the reasons that the new and old planner has different ability in some
corner cases.

>From my point of view, we should at least:
- Give a convincing test set for the new planner that makes us believe the
new planner is stable and powerful enough. I mean obviously the current
rule tests are far away from enough to support the new planner
- We should give a more detailed design doc about the new planner,
especially about the interfaces changes and any change that would bring in
the compatibility problem. Then we can make more accurate decision how much
work the new planner would bring in, until then, we can decide if switch to
a pure new planner development is a good idea or modify the existing one.


Haisheng Yuan <hy...@apache.org> 于2020年4月22日周三 上午9:45写道：

> Hi Andrii,
>
> > Obviously, from what is written here, I could guess that this would
> require me to change my physical planning rules, even if only by
> implementing a marker interface.
> You don't need to change your physical rules, it will be treated as equal
> as logical rules and be applied together with the real logical rules, no
> more logical/physical rules difference. This is also how current
> VolcanoPlanner works.
>
> > I don't want you to think that I somehow resent the changes you are
> pushing.
> Don't get me wrong. I am seriously thinking of revert these changes, since
> most people like the idea of adding new planner, why don't we make all the
> plan changes in the new planner, instead of forcing people changing test
> cases for the code changes that they might not need in VolcanoPlanner
> during upgrade.
>
> I didn't intend to replace VolcanoPlanner, thought just change the search
> strategy and add trait derivation mechanism, because most of the code in
> VolcanoPlanner can be reused. But since many agree to add new planner and
> replace VolcanoPlanner as the final goal, I won't be against most people's
> decision.
>
> Is there any recommended approach to make that happen smoothly besides
> coding and testing work? We need to be aware that the new planner might be
> co-exist with VolcanoPlanner for 5 or more years, or even never replace
> VolcanoPlanner.
>
> More thoughts are welcome.
>
> Haisheng
>
> On 2020/04/21 19:56:25, Андрей Цвелодуб <a.tsvelo...@gmail.com> wrote:
> > Hello Haisheng,
> >
> > > To keep backward compatibility, all the un-marked rules will be treated
> > as logical rules, except rules that uses AbstractConverter as rule
> operand,
> > these rules still need to applied top-down, or random order.
> > Obviously, from what is written here, I could guess that this would
> require
> > me to change my physical planning rules, even if only by implementing a
> > marker interface. I am not saying this is a bad thing, but this is a
> thing
> > that should be communicated and planned ahead in case the VolcanoPlanner
> is
> > modified.
> >
> > > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > because they will cause another tons of plan changes.
> > I see you are still bitter due to all the discussions on this list
> lately,
> > I'm sorry. I don't want you to think that I somehow resent the changes
> you
> > are pushing, au contraire I support them and would be happy to help if I
> > can. I just want the process of these changes to be executed in the best
> > possible way.
> > As I see there are already several opinions in this thread that basically
> > align with what I am saying, so I guess I am not the crazy guy running
> > around and yelling "the end is nigh!".
> >
> > Thank you for taking these mumbled thoughts into account.
> >
> > Bestest Regards,
> > Andrii Tsvielodub
> >
> > On Tue, 21 Apr 2020 at 21:08, Haisheng Yuan <hy...@apache.org> wrote:
> >
> > > Hi Andrii,
> > >
> > > > I guess changing the planner would lead to changes in tons of rules
> and
> > > even more tests.
> > > Obviously you didn't read through my email. You are not required to do
> any
> > > changes to your rule if you don't want to, but if you do, just need to
> mark
> > > the rule to tell planner whether it is a physical rule or not, simply
> by
> > > implementing an empty interface.
> > >
> > > > many on this list already experienced problems with upgrading even
> > > between the minor versions of Calcite.
> > > Sorry to see the problem you have experienced when upgrading Calcite.
> > > Looks like I have to revert changes in CALCITE-2970 and CALCITE-3753,
> > > because they will cause another tons of plan changes.
> > >
> > > But I will see if I can add a setting to use the old search strategy,
> > > which can be left untouched.
> > >
> > > Haisheng
> > >
> > > On 2020/04/21 06:33:08, Андрей Цвелодуб <a.tsvelo...@gmail.com> wrote:
> > > > Hello everyone,
> > > >
> > > > First of all, thanks for this great effort of improving the core
> parts of
> > > > the framework we all are using,
> > > > I believe this is long overdue and hope this will have benefits both
> for
> > > > the maintainers and users of the library.
> > > >
> > > > I don't have anything to say about the general idea at the moment,
> > > > but I want to make a point that maintaining the old implementation of
> > > > VolcanoPlanner during
> > > > the initial stages of implementing the new planner is absolutely
> > > CRITICAL.
> > > > As a lot of users of Calcite do various customizations to the
> engine, to
> > > > the rules
> > > > and all that is there in between, I believe changing the
> implementation
> > > of
> > > > the core component
> > > > would have a huge impact on most users of the library. I think many
> on
> > > this
> > > > list
> > > > already experienced problems with upgrading even between the minor
> > > versions
> > > > of Calcite,
> > > > so I guess changing the planner would lead to changes in tons of
> rules
> > > and
> > > > even more tests.
> > > >
> > > > I don't have anything against replacing VolcanoPlanner as a final
> goal of
> > > > this effort,
> > > > but I don't think that modifying it directly and merging it to
> master is
> > > a
> > > > viable development approach.
> > > > While I understand how burdensome it is to maintain several parallel
> core
> > > > components at once
> > > > (we did this while moving the engine of our product to Calcite), we
> > > should
> > > > still respect those who depend
> > > > on it and not introduce the risks related to the development of a new
> > > > component into existing processing flows.
> > > >
> > > > A good model to try to follow would be the way new Garbage
> Collectors are
> > > > introduced in Java.
> > > > First, add it as an experimental option, then make it generally
> > > available,
> > > > then after everyone agrees
> > > > this is the best option - make it the default one.
> > > > With this approach, everyone can then move to the new planner at
> their
> > > own
> > > > pace, guaranteeing a smooth transition overall.
> > > > Yes, this could take some time, maybe even a year, but this is the
> price
> > > of
> > > > doing major changes in a popular framework.
> > > >
> > > > Again, thank you for initiating this discussion and leading this
> effort.
> > > >
> > > > Best Regards,
> > > > Andrii Tsvielodub
> > > >
> > > > On Tue, 21 Apr 2020 at 07:51, Jinpeng Wu <wjpabc...@gmail.com>
> wrote:
> > > >
> > > > > Hi, Xiening.
> > > > >
> > > > > Regarding calculating the logical cost, here are some ways I
> though:
> > > > > 1. Logical rel may implement their own computeSelfCost method. Some
> > > > > rels can provide such information, for example the
> > > > > LogicalProject/LogicalFilter contains nearly the same information
> as
> > > their
> > > > > physical implementations. If we don't have enough confidence, just
> > > return
> > > > > zeroCost is also OK, as it only affects pruning.
> > > > > 2. Logical rel  tells its parents what its physical input could be
> > > after
> > > > > implementation. Then the problem come back to calculating lower
> bound
> > > of a
> > > > > physical rel.
> > > > > There should always be ways. The only problem is how to find a
> pretty
> > > one.
> > > > >
> > > > > Regarding the risk, new planner do have different risk. It is not
> > > because
> > > > > new planner could stop us doing something wrong but we can decide
> when
> > > to
> > > > > use the new one. Some scenarios:
> > > > > 1. If modifying the VolcanoPlanner directly, the only way user
> could
> > > > > control the risk is not to upgrade calcite version until it is
> > > considered
> > > > > stable. You know, it is quite different from keeping calcite
> updated
> > > and
> > > > > switching to the new planner at a proper time.
> > > > > 2. It is very importance for SLA control. For the important
> business
> > > and
> > > > > jobs, we may keep using the old and stable planner. And use the
> new one
> > > > > only for jobs that have fault tolerance. And this helps testing new
> > > planner
> > > > > with actual scenarios.
> > > > > 3. It is helpful when upgrading online services. When the new
> planner
> > > > > happened to have some bugs, we can switch to the old planner
> directly
> > > > > without rollback the whole service.
> > > > > 4. With all these ways to prevent issues becoming disasters, we
> are not
> > > > > vulnerable to making mistakes. This not only enables faster
> iterations
> > > but
> > > > > also let us have enough time to resolve big bugs, like considering
> it
> > > in
> > > > > detail and applying a time-consuming refactoring for it. To work
> > > around a
> > > > > critical bug using tricky ways usually introduces more issues.
> > > > >
> > > > > Thanks,
> > > > > Jinpeng
> > > > >
> > > > > On Tue, Apr 21, 2020 at 2:04 AM Xiening Dai <xndai....@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hi Jinpeng,
> > > > > >
> > > > > > Regarding this comment - I believe there are ways to calculate
> the
> > > > > logical
> > > > > > cost, but I think it’s not that simple as  "cardinality *
> > > > > unit_copy_cost.”,
> > > > > > would  you provide more details of other different ways? Just the
> > > > > algorithm
> > > > > > description or pseudo code would help us understand. Thanks.
> > > > > >
> > > > > > Regarding the approach of creating new planner, I don’t think a
> new
> > > > > > planner would lower the risk. We don’t know what we don’t know.
> If we
> > > > > > introduced an issue while modifying the planner, most likely we
> > > would do
> > > > > > the same with new planner class. A new planner doesn’t
> necessarily
> > > > > prevent
> > > > > > the issue from happening, but just delay its surfacing, which is
> > > worse
> > > > > IMO.
> > > > > >
> > > > > > There’s one obvious benefit with new planner is that we can
> provide
> > > some
> > > > > > sort of isolation so the change won’t cause test baseline
> updates,
> > > which
> > > > > > could be painful at times.  We should see if we could use planner
> > > config
> > > > > to
> > > > > > achieve the same if we decide to just modify the current planner.
> > > > > >
> > > > > >
> > > > > > > On Apr 20, 2020, at 8:32 AM, Haisheng Yuan <hy...@apache.org>
> > > wrote:
> > > > > > >
> > > > > > > Igor,
> > > > > > >
> > > > > > > That's great.
> > > > > > >
> > > > > > > On 2020/04/20 11:17:49, Seliverstov Igor <gvvinbl...@gmail.com
> >
> > > wrote:
> > > > > > >> Haisheng, Xiening,
> > > > > > >>
> > > > > > >> Ok, Now I see how it should work.
> > > > > > >>
> > > > > > >> Thanks for your replies.
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >> Igor
> > > > > > >>
> > > > > > >>> 20 апр. 2020 г., в 09:56, Seliverstov Igor <
> gvvinbl...@gmail.com
> > > >
> > > > > > написал(а):
> > > > > > >>>
> > > > > > >>> Haisheng, Xiening,
> > > > > > >>>
> > > > > > >>> Thanks for clarifying.
> > > > > > >>>
> > > > > > >>> In this proposal, we are not trying to split logical and
> physical
> > > > > > planning entirely. - actually I was in doubt about an idea of
> entire
> > > > > > splitting logical and physical phases, if you aren't going to, I
> > > have no
> > > > > > objections.
> > > > > > >>>
> > > > > > >>> But it returns me to my first question: how we will propagate
> > > traits
> > > > > > in bottom-up manner using proposed approach. (See [DISCUSS]
> Proposal
> > > to
> > > > > add
> > > > > > API to force rules matching specific rels for details)
> > > > > > >>>
> > > > > > >>> One of inconveniences of current VolcanoPlanner
> implementation is
> > > > > > amount of tricks that we need to get desired behaviour. It would
> be
> > > great
> > > > > > if some of issues (or all of them) were solved in the new
> approach.
> > > > > > >>>
> > > > > > >>> Regards,
> > > > > > >>> Igor
> > > > > > >>>
> > > > > > >>> пн, 20 апр. 2020 г., 7:02 Xiening Dai <xndai....@gmail.com
> > > <mailto:
> > > > > > xndai....@gmail.com>>:
> > > > > > >>> Hi Igor,
> > > > > > >>>
> > > > > > >>> Your comment - "because actual cost may be calculated
> correctly
> > > using
> > > > > > physical operators only. So won't be able to implement Branch and
> > > Bound
> > > > > > Space Pruning.“ is actually not true. In Cascade’s lower bound /
> > > upper
> > > > > > bound pruning algorithm, you can get cost lower bound of input
> > > RelNode
> > > > > > using cardinality * unit_copy_cost. The unit_copy_cost is a
> constant,
> > > > > which
> > > > > > stands for the minimal cost for processing a tuple from the
> input. So
> > > > > this
> > > > > > will be the minimal cost the input RelNode can achieve, and if
> this
> > > is
> > > > > > indeed larger than the current best cost, this input node can be
> > > pruned.
> > > > > > >>>
> > > > > > >>> In this proposal, we are not trying to split logical and
> physical
> > > > > > planning entirely. But for any given equivalent set, we would
> need to
> > > > > > finish exploration before implementation. But for the entire memo
> > > tree,
> > > > > > each set could be in different planning stage, and an equivalent
> set
> > > can
> > > > > be
> > > > > > pruned even before it’s implemented. A text book example of
> > > > > aforementioned
> > > > > > “lower bound / upper bound pruning” would be the join ordering
> case.
> > > > > > >>>
> > > > > > >>> Regarding #3, I think we can still achieve that (partially)
> > > through
> > > > > > this proposal. Remember every time when we start the
> optimization, we
> > > > > pass
> > > > > > down an upper bound limit. Initially this upper bound for root is
> > > > > infinite
> > > > > > - meaning that no feasible plan is available. Every time when we
> > > find a
> > > > > > physical plan we update this upper bound, then start the search
> > > again. We
> > > > > > could stop the search when the cost is less than a pre-defined
> > > threshold
> > > > > -
> > > > > > which gives you a “good enough” plan with early termination.
> Still
> > > this
> > > > > > wouldn't avoid the logical exploration. For that, you would
> probably
> > > > > > archive through rule configurations, and avoid some expansive
> > > > > > transformation to keep the cost down.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>>> On Apr 19, 2020, at 7:30 PM, Haisheng Yuan <
> hy...@apache.org
> > > > > <mailto:
> > > > > > hy...@apache.org>> wrote:
> > > > > > >>>>
> > > > > > >>>> Igor,
> > > > > > >>>>
> > > > > > >>>> a) Given current Calcite's stats derivation strategy, mixing
> > > logical
> > > > > > and physical planning won't make it better. I hope you went
> through
> > > my
> > > > > > email to the end, currently operators inside a memo group don't
> share
> > > > > stats
> > > > > > info, each operator's stats may differ with the other one, and
> the
> > > > > > RelSubset's best node and stats may vary during optimization. So
> > > logical
> > > > > > and physical rule pruning are not safe at the moment, otherwise
> it
> > > almost
> > > > > > implies changing downstream project's cost model silently.
> > > > > > >>>>
> > > > > > >>>> On the other hand, ensuring child group exploration task
> finish
> > > > > first
> > > > > > will make rule mutual exclusivity check possible, like the
> result of
> > > > > > ReduceExpressionRule won't need trigger the same rule again, The
> join
> > > > > > result of JoinCommuteRule won't need trigger JoinCommuteRule and
> > > > > > ReduceExpressionRule again.
> > > > > > >>>>
> > > > > > >>>> More importantly, most if not all the the long planning
> queries
> > > in
> > > > > > Calcite are not caused by too many alternatives, but mutual
> > > triggering,
> > > > > or
> > > > > > cyclic triggering, which can be avoided by customizing the rules
> > > instead
> > > > > of
> > > > > > using the default one. Unless you use dynamic programming
> (Calcite
> > > use
> > > > > > heuristic) on join reordering (left-deep, right-deep, bushy),
> space
> > > > > pruning
> > > > > > won't help make long / infinite running query faster.
> > > > > > >>>>
> > > > > > >>>> b) No evidence shows current version of Calcite will return
> the
> > > most
> > > > > > promising plan in first planning iteration. Instead of praying
> for
> > > > > getting
> > > > > > good enough plan in the first iteration, why not focus on fixing
> > > rules
> > > > > that
> > > > > > causes the issue?
> > > > > > >>>>
> > > > > > >>>> c) That is not the goal.
> > > > > > >>>>
> > > > > > >>>> On 2020/04/19 15:14:57, Seliverstov Igor <
> gvvinbl...@gmail.com
> > > > > > <mailto:gvvinbl...@gmail.com>> wrote:
> > > > > > >>>>> Haisheng,
> > > > > > >>>>>
> > > > > > >>>>> From my point of view splitting logical and physical
> planning
> > > parts
> > > > > > isn’t a good idea.
> > > > > > >>>>>
> > > > > > >>>>> I think so because actual cost may be calculated correctly
> > > using
> > > > > > physical operators only. So that we:
> > > > > > >>>>> a) won't be able to implement Branch and Bound Space
> Pruning
> > > (as
> > > > > far
> > > > > > as I understand, at exploring time there are no physical
> operators,
> > > no
> > > > > > correct costs, but assumptions only, I don’t think we should
> rely on
> > > > > them)
> > > > > > >>>>> b) won’t be able to get good enough plan (without Branch
> and
> > > Bound
> > > > > > Space Pruning it’s non-trivial task to get right search
> direction and
> > > > > most
> > > > > > promising plans in first planning iterations)
> > > > > > >>>>> c) won’t be able to tune the good enough plan during
> several
> > > > > similar
> > > > > > queries are executed
> > > > > > >>>>>
> > > > > > >>>>> Regards,
> > > > > > >>>>> Igor
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>>>> 19 апр. 2020 г., в 17:37, Haisheng Yuan <hy...@apache.org
> > > > > <mailto:
> > > > > > hy...@apache.org>> написал(а):
> > > > > > >>>>>>
> > > > > > >>>>>> Hi Igor,
> > > > > > >>>>>>
> > > > > > >>>>>> You can't have your cake and eat it.
> > > > > > >>>>>> But one feasible work item definitely we can do is that
> once
> > > > > > timeout, stop exploring, use the first available physical
> operator in
> > > > > each
> > > > > > group and optimize it.
> > > > > > >>>>>>
> > > > > > >>>>>> Because most, if not all, of the long / infinite running
> > > > > > optimizations are caused by project related transpose, join
> > > reordering
> > > > > > (join commute + associative), constant reduction for large
> expression
> > > > > (see
> > > > > > CALCITE-3460), all of which are logical transformations rules and
> > > many of
> > > > > > which have corresponding JIRAs. So another alternative is, let's
> fix
> > > > > these
> > > > > > bugs to improve Calcite to make timeout option less usable.
> > > > > > >>>>>>
> > > > > > >>>>>> Another thing worth noting is that sql server optimizer
> > > timeout
> > > > > > mechanism is not based on clock time, but the number of
> optimization
> > > > > tasks
> > > > > > it has done [1].
> > > > > > >>>>>>
> > > > > > >>>>>> [1]
> > > > > >
> > > > >
> > >
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> > > > > > <
> > > > > >
> > > > >
> > >
> https://techcommunity.microsoft.com/t5/sql-server-support/understanding-optimizer-timeout-and-how-complex-queries-can-be/ba-p/319188
> > > > > > >
> > > > > > >>>>>>
> > > > > > >>>>>> Regards,
> > > > > > >>>>>> Haisheng
> > > > > > >>>>>>
> > > > > > >>>>>> On 2020/04/19 11:31:27, Seliverstov Igor <
> > > gvvinbl...@gmail.com
> > > > > > <mailto:gvvinbl...@gmail.com>> wrote:
> > > > > > >>>>>>> Haisheng,
> > > > > > >>>>>>>
> > > > > > >>>>>>> Ok, then such notification isn't needed
> > > > > > >>>>>>>
> > > > > > >>>>>>> But in this case we don't have any control over how long
> > > planning
> > > > > > takes
> > > > > > >>>>>>>
> > > > > > >>>>>>> For some systems it's necessary to get good enough plan
> > > right now
> > > > > > instead
> > > > > > >>>>>>> of best one after while
> > > > > > >>>>>>>
> > > > > > >>>>>>> For example we've been considering a case when a query is
> > > > > > optimised several
> > > > > > >>>>>>> times in short iterations in case it's impossible to get
> the
> > > best
> > > > > > plan in
> > > > > > >>>>>>> reasonable period of time (for example there is an SLA
> for
> > > > > > response time)
> > > > > > >>>>>>>
> > > > > > >>>>>>> This mean we need all needed physical implementations
> after
> > > each
> > > > > > logical
> > > > > > >>>>>>> transformation is applied.
> > > > > > >>>>>>>
> > > > > > >>>>>>> Regards,
> > > > > > >>>>>>> Igor
> > > > > > >>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>>> вс, 19 апр. 2020 г., 13:55 Haisheng Yuan <
> hy...@apache.org
> > > > > > <mailto:hy...@apache.org>>:
> > > > > > >>>>>>>
> > > > > > >>>>>>>> Hi Igor,
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> There will be no rule queue anymore. Y will be fully
> > > explored
> > > > > > (logical
> > > > > > >>>>>>>> rules are matched and applied) before it can be
> implemented
> > > and
> > > > > > optimized.
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> Thanks,
> > > > > > >>>>>>>> Haisheng
> > > > > > >>>>>>>>
> > > > > > >>>>>>>> On 2020/04/19 10:11:45, Seliverstov Igor <
> > > gvvinbl...@gmail.com
> > > > > > <mailto:gvvinbl...@gmail.com>> wrote:
> > > > > > >>>>>>>>> Hi Haisheng,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Great explanation, thanks.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> One thing I'd like to cover in advance is trait
> propagation
> > > > > > process (I
> > > > > > >>>>>>>> need
> > > > > > >>>>>>>>> it for Apache Ignite SQL engine implementation).
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> For example:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> There are two nodes: Rel X and its child node Rel Y
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Both nodes are in Optimized state, and there is a
> Logical
> > > rule
> > > > > > for Rel Y
> > > > > > >>>>>>>> in
> > > > > > >>>>>>>>> a rules queue matched,
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> After logical rule is applied, there is a logical rel
> Rel
> > > Y'
> > > > > and
> > > > > > >>>>>>>> containing
> > > > > > >>>>>>>>> it set moves into Exploring state (since there is a
> logical
> > > > > node
> > > > > > which
> > > > > > >>>>>>>> has
> > > > > > >>>>>>>>> to be implemented)
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> After whole process Exploring -> ... -> Optimized we
> need
> > > to
> > > > > > force parent
> > > > > > >>>>>>>>> Rel X set to switch its state from Optimized to
> Optimizing
> > > to
> > > > > > peek the
> > > > > > >>>>>>>>> newly implemented child and (possible) produce a new
> Rel X'
> > > > > > physical rel
> > > > > > >>>>>>>> on
> > > > > > >>>>>>>>> the basis of previously requested from the Rel X set
> > > traits.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> If a new Rel X' is produced, a Rel X' parent should
> move
> > > its
> > > > > > state
> > > > > > >>>>>>>>> Optimized -> Optimizing and repeat described above
> > > operations.
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Does it look like true?
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> Regards,
> > > > > > >>>>>>>>> Igor
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>> вс, 19 апр. 2020 г., 6:52 Haisheng Yuan <
> > > > > h.y...@alibaba-inc.com
> > > > > > <mailto:h.y...@alibaba-inc.com>>:
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>>> Hi,
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> In the past few months, we have discussed a lot about
> > > Cascades
> > > > > > style
> > > > > > >>>>>>>>>> top-down optimization, including on-demand trait
> > > > > > derivation/request,
> > > > > > >>>>>>>> rule
> > > > > > >>>>>>>>>> apply, branch and bound space pruning. Now we think
> it is
> > > time
> > > > > > to move
> > > > > > >>>>>>>>>> towards these targets.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> We will separate it into several small issues, and
> each
> > > one
> > > > > can
> > > > > > be
> > > > > > >>>>>>>>>> integrated as a standalone, independent feature, and
> most
> > > > > > importantly,
> > > > > > >>>>>>>>>> meanwhile keep backward compatibility.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> 1. Top-down trait request
> > > > > > >>>>>>>>>> In other words, pass traits requirements from parent
> > > nodes to
> > > > > > child
> > > > > > >>>>>>>> nodes.
> > > > > > >>>>>>>>>> The trait requests happens after all the logical
> > > > > transformation
> > > > > > rules
> > > > > > >>>>>>>> and
> > > > > > >>>>>>>>>> physical implementation rules are done, in a top-down
> > > manner,
> > > > > > driven
> > > > > > >>>>>>>> from
> > > > > > >>>>>>>>>> root set. e.g.:
> > > > > > >>>>>>>>>> SELECT a, sum(c) FROM
> > > > > > >>>>>>>>>>  (SELECT * FROM R JOIN S USING (a, b)) t
> > > > > > >>>>>>>>>> GROUP BY a;
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Suppose we have the following plan tree in the MEMO,
> and
> > > let's
> > > > > > only
> > > > > > >>>>>>>>>> consider distribution for simplicity, each group
> > > represents a
> > > > > > RelSet
> > > > > > >>>>>>>> in the
> > > > > > >>>>>>>>>> MEMO.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Group 1:     Agg on [a]
> > > > > > >>>>>>>>>> Group 2:           +-- MergeJoin on [a, b]
> > > > > > >>>>>>>>>> Group 3:                       |--- TableScan R
> > > > > > >>>>>>>>>> Group 4:                       +--- TableScan S
> > > > > > >>>>>>>>>> | Group No | Operator    | Derived Traits | Required
> > > Traits |
> > > > > > >>>>>>>>>> | ----------- | ------------- | --------------- |
> > > > > > --------------- |
> > > > > > >>>>>>>>>> | Group 1  | Aggregate    | Hash[a]         | N/A
> > > > > >   |
> > > > > > >>>>>>>>>> | Group 2  | MergeJoin    | Hash[a,b]      | Hash[a]
> > > > > |
> > > > > > >>>>>>>>>> | Group 3  | TableScan R | None            | Hash[a,b]
> > >    |
> > > > > > >>>>>>>>>> | Group 4  | TableScan S | None            | Hash[a,b]
> > >    |
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> We will add new interface PhysicalNode (or extending
> > > RelNode)
> > > > > > with
> > > > > > >>>>>>>>>> methods:
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> - Pair<RelTraitSet,List<RelTraitSet>>
> > > > > requireTraits(RelTraitSet
> > > > > > >>>>>>>> required);
> > > > > > >>>>>>>>>> pair.left is the current operator's new traitset,
> > > pair.right
> > > > > is
> > > > > > the
> > > > > > >>>>>>>> list
> > > > > > >>>>>>>>>> of children's required traitset.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> - RelNode passThrough(RelTraitSet required);
> > > > > > >>>>>>>>>> Default implementation will call above method
> > > requireTraits()
> > > > > > and
> > > > > > >>>>>>>>>> RelNode.copy() to create new RelNode. Available to be
> > > > > overriden
> > > > > > for
> > > > > > >>>>>>>>>> physical operators to customize the logic.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> The planner will call above method on MergeJoin
> operator
> > > to
> > > > > > pass the
> > > > > > >>>>>>>>>> required traits (Hash[a]) to Mergejoin's child
> operators.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> We will get a new MergeJoin:
> > > > > > >>>>>>>>>> MergeJoin hash[a]
> > > > > > >>>>>>>>>>        |---- TableScan R hash[a] (RelSubset)
> > > > > > >>>>>>>>>>        +---- TableScan S hash[a] (RelSubset)
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Now the MEMO group looks like:
> > > > > > >>>>>>>>>> | Group No | Operator    | Derived Traits       |
> Required
> > > > > > Traits
> > > > > > >>>>>>>>  |
> > > > > > >>>>>>>>>> | ---------- | -------- ----- | -------------------- |
> > > > > > >>>>>>>>>> --------------------- |
> > > > > > >>>>>>>>>> | Group1   | Aggregate   | Hash[a]                 |
> N/A
> > > > > > >>>>>>>>>>      |
> > > > > > >>>>>>>>>> | Group2   | MergeJoin   | Hash[a,b], Hash[a]| Hash[a]
> > > > > > >>>>>>>>>> |
> > > > > > >>>>>>>>>> | Group3   | TableScan R | None                    |
> > > > > Hash[a,b],
> > > > > > >>>>>>>> Hash[a]
> > > > > > >>>>>>>>>> |
> > > > > > >>>>>>>>>> | Group4   | TableScan S | None                    |
> > > > > Hash[a,b],
> > > > > > >>>>>>>> Hash[a]
> > > > > > >>>>>>>>>> |
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Calcite user may choose to ignore / not implement the
> > > > > interface
> > > > > > to keep
> > > > > > >>>>>>>>>> the original behavior. Each physical operator,
> according
> > > to
> > > > > its
> > > > > > own
> > > > > > >>>>>>>> logic,
> > > > > > >>>>>>>>>> decides whether passThrough() should pass traits down
> or
> > > not
> > > > > by
> > > > > > >>>>>>>> returning:
> > > > > > >>>>>>>>>> - a non-null RelNode, which means it can pass down
> > > > > > >>>>>>>>>> - null object, which means can't pass down
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> 2. Provide option to disable AbstractConverter
> > > > > > >>>>>>>>>> Once the plan can request traits in top-down way in
> the
> > > > > > framework, many
> > > > > > >>>>>>>>>> system don't need AbstractConverter anymore, since it
> is
> > > just
> > > > > a
> > > > > > >>>>>>>>>> intermediate operator to generate physical sort /
> > > exchange.
> > > > > For
> > > > > > those,
> > > > > > >>>>>>>> we
> > > > > > >>>>>>>>>> can provide option to disable AbstractConverter,
> generate
> > > > > > physical
> > > > > > >>>>>>>> enforcer
> > > > > > >>>>>>>>>> directly by adding a method to interface Convention:
> > > > > > >>>>>>>>>> - RelNode enforce(RelNode node, RelTraitSet traits);
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> The default implementation may just calling
> > > > > > >>>>>>>> changeTraitsUsingConverters(),
> > > > > > >>>>>>>>>> but people may choose to override it if the system has
> > > special
> > > > > > needs,
> > > > > > >>>>>>>> like
> > > > > > >>>>>>>>>> several traits must implement together, or the
> position of
> > > > > > collation in
> > > > > > >>>>>>>>>> RelTraitSet is before distribution.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> 3. Top-down driven, on-demand rule apply
> > > > > > >>>>>>>>>> For every RelNode in a RelSet, rule is matched and
> applied
> > > > > > >>>>>>>> sequentially,
> > > > > > >>>>>>>>>> newly generated RelNodes are added to the end of
> RelNode
> > > list
> > > > > > in the
> > > > > > >>>>>>>> RelSet
> > > > > > >>>>>>>>>> waiting for rule apply. RuleQueue and
> DeferringRuleCall
> > > is not
> > > > > > needed
> > > > > > >>>>>>>>>> anymore. This will make space pruning and rule mutual
> > > > > > exclusivity check
> > > > > > >>>>>>>>>> possible.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> There are 3 stages for each RelSet:
> > > > > > >>>>>>>>>> a). Exploration: logical transformation, yields
> logical
> > > nodes
> > > > > > >>>>>>>>>> b). Implementation: physical transformation, yields
> > > physical
> > > > > > nodes
> > > > > > >>>>>>>>>> c). Optimization: trait request, enforcement
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> The general process looks like:
> > > > > > >>>>>>>>>> - optimize RelSet X:
> > > > > > >>>>>>>>>> implement RelSet X
> > > > > > >>>>>>>>>>  for each physical relnode in RelSet X:
> > > > > > >>>>>>>>>>      // pass down trait requests to child RelSets
> > > > > > >>>>>>>>>>      for each child RelSet Y of current relnode:
> > > > > > >>>>>>>>>>          optimize RelSet Y
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> - implement RelSet X:
> > > > > > >>>>>>>>>>  if X has been implemented:
> > > > > > >>>>>>>>>>      return
> > > > > > >>>>>>>>>> explore RelSet X
> > > > > > >>>>>>>>>>  for each relnode in RelSet X:
> > > > > > >>>>>>>>>>      apply physical rules
> > > > > > >>>>>>>>>> - explore RelSet X:
> > > > > > >>>>>>>>>>   if X has been explored
> > > > > > >>>>>>>>>>      return
> > > > > > >>>>>>>>>> for each relnode in RelSet X:
> > > > > > >>>>>>>>>>      // ensure each child RelSet of current relnode is
> > > > > explored
> > > > > > >>>>>>>>>> for each child RelSet Y of current relnode:
> > > > > > >>>>>>>>>>         explore RelSet Y
> > > > > > >>>>>>>>>>      apply logical rules on current relnode
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Basically it is a state machine with several states:
> > > > > > Initialized,
> > > > > > >>>>>>>>>> Explored, Exploring, Implemented, Implementing,
> Optimized,
> > > > > > Optimizing
> > > > > > >>>>>>>> and
> > > > > > >>>>>>>>>> several transition methods: exploreRelSet,
> exploreRelNode,
> > > > > > >>>>>>>> implementRelSet,
> > > > > > >>>>>>>>>> implementRelNode, optimizeRelSet, optimizeRelNode...
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> To achieve this, we need to mark the rules either
> logical
> > > rule
> > > > > > or
> > > > > > >>>>>>>> physical
> > > > > > >>>>>>>>>> rule.
> > > > > > >>>>>>>>>> To keep backward compatibility, all the un-marked
> rules
> > > will
> > > > > be
> > > > > > >>>>>>>> treated as
> > > > > > >>>>>>>>>> logical rules, except rules that uses
> AbstractConverter as
> > > > > rule
> > > > > > >>>>>>>> operand,
> > > > > > >>>>>>>>>> these rules still need to applied top-down, or random
> > > order.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> 4. On-demand, bottom-up trait derivation
> > > > > > >>>>>>>>>> It is called bottom-up, but actually driven by
> top-down,
> > > > > > happens same
> > > > > > >>>>>>>> time
> > > > > > >>>>>>>>>> as top-down trait request, in optimization stage
> mentioned
> > > > > > above. Many
> > > > > > >>>>>>>>>> Calcite based bigdata system only propagate traits on
> > > Project
> > > > > > and
> > > > > > >>>>>>>> Filter by
> > > > > > >>>>>>>>>> writing rules, which is very limited. In fact, we can
> > > extend
> > > > > > trait
> > > > > > >>>>>>>>>> propagation/derivation to all the operators, without
> > > rules, by
> > > > > > adding
> > > > > > >>>>>>>>>> interface PhysicalNode (or extending RelNode) with
> method:
> > > > > > >>>>>>>>>> - RelNode derive(RelTraitSet traits, int childId);
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Given the following plan (only consider distribution
> for
> > > > > > simplicity):
> > > > > > >>>>>>>>>> Agg [a,b]
> > > > > > >>>>>>>>>>    +-- MergeJoin [a]
> > > > > > >>>>>>>>>>               |---- TableScan R
> > > > > > >>>>>>>>>>               +--- TableScan S
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Hash[a] won't satisfy Hash[a,b] without special
> treatment,
> > > > > > because
> > > > > > >>>>>>>> there
> > > > > > >>>>>>>>>> isn't a mechanism to coordinate traits between
> children.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Now we call derive method on Agg [a,b] node,
> > > derive(Hash[a],
> > > > > > 0), we get
> > > > > > >>>>>>>>>> the new node:
> > > > > > >>>>>>>>>> Agg [a]
> > > > > > >>>>>>>>>> +-- MergeJoin [a] (RelSubset)
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> We will provide different matching type, so each
> operator
> > > can
> > > > > > specify
> > > > > > >>>>>>>> what
> > > > > > >>>>>>>>>> kind of matching type it requires its children:
> > > > > > >>>>>>>>>> - MatchType getMatchType(RelTrait trait, int childId);
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> a) Exact: Hash[a,b] exact match Hash[a,b], aka,
> satisfy
> > > > > > >>>>>>>>>> b) Partial: Hash[a] partial match Hash[a,b]
> > > > > > >>>>>>>>>> c) Permuted: Sort[a,b,c] permuted match Sort[c,b,a]
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> In addition, optimization order is provided for each
> > > opertor
> > > > > to
> > > > > > >>>>>>>> specify:
> > > > > > >>>>>>>>>> a) left to right
> > > > > > >>>>>>>>>> b) right to left
> > > > > > >>>>>>>>>> c) both
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> For example, for query SELECT * FROM R join S on
> R.a=S.a
> > > and
> > > > > > R.b=S.b
> > > > > > >>>>>>>> and
> > > > > > >>>>>>>>>> R.c=S.c:
> > > > > > >>>>>>>>>> Suppose R is distributed by a,b, and S is distributed
> by
> > > c.
> > > > > > >>>>>>>>>> MergeJoin [a,b,c]
> > > > > > >>>>>>>>>>     |--- TableScan R [a,b]
> > > > > > >>>>>>>>>>     +-- TableScan S [c]
> > > > > > >>>>>>>>>> a) left to right, call derive(Hash[a,b], 0), we get
> > > MergeJoin
> > > > > > [a,b]
> > > > > > >>>>>>>>>> b) right to left, call derive(Hash[c], 1), we get
> > > MergeJoin
> > > > > > [c], most
> > > > > > >>>>>>>>>> likely a bad plan
> > > > > > >>>>>>>>>> c) both, get above 2 plans.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> For system that doesn't have a fine-tuned stats and
> cost
> > > > > model,
> > > > > > it may
> > > > > > >>>>>>>> not
> > > > > > >>>>>>>>>> be able to make a right decision based purely on cost.
> > > > > Probably
> > > > > > we
> > > > > > >>>>>>>> need to
> > > > > > >>>>>>>>>> provide the MergeJoin with both children's derived
> > > traitset
> > > > > > list.
> > > > > > >>>>>>>>>> - List<RelNode> derive(List<List<RelTraitSet>>);
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Of course, all above methods are optional to
> implement for
> > > > > > those who
> > > > > > >>>>>>>>>> doesn't need this feature.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> 5. Branch and Bound Space Pruning
> > > > > > >>>>>>>>>> After we implement on-demand, top-down trait
> enforcement
> > > and
> > > > > > >>>>>>>> rule-apply,
> > > > > > >>>>>>>>>> we can pass the cost limit at the time of passing down
> > > > > required
> > > > > > >>>>>>>> traits, as
> > > > > > >>>>>>>>>> described in the classical Cascades paper. Right now,
> > > Calcite
> > > > > > doesn't
> > > > > > >>>>>>>>>> provide group level logical properties, including
> stats
> > > info,
> > > > > > each
> > > > > > >>>>>>>> operator
> > > > > > >>>>>>>>>> in the same group has its own logical property and the
> > > stats
> > > > > > may vary,
> > > > > > >>>>>>>> so
> > > > > > >>>>>>>>>> we can only do limited space pruning for trait
> > > enforcement,
> > > > > > still
> > > > > > >>>>>>>> good. But
> > > > > > >>>>>>>>>> if we agree to add option to share group level stats
> > > between
> > > > > > relnodes
> > > > > > >>>>>>>> in a
> > > > > > >>>>>>>>>> RelSet, we will be able to do more aggresive space
> > > pruning,
> > > > > > which will
> > > > > > >>>>>>>> help
> > > > > > >>>>>>>>>> boost the performance of join reorder planning.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> With all that being said, how do we move forward?
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> There are 2 ways:
> > > > > > >>>>>>>>>> a) Modify on current VolcanoPlanner.
> > > > > > >>>>>>>>>> Pros: code reuse, existing numerous test cases and
> > > > > > infrastructure,
> > > > > > >>>>>>>> fast
> > > > > > >>>>>>>>>> integration
> > > > > > >>>>>>>>>> Cons: changing code always brings risk
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> b) Add a new XXXXPlanner
> > > > > > >>>>>>>>>> Pros: no risk, no tech debt, no need to worry about
> > > backward
> > > > > > >>>>>>>>>> compatability
> > > > > > >>>>>>>>>> Cons: separate test cases for new planner, one more
> > > planner to
> > > > > > >>>>>>>> maintain
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> We'd like to hear the community's thoughts and
> advices.
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>> Thanks.
> > > > > > >>>>>>>>>> - Haisheng
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>>
> > > > > > >>>>>>>>>
> > > > > > >>>>>>>>
> > > > > > >>>>>>>
> > > > > > >>>>>
> > > > > > >>>>>
> > > > > > >>>
> > > > > > >>
> > > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] Towards Cascades Optimizer

Reply via email to