Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Vladimir Ozerov Tue, 19 Nov 2019 10:15:57 -0800

HI Igor,

Thank you for the details. Meanwhile, I solved it with separation of
conversion rules from the physical optimization rules. So the first pass
creates physical nodes with unknown physical properties (top-bottom), while
subsequent processing of the leaf nodes triggers rules which convert "bad"
physical nodes to "good" physical nodes with know distribution and
collation.


Regards,
Vladimir.

пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor <gvvinbl...@gmail.com>:

> Vladimir,
>
> Hope it may help you.
>
> Currently we applied the next way (just rough description):
>
> 1) We created an API to derive possible traits permutations on the basis
> of children traits  (quite similar to one, described in «On Demand trait
> set request» topic)
>
> 2) added a general rule that copies Logical nodes, but requesting our
> convention from their children (IGNITE convention, ANY distribution, EMPTY
> collation) and sets importance of old Logical nodes to zero - so, we have a
> Logical parent which input satisfies any possible distribution and no rules
> matched to previous logical node any more.
>
> 3) Physical rules to create physical rel nodes only if physical traits may
> be derived (there is no «barrier», described in one of previous messages) -
> derived traits are a collection, we don’t create a physical rel node for
> each possible traits set, also we may set zero importance for previously
> created rel nodes to decrease search space.
>
> Now we know actual and required distribution, we don’t need
> AbstractConverters and able just call TraitDef.convert() method inside a
> rule.
> A rule still able to produce the same output several times, but
> «memorization" inside the planner solves it for us.
>
> preliminary tests show almost zero overhead of the approach.
>
> Regards,
> Igor
>
>
> > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov <ppoze...@gmail.com>
> написал(а):
> >
> > Hi Xing,
> >
> > Thanks for your suggestion. Yes, this may help, and if I get your idea
> > right, I already had it in my reproducer:
> > 1) Create the converted physical input:
> >
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49
> > 2) Use it in case no physical children were found:
> >
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79
> >
> > This idea is borrowed from Apache Drill physical rules. But the problem
> is
> > that this approach leads to a suboptimal plan - parent node doesn't know
> > the future distribution of a child node. And as a result, it doesn't know
> > it's own distribution. So the final plan is constructed in that way:
> > 1.1) Root enforced "SINGLETON" on its child:
> > -> PhysicalRoot[SINGLETON]
> > -> Converter[SINGLETON]
> >  -> PhysicalProject[*ANY*]
> >   -> PhysicalScan[REPLICATED]
> >
> > 1.2) But since the child (PhysicalProject) failed to infer distribution
> > during rule call, it falls back to ANY distribution. In order to satisfy
> > SINGLETON distribution of a parent, we inject an exchange in the final
> plan:
> > -> PhysicalRoot[SINGLETON]
> > * -> Exchange[SINGLETON]*
> >  -> PhysicalProject[*ANY*]
> >   -> PhysicalScan[REPLICATED]
> >
> > 2) But this is a suboptimal plan. The optimal plan is:
> > -> PhysicalRoot[SINGLETON]
> > -> PhysicalProject[REPLICATED]
> >  -> PhysicalScan[REPLICATED]
> >
> > You may observe it in my tests:
> > 1)
> >
> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46
> > -
> > works as you described and produces not optimal plan with exchange
> > 2)
> >
> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30
> > -
> > rely on AbstractConverter-s and produce an optimal plan with bottom-up
> > trait propagation at the cost of significantly increased planning time
> >
> > Regards,
> > Vladimir.
> >
> > пт, 8 нояб. 2019 г. в 16:15, XING JIN <jinxing.co...@gmail.com>:
> >
> >> Hi Vladimir,
> >>
> >> I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work
> may
> >> help you ~
> >> They work by a top-down fashion, but when matching parent, they convert
> the
> >> children explicitly.
> >> You may try below steps:
> >> 1. Construct a rule LogicalParentRule to match the LogicalParent without
> >> distribution/physical requirement for the LogicalChild;
> >> 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to
> >> build a new child with physical convention. Note that at this moment
> only
> >> an empty RelSubset is created and no PhysicalChild exists.
> >> 3. Then set the RelNode to be the new input of LogicalParent;
> >>
> >> By above steps, you can build a parent-child relationship between
> >> LogicalParent and PhysicalChild, and at last the PhysicalParentRule
> will be
> >> fired based on this relationship.
> >>
> >> I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV
> in
> >> below link, hope it may help you ~
> >> https://github.com/jinxing64/calcite/tree/demo
> >>
> >> Also I'm +1 with Seliverstov that to get all parents of a set, which
> >> against the current check in RelSubset#getParentRels
> >>
> >> Best,
> >> Jin
> >>
> >> Vladimir Ozerov <ppoze...@gmail.com> 于2019年11月5日周二 下午6:41写道：
> >>
> >>> Hi Xiening,
> >>>
> >>> I read the thread about on-demand trait requests. It seems pretty
> similar
> >>> to what I am trying to achieve, as it facilitates the bottom-up
> >> propagation
> >>> of physical traits. In fact, both your and my strategy propagate traits
> >>> bottom-up, but I do this through rules, which also fire bottom-up,
> while
> >> in
> >>> your case only the traits are propagated bottom-up, while rules
> continue
> >>> working in a top-down fashion.
> >>>
> >>> However, I am thinking of how I would potentially implement my
> optimizer
> >>> with your approach, and it feels like with on-demand traits resulting
> >>> implementation of metadata queries may become very complex to that
> point
> >>> that it will look like another set of rules, parallel to the already
> >>> existing ruleset. For example, consider that I have a couple of
> >> distributed
> >>> tables in an OLTP application. These tables have a number of indexes,
> >> and I
> >>> would like to join them. First, I have a number of choices on how to
> join
> >>> tables with respect to distribution. Then, I have a number of choices
> on
> >>> which access method to use. Because sometimes it is beneficial to pick
> >>> index scans instead of table scans even without index conditions, for
> >>> example, to preserve a comfortable collation. So when my logical scan
> >>> receives such metadata request, it typically cannot return all possible
> >>> combinations, because there are too many of them. Instead, some
> >> heuristical
> >>> or cost-based logic will be used to calculate a couple of most
> >> prospective
> >>> ones. But it seems that we will have to duplicate the same logic in the
> >>> corresponding rule, aren't we?
> >>>
> >>> I would love to read your design because this is a really interesting
> >>> topic, and it is of great importance for the distributed engines
> >> developed
> >>> on top of Calcite since proper use of distribution and collation is the
> >> key
> >>> success factor for efficient query optimization.
> >>>
> >>> Regards,
> >>> Vladimir.
> >>>
> >>> пт, 1 нояб. 2019 г. в 00:40, Xiening Dai <xndai....@gmail.com>:
> >>>
> >>>> Actually we solved this problem in our setup using a mechanism called
> >>>> “Pull-Up Traits”, which explores the possible trait set of children’s
> >>> input
> >>>> to decide parent’s physical properties. In order to determine child
> >> input
> >>>> trait, you would have to look at child’s children, and all the way to
> >> the
> >>>> leaves nodes or a barrier. A barrier is a rel node which cannot derive
> >>> any
> >>>> traits regardless the input. A good example would be a user define
> >>> function
> >>>> which would throw off any distribution or collation. Then we realize
> >> just
> >>>> pulling up is not enough, sometimes we would need to look at parent’s
> >>>> requirement as well. So we try to solve this in a unified framework,
> >>> which
> >>>> we call “On Demand Trait” and implement it as part of the framework so
> >>>> anyone can be benefited. I hope Haisheng can share a design doc once
> we
> >>>> have more concrete ideas.
> >>>>
> >>>>
> >>>>> On Oct 31, 2019, at 11:37 AM, Jinfeng Ni <j...@apache.org> wrote:
> >>>>>
> >>>>> Hi Vladimir,
> >>>>>
> >>>>> The SubsetTransformer interface and the iterating over the RelNodes
> >>>>> within a RelSubset in Drill  is exactly implemented to do the trait
> >>>>> propagation. We also had to rely on AbstractConverter to fire
> >>>>> necessary rule to avoid the CanNotPlan issue. At some point, Calcite
> >>>>> community chooses to remove AbstractConverter and Drill had to add it
> >>>>> back, which is probably one of the main reasons for us to continue
> >>>>> using a Calcite fork.  I still remember we constantly had to deal
> >> with
> >>>>> the dilemma between "CanNotPlan" and long planing time due to
> >> explored
> >>>>> search space.
> >>>>>
> >>>>> Glad to see more people are joining the effort to solve this long
> >>>>> overdue issue, something missing in Calcite's core optimizer
> >> framework
> >>>>> "since before Calcite was Calcite" (Jacques's words).
> >>>>>
> >>>>> Jinfeng
> >>>>>
> >>>>>
> >>>>> On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov <ppoze...@gmail.com>
> >>>> wrote:
> >>>>>>
> >>>>>> Hi Danny,
> >>>>>>
> >>>>>> Thank you very much for the links. What is described here is pretty
> >>> much
> >>>>>> similar to the problem I describe. Especially the discussion about
> >>> trait
> >>>>>> propagation, as this is basically what I need - to explore potential
> >>>> traits
> >>>>>> of children before optimizing parents. And this is basically what
> >>> Drill
> >>>>>> already does with it's SubsetTransformer:
> >>>>>> 1) There is a SubsetTransformer interface, which iterates over
> >>> physical
> >>>>>> relations of the given subset [1]
> >>>>>> 2) If you want to make a physical project, you iterate over physical
> >>>>>> relations of the input subset and create possible physical projects
> >>> [2]
> >>>>>> 3) But if you cannot find any physical input, then you trigger
> >>> creation
> >>>> of
> >>>>>> a "bad" physical project, which is very likely to have poor cost
> >>>> because it
> >>>>>> cannot take advantage of input's distribution information [3]
> >>>>>> So, step 2 - is a trait set propagation which is needed by many
> >>>>>> distributed engines. Step 3 - an attempt to workaround current
> >>>>>> VolcanoPlanner behavior, when a parent rule is fired only if
> >> produced
> >>>> child
> >>>>>> node has compatible trait set.
> >>>>>>
> >>>>>> I do not know Calcite's architecture that good but at first glance,
> >>> the
> >>>>>> proposed ability to re-fire rules of a specific Rel seems good
> >> enough.
> >>>> It
> >>>>>> doesn't expand search space, because no new nodes are created, and
> >> it
> >>>> seems
> >>>>>> to be relatively simple to implement.
> >>>>>>
> >>>>>> [1]
> >>>>>>
> >>>>
> >>>
> >>
> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java
> >>>>>> [2]
> >>>>>>
> >>>>
> >>>
> >>
> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66
> >>>>>> [3]
> >>>>>>
> >>>>
> >>>
> >>
> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69
> >>>>>>
> >>>>>> чт, 31 окт. 2019 г. в 12:21, Danny Chan <yuzhao....@gmail.com>:
> >>>>>>
> >>>>>>> Thanks Vladimir for bringing up this discussion !
> >>>>>>>
> >>>>>>> Did you notice that there is a JIRA issue about this problem ? [1]
> >>>> Also a
> >>>>>>> discussion about how to propagate the traits [2]
> >>>>>>>
> >>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-2970
> >>>>>>> [2]
> >>>>>>>
> >>>>
> >>>
> >>
> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Danny Chan
> >>>>>>> 在 2019年10月31日 +0800 PM3:56，Vladimir Ozerov <ppoze...@gmail.com
> >>> ，写道：
> >>>>>>>> Hi colleagues,
> >>>>>>>>
> >>>>>>>> I would like to discuss with the community the possibility of
> >>> adding a
> >>>>>>> new
> >>>>>>>> public method to VolcanoPlanner which will forcefully re-trigger
> >> the
> >>>>>>> rules
> >>>>>>>> for the specific rel. This is a follow up of a discussion started
> >> in
> >>>> the
> >>>>>>>> other thread [1].
> >>>>>>>>
> >>>>>>>> **Problem statement**
> >>>>>>>> When converting between conventions during optimization
> >>> VolcanoPlanner
> >>>>>>>> prefers the top-bottom approach, so that the nodes are converted
> >>> from
> >>>> the
> >>>>>>>> root. But in some cases, the intermediate node must be converted
> >>> after
> >>>>>>> its
> >>>>>>>> children. This is especially true for distributed SQL engines,
> >> which
> >>>> rely
> >>>>>>>> on distribution traits during the optimization process: it is not
> >>>>>>> possible
> >>>>>>>> to efficiently choose a proper physical implementation of a parent
> >>>> node
> >>>>>>>> unless the physical representation of a child node is known.
> >>>>>>>>
> >>>>>>>> It seems that presently VolcanoPlanner cannot address such cases
> >> by
> >>>>>>>> default. Consider that we have two nodes and associated rules
> >> which
> >>>>>>> convert
> >>>>>>>> them to physical counterparts:
> >>>>>>>> [LogicalParent <- LogicalChild]
> >>>>>>>> The parent should be converted after the child. When the child is
> >>>>>>>> converted, the physical node is created:
> >>>>>>>> [LogicalParent <- {LogicalChild, PhysicalChild}]
> >>>>>>>> In order to finish the optimization process, we need to convert
> >> the
> >>>>>>> parent.
> >>>>>>>> But parent rules are not fired, because PhysicalChild has traits
> >>>>>>>> incompatible with LogicalParent.
> >>>>>>>>
> >>>>>>>> Presently the problem could be solved in two ways:
> >>>>>>>> 1) Always produce conversions when going top-down. In this case, I
> >>>> first
> >>>>>>>> create a physical parent, then a physical child. The problem is
> >> that
> >>>>>>>> created parent is not optimal because it didn't know child
> >>>> distribution
> >>>>>>> at
> >>>>>>>> the time of creation. So the full flow would be: create a bad
> >>> parent,
> >>>>>>>> create a good child, create a good parent.
> >>>>>>>> 1.1) [LogicalParent <- LogicalChild]
> >>>>>>>> 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild]
> >>>>>>>> 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild,
> >>>>>>> PhysicalChild}]
> >>>>>>>> 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <-
> >>>>>>>> {LogicalChild, PhysicalChild}]
> >>>>>>>> What is worse, the creation of a not optimal parent will trigger
> >>>> rules on
> >>>>>>>> its parent, which in turn may create a not optimal parent-parent
> >>> node,
> >>>>>>> etc.
> >>>>>>>>
> >>>>>>>> 2) Make sure that your convention returns true for
> >>>>>>>> Convention.canConvertConvention. In this case, every time a new
> >> rel
> >>> is
> >>>>>>>> added to a RelSet, its traitset will be compared to traitsets of
> >> all
> >>>>>>> other
> >>>>>>>> rels in the set. For every pair of traitset we may ask the engine
> >> to
> >>>>>>> create
> >>>>>>>> a relevant AbstractConverter. The net effect is that
> >>>>>>> "physical-to-logical"
> >>>>>>>> converter is created, which re-triggers the rule on the logical
> >>> parent
> >>>>>>>> since their conventions are compatible:
> >>>>>>>> 2.1) [LogicalParent <- LogicalChild]
> >>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
> >>>>>>>> 2.3) [LogicalParent <- {LogicalChild, PhysicalChild,
> >>>>>>>> PhysicalToLogicalConverter}]
> >>>>>>>> 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild,
> >>> PhysicalChild,
> >>>>>>>> PhysicalToLogicalConverter}]
> >>>>>>>>
> >>>>>>>> The problem with that approach is that it is too coarse-grained
> >>> since
> >>>> we
> >>>>>>>> operate on traitsets rather than rels. As a result, additional
> >>> memory
> >>>> and
> >>>>>>>> CPU pressure are introduced because usually too many converters
> >> are
> >>>>>>>> created, which triggers the rules over and over again.
> >>>>>>>>
> >>>>>>>> **Affected products**
> >>>>>>>> At the moment two distributed engines are being developed for
> >>>> Hazelcast
> >>>>>>> and
> >>>>>>>> Apache Ignite. Both require bottom-up optimization and currently
> >>> rely
> >>>> on
> >>>>>>>> the second workaround.
> >>>>>>>> Another example is Apache Drill. I do not know whether its
> >> community
> >>>> is
> >>>>>>>> concerned with the issue, but it also uses bottom-up optimization
> >>> for
> >>>>>>> many
> >>>>>>>> rules and employs both the aforementioned workarounds. As a
> >> result,
> >>> I
> >>>>>>> guess
> >>>>>>>> that Drill's optimizer also creates too many rels during
> >>> optimization
> >>>> and
> >>>>>>>> suffer from huge search space. Please correct me if I am wrong.
> >>>>>>>>
> >>>>>>>> **Proposal**
> >>>>>>>> The key problem is that there is no way to re-fire rules on a
> >>> specific
> >>>>>>>> node. The original problem could have been solved, if it would be
> >>>>>>> possible
> >>>>>>>> to re-fire rules on a LogicalParent without creating any
> >> additional
> >>>> rels.
> >>>>>>>> That would lead to a clear optimization flow:
> >>>>>>>> 2.1) [LogicalParent <- LogicalChild]
> >>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
> >>>>>>>> 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild,
> >>>> PhysicalChild}]
> >>>>>>>>
> >>>>>>>> We can add the following method to VolcanoPlanner (RelOptPlanner?)
> >>>>>>>> interface:
> >>>>>>>> void fireRules(RelNode rel)
> >>>>>>>>
> >>>>>>>> This method will fire the rules on a passed node in a deferred
> >> mode
> >>>> as if
> >>>>>>>> it was the new node just added to the optimizer. This would
> >> require
> >>>>>>> slight
> >>>>>>>> changes to RuleQueue.addMatch method, and possibly some other
> >>> places.
> >>>>>>>>
> >>>>>>>> Usage example:
> >>>>>>>> class PhysicalChildRule extends RelOptRule {
> >>>>>>>> void onMatch(RelOptRuleCall call) {
> >>>>>>>> LogicalChild logicalRel = call.get(0);
> >>>>>>>> PhysicalChild physicalRel = ...;
> >>>>>>>>
> >>>>>>>> call.transformTo(physicalRel);
> >>>>>>>> optimizer.fireRules(logicalRel);
> >>>>>>>> }
> >>>>>>>> }
> >>>>>>>>
> >>>>>>>> What does the community think about such a method? Does it make
> >> any
> >>>> sense
> >>>>>>>> to you? If not, do you aware of any other ways on how to organize
> >>>>>>> bottom-up
> >>>>>>>> optimization with VolcanoPlanner without the creation of
> >> additional
> >>>> rels?
> >>>>>>>>
> >>>>>>>> If the community is OK in general, I can create try to create a PR
> >>>> with a
> >>>>>>>> prototype.
> >>>>>>>>
> >>>>>>>> Would appreciate your feedback.
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>> Vladimir.
> >>>>>>>>
> >>>>>>>> [1]
> >>>>>>>>
> >>>>>>>
> >>>>
> >>>
> >>
> https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E
> >>>>>>>
> >>>>
> >>>>
> >>>
> >>
>
>

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Reply via email to