Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Danny Chan Tue, 07 Jan 2020 01:32:55 -0800

Hi, guys, it seems that the discussion silent now, so do we have some 
conclusion that can contribute to current code, i.e. the suggested API change 
or new abstraction ?


Or better, can someone give a design doc so that we can push and make that 
implemented ?

Personally I was always looking forward to the result, because Apache Flink 
suffers for the bad planning performance for Join re-order or traits 
auto-adapter.

Best,
Danny Chan
在 2019年11月20日 +0800 AM2:14，Vladimir Ozerov <ppoze...@gmail.com>，写道：
> HI Igor,
>
> Thank you for the details. Meanwhile, I solved it with separation of
> conversion rules from the physical optimization rules. So the first pass
> creates physical nodes with unknown physical properties (top-bottom), while
> subsequent processing of the leaf nodes triggers rules which convert "bad"
> physical nodes to "good" physical nodes with know distribution and
> collation.
>
> Regards,
> Vladimir.
>
> пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor <gvvinbl...@gmail.com>:
>
> > Vladimir,
> >
> > Hope it may help you.
> >
> > Currently we applied the next way (just rough description):
> >
> > 1) We created an API to derive possible traits permutations on the basis
> > of children traits (quite similar to one, described in «On Demand trait
> > set request» topic)
> >
> > 2) added a general rule that copies Logical nodes, but requesting our
> > convention from their children (IGNITE convention, ANY distribution, EMPTY
> > collation) and sets importance of old Logical nodes to zero - so, we have a
> > Logical parent which input satisfies any possible distribution and no rules
> > matched to previous logical node any more.
> >
> > 3) Physical rules to create physical rel nodes only if physical traits may
> > be derived (there is no «barrier», described in one of previous messages) -
> > derived traits are a collection, we don’t create a physical rel node for
> > each possible traits set, also we may set zero importance for previously
> > created rel nodes to decrease search space.
> >
> > Now we know actual and required distribution, we don’t need
> > AbstractConverters and able just call TraitDef.convert() method inside a
> > rule.
> > A rule still able to produce the same output several times, but
> > «memorization" inside the planner solves it for us.
> >
> > preliminary tests show almost zero overhead of the approach.
> >
> > Regards,
> > Igor
> >
> >
> > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov <ppoze...@gmail.com>
> > написал(а):
> > >
> > > Hi Xing,
> > >
> > > Thanks for your suggestion. Yes, this may help, and if I get your idea
> > > right, I already had it in my reproducer:
> > > 1) Create the converted physical input:
> > >
> > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49
> > > 2) Use it in case no physical children were found:
> > >
> > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79
> > >
> > > This idea is borrowed from Apache Drill physical rules. But the problem
> > is
> > > that this approach leads to a suboptimal plan - parent node doesn't know
> > > the future distribution of a child node. And as a result, it doesn't know
> > > it's own distribution. So the final plan is constructed in that way:
> > > 1.1) Root enforced "SINGLETON" on its child:
> > > -> PhysicalRoot[SINGLETON]
> > > -> Converter[SINGLETON]
> > > -> PhysicalProject[*ANY*]
> > > -> PhysicalScan[REPLICATED]
> > >
> > > 1.2) But since the child (PhysicalProject) failed to infer distribution
> > > during rule call, it falls back to ANY distribution. In order to satisfy
> > > SINGLETON distribution of a parent, we inject an exchange in the final
> > plan:
> > > -> PhysicalRoot[SINGLETON]
> > > * -> Exchange[SINGLETON]*
> > > -> PhysicalProject[*ANY*]
> > > -> PhysicalScan[REPLICATED]
> > >
> > > 2) But this is a suboptimal plan. The optimal plan is:
> > > -> PhysicalRoot[SINGLETON]
> > > -> PhysicalProject[REPLICATED]
> > > -> PhysicalScan[REPLICATED]
> > >
> > > You may observe it in my tests:
> > > 1)
> > >
> > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46
> > > -
> > > works as you described and produces not optimal plan with exchange
> > > 2)
> > >
> > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30
> > > -
> > > rely on AbstractConverter-s and produce an optimal plan with bottom-up
> > > trait propagation at the cost of significantly increased planning time
> > >
> > > Regards,
> > > Vladimir.
> > >
> > > пт, 8 нояб. 2019 г. в 16:15, XING JIN <jinxing.co...@gmail.com>:
> > >
> > > > Hi Vladimir,
> > > >
> > > > I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work
> > may
> > > > help you ~
> > > > They work by a top-down fashion, but when matching parent, they convert
> > the
> > > > children explicitly.
> > > > You may try below steps:
> > > > 1. Construct a rule LogicalParentRule to match the LogicalParent without
> > > > distribution/physical requirement for the LogicalChild;
> > > > 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to
> > > > build a new child with physical convention. Note that at this moment
> > only
> > > > an empty RelSubset is created and no PhysicalChild exists.
> > > > 3. Then set the RelNode to be the new input of LogicalParent;
> > > >
> > > > By above steps, you can build a parent-child relationship between
> > > > LogicalParent and PhysicalChild, and at last the PhysicalParentRule
> > will be
> > > > fired based on this relationship.
> > > >
> > > > I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV
> > in
> > > > below link, hope it may help you ~
> > > > https://github.com/jinxing64/calcite/tree/demo
> > > >
> > > > Also I'm +1 with Seliverstov that to get all parents of a set, which
> > > > against the current check in RelSubset#getParentRels
> > > >
> > > > Best,
> > > > Jin
> > > >
> > > > Vladimir Ozerov <ppoze...@gmail.com> 于2019年11月5日周二 下午6:41写道：
> > > >
> > > > > Hi Xiening,
> > > > >
> > > > > I read the thread about on-demand trait requests. It seems pretty
> > similar
> > > > > to what I am trying to achieve, as it facilitates the bottom-up
> > > > propagation
> > > > > of physical traits. In fact, both your and my strategy propagate 
> > > > > traits
> > > > > bottom-up, but I do this through rules, which also fire bottom-up,
> > while
> > > > in
> > > > > your case only the traits are propagated bottom-up, while rules
> > continue
> > > > > working in a top-down fashion.
> > > > >
> > > > > However, I am thinking of how I would potentially implement my
> > optimizer
> > > > > with your approach, and it feels like with on-demand traits resulting
> > > > > implementation of metadata queries may become very complex to that
> > point
> > > > > that it will look like another set of rules, parallel to the already
> > > > > existing ruleset. For example, consider that I have a couple of
> > > > distributed
> > > > > tables in an OLTP application. These tables have a number of indexes,
> > > > and I
> > > > > would like to join them. First, I have a number of choices on how to
> > join
> > > > > tables with respect to distribution. Then, I have a number of choices
> > on
> > > > > which access method to use. Because sometimes it is beneficial to pick
> > > > > index scans instead of table scans even without index conditions, for
> > > > > example, to preserve a comfortable collation. So when my logical scan
> > > > > receives such metadata request, it typically cannot return all 
> > > > > possible
> > > > > combinations, because there are too many of them. Instead, some
> > > > heuristical
> > > > > or cost-based logic will be used to calculate a couple of most
> > > > prospective
> > > > > ones. But it seems that we will have to duplicate the same logic in 
> > > > > the
> > > > > corresponding rule, aren't we?
> > > > >
> > > > > I would love to read your design because this is a really interesting
> > > > > topic, and it is of great importance for the distributed engines
> > > > developed
> > > > > on top of Calcite since proper use of distribution and collation is 
> > > > > the
> > > > key
> > > > > success factor for efficient query optimization.
> > > > >
> > > > > Regards,
> > > > > Vladimir.
> > > > >
> > > > > пт, 1 нояб. 2019 г. в 00:40, Xiening Dai <xndai....@gmail.com>:
> > > > >
> > > > > > Actually we solved this problem in our setup using a mechanism 
> > > > > > called
> > > > > > “Pull-Up Traits”, which explores the possible trait set of 
> > > > > > children’s
> > > > > input
> > > > > > to decide parent’s physical properties. In order to determine child
> > > > input
> > > > > > trait, you would have to look at child’s children, and all the way 
> > > > > > to
> > > > the
> > > > > > leaves nodes or a barrier. A barrier is a rel node which cannot 
> > > > > > derive
> > > > > any
> > > > > > traits regardless the input. A good example would be a user define
> > > > > function
> > > > > > which would throw off any distribution or collation. Then we realize
> > > > just
> > > > > > pulling up is not enough, sometimes we would need to look at 
> > > > > > parent’s
> > > > > > requirement as well. So we try to solve this in a unified framework,
> > > > > which
> > > > > > we call “On Demand Trait” and implement it as part of the framework 
> > > > > > so
> > > > > > anyone can be benefited. I hope Haisheng can share a design doc once
> > we
> > > > > > have more concrete ideas.
> > > > > >
> > > > > >
> > > > > > > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni <j...@apache.org> wrote:
> > > > > > >
> > > > > > > Hi Vladimir,
> > > > > > >
> > > > > > > The SubsetTransformer interface and the iterating over the 
> > > > > > > RelNodes
> > > > > > > within a RelSubset in Drill is exactly implemented to do the trait
> > > > > > > propagation. We also had to rely on AbstractConverter to fire
> > > > > > > necessary rule to avoid the CanNotPlan issue. At some point, 
> > > > > > > Calcite
> > > > > > > community chooses to remove AbstractConverter and Drill had to 
> > > > > > > add it
> > > > > > > back, which is probably one of the main reasons for us to continue
> > > > > > > using a Calcite fork. I still remember we constantly had to deal
> > > > with
> > > > > > > the dilemma between "CanNotPlan" and long planing time due to
> > > > explored
> > > > > > > search space.
> > > > > > >
> > > > > > > Glad to see more people are joining the effort to solve this long
> > > > > > > overdue issue, something missing in Calcite's core optimizer
> > > > framework
> > > > > > > "since before Calcite was Calcite" (Jacques's words).
> > > > > > >
> > > > > > > Jinfeng
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov 
> > > > > > > <ppoze...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > Hi Danny,
> > > > > > > >
> > > > > > > > Thank you very much for the links. What is described here is 
> > > > > > > > pretty
> > > > > much
> > > > > > > > similar to the problem I describe. Especially the discussion 
> > > > > > > > about
> > > > > trait
> > > > > > > > propagation, as this is basically what I need - to explore 
> > > > > > > > potential
> > > > > > traits
> > > > > > > > of children before optimizing parents. And this is basically 
> > > > > > > > what
> > > > > Drill
> > > > > > > > already does with it's SubsetTransformer:
> > > > > > > > 1) There is a SubsetTransformer interface, which iterates over
> > > > > physical
> > > > > > > > relations of the given subset [1]
> > > > > > > > 2) If you want to make a physical project, you iterate over 
> > > > > > > > physical
> > > > > > > > relations of the input subset and create possible physical 
> > > > > > > > projects
> > > > > [2]
> > > > > > > > 3) But if you cannot find any physical input, then you trigger
> > > > > creation
> > > > > > of
> > > > > > > > a "bad" physical project, which is very likely to have poor cost
> > > > > > because it
> > > > > > > > cannot take advantage of input's distribution information [3]
> > > > > > > > So, step 2 - is a trait set propagation which is needed by many
> > > > > > > > distributed engines. Step 3 - an attempt to workaround current
> > > > > > > > VolcanoPlanner behavior, when a parent rule is fired only if
> > > > produced
> > > > > > child
> > > > > > > > node has compatible trait set.
> > > > > > > >
> > > > > > > > I do not know Calcite's architecture that good but at first 
> > > > > > > > glance,
> > > > > the
> > > > > > > > proposed ability to re-fire rules of a specific Rel seems good
> > > > enough.
> > > > > > It
> > > > > > > > doesn't expand search space, because no new nodes are created, 
> > > > > > > > and
> > > > it
> > > > > > seems
> > > > > > > > to be relatively simple to implement.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java
> > > > > > > > [2]
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66
> > > > > > > > [3]
> > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69
> > > > > > > >
> > > > > > > > чт, 31 окт. 2019 г. в 12:21, Danny Chan <yuzhao....@gmail.com>:
> > > > > > > >
> > > > > > > > > Thanks Vladimir for bringing up this discussion !
> > > > > > > > >
> > > > > > > > > Did you notice that there is a JIRA issue about this problem 
> > > > > > > > > ? [1]
> > > > > > Also a
> > > > > > > > > discussion about how to propagate the traits [2]
> > > > > > > > >
> > > > > > > > > [1] https://issues.apache.org/jira/browse/CALCITE-2970
> > > > > > > > > [2]
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Danny Chan
> > > > > > > > > 在 2019年10月31日 +0800 PM3:56，Vladimir Ozerov <ppoze...@gmail.com
> > > > > ，写道：
> > > > > > > > > > Hi colleagues,
> > > > > > > > > >
> > > > > > > > > > I would like to discuss with the community the possibility 
> > > > > > > > > > of
> > > > > adding a
> > > > > > > > > new
> > > > > > > > > > public method to VolcanoPlanner which will forcefully 
> > > > > > > > > > re-trigger
> > > > the
> > > > > > > > > rules
> > > > > > > > > > for the specific rel. This is a follow up of a discussion 
> > > > > > > > > > started
> > > > in
> > > > > > the
> > > > > > > > > > other thread [1].
> > > > > > > > > >
> > > > > > > > > > **Problem statement**
> > > > > > > > > > When converting between conventions during optimization
> > > > > VolcanoPlanner
> > > > > > > > > > prefers the top-bottom approach, so that the nodes are 
> > > > > > > > > > converted
> > > > > from
> > > > > > the
> > > > > > > > > > root. But in some cases, the intermediate node must be 
> > > > > > > > > > converted
> > > > > after
> > > > > > > > > its
> > > > > > > > > > children. This is especially true for distributed SQL 
> > > > > > > > > > engines,
> > > > which
> > > > > > rely
> > > > > > > > > > on distribution traits during the optimization process: it 
> > > > > > > > > > is not
> > > > > > > > > possible
> > > > > > > > > > to efficiently choose a proper physical implementation of a 
> > > > > > > > > > parent
> > > > > > node
> > > > > > > > > > unless the physical representation of a child node is known.
> > > > > > > > > >
> > > > > > > > > > It seems that presently VolcanoPlanner cannot address such 
> > > > > > > > > > cases
> > > > by
> > > > > > > > > > default. Consider that we have two nodes and associated 
> > > > > > > > > > rules
> > > > which
> > > > > > > > > convert
> > > > > > > > > > them to physical counterparts:
> > > > > > > > > > [LogicalParent <- LogicalChild]
> > > > > > > > > > The parent should be converted after the child. When the 
> > > > > > > > > > child is
> > > > > > > > > > converted, the physical node is created:
> > > > > > > > > > [LogicalParent <- {LogicalChild, PhysicalChild}]
> > > > > > > > > > In order to finish the optimization process, we need to 
> > > > > > > > > > convert
> > > > the
> > > > > > > > > parent.
> > > > > > > > > > But parent rules are not fired, because PhysicalChild has 
> > > > > > > > > > traits
> > > > > > > > > > incompatible with LogicalParent.
> > > > > > > > > >
> > > > > > > > > > Presently the problem could be solved in two ways:
> > > > > > > > > > 1) Always produce conversions when going top-down. In this 
> > > > > > > > > > case, I
> > > > > > first
> > > > > > > > > > create a physical parent, then a physical child. The 
> > > > > > > > > > problem is
> > > > that
> > > > > > > > > > created parent is not optimal because it didn't know child
> > > > > > distribution
> > > > > > > > > at
> > > > > > > > > > the time of creation. So the full flow would be: create a 
> > > > > > > > > > bad
> > > > > parent,
> > > > > > > > > > create a good child, create a good parent.
> > > > > > > > > > 1.1) [LogicalParent <- LogicalChild]
> > > > > > > > > > 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild]
> > > > > > > > > > 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild,
> > > > > > > > > PhysicalChild}]
> > > > > > > > > > 1.4) [{LogicalParent, PhysicalParentBad, 
> > > > > > > > > > PhysicalParentGood} <-
> > > > > > > > > > {LogicalChild, PhysicalChild}]
> > > > > > > > > > What is worse, the creation of a not optimal parent will 
> > > > > > > > > > trigger
> > > > > > rules on
> > > > > > > > > > its parent, which in turn may create a not optimal 
> > > > > > > > > > parent-parent
> > > > > node,
> > > > > > > > > etc.
> > > > > > > > > >
> > > > > > > > > > 2) Make sure that your convention returns true for
> > > > > > > > > > Convention.canConvertConvention. In this case, every time a 
> > > > > > > > > > new
> > > > rel
> > > > > is
> > > > > > > > > > added to a RelSet, its traitset will be compared to 
> > > > > > > > > > traitsets of
> > > > all
> > > > > > > > > other
> > > > > > > > > > rels in the set. For every pair of traitset we may ask the 
> > > > > > > > > > engine
> > > > to
> > > > > > > > > create
> > > > > > > > > > a relevant AbstractConverter. The net effect is that
> > > > > > > > > "physical-to-logical"
> > > > > > > > > > converter is created, which re-triggers the rule on the 
> > > > > > > > > > logical
> > > > > parent
> > > > > > > > > > since their conventions are compatible:
> > > > > > > > > > 2.1) [LogicalParent <- LogicalChild]
> > > > > > > > > > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
> > > > > > > > > > 2.3) [LogicalParent <- {LogicalChild, PhysicalChild,
> > > > > > > > > > PhysicalToLogicalConverter}]
> > > > > > > > > > 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild,
> > > > > PhysicalChild,
> > > > > > > > > > PhysicalToLogicalConverter}]
> > > > > > > > > >
> > > > > > > > > > The problem with that approach is that it is too 
> > > > > > > > > > coarse-grained
> > > > > since
> > > > > > we
> > > > > > > > > > operate on traitsets rather than rels. As a result, 
> > > > > > > > > > additional
> > > > > memory
> > > > > > and
> > > > > > > > > > CPU pressure are introduced because usually too many 
> > > > > > > > > > converters
> > > > are
> > > > > > > > > > created, which triggers the rules over and over again.
> > > > > > > > > >
> > > > > > > > > > **Affected products**
> > > > > > > > > > At the moment two distributed engines are being developed 
> > > > > > > > > > for
> > > > > > Hazelcast
> > > > > > > > > and
> > > > > > > > > > Apache Ignite. Both require bottom-up optimization and 
> > > > > > > > > > currently
> > > > > rely
> > > > > > on
> > > > > > > > > > the second workaround.
> > > > > > > > > > Another example is Apache Drill. I do not know whether its
> > > > community
> > > > > > is
> > > > > > > > > > concerned with the issue, but it also uses bottom-up 
> > > > > > > > > > optimization
> > > > > for
> > > > > > > > > many
> > > > > > > > > > rules and employs both the aforementioned workarounds. As a
> > > > result,
> > > > > I
> > > > > > > > > guess
> > > > > > > > > > that Drill's optimizer also creates too many rels during
> > > > > optimization
> > > > > > and
> > > > > > > > > > suffer from huge search space. Please correct me if I am 
> > > > > > > > > > wrong.
> > > > > > > > > >
> > > > > > > > > > **Proposal**
> > > > > > > > > > The key problem is that there is no way to re-fire rules on 
> > > > > > > > > > a
> > > > > specific
> > > > > > > > > > node. The original problem could have been solved, if it 
> > > > > > > > > > would be
> > > > > > > > > possible
> > > > > > > > > > to re-fire rules on a LogicalParent without creating any
> > > > additional
> > > > > > rels.
> > > > > > > > > > That would lead to a clear optimization flow:
> > > > > > > > > > 2.1) [LogicalParent <- LogicalChild]
> > > > > > > > > > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
> > > > > > > > > > 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild,
> > > > > > PhysicalChild}]
> > > > > > > > > >
> > > > > > > > > > We can add the following method to VolcanoPlanner 
> > > > > > > > > > (RelOptPlanner?)
> > > > > > > > > > interface:
> > > > > > > > > > void fireRules(RelNode rel)
> > > > > > > > > >
> > > > > > > > > > This method will fire the rules on a passed node in a 
> > > > > > > > > > deferred
> > > > mode
> > > > > > as if
> > > > > > > > > > it was the new node just added to the optimizer. This would
> > > > require
> > > > > > > > > slight
> > > > > > > > > > changes to RuleQueue.addMatch method, and possibly some 
> > > > > > > > > > other
> > > > > places.
> > > > > > > > > >
> > > > > > > > > > Usage example:
> > > > > > > > > > class PhysicalChildRule extends RelOptRule {
> > > > > > > > > > void onMatch(RelOptRuleCall call) {
> > > > > > > > > > LogicalChild logicalRel = call.get(0);
> > > > > > > > > > PhysicalChild physicalRel = ...;
> > > > > > > > > >
> > > > > > > > > > call.transformTo(physicalRel);
> > > > > > > > > > optimizer.fireRules(logicalRel);
> > > > > > > > > > }
> > > > > > > > > > }
> > > > > > > > > >
> > > > > > > > > > What does the community think about such a method? Does it 
> > > > > > > > > > make
> > > > any
> > > > > > sense
> > > > > > > > > > to you? If not, do you aware of any other ways on how to 
> > > > > > > > > > organize
> > > > > > > > > bottom-up
> > > > > > > > > > optimization with VolcanoPlanner without the creation of
> > > > additional
> > > > > > rels?
> > > > > > > > > >
> > > > > > > > > > If the community is OK in general, I can create try to 
> > > > > > > > > > create a PR
> > > > > > with a
> > > > > > > > > > prototype.
> > > > > > > > > >
> > > > > > > > > > Would appreciate your feedback.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > > Vladimir.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E
> > > > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> >
> >

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Reply via email to