Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Seliverstov Igor Mon, 18 Nov 2019 02:44:13 -0800

Vladimir,

Hope it may help you.


Currently we applied the next way (just rough description):

1) We created an API to derive possible traits permutations on the basis of 
children traits  (quite similar to one, described in «On Demand trait set 
request» topic)

2) added a general rule that copies Logical nodes, but requesting our 
convention from their children (IGNITE convention, ANY distribution, EMPTY 
collation) and sets importance of old Logical nodes to zero - so, we have a 
Logical parent which input satisfies any possible distribution and no rules 
matched to previous logical node any more.

3) Physical rules to create physical rel nodes only if physical traits may be 
derived (there is no «barrier», described in one of previous messages) - 
derived traits are a collection, we don’t create a physical rel node for each 
possible traits set, also we may set zero importance for previously created rel 
nodes to decrease search space.

Now we know actual and required distribution, we don’t need AbstractConverters 
and able just call TraitDef.convert() method inside a rule.
A rule still able to produce the same output several times, but «memorization" 
inside the planner solves it for us.

preliminary tests show almost zero overhead of the approach.

Regards,
Igor


> 14 нояб. 2019 г., в 14:49, Vladimir Ozerov <ppoze...@gmail.com> написал(а):
> 
> Hi Xing,
> 
> Thanks for your suggestion. Yes, this may help, and if I get your idea
> right, I already had it in my reproducer:
> 1) Create the converted physical input:
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49
> 2) Use it in case no physical children were found:
> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79
> 
> This idea is borrowed from Apache Drill physical rules. But the problem is
> that this approach leads to a suboptimal plan - parent node doesn't know
> the future distribution of a child node. And as a result, it doesn't know
> it's own distribution. So the final plan is constructed in that way:
> 1.1) Root enforced "SINGLETON" on its child:
> -> PhysicalRoot[SINGLETON]
> -> Converter[SINGLETON]
>  -> PhysicalProject[*ANY*]
>   -> PhysicalScan[REPLICATED]
> 
> 1.2) But since the child (PhysicalProject) failed to infer distribution
> during rule call, it falls back to ANY distribution. In order to satisfy
> SINGLETON distribution of a parent, we inject an exchange in the final plan:
> -> PhysicalRoot[SINGLETON]
> * -> Exchange[SINGLETON]*
>  -> PhysicalProject[*ANY*]
>   -> PhysicalScan[REPLICATED]
> 
> 2) But this is a suboptimal plan. The optimal plan is:
> -> PhysicalRoot[SINGLETON]
> -> PhysicalProject[REPLICATED]
>  -> PhysicalScan[REPLICATED]
> 
> You may observe it in my tests:
> 1)
> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46
> -
> works as you described and produces not optimal plan with exchange
> 2)
> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30
> -
> rely on AbstractConverter-s and produce an optimal plan with bottom-up
> trait propagation at the cost of significantly increased planning time
> 
> Regards,
> Vladimir.
> 
> пт, 8 нояб. 2019 г. в 16:15, XING JIN <jinxing.co...@gmail.com>:
> 
>> Hi Vladimir,
>> 
>> I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work may
>> help you ~
>> They work by a top-down fashion, but when matching parent, they convert the
>> children explicitly.
>> You may try below steps:
>> 1. Construct a rule LogicalParentRule to match the LogicalParent without
>> distribution/physical requirement for the LogicalChild;
>> 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to
>> build a new child with physical convention. Note that at this moment only
>> an empty RelSubset is created and no PhysicalChild exists.
>> 3. Then set the RelNode to be the new input of LogicalParent;
>> 
>> By above steps, you can build a parent-child relationship between
>> LogicalParent and PhysicalChild, and at last the PhysicalParentRule will be
>> fired based on this relationship.
>> 
>> I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV in
>> below link, hope it may help you ~
>> https://github.com/jinxing64/calcite/tree/demo
>> 
>> Also I'm +1 with Seliverstov that to get all parents of a set, which
>> against the current check in RelSubset#getParentRels
>> 
>> Best,
>> Jin
>> 
>> Vladimir Ozerov <ppoze...@gmail.com> 于2019年11月5日周二 下午6:41写道：
>> 
>>> Hi Xiening,
>>> 
>>> I read the thread about on-demand trait requests. It seems pretty similar
>>> to what I am trying to achieve, as it facilitates the bottom-up
>> propagation
>>> of physical traits. In fact, both your and my strategy propagate traits
>>> bottom-up, but I do this through rules, which also fire bottom-up, while
>> in
>>> your case only the traits are propagated bottom-up, while rules continue
>>> working in a top-down fashion.
>>> 
>>> However, I am thinking of how I would potentially implement my optimizer
>>> with your approach, and it feels like with on-demand traits resulting
>>> implementation of metadata queries may become very complex to that point
>>> that it will look like another set of rules, parallel to the already
>>> existing ruleset. For example, consider that I have a couple of
>> distributed
>>> tables in an OLTP application. These tables have a number of indexes,
>> and I
>>> would like to join them. First, I have a number of choices on how to join
>>> tables with respect to distribution. Then, I have a number of choices on
>>> which access method to use. Because sometimes it is beneficial to pick
>>> index scans instead of table scans even without index conditions, for
>>> example, to preserve a comfortable collation. So when my logical scan
>>> receives such metadata request, it typically cannot return all possible
>>> combinations, because there are too many of them. Instead, some
>> heuristical
>>> or cost-based logic will be used to calculate a couple of most
>> prospective
>>> ones. But it seems that we will have to duplicate the same logic in the
>>> corresponding rule, aren't we?
>>> 
>>> I would love to read your design because this is a really interesting
>>> topic, and it is of great importance for the distributed engines
>> developed
>>> on top of Calcite since proper use of distribution and collation is the
>> key
>>> success factor for efficient query optimization.
>>> 
>>> Regards,
>>> Vladimir.
>>> 
>>> пт, 1 нояб. 2019 г. в 00:40, Xiening Dai <xndai....@gmail.com>:
>>> 
>>>> Actually we solved this problem in our setup using a mechanism called
>>>> “Pull-Up Traits”, which explores the possible trait set of children’s
>>> input
>>>> to decide parent’s physical properties. In order to determine child
>> input
>>>> trait, you would have to look at child’s children, and all the way to
>> the
>>>> leaves nodes or a barrier. A barrier is a rel node which cannot derive
>>> any
>>>> traits regardless the input. A good example would be a user define
>>> function
>>>> which would throw off any distribution or collation. Then we realize
>> just
>>>> pulling up is not enough, sometimes we would need to look at parent’s
>>>> requirement as well. So we try to solve this in a unified framework,
>>> which
>>>> we call “On Demand Trait” and implement it as part of the framework so
>>>> anyone can be benefited. I hope Haisheng can share a design doc once we
>>>> have more concrete ideas.
>>>> 
>>>> 
>>>>> On Oct 31, 2019, at 11:37 AM, Jinfeng Ni <j...@apache.org> wrote:
>>>>> 
>>>>> Hi Vladimir,
>>>>> 
>>>>> The SubsetTransformer interface and the iterating over the RelNodes
>>>>> within a RelSubset in Drill  is exactly implemented to do the trait
>>>>> propagation. We also had to rely on AbstractConverter to fire
>>>>> necessary rule to avoid the CanNotPlan issue. At some point, Calcite
>>>>> community chooses to remove AbstractConverter and Drill had to add it
>>>>> back, which is probably one of the main reasons for us to continue
>>>>> using a Calcite fork.  I still remember we constantly had to deal
>> with
>>>>> the dilemma between "CanNotPlan" and long planing time due to
>> explored
>>>>> search space.
>>>>> 
>>>>> Glad to see more people are joining the effort to solve this long
>>>>> overdue issue, something missing in Calcite's core optimizer
>> framework
>>>>> "since before Calcite was Calcite" (Jacques's words).
>>>>> 
>>>>> Jinfeng
>>>>> 
>>>>> 
>>>>> On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov <ppoze...@gmail.com>
>>>> wrote:
>>>>>> 
>>>>>> Hi Danny,
>>>>>> 
>>>>>> Thank you very much for the links. What is described here is pretty
>>> much
>>>>>> similar to the problem I describe. Especially the discussion about
>>> trait
>>>>>> propagation, as this is basically what I need - to explore potential
>>>> traits
>>>>>> of children before optimizing parents. And this is basically what
>>> Drill
>>>>>> already does with it's SubsetTransformer:
>>>>>> 1) There is a SubsetTransformer interface, which iterates over
>>> physical
>>>>>> relations of the given subset [1]
>>>>>> 2) If you want to make a physical project, you iterate over physical
>>>>>> relations of the input subset and create possible physical projects
>>> [2]
>>>>>> 3) But if you cannot find any physical input, then you trigger
>>> creation
>>>> of
>>>>>> a "bad" physical project, which is very likely to have poor cost
>>>> because it
>>>>>> cannot take advantage of input's distribution information [3]
>>>>>> So, step 2 - is a trait set propagation which is needed by many
>>>>>> distributed engines. Step 3 - an attempt to workaround current
>>>>>> VolcanoPlanner behavior, when a parent rule is fired only if
>> produced
>>>> child
>>>>>> node has compatible trait set.
>>>>>> 
>>>>>> I do not know Calcite's architecture that good but at first glance,
>>> the
>>>>>> proposed ability to re-fire rules of a specific Rel seems good
>> enough.
>>>> It
>>>>>> doesn't expand search space, because no new nodes are created, and
>> it
>>>> seems
>>>>>> to be relatively simple to implement.
>>>>>> 
>>>>>> [1]
>>>>>> 
>>>> 
>>> 
>> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java
>>>>>> [2]
>>>>>> 
>>>> 
>>> 
>> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66
>>>>>> [3]
>>>>>> 
>>>> 
>>> 
>> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69
>>>>>> 
>>>>>> чт, 31 окт. 2019 г. в 12:21, Danny Chan <yuzhao....@gmail.com>:
>>>>>> 
>>>>>>> Thanks Vladimir for bringing up this discussion !
>>>>>>> 
>>>>>>> Did you notice that there is a JIRA issue about this problem ? [1]
>>>> Also a
>>>>>>> discussion about how to propagate the traits [2]
>>>>>>> 
>>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-2970
>>>>>>> [2]
>>>>>>> 
>>>> 
>>> 
>> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E
>>>>>>> 
>>>>>>> Best,
>>>>>>> Danny Chan
>>>>>>> 在 2019年10月31日 +0800 PM3:56，Vladimir Ozerov <ppoze...@gmail.com
>>> ，写道：
>>>>>>>> Hi colleagues,
>>>>>>>> 
>>>>>>>> I would like to discuss with the community the possibility of
>>> adding a
>>>>>>> new
>>>>>>>> public method to VolcanoPlanner which will forcefully re-trigger
>> the
>>>>>>> rules
>>>>>>>> for the specific rel. This is a follow up of a discussion started
>> in
>>>> the
>>>>>>>> other thread [1].
>>>>>>>> 
>>>>>>>> **Problem statement**
>>>>>>>> When converting between conventions during optimization
>>> VolcanoPlanner
>>>>>>>> prefers the top-bottom approach, so that the nodes are converted
>>> from
>>>> the
>>>>>>>> root. But in some cases, the intermediate node must be converted
>>> after
>>>>>>> its
>>>>>>>> children. This is especially true for distributed SQL engines,
>> which
>>>> rely
>>>>>>>> on distribution traits during the optimization process: it is not
>>>>>>> possible
>>>>>>>> to efficiently choose a proper physical implementation of a parent
>>>> node
>>>>>>>> unless the physical representation of a child node is known.
>>>>>>>> 
>>>>>>>> It seems that presently VolcanoPlanner cannot address such cases
>> by
>>>>>>>> default. Consider that we have two nodes and associated rules
>> which
>>>>>>> convert
>>>>>>>> them to physical counterparts:
>>>>>>>> [LogicalParent <- LogicalChild]
>>>>>>>> The parent should be converted after the child. When the child is
>>>>>>>> converted, the physical node is created:
>>>>>>>> [LogicalParent <- {LogicalChild, PhysicalChild}]
>>>>>>>> In order to finish the optimization process, we need to convert
>> the
>>>>>>> parent.
>>>>>>>> But parent rules are not fired, because PhysicalChild has traits
>>>>>>>> incompatible with LogicalParent.
>>>>>>>> 
>>>>>>>> Presently the problem could be solved in two ways:
>>>>>>>> 1) Always produce conversions when going top-down. In this case, I
>>>> first
>>>>>>>> create a physical parent, then a physical child. The problem is
>> that
>>>>>>>> created parent is not optimal because it didn't know child
>>>> distribution
>>>>>>> at
>>>>>>>> the time of creation. So the full flow would be: create a bad
>>> parent,
>>>>>>>> create a good child, create a good parent.
>>>>>>>> 1.1) [LogicalParent <- LogicalChild]
>>>>>>>> 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild]
>>>>>>>> 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild,
>>>>>>> PhysicalChild}]
>>>>>>>> 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <-
>>>>>>>> {LogicalChild, PhysicalChild}]
>>>>>>>> What is worse, the creation of a not optimal parent will trigger
>>>> rules on
>>>>>>>> its parent, which in turn may create a not optimal parent-parent
>>> node,
>>>>>>> etc.
>>>>>>>> 
>>>>>>>> 2) Make sure that your convention returns true for
>>>>>>>> Convention.canConvertConvention. In this case, every time a new
>> rel
>>> is
>>>>>>>> added to a RelSet, its traitset will be compared to traitsets of
>> all
>>>>>>> other
>>>>>>>> rels in the set. For every pair of traitset we may ask the engine
>> to
>>>>>>> create
>>>>>>>> a relevant AbstractConverter. The net effect is that
>>>>>>> "physical-to-logical"
>>>>>>>> converter is created, which re-triggers the rule on the logical
>>> parent
>>>>>>>> since their conventions are compatible:
>>>>>>>> 2.1) [LogicalParent <- LogicalChild]
>>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
>>>>>>>> 2.3) [LogicalParent <- {LogicalChild, PhysicalChild,
>>>>>>>> PhysicalToLogicalConverter}]
>>>>>>>> 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild,
>>> PhysicalChild,
>>>>>>>> PhysicalToLogicalConverter}]
>>>>>>>> 
>>>>>>>> The problem with that approach is that it is too coarse-grained
>>> since
>>>> we
>>>>>>>> operate on traitsets rather than rels. As a result, additional
>>> memory
>>>> and
>>>>>>>> CPU pressure are introduced because usually too many converters
>> are
>>>>>>>> created, which triggers the rules over and over again.
>>>>>>>> 
>>>>>>>> **Affected products**
>>>>>>>> At the moment two distributed engines are being developed for
>>>> Hazelcast
>>>>>>> and
>>>>>>>> Apache Ignite. Both require bottom-up optimization and currently
>>> rely
>>>> on
>>>>>>>> the second workaround.
>>>>>>>> Another example is Apache Drill. I do not know whether its
>> community
>>>> is
>>>>>>>> concerned with the issue, but it also uses bottom-up optimization
>>> for
>>>>>>> many
>>>>>>>> rules and employs both the aforementioned workarounds. As a
>> result,
>>> I
>>>>>>> guess
>>>>>>>> that Drill's optimizer also creates too many rels during
>>> optimization
>>>> and
>>>>>>>> suffer from huge search space. Please correct me if I am wrong.
>>>>>>>> 
>>>>>>>> **Proposal**
>>>>>>>> The key problem is that there is no way to re-fire rules on a
>>> specific
>>>>>>>> node. The original problem could have been solved, if it would be
>>>>>>> possible
>>>>>>>> to re-fire rules on a LogicalParent without creating any
>> additional
>>>> rels.
>>>>>>>> That would lead to a clear optimization flow:
>>>>>>>> 2.1) [LogicalParent <- LogicalChild]
>>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}]
>>>>>>>> 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild,
>>>> PhysicalChild}]
>>>>>>>> 
>>>>>>>> We can add the following method to VolcanoPlanner (RelOptPlanner?)
>>>>>>>> interface:
>>>>>>>> void fireRules(RelNode rel)
>>>>>>>> 
>>>>>>>> This method will fire the rules on a passed node in a deferred
>> mode
>>>> as if
>>>>>>>> it was the new node just added to the optimizer. This would
>> require
>>>>>>> slight
>>>>>>>> changes to RuleQueue.addMatch method, and possibly some other
>>> places.
>>>>>>>> 
>>>>>>>> Usage example:
>>>>>>>> class PhysicalChildRule extends RelOptRule {
>>>>>>>> void onMatch(RelOptRuleCall call) {
>>>>>>>> LogicalChild logicalRel = call.get(0);
>>>>>>>> PhysicalChild physicalRel = ...;
>>>>>>>> 
>>>>>>>> call.transformTo(physicalRel);
>>>>>>>> optimizer.fireRules(logicalRel);
>>>>>>>> }
>>>>>>>> }
>>>>>>>> 
>>>>>>>> What does the community think about such a method? Does it make
>> any
>>>> sense
>>>>>>>> to you? If not, do you aware of any other ways on how to organize
>>>>>>> bottom-up
>>>>>>>> optimization with VolcanoPlanner without the creation of
>> additional
>>>> rels?
>>>>>>>> 
>>>>>>>> If the community is OK in general, I can create try to create a PR
>>>> with a
>>>>>>>> prototype.
>>>>>>>> 
>>>>>>>> Would appreciate your feedback.
>>>>>>>> 
>>>>>>>> Regards,
>>>>>>>> Vladimir.
>>>>>>>> 
>>>>>>>> [1]
>>>>>>>> 
>>>>>>> 
>>>> 
>>> 
>> https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E
>>>>>>> 
>>>> 
>>>> 
>>> 
>>

Re: [DISCUSS] Proposal to add API to force rules matching specific rels

Reply via email to