HI Igor, Thank you for the details. Meanwhile, I solved it with separation of conversion rules from the physical optimization rules. So the first pass creates physical nodes with unknown physical properties (top-bottom), while subsequent processing of the leaf nodes triggers rules which convert "bad" physical nodes to "good" physical nodes with know distribution and collation.
Regards, Vladimir. пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor <gvvinbl...@gmail.com>: > Vladimir, > > Hope it may help you. > > Currently we applied the next way (just rough description): > > 1) We created an API to derive possible traits permutations on the basis > of children traits (quite similar to one, described in «On Demand trait > set request» topic) > > 2) added a general rule that copies Logical nodes, but requesting our > convention from their children (IGNITE convention, ANY distribution, EMPTY > collation) and sets importance of old Logical nodes to zero - so, we have a > Logical parent which input satisfies any possible distribution and no rules > matched to previous logical node any more. > > 3) Physical rules to create physical rel nodes only if physical traits may > be derived (there is no «barrier», described in one of previous messages) - > derived traits are a collection, we don’t create a physical rel node for > each possible traits set, also we may set zero importance for previously > created rel nodes to decrease search space. > > Now we know actual and required distribution, we don’t need > AbstractConverters and able just call TraitDef.convert() method inside a > rule. > A rule still able to produce the same output several times, but > «memorization" inside the planner solves it for us. > > preliminary tests show almost zero overhead of the approach. > > Regards, > Igor > > > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov <ppoze...@gmail.com> > написал(а): > > > > Hi Xing, > > > > Thanks for your suggestion. Yes, this may help, and if I get your idea > > right, I already had it in my reproducer: > > 1) Create the converted physical input: > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 > > 2) Use it in case no physical children were found: > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 > > > > This idea is borrowed from Apache Drill physical rules. But the problem > is > > that this approach leads to a suboptimal plan - parent node doesn't know > > the future distribution of a child node. And as a result, it doesn't know > > it's own distribution. So the final plan is constructed in that way: > > 1.1) Root enforced "SINGLETON" on its child: > > -> PhysicalRoot[SINGLETON] > > -> Converter[SINGLETON] > > -> PhysicalProject[*ANY*] > > -> PhysicalScan[REPLICATED] > > > > 1.2) But since the child (PhysicalProject) failed to infer distribution > > during rule call, it falls back to ANY distribution. In order to satisfy > > SINGLETON distribution of a parent, we inject an exchange in the final > plan: > > -> PhysicalRoot[SINGLETON] > > * -> Exchange[SINGLETON]* > > -> PhysicalProject[*ANY*] > > -> PhysicalScan[REPLICATED] > > > > 2) But this is a suboptimal plan. The optimal plan is: > > -> PhysicalRoot[SINGLETON] > > -> PhysicalProject[REPLICATED] > > -> PhysicalScan[REPLICATED] > > > > You may observe it in my tests: > > 1) > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 > > - > > works as you described and produces not optimal plan with exchange > > 2) > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 > > - > > rely on AbstractConverter-s and produce an optimal plan with bottom-up > > trait propagation at the cost of significantly increased planning time > > > > Regards, > > Vladimir. > > > > пт, 8 нояб. 2019 г. в 16:15, XING JIN <jinxing.co...@gmail.com>: > > > >> Hi Vladimir, > >> > >> I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work > may > >> help you ~ > >> They work by a top-down fashion, but when matching parent, they convert > the > >> children explicitly. > >> You may try below steps: > >> 1. Construct a rule LogicalParentRule to match the LogicalParent without > >> distribution/physical requirement for the LogicalChild; > >> 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to > >> build a new child with physical convention. Note that at this moment > only > >> an empty RelSubset is created and no PhysicalChild exists. > >> 3. Then set the RelNode to be the new input of LogicalParent; > >> > >> By above steps, you can build a parent-child relationship between > >> LogicalParent and PhysicalChild, and at last the PhysicalParentRule > will be > >> fired based on this relationship. > >> > >> I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV > in > >> below link, hope it may help you ~ > >> https://github.com/jinxing64/calcite/tree/demo > >> > >> Also I'm +1 with Seliverstov that to get all parents of a set, which > >> against the current check in RelSubset#getParentRels > >> > >> Best, > >> Jin > >> > >> Vladimir Ozerov <ppoze...@gmail.com> 于2019年11月5日周二 下午6:41写道: > >> > >>> Hi Xiening, > >>> > >>> I read the thread about on-demand trait requests. It seems pretty > similar > >>> to what I am trying to achieve, as it facilitates the bottom-up > >> propagation > >>> of physical traits. In fact, both your and my strategy propagate traits > >>> bottom-up, but I do this through rules, which also fire bottom-up, > while > >> in > >>> your case only the traits are propagated bottom-up, while rules > continue > >>> working in a top-down fashion. > >>> > >>> However, I am thinking of how I would potentially implement my > optimizer > >>> with your approach, and it feels like with on-demand traits resulting > >>> implementation of metadata queries may become very complex to that > point > >>> that it will look like another set of rules, parallel to the already > >>> existing ruleset. For example, consider that I have a couple of > >> distributed > >>> tables in an OLTP application. These tables have a number of indexes, > >> and I > >>> would like to join them. First, I have a number of choices on how to > join > >>> tables with respect to distribution. Then, I have a number of choices > on > >>> which access method to use. Because sometimes it is beneficial to pick > >>> index scans instead of table scans even without index conditions, for > >>> example, to preserve a comfortable collation. So when my logical scan > >>> receives such metadata request, it typically cannot return all possible > >>> combinations, because there are too many of them. Instead, some > >> heuristical > >>> or cost-based logic will be used to calculate a couple of most > >> prospective > >>> ones. But it seems that we will have to duplicate the same logic in the > >>> corresponding rule, aren't we? > >>> > >>> I would love to read your design because this is a really interesting > >>> topic, and it is of great importance for the distributed engines > >> developed > >>> on top of Calcite since proper use of distribution and collation is the > >> key > >>> success factor for efficient query optimization. > >>> > >>> Regards, > >>> Vladimir. > >>> > >>> пт, 1 нояб. 2019 г. в 00:40, Xiening Dai <xndai....@gmail.com>: > >>> > >>>> Actually we solved this problem in our setup using a mechanism called > >>>> “Pull-Up Traits”, which explores the possible trait set of children’s > >>> input > >>>> to decide parent’s physical properties. In order to determine child > >> input > >>>> trait, you would have to look at child’s children, and all the way to > >> the > >>>> leaves nodes or a barrier. A barrier is a rel node which cannot derive > >>> any > >>>> traits regardless the input. A good example would be a user define > >>> function > >>>> which would throw off any distribution or collation. Then we realize > >> just > >>>> pulling up is not enough, sometimes we would need to look at parent’s > >>>> requirement as well. So we try to solve this in a unified framework, > >>> which > >>>> we call “On Demand Trait” and implement it as part of the framework so > >>>> anyone can be benefited. I hope Haisheng can share a design doc once > we > >>>> have more concrete ideas. > >>>> > >>>> > >>>>> On Oct 31, 2019, at 11:37 AM, Jinfeng Ni <j...@apache.org> wrote: > >>>>> > >>>>> Hi Vladimir, > >>>>> > >>>>> The SubsetTransformer interface and the iterating over the RelNodes > >>>>> within a RelSubset in Drill is exactly implemented to do the trait > >>>>> propagation. We also had to rely on AbstractConverter to fire > >>>>> necessary rule to avoid the CanNotPlan issue. At some point, Calcite > >>>>> community chooses to remove AbstractConverter and Drill had to add it > >>>>> back, which is probably one of the main reasons for us to continue > >>>>> using a Calcite fork. I still remember we constantly had to deal > >> with > >>>>> the dilemma between "CanNotPlan" and long planing time due to > >> explored > >>>>> search space. > >>>>> > >>>>> Glad to see more people are joining the effort to solve this long > >>>>> overdue issue, something missing in Calcite's core optimizer > >> framework > >>>>> "since before Calcite was Calcite" (Jacques's words). > >>>>> > >>>>> Jinfeng > >>>>> > >>>>> > >>>>> On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov <ppoze...@gmail.com> > >>>> wrote: > >>>>>> > >>>>>> Hi Danny, > >>>>>> > >>>>>> Thank you very much for the links. What is described here is pretty > >>> much > >>>>>> similar to the problem I describe. Especially the discussion about > >>> trait > >>>>>> propagation, as this is basically what I need - to explore potential > >>>> traits > >>>>>> of children before optimizing parents. And this is basically what > >>> Drill > >>>>>> already does with it's SubsetTransformer: > >>>>>> 1) There is a SubsetTransformer interface, which iterates over > >>> physical > >>>>>> relations of the given subset [1] > >>>>>> 2) If you want to make a physical project, you iterate over physical > >>>>>> relations of the input subset and create possible physical projects > >>> [2] > >>>>>> 3) But if you cannot find any physical input, then you trigger > >>> creation > >>>> of > >>>>>> a "bad" physical project, which is very likely to have poor cost > >>>> because it > >>>>>> cannot take advantage of input's distribution information [3] > >>>>>> So, step 2 - is a trait set propagation which is needed by many > >>>>>> distributed engines. Step 3 - an attempt to workaround current > >>>>>> VolcanoPlanner behavior, when a parent rule is fired only if > >> produced > >>>> child > >>>>>> node has compatible trait set. > >>>>>> > >>>>>> I do not know Calcite's architecture that good but at first glance, > >>> the > >>>>>> proposed ability to re-fire rules of a specific Rel seems good > >> enough. > >>>> It > >>>>>> doesn't expand search space, because no new nodes are created, and > >> it > >>>> seems > >>>>>> to be relatively simple to implement. > >>>>>> > >>>>>> [1] > >>>>>> > >>>> > >>> > >> > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java > >>>>>> [2] > >>>>>> > >>>> > >>> > >> > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 > >>>>>> [3] > >>>>>> > >>>> > >>> > >> > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 > >>>>>> > >>>>>> чт, 31 окт. 2019 г. в 12:21, Danny Chan <yuzhao....@gmail.com>: > >>>>>> > >>>>>>> Thanks Vladimir for bringing up this discussion ! > >>>>>>> > >>>>>>> Did you notice that there is a JIRA issue about this problem ? [1] > >>>> Also a > >>>>>>> discussion about how to propagate the traits [2] > >>>>>>> > >>>>>>> [1] https://issues.apache.org/jira/browse/CALCITE-2970 > >>>>>>> [2] > >>>>>>> > >>>> > >>> > >> > https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E > >>>>>>> > >>>>>>> Best, > >>>>>>> Danny Chan > >>>>>>> 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov <ppoze...@gmail.com > >>> ,写道: > >>>>>>>> Hi colleagues, > >>>>>>>> > >>>>>>>> I would like to discuss with the community the possibility of > >>> adding a > >>>>>>> new > >>>>>>>> public method to VolcanoPlanner which will forcefully re-trigger > >> the > >>>>>>> rules > >>>>>>>> for the specific rel. This is a follow up of a discussion started > >> in > >>>> the > >>>>>>>> other thread [1]. > >>>>>>>> > >>>>>>>> **Problem statement** > >>>>>>>> When converting between conventions during optimization > >>> VolcanoPlanner > >>>>>>>> prefers the top-bottom approach, so that the nodes are converted > >>> from > >>>> the > >>>>>>>> root. But in some cases, the intermediate node must be converted > >>> after > >>>>>>> its > >>>>>>>> children. This is especially true for distributed SQL engines, > >> which > >>>> rely > >>>>>>>> on distribution traits during the optimization process: it is not > >>>>>>> possible > >>>>>>>> to efficiently choose a proper physical implementation of a parent > >>>> node > >>>>>>>> unless the physical representation of a child node is known. > >>>>>>>> > >>>>>>>> It seems that presently VolcanoPlanner cannot address such cases > >> by > >>>>>>>> default. Consider that we have two nodes and associated rules > >> which > >>>>>>> convert > >>>>>>>> them to physical counterparts: > >>>>>>>> [LogicalParent <- LogicalChild] > >>>>>>>> The parent should be converted after the child. When the child is > >>>>>>>> converted, the physical node is created: > >>>>>>>> [LogicalParent <- {LogicalChild, PhysicalChild}] > >>>>>>>> In order to finish the optimization process, we need to convert > >> the > >>>>>>> parent. > >>>>>>>> But parent rules are not fired, because PhysicalChild has traits > >>>>>>>> incompatible with LogicalParent. > >>>>>>>> > >>>>>>>> Presently the problem could be solved in two ways: > >>>>>>>> 1) Always produce conversions when going top-down. In this case, I > >>>> first > >>>>>>>> create a physical parent, then a physical child. The problem is > >> that > >>>>>>>> created parent is not optimal because it didn't know child > >>>> distribution > >>>>>>> at > >>>>>>>> the time of creation. So the full flow would be: create a bad > >>> parent, > >>>>>>>> create a good child, create a good parent. > >>>>>>>> 1.1) [LogicalParent <- LogicalChild] > >>>>>>>> 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] > >>>>>>>> 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, > >>>>>>> PhysicalChild}] > >>>>>>>> 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- > >>>>>>>> {LogicalChild, PhysicalChild}] > >>>>>>>> What is worse, the creation of a not optimal parent will trigger > >>>> rules on > >>>>>>>> its parent, which in turn may create a not optimal parent-parent > >>> node, > >>>>>>> etc. > >>>>>>>> > >>>>>>>> 2) Make sure that your convention returns true for > >>>>>>>> Convention.canConvertConvention. In this case, every time a new > >> rel > >>> is > >>>>>>>> added to a RelSet, its traitset will be compared to traitsets of > >> all > >>>>>>> other > >>>>>>>> rels in the set. For every pair of traitset we may ask the engine > >> to > >>>>>>> create > >>>>>>>> a relevant AbstractConverter. The net effect is that > >>>>>>> "physical-to-logical" > >>>>>>>> converter is created, which re-triggers the rule on the logical > >>> parent > >>>>>>>> since their conventions are compatible: > >>>>>>>> 2.1) [LogicalParent <- LogicalChild] > >>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > >>>>>>>> 2.3) [LogicalParent <- {LogicalChild, PhysicalChild, > >>>>>>>> PhysicalToLogicalConverter}] > >>>>>>>> 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild, > >>> PhysicalChild, > >>>>>>>> PhysicalToLogicalConverter}] > >>>>>>>> > >>>>>>>> The problem with that approach is that it is too coarse-grained > >>> since > >>>> we > >>>>>>>> operate on traitsets rather than rels. As a result, additional > >>> memory > >>>> and > >>>>>>>> CPU pressure are introduced because usually too many converters > >> are > >>>>>>>> created, which triggers the rules over and over again. > >>>>>>>> > >>>>>>>> **Affected products** > >>>>>>>> At the moment two distributed engines are being developed for > >>>> Hazelcast > >>>>>>> and > >>>>>>>> Apache Ignite. Both require bottom-up optimization and currently > >>> rely > >>>> on > >>>>>>>> the second workaround. > >>>>>>>> Another example is Apache Drill. I do not know whether its > >> community > >>>> is > >>>>>>>> concerned with the issue, but it also uses bottom-up optimization > >>> for > >>>>>>> many > >>>>>>>> rules and employs both the aforementioned workarounds. As a > >> result, > >>> I > >>>>>>> guess > >>>>>>>> that Drill's optimizer also creates too many rels during > >>> optimization > >>>> and > >>>>>>>> suffer from huge search space. Please correct me if I am wrong. > >>>>>>>> > >>>>>>>> **Proposal** > >>>>>>>> The key problem is that there is no way to re-fire rules on a > >>> specific > >>>>>>>> node. The original problem could have been solved, if it would be > >>>>>>> possible > >>>>>>>> to re-fire rules on a LogicalParent without creating any > >> additional > >>>> rels. > >>>>>>>> That would lead to a clear optimization flow: > >>>>>>>> 2.1) [LogicalParent <- LogicalChild] > >>>>>>>> 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > >>>>>>>> 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild, > >>>> PhysicalChild}] > >>>>>>>> > >>>>>>>> We can add the following method to VolcanoPlanner (RelOptPlanner?) > >>>>>>>> interface: > >>>>>>>> void fireRules(RelNode rel) > >>>>>>>> > >>>>>>>> This method will fire the rules on a passed node in a deferred > >> mode > >>>> as if > >>>>>>>> it was the new node just added to the optimizer. This would > >> require > >>>>>>> slight > >>>>>>>> changes to RuleQueue.addMatch method, and possibly some other > >>> places. > >>>>>>>> > >>>>>>>> Usage example: > >>>>>>>> class PhysicalChildRule extends RelOptRule { > >>>>>>>> void onMatch(RelOptRuleCall call) { > >>>>>>>> LogicalChild logicalRel = call.get(0); > >>>>>>>> PhysicalChild physicalRel = ...; > >>>>>>>> > >>>>>>>> call.transformTo(physicalRel); > >>>>>>>> optimizer.fireRules(logicalRel); > >>>>>>>> } > >>>>>>>> } > >>>>>>>> > >>>>>>>> What does the community think about such a method? Does it make > >> any > >>>> sense > >>>>>>>> to you? If not, do you aware of any other ways on how to organize > >>>>>>> bottom-up > >>>>>>>> optimization with VolcanoPlanner without the creation of > >> additional > >>>> rels? > >>>>>>>> > >>>>>>>> If the community is OK in general, I can create try to create a PR > >>>> with a > >>>>>>>> prototype. > >>>>>>>> > >>>>>>>> Would appreciate your feedback. > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Vladimir. > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>>>> > >>>> > >>> > >> > https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E > >>>>>>> > >>>> > >>>> > >>> > >> > >