Thanks Vladimir for bringing up this discussion ! Did you notice that there is a JIRA issue about this problem ? [1] Also a discussion about how to propagate the traits [2]
[1] https://issues.apache.org/jira/browse/CALCITE-2970 [2] https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E Best, Danny Chan 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov <ppoze...@gmail.com>,写道: > Hi colleagues, > > I would like to discuss with the community the possibility of adding a new > public method to VolcanoPlanner which will forcefully re-trigger the rules > for the specific rel. This is a follow up of a discussion started in the > other thread [1]. > > **Problem statement** > When converting between conventions during optimization VolcanoPlanner > prefers the top-bottom approach, so that the nodes are converted from the > root. But in some cases, the intermediate node must be converted after its > children. This is especially true for distributed SQL engines, which rely > on distribution traits during the optimization process: it is not possible > to efficiently choose a proper physical implementation of a parent node > unless the physical representation of a child node is known. > > It seems that presently VolcanoPlanner cannot address such cases by > default. Consider that we have two nodes and associated rules which convert > them to physical counterparts: > [LogicalParent <- LogicalChild] > The parent should be converted after the child. When the child is > converted, the physical node is created: > [LogicalParent <- {LogicalChild, PhysicalChild}] > In order to finish the optimization process, we need to convert the parent. > But parent rules are not fired, because PhysicalChild has traits > incompatible with LogicalParent. > > Presently the problem could be solved in two ways: > 1) Always produce conversions when going top-down. In this case, I first > create a physical parent, then a physical child. The problem is that > created parent is not optimal because it didn't know child distribution at > the time of creation. So the full flow would be: create a bad parent, > create a good child, create a good parent. > 1.1) [LogicalParent <- LogicalChild] > 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] > 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, PhysicalChild}] > 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- > {LogicalChild, PhysicalChild}] > What is worse, the creation of a not optimal parent will trigger rules on > its parent, which in turn may create a not optimal parent-parent node, etc. > > 2) Make sure that your convention returns true for > Convention.canConvertConvention. In this case, every time a new rel is > added to a RelSet, its traitset will be compared to traitsets of all other > rels in the set. For every pair of traitset we may ask the engine to create > a relevant AbstractConverter. The net effect is that "physical-to-logical" > converter is created, which re-triggers the rule on the logical parent > since their conventions are compatible: > 2.1) [LogicalParent <- LogicalChild] > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > 2.3) [LogicalParent <- {LogicalChild, PhysicalChild, > PhysicalToLogicalConverter}] > 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild, > PhysicalToLogicalConverter}] > > The problem with that approach is that it is too coarse-grained since we > operate on traitsets rather than rels. As a result, additional memory and > CPU pressure are introduced because usually too many converters are > created, which triggers the rules over and over again. > > **Affected products** > At the moment two distributed engines are being developed for Hazelcast and > Apache Ignite. Both require bottom-up optimization and currently rely on > the second workaround. > Another example is Apache Drill. I do not know whether its community is > concerned with the issue, but it also uses bottom-up optimization for many > rules and employs both the aforementioned workarounds. As a result, I guess > that Drill's optimizer also creates too many rels during optimization and > suffer from huge search space. Please correct me if I am wrong. > > **Proposal** > The key problem is that there is no way to re-fire rules on a specific > node. The original problem could have been solved, if it would be possible > to re-fire rules on a LogicalParent without creating any additional rels. > That would lead to a clear optimization flow: > 2.1) [LogicalParent <- LogicalChild] > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild}] > > We can add the following method to VolcanoPlanner (RelOptPlanner?) > interface: > void fireRules(RelNode rel) > > This method will fire the rules on a passed node in a deferred mode as if > it was the new node just added to the optimizer. This would require slight > changes to RuleQueue.addMatch method, and possibly some other places. > > Usage example: > class PhysicalChildRule extends RelOptRule { > void onMatch(RelOptRuleCall call) { > LogicalChild logicalRel = call.get(0); > PhysicalChild physicalRel = ...; > > call.transformTo(physicalRel); > optimizer.fireRules(logicalRel); > } > } > > What does the community think about such a method? Does it make any sense > to you? If not, do you aware of any other ways on how to organize bottom-up > optimization with VolcanoPlanner without the creation of additional rels? > > If the community is OK in general, I can create try to create a PR with a > prototype. > > Would appreciate your feedback. > > Regards, > Vladimir. > > [1] > https://ponymail-vm.apache.org/_GUI_/thread.html/13e7ab2040bfa4902db6647992ec4203ceb0262cfcb28d38ef7e9e32@%3Cdev.calcite.apache.org%3E