I forgot to mention that this approach was inspired by Stamatis's idea [1] [1] https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E
-- Kind Regards Roman Kondakov On 07.01.2020 21:14, Roman Kondakov wrote: > Hello! > > As Vladimir Ozerov mentioned, the general problem here is that > VolcanoPlanner applies both logical and physical rules in a top-down > manner. It's more like the original volcano paper approach [1]: > >> In the Volcano search strategy, a first phase applied all >> transformation rules to create all possible logical expressions for a query >> and all its subtrees. The second phase, >> which performed the actual optimization, navigated within that network of >> equivalence classes and expressions, >> applied implementation rules to obtain plans, and determined the best plan. > The Cascades paper [2] says > >> In the Cascades optimizer, this separation into two phases is abolished, >> because it is not useful to derive all >> logically equivalent forms of all expressions, e.g., of a predicate. A group >> is explored using transformation rules >> only on demand > > Unfortunately, it is not clear from this paper what the actual "on > demand" phrase means. But there is a great explanation of the Cascades > optimization process can be found in [3]: > >> It begins with the input query and, ..., >> proceeds top-down, using recursion on the inputs >> of the current multiexpression MExpr. However, plan >> costing actually proceeds bottom-up, based on the order >> of the returns from the top-down recursive calls. > > and this is what VolcanoPlanner lacks for: when a logical rule is > applied in a top-down way and a new LogicalRelNode is created, we need > to understand whether the overall plan cost was improved by this move or > not. In order to understand it we need to know the actual cost of the > new plan. As [3] says: > >> A plan is an expression made up entirely of physical operators. > > Hence, to evaluate the plan cost we need to have fully implemented > physical tree of operators for our new LogicalRelNode which was created > during applying the logical rule above. > > This process can be described with the pseudo-code: > > for (LogicalRuleMatch rule : LogicalRuleMatchesSortedTopDown) { > LogicalNode newNode = rule.apply(); > implementRecursivelyBottomUp(newNode); > } > > we first apply logical rules in a top-down fashion and then after each > logical rule invocation we need to implement newly created nodes with > the physical rules recursively in a bottom-up way. > > The outcome here is when the physical converter rule is applied to a > RelNode, all children of this node are already implemented, so their > traits can easily be derived and the precise cost of them can be calculated. > > I tried to simulate this behavior with minor changes in > RuleQueue.RuleMatchImportanceComparator (see PR [4]) and it worked well > for me. Tests started perform better. I counted planner ticks (see > VolcanoPlanner#findBestExp) to evaluate performance improvements. For > example, the number of ticks in JdbcTest#testJoinManyWay before > comparator change: > > ticks=11 > ticks=1041 > ticks=31362 > > and after: > > ticks=11 > ticks=678 > ticks=26231 > > In some Apache Ignite test scenarios the number of ticks was reduced by > half. > > The essence of this change is the minor adjustment of > RuleQueue.RuleMatchImportanceComparator: > > - converter rules (physical rules) always have priority over the logical > ones (if there are physical rules in the queue, they will be applied first) > - Physical rules are ordered in the bottom-up fashion > - Logical rules are ordered in the top-down fashion > > For example, if we have the initial rel: > > LogicalProject > LogicalFilter > > the rule queue for this RelNode would be ordered like this: > > 1. PhysicalFilterRule <-children physical rules go first > 2. PhysicalProjectRule <- then go parent physical rules > 3. FilterProjectTransposeRule <- logical rules go last > > This minor improvement might help in some cases: > - slight reducing the planning time. > - simplification of the bottom-up trait propagation because when it's > time to physically implement a parent node, all it's children are > already physically implemented. > > But in general this approach is not the actual Cascades implementation > because it still missing one of the most useful Cascades feature: groups > prunning. Details can be found in [3]. Long story short: we can skip > optimization of some RelNodes if it is known that it's children have the > bigger cost than the best RelNode in the same Subset. The more > fundamental changes are required to fix it. > > @Haisheng, are you doing something like that? > > > [1] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf > [2] > https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Cascades-graefe.pdf > [3] > https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/shapiro-ideas2001.pdf > [4] https://github.com/apache/calcite/pull/1732 > >