> @Haisheng, are you doing something like that?
Kind of, but not exactly. It is about on-demand trait propagation.

@Roman seems to be keen on space pruning for Calcite. But IMHO, for now, the 
main reason of Calcite's poor performance is not lack of branch & bound space 
puning, but 
- rule applying on physical nodes, 
- randomness of rule matching,
- the poor-design of column pruning, 
- lack of on-demand trait propagation, 
- lack of group properties etc.

We tried a similar change with Roman's on our product. We totally removed rule 
match importance and its comparison, split it into exploration, implementation, 
enforcement 3 phases with specific top-down/bottom-up order, it achieved almost 
100% speedup.
Even @vlsi's RexNode normalization can improve it to some degree.

Calcite currently generates only 1 join-order alternative for 6-way joins in 
testJoinManyWay, not even top 10, 100  or N! ordering alternatives, but it 
still can't finish within reasonable amount of time when abstract converter is 
allowed. If there is only 1 join order alternative, the query optimizer should 
finish the optimization quickly even for clique or chain queries with 20 way 
joins, without space pruning. But this is not the case for Calcite.

Simply put it, space pruning is important for optimization, especially for 
join-reordering, but not an urgent issue for Calcite.

- Haisheng

------------------------------------------------------------------
发件人:Roman Kondakov<kondako...@mail.ru.INVALID>
日 期:2020年01月08日 02:39:19
收件人:<dev@calcite.apache.org>
主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels

I forgot to mention that this approach was inspired by Stamatis's idea [1]

[1]
https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E


-- 
Kind Regards
Roman Kondakov


On 07.01.2020 21:14, Roman Kondakov wrote:
> Hello!
> 
> As Vladimir Ozerov mentioned, the general problem here is that
> VolcanoPlanner applies both logical and physical rules in a top-down
> manner. It's more like the original volcano paper approach [1]:
> 
>> In the Volcano search strategy, a first phase applied all
>> transformation rules to create all possible logical expressions for a query 
>> and all its subtrees. The second phase,
>> which performed the actual optimization, navigated within that network of 
>> equivalence classes and expressions,
>> applied implementation rules to obtain plans, and determined the best plan.
> The Cascades paper [2] says
> 
>> In the Cascades optimizer, this separation into two phases is abolished, 
>> because it is not useful to derive all
>> logically equivalent forms of all expressions, e.g., of a predicate. A group 
>> is explored using transformation rules
>> only on demand
> 
> Unfortunately, it is not clear from this paper what the actual "on
> demand" phrase means. But there is a great explanation of the Cascades
> optimization process can be found in [3]:
> 
>> It begins with the input query and, ..., 
>> proceeds top-down, using recursion on the inputs
>> of the current multiexpression MExpr. However, plan
>> costing actually proceeds bottom-up, based on the order
>> of the returns from the top-down recursive calls. 
> 
> and this is what VolcanoPlanner lacks for: when a logical rule is
> applied in a top-down way and a new LogicalRelNode is created, we need
> to understand whether the overall plan cost was improved by this move or
> not. In order to understand it we need to know the actual cost of the
> new plan. As [3] says:
> 
>> A plan is an expression made up entirely of physical operators.
> 
> Hence, to evaluate the plan cost we need to have fully implemented
> physical tree of operators for our new LogicalRelNode which was created
> during applying the logical rule above.
> 
> This process can be described with the pseudo-code:
> 
> for (LogicalRuleMatch rule : LogicalRuleMatchesSortedTopDown) {
> LogicalNode newNode = rule.apply();
> implementRecursivelyBottomUp(newNode);
> }
> 
> we first apply logical rules in a top-down fashion and then after each
> logical rule invocation we need to implement newly created nodes with
> the physical rules recursively in a bottom-up way.
> 
> The outcome here is when the physical converter rule is applied to a
> RelNode, all children of this node are already implemented, so their
> traits can easily be derived and the precise cost of them can be calculated.
> 
> I tried to simulate this behavior with minor changes in
> RuleQueue.RuleMatchImportanceComparator (see PR [4]) and it worked well
> for me. Tests started perform better. I counted planner ticks (see
> VolcanoPlanner#findBestExp) to evaluate performance improvements. For
> example, the number of ticks in JdbcTest#testJoinManyWay before
> comparator change:
> 
> ticks=11
> ticks=1041
> ticks=31362
> 
> and after:
> 
> ticks=11
> ticks=678
> ticks=26231
> 
> In some Apache Ignite test scenarios the number of ticks was reduced by
> half.
> 
> The essence of this change is the minor adjustment of
> RuleQueue.RuleMatchImportanceComparator:
> 
> - converter rules (physical rules) always have priority over the logical
> ones (if there are physical rules in the queue, they will be applied first)
> - Physical rules are ordered in the bottom-up fashion
> - Logical rules are ordered in the top-down fashion
> 
> For example, if we have the initial rel:
> 
> LogicalProject
> LogicalFilter
> 
> the rule queue for this RelNode would be ordered like this:
> 
> 1. PhysicalFilterRule <-children physical rules go first
> 2. PhysicalProjectRule <- then go parent physical rules
> 3. FilterProjectTransposeRule <- logical rules go last
> 
> This minor improvement might help in some cases:
> - slight reducing the planning time.
> - simplification of the bottom-up trait propagation because when it's
> time to physically implement a parent node, all it's children are
> already physically implemented.
> 
> But in general this approach is not the actual Cascades implementation
> because it still missing one of the most useful Cascades feature: groups
> prunning. Details can be found in [3]. Long story short: we can skip
> optimization of some RelNodes if it is known that it's children have the
> bigger cost than the best RelNode in the same Subset. The more
> fundamental changes are required to fix it.
> 
> @Haisheng, are you doing something like that?
> 
> 
> [1] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf
> [2]
> https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Cascades-graefe.pdf
> [3]
> https://15721.courses.cs.cmu.edu/spring2019/papers/22-optimizer1/shapiro-ideas2001.pdf
> [4] https://github.com/apache/calcite/pull/1732
> 
> 

Reply via email to