Re: Re: Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Haisheng, Thank you for pointing to 1.23.0 changes, I'll try to use this version. Regards, Vladimir. вс, 15 мар. 2020 г. в 08:05, Haisheng Yuan : > Thanks for your detailed explanation, Vladimir. > > Indeed, the only way to propagate traits in Calcite currently is using > rules, which is a big pain. I can feel your pain. I tried to come up ways > to implement the trait derivation and requirement in the framwork without > breaking current usages, only turns out it is almost impossible. It has too > many stakeholders, even a small change may incur opposition. > > But before we get the real top-down cascades framework, there are still a > lot you can do to improve your planner's performance. > > Since Calcite 1.22.0, I committed a change that enabes RelSubset to be > used to trigger a rule, which can greatly reduce the number of rule calls > for trait propagation. With your example, you need 2 rules: > 1. Physical implementation rule > match pattern: operand(LogicalProject.class) > Produce PhysicalProject without trait > 2. Project trait propagtion rule > match pattern: operand(PhysicalProject.class, operand(RelSubset.class)) > Produce PhysicalProject with derived trait. > > Since 1.23.0, we removed the rule match importances and ordering, I guess > the can reduce the optimizatino time around 10~20% for some complex queries > with many rule calls. > > - Haisheng > > -- > 发件人:Vladimir Ozerov > 日 期:2020年03月15日 04:18:53 > 收件人:Haisheng Yuan > 抄 送:dev@calcite.apache.org (dev@calcite.apache.org) > > 主 题:Re: Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching > specific rels > > Hi Haisheng, > > You are right, the behavior I showed is not the default one, I should > provide more context here. This is how we (Hazelcast) and at least Drill > use the engine to ensure that the produced plan is optimal, I gave an > example in [1]. > In real distributed engines, we rely on physical properties heavily. Most > notably, distribution and collation. And the fundamental problem with the > VolcanoOptimizer, is that it cannot propagate traits in a controlled > manner. This, in turn, forces us to use AbstractConverters and implement > rules in ways, which appear strange to Calcite community :-). And this, in > turn, leads to excessive rule calls and failure to plan complex queries. > > Let's consider the same tree again, but now assuming that this is not the > complete tree, but a subtree, and there are some parent operators. Let's > also assume that the ScanRule may produce two equivalent operators with > different physical properties: PhysicalScan and PhysicalIndexScan[a ASC]. > It is important to consider both alternatives in parent operators. Now > let's consider two different ways to optimize that subtree. > > 1. Canonical Calcite way (default) > 1.1 Perform initial rules ordering, parent rules fire first: [ProjectRule, > ScanRule] > 1.2 Invoke ProjectRule, which produces physical project without any > physical traits > 1.3 Invoke ScanRule, which produces, PhysicalScan and PhysicalIndexScan[a > ASC] > Since ProjectRule was not aware of different scan alternatives, it missed > collation, and resulting hypergraph looks like this: > > -- PhysicalProject > [PhysicalScan, PhysicalIndexScan[a ASC]] > > This plan is suboptimal, because of parent operators cannot take advantage > of collation. > > 2. Hazelast/Drill way: > 2.1 Enable abstract converters > 2.2 Rules are ordered in the same way as in example 1: [ProjectRule, > ScanRule] > 2.3 ProjectRule fires, enumerates physical implementations of the input. > Since there are no physical inputs yet, it exits without any > transformations > 2.4 ScanRule fires and produces two physical scans > 2.5 Abstract converters ensure that the ProjectRule is scheduled for > execution again because it's input RelSet has new nodes > 2.6 ProjectRule fires again, now having physical inputs, and generates > one implementation per unique combination of physical input traits. > > As a result, we have two graphs now: > > Graph 1: > -- PhysicalProject > PhysicalScan > > Graph 2: > -- PhysicalProject[a ASC] > PhysicalIndexScan[a ASC] > > Notice how we propagated physical collation. Now parent operators may take > advantage of it, e.g. eliminate sorting, or perform streaming aggregation > instead of blocking hash aggregation. > > This is the fundamental problem we have in Hazelcast: how to ensure the > complete exploration of the search space without excessive rule calls. > > Very short summary: > 1. The default behavior of VolcanoOptimizer cannot explore the physical > search space, so
Re: Re: Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Thanks for your detailed explanation, Vladimir. Indeed, the only way to propagate traits in Calcite currently is using rules, which is a big pain. I can feel your pain. I tried to come up ways to implement the trait derivation and requirement in the framwork without breaking current usages, only turns out it is almost impossible. It has too many stakeholders, even a small change may incur opposition. But before we get the real top-down cascades framework, there are still a lot you can do to improve your planner's performance. Since Calcite 1.22.0, I committed a change that enabes RelSubset to be used to trigger a rule, which can greatly reduce the number of rule calls for trait propagation. With your example, you need 2 rules: 1. Physical implementation rule match pattern: operand(LogicalProject.class) Produce PhysicalProject without trait 2. Project trait propagtion rule match pattern: operand(PhysicalProject.class, operand(RelSubset.class)) Produce PhysicalProject with derived trait. Since 1.23.0, we removed the rule match importances and ordering, I guess the can reduce the optimizatino time around 10~20% for some complex queries with many rule calls. - Haisheng -- 发件人:Vladimir Ozerov 日 期:2020年03月15日 04:18:53 收件人:Haisheng Yuan 抄 送:dev@calcite.apache.org (dev@calcite.apache.org) 主 题:Re: Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels Hi Haisheng, You are right, the behavior I showed is not the default one, I should provide more context here. This is how we (Hazelcast) and at least Drill use the engine to ensure that the produced plan is optimal, I gave an example in [1]. In real distributed engines, we rely on physical properties heavily. Most notably, distribution and collation. And the fundamental problem with the VolcanoOptimizer, is that it cannot propagate traits in a controlled manner. This, in turn, forces us to use AbstractConverters and implement rules in ways, which appear strange to Calcite community :-). And this, in turn, leads to excessive rule calls and failure to plan complex queries. Let's consider the same tree again, but now assuming that this is not the complete tree, but a subtree, and there are some parent operators. Let's also assume that the ScanRule may produce two equivalent operators with different physical properties: PhysicalScan and PhysicalIndexScan[a ASC]. It is important to consider both alternatives in parent operators. Now let's consider two different ways to optimize that subtree. 1. Canonical Calcite way (default) 1.1 Perform initial rules ordering, parent rules fire first: [ProjectRule, ScanRule] 1.2 Invoke ProjectRule, which produces physical project without any physical traits 1.3 Invoke ScanRule, which produces, PhysicalScan and PhysicalIndexScan[a ASC] Since ProjectRule was not aware of different scan alternatives, it missed collation, and resulting hypergraph looks like this: -- PhysicalProject [PhysicalScan, PhysicalIndexScan[a ASC]] This plan is suboptimal, because of parent operators cannot take advantage of collation. 2. Hazelast/Drill way: 2.1 Enable abstract converters 2.2 Rules are ordered in the same way as in example 1: [ProjectRule, ScanRule] 2.3 ProjectRule fires, enumerates physical implementations of the input. Since there are no physical inputs yet, it exits without any transformations 2.4 ScanRule fires and produces two physical scans 2.5 Abstract converters ensure that the ProjectRule is scheduled for execution again because it's input RelSet has new nodes 2.6 ProjectRule fires again, now having physical inputs, and generates one implementation per unique combination of physical input traits. As a result, we have two graphs now: Graph 1: -- PhysicalProject PhysicalScan Graph 2: -- PhysicalProject[a ASC] PhysicalIndexScan[a ASC] Notice how we propagated physical collation. Now parent operators may take advantage of it, e.g. eliminate sorting, or perform streaming aggregation instead of blocking hash aggregation. This is the fundamental problem we have in Hazelcast: how to ensure the complete exploration of the search space without excessive rule calls. Very short summary: 1. The default behavior of VolcanoOptimizer cannot explore the physical search space, so plans are not optimal 2. Abstract converters fix this if you follow a certain pattern in rule implementations (see 2.3), but generate too many rule calls, so join planning rules cannot be called together with other rules, which again lead to not optimal plans (yet, better than with p.1) 3. "Trait pull-up" proposal may fix it. But I have a feeling that pulling up possible trait combinations from a child node is indistinguishable from child node exploration, so it may be not very efficient again 4. A brand new optimizer implementation with recursive top-down approach may address all the concerns fr
Re: Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Haisheng, You are right, the behavior I showed is not the default one, I should provide more context here. This is how we (Hazelcast) and at least Drill use the engine to ensure that the produced plan is optimal, I gave an example in [1]. In real distributed engines, we rely on physical properties heavily. Most notably, distribution and collation. And the fundamental problem with the VolcanoOptimizer, is that it cannot propagate traits in a controlled manner. This, in turn, forces us to use AbstractConverters and implement rules in ways, which appear strange to Calcite community :-). And this, in turn, leads to excessive rule calls and failure to plan complex queries. Let's consider the same tree again, but now assuming that this is not the complete tree, but a subtree, and there are some parent operators. Let's also assume that the ScanRule may produce two equivalent operators with different physical properties: PhysicalScan and PhysicalIndexScan[a ASC]. It is important to consider both alternatives in parent operators. Now let's consider two different ways to optimize that subtree. 1. Canonical Calcite way (default) 1.1 Perform initial rules ordering, parent rules fire first: [ProjectRule, ScanRule] 1.2 Invoke ProjectRule, which produces physical project without any physical traits 1.3 Invoke ScanRule, which produces, PhysicalScan and PhysicalIndexScan[a ASC] Since ProjectRule was not aware of different scan alternatives, it missed collation, and resulting hypergraph looks like this: -- PhysicalProject [PhysicalScan, PhysicalIndexScan[a ASC]] This plan is suboptimal, because of parent operators cannot take advantage of collation. 2. Hazelast/Drill way: 2.1 Enable abstract converters 2.2 Rules are ordered in the same way as in example 1: [ProjectRule, ScanRule] 2.3 ProjectRule fires, enumerates physical implementations of the input. Since there are no physical inputs yet, it exits without any transformations 2.4 ScanRule fires and produces two physical scans 2.5 Abstract converters ensure that the ProjectRule is scheduled for execution again because it's input RelSet has new nodes 2.6 ProjectRule fires again, now having physical inputs, and generates one implementation per unique combination of physical input traits. As a result, we have two graphs now: Graph 1: -- PhysicalProject PhysicalScan Graph 2: -- PhysicalProject[a ASC] PhysicalIndexScan[a ASC] Notice how we propagated physical collation. Now parent operators may take advantage of it, e.g. eliminate sorting, or perform streaming aggregation instead of blocking hash aggregation. This is the fundamental problem we have in Hazelcast: how to ensure the complete exploration of the search space without excessive rule calls. Very short summary: 1. The default behavior of VolcanoOptimizer cannot explore the physical search space, so plans are not optimal 2. Abstract converters fix this if you follow a certain pattern in rule implementations (see 2.3), but generate too many rule calls, so join planning rules cannot be called together with other rules, which again lead to not optimal plans (yet, better than with p.1) 3. "Trait pull-up" proposal may fix it. But I have a feeling that pulling up possible trait combinations from a child node is indistinguishable from child node exploration, so it may be not very efficient again 4. A brand new optimizer implementation with recursive top-down approach may address all the concerns from p.1-p.3, but appears to be complex to implement and may be incompatible with existing rules Not an easy choice. Regards, Vladimir. [1] https://mail-archives.apache.org/mod_mbox/calcite-dev/201910.mbox/%3CCAJJmzpU9_75O48WeEKgHKg3fTMhK3eAMmHNOVvczj_uUTBxHkA%40mail.gmail.com%3E сб, 14 мар. 2020 г. в 21:53, Haisheng Yuan : > I agree that there should be no global rule queue, we should it do it on > per-operator basis, which is also how other major Cascades frameworks do. > > However, Calcite's VolcanoPlanner doesn't generate unnecessary rule calls > as you described. The current process is: > 1. global rule queue: ScanRule, ProjectRule > 2. Call ScanRule, produce physical scan > 3. Call ProjectRule, produce physical project. > > Even with global rule queue of reversed order ProjectRule, ScanRule, there > are still 2 rule calls. In your step 2, ProjectRule doesn't produce > physical node, which is incorrect. Any rule is, and should be independent > with each other rule. If your rule relies on other operators or rules to be > explored first, then you should think about it twice. > > - Haisheng > > -- > 发件人:Vladimir Ozerov > 日 期:2020年03月15日 01:50:10 > 收件人:dev@calcite.apache.org (dev@calcite.apache.org) > > 主 题:Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching > specific rels > > Hi Roman, > > In my understanding, the proposed mino
Re: Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
I agree that there should be no global rule queue, we should it do it on per-operator basis, which is also how other major Cascades frameworks do. However, Calcite's VolcanoPlanner doesn't generate unnecessary rule calls as you described. The current process is: 1. global rule queue: ScanRule, ProjectRule 2. Call ScanRule, produce physical scan 3. Call ProjectRule, produce physical project. Even with global rule queue of reversed order ProjectRule, ScanRule, there are still 2 rule calls. In your step 2, ProjectRule doesn't produce physical node, which is incorrect. Any rule is, and should be independent with each other rule. If your rule relies on other operators or rules to be explored first, then you should think about it twice. - Haisheng -- 发件人:Vladimir Ozerov 日 期:2020年03月15日 01:50:10 收件人:dev@calcite.apache.org (dev@calcite.apache.org) 主 题:Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels Hi Roman, In my understanding, the proposed minor changes may only decrease the total number of rule invocations slightly, but all principal problems remain the same. In the top-down approach, you do not want to implement bottom logical nodes unless it is requested explicitly by a parent operator. It seems to me that the very first step to efficient optimizer could be a new rule invocation infrastructure. There should be *no global rule queue* at all. Instead, we may introduce the per-node rule queue. Then, the optimizer performs a recursive top-down optimization dive, invoking the rules for every operator. Consider the following simple tree: -- LogicalProject LogicalScan Assuming that we have two implementation rules ProjectRule, ScanRule, and abstract converters enabled, VolcanoOptimizer will proceed as follows, generating one unnecessary rule call: 1. Define global rule call queue: ProjectRule, ScanRule 2. Call ProjectRule, no new nodes are produced 3. Call ScanRule, produce physical scans, reschedule ProjectRule 4. Call ProjectRule again, produce the physical project With the proposed approach, it will work differently: 1. Define per-operator queues: LogicalProject -> ProjectRule LogicalScan -> ScanRule 2. Call optimize(LogicalProject) 3. Invoke ProjectRule, which calls optimize(LogicalScan) on the input 4. Invoke ScanRule, produce physical scans, return control back to ProjectRule 5. Produce the physical project, finish optimization Now we have only 2 rule invocations as expected, and we reached the same result as in the proposed minor changes. But the crucial difference is that now we have well-defined control flow between operators: start at the top, delegate to children. With this infrastructure in place, we will be able to introduce more complex features, such as pruning, or partial exploration later on. But notice that this change will be incompatible with the current rules, i.e. they should be re-written for the new optimization algorithm (e.g. see step 3), which might be a blocker for the current Calcite users. So maybe it is better to think of a new optimizer, rather than fixing VolcanoOptimizer. Regards, Vladimir. вт, 14 янв. 2020 г. в 23:52, Haisheng Yuan : > On the other hand, if we don't preprocess and normalize the rel expression > before going to valcano planner, still compute and keep logical/relational > properties, like cardinality, for each operator, how can we achieve group > seach space pruning? Given a physical group expression, its required > property and upper bound cost C_limit, we need to get its child group > cardinality, compute local cost C_local, so that we can calculate the > child group's upper bound cost and pass it down. > > I can't figure out how we can do group pruning without shared relational > properties. > > - Haisheng > > -- > 发件人:Haisheng Yuan > 日 期:2020年01月14日 12:06:17 > 收件人:dev@calcite.apache.org > 主 题:Re: Re: [DISCUSS] Proposal to add API to force rules matching specific > rels > > The example is valid if Calcite doesn't do normalization or preprocessing > before going to VolcanoPlanner. > But many databases and big data systems will try to preprocess the > expression (push down predicates etc.) so that expressions in the same > group can share the logical properties, for most case if not all. You may > argue that it should be cost based, e.g. evaluating filter early can still > be bad. It is true, but how accurate will the statistics be, how accurate > will the cost model be? > > - Haisheng > > ---------- > 发件人:Julian Hyde > 日 期:2020年01月13日 08:44:54 > 收件人:dev@calcite.apache.org > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > > MEMO group (RelSet) represents logically equivalent
Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Roman, In my understanding, the proposed minor changes may only decrease the total number of rule invocations slightly, but all principal problems remain the same. In the top-down approach, you do not want to implement bottom logical nodes unless it is requested explicitly by a parent operator. It seems to me that the very first step to efficient optimizer could be a new rule invocation infrastructure. There should be *no global rule queue* at all. Instead, we may introduce the per-node rule queue. Then, the optimizer performs a recursive top-down optimization dive, invoking the rules for every operator. Consider the following simple tree: -- LogicalProject LogicalScan Assuming that we have two implementation rules ProjectRule, ScanRule, and abstract converters enabled, VolcanoOptimizer will proceed as follows, generating one unnecessary rule call: 1. Define global rule call queue: ProjectRule, ScanRule 2. Call ProjectRule, no new nodes are produced 3. Call ScanRule, produce physical scans, reschedule ProjectRule 4. Call ProjectRule again, produce the physical project With the proposed approach, it will work differently: 1. Define per-operator queues: LogicalProject -> ProjectRule LogicalScan -> ScanRule 2. Call optimize(LogicalProject) 3. Invoke ProjectRule, which calls optimize(LogicalScan) on the input 4. Invoke ScanRule, produce physical scans, return control back to ProjectRule 5. Produce the physical project, finish optimization Now we have only 2 rule invocations as expected, and we reached the same result as in the proposed minor changes. But the crucial difference is that now we have well-defined control flow between operators: start at the top, delegate to children. With this infrastructure in place, we will be able to introduce more complex features, such as pruning, or partial exploration later on. But notice that this change will be incompatible with the current rules, i.e. they should be re-written for the new optimization algorithm (e.g. see step 3), which might be a blocker for the current Calcite users. So maybe it is better to think of a new optimizer, rather than fixing VolcanoOptimizer. Regards, Vladimir. вт, 14 янв. 2020 г. в 23:52, Haisheng Yuan : > On the other hand, if we don't preprocess and normalize the rel expression > before going to valcano planner, still compute and keep logical/relational > properties, like cardinality, for each operator, how can we achieve group > seach space pruning? Given a physical group expression, its required > property and upper bound cost C_limit, we need to get its child group > cardinality, compute local cost C_local, so that we can calculate the > child group's upper bound cost and pass it down. > > I can't figure out how we can do group pruning without shared relational > properties. > > - Haisheng > > -- > 发件人:Haisheng Yuan > 日 期:2020年01月14日 12:06:17 > 收件人:dev@calcite.apache.org > 主 题:Re: Re: [DISCUSS] Proposal to add API to force rules matching specific > rels > > The example is valid if Calcite doesn't do normalization or preprocessing > before going to VolcanoPlanner. > But many databases and big data systems will try to preprocess the > expression (push down predicates etc.) so that expressions in the same > group can share the logical properties, for most case if not all. You may > argue that it should be cost based, e.g. evaluating filter early can still > be bad. It is true, but how accurate will the statistics be, how accurate > will the cost model be? > > - Haisheng > > ---------- > 发件人:Julian Hyde > 日 期:2020年01月13日 08:44:54 > 收件人:dev@calcite.apache.org > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > > MEMO group (RelSet) represents logically equivalent expressions. > > All the expressions in one group should share the same logical > > properties, e.g. functional dependency, constraint properties etc. > > But they are not sharing it. Computation is done for each individual > operator. > > It's good, and correct, that we compute for each individual operator. > > Suppose that a RelSubset s contains RelNode r1 and we know that the > constraint x > 0 holds. Suppose that we also have r2 with constraint y > < 10, and we discover that r1 and r2 are equivalent and belong > together in s. Now we can safely say that both constraints (x > 0 and > y < 10) apply to both r1 and r2. > > Deducing additional constraints in this way is a big win. The effort > to compute constraints for each RelNode is well-spent. > > This kind of deduction applies to other logical properties (e.g. > unique keys) and it applies to RelSet as well as RelSubset. > > Julian > > > On Sun, Jan
Re: Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
On the other hand, if we don't preprocess and normalize the rel expression before going to valcano planner, still compute and keep logical/relational properties, like cardinality, for each operator, how can we achieve group seach space pruning? Given a physical group expression, its required property and upper bound cost C_limit, we need to get its child group cardinality, compute local cost C_local, so that we can calculate the child group's upper bound cost and pass it down. I can't figure out how we can do group pruning without shared relational properties. - Haisheng -- 发件人:Haisheng Yuan 日 期:2020年01月14日 12:06:17 收件人:dev@calcite.apache.org 主 题:Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels The example is valid if Calcite doesn't do normalization or preprocessing before going to VolcanoPlanner. But many databases and big data systems will try to preprocess the expression (push down predicates etc.) so that expressions in the same group can share the logical properties, for most case if not all. You may argue that it should be cost based, e.g. evaluating filter early can still be bad. It is true, but how accurate will the statistics be, how accurate will the cost model be? - Haisheng -- 发件人:Julian Hyde 日 期:2020年01月13日 08:44:54 收件人:dev@calcite.apache.org 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > MEMO group (RelSet) represents logically equivalent expressions. > All the expressions in one group should share the same logical > properties, e.g. functional dependency, constraint properties etc. > But they are not sharing it. Computation is done for each individual operator. It's good, and correct, that we compute for each individual operator. Suppose that a RelSubset s contains RelNode r1 and we know that the constraint x > 0 holds. Suppose that we also have r2 with constraint y < 10, and we discover that r1 and r2 are equivalent and belong together in s. Now we can safely say that both constraints (x > 0 and y < 10) apply to both r1 and r2. Deducing additional constraints in this way is a big win. The effort to compute constraints for each RelNode is well-spent. This kind of deduction applies to other logical properties (e.g. unique keys) and it applies to RelSet as well as RelSubset. Julian On Sun, Jan 12, 2020 at 10:10 AM Roman Kondakov wrote: > > @Haisheng > > > Calcite uses Project operator and all kinds of ProjectXXXTranposeRule to > > prune unused columns. > > I also noticed that in most cases Project-related rules took significant > part of the planning time. But I didn't explore this problem yet. > > > MEMO group (RelSet) represents logically equivalent expressions. All the > > expressions in one group should share the same logical properties, e.g. > > functional dependency, constraint properties etc. But they are not sharing > > it. Computation is done for each individual operator. > > I thought the equivalence of logical properties within groups (RelSets) > are implicit. For example, in RelSet#addInternal it is always verified > that the new added node has the same type as other members of the set. > > Anyway I absolutely agree with you that problems with traits > propagation, rules matching and other that you mentioned in the previous > e-mails should be solved in the first place. We need first to make > Volcano optimizer work right and only then make some improvements like > search space pruning. > > I would love to join this work to improve Volcano planner. Looking > forward for design doc. > > > -- > Kind Regards > Roman Kondakov > > > On 11.01.2020 11:42, Haisheng Yuan wrote: > > Currently, Calcite uses Project operator and all kinds of > > ProjectXXXTranposeRule to prune unused columns. Every operator's output > > columns use index to reference child operators' columns. If there is a > > Project operator with child operator of a Filter, if we push project down > > under Filter, we will have Project-Filter-Project-FilterInput. All the > > newly generated relnodes will trigger rule matches. e.g. If we already did > > ReduceExpressionRule on filter, but due to the new filter RexCall's input > > ref index changed, we have to apply ReduceExpressionRule on the new filter > > again, even there is nothing can be reduced. Similarly new operator > > transpose/merge rule will be triggered. This can trigger a lot of rule > > matches. > > > > MEMO group (RelSet) represents logically equivalent expressions. All the > > expressions in one group should share the same logical properties, e.g. > > functional dependency, constraint properties e
Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
The example is valid if Calcite doesn't do normalization or preprocessing before going to VolcanoPlanner. But many databases and big data systems will try to preprocess the expression (push down predicates etc.) so that expressions in the same group can share the logical properties, for most case if not all. You may argue that it should be cost based, e.g. evaluating filter early can still be bad. It is true, but how accurate will the statistics be, how accurate will the cost model be? - Haisheng -- 发件人:Julian Hyde 日 期:2020年01月13日 08:44:54 收件人:dev@calcite.apache.org 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > MEMO group (RelSet) represents logically equivalent expressions. > All the expressions in one group should share the same logical > properties, e.g. functional dependency, constraint properties etc. > But they are not sharing it. Computation is done for each individual operator. It's good, and correct, that we compute for each individual operator. Suppose that a RelSubset s contains RelNode r1 and we know that the constraint x > 0 holds. Suppose that we also have r2 with constraint y < 10, and we discover that r1 and r2 are equivalent and belong together in s. Now we can safely say that both constraints (x > 0 and y < 10) apply to both r1 and r2. Deducing additional constraints in this way is a big win. The effort to compute constraints for each RelNode is well-spent. This kind of deduction applies to other logical properties (e.g. unique keys) and it applies to RelSet as well as RelSubset. Julian On Sun, Jan 12, 2020 at 10:10 AM Roman Kondakov wrote: > > @Haisheng > > > Calcite uses Project operator and all kinds of ProjectXXXTranposeRule to > > prune unused columns. > > I also noticed that in most cases Project-related rules took significant > part of the planning time. But I didn't explore this problem yet. > > > MEMO group (RelSet) represents logically equivalent expressions. All the > > expressions in one group should share the same logical properties, e.g. > > functional dependency, constraint properties etc. But they are not sharing > > it. Computation is done for each individual operator. > > I thought the equivalence of logical properties within groups (RelSets) > are implicit. For example, in RelSet#addInternal it is always verified > that the new added node has the same type as other members of the set. > > Anyway I absolutely agree with you that problems with traits > propagation, rules matching and other that you mentioned in the previous > e-mails should be solved in the first place. We need first to make > Volcano optimizer work right and only then make some improvements like > search space pruning. > > I would love to join this work to improve Volcano planner. Looking > forward for design doc. > > > -- > Kind Regards > Roman Kondakov > > > On 11.01.2020 11:42, Haisheng Yuan wrote: > > Currently, Calcite uses Project operator and all kinds of > > ProjectXXXTranposeRule to prune unused columns. Every operator's output > > columns use index to reference child operators' columns. If there is a > > Project operator with child operator of a Filter, if we push project down > > under Filter, we will have Project-Filter-Project-FilterInput. All the > > newly generated relnodes will trigger rule matches. e.g. If we already did > > ReduceExpressionRule on filter, but due to the new filter RexCall's input > > ref index changed, we have to apply ReduceExpressionRule on the new filter > > again, even there is nothing can be reduced. Similarly new operator > > transpose/merge rule will be triggered. This can trigger a lot of rule > > matches. > > > > MEMO group (RelSet) represents logically equivalent expressions. All the > > expressions in one group should share the same logical properties, e.g. > > functional dependency, constraint properties etc. But they are not sharing > > it. Computation is done for each individual operator. > > > > Without resolving those issue, space pruning won't help much. > > > > There are a lot of room for improvement. Hope the community can join the > > effort to make Calcite better. > > > > - Haisheng > > > > -- > > 发件人:Roman Kondakov > > 日 期:2020年01月10日 19:39:51 > > 收件人: > > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > > > @Haisheng, could you please clarify what you mean by these points? > > > >> - the poor-design of column pruning, > >> - lack of group properties etc. > > > > I guess I'm not aware of these problems. > >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
> MEMO group (RelSet) represents logically equivalent expressions. > All the expressions in one group should share the same logical > properties, e.g. functional dependency, constraint properties etc. > But they are not sharing it. Computation is done for each individual operator. It's good, and correct, that we compute for each individual operator. Suppose that a RelSubset s contains RelNode r1 and we know that the constraint x > 0 holds. Suppose that we also have r2 with constraint y < 10, and we discover that r1 and r2 are equivalent and belong together in s. Now we can safely say that both constraints (x > 0 and y < 10) apply to both r1 and r2. Deducing additional constraints in this way is a big win. The effort to compute constraints for each RelNode is well-spent. This kind of deduction applies to other logical properties (e.g. unique keys) and it applies to RelSet as well as RelSubset. Julian On Sun, Jan 12, 2020 at 10:10 AM Roman Kondakov wrote: > > @Haisheng > > > Calcite uses Project operator and all kinds of ProjectXXXTranposeRule to > > prune unused columns. > > I also noticed that in most cases Project-related rules took significant > part of the planning time. But I didn't explore this problem yet. > > > MEMO group (RelSet) represents logically equivalent expressions. All the > > expressions in one group should share the same logical properties, e.g. > > functional dependency, constraint properties etc. But they are not sharing > > it. Computation is done for each individual operator. > > I thought the equivalence of logical properties within groups (RelSets) > are implicit. For example, in RelSet#addInternal it is always verified > that the new added node has the same type as other members of the set. > > Anyway I absolutely agree with you that problems with traits > propagation, rules matching and other that you mentioned in the previous > e-mails should be solved in the first place. We need first to make > Volcano optimizer work right and only then make some improvements like > search space pruning. > > I would love to join this work to improve Volcano planner. Looking > forward for design doc. > > > -- > Kind Regards > Roman Kondakov > > > On 11.01.2020 11:42, Haisheng Yuan wrote: > > Currently, Calcite uses Project operator and all kinds of > > ProjectXXXTranposeRule to prune unused columns. Every operator's output > > columns use index to reference child operators' columns. If there is a > > Project operator with child operator of a Filter, if we push project down > > under Filter, we will have Project-Filter-Project-FilterInput. All the > > newly generated relnodes will trigger rule matches. e.g. If we already did > > ReduceExpressionRule on filter, but due to the new filter RexCall's input > > ref index changed, we have to apply ReduceExpressionRule on the new filter > > again, even there is nothing can be reduced. Similarly new operator > > transpose/merge rule will be triggered. This can trigger a lot of rule > > matches. > > > > MEMO group (RelSet) represents logically equivalent expressions. All the > > expressions in one group should share the same logical properties, e.g. > > functional dependency, constraint properties etc. But they are not sharing > > it. Computation is done for each individual operator. > > > > Without resolving those issue, space pruning won't help much. > > > > There are a lot of room for improvement. Hope the community can join the > > effort to make Calcite better. > > > > - Haisheng > > > > -- > > 发件人:Roman Kondakov > > 日 期:2020年01月10日 19:39:51 > > 收件人: > > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > > > @Haisheng, could you please clarify what you mean by these points? > > > >> - the poor-design of column pruning, > >> - lack of group properties etc. > > > > I guess I'm not aware of these problems. > >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
@Haisheng > Calcite uses Project operator and all kinds of ProjectXXXTranposeRule to > prune unused columns. I also noticed that in most cases Project-related rules took significant part of the planning time. But I didn't explore this problem yet. > MEMO group (RelSet) represents logically equivalent expressions. All the > expressions in one group should share the same logical properties, e.g. > functional dependency, constraint properties etc. But they are not sharing > it. Computation is done for each individual operator. I thought the equivalence of logical properties within groups (RelSets) are implicit. For example, in RelSet#addInternal it is always verified that the new added node has the same type as other members of the set. Anyway I absolutely agree with you that problems with traits propagation, rules matching and other that you mentioned in the previous e-mails should be solved in the first place. We need first to make Volcano optimizer work right and only then make some improvements like search space pruning. I would love to join this work to improve Volcano planner. Looking forward for design doc. -- Kind Regards Roman Kondakov On 11.01.2020 11:42, Haisheng Yuan wrote: > Currently, Calcite uses Project operator and all kinds of > ProjectXXXTranposeRule to prune unused columns. Every operator's output > columns use index to reference child operators' columns. If there is a > Project operator with child operator of a Filter, if we push project down > under Filter, we will have Project-Filter-Project-FilterInput. All the newly > generated relnodes will trigger rule matches. e.g. If we already did > ReduceExpressionRule on filter, but due to the new filter RexCall's input ref > index changed, we have to apply ReduceExpressionRule on the new filter again, > even there is nothing can be reduced. Similarly new operator transpose/merge > rule will be triggered. This can trigger a lot of rule matches. > > MEMO group (RelSet) represents logically equivalent expressions. All the > expressions in one group should share the same logical properties, e.g. > functional dependency, constraint properties etc. But they are not sharing > it. Computation is done for each individual operator. > > Without resolving those issue, space pruning won't help much. > > There are a lot of room for improvement. Hope the community can join the > effort to make Calcite better. > > - Haisheng > > ---------- > 发件人:Roman Kondakov > 日 期:2020年01月10日 19:39:51 > 收件人: > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > @Haisheng, could you please clarify what you mean by these points? > >> - the poor-design of column pruning, >> - lack of group properties etc. > > I guess I'm not aware of these problems. >
Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Currently, Calcite uses Project operator and all kinds of ProjectXXXTranposeRule to prune unused columns. Every operator's output columns use index to reference child operators' columns. If there is a Project operator with child operator of a Filter, if we push project down under Filter, we will have Project-Filter-Project-FilterInput. All the newly generated relnodes will trigger rule matches. e.g. If we already did ReduceExpressionRule on filter, but due to the new filter RexCall's input ref index changed, we have to apply ReduceExpressionRule on the new filter again, even there is nothing can be reduced. Similarly new operator transpose/merge rule will be triggered. This can trigger a lot of rule matches. MEMO group (RelSet) represents logically equivalent expressions. All the expressions in one group should share the same logical properties, e.g. functional dependency, constraint properties etc. But they are not sharing it. Computation is done for each individual operator. Without resolving those issue, space pruning won't help much. There are a lot of room for improvement. Hope the community can join the effort to make Calcite better. - Haisheng -- 发件人:Roman Kondakov 日 期:2020年01月10日 19:39:51 收件人: 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels @Haisheng, could you please clarify what you mean by these points? > - the poor-design of column pruning, > - lack of group properties etc. I guess I'm not aware of these problems. -- Kind Regards Roman Kondakov On 08.01.2020 02:21, Haisheng Yuan wrote: >> @Haisheng, are you doing something like that? > Kind of, but not exactly. It is about on-demand trait propagation. > > @Roman seems to be keen on space pruning for Calcite. But IMHO, for now, the > main reason of Calcite's poor performance is not lack of branch & bound space > puning, but > - rule applying on physical nodes, > - randomness of rule matching, > - the poor-design of column pruning, > - lack of on-demand trait propagation, > - lack of group properties etc. > > We tried a similar change with Roman's on our product. We totally removed > rule match importance and its comparison, split it into exploration, > implementation, enforcement 3 phases with specific top-down/bottom-up order, > it achieved almost 100% speedup. > Even @vlsi's RexNode normalization can improve it to some degree. > > Calcite currently generates only 1 join-order alternative for 6-way joins in > testJoinManyWay, not even top 10, 100 or N! ordering alternatives, but it > still can't finish within reasonable amount of time when abstract converter > is allowed. If there is only 1 join order alternative, the query optimizer > should finish the optimization quickly even for clique or chain queries with > 20 way joins, without space pruning. But this is not the case for Calcite. > > Simply put it, space pruning is important for optimization, especially for > join-reordering, but not an urgent issue for Calcite. > > - Haisheng > > ------------------ > 发件人:Roman Kondakov > 日 期:2020年01月08日 02:39:19 > 收件人: > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > I forgot to mention that this approach was inspired by Stamatis's idea [1] > > [1] > https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E > >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
@Haisheng, could you please clarify what you mean by these points? > - the poor-design of column pruning, > - lack of group properties etc. I guess I'm not aware of these problems. -- Kind Regards Roman Kondakov On 08.01.2020 02:21, Haisheng Yuan wrote: >> @Haisheng, are you doing something like that? > Kind of, but not exactly. It is about on-demand trait propagation. > > @Roman seems to be keen on space pruning for Calcite. But IMHO, for now, the > main reason of Calcite's poor performance is not lack of branch & bound space > puning, but > - rule applying on physical nodes, > - randomness of rule matching, > - the poor-design of column pruning, > - lack of on-demand trait propagation, > - lack of group properties etc. > > We tried a similar change with Roman's on our product. We totally removed > rule match importance and its comparison, split it into exploration, > implementation, enforcement 3 phases with specific top-down/bottom-up order, > it achieved almost 100% speedup. > Even @vlsi's RexNode normalization can improve it to some degree. > > Calcite currently generates only 1 join-order alternative for 6-way joins in > testJoinManyWay, not even top 10, 100 or N! ordering alternatives, but it > still can't finish within reasonable amount of time when abstract converter > is allowed. If there is only 1 join order alternative, the query optimizer > should finish the optimization quickly even for clique or chain queries with > 20 way joins, without space pruning. But this is not the case for Calcite. > > Simply put it, space pruning is important for optimization, especially for > join-reordering, but not an urgent issue for Calcite. > > - Haisheng > > ------------------ > 发件人:Roman Kondakov > 日 期:2020年01月08日 02:39:19 > 收件人: > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > I forgot to mention that this approach was inspired by Stamatis's idea [1] > > [1] > https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E > >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
I see. But that’s unrelated to join ordering. > On Jan 7, 2020, at 11:29 PM, Danny Chan wrote: > > Internally we have a multi-inputs merge join, for each input there maybe a > collation permutations. > > Best, > Danny Chan > 在 2020年1月8日 +0800 AM1:20,Xiening Dai ,写道: >> Danny, I am not sure how this would affect join re-order. Could you >> elaborate? >> >> >>> On Jan 7, 2020, at 1:29 AM, Danny Chan wrote: >>> >>> Hi, guys, it seems that the discussion silent now, so do we have some >>> conclusion that can contribute to current code, i.e. the suggested API >>> change or new abstraction ? >>> >>> Or better, can someone give a design doc so that we can push and make that >>> implemented ? >>> >>> Personally I was always looking forward to the result, because Apache Flink >>> suffers for the bad planning performance for Join re-order or traits >>> auto-adapter. >>> >>> Best, >>> Danny Chan >>> 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov ,写道: HI Igor, Thank you for the details. Meanwhile, I solved it with separation of conversion rules from the physical optimization rules. So the first pass creates physical nodes with unknown physical properties (top-bottom), while subsequent processing of the leaf nodes triggers rules which convert "bad" physical nodes to "good" physical nodes with know distribution and collation. Regards, Vladimir. пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : > Vladimir, > > Hope it may help you. > > Currently we applied the next way (just rough description): > > 1) We created an API to derive possible traits permutations on the basis > of children traits (quite similar to one, described in «On Demand trait > set request» topic) > > 2) added a general rule that copies Logical nodes, but requesting our > convention from their children (IGNITE convention, ANY distribution, EMPTY > collation) and sets importance of old Logical nodes to zero - so, we have > a > Logical parent which input satisfies any possible distribution and no > rules > matched to previous logical node any more. > > 3) Physical rules to create physical rel nodes only if physical traits may > be derived (there is no «barrier», described in one of previous messages) > - > derived traits are a collection, we don’t create a physical rel node for > each possible traits set, also we may set zero importance for previously > created rel nodes to decrease search space. > > Now we know actual and required distribution, we don’t need > AbstractConverters and able just call TraitDef.convert() method inside a > rule. > A rule still able to produce the same output several times, but > «memorization" inside the planner solves it for us. > > preliminary tests show almost zero overhead of the approach. > > Regards, > Igor > > >> 14 нояб. 2019 г., в 14:49, Vladimir Ozerov > написал(а): >> >> Hi Xing, >> >> Thanks for your suggestion. Yes, this may help, and if I get your idea >> right, I already had it in my reproducer: >> 1) Create the converted physical input: >> > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 >> 2) Use it in case no physical children were found: >> > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 >> >> This idea is borrowed from Apache Drill physical rules. But the problem > is >> that this approach leads to a suboptimal plan - parent node doesn't know >> the future distribution of a child node. And as a result, it doesn't know >> it's own distribution. So the final plan is constructed in that way: >> 1.1) Root enforced "SINGLETON" on its child: >> -> PhysicalRoot[SINGLETON] >> -> Converter[SINGLETON] >> -> PhysicalProject[*ANY*] >> -> PhysicalScan[REPLICATED] >> >> 1.2) But since the child (PhysicalProject) failed to infer distribution >> during rule call, it falls back to ANY distribution. In order to satisfy >> SINGLETON distribution of a parent, we inject an exchange in the final > plan: >> -> PhysicalRoot[SINGLETON] >> * -> Exchange[SINGLETON]* >> -> PhysicalProject[*ANY*] >> -> PhysicalScan[REPLICATED] >> >> 2) But this is a suboptimal plan. The optimal plan is: >> -> PhysicalRoot[SINGLETON] >> -> PhysicalProject[REPLICATED] >> -> PhysicalScan[REPLICATED] >> >> You may observe it in my tests: >> 1) >> > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 >> - >> works as you described and produces not optimal plan with exchange >> 2) >> >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Internally we have a multi-inputs merge join, for each input there maybe a collation permutations. Best, Danny Chan 在 2020年1月8日 +0800 AM1:20,Xiening Dai ,写道: > Danny, I am not sure how this would affect join re-order. Could you elaborate? > > > > On Jan 7, 2020, at 1:29 AM, Danny Chan wrote: > > > > Hi, guys, it seems that the discussion silent now, so do we have some > > conclusion that can contribute to current code, i.e. the suggested API > > change or new abstraction ? > > > > Or better, can someone give a design doc so that we can push and make that > > implemented ? > > > > Personally I was always looking forward to the result, because Apache Flink > > suffers for the bad planning performance for Join re-order or traits > > auto-adapter. > > > > Best, > > Danny Chan > > 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov ,写道: > > > HI Igor, > > > > > > Thank you for the details. Meanwhile, I solved it with separation of > > > conversion rules from the physical optimization rules. So the first pass > > > creates physical nodes with unknown physical properties (top-bottom), > > > while > > > subsequent processing of the leaf nodes triggers rules which convert "bad" > > > physical nodes to "good" physical nodes with know distribution and > > > collation. > > > > > > Regards, > > > Vladimir. > > > > > > пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : > > > > > > > Vladimir, > > > > > > > > Hope it may help you. > > > > > > > > Currently we applied the next way (just rough description): > > > > > > > > 1) We created an API to derive possible traits permutations on the basis > > > > of children traits (quite similar to one, described in «On Demand trait > > > > set request» topic) > > > > > > > > 2) added a general rule that copies Logical nodes, but requesting our > > > > convention from their children (IGNITE convention, ANY distribution, > > > > EMPTY > > > > collation) and sets importance of old Logical nodes to zero - so, we > > > > have a > > > > Logical parent which input satisfies any possible distribution and no > > > > rules > > > > matched to previous logical node any more. > > > > > > > > 3) Physical rules to create physical rel nodes only if physical traits > > > > may > > > > be derived (there is no «barrier», described in one of previous > > > > messages) - > > > > derived traits are a collection, we don’t create a physical rel node for > > > > each possible traits set, also we may set zero importance for previously > > > > created rel nodes to decrease search space. > > > > > > > > Now we know actual and required distribution, we don’t need > > > > AbstractConverters and able just call TraitDef.convert() method inside a > > > > rule. > > > > A rule still able to produce the same output several times, but > > > > «memorization" inside the planner solves it for us. > > > > > > > > preliminary tests show almost zero overhead of the approach. > > > > > > > > Regards, > > > > Igor > > > > > > > > > > > > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov > > > > написал(а): > > > > > > > > > > Hi Xing, > > > > > > > > > > Thanks for your suggestion. Yes, this may help, and if I get your idea > > > > > right, I already had it in my reproducer: > > > > > 1) Create the converted physical input: > > > > > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 > > > > > 2) Use it in case no physical children were found: > > > > > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 > > > > > > > > > > This idea is borrowed from Apache Drill physical rules. But the > > > > > problem > > > > is > > > > > that this approach leads to a suboptimal plan - parent node doesn't > > > > > know > > > > > the future distribution of a child node. And as a result, it doesn't > > > > > know > > > > > it's own distribution. So the final plan is constructed in that way: > > > > > 1.1) Root enforced "SINGLETON" on its child: > > > > > -> PhysicalRoot[SINGLETON] > > > > > -> Converter[SINGLETON] > > > > > -> PhysicalProject[*ANY*] > > > > > -> PhysicalScan[REPLICATED] > > > > > > > > > > 1.2) But since the child (PhysicalProject) failed to infer > > > > > distribution > > > > > during rule call, it falls back to ANY distribution. In order to > > > > > satisfy > > > > > SINGLETON distribution of a parent, we inject an exchange in the final > > > > plan: > > > > > -> PhysicalRoot[SINGLETON] > > > > > * -> Exchange[SINGLETON]* > > > > > -> PhysicalProject[*ANY*] > > > > > -> PhysicalScan[REPLICATED] > > > > > > > > > > 2) But this is a suboptimal plan. The optimal plan is: > > > > > -> PhysicalRoot[SINGLETON] > > > > > -> PhysicalProject[REPLICATED] > > > > > -> PhysicalScan[REPLICATED] > > > > > > > > > > You may observe it in my tests: > > > > > 1) > > > > > > > > >
Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
> @Haisheng, are you doing something like that? Kind of, but not exactly. It is about on-demand trait propagation. @Roman seems to be keen on space pruning for Calcite. But IMHO, for now, the main reason of Calcite's poor performance is not lack of branch & bound space puning, but - rule applying on physical nodes, - randomness of rule matching, - the poor-design of column pruning, - lack of on-demand trait propagation, - lack of group properties etc. We tried a similar change with Roman's on our product. We totally removed rule match importance and its comparison, split it into exploration, implementation, enforcement 3 phases with specific top-down/bottom-up order, it achieved almost 100% speedup. Even @vlsi's RexNode normalization can improve it to some degree. Calcite currently generates only 1 join-order alternative for 6-way joins in testJoinManyWay, not even top 10, 100 or N! ordering alternatives, but it still can't finish within reasonable amount of time when abstract converter is allowed. If there is only 1 join order alternative, the query optimizer should finish the optimization quickly even for clique or chain queries with 20 way joins, without space pruning. But this is not the case for Calcite. Simply put it, space pruning is important for optimization, especially for join-reordering, but not an urgent issue for Calcite. - Haisheng -- 发件人:Roman Kondakov 日 期:2020年01月08日 02:39:19 收件人: 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels I forgot to mention that this approach was inspired by Stamatis's idea [1] [1] https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E -- Kind Regards Roman Kondakov On 07.01.2020 21:14, Roman Kondakov wrote: > Hello! > > As Vladimir Ozerov mentioned, the general problem here is that > VolcanoPlanner applies both logical and physical rules in a top-down > manner. It's more like the original volcano paper approach [1]: > >> In the Volcano search strategy, a first phase applied all >> transformation rules to create all possible logical expressions for a query >> and all its subtrees. The second phase, >> which performed the actual optimization, navigated within that network of >> equivalence classes and expressions, >> applied implementation rules to obtain plans, and determined the best plan. > The Cascades paper [2] says > >> In the Cascades optimizer, this separation into two phases is abolished, >> because it is not useful to derive all >> logically equivalent forms of all expressions, e.g., of a predicate. A group >> is explored using transformation rules >> only on demand > > Unfortunately, it is not clear from this paper what the actual "on > demand" phrase means. But there is a great explanation of the Cascades > optimization process can be found in [3]: > >> It begins with the input query and, ..., >> proceeds top-down, using recursion on the inputs >> of the current multiexpression MExpr. However, plan >> costing actually proceeds bottom-up, based on the order >> of the returns from the top-down recursive calls. > > and this is what VolcanoPlanner lacks for: when a logical rule is > applied in a top-down way and a new LogicalRelNode is created, we need > to understand whether the overall plan cost was improved by this move or > not. In order to understand it we need to know the actual cost of the > new plan. As [3] says: > >> A plan is an expression made up entirely of physical operators. > > Hence, to evaluate the plan cost we need to have fully implemented > physical tree of operators for our new LogicalRelNode which was created > during applying the logical rule above. > > This process can be described with the pseudo-code: > > for (LogicalRuleMatch rule : LogicalRuleMatchesSortedTopDown) { > LogicalNode newNode = rule.apply(); > implementRecursivelyBottomUp(newNode); > } > > we first apply logical rules in a top-down fashion and then after each > logical rule invocation we need to implement newly created nodes with > the physical rules recursively in a bottom-up way. > > The outcome here is when the physical converter rule is applied to a > RelNode, all children of this node are already implemented, so their > traits can easily be derived and the precise cost of them can be calculated. > > I tried to simulate this behavior with minor changes in > RuleQueue.RuleMatchImportanceComparator (see PR [4]) and it worked well > for me. Tests started perform better. I counted planner ticks (see > VolcanoPlanner#findBestExp) to evaluate performance improvements. For > example, the number of ticks
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
I forgot to mention that this approach was inspired by Stamatis's idea [1] [1] https://ponymail-vm.apache.org/_GUI_/thread.html/d8f8bc0efd091c0750534ca5cd224f4dfe8940c9d0a99ce486516fd5@%3Cdev.calcite.apache.org%3E -- Kind Regards Roman Kondakov On 07.01.2020 21:14, Roman Kondakov wrote: > Hello! > > As Vladimir Ozerov mentioned, the general problem here is that > VolcanoPlanner applies both logical and physical rules in a top-down > manner. It's more like the original volcano paper approach [1]: > >> In the Volcano search strategy, a first phase applied all >> transformation rules to create all possible logical expressions for a query >> and all its subtrees. The second phase, >> which performed the actual optimization, navigated within that network of >> equivalence classes and expressions, >> applied implementation rules to obtain plans, and determined the best plan. > The Cascades paper [2] says > >> In the Cascades optimizer, this separation into two phases is abolished, >> because it is not useful to derive all >> logically equivalent forms of all expressions, e.g., of a predicate. A group >> is explored using transformation rules >> only on demand > > Unfortunately, it is not clear from this paper what the actual "on > demand" phrase means. But there is a great explanation of the Cascades > optimization process can be found in [3]: > >> It begins with the input query and, ..., >> proceeds top-down, using recursion on the inputs >> of the current multiexpression MExpr. However, plan >> costing actually proceeds bottom-up, based on the order >> of the returns from the top-down recursive calls. > > and this is what VolcanoPlanner lacks for: when a logical rule is > applied in a top-down way and a new LogicalRelNode is created, we need > to understand whether the overall plan cost was improved by this move or > not. In order to understand it we need to know the actual cost of the > new plan. As [3] says: > >> A plan is an expression made up entirely of physical operators. > > Hence, to evaluate the plan cost we need to have fully implemented > physical tree of operators for our new LogicalRelNode which was created > during applying the logical rule above. > > This process can be described with the pseudo-code: > > for (LogicalRuleMatch rule : LogicalRuleMatchesSortedTopDown) { > LogicalNode newNode = rule.apply(); > implementRecursivelyBottomUp(newNode); > } > > we first apply logical rules in a top-down fashion and then after each > logical rule invocation we need to implement newly created nodes with > the physical rules recursively in a bottom-up way. > > The outcome here is when the physical converter rule is applied to a > RelNode, all children of this node are already implemented, so their > traits can easily be derived and the precise cost of them can be calculated. > > I tried to simulate this behavior with minor changes in > RuleQueue.RuleMatchImportanceComparator (see PR [4]) and it worked well > for me. Tests started perform better. I counted planner ticks (see > VolcanoPlanner#findBestExp) to evaluate performance improvements. For > example, the number of ticks in JdbcTest#testJoinManyWay before > comparator change: > > ticks=11 > ticks=1041 > ticks=31362 > > and after: > > ticks=11 > ticks=678 > ticks=26231 > > In some Apache Ignite test scenarios the number of ticks was reduced by > half. > > The essence of this change is the minor adjustment of > RuleQueue.RuleMatchImportanceComparator: > > - converter rules (physical rules) always have priority over the logical > ones (if there are physical rules in the queue, they will be applied first) > - Physical rules are ordered in the bottom-up fashion > - Logical rules are ordered in the top-down fashion > > For example, if we have the initial rel: > > LogicalProject > LogicalFilter > > the rule queue for this RelNode would be ordered like this: > > 1. PhysicalFilterRule <-children physical rules go first > 2. PhysicalProjectRule <- then go parent physical rules > 3. FilterProjectTransposeRule <- logical rules go last > > This minor improvement might help in some cases: > - slight reducing the planning time. > - simplification of the bottom-up trait propagation because when it's > time to physically implement a parent node, all it's children are > already physically implemented. > > But in general this approach is not the actual Cascades implementation > because it still missing one of the most useful Cascades feature: groups > prunning. Details can be found in [3]. Long story short: we can skip > optimization of some RelNodes if it is known that it's children have the > bigger cost than the best RelNode in the same Subset. The more > fundamental changes are required to fix it. > > @Haisheng, are you doing something like that? > > > [1] https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf > [2] > https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Cascades-graefe.pdf > [3] >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
> ------------------ > 发件人:Danny Chan > 日 期:2020年01月07日 17:29:46 > 收件人: > 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels > > Hi, guys, it seems that the discussion silent now, so do we have some > conclusion that can contribute to current code, i.e. the suggested API change > or new abstraction ? > > Or better, can someone give a design doc so that we can push and make that > implemented ? > > Personally I was always looking forward to the result, because Apache Flink > suffers for the bad planning performance for Join re-order or traits > auto-adapter. > > Best, > Danny Chan > 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov ,写道: >> HI Igor, >> >> Thank you for the details. Meanwhile, I solved it with separation of >> conversion rules from the physical optimization rules. So the first pass >> creates physical nodes with unknown physical properties (top-bottom), while >> subsequent processing of the leaf nodes triggers rules which convert "bad" >> physical nodes to "good" physical nodes with know distribution and >> collation. >> >> Regards, >> Vladimir. >> >> пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : >> >>> Vladimir, >>> >>> Hope it may help you. >>> >>> Currently we applied the next way (just rough description): >>> >>> 1) We created an API to derive possible traits permutations on the basis >>> of children traits (quite similar to one, described in «On Demand trait >>> set request» topic) >>> >>> 2) added a general rule that copies Logical nodes, but requesting our >>> convention from their children (IGNITE convention, ANY distribution, EMPTY >>> collation) and sets importance of old Logical nodes to zero - so, we have a >>> Logical parent which input satisfies any possible distribution and no rules >>> matched to previous logical node any more. >>> >>> 3) Physical rules to create physical rel nodes only if physical traits may >>> be derived (there is no «barrier», described in one of previous messages) - >>> derived traits are a collection, we don’t create a physical rel node for >>> each possible traits set, also we may set zero importance for previously >>> created rel nodes to decrease search space. >>> >>> Now we know actual and required distribution, we don’t need >>> AbstractConverters and able just call TraitDef.convert() method inside a >>> rule. >>> A rule still able to produce the same output several times, but >>> «memorization" inside the planner solves it for us. >>> >>> preliminary tests show almost zero overhead of the approach. >>> >>> Regards, >>> Igor >>> >>> >>>> 14 нояб. 2019 г., в 14:49, Vladimir Ozerov >>> написал(а): >>>> >>>> Hi Xing, >>>> >>>> Thanks for your suggestion. Yes, this may help, and if I get your idea >>>> right, I already had it in my reproducer: >>>> 1) Create the converted physical input: >>>> >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 >>>> 2) Use it in case no physical children were found: >>>> >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 >>>> >>>> This idea is borrowed from Apache Drill physical rules. But the problem >>> is >>>> that this approach leads to a suboptimal plan - parent node doesn't know >>>> the future distribution of a child node. And as a result, it doesn't know >>>> it's own distribution. So the final plan is constructed in that way: >>>> 1.1) Root enforced "SINGLETON" on its child: >>>> -> PhysicalRoot[SINGLETON] >>>> -> Converter[SINGLETON] >>>> -> PhysicalProject[*ANY*] >>>> -> PhysicalScan[REPLICATED] >>>> >>>> 1.2) But since the child (PhysicalProject) failed to infer distribution >>>> during rule call, it falls back to ANY distribution. In order to satisfy >>>> SINGLETON distribution of a parent, we inject an exchange in the final >>> plan: >>>> -> PhysicalRoot[SINGLETON] >>>> * -> Exchange[SINGLETON]* >>>> -> PhysicalProject[*ANY*] >>>> -> PhysicalScan[REPLICATED] >>>> >>>> 2) But this is a subopt
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Danny, I am not sure how this would affect join re-order. Could you elaborate? > On Jan 7, 2020, at 1:29 AM, Danny Chan wrote: > > Hi, guys, it seems that the discussion silent now, so do we have some > conclusion that can contribute to current code, i.e. the suggested API change > or new abstraction ? > > Or better, can someone give a design doc so that we can push and make that > implemented ? > > Personally I was always looking forward to the result, because Apache Flink > suffers for the bad planning performance for Join re-order or traits > auto-adapter. > > Best, > Danny Chan > 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov ,写道: >> HI Igor, >> >> Thank you for the details. Meanwhile, I solved it with separation of >> conversion rules from the physical optimization rules. So the first pass >> creates physical nodes with unknown physical properties (top-bottom), while >> subsequent processing of the leaf nodes triggers rules which convert "bad" >> physical nodes to "good" physical nodes with know distribution and >> collation. >> >> Regards, >> Vladimir. >> >> пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : >> >>> Vladimir, >>> >>> Hope it may help you. >>> >>> Currently we applied the next way (just rough description): >>> >>> 1) We created an API to derive possible traits permutations on the basis >>> of children traits (quite similar to one, described in «On Demand trait >>> set request» topic) >>> >>> 2) added a general rule that copies Logical nodes, but requesting our >>> convention from their children (IGNITE convention, ANY distribution, EMPTY >>> collation) and sets importance of old Logical nodes to zero - so, we have a >>> Logical parent which input satisfies any possible distribution and no rules >>> matched to previous logical node any more. >>> >>> 3) Physical rules to create physical rel nodes only if physical traits may >>> be derived (there is no «barrier», described in one of previous messages) - >>> derived traits are a collection, we don’t create a physical rel node for >>> each possible traits set, also we may set zero importance for previously >>> created rel nodes to decrease search space. >>> >>> Now we know actual and required distribution, we don’t need >>> AbstractConverters and able just call TraitDef.convert() method inside a >>> rule. >>> A rule still able to produce the same output several times, but >>> «memorization" inside the planner solves it for us. >>> >>> preliminary tests show almost zero overhead of the approach. >>> >>> Regards, >>> Igor >>> >>> 14 нояб. 2019 г., в 14:49, Vladimir Ozerov >>> написал(а): Hi Xing, Thanks for your suggestion. Yes, this may help, and if I get your idea right, I already had it in my reproducer: 1) Create the converted physical input: >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 2) Use it in case no physical children were found: >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 This idea is borrowed from Apache Drill physical rules. But the problem >>> is that this approach leads to a suboptimal plan - parent node doesn't know the future distribution of a child node. And as a result, it doesn't know it's own distribution. So the final plan is constructed in that way: 1.1) Root enforced "SINGLETON" on its child: -> PhysicalRoot[SINGLETON] -> Converter[SINGLETON] -> PhysicalProject[*ANY*] -> PhysicalScan[REPLICATED] 1.2) But since the child (PhysicalProject) failed to infer distribution during rule call, it falls back to ANY distribution. In order to satisfy SINGLETON distribution of a parent, we inject an exchange in the final >>> plan: -> PhysicalRoot[SINGLETON] * -> Exchange[SINGLETON]* -> PhysicalProject[*ANY*] -> PhysicalScan[REPLICATED] 2) But this is a suboptimal plan. The optimal plan is: -> PhysicalRoot[SINGLETON] -> PhysicalProject[REPLICATED] -> PhysicalScan[REPLICATED] You may observe it in my tests: 1) >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 - works as you described and produces not optimal plan with exchange 2) >>> https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 - rely on AbstractConverter-s and produce an optimal plan with bottom-up trait propagation at the cost of significantly increased planning time Regards, Vladimir. пт, 8 нояб. 2019 г. в 16:15, XING JIN : > Hi Vladimir, > > I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work >>> may > help you ~ > They work by a top-down fashion, but when matching parent, they
Re: Re: [DISCUSS] Proposal to add API to force rules matching specific rels
It is still on my radar. I have been busy these days, but will send out a design doc in a few days. But just a heads up, the change would be larger than everyone's expected. - Haisheng -- 发件人:Danny Chan 日 期:2020年01月07日 17:29:46 收件人: 主 题:Re: [DISCUSS] Proposal to add API to force rules matching specific rels Hi, guys, it seems that the discussion silent now, so do we have some conclusion that can contribute to current code, i.e. the suggested API change or new abstraction ? Or better, can someone give a design doc so that we can push and make that implemented ? Personally I was always looking forward to the result, because Apache Flink suffers for the bad planning performance for Join re-order or traits auto-adapter. Best, Danny Chan 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov ,写道: > HI Igor, > > Thank you for the details. Meanwhile, I solved it with separation of > conversion rules from the physical optimization rules. So the first pass > creates physical nodes with unknown physical properties (top-bottom), while > subsequent processing of the leaf nodes triggers rules which convert "bad" > physical nodes to "good" physical nodes with know distribution and > collation. > > Regards, > Vladimir. > > пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : > > > Vladimir, > > > > Hope it may help you. > > > > Currently we applied the next way (just rough description): > > > > 1) We created an API to derive possible traits permutations on the basis > > of children traits (quite similar to one, described in «On Demand trait > > set request» topic) > > > > 2) added a general rule that copies Logical nodes, but requesting our > > convention from their children (IGNITE convention, ANY distribution, EMPTY > > collation) and sets importance of old Logical nodes to zero - so, we have a > > Logical parent which input satisfies any possible distribution and no rules > > matched to previous logical node any more. > > > > 3) Physical rules to create physical rel nodes only if physical traits may > > be derived (there is no «barrier», described in one of previous messages) - > > derived traits are a collection, we don’t create a physical rel node for > > each possible traits set, also we may set zero importance for previously > > created rel nodes to decrease search space. > > > > Now we know actual and required distribution, we don’t need > > AbstractConverters and able just call TraitDef.convert() method inside a > > rule. > > A rule still able to produce the same output several times, but > > «memorization" inside the planner solves it for us. > > > > preliminary tests show almost zero overhead of the approach. > > > > Regards, > > Igor > > > > > > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov > > написал(а): > > > > > > Hi Xing, > > > > > > Thanks for your suggestion. Yes, this may help, and if I get your idea > > > right, I already had it in my reproducer: > > > 1) Create the converted physical input: > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 > > > 2) Use it in case no physical children were found: > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 > > > > > > This idea is borrowed from Apache Drill physical rules. But the problem > > is > > > that this approach leads to a suboptimal plan - parent node doesn't know > > > the future distribution of a child node. And as a result, it doesn't know > > > it's own distribution. So the final plan is constructed in that way: > > > 1.1) Root enforced "SINGLETON" on its child: > > > -> PhysicalRoot[SINGLETON] > > > -> Converter[SINGLETON] > > > -> PhysicalProject[*ANY*] > > > -> PhysicalScan[REPLICATED] > > > > > > 1.2) But since the child (PhysicalProject) failed to infer distribution > > > during rule call, it falls back to ANY distribution. In order to satisfy > > > SINGLETON distribution of a parent, we inject an exchange in the final > > plan: > > > -> PhysicalRoot[SINGLETON] > > > * -> Exchange[SINGLETON]* > > > -> PhysicalProject[*ANY*] > > > -> PhysicalScan[REPLICATED] > > > > > > 2) But this is a suboptimal plan. The optimal plan is: > > > -> PhysicalRoot[SINGLETON] > > > -> PhysicalProject[REPLICATED] > > >
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi, guys, it seems that the discussion silent now, so do we have some conclusion that can contribute to current code, i.e. the suggested API change or new abstraction ? Or better, can someone give a design doc so that we can push and make that implemented ? Personally I was always looking forward to the result, because Apache Flink suffers for the bad planning performance for Join re-order or traits auto-adapter. Best, Danny Chan 在 2019年11月20日 +0800 AM2:14,Vladimir Ozerov ,写道: > HI Igor, > > Thank you for the details. Meanwhile, I solved it with separation of > conversion rules from the physical optimization rules. So the first pass > creates physical nodes with unknown physical properties (top-bottom), while > subsequent processing of the leaf nodes triggers rules which convert "bad" > physical nodes to "good" physical nodes with know distribution and > collation. > > Regards, > Vladimir. > > пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : > > > Vladimir, > > > > Hope it may help you. > > > > Currently we applied the next way (just rough description): > > > > 1) We created an API to derive possible traits permutations on the basis > > of children traits (quite similar to one, described in «On Demand trait > > set request» topic) > > > > 2) added a general rule that copies Logical nodes, but requesting our > > convention from their children (IGNITE convention, ANY distribution, EMPTY > > collation) and sets importance of old Logical nodes to zero - so, we have a > > Logical parent which input satisfies any possible distribution and no rules > > matched to previous logical node any more. > > > > 3) Physical rules to create physical rel nodes only if physical traits may > > be derived (there is no «barrier», described in one of previous messages) - > > derived traits are a collection, we don’t create a physical rel node for > > each possible traits set, also we may set zero importance for previously > > created rel nodes to decrease search space. > > > > Now we know actual and required distribution, we don’t need > > AbstractConverters and able just call TraitDef.convert() method inside a > > rule. > > A rule still able to produce the same output several times, but > > «memorization" inside the planner solves it for us. > > > > preliminary tests show almost zero overhead of the approach. > > > > Regards, > > Igor > > > > > > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov > > написал(а): > > > > > > Hi Xing, > > > > > > Thanks for your suggestion. Yes, this may help, and if I get your idea > > > right, I already had it in my reproducer: > > > 1) Create the converted physical input: > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 > > > 2) Use it in case no physical children were found: > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 > > > > > > This idea is borrowed from Apache Drill physical rules. But the problem > > is > > > that this approach leads to a suboptimal plan - parent node doesn't know > > > the future distribution of a child node. And as a result, it doesn't know > > > it's own distribution. So the final plan is constructed in that way: > > > 1.1) Root enforced "SINGLETON" on its child: > > > -> PhysicalRoot[SINGLETON] > > > -> Converter[SINGLETON] > > > -> PhysicalProject[*ANY*] > > > -> PhysicalScan[REPLICATED] > > > > > > 1.2) But since the child (PhysicalProject) failed to infer distribution > > > during rule call, it falls back to ANY distribution. In order to satisfy > > > SINGLETON distribution of a parent, we inject an exchange in the final > > plan: > > > -> PhysicalRoot[SINGLETON] > > > * -> Exchange[SINGLETON]* > > > -> PhysicalProject[*ANY*] > > > -> PhysicalScan[REPLICATED] > > > > > > 2) But this is a suboptimal plan. The optimal plan is: > > > -> PhysicalRoot[SINGLETON] > > > -> PhysicalProject[REPLICATED] > > > -> PhysicalScan[REPLICATED] > > > > > > You may observe it in my tests: > > > 1) > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 > > > - > > > works as you described and produces not optimal plan with exchange > > > 2) > > > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 > > > - > > > rely on AbstractConverter-s and produce an optimal plan with bottom-up > > > trait propagation at the cost of significantly increased planning time > > > > > > Regards, > > > Vladimir. > > > > > > пт, 8 нояб. 2019 г. в 16:15, XING JIN : > > > > > > > Hi Vladimir, > > > > > > > > I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work > > may > > > > help you ~ > > > > They work by a top-down fashion, but when matching parent, they convert > > the > > > > children explicitly. > > > > You may try below steps: > > > > 1. Construct a rule LogicalParentRule to match the
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
HI Igor, Thank you for the details. Meanwhile, I solved it with separation of conversion rules from the physical optimization rules. So the first pass creates physical nodes with unknown physical properties (top-bottom), while subsequent processing of the leaf nodes triggers rules which convert "bad" physical nodes to "good" physical nodes with know distribution and collation. Regards, Vladimir. пн, 18 нояб. 2019 г. в 13:43, Seliverstov Igor : > Vladimir, > > Hope it may help you. > > Currently we applied the next way (just rough description): > > 1) We created an API to derive possible traits permutations on the basis > of children traits (quite similar to one, described in «On Demand trait > set request» topic) > > 2) added a general rule that copies Logical nodes, but requesting our > convention from their children (IGNITE convention, ANY distribution, EMPTY > collation) and sets importance of old Logical nodes to zero - so, we have a > Logical parent which input satisfies any possible distribution and no rules > matched to previous logical node any more. > > 3) Physical rules to create physical rel nodes only if physical traits may > be derived (there is no «barrier», described in one of previous messages) - > derived traits are a collection, we don’t create a physical rel node for > each possible traits set, also we may set zero importance for previously > created rel nodes to decrease search space. > > Now we know actual and required distribution, we don’t need > AbstractConverters and able just call TraitDef.convert() method inside a > rule. > A rule still able to produce the same output several times, but > «memorization" inside the planner solves it for us. > > preliminary tests show almost zero overhead of the approach. > > Regards, > Igor > > > > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov > написал(а): > > > > Hi Xing, > > > > Thanks for your suggestion. Yes, this may help, and if I get your idea > > right, I already had it in my reproducer: > > 1) Create the converted physical input: > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 > > 2) Use it in case no physical children were found: > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 > > > > This idea is borrowed from Apache Drill physical rules. But the problem > is > > that this approach leads to a suboptimal plan - parent node doesn't know > > the future distribution of a child node. And as a result, it doesn't know > > it's own distribution. So the final plan is constructed in that way: > > 1.1) Root enforced "SINGLETON" on its child: > > -> PhysicalRoot[SINGLETON] > > -> Converter[SINGLETON] > > -> PhysicalProject[*ANY*] > > -> PhysicalScan[REPLICATED] > > > > 1.2) But since the child (PhysicalProject) failed to infer distribution > > during rule call, it falls back to ANY distribution. In order to satisfy > > SINGLETON distribution of a parent, we inject an exchange in the final > plan: > > -> PhysicalRoot[SINGLETON] > > * -> Exchange[SINGLETON]* > > -> PhysicalProject[*ANY*] > > -> PhysicalScan[REPLICATED] > > > > 2) But this is a suboptimal plan. The optimal plan is: > > -> PhysicalRoot[SINGLETON] > > -> PhysicalProject[REPLICATED] > > -> PhysicalScan[REPLICATED] > > > > You may observe it in my tests: > > 1) > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 > > - > > works as you described and produces not optimal plan with exchange > > 2) > > > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 > > - > > rely on AbstractConverter-s and produce an optimal plan with bottom-up > > trait propagation at the cost of significantly increased planning time > > > > Regards, > > Vladimir. > > > > пт, 8 нояб. 2019 г. в 16:15, XING JIN : > > > >> Hi Vladimir, > >> > >> I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work > may > >> help you ~ > >> They work by a top-down fashion, but when matching parent, they convert > the > >> children explicitly. > >> You may try below steps: > >> 1. Construct a rule LogicalParentRule to match the LogicalParent without > >> distribution/physical requirement for the LogicalChild; > >> 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to > >> build a new child with physical convention. Note that at this moment > only > >> an empty RelSubset is created and no PhysicalChild exists. > >> 3. Then set the RelNode to be the new input of LogicalParent; > >> > >> By above steps, you can build a parent-child relationship between > >> LogicalParent and PhysicalChild, and at last the PhysicalParentRule > will be > >> fired based on this relationship. > >> > >> I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV > in > >> below link, hope it may help you ~ > >>
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Vladimir, Hope it may help you. Currently we applied the next way (just rough description): 1) We created an API to derive possible traits permutations on the basis of children traits (quite similar to one, described in «On Demand trait set request» topic) 2) added a general rule that copies Logical nodes, but requesting our convention from their children (IGNITE convention, ANY distribution, EMPTY collation) and sets importance of old Logical nodes to zero - so, we have a Logical parent which input satisfies any possible distribution and no rules matched to previous logical node any more. 3) Physical rules to create physical rel nodes only if physical traits may be derived (there is no «barrier», described in one of previous messages) - derived traits are a collection, we don’t create a physical rel node for each possible traits set, also we may set zero importance for previously created rel nodes to decrease search space. Now we know actual and required distribution, we don’t need AbstractConverters and able just call TraitDef.convert() method inside a rule. A rule still able to produce the same output several times, but «memorization" inside the planner solves it for us. preliminary tests show almost zero overhead of the approach. Regards, Igor > 14 нояб. 2019 г., в 14:49, Vladimir Ozerov написал(а): > > Hi Xing, > > Thanks for your suggestion. Yes, this may help, and if I get your idea > right, I already had it in my reproducer: > 1) Create the converted physical input: > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 > 2) Use it in case no physical children were found: > https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 > > This idea is borrowed from Apache Drill physical rules. But the problem is > that this approach leads to a suboptimal plan - parent node doesn't know > the future distribution of a child node. And as a result, it doesn't know > it's own distribution. So the final plan is constructed in that way: > 1.1) Root enforced "SINGLETON" on its child: > -> PhysicalRoot[SINGLETON] > -> Converter[SINGLETON] > -> PhysicalProject[*ANY*] > -> PhysicalScan[REPLICATED] > > 1.2) But since the child (PhysicalProject) failed to infer distribution > during rule call, it falls back to ANY distribution. In order to satisfy > SINGLETON distribution of a parent, we inject an exchange in the final plan: > -> PhysicalRoot[SINGLETON] > * -> Exchange[SINGLETON]* > -> PhysicalProject[*ANY*] > -> PhysicalScan[REPLICATED] > > 2) But this is a suboptimal plan. The optimal plan is: > -> PhysicalRoot[SINGLETON] > -> PhysicalProject[REPLICATED] > -> PhysicalScan[REPLICATED] > > You may observe it in my tests: > 1) > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 > - > works as you described and produces not optimal plan with exchange > 2) > https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 > - > rely on AbstractConverter-s and produce an optimal plan with bottom-up > trait propagation at the cost of significantly increased planning time > > Regards, > Vladimir. > > пт, 8 нояб. 2019 г. в 16:15, XING JIN : > >> Hi Vladimir, >> >> I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work may >> help you ~ >> They work by a top-down fashion, but when matching parent, they convert the >> children explicitly. >> You may try below steps: >> 1. Construct a rule LogicalParentRule to match the LogicalParent without >> distribution/physical requirement for the LogicalChild; >> 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to >> build a new child with physical convention. Note that at this moment only >> an empty RelSubset is created and no PhysicalChild exists. >> 3. Then set the RelNode to be the new input of LogicalParent; >> >> By above steps, you can build a parent-child relationship between >> LogicalParent and PhysicalChild, and at last the PhysicalParentRule will be >> fired based on this relationship. >> >> I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV in >> below link, hope it may help you ~ >> https://github.com/jinxing64/calcite/tree/demo >> >> Also I'm +1 with Seliverstov that to get all parents of a set, which >> against the current check in RelSubset#getParentRels >> >> Best, >> Jin >> >> Vladimir Ozerov 于2019年11月5日周二 下午6:41写道: >> >>> Hi Xiening, >>> >>> I read the thread about on-demand trait requests. It seems pretty similar >>> to what I am trying to achieve, as it facilitates the bottom-up >> propagation >>> of physical traits. In fact, both your and my strategy propagate traits >>> bottom-up, but I do this through rules, which also fire bottom-up, while >> in >>> your case only the traits are propagated bottom-up, while rules continue >>>
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Xing, Thanks for your suggestion. Yes, this may help, and if I get your idea right, I already had it in my reproducer: 1) Create the converted physical input: https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L49 2) Use it in case no physical children were found: https://github.com/devozerov/calcite-optimizer/blob/master/src/main/java/devozerov/physical/ProjectPhysicalRule.java#L79 This idea is borrowed from Apache Drill physical rules. But the problem is that this approach leads to a suboptimal plan - parent node doesn't know the future distribution of a child node. And as a result, it doesn't know it's own distribution. So the final plan is constructed in that way: 1.1) Root enforced "SINGLETON" on its child: -> PhysicalRoot[SINGLETON] -> Converter[SINGLETON] -> PhysicalProject[*ANY*] -> PhysicalScan[REPLICATED] 1.2) But since the child (PhysicalProject) failed to infer distribution during rule call, it falls back to ANY distribution. In order to satisfy SINGLETON distribution of a parent, we inject an exchange in the final plan: -> PhysicalRoot[SINGLETON] * -> Exchange[SINGLETON]* -> PhysicalProject[*ANY*] -> PhysicalScan[REPLICATED] 2) But this is a suboptimal plan. The optimal plan is: -> PhysicalRoot[SINGLETON] -> PhysicalProject[REPLICATED] -> PhysicalScan[REPLICATED] You may observe it in my tests: 1) https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L46 - works as you described and produces not optimal plan with exchange 2) https://github.com/devozerov/calcite-optimizer/blob/master/src/test/java/devozerov/OptimizerTest.java#L30 - rely on AbstractConverter-s and produce an optimal plan with bottom-up trait propagation at the cost of significantly increased planning time Regards, Vladimir. пт, 8 нояб. 2019 г. в 16:15, XING JIN : > Hi Vladimir, > > I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work may > help you ~ > They work by a top-down fashion, but when matching parent, they convert the > children explicitly. > You may try below steps: > 1. Construct a rule LogicalParentRule to match the LogicalParent without > distribution/physical requirement for the LogicalChild; > 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to > build a new child with physical convention. Note that at this moment only > an empty RelSubset is created and no PhysicalChild exists. > 3. Then set the RelNode to be the new input of LogicalParent; > > By above steps, you can build a parent-child relationship between > LogicalParent and PhysicalChild, and at last the PhysicalParentRule will be > fired based on this relationship. > > I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV in > below link, hope it may help you ~ > https://github.com/jinxing64/calcite/tree/demo > > Also I'm +1 with Seliverstov that to get all parents of a set, which > against the current check in RelSubset#getParentRels > > Best, > Jin > > Vladimir Ozerov 于2019年11月5日周二 下午6:41写道: > > > Hi Xiening, > > > > I read the thread about on-demand trait requests. It seems pretty similar > > to what I am trying to achieve, as it facilitates the bottom-up > propagation > > of physical traits. In fact, both your and my strategy propagate traits > > bottom-up, but I do this through rules, which also fire bottom-up, while > in > > your case only the traits are propagated bottom-up, while rules continue > > working in a top-down fashion. > > > > However, I am thinking of how I would potentially implement my optimizer > > with your approach, and it feels like with on-demand traits resulting > > implementation of metadata queries may become very complex to that point > > that it will look like another set of rules, parallel to the already > > existing ruleset. For example, consider that I have a couple of > distributed > > tables in an OLTP application. These tables have a number of indexes, > and I > > would like to join them. First, I have a number of choices on how to join > > tables with respect to distribution. Then, I have a number of choices on > > which access method to use. Because sometimes it is beneficial to pick > > index scans instead of table scans even without index conditions, for > > example, to preserve a comfortable collation. So when my logical scan > > receives such metadata request, it typically cannot return all possible > > combinations, because there are too many of them. Instead, some > heuristical > > or cost-based logic will be used to calculate a couple of most > prospective > > ones. But it seems that we will have to duplicate the same logic in the > > corresponding rule, aren't we? > > > > I would love to read your design because this is a really interesting > > topic, and it is of great importance for the distributed engines > developed > > on top of Calcite since proper use of distribution and collation is the > key > > success
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Vladimir, I think the way PlannerTests#GoodSingleRule and EnumerableXXXRule work may help you ~ They work by a top-down fashion, but when matching parent, they convert the children explicitly. You may try below steps: 1. Construct a rule LogicalParentRule to match the LogicalParent without distribution/physical requirement for the LogicalChild; 2. In this rule, you call 'planner.changeTraits' on the LogicalChild to build a new child with physical convention. Note that at this moment only an empty RelSubset is created and no PhysicalChild exists. 3. Then set the RelNode to be the new input of LogicalParent; By above steps, you can build a parent-child relationship between LogicalParent and PhysicalChild, and at last the PhysicalParentRule will be fired based on this relationship. I have a commit to illustrate my idea, check VolcanoPlannerTest#testDEV in below link, hope it may help you ~ https://github.com/jinxing64/calcite/tree/demo Also I'm +1 with Seliverstov that to get all parents of a set, which against the current check in RelSubset#getParentRels Best, Jin Vladimir Ozerov 于2019年11月5日周二 下午6:41写道: > Hi Xiening, > > I read the thread about on-demand trait requests. It seems pretty similar > to what I am trying to achieve, as it facilitates the bottom-up propagation > of physical traits. In fact, both your and my strategy propagate traits > bottom-up, but I do this through rules, which also fire bottom-up, while in > your case only the traits are propagated bottom-up, while rules continue > working in a top-down fashion. > > However, I am thinking of how I would potentially implement my optimizer > with your approach, and it feels like with on-demand traits resulting > implementation of metadata queries may become very complex to that point > that it will look like another set of rules, parallel to the already > existing ruleset. For example, consider that I have a couple of distributed > tables in an OLTP application. These tables have a number of indexes, and I > would like to join them. First, I have a number of choices on how to join > tables with respect to distribution. Then, I have a number of choices on > which access method to use. Because sometimes it is beneficial to pick > index scans instead of table scans even without index conditions, for > example, to preserve a comfortable collation. So when my logical scan > receives such metadata request, it typically cannot return all possible > combinations, because there are too many of them. Instead, some heuristical > or cost-based logic will be used to calculate a couple of most prospective > ones. But it seems that we will have to duplicate the same logic in the > corresponding rule, aren't we? > > I would love to read your design because this is a really interesting > topic, and it is of great importance for the distributed engines developed > on top of Calcite since proper use of distribution and collation is the key > success factor for efficient query optimization. > > Regards, > Vladimir. > > пт, 1 нояб. 2019 г. в 00:40, Xiening Dai : > > > Actually we solved this problem in our setup using a mechanism called > > “Pull-Up Traits”, which explores the possible trait set of children’s > input > > to decide parent’s physical properties. In order to determine child input > > trait, you would have to look at child’s children, and all the way to the > > leaves nodes or a barrier. A barrier is a rel node which cannot derive > any > > traits regardless the input. A good example would be a user define > function > > which would throw off any distribution or collation. Then we realize just > > pulling up is not enough, sometimes we would need to look at parent’s > > requirement as well. So we try to solve this in a unified framework, > which > > we call “On Demand Trait” and implement it as part of the framework so > > anyone can be benefited. I hope Haisheng can share a design doc once we > > have more concrete ideas. > > > > > > > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni wrote: > > > > > > Hi Vladimir, > > > > > > The SubsetTransformer interface and the iterating over the RelNodes > > > within a RelSubset in Drill is exactly implemented to do the trait > > > propagation. We also had to rely on AbstractConverter to fire > > > necessary rule to avoid the CanNotPlan issue. At some point, Calcite > > > community chooses to remove AbstractConverter and Drill had to add it > > > back, which is probably one of the main reasons for us to continue > > > using a Calcite fork. I still remember we constantly had to deal with > > > the dilemma between "CanNotPlan" and long planing time due to explored > > > search space. > > > > > > Glad to see more people are joining the effort to solve this long > > > overdue issue, something missing in Calcite's core optimizer framework > > > "since before Calcite was Calcite" (Jacques's words). > > > > > > Jinfeng > > > > > > > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov > > wrote: > > >> > > >> Hi
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Xiening, I read the thread about on-demand trait requests. It seems pretty similar to what I am trying to achieve, as it facilitates the bottom-up propagation of physical traits. In fact, both your and my strategy propagate traits bottom-up, but I do this through rules, which also fire bottom-up, while in your case only the traits are propagated bottom-up, while rules continue working in a top-down fashion. However, I am thinking of how I would potentially implement my optimizer with your approach, and it feels like with on-demand traits resulting implementation of metadata queries may become very complex to that point that it will look like another set of rules, parallel to the already existing ruleset. For example, consider that I have a couple of distributed tables in an OLTP application. These tables have a number of indexes, and I would like to join them. First, I have a number of choices on how to join tables with respect to distribution. Then, I have a number of choices on which access method to use. Because sometimes it is beneficial to pick index scans instead of table scans even without index conditions, for example, to preserve a comfortable collation. So when my logical scan receives such metadata request, it typically cannot return all possible combinations, because there are too many of them. Instead, some heuristical or cost-based logic will be used to calculate a couple of most prospective ones. But it seems that we will have to duplicate the same logic in the corresponding rule, aren't we? I would love to read your design because this is a really interesting topic, and it is of great importance for the distributed engines developed on top of Calcite since proper use of distribution and collation is the key success factor for efficient query optimization. Regards, Vladimir. пт, 1 нояб. 2019 г. в 00:40, Xiening Dai : > Actually we solved this problem in our setup using a mechanism called > “Pull-Up Traits”, which explores the possible trait set of children’s input > to decide parent’s physical properties. In order to determine child input > trait, you would have to look at child’s children, and all the way to the > leaves nodes or a barrier. A barrier is a rel node which cannot derive any > traits regardless the input. A good example would be a user define function > which would throw off any distribution or collation. Then we realize just > pulling up is not enough, sometimes we would need to look at parent’s > requirement as well. So we try to solve this in a unified framework, which > we call “On Demand Trait” and implement it as part of the framework so > anyone can be benefited. I hope Haisheng can share a design doc once we > have more concrete ideas. > > > > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni wrote: > > > > Hi Vladimir, > > > > The SubsetTransformer interface and the iterating over the RelNodes > > within a RelSubset in Drill is exactly implemented to do the trait > > propagation. We also had to rely on AbstractConverter to fire > > necessary rule to avoid the CanNotPlan issue. At some point, Calcite > > community chooses to remove AbstractConverter and Drill had to add it > > back, which is probably one of the main reasons for us to continue > > using a Calcite fork. I still remember we constantly had to deal with > > the dilemma between "CanNotPlan" and long planing time due to explored > > search space. > > > > Glad to see more people are joining the effort to solve this long > > overdue issue, something missing in Calcite's core optimizer framework > > "since before Calcite was Calcite" (Jacques's words). > > > > Jinfeng > > > > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov > wrote: > >> > >> Hi Danny, > >> > >> Thank you very much for the links. What is described here is pretty much > >> similar to the problem I describe. Especially the discussion about trait > >> propagation, as this is basically what I need - to explore potential > traits > >> of children before optimizing parents. And this is basically what Drill > >> already does with it's SubsetTransformer: > >> 1) There is a SubsetTransformer interface, which iterates over physical > >> relations of the given subset [1] > >> 2) If you want to make a physical project, you iterate over physical > >> relations of the input subset and create possible physical projects [2] > >> 3) But if you cannot find any physical input, then you trigger creation > of > >> a "bad" physical project, which is very likely to have poor cost > because it > >> cannot take advantage of input's distribution information [3] > >> So, step 2 - is a trait set propagation which is needed by many > >> distributed engines. Step 3 - an attempt to workaround current > >> VolcanoPlanner behavior, when a parent rule is fired only if produced > child > >> node has compatible trait set. > >> > >> I do not know Calcite's architecture that good but at first glance, the > >> proposed ability to re-fire rules of a specific Rel seems good
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
I think the Convention as a trait is something special to Calcite (not a Volcano concept). Calcite uses it for federation queries over heterogeneous data source. Look at Calcite paper[1] 4 Traits. It explains the idea quite well. [1] https://arxiv.org/pdf/1802.10233.pdf > On Nov 1, 2019, at 1:42 AM, Stamatis Zampetakis wrote: > > @Vladimir: Given that you want to make the optimizer to work in bottom-up > fashion maybe it suffices to inverse the comparator in the RuleQueue [1]. > For sure there are rules which would not be compatible with this change but > maybe for your project it makes sense. > I never tried it my self but I would be curious to see if it works. > > @Jinfeng: I guess that the reason that we have the convention as a trait is > because results from different systems need to be combined together. > Given that these results arrive in different forms we need a way to know > how to transform one to the other. In some sense the convention trait can > be thought to be similar to the "assembledness" property mentioned in the > paper. > "For query optimization in object-oriented systems, we plan on defining > "assembledness" of complex objects in memory as a physical property and > using the assembly operator described in ... as the enforcer for this > property." [2] > For sure I am not the best person to talk about this choice so don't take > anything that I say as 100% accurate. I am sure other people can provide a > much better explanation. > > Best, > Stamatis > > [1] > https://github.com/apache/calcite/blob/65043f290a7be45a668cf941ab48ee3c30efbe4e/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L94 > [2] > https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Volcano-graefe.pdf > > On Fri, Nov 1, 2019 at 7:53 AM Jinfeng Ni wrote: > >> I think "pull-up traits" is necessary, since the physical traits are >> propagated upwards. However, I'm not fully convinced "On Demand >> Trait" is the best solution, as I feel it may restrict the choices the >> planner may consider. Maybe after the proposed design doc is ready, >> we may look into the detail and reevaluate. >> >> One question that I have always had ever since I started using Calcite >> couple of years ago, is the concept of Convention being a trait, and >> introduction of AbstractConverter in Calcite's VolcanoPlanner. >> Quickly looking through the original paper of Volcano [1], or the new >> one [2], I did not see mentioning of such concepts. The original >> paper has a classification of "transformation rules" which operates on >> logical relation expression and "implementation rules" which provides >> the mapping to physical operators. >> >> >> 1. https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf >> 2. >> https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/volcano-prasan.pdf >> >> On Thu, Oct 31, 2019 at 2:40 PM Xiening Dai wrote: >>> >>> Actually we solved this problem in our setup using a mechanism called >> “Pull-Up Traits”, which explores the possible trait set of children’s input >> to decide parent’s physical properties. In order to determine child input >> trait, you would have to look at child’s children, and all the way to the >> leaves nodes or a barrier. A barrier is a rel node which cannot derive any >> traits regardless the input. A good example would be a user define function >> which would throw off any distribution or collation. Then we realize just >> pulling up is not enough, sometimes we would need to look at parent’s >> requirement as well. So we try to solve this in a unified framework, which >> we call “On Demand Trait” and implement it as part of the framework so >> anyone can be benefited. I hope Haisheng can share a design doc once we >> have more concrete ideas. >>> >>> On Oct 31, 2019, at 11:37 AM, Jinfeng Ni wrote: Hi Vladimir, The SubsetTransformer interface and the iterating over the RelNodes within a RelSubset in Drill is exactly implemented to do the trait propagation. We also had to rely on AbstractConverter to fire necessary rule to avoid the CanNotPlan issue. At some point, Calcite community chooses to remove AbstractConverter and Drill had to add it back, which is probably one of the main reasons for us to continue using a Calcite fork. I still remember we constantly had to deal with the dilemma between "CanNotPlan" and long planing time due to explored search space. Glad to see more people are joining the effort to solve this long overdue issue, something missing in Calcite's core optimizer framework "since before Calcite was Calcite" (Jacques's words). Jinfeng On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov >> wrote: > > Hi Danny, > > Thank you very much for the links. What is described here is pretty >> much > similar to the problem I describe. Especially the discussion about >> trait
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
@Vladimir: Given that you want to make the optimizer to work in bottom-up fashion maybe it suffices to inverse the comparator in the RuleQueue [1]. For sure there are rules which would not be compatible with this change but maybe for your project it makes sense. I never tried it my self but I would be curious to see if it works. @Jinfeng: I guess that the reason that we have the convention as a trait is because results from different systems need to be combined together. Given that these results arrive in different forms we need a way to know how to transform one to the other. In some sense the convention trait can be thought to be similar to the "assembledness" property mentioned in the paper. "For query optimization in object-oriented systems, we plan on defining "assembledness" of complex objects in memory as a physical property and using the assembly operator described in ... as the enforcer for this property." [2] For sure I am not the best person to talk about this choice so don't take anything that I say as 100% accurate. I am sure other people can provide a much better explanation. Best, Stamatis [1] https://github.com/apache/calcite/blob/65043f290a7be45a668cf941ab48ee3c30efbe4e/core/src/main/java/org/apache/calcite/plan/volcano/RuleQueue.java#L94 [2] https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/Volcano-graefe.pdf On Fri, Nov 1, 2019 at 7:53 AM Jinfeng Ni wrote: > I think "pull-up traits" is necessary, since the physical traits are > propagated upwards. However, I'm not fully convinced "On Demand > Trait" is the best solution, as I feel it may restrict the choices the > planner may consider. Maybe after the proposed design doc is ready, > we may look into the detail and reevaluate. > > One question that I have always had ever since I started using Calcite > couple of years ago, is the concept of Convention being a trait, and > introduction of AbstractConverter in Calcite's VolcanoPlanner. > Quickly looking through the original paper of Volcano [1], or the new > one [2], I did not see mentioning of such concepts. The original > paper has a classification of "transformation rules" which operates on > logical relation expression and "implementation rules" which provides > the mapping to physical operators. > > > 1. https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf > 2. > https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/volcano-prasan.pdf > > On Thu, Oct 31, 2019 at 2:40 PM Xiening Dai wrote: > > > > Actually we solved this problem in our setup using a mechanism called > “Pull-Up Traits”, which explores the possible trait set of children’s input > to decide parent’s physical properties. In order to determine child input > trait, you would have to look at child’s children, and all the way to the > leaves nodes or a barrier. A barrier is a rel node which cannot derive any > traits regardless the input. A good example would be a user define function > which would throw off any distribution or collation. Then we realize just > pulling up is not enough, sometimes we would need to look at parent’s > requirement as well. So we try to solve this in a unified framework, which > we call “On Demand Trait” and implement it as part of the framework so > anyone can be benefited. I hope Haisheng can share a design doc once we > have more concrete ideas. > > > > > > > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni wrote: > > > > > > Hi Vladimir, > > > > > > The SubsetTransformer interface and the iterating over the RelNodes > > > within a RelSubset in Drill is exactly implemented to do the trait > > > propagation. We also had to rely on AbstractConverter to fire > > > necessary rule to avoid the CanNotPlan issue. At some point, Calcite > > > community chooses to remove AbstractConverter and Drill had to add it > > > back, which is probably one of the main reasons for us to continue > > > using a Calcite fork. I still remember we constantly had to deal with > > > the dilemma between "CanNotPlan" and long planing time due to explored > > > search space. > > > > > > Glad to see more people are joining the effort to solve this long > > > overdue issue, something missing in Calcite's core optimizer framework > > > "since before Calcite was Calcite" (Jacques's words). > > > > > > Jinfeng > > > > > > > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov > wrote: > > >> > > >> Hi Danny, > > >> > > >> Thank you very much for the links. What is described here is pretty > much > > >> similar to the problem I describe. Especially the discussion about > trait > > >> propagation, as this is basically what I need - to explore potential > traits > > >> of children before optimizing parents. And this is basically what > Drill > > >> already does with it's SubsetTransformer: > > >> 1) There is a SubsetTransformer interface, which iterates over > physical > > >> relations of the given subset [1] > > >> 2) If you want to make a physical project, you iterate over physical > > >>
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
I think "pull-up traits" is necessary, since the physical traits are propagated upwards. However, I'm not fully convinced "On Demand Trait" is the best solution, as I feel it may restrict the choices the planner may consider. Maybe after the proposed design doc is ready, we may look into the detail and reevaluate. One question that I have always had ever since I started using Calcite couple of years ago, is the concept of Convention being a trait, and introduction of AbstractConverter in Calcite's VolcanoPlanner. Quickly looking through the original paper of Volcano [1], or the new one [2], I did not see mentioning of such concepts. The original paper has a classification of "transformation rules" which operates on logical relation expression and "implementation rules" which provides the mapping to physical operators. 1. https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf 2. https://www.cse.iitb.ac.in/infolab/Data/Courses/CS632/Papers/volcano-prasan.pdf On Thu, Oct 31, 2019 at 2:40 PM Xiening Dai wrote: > > Actually we solved this problem in our setup using a mechanism called > “Pull-Up Traits”, which explores the possible trait set of children’s input > to decide parent’s physical properties. In order to determine child input > trait, you would have to look at child’s children, and all the way to the > leaves nodes or a barrier. A barrier is a rel node which cannot derive any > traits regardless the input. A good example would be a user define function > which would throw off any distribution or collation. Then we realize just > pulling up is not enough, sometimes we would need to look at parent’s > requirement as well. So we try to solve this in a unified framework, which we > call “On Demand Trait” and implement it as part of the framework so anyone > can be benefited. I hope Haisheng can share a design doc once we have more > concrete ideas. > > > > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni wrote: > > > > Hi Vladimir, > > > > The SubsetTransformer interface and the iterating over the RelNodes > > within a RelSubset in Drill is exactly implemented to do the trait > > propagation. We also had to rely on AbstractConverter to fire > > necessary rule to avoid the CanNotPlan issue. At some point, Calcite > > community chooses to remove AbstractConverter and Drill had to add it > > back, which is probably one of the main reasons for us to continue > > using a Calcite fork. I still remember we constantly had to deal with > > the dilemma between "CanNotPlan" and long planing time due to explored > > search space. > > > > Glad to see more people are joining the effort to solve this long > > overdue issue, something missing in Calcite's core optimizer framework > > "since before Calcite was Calcite" (Jacques's words). > > > > Jinfeng > > > > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov wrote: > >> > >> Hi Danny, > >> > >> Thank you very much for the links. What is described here is pretty much > >> similar to the problem I describe. Especially the discussion about trait > >> propagation, as this is basically what I need - to explore potential traits > >> of children before optimizing parents. And this is basically what Drill > >> already does with it's SubsetTransformer: > >> 1) There is a SubsetTransformer interface, which iterates over physical > >> relations of the given subset [1] > >> 2) If you want to make a physical project, you iterate over physical > >> relations of the input subset and create possible physical projects [2] > >> 3) But if you cannot find any physical input, then you trigger creation of > >> a "bad" physical project, which is very likely to have poor cost because it > >> cannot take advantage of input's distribution information [3] > >> So, step 2 - is a trait set propagation which is needed by many > >> distributed engines. Step 3 - an attempt to workaround current > >> VolcanoPlanner behavior, when a parent rule is fired only if produced child > >> node has compatible trait set. > >> > >> I do not know Calcite's architecture that good but at first glance, the > >> proposed ability to re-fire rules of a specific Rel seems good enough. It > >> doesn't expand search space, because no new nodes are created, and it seems > >> to be relatively simple to implement. > >> > >> [1] > >> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java > >> [2] > >> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 > >> [3] > >> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 > >> > >> чт, 31 окт. 2019 г. в 12:21, Danny Chan : > >> > >>> Thanks Vladimir for bringing up this discussion ! > >>> > >>> Did you notice that there is a JIRA issue about this problem ? [1] Also a > >>> discussion about how to propagate
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Actually we solved this problem in our setup using a mechanism called “Pull-Up Traits”, which explores the possible trait set of children’s input to decide parent’s physical properties. In order to determine child input trait, you would have to look at child’s children, and all the way to the leaves nodes or a barrier. A barrier is a rel node which cannot derive any traits regardless the input. A good example would be a user define function which would throw off any distribution or collation. Then we realize just pulling up is not enough, sometimes we would need to look at parent’s requirement as well. So we try to solve this in a unified framework, which we call “On Demand Trait” and implement it as part of the framework so anyone can be benefited. I hope Haisheng can share a design doc once we have more concrete ideas. > On Oct 31, 2019, at 11:37 AM, Jinfeng Ni wrote: > > Hi Vladimir, > > The SubsetTransformer interface and the iterating over the RelNodes > within a RelSubset in Drill is exactly implemented to do the trait > propagation. We also had to rely on AbstractConverter to fire > necessary rule to avoid the CanNotPlan issue. At some point, Calcite > community chooses to remove AbstractConverter and Drill had to add it > back, which is probably one of the main reasons for us to continue > using a Calcite fork. I still remember we constantly had to deal with > the dilemma between "CanNotPlan" and long planing time due to explored > search space. > > Glad to see more people are joining the effort to solve this long > overdue issue, something missing in Calcite's core optimizer framework > "since before Calcite was Calcite" (Jacques's words). > > Jinfeng > > > On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov wrote: >> >> Hi Danny, >> >> Thank you very much for the links. What is described here is pretty much >> similar to the problem I describe. Especially the discussion about trait >> propagation, as this is basically what I need - to explore potential traits >> of children before optimizing parents. And this is basically what Drill >> already does with it's SubsetTransformer: >> 1) There is a SubsetTransformer interface, which iterates over physical >> relations of the given subset [1] >> 2) If you want to make a physical project, you iterate over physical >> relations of the input subset and create possible physical projects [2] >> 3) But if you cannot find any physical input, then you trigger creation of >> a "bad" physical project, which is very likely to have poor cost because it >> cannot take advantage of input's distribution information [3] >> So, step 2 - is a trait set propagation which is needed by many >> distributed engines. Step 3 - an attempt to workaround current >> VolcanoPlanner behavior, when a parent rule is fired only if produced child >> node has compatible trait set. >> >> I do not know Calcite's architecture that good but at first glance, the >> proposed ability to re-fire rules of a specific Rel seems good enough. It >> doesn't expand search space, because no new nodes are created, and it seems >> to be relatively simple to implement. >> >> [1] >> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java >> [2] >> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 >> [3] >> https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 >> >> чт, 31 окт. 2019 г. в 12:21, Danny Chan : >> >>> Thanks Vladimir for bringing up this discussion ! >>> >>> Did you notice that there is a JIRA issue about this problem ? [1] Also a >>> discussion about how to propagate the traits [2] >>> >>> [1] https://issues.apache.org/jira/browse/CALCITE-2970 >>> [2] >>> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E >>> >>> Best, >>> Danny Chan >>> 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov ,写道: Hi colleagues, I would like to discuss with the community the possibility of adding a >>> new public method to VolcanoPlanner which will forcefully re-trigger the >>> rules for the specific rel. This is a follow up of a discussion started in the other thread [1]. **Problem statement** When converting between conventions during optimization VolcanoPlanner prefers the top-bottom approach, so that the nodes are converted from the root. But in some cases, the intermediate node must be converted after >>> its children. This is especially true for distributed SQL engines, which rely on distribution traits during the optimization process: it is not >>> possible to efficiently choose a proper physical implementation of a parent node unless the physical representation of a child node is known.
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Vladimir, The SubsetTransformer interface and the iterating over the RelNodes within a RelSubset in Drill is exactly implemented to do the trait propagation. We also had to rely on AbstractConverter to fire necessary rule to avoid the CanNotPlan issue. At some point, Calcite community chooses to remove AbstractConverter and Drill had to add it back, which is probably one of the main reasons for us to continue using a Calcite fork. I still remember we constantly had to deal with the dilemma between "CanNotPlan" and long planing time due to explored search space. Glad to see more people are joining the effort to solve this long overdue issue, something missing in Calcite's core optimizer framework "since before Calcite was Calcite" (Jacques's words). Jinfeng On Thu, Oct 31, 2019 at 3:38 AM Vladimir Ozerov wrote: > > Hi Danny, > > Thank you very much for the links. What is described here is pretty much > similar to the problem I describe. Especially the discussion about trait > propagation, as this is basically what I need - to explore potential traits > of children before optimizing parents. And this is basically what Drill > already does with it's SubsetTransformer: > 1) There is a SubsetTransformer interface, which iterates over physical > relations of the given subset [1] > 2) If you want to make a physical project, you iterate over physical > relations of the input subset and create possible physical projects [2] > 3) But if you cannot find any physical input, then you trigger creation of > a "bad" physical project, which is very likely to have poor cost because it > cannot take advantage of input's distribution information [3] > So, step 2 - is a trait set propagation which is needed by many > distributed engines. Step 3 - an attempt to workaround current > VolcanoPlanner behavior, when a parent rule is fired only if produced child > node has compatible trait set. > > I do not know Calcite's architecture that good but at first glance, the > proposed ability to re-fire rules of a specific Rel seems good enough. It > doesn't expand search space, because no new nodes are created, and it seems > to be relatively simple to implement. > > [1] > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java > [2] > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 > [3] > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 > > чт, 31 окт. 2019 г. в 12:21, Danny Chan : > > > Thanks Vladimir for bringing up this discussion ! > > > > Did you notice that there is a JIRA issue about this problem ? [1] Also a > > discussion about how to propagate the traits [2] > > > > [1] https://issues.apache.org/jira/browse/CALCITE-2970 > > [2] > > https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E > > > > Best, > > Danny Chan > > 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov ,写道: > > > Hi colleagues, > > > > > > I would like to discuss with the community the possibility of adding a > > new > > > public method to VolcanoPlanner which will forcefully re-trigger the > > rules > > > for the specific rel. This is a follow up of a discussion started in the > > > other thread [1]. > > > > > > **Problem statement** > > > When converting between conventions during optimization VolcanoPlanner > > > prefers the top-bottom approach, so that the nodes are converted from the > > > root. But in some cases, the intermediate node must be converted after > > its > > > children. This is especially true for distributed SQL engines, which rely > > > on distribution traits during the optimization process: it is not > > possible > > > to efficiently choose a proper physical implementation of a parent node > > > unless the physical representation of a child node is known. > > > > > > It seems that presently VolcanoPlanner cannot address such cases by > > > default. Consider that we have two nodes and associated rules which > > convert > > > them to physical counterparts: > > > [LogicalParent <- LogicalChild] > > > The parent should be converted after the child. When the child is > > > converted, the physical node is created: > > > [LogicalParent <- {LogicalChild, PhysicalChild}] > > > In order to finish the optimization process, we need to convert the > > parent. > > > But parent rules are not fired, because PhysicalChild has traits > > > incompatible with LogicalParent. > > > > > > Presently the problem could be solved in two ways: > > > 1) Always produce conversions when going top-down. In this case, I first > > > create a physical parent, then a physical child. The problem is that > > > created parent is not optimal because it didn't know child distribution > > at > > > the time of creation. So the full flow would be: create
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Xiening, Yes, I think that the manual creation of converters to trigger parent rules should be enough at the moment. I'll try to explore this possibility. Thank you. Regards, Vladimir чт, 31 окт. 2019 г. в 20:11, Xiening Dai : > Hi Vladimir, > > I think for short/mid term, #2 way (using AbstractConverter) should work > after we fix CALCITE-2970. We already understand the root cause, now are > looking at the best way to fix it. If you cannot wait, you can also create > your own converter rule so it won’t generate logical node, and the > performance should be much better. And to your concern regarding the > overhead of creating AbstractConverter objects, I think they are just minor > overheads compared to the rest of part of the work framework’s doing (rule > matching, merging set, etc). > > Regarding the comment "to explore potential traits of children before > optimizing parents”, I totally agree and want to point out we should also > consider parent requirements. Currently such mechanism is missing in > Calcite, and we are having discussion on how to add it as part of the > Volcano planner. Danny'd shared the mail archive previously. Welcome to > join the discussion. > > > > On Oct 31, 2019, at 3:37 AM, Vladimir Ozerov wrote: > > > > Hi Danny, > > > > Thank you very much for the links. What is described here is pretty much > > similar to the problem I describe. Especially the discussion about trait > > propagation, as this is basically what I need - to explore potential > traits > > of children before optimizing parents. And this is basically what Drill > > already does with it's SubsetTransformer: > > 1) There is a SubsetTransformer interface, which iterates over physical > > relations of the given subset [1] > > 2) If you want to make a physical project, you iterate over physical > > relations of the input subset and create possible physical projects [2] > > 3) But if you cannot find any physical input, then you trigger creation > of > > a "bad" physical project, which is very likely to have poor cost because > it > > cannot take advantage of input's distribution information [3] > > So, step 2 - is a trait set propagation which is needed by many > > distributed engines. Step 3 - an attempt to workaround current > > VolcanoPlanner behavior, when a parent rule is fired only if produced > child > > node has compatible trait set. > > > > I do not know Calcite's architecture that good but at first glance, the > > proposed ability to re-fire rules of a specific Rel seems good enough. It > > doesn't expand search space, because no new nodes are created, and it > seems > > to be relatively simple to implement. > > > > [1] > > > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java > > [2] > > > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 > > [3] > > > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 > > > > чт, 31 окт. 2019 г. в 12:21, Danny Chan : > > > >> Thanks Vladimir for bringing up this discussion ! > >> > >> Did you notice that there is a JIRA issue about this problem ? [1] Also > a > >> discussion about how to propagate the traits [2] > >> > >> [1] https://issues.apache.org/jira/browse/CALCITE-2970 > >> [2] > >> > https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E > >> > >> Best, > >> Danny Chan > >> 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov ,写道: > >>> Hi colleagues, > >>> > >>> I would like to discuss with the community the possibility of adding a > >> new > >>> public method to VolcanoPlanner which will forcefully re-trigger the > >> rules > >>> for the specific rel. This is a follow up of a discussion started in > the > >>> other thread [1]. > >>> > >>> **Problem statement** > >>> When converting between conventions during optimization VolcanoPlanner > >>> prefers the top-bottom approach, so that the nodes are converted from > the > >>> root. But in some cases, the intermediate node must be converted after > >> its > >>> children. This is especially true for distributed SQL engines, which > rely > >>> on distribution traits during the optimization process: it is not > >> possible > >>> to efficiently choose a proper physical implementation of a parent node > >>> unless the physical representation of a child node is known. > >>> > >>> It seems that presently VolcanoPlanner cannot address such cases by > >>> default. Consider that we have two nodes and associated rules which > >> convert > >>> them to physical counterparts: > >>> [LogicalParent <- LogicalChild] > >>> The parent should be converted after the child. When the child is > >>> converted, the physical node is created: > >>> [LogicalParent <- {LogicalChild, PhysicalChild}] > >>> In order to finish
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Vladimir, I think for short/mid term, #2 way (using AbstractConverter) should work after we fix CALCITE-2970. We already understand the root cause, now are looking at the best way to fix it. If you cannot wait, you can also create your own converter rule so it won’t generate logical node, and the performance should be much better. And to your concern regarding the overhead of creating AbstractConverter objects, I think they are just minor overheads compared to the rest of part of the work framework’s doing (rule matching, merging set, etc). Regarding the comment "to explore potential traits of children before optimizing parents”, I totally agree and want to point out we should also consider parent requirements. Currently such mechanism is missing in Calcite, and we are having discussion on how to add it as part of the Volcano planner. Danny'd shared the mail archive previously. Welcome to join the discussion. > On Oct 31, 2019, at 3:37 AM, Vladimir Ozerov wrote: > > Hi Danny, > > Thank you very much for the links. What is described here is pretty much > similar to the problem I describe. Especially the discussion about trait > propagation, as this is basically what I need - to explore potential traits > of children before optimizing parents. And this is basically what Drill > already does with it's SubsetTransformer: > 1) There is a SubsetTransformer interface, which iterates over physical > relations of the given subset [1] > 2) If you want to make a physical project, you iterate over physical > relations of the input subset and create possible physical projects [2] > 3) But if you cannot find any physical input, then you trigger creation of > a "bad" physical project, which is very likely to have poor cost because it > cannot take advantage of input's distribution information [3] > So, step 2 - is a trait set propagation which is needed by many > distributed engines. Step 3 - an attempt to workaround current > VolcanoPlanner behavior, when a parent rule is fired only if produced child > node has compatible trait set. > > I do not know Calcite's architecture that good but at first glance, the > proposed ability to re-fire rules of a specific Rel seems good enough. It > doesn't expand search space, because no new nodes are created, and it seems > to be relatively simple to implement. > > [1] > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java > [2] > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 > [3] > https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 > > чт, 31 окт. 2019 г. в 12:21, Danny Chan : > >> Thanks Vladimir for bringing up this discussion ! >> >> Did you notice that there is a JIRA issue about this problem ? [1] Also a >> discussion about how to propagate the traits [2] >> >> [1] https://issues.apache.org/jira/browse/CALCITE-2970 >> [2] >> https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E >> >> Best, >> Danny Chan >> 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov ,写道: >>> Hi colleagues, >>> >>> I would like to discuss with the community the possibility of adding a >> new >>> public method to VolcanoPlanner which will forcefully re-trigger the >> rules >>> for the specific rel. This is a follow up of a discussion started in the >>> other thread [1]. >>> >>> **Problem statement** >>> When converting between conventions during optimization VolcanoPlanner >>> prefers the top-bottom approach, so that the nodes are converted from the >>> root. But in some cases, the intermediate node must be converted after >> its >>> children. This is especially true for distributed SQL engines, which rely >>> on distribution traits during the optimization process: it is not >> possible >>> to efficiently choose a proper physical implementation of a parent node >>> unless the physical representation of a child node is known. >>> >>> It seems that presently VolcanoPlanner cannot address such cases by >>> default. Consider that we have two nodes and associated rules which >> convert >>> them to physical counterparts: >>> [LogicalParent <- LogicalChild] >>> The parent should be converted after the child. When the child is >>> converted, the physical node is created: >>> [LogicalParent <- {LogicalChild, PhysicalChild}] >>> In order to finish the optimization process, we need to convert the >> parent. >>> But parent rules are not fired, because PhysicalChild has traits >>> incompatible with LogicalParent. >>> >>> Presently the problem could be solved in two ways: >>> 1) Always produce conversions when going top-down. In this case, I first >>> create a physical parent, then a physical child. The problem is that >>> created parent is not optimal because it
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Hi Danny, Thank you very much for the links. What is described here is pretty much similar to the problem I describe. Especially the discussion about trait propagation, as this is basically what I need - to explore potential traits of children before optimizing parents. And this is basically what Drill already does with it's SubsetTransformer: 1) There is a SubsetTransformer interface, which iterates over physical relations of the given subset [1] 2) If you want to make a physical project, you iterate over physical relations of the input subset and create possible physical projects [2] 3) But if you cannot find any physical input, then you trigger creation of a "bad" physical project, which is very likely to have poor cost because it cannot take advantage of input's distribution information [3] So, step 2 - is a trait set propagation which is needed by many distributed engines. Step 3 - an attempt to workaround current VolcanoPlanner behavior, when a parent rule is fired only if produced child node has compatible trait set. I do not know Calcite's architecture that good but at first glance, the proposed ability to re-fire rules of a specific Rel seems good enough. It doesn't expand search space, because no new nodes are created, and it seems to be relatively simple to implement. [1] https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/SubsetTransformer.java [2] https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L66 [3] https://github.com/apache/drill/blob/1.16.0/exec/java-exec/src/main/java/org/apache/drill/exec/planner/physical/ProjectPrule.java#L69 чт, 31 окт. 2019 г. в 12:21, Danny Chan : > Thanks Vladimir for bringing up this discussion ! > > Did you notice that there is a JIRA issue about this problem ? [1] Also a > discussion about how to propagate the traits [2] > > [1] https://issues.apache.org/jira/browse/CALCITE-2970 > [2] > https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E > > Best, > Danny Chan > 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov ,写道: > > Hi colleagues, > > > > I would like to discuss with the community the possibility of adding a > new > > public method to VolcanoPlanner which will forcefully re-trigger the > rules > > for the specific rel. This is a follow up of a discussion started in the > > other thread [1]. > > > > **Problem statement** > > When converting between conventions during optimization VolcanoPlanner > > prefers the top-bottom approach, so that the nodes are converted from the > > root. But in some cases, the intermediate node must be converted after > its > > children. This is especially true for distributed SQL engines, which rely > > on distribution traits during the optimization process: it is not > possible > > to efficiently choose a proper physical implementation of a parent node > > unless the physical representation of a child node is known. > > > > It seems that presently VolcanoPlanner cannot address such cases by > > default. Consider that we have two nodes and associated rules which > convert > > them to physical counterparts: > > [LogicalParent <- LogicalChild] > > The parent should be converted after the child. When the child is > > converted, the physical node is created: > > [LogicalParent <- {LogicalChild, PhysicalChild}] > > In order to finish the optimization process, we need to convert the > parent. > > But parent rules are not fired, because PhysicalChild has traits > > incompatible with LogicalParent. > > > > Presently the problem could be solved in two ways: > > 1) Always produce conversions when going top-down. In this case, I first > > create a physical parent, then a physical child. The problem is that > > created parent is not optimal because it didn't know child distribution > at > > the time of creation. So the full flow would be: create a bad parent, > > create a good child, create a good parent. > > 1.1) [LogicalParent <- LogicalChild] > > 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] > > 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, > PhysicalChild}] > > 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- > > {LogicalChild, PhysicalChild}] > > What is worse, the creation of a not optimal parent will trigger rules on > > its parent, which in turn may create a not optimal parent-parent node, > etc. > > > > 2) Make sure that your convention returns true for > > Convention.canConvertConvention. In this case, every time a new rel is > > added to a RelSet, its traitset will be compared to traitsets of all > other > > rels in the set. For every pair of traitset we may ask the engine to > create > > a relevant AbstractConverter. The net effect is that > "physical-to-logical" > > converter is created, which re-triggers the rule on the logical parent > > since their conventions are
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Thanks Vladimir for bringing up this discussion ! Did you notice that there is a JIRA issue about this problem ? [1] Also a discussion about how to propagate the traits [2] [1] https://issues.apache.org/jira/browse/CALCITE-2970 [2] https://ponymail-vm.apache.org/_GUI_/thread.html/79dac47ea50b5dfbd3f234e368ed61d247fb0eb989f87fe01aedaf25@%3Cdev.calcite.apache.org%3E Best, Danny Chan 在 2019年10月31日 +0800 PM3:56,Vladimir Ozerov ,写道: > Hi colleagues, > > I would like to discuss with the community the possibility of adding a new > public method to VolcanoPlanner which will forcefully re-trigger the rules > for the specific rel. This is a follow up of a discussion started in the > other thread [1]. > > **Problem statement** > When converting between conventions during optimization VolcanoPlanner > prefers the top-bottom approach, so that the nodes are converted from the > root. But in some cases, the intermediate node must be converted after its > children. This is especially true for distributed SQL engines, which rely > on distribution traits during the optimization process: it is not possible > to efficiently choose a proper physical implementation of a parent node > unless the physical representation of a child node is known. > > It seems that presently VolcanoPlanner cannot address such cases by > default. Consider that we have two nodes and associated rules which convert > them to physical counterparts: > [LogicalParent <- LogicalChild] > The parent should be converted after the child. When the child is > converted, the physical node is created: > [LogicalParent <- {LogicalChild, PhysicalChild}] > In order to finish the optimization process, we need to convert the parent. > But parent rules are not fired, because PhysicalChild has traits > incompatible with LogicalParent. > > Presently the problem could be solved in two ways: > 1) Always produce conversions when going top-down. In this case, I first > create a physical parent, then a physical child. The problem is that > created parent is not optimal because it didn't know child distribution at > the time of creation. So the full flow would be: create a bad parent, > create a good child, create a good parent. > 1.1) [LogicalParent <- LogicalChild] > 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] > 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, PhysicalChild}] > 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- > {LogicalChild, PhysicalChild}] > What is worse, the creation of a not optimal parent will trigger rules on > its parent, which in turn may create a not optimal parent-parent node, etc. > > 2) Make sure that your convention returns true for > Convention.canConvertConvention. In this case, every time a new rel is > added to a RelSet, its traitset will be compared to traitsets of all other > rels in the set. For every pair of traitset we may ask the engine to create > a relevant AbstractConverter. The net effect is that "physical-to-logical" > converter is created, which re-triggers the rule on the logical parent > since their conventions are compatible: > 2.1) [LogicalParent <- LogicalChild] > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > 2.3) [LogicalParent <- {LogicalChild, PhysicalChild, > PhysicalToLogicalConverter}] > 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild, > PhysicalToLogicalConverter}] > > The problem with that approach is that it is too coarse-grained since we > operate on traitsets rather than rels. As a result, additional memory and > CPU pressure are introduced because usually too many converters are > created, which triggers the rules over and over again. > > **Affected products** > At the moment two distributed engines are being developed for Hazelcast and > Apache Ignite. Both require bottom-up optimization and currently rely on > the second workaround. > Another example is Apache Drill. I do not know whether its community is > concerned with the issue, but it also uses bottom-up optimization for many > rules and employs both the aforementioned workarounds. As a result, I guess > that Drill's optimizer also creates too many rels during optimization and > suffer from huge search space. Please correct me if I am wrong. > > **Proposal** > The key problem is that there is no way to re-fire rules on a specific > node. The original problem could have been solved, if it would be possible > to re-fire rules on a LogicalParent without creating any additional rels. > That would lead to a clear optimization flow: > 2.1) [LogicalParent <- LogicalChild] > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild}] > > We can add the following method to VolcanoPlanner (RelOptPlanner?) > interface: > void fireRules(RelNode rel) > > This method will fire the rules on a passed node in a deferred mode as if > it was the new node just added to the optimizer. This would require slight > changes to
Re: [DISCUSS] Proposal to add API to force rules matching specific rels
Not only "fireRule" method is needed, but the way to get all parents of a set, containing a subset, produced by the transformation. Or a way to fire rules without traits satisfaction check. чт, 31 окт. 2019 г., 10:56 Vladimir Ozerov : > Hi colleagues, > > I would like to discuss with the community the possibility of adding a new > public method to VolcanoPlanner which will forcefully re-trigger the rules > for the specific rel. This is a follow up of a discussion started in the > other thread [1]. > > **Problem statement** > When converting between conventions during optimization VolcanoPlanner > prefers the top-bottom approach, so that the nodes are converted from the > root. But in some cases, the intermediate node must be converted after its > children. This is especially true for distributed SQL engines, which rely > on distribution traits during the optimization process: it is not possible > to efficiently choose a proper physical implementation of a parent node > unless the physical representation of a child node is known. > > It seems that presently VolcanoPlanner cannot address such cases by > default. Consider that we have two nodes and associated rules which convert > them to physical counterparts: > [LogicalParent <- LogicalChild] > The parent should be converted after the child. When the child is > converted, the physical node is created: > [LogicalParent <- {LogicalChild, PhysicalChild}] > In order to finish the optimization process, we need to convert the parent. > But parent rules are not fired, because PhysicalChild has traits > incompatible with LogicalParent. > > Presently the problem could be solved in two ways: > 1) Always produce conversions when going top-down. In this case, I first > create a physical parent, then a physical child. The problem is that > created parent is not optimal because it didn't know child distribution at > the time of creation. So the full flow would be: create a bad parent, > create a good child, create a good parent. > 1.1) [LogicalParent <- LogicalChild] > 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] > 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, PhysicalChild}] > 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- > {LogicalChild, PhysicalChild}] > What is worse, the creation of a not optimal parent will trigger rules on > its parent, which in turn may create a not optimal parent-parent node, etc. > > 2) Make sure that your convention returns true for > Convention.canConvertConvention. In this case, every time a new rel is > added to a RelSet, its traitset will be compared to traitsets of all other > rels in the set. For every pair of traitset we may ask the engine to create > a relevant AbstractConverter. The net effect is that "physical-to-logical" > converter is created, which re-triggers the rule on the logical parent > since their conventions are compatible: > 2.1) [LogicalParent <- LogicalChild] > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > 2.3) [LogicalParent <- {LogicalChild, PhysicalChild, > PhysicalToLogicalConverter}] > 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild, > PhysicalToLogicalConverter}] > > The problem with that approach is that it is too coarse-grained since we > operate on traitsets rather than rels. As a result, additional memory and > CPU pressure are introduced because usually too many converters are > created, which triggers the rules over and over again. > > **Affected products** > At the moment two distributed engines are being developed for Hazelcast and > Apache Ignite. Both require bottom-up optimization and currently rely on > the second workaround. > Another example is Apache Drill. I do not know whether its community is > concerned with the issue, but it also uses bottom-up optimization for many > rules and employs both the aforementioned workarounds. As a result, I guess > that Drill's optimizer also creates too many rels during optimization and > suffer from huge search space. Please correct me if I am wrong. > > **Proposal** > The key problem is that there is no way to re-fire rules on a specific > node. The original problem could have been solved, if it would be possible > to re-fire rules on a LogicalParent without creating any additional rels. > That would lead to a clear optimization flow: > 2.1) [LogicalParent <- LogicalChild] > 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] > 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild}] > > We can add the following method to VolcanoPlanner (RelOptPlanner?) > interface: > void fireRules(RelNode rel) > > This method will fire the rules on a passed node in a deferred mode as if > it was the new node just added to the optimizer. This would require slight > changes to RuleQueue.addMatch method, and possibly some other places. > > Usage example: > class PhysicalChildRule extends RelOptRule { > void onMatch(RelOptRuleCall call) { > LogicalChild logicalRel =
[DISCUSS] Proposal to add API to force rules matching specific rels
Hi colleagues, I would like to discuss with the community the possibility of adding a new public method to VolcanoPlanner which will forcefully re-trigger the rules for the specific rel. This is a follow up of a discussion started in the other thread [1]. **Problem statement** When converting between conventions during optimization VolcanoPlanner prefers the top-bottom approach, so that the nodes are converted from the root. But in some cases, the intermediate node must be converted after its children. This is especially true for distributed SQL engines, which rely on distribution traits during the optimization process: it is not possible to efficiently choose a proper physical implementation of a parent node unless the physical representation of a child node is known. It seems that presently VolcanoPlanner cannot address such cases by default. Consider that we have two nodes and associated rules which convert them to physical counterparts: [LogicalParent <- LogicalChild] The parent should be converted after the child. When the child is converted, the physical node is created: [LogicalParent <- {LogicalChild, PhysicalChild}] In order to finish the optimization process, we need to convert the parent. But parent rules are not fired, because PhysicalChild has traits incompatible with LogicalParent. Presently the problem could be solved in two ways: 1) Always produce conversions when going top-down. In this case, I first create a physical parent, then a physical child. The problem is that created parent is not optimal because it didn't know child distribution at the time of creation. So the full flow would be: create a bad parent, create a good child, create a good parent. 1.1) [LogicalParent <- LogicalChild] 1.2) [{LogicalParent, PhysicalParentBad} <- LogicalChild] 1.3) [{LogicalParent, PhysicalParentBad} <- {LogicalChild, PhysicalChild}] 1.4) [{LogicalParent, PhysicalParentBad, PhysicalParentGood} <- {LogicalChild, PhysicalChild}] What is worse, the creation of a not optimal parent will trigger rules on its parent, which in turn may create a not optimal parent-parent node, etc. 2) Make sure that your convention returns true for Convention.canConvertConvention. In this case, every time a new rel is added to a RelSet, its traitset will be compared to traitsets of all other rels in the set. For every pair of traitset we may ask the engine to create a relevant AbstractConverter. The net effect is that "physical-to-logical" converter is created, which re-triggers the rule on the logical parent since their conventions are compatible: 2.1) [LogicalParent <- LogicalChild] 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] 2.3) [LogicalParent <- {LogicalChild, PhysicalChild, PhysicalToLogicalConverter}] 2.4) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild, PhysicalToLogicalConverter}] The problem with that approach is that it is too coarse-grained since we operate on traitsets rather than rels. As a result, additional memory and CPU pressure are introduced because usually too many converters are created, which triggers the rules over and over again. **Affected products** At the moment two distributed engines are being developed for Hazelcast and Apache Ignite. Both require bottom-up optimization and currently rely on the second workaround. Another example is Apache Drill. I do not know whether its community is concerned with the issue, but it also uses bottom-up optimization for many rules and employs both the aforementioned workarounds. As a result, I guess that Drill's optimizer also creates too many rels during optimization and suffer from huge search space. Please correct me if I am wrong. **Proposal** The key problem is that there is no way to re-fire rules on a specific node. The original problem could have been solved, if it would be possible to re-fire rules on a LogicalParent without creating any additional rels. That would lead to a clear optimization flow: 2.1) [LogicalParent <- LogicalChild] 2.2) [LogicalParent <- {LogicalChild, PhysicalChild}] 2.3) [{LogicalParent, PhysicalParent} <- {LogicalChild, PhysicalChild}] We can add the following method to VolcanoPlanner (RelOptPlanner?) interface: void fireRules(RelNode rel) This method will fire the rules on a passed node in a deferred mode as if it was the new node just added to the optimizer. This would require slight changes to RuleQueue.addMatch method, and possibly some other places. Usage example: class PhysicalChildRule extends RelOptRule { void onMatch(RelOptRuleCall call) { LogicalChild logicalRel = call.get(0); PhysicalChild physicalRel = ...; call.transformTo(physicalRel); optimizer.fireRules(logicalRel); } } What does the community think about such a method? Does it make any sense to you? If not, do you aware of any other ways on how to organize bottom-up optimization with VolcanoPlanner without the creation of additional rels? If the community is OK in general, I can