[ https://issues.apache.org/jira/browse/CALCITE-7125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18011931#comment-18011931 ]
Alessandro Solimando commented on CALCITE-7125: ----------------------------------------------- Thanks for your reply [~jensen], and no problem about it, it's part of the process, I miss that too during the release process. Let's just find a good way to move forward on this, as you seem to agree on the regression. I think too that it's interesting to keep both versions of the rule, but since we want to reintroduce the original behavior, I would be inclined to keep that one as the default one, and provide the new behavior via an option (that's *not* what it's currently implemented in my branch, which is just a PoC for checking details). I prefer change in plans to be a conscious effort whenever it's controllable, rather than have it change under the hood, whoever hasn't upgraded yet to 1.40.0, and will upgrade at 1.41.0 or later, will see no change here, unless they change their way to register planning rules. Let's see if others have any opinions on this, as it will be a breaking change for 1.41.0, if not I will move forward with my proposal. > Impossible to get a plan with partial aggregate push-down via > IntersectToDistinctRule > ------------------------------------------------------------------------------------- > > Key: CALCITE-7125 > URL: https://issues.apache.org/jira/browse/CALCITE-7125 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.40.0 > Reporter: Alessandro Solimando > Priority: Major > Labels: breaking_change > > CALCITE-6893 changed _IntersectToDistinctRule_ to rewrite "INTERSECT > DISTINCT" queries using a tagging plus filtering strategy rather than pushing > aggregates into the "UNION" inputs. > Rationale was to let other rules (_AGGREGATE_UNION_TRANSPOSE_ is mentioned in > the ticket description) produce the plan with partial aggregation push-down, > and let the CBO do the choice. > Consider the following query: > {noformat} > SELECT empno FROM emp > INTERSECT > SELECT empno FROM emp > {noformat} > _IntersectToDistinctRule_, as per CALCITE-6893, would rewrite it as follows: > {noformat} > LogicalProject(EMPNO=[$0]) > LogicalFilter(condition=[AND(>($1, 0), >($2, 0))]) > LogicalAggregate(group=[{0}], count_i0=[COUNT() FILTER (WHERE i = 0)], > count_i1=[COUNT() FILTER (WHERE i = 1)]) > LogicalUnion(all=[true]) > LogicalProject(EMPNO=[$0], i=[0]) // "tag" i=0 > TableScan(emp) > LogicalProject(EMPNO=[$0], i=[1]) // "tag" i=1 > TableScan(emp) > {noformat} > This structure introduces: > - A tagging column "i" added via projection > - Two Projects wrapping each input > - A single Aggregate with COUNT() FILTER expressions over those tags > - A final filter requiring count_iN > 0, for N in "1..#operands" > With this rewrite, the planner: > - Can no longer push aggregates into the union inputs > - Cannot apply rules like > _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ or > _AGGREGATE_PROJECT_MERGE_/_AggregateProjectMergeRule_ > - Fails to recover the earlier, sometimes more efficient plan shape: > {noformat} > LogicalAggregate(group=[{0}]) > LogicalUnion(all=[true]) > LogicalAggregate(group=[{0}]) > ... > LogicalAggregate(group=[{0}]) > ... > {noformat} > This prevents early row reduction and downstream rule composition, even when > all relevant transformation rules are enabled. > Prior to CALCITE-6893, the same query with the same planner configuration > produced the optimized form above. That plan is no longer reachable with the > current {_}IntersectToDistinctRule{_}, and no combination of core rules > (including _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_, > _PROJECT_SET_OP_TRANSPOSE_/_ProjectSetOpTransposeRule_, etc.) can recover it. > Unless I am mistaken and someone can suggest further rules that can lead to > the same plan as before, this seems like a regression. > My proposal would be to either revert the change, or to at least expose the > "old" behavior via a configuration flag. > My understanding is that, the original rewrite proposed in CALCITE-6893 was > supposed to rewritable (by means of > _AGGREGATE_UNION_TRANSPOSE_/_AggregateUnionTransposeRule_ rule). The rewrite > was incorrect and the currently implemented form has been suggested, but it > hasn't been discussed if the "new" plan shape (post CALCITE-6893) could be > transformed into the "old" plan shape (pre CALCITE-6893). -- This message was sent by Atlassian Jira (v8.20.10#820010)