[ https://issues.apache.org/jira/browse/CALCITE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511457#comment-17511457 ]
Zachary Gramana commented on CALCITE-5051: ------------------------------------------ I looked into this yesterday, and indeed CALCITE-3399 is the culprit. Removing the {{&& setOp.all}} condition restores the 1.24 UNION behavior. Doing so, however, does cause one test to fail: {{SqlToRelConverterExtendedTest.testTrimUnionDistinct()}}. In looking at the original issue prompting the change, [CALCITE-2260|https://issues.apache.org/jira/browse/CALCITE-2260], I think the path forward is a bit more complex, but I'd like to propose an additional solution besides whether to enable trimming for UNION or not. Taking inspiration from [Sqlite|https://sqlite.org/eqp.html#compound_queries], what if we recursively call {{trimFields}} on each subquery in a compound query? Sort of the reverse of how Calcite handles view queries, where we treat the component queries/inline views as tables. Something along the lines of: # Compute the rowType from the component queries/inline views # Replace the subquery with a "placeholder table" with the computed rowType from the previous step # Call {{trimFields}} on the simplified root query # Separately call {{trimFields}} on the now isolated subquery/queries # Replace the "placeholder table" with the corresponding {{TrimResult}} This process could be done recursively to handle particularly complex queries. This approach would potentially have the advantage of both correctness while still allowing trimming where it's safe to yield a more efficient plan. > UNION query plan prevents projection push down > ---------------------------------------------- > > Key: CALCITE-5051 > URL: https://issues.apache.org/jira/browse/CALCITE-5051 > Project: Calcite > Issue Type: Bug > Components: core > Affects Versions: 1.29.0 > Reporter: Zachary Gramana > Priority: Major > > As a user with a custom Calcite adapter that does push down, I should be able > to run a UNION query of statements containing joins and still get the benefit > of projection push down. > Given a query such as: > {code:sql} > SELECT Id > FROM MySchema.t1 > UNION > SELECT t3.Id > FROM MySchema.t2 > JOIN MySchema.t3 ON (t3.Id = t2.t3_Id) > {code} > I expect a resulting query plan that looks like: > {code:lua} > EnumerableUnion(all=[true]) > MyEnumerableConverter > MyProject(Id=[$0]) > MyTableScan(table=[[MySchema, t1]]) > EnumerableCalc(expr#0..1=[{inputs}], Id=[$t1]) > EnumerableMergeJoin(condition=[=($0, $1)], joinType=[inner]) > EnumerableSort(sort0=[$0], dir0=[ASC]) > EnumerableCalc(expr#0..100=[{inputs}], expr#101=[CAST($t1):BIGINT NOT > NULL], t3_Id0=[$t101]) > MyEnumerableConverter > MyTableScan(table=[[MySchema, t2]]) > EnumerableSort(sort0=[$0], dir0=[ASC]) > MyEnumerableConverter > MyProject(Id=[$0]) > MyTableScan(table=[[MySchema, t3]]) > {code} > But instead I observed: > {code:java} > EnumerableUnion(all=[false]) > MyEnumerableConverter > MyProject(Id=[$0]) > MyTableScan(table=[[MySchema, t1]]) > EnumerableCalc(expr#0..251=[{inputs}], Id=[$t102]) > EnumerableMergeJoin(condition=[=($101, $102)], joinType=[inner]) > EnumerableSort(sort0=[$101], dir0=[ASC]) > EnumerableCalc(expr#0..100=[{inputs}], expr#101=[CAST($t1):BIGINT NOT > NULL], proj#0..101=[{exprs}]) > MyEnumerableConverter > MyTableScan(table=[[MySchema, t2]]) > EnumerableSort(sort0=[$0], dir0=[ASC]) > MyEnumerableConverter > MyTableScan(table=[[MySchema, t3]]) > {code} > Note that: > # The {{EnumerableCalc}} node applied to the {{EnumerableMergeJoin}} goes > from taking 1 expected input field to taking 251 input fields > # The {{MyProject}} node expected to be applied to > {{MyTableScan(table=[[MySchema, t3]])}} is missing from the observed plan > # Issue was observed after upgrading from 1.24 to 1.29, so may affect one or > more intervening releases > # PR containing reproducing unit test: > https://github.com/apache/calcite/pull/2747 -- This message was sent by Atlassian Jira (v8.20.1#820001)