[ 
https://issues.apache.org/jira/browse/CALCITE-5051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17509048#comment-17509048
 ] 

Julian Hyde commented on CALCITE-5051:
--------------------------------------

There was a bug logged not too long ago that said that we should be able to 
push projects through UNION. But IIRC that wasn't safe, because a narrower 
projection caused more rows to become duplicates. Can you find that bug and 
make sure that it doesn't apply here.

Can you explain how you are able to convert "EnumerableUnion(a=[true])" to 
"EnumerableUnion(a=[false])"?

Can you identity which commit (or JIRA case) caused the issue you are seeing?

I am supportive of this change. I just want to make sure we have done our due 
diligence.

> UNION query plan prevents projection push down
> ----------------------------------------------
>
>                 Key: CALCITE-5051
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5051
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.29.0
>            Reporter: Zachary Gramana
>            Priority: Major
>
> As a user with a custom Calcite adapter that does push down, I should be able 
> to run a UNION query of statements containing joins and still get the benefit 
> of projection push down.
> Given a query such as:
> {code:sql}
> SELECT Id
>   FROM MySchema.t1
> UNION
> SELECT t3.Id
>   FROM MySchema.t2
>   JOIN MySchema.t3 ON (t3.Id = t2.t3_Id)
> {code}
> I expect a resulting query plan that looks like:
> {code:lua}
> EnumerableUnion(all=[true])
>   MyEnumerableConverter
>     MyProject(Id=[$0])
>       MyTableScan(table=[[MySchema, t1]])
>   EnumerableCalc(expr#0..1=[{inputs}], Id=[$t1])
>     EnumerableMergeJoin(condition=[=($0, $1)], joinType=[inner])
>       EnumerableSort(sort0=[$0], dir0=[ASC])
>         EnumerableCalc(expr#0..100=[{inputs}], expr#101=[CAST($t1):BIGINT NOT 
> NULL], t3_Id0=[$t101])
>           MyEnumerableConverter
>             MyTableScan(table=[[MySchema, t2]])
>       EnumerableSort(sort0=[$0], dir0=[ASC])
>         MyEnumerableConverter
>           MyProject(Id=[$0])
>             MyTableScan(table=[[MySchema, t3]])
> {code}
> But instead I observed:
> {code:java}
> EnumerableUnion(all=[false])
>   MyEnumerableConverter
>     MyProject(Id=[$0])
>       MyTableScan(table=[[MySchema, t1]])
>   EnumerableCalc(expr#0..251=[{inputs}], Id=[$t102])
>     EnumerableMergeJoin(condition=[=($101, $102)], joinType=[inner])
>       EnumerableSort(sort0=[$101], dir0=[ASC])
>         EnumerableCalc(expr#0..100=[{inputs}], expr#101=[CAST($t1):BIGINT NOT 
> NULL], proj#0..101=[{exprs}])
>           MyEnumerableConverter
>             MyTableScan(table=[[MySchema, t2]])
>       EnumerableSort(sort0=[$0], dir0=[ASC])
>         MyEnumerableConverter
>           MyTableScan(table=[[MySchema, t3]])
> {code}
> Note that:
>  # The {{EnumerableCalc}} node applied to the {{EnumerableMergeJoin}} goes 
> from taking 1 expected input field to taking 251 input fields
>  # The {{MyProject}} node expected to be applied to 
> {{MyTableScan(table=[[MySchema, t3]])}} is missing from the observed plan
>  # Issue was observed after upgrading from 1.24 to 1.29, so may affect one or 
> more intervening releases
>  # PR containing reproducing unit test: 
> https://github.com/apache/calcite/pull/2747



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to