[ 
https://issues.apache.org/jira/browse/CALCITE-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14643488#comment-14643488
 ] 

Maryann Xue commented on CALCITE-818:
-------------------------------------

The optimizer asks for the same trait set (except convention) of the output rel 
(the final plan) as that of the input rel (the rel created by 
SqlToRelConverter). Here the input rel is 
LogicalProject(input=LogicalValues(...), projects=...), and for the first case 
"select p0+p1 ..." the LogicalProject has collation trait "[0]", inferred from 
its child, so the output rel also has to have that collation while the 
collation trait of the child of the XXXProject is lost during subset creation, 
and that's why there is an extra EnumerableSort.
CALCITE-793 fixes this in that it clears the collation trait of the input rel 
if there isn't an order-by in the sql, but it does not fix the real problem I 
think, say if we are doing a merge-join and one of the tables has multiple 
collation traits, we might see a sort that could have been avoided.

> Multiple collation traits get wiped out when creating subset, thus cause 
> unnecessary sort
> -----------------------------------------------------------------------------------------
>
>                 Key: CALCITE-818
>                 URL: https://issues.apache.org/jira/browse/CALCITE-818
>             Project: Calcite
>          Issue Type: Bug
>    Affects Versions: 1.4.0-incubating
>            Reporter: Maryann Xue
>            Assignee: Julian Hyde
>            Priority: Minor
>
> "select p1 from (values (2, 1)) as t(p0, p1)"
> or
> "select p0+p1 from (values (2, 1)) as t(p0, p1)"
> would return a plan (with VolcanoPlanner) like:
> {code}
> EnumerableSort(...)
>   EnumerableCalc(...)
>     EnumerableValues(...)
> {code}
> It was because a multiple collation trait was inferred from the LogicalValues 
> rel as: [[0,1], [1]], and the LogicalProject would have a corresponding 
> collation trait based on the project expressions. But when optimizing, the 
> multiple collation trait was simplified to empty when a subset for the 
> LogicalValues rel was created, thus making EnumerableCalc unable to infer 
> collation accordingly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to