[ https://issues.apache.org/jira/browse/CALCITE-2648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16846193#comment-16846193 ]
Haisheng Yuan commented on CALCITE-2648: ---------------------------------------- Whether EnumerableWindow should preserve the order is highly dependent on the runtime implementation. In Postgres/GPDB, the window is sort based, so the optimizer assume window operator preserves the order. In calcite, I suppose it uses hashmap for window partitioning? {quote} We should not necessarily preserve order, if doing so would be expensive (and/or more complicated). {quote} I don't agree with this. If we don't preserve order, we will lose a lot of optimization opportunities. e.g select row_number() over (partition by a order by b,c), row_number() over (partition by a order by b) from foo; We just need 1 sort, but in calcite plan, it does the window separately, which is a waste. > Output collation of EnumerableWindow is not consistent with its implementation > ------------------------------------------------------------------------------ > > Key: CALCITE-2648 > URL: https://issues.apache.org/jira/browse/CALCITE-2648 > Project: Calcite > Issue Type: Bug > Affects Versions: 1.17.0 > Reporter: Hongze Zhang > Priority: Major > Labels: pull-request-available > Attachments: > postgresql_96_doesnt_care_to_keep_collation_for_project_over_expression.png > > Time Spent: 40m > Remaining Estimate: 0h > > Here is a case: > {code:sql} > select x, COUNT(*) OVER (PARTITION BY x) from (values (20), (35)) as t(x) > ORDER BY x > {code} > Final plan: > {code:java} > EnumerableWindow(window#0=[window(partition {0} order by [] range between > UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT()])]) > EnumerableValues(tuples=[[{ 20 }, { 35 }]]) > {code} > Output rows: > {code:java} > X |EXPR$1 | > ---|-------| > 35 |1 | > 20 |1 | > {code} > EnumerableWindow is supposed to preserve input collations, as a result > EnumerableSort is ignored. However the implementation of EnumerableWindow > generates non-ordered output (when PARTITION BY clause is used). -- This message was sent by Atlassian JIRA (v7.6.3#76005)