[jira] [Updated] (DRILL-3420) Final sort can be dropped in some cases when result of window operator is already sorted on the same columns

Deneche A. Hakim (JIRA) Mon, 29 Jun 2015 14:56:24 -0700

     [ 
https://issues.apache.org/jira/browse/DRILL-3420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Deneche A. Hakim updated DRILL-3420:
------------------------------------
    Description: 
In the example query, output of the window operator is already sorted on the 
same columns that are specified in "order by clause" Last sort is redundant.

{noformat}
0: jdbc:drill:schema=dfs> explain plan for select b1, c1, a1, sum(a1) 
over(partition by b1 order by c1, a1) from t1 order by 1,2,3;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(b1=[$0], c1=[$1], a1=[$2], EXPR$3=[$3])
00-02        SelectionVectorRemover
00-03          Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], dir1=[ASC], 
dir2=[ASC])
00-04            Project(b1=[$0], c1=[$1], a1=[$2], EXPR$3=[CASE(>($3, 0), $4, 
null)])
00-05              Window(window#0=[window(partition {0} order by [1, 2] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($2), $SUM0($2)])])
00-06                SelectionVectorRemover
00-07                  Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], 
dir1=[ASC], dir2=[ASC])
00-08                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`b1`, `c1`, 
`a1`]]])
{noformat}

Note to QA: when this enhancement is implemented, we need to make sure that we 
have cases where sort order is destroyed by subsequent operation on top of 
window. In these cases "sort should still be planned.

  was:
In the example query, output of the window operator is already sorted on the 
same columns that are specified in "order by clause" Last sort is redundant.

{code}
0: jdbc:drill:schema=dfs> explain plan for select b1, c1, a1, sum(a1) 
over(partition by b1 order by c1, a1) from t1 order by 1,2,3;
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(b1=[$0], c1=[$1], a1=[$2], EXPR$3=[$3])
00-02        SelectionVectorRemover
00-03          Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], dir1=[ASC], 
dir2=[ASC])
00-04            Project(b1=[$0], c1=[$1], a1=[$2], EXPR$3=[CASE(>($3, 0), $4, 
null)])
00-05              Window(window#0=[window(partition {0} order by [1, 2] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($2), $SUM0($2)])])
00-06                SelectionVectorRemover
00-07                  Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], 
dir1=[ASC], dir2=[ASC])
00-08                    Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`b1`, `c1`, 
`a1`]]])
{code}

Note to QA: when this enhancement is implemented, we need to make sure that we 
have cases where sort order is destroyed by subsequent operation on top of 
window. In these cases "sort should still be planned.


> Final sort can be dropped in some cases when result of window operator is 
> already sorted on the same columns
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-3420
>                 URL: https://issues.apache.org/jira/browse/DRILL-3420
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Query Planning & Optimization
>    Affects Versions: 1.0.0
>            Reporter: Victoria Markman
>            Assignee: Jinfeng Ni
>              Labels: window_function
>             Fix For: 1.2.0
>
>
> In the example query, output of the window operator is already sorted on the 
> same columns that are specified in "order by clause" Last sort is redundant.
> {noformat}
> 0: jdbc:drill:schema=dfs> explain plan for select b1, c1, a1, sum(a1) 
> over(partition by b1 order by c1, a1) from t1 order by 1,2,3;
> +------+------+
> | text | json |
> +------+------+
> | 00-00    Screen
> 00-01      Project(b1=[$0], c1=[$1], a1=[$2], EXPR$3=[$3])
> 00-02        SelectionVectorRemover
> 00-03          Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-04            Project(b1=[$0], c1=[$1], a1=[$2], EXPR$3=[CASE(>($3, 0), 
> $4, null)])
> 00-05              Window(window#0=[window(partition {0} order by [1, 2] 
> range between UNBOUNDED PRECEDING and CURRENT ROW aggs [COUNT($2), 
> $SUM0($2)])])
> 00-06                SelectionVectorRemover
> 00-07                  Sort(sort0=[$0], sort1=[$1], sort2=[$2], dir0=[ASC], 
> dir1=[ASC], dir2=[ASC])
> 00-08                    Scan(groupscan=[ParquetGroupScan 
> [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], 
> selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`b1`, `c1`, 
> `a1`]]])
> {noformat}
> Note to QA: when this enhancement is implemented, we need to make sure that 
> we have cases where sort order is destroyed by subsequent operation on top of 
> window. In these cases "sort should still be planned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (DRILL-3420) Final sort can be dropped in some cases when result of window operator is already sorted on the same columns

Reply via email to