[ https://issues.apache.org/jira/browse/DRILL-3657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14698885#comment-14698885 ]
Sean Hsuan-Yi Chu commented on DRILL-3657: ------------------------------------------ The window function in the top window should not have pointed at $2, which represents a the output of the window function below. > Wrong result with SUM(1) window function when multiple partitions are present > ----------------------------------------------------------------------------- > > Key: DRILL-3657 > URL: https://issues.apache.org/jira/browse/DRILL-3657 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.2.0 > Reporter: Victoria Markman > Assignee: Sean Hsuan-Yi Chu > Priority: Critical > Labels: window_function > Fix For: 1.2.0 > > > The common use case for this would be: SUM( CASE WHEN x=y THEN 1 ELSE 0 END) > Wrong result: > {code} > 0: jdbc:drill:schema=dfs> select > . . . . . . . . . . . . > b2, > . . . . . . . . . . . . > c2, > . . . . . . . . . . . . > sum(1) over(partition by b2 order by c2), > . . . . . . . . . . . . > sum(1) over(partition by c2) > . . . . . . . . . . . . > from > . . . . . . . . . . . . > t2 > . . . . . . . . . . . . > order by > . . . . . . . . . . . . > 1,2; > +--------+-------------+---------+---------+ > | b2 | c2 | EXPR$2 | EXPR$3 | > +--------+-------------+---------+---------+ > | aaaaa | 2015-01-01 | 1 | 1 | > | bbbbb | 2015-01-02 | 3 | 9 | > | bbbbb | 2015-01-02 | 3 | 9 | > | bbbbb | 2015-01-02 | 3 | 9 | > | ccccc | 2015-01-03 | 1 | 1 | > | ddddd | 2015-01-04 | 1 | 1 | > | eeeee | 2015-01-05 | 1 | 1 | > | fffff | 2015-01-06 | 1 | 1 | > | ggggg | 2015-01-07 | 2 | 4 | > | ggggg | 2015-01-07 | 2 | 4 | > | hhhhh | 2015-01-08 | 1 | 1 | > | iiiii | 2015-01-09 | 1 | 1 | > | zzz | 2014-12-31 | 1 | 1 | > +--------+-------------+---------+---------+ > 13 rows selected (0.204 seconds) > {code} > Explain plan: > {code} > | 00-00 Screen > 00-01 Project(b2=[$0], c2=[$1], EXPR$2=[$2], EXPR$3=[$3]) > 00-02 SelectionVectorRemover > 00-03 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > 00-04 Window(window#0=[window(partition {1} order by [] range > between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($2)])]) > 00-05 SelectionVectorRemover > 00-06 Sort(sort0=[$1], dir0=[ASC]) > 00-07 Window(window#0=[window(partition {0} order by [1] > range between UNBOUNDED PRECEDING and CURRENT ROW aggs [SUM($2)])]) > 00-08 SelectionVectorRemover > 00-09 Sort(sort0=[$0], sort1=[$1], dir0=[ASC], > dir1=[ASC]) > 00-10 Project(b2=[$1], c2=[$0]) > 00-11 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t2]], > selectionRoot=maprfs:/drill/testdata/subqueries/t2, numFiles=1, > columns=[`b2`, `c2`]]]) > {code} > If you have a query with only one of these partitions, result is correct : > {code} > 0: jdbc:drill:schema=dfs> select > . . . . . . . . . . . . > b2, > . . . . . . . . . . . . > c2, > . . . . . . . . . . . . > sum(1) over(partition by b2 order by c2) > . . . . . . . . . . . . > from > . . . . . . . . . . . . > t2 > . . . . . . . . . . . . > order by > . . . . . . . . . . . . > 1,2; > +--------+-------------+---------+ > | b2 | c2 | EXPR$2 | > +--------+-------------+---------+ > | aaaaa | 2015-01-01 | 1 | > | bbbbb | 2015-01-02 | 3 | > | bbbbb | 2015-01-02 | 3 | > | bbbbb | 2015-01-02 | 3 | > | ccccc | 2015-01-03 | 1 | > | ddddd | 2015-01-04 | 1 | > | eeeee | 2015-01-05 | 1 | > | fffff | 2015-01-06 | 1 | > | ggggg | 2015-01-07 | 2 | > | ggggg | 2015-01-07 | 2 | > | hhhhh | 2015-01-08 | 1 | > | iiiii | 2015-01-09 | 1 | > | zzz | 2014-12-31 | 1 | > +--------+-------------+---------+ > 13 rows selected (0.196 seconds) > 0: jdbc:drill:schema=dfs> select > . . . . . . . . . . . . > b2, > . . . . . . . . . . . . > c2, > . . . . . . . . . . . . > sum(1) over(partition by c2) > . . . . . . . . . . . . > from > . . . . . . . . . . . . > t2 > . . . . . . . . . . . . > order by > . . . . . . . . . . . . > 1,2; > +--------+-------------+---------+ > | b2 | c2 | EXPR$2 | > +--------+-------------+---------+ > | aaaaa | 2015-01-01 | 1 | > | bbbbb | 2015-01-02 | 3 | > | bbbbb | 2015-01-02 | 3 | > | bbbbb | 2015-01-02 | 3 | > | ccccc | 2015-01-03 | 1 | > | ddddd | 2015-01-04 | 1 | > | eeeee | 2015-01-05 | 1 | > | fffff | 2015-01-06 | 1 | > | ggggg | 2015-01-07 | 2 | > | ggggg | 2015-01-07 | 2 | > | hhhhh | 2015-01-08 | 1 | > | iiiii | 2015-01-09 | 1 | > | zzz | 2014-12-31 | 1 | > +--------+-------------+---------+ > 13 rows selected (0.179 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)