[ https://issues.apache.org/jira/browse/DRILL-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Khurram Faraaz updated DRILL-3404: ---------------------------------- Attachment: 0_0_0.parquet Attached parquet input file used in the test. > Filter on window function does not appear in query plan > ------------------------------------------------------- > > Key: DRILL-3404 > URL: https://issues.apache.org/jira/browse/DRILL-3404 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 1.1.0 > Environment: 4 node cluster on CentOS > Reporter: Khurram Faraaz > Assignee: Jinfeng Ni > Priority: Critical > Attachments: 0_0_0.parquet > > > Filter is missing in the query plan for the below query in Drill, and hence > wrong results are returned. > Results from Drill > {code} > 0: jdbc:drill:schema=dfs.tmp> select c1, c2, w_sum from ( select c1, c2, sum > ( c1 ) over ( partition by c2 order by c1 asc nulls first ) w_sum from > `tblWnulls` ) sub_query where w_sum is not null; > +-------------+-------+-------------+ > | c1 | c2 | w_sum | > +-------------+-------+-------------+ > | 0 | a | 0 | > | 1 | a | 1 | > | 5 | a | 6 | > | 10 | a | 16 | > | 11 | a | 27 | > | 14 | a | 41 | > | 11111 | a | 11152 | > | 2 | b | 2 | > | 9 | b | 11 | > | 13 | b | 24 | > | 17 | b | 41 | > | null | c | null | > | 4 | c | 4 | > | 6 | c | 10 | > | 8 | c | 18 | > | 12 | c | 30 | > | 13 | c | 56 | > | 13 | c | 56 | > | null | d | null | > | null | d | null | > | 10 | d | 10 | > | 11 | d | 21 | > | 2147483647 | d | 4294967315 | > | 2147483647 | d | 4294967315 | > | -1 | e | -1 | > | 15 | e | 14 | > | null | null | null | > | 19 | null | 19 | > | 65536 | null | 65555 | > | 1000000 | null | 1065555 | > +-------------+-------+-------------+ > 30 rows selected (0.337 seconds) > {code} > Explain plan for the above query from Drill > {code} > 0: jdbc:drill:schema=dfs.tmp> explain plan for select c1, c2, w_sum from ( > select c1, c2, sum ( c1 ) over ( partition by c2 order by c1 asc nulls first > ) w_sum from `tblWnulls` ) sub_query where w_sum is not null; > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ > | > > > > text > > > > | json | > +----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------+ > | 00-00 Screen > 00-01 Project(c1=[$0], c2=[$1], w_sum=[$2]) > 00-02 Project(c1=[$0], c2=[$1], w_sum=[CASE(>($2, 0), $3, null)]) > 00-03 Window(window#0=[window(partition {1} order by [0 > ASC-nulls-first] range between UNBOUNDED PRECEDING and CURRENT ROW aggs > [COUNT($0), $SUM0($0)])]) > 00-04 SelectionVectorRemover > 00-05 Sort(sort0=[$1], sort1=[$0], dir0=[ASC], > dir1=[ASC-nulls-first]) > 00-06 Project(c1=[$1], c2=[$0]) > 00-07 Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls]], > selectionRoot=/tmp/tblWnulls, numFiles=1, columns=[`c1`, `c2`]]]) > {code} > Results from Postgres 9.3 > {code} > postgres=# select c1, c2, w_sum from ( select c1, c2, sum ( c1 ) over ( > partition by c2 order by c1 asc nulls first ) w_sum from t222 ) sub_query > where w_sum is not null; > c1 | c2 | w_sum > ------------+----+------------ > 0 | a | 0 > 1 | a | 1 > 5 | a | 6 > 10 | a | 16 > 11 | a | 27 > 14 | a | 41 > 11111 | a | 11152 > 2 | b | 2 > 9 | b | 11 > 13 | b | 24 > 17 | b | 41 > 4 | c | 4 > 6 | c | 10 > 8 | c | 18 > 12 | c | 30 > 13 | c | 56 > 13 | c | 56 > 10 | d | 10 > 11 | d | 21 > 2147483647 | d | 4294967315 > 2147483647 | d | 4294967315 > -1 | e | -1 > 15 | e | 14 > 19 | | 19 > 65536 | | 65555 > 1000000 | | 1065555 > (26 rows) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)