Khurram Faraaz created DRILL-3680:
-------------------------------------

             Summary: window function query returns Incorrect results 
                 Key: DRILL-3680
                 URL: https://issues.apache.org/jira/browse/DRILL-3680
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.2.0
         Environment: private-branch 
https://github.com/adeneche/incubator-drill/tree/new-window-funcs
            Reporter: Khurram Faraaz
            Assignee: Chris Westin
            Priority: Critical



Query plan from Drill for the query that returns wrong results
{code}
0: jdbc:drill:schema=dfs.tmp> explain plan for select c1 , c2 , lead(c2) OVER ( 
PARTITION BY c2 ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) 
over(PARTITION BY c2 ORDER BY c1) FROM `tblWnulls.parquet`);
+------+------+
| text | json |
+------+------+
| 00-00    Screen
00-01      Project(c1=[$0], c2=[$1], lead_c2=[$2])
00-02        Project(c1=[$0], c2=[$1], lead_c2=[$2])
00-03          Project(c1=[$0], c2=[$1], $2=[$3])
00-04            Window(window#0=[window(partition {1} order by [0] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [LEAD($1)])])
00-05              Window(window#0=[window(partition {1} order by [0] range 
between UNBOUNDED PRECEDING and CURRENT ROW aggs [NTILE($2)])])
00-06                SelectionVectorRemover
00-07                  Sort(sort0=[$1], sort1=[$0], dir0=[ASC], dir1=[ASC])
00-08                    Project(c1=[$1], c2=[$0])
00-09                      Scan(groupscan=[ParquetGroupScan 
[entries=[ReadEntryWithPath [path=maprfs:///tmp/tblWnulls.parquet]], 
selectionRoot=maprfs:/tmp/tblWnulls.parquet, numFiles=1, columns=[`c1`, `c2`]]])
{code}

Results returned by Drill.
{code}
0: jdbc:drill:schema=dfs.tmp> select c1 , c2 , lead(c2) OVER ( PARTITION BY c2 
ORDER BY c1) lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER 
BY c1) FROM `tblWnulls.parquet`);
+-------------+-------+----------+
|     c1      |  c2   | lead_c2  |
+-------------+-------+----------+
| 0           | a     | null     |
| 1           | a     | null     |
| 5           | a     | null     |
| 10          | a     | null     |
| 11          | a     | null     |
| 14          | a     | null     |
| 11111       | a     | null     |
| 2           | b     | null     |
| 9           | b     | null     |
| 13          | b     | null     |
| 17          | b     | null     |
| 4           | c     | null     |
| 6           | c     | null     |
| 8           | c     | null     |
| 12          | c     | null     |
| 13          | c     | null     |
| 13          | c     | null     |
| null        | c     | null     |
| 10          | d     | null     |
| 11          | d     | null     |
| 2147483647  | d     | null     |
| 2147483647  | d     | null     |
| null        | d     | null     |
| null        | d     | null     |
| -1          | e     | null     |
| 15          | e     | null     |
| 19          | null  | null     |
| 65536       | null  | null     |
| 1000000     | null  | null     |
| null        | null  | null     |
+-------------+-------+----------+
30 rows selected (0.339 seconds)
{code}

Results returned by Postgres

{code}
postgres=# select c1 , c2 , lead(c2) OVER ( PARTITION BY c2 ORDER BY c1) 
lead_c2 FROM (SELECT c1 , c2, ntile(3) over(PARTITION BY c2 ORDER BY c1) FROM 
t222) sub_query;
     c1     | c2 | lead_c2 
------------+----+---------
          0 | a  | a
          1 | a  | a
          5 | a  | a
         10 | a  | a
         11 | a  | a
         14 | a  | a
      11111 | a  | 
          2 | b  | b
          9 | b  | b
         13 | b  | b
         17 | b  | 
          4 | c  | c
          6 | c  | c
          8 | c  | c
         12 | c  | c
         13 | c  | c
         13 | c  | c
            | c  | 
         10 | d  | d
         11 | d  | d
 2147483647 | d  | d
 2147483647 | d  | d
            | d  | d
            | d  | 
         -1 | e  | e
         15 | e  | 
         19 |    | 
      65536 |    | 
    1000000 |    | 
            |    | 
(30 rows)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to