Khurram Faraaz created DRILL-4469:
-------------------------------------

             Summary: SUM window query returns incorrect results over integer 
data
                 Key: DRILL-4469
                 URL: https://issues.apache.org/jira/browse/DRILL-4469
             Project: Apache Drill
          Issue Type: Bug
          Components: Execution - Flow
    Affects Versions: 1.6.0
         Environment: 4 node CentOS cluster
            Reporter: Khurram Faraaz
            Priority: Critical


SUM window query returns incorrect results as compared to Postgres, with or 
without the frame clause in the window definition. Note that there is a sub 
query involved and data in column c1 is sorted integer data with no nulls.

Drill 1.6.0 commit ID: 6d5f4983

Results from Drill 1.6.0

{noformat}
0: jdbc:drill:schema=dfs.tmp> SELECT SUM(c1) OVER w FROM (select * from 
dfs.tmp.`t_alltype`) subQry WINDOW w AS (PARTITION BY c8 ORDER BY c1 RANGE 
BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING);
+---------+
| EXPR$0  |
+---------+
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
...
| 10585  |
| 10585  |
| 10585  |
+--------+
145 rows selected (0.257 seconds)
{noformat}

results from Postgres 9.3

{noformat}
postgres=# SELECT SUM(c1) OVER w FROM (select * from t_alltype) subQry WINDOW w 
AS (PARTITION BY c8 ORDER BY c1 RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED 
FOLLOWING);
 sum
------
 4499
 4499
 4499
 4499
 4499
 4499
...
 5613
 5613
 5613
  473
  473
  473
  473
  473
(145 rows)
{noformat}

Removing the frame clause from window definition, still results in completely 
different results on Postgres vs Drill

Results from Drill 1.6.0

{noformat}
0: jdbc:drill:schema=dfs.tmp>    SELECT SUM(c1) OVER w FROM (select * from 
t_alltype) subQry WINDOW w AS (PARTITION BY c8 ORDER BY c1);
+---------+
| EXPR$0  |
+---------+
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
| 10585   |
...
| 10585  |
| 10585  |
| 10585  |
| 10585  |
| 10585  |
+--------+
145 rows selected (0.28 seconds)
{noformat}

Results from Postgres

{noformat}
postgres=# SELECT SUM(c1) OVER w FROM (select * from t_alltype) subQry WINDOW w 
AS (PARTITION BY c8 ORDER BY c1);
 sum
------
    5
   12
   21
   33
   47
   62
   78
   96
  115
  135
  158
  182
  207
  233
  260
  289
...
4914
 5051
 5189
 5328
 5470
 5613
    8
   70
  198
  332
  473
(145 rows)
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to