Rahul Challapalli created DRILL-2264:
----------------------------------------

             Summary: Incorrect data when we use aggregate functions with 
flatten
                 Key: DRILL-2264
                 URL: https://issues.apache.org/jira/browse/DRILL-2264
             Project: Apache Drill
          Issue Type: Bug
          Components: Functions - Drill
            Reporter: Rahul Challapalli
            Assignee: Jason Altekruse
            Priority: Critical


git.commit.id.abbrev=6676f2d

Data Set :
{code}
{
  "uid":1,
  "lst_lst" : [[1,2],[3,4]]
}
{
  "uid":2,
  "lst_lst" : [[1,2],[3,4]]
}
{code}

The below query returns incorrect results :
{code}
select uid,MAX( flatten(lst_lst[1]) + flatten(lst_lst[0])) from `temp.json` 
group by uid, flatten(lst_lst[1]), flatten(lst_lst[0]);
+------------+------------+
|    uid     |   EXPR$1   |
+------------+------------+
| 1          | 6          |
| 1          | 6          |
| 1          | 6          |
| 1          | 6          |
| 2          | 6          |
| 2          | 6          |
| 2          | 6          |
| 2          | 6          |
+------------+------------+
{code}

However if we use a sub query, drill returns the right data
{code}
select uid, MAX(l1+l2) from (select uid,flatten(lst_lst[1]) l1, 
flatten(lst_lst[0]) l2 from `temp.json`) sub group by uid, l1, l2;
+------------+------------+
|    uid     |   EXPR$1   |
+------------+------------+
| 1          | 4          |
| 1          | 5          |
| 1          | 5          |
| 1          | 6          |
| 2          | 4          |
| 2          | 5          |
| 2          | 5          |
| 2          | 6          |
+------------+------------+
{code}


Also using a single flatten yields proper results
{code}
select uid,MAX(flatten(lst_lst[0])) from `temp.json` group by uid, 
flatten(lst_lst[0]);
+------------+------------+
|    uid     |   EXPR$1   |
+------------+------------+
| 1          | 1          |
| 1          | 2          |
| 2          | 1          |
| 2          | 2          |
+------------+------------+
{code}

Marked it as critical since we return in-correct data. Let me know if you have 
any other questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to