Rahul Challapalli created DRILL-2264: ----------------------------------------
Summary: Incorrect data when we use aggregate functions with flatten Key: DRILL-2264 URL: https://issues.apache.org/jira/browse/DRILL-2264 Project: Apache Drill Issue Type: Bug Components: Functions - Drill Reporter: Rahul Challapalli Assignee: Jason Altekruse Priority: Critical git.commit.id.abbrev=6676f2d Data Set : {code} { "uid":1, "lst_lst" : [[1,2],[3,4]] } { "uid":2, "lst_lst" : [[1,2],[3,4]] } {code} The below query returns incorrect results : {code} select uid,MAX( flatten(lst_lst[1]) + flatten(lst_lst[0])) from `temp.json` group by uid, flatten(lst_lst[1]), flatten(lst_lst[0]); +------------+------------+ | uid | EXPR$1 | +------------+------------+ | 1 | 6 | | 1 | 6 | | 1 | 6 | | 1 | 6 | | 2 | 6 | | 2 | 6 | | 2 | 6 | | 2 | 6 | +------------+------------+ {code} However if we use a sub query, drill returns the right data {code} select uid, MAX(l1+l2) from (select uid,flatten(lst_lst[1]) l1, flatten(lst_lst[0]) l2 from `temp.json`) sub group by uid, l1, l2; +------------+------------+ | uid | EXPR$1 | +------------+------------+ | 1 | 4 | | 1 | 5 | | 1 | 5 | | 1 | 6 | | 2 | 4 | | 2 | 5 | | 2 | 5 | | 2 | 6 | +------------+------------+ {code} Also using a single flatten yields proper results {code} select uid,MAX(flatten(lst_lst[0])) from `temp.json` group by uid, flatten(lst_lst[0]); +------------+------------+ | uid | EXPR$1 | +------------+------------+ | 1 | 1 | | 1 | 2 | | 2 | 1 | | 2 | 2 | +------------+------------+ {code} Marked it as critical since we return in-correct data. Let me know if you have any other questions -- This message was sent by Atlassian JIRA (v6.3.4#6332)