[ https://issues.apache.org/jira/browse/DRILL-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Khurram Faraaz reopened DRILL-2953: ----------------------------------- > Group By + Order By query results are not ordered. > -------------------------------------------------- > > Key: DRILL-2953 > URL: https://issues.apache.org/jira/browse/DRILL-2953 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization > Affects Versions: 0.9.0 > Environment: 10833d2cae9f5312cf0e31f8c9f3f8a9dcdc0c45 | Commit 0.9.0 > release version. | 03.05.2015 @ 14:56:56 EDT > Reporter: Khurram Faraaz > Assignee: Jinfeng Ni > Priority: Critical > > Group by + order by query does not return results in correct order. Sort is > performed before the aggregation is done, which should not be the case. > Test was performed on 4 node cluster on CentOS. > {code} > 0: jdbc:drill:> select cast(columns[0] as int) c1 from `testWindow.csv` t2 > where t2.columns[0] is not null group by columns[0] order by columns[0]; > +------------+ > | c1 | > +------------+ > | 10 | > | 100 | > | 113 | > | 119 | > | 2 | > | 50 | > | 55 | > | 57 | > | 61 | > | 67 | > | 89 | > +------------+ > 11 rows selected (0.218 seconds) > {code} > Explain plan for that query that returns wrong results. > {code} > 0: jdbc:drill:> explain plan for select cast(columns[0] as int) c1 from > `testWindow.csv` t2 where t2.columns[0] is not null group by columns[0] order > by columns[0]; > +------------+------------+ > | text | json | > +------------+------------+ > | 00-00 Screen > 00-01 Project(c1=[$0]) > 00-02 Project(c1=[CAST($0):INTEGER], EXPR$1=[$0]) > 00-03 StreamAgg(group=[{0}]) > 00-04 Sort(sort0=[$0], dir0=[ASC]) > 00-05 Filter(condition=[IS NOT NULL($0)]) > 00-06 Project(ITEM=[ITEM($0, 0)]) > 00-07 Scan(groupscan=[EasyGroupScan > [selectionRoot=/tmp/testWindow.csv, numFiles=1, columns=[`columns`[0]], > files=[maprfs:/tmp/testWindow.csv]]]) > {code} > Incorrect results , not in order. > {code} > 0: jdbc:drill:> select cast(columns[0] as int) from `testWindow.csv` t2 where > t2.columns[0] is not null group by columns[0] order by columns[0]; > +------------+ > | EXPR$0 | > +------------+ > | 10 | > | 100 | > | 113 | > | 119 | > | 2 | > | 50 | > | 55 | > | 57 | > | 61 | > | 67 | > | 89 | > +------------+ > 11 rows selected (0.214 seconds) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)