Sudheesh Katkam created DRILL-2801: -------------------------------------- Summary: ORDER BY produces extra records Key: DRILL-2801 URL: https://issues.apache.org/jira/browse/DRILL-2801 Project: Apache Drill Issue Type: Bug Components: Execution - Relational Operators Affects Versions: 0.8.0 Reporter: Sudheesh Katkam Assignee: Chris Westin Priority: Critical Attachments: data.csv
Running in embedded mode on my mac. {code} $ wc -w data.csv 50000 data.csv {code} Here's the query: {code} 0: jdbc:drill:zk=local> SELECT count(*) FROM dfs.`data.csv`; +------------+ | EXPR$0 | +------------+ | 50000 | +------------+ 1 row selected (0.223 seconds) 0: jdbc:drill:zk=local> SELECT columns[0] FROM dfs.`data.csv` ORDER BY columns[0]; +------------+ | EXPR$0 | +------------+ ... | 6 | +------------+ 50,001 rows selected (0.928 seconds) 0: jdbc:drill:zk=local> SELECT tab.col, COUNT(tab.col) FROM (SELECT columns[0] col FROM dfs.`data.csv` ORDER BY columns[0]) tab GROUP BY tab.col; +------------+------------+ | col | EXPR$1 | +------------+------------+ | 2 | 10000 | | 3 | 10000 | | 4 | 10000 | | 5 | 10001 | | 6 | 10000 | +------------+------------+ 5 rows selected (0.704 seconds) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)