[jira] [Commented] (DRILL-2167) Order by on a repeated index from the output of a flatten on large no of records results in incorrect results

Jason Altekruse (JIRA) Wed, 04 Feb 2015 14:13:29 -0800

    [ 
https://issues.apache.org/jira/browse/DRILL-2167?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14306080#comment-14306080
 ]


Jason Altekruse commented on DRILL-2167:
----------------------------------------

Rahul, can you check to see if you take the output of flatten, written to a new 
file, and then read in and sorted will produce the same result? It is possible 
that this is somewhat related to an interaction between the two operations, but 
it might just be an issue with sort. It would also be useful to check the plans 
to see how they differ.

> Order by on a repeated index from the output of a flatten on large no of 
> records results in incorrect results
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: DRILL-2167
>                 URL: https://issues.apache.org/jira/browse/DRILL-2167
>             Project: Apache Drill
>          Issue Type: Bug
>            Reporter: Rahul Challapalli
>            Assignee: Jason Altekruse
>            Priority: Critical
>         Attachments: data.json
>
>
> git.commit.id.abbrev=3e33880
> The below query results in 200006 records. Based on the data set we should 
> only receive 200000 records. 
> {code}
> select s.uid from (select d.uid, flatten(d.map.rm) rms from `data.json` d) s 
> order by s.rms.rptd[1].d;
> {code}
> When I removed the order by part, drill correctly reported 200000 records.
> {code}
> select s.uid from (select d.uid, flatten(d.map.rm) rms from `data.json` d) s;
> {code}
> I attached the data set with 2 records. I copied over the data set 50000 
> times and ran the queries on top of it. Let me know if you have any other 
> questions



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (DRILL-2167) Order by on a repeated index from the output of a flatten on large no of records results in incorrect results

Reply via email to