[ 
https://issues.apache.org/jira/browse/DRILL-1671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jacques Nadeau reassigned DRILL-1671:
-------------------------------------

    Assignee: Jacques Nadeau

> Incorrect results reported by drill when we have more than  10 flattens (2048 
> records)
> --------------------------------------------------------------------------------------
>
>                 Key: DRILL-1671
>                 URL: https://issues.apache.org/jira/browse/DRILL-1671
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Drill, Storage - JSON
>            Reporter: Rahul Challapalli
>            Assignee: Jacques Nadeau
>         Attachments: many-arrays-50.json
>
>
> git.commit.id.abbrev=60aa446
> I ran the below test against the private branch of Jason which has some 
> patches for bugs related to flatten which are not yet merged into the master.
> The data is in such a way that each array within the record contains only 2 
> records. So with each flatten added to the query the no of rows should get 
> doubled
> The below query works as expected
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir>select count(*) from (select id, 
> flatten(evnts1), flatten(evnts2), flatten(evnts3), flatten(evnts4), 
> flatten(evnts5), flatten(evnts6), flatten(evnts7), flatten(evnts8), 
> flatten(evnts9), flatten(evnts10) from 
> `json_kvgenflatten/many-arrays-50.json`) ;
> +------------+
> |   EXPR$0   |
> +------------+
> | 1024       |
> +------------+
> {code}
> However the below query reports incorrect results. The correct output is 2048.
> {code}
> 0: jdbc:drill:schema=dfs.drillTestDir> select count(*) from (select id, 
> flatten(evnts1), flatten(evnts2), flatten(evnts3), flatten(evnts4), 
> flatten(evnts5), flatten(evnts6), flatten(evnts7), flatten(evnts8), 
> flatten(evnts9), flatten(evnts10), flatten(evnts11) from 
> `json_kvgenflatten/many-arrays-50.json`) ;
> +------------+
> |   EXPR$0   |
> +------------+
> | 2047       |
> +------------+
> {code}
> From here on no matter how many flattens we add to the query, the output 
> still remains the same. However the duration of the query seems to more and 
> more with each new flatten added.
> I attached the data file. Let me know if you have any questions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to