[jira] [Updated] (DRILL-7268) Read Hive array with parquet native reader

Igor Guzenko (JIRA) Wed, 26 Jun 2019 02:16:17 -0700


     [ 
https://issues.apache.org/jira/browse/DRILL-7268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Igor Guzenko updated DRILL-7268:
--------------------------------
    Description: 
When Hive stores array data in parquet format, it creates schema for such 
columns, like: 
arr_n_0 ARRAY<INT>

{code}
 optional group arr_n_0 (LIST) {
 repeated group bag {
 optional int32 array_element;
 }
 }
{code}

Sample result before the changes was:

{code}\{"bag":[{"array_element":1},\{"array_element":2}]} \{code}

After the changes Drill reads only array elements data without additional keys 
like "bag" or "array_element":

{code} [1,2] \{code} . 

 

Please read Design Doc linked to parent task for more details. 

> Read Hive array with parquet native reader
> ------------------------------------------
>
>                 Key: DRILL-7268
>                 URL: https://issues.apache.org/jira/browse/DRILL-7268
>             Project: Apache Drill
>          Issue Type: Sub-task
>            Reporter: Igor Guzenko
>            Assignee: Igor Guzenko
>            Priority: Major
>              Labels: ready-to-commit
>             Fix For: 1.17.0
>
>
> When Hive stores array data in parquet format, it creates schema for such 
> columns, like: 
> arr_n_0 ARRAY<INT>
> {code}
>  optional group arr_n_0 (LIST) {
>  repeated group bag {
>  optional int32 array_element;
>  }
>  }
> {code}
> Sample result before the changes was:
> {code}\{"bag":[{"array_element":1},\{"array_element":2}]} \{code}
> After the changes Drill reads only array elements data without additional 
> keys like "bag" or "array_element":
> {code} [1,2] \{code} . 
>  
> Please read Design Doc linked to parent task for more details. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (DRILL-7268) Read Hive array with parquet native reader

Reply via email to