[ 
https://issues.apache.org/jira/browse/PARQUET-221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14369847#comment-14369847
 ] 

Julien Le Dem commented on PARQUET-221:
---------------------------------------

[~rdblue] has been working on standardizing this.
See: 
https://github.com/apache/incubator-parquet-format/blob/master/LogicalTypes.md#lists

> For array type, inconsistent names are used as the array element name.
> ----------------------------------------------------------------------
>
>                 Key: PARQUET-221
>                 URL: https://issues.apache.org/jira/browse/PARQUET-221
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-mr
>    Affects Versions: 1.6.0
>            Reporter: Yin Huai
>
> When creating a convert for an array, Parquet Avro uses "array" as the field 
> name name ([see 
> here|https://github.com/apache/incubator-parquet-mr/blob/parquet-1.6.0rc7/parquet-avro/src/main/java/parquet/avro/AvroSchemaConverter.java#L131])
>  , but Parquet Hive SerDe uses "array_element" as the field name [see 
> here|https://github.com/apache/incubator-parquet-mr/blob/parquet-1.6.0rc7/parquet-hive/parquet-hive-storage-handler/src/main/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java#L109].
>  In Spark SQL, our native Parquet support is following Parquet Avro's 
> convention, for data generated by Parquet Hive SerDe, the array value cannot 
> be correctly read and null will be returned.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to