[
https://issues.apache.org/jira/browse/PARQUET-221?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yin Huai updated PARQUET-221:
-----------------------------
Summary: For array type, inconsistent names are used as the array element
name. (was: For array type, inconsistent names are used as the field name.)
> For array type, inconsistent names are used as the array element name.
> ----------------------------------------------------------------------
>
> Key: PARQUET-221
> URL: https://issues.apache.org/jira/browse/PARQUET-221
> Project: Parquet
> Issue Type: Bug
> Components: parquet-mr
> Affects Versions: 1.6.0
> Reporter: Yin Huai
>
> When creating a convert for an array, Parquet Avro uses "array" as the field
> name name ([see
> here|https://github.com/apache/incubator-parquet-mr/blob/parquet-1.6.0rc7/parquet-avro/src/main/java/parquet/avro/AvroSchemaConverter.java#L131])
> , but Parquet Hive SerDe uses "array_element" as the field name [see
> here|https://github.com/apache/incubator-parquet-mr/blob/parquet-1.6.0rc7/parquet-hive/parquet-hive-storage-handler/src/main/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveSchemaConverter.java#L109].
> In Spark SQL, our native Parquet support is following Parquet Avro's
> convention, for data generated by Parquet Hive SerDe, the array value cannot
> be correctly read and null will be returned.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)