[ https://issues.apache.org/jira/browse/HIVE-8205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14204434#comment-14204434 ]
Lefty Leverenz commented on HIVE-8205: -------------------------------------- Does this need documentation? The Parquet wikidoc has examples with strings in map, array, and struct for Hive 0.10 - 0.12 as well as Hive 0.13+. * [Parquet -- HiveQL Syntax -- Hive 0.10 - 0.12 | https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.10-0.12] * [Parquet -- HiveQL Syntax -- Hive 0.13 and later | https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Hive0.13andlater] * [Parquet -- Limitations | https://cwiki.apache.org/confluence/display/Hive/Parquet#Parquet-Limitations] > Using strings in group type fails in ParquetSerDe > ------------------------------------------------- > > Key: HIVE-8205 > URL: https://issues.apache.org/jira/browse/HIVE-8205 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Reporter: Mohit Sabharwal > Assignee: Mohit Sabharwal > Labels: parquet > Fix For: 0.14.0 > > Attachments: HIVE-8205.1.patch, HIVE-8205.1.patch, HIVE-8205.patch > > > In HIVE-7735, schema info was plumbed to ETypeConverter to disambiguate > between hive Char, Varchar and String types, which are all represented as > PrimitiveType "binary" and OriginalType "utf8" in parquet. > However, this does not work for parquet nested types (that map to hive Array, > Map, etc.) containing these values, because schema lookup for nested values > was not implemented. It's also non-trivial to do that in the current parquet > serde implementation. Instead of plumbing in the schema, we should convert > these types to the same Text writeable and let the object inspectors handle > the final conversion. -- This message was sent by Atlassian JIRA (v6.3.4#6332)