[
https://issues.apache.org/jira/browse/PARQUET-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242204#comment-14242204
]
Ashish Kumar Singh commented on PARQUET-47:
-------------------------------------------
[~julienledem] could you assign this to me.
> SERDE backed schema for parquet storage in Hive
> -----------------------------------------------
>
> Key: PARQUET-47
> URL: https://issues.apache.org/jira/browse/PARQUET-47
> Project: Parquet
> Issue Type: Improvement
> Components: parquet-mr
> Reporter: Abhishek Agarwal
>
> As of now, for a hive table stored as parquet, the schema can only be
> specified in Hive MetaStore. For our use-case, it is desired that the schema
> be provided by Thrift SerDe rather than MetaStore. Using thrift IDL as a
> schema provider, allows us to maintain a consistent schema across executions
> engines other than Hive such as Pig and Native MR.
> Additionally, for a large sparse schema, it is much easier to build thrift
> objects, and use parquet-thrift/elephant-bird to convert them into
> columns/tuples rather than constructing the whole big tuple itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)