[ 
https://issues.apache.org/jira/browse/PARQUET-47?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14242204#comment-14242204
 ] 

Ashish Kumar Singh commented on PARQUET-47:
-------------------------------------------

[~julienledem] could you assign this to me.

> SERDE backed schema for parquet storage in Hive
> -----------------------------------------------
>
>                 Key: PARQUET-47
>                 URL: https://issues.apache.org/jira/browse/PARQUET-47
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>            Reporter: Abhishek Agarwal
>
> As of now, for a hive table stored as parquet, the schema can only be 
> specified in Hive MetaStore. For our use-case, it is desired that the schema 
> be provided by Thrift SerDe rather than MetaStore. Using thrift IDL as a 
> schema provider, allows us to maintain a consistent schema across executions 
> engines other than Hive such as Pig and Native MR. 
> Additionally, for a large sparse schema, it is much easier to build thrift 
> objects, and use parquet-thrift/elephant-bird to convert them into 
> columns/tuples rather than constructing the whole big tuple itself.  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to