[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

Brock Noland (JIRA) Fri, 06 Dec 2013 11:56:02 -0800

    [ 
https://issues.apache.org/jira/browse/HIVE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841618#comment-13841618
 ]


Brock Noland commented on HIVE-5783:
------------------------------------

bq. Why does support need to be build directly into the semantic analyzer?

At present this is required to get STORED AS. 

bq. I think input format/serde's should be decoupled from the hive code as much 
as possible. hard codes like this make it hard to evolve support.

Yes, I agree. We should have some kind of registration system. I have created a 
jira for that HIVE-5976, but I don't see that as a blocker.

> Native Parquet Support in Hive
> ------------------------------
>
>                 Key: HIVE-5783
>                 URL: https://issues.apache.org/jira/browse/HIVE-5783
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Justin Coffey
>            Assignee: Justin Coffey
>            Priority: Minor
>             Fix For: 0.11.0
>
>         Attachments: hive-0.11-parquet.patch
>
>
> Problem Statement:
> Hive would be easier to use if it had native Parquet support. Our 
> organization, Criteo, uses Hive extensively. Therefore we built the Parquet 
> Hive integration and would like to now contribute that integration to Hive.
> About Parquet:
> Parquet is a columnar storage format for Hadoop and integrates with many 
> Hadoop ecosystem tools such as Thrift, Avro, Hadoop MapReduce, Cascading, 
> Pig, Drill, Crunch, and Hive. Pig, Crunch, and Drill all contain native 
> Parquet integration.
> Changes Details:
> Parquet was built with dependency management in mind and therefore only a 
> single Parquet jar will be added as a dependency.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

[jira] [Commented] (HIVE-5783) Native Parquet Support in Hive

Reply via email to