[ 
https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716470#action_12716470
 ] 

Hong Tang commented on PIG-833:
-------------------------------

Jeff, just like the SQL effort, the space of columnar storage is also wide 
open, and I think it is more beneficial to the overall healthy of the hadoop 
ecosystem.

With that being said, I also looked at the patch attached with HIVE-352. It 
appears that what the patch does is a level below our stated objectives. 
Specifically, the guts of the implementation (RCFile) is very close in spirit 
to TFile as described HADOOP-3315, which seems to have its first comprehensive 
patch back in December 2008. 

> Storage access layer
> --------------------
>
>                 Key: PIG-833
>                 URL: https://issues.apache.org/jira/browse/PIG-833
>             Project: Pig
>          Issue Type: New Feature
>            Reporter: Jay Tang
>
> A layer is needed to provide a high level data access abstraction and a 
> tabular view of data in Hadoop, and could free Pig users from implementing 
> their own data storage/retrieval code.  This layer should also include a 
> columnar storage format in order to provide fast data projection, 
> CPU/space-efficient data serialization, and a schema language to manage 
> physical storage metadata.  Eventually it could also support predicate 
> pushdown for further performance improvement.  Initially, this layer could be 
> a contrib project in Pig and become a hadoop subproject later on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to