[ https://issues.apache.org/jira/browse/PIG-833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12716470#action_12716470 ]
Hong Tang commented on PIG-833: ------------------------------- Jeff, just like the SQL effort, the space of columnar storage is also wide open, and I think it is more beneficial to the overall healthy of the hadoop ecosystem. With that being said, I also looked at the patch attached with HIVE-352. It appears that what the patch does is a level below our stated objectives. Specifically, the guts of the implementation (RCFile) is very close in spirit to TFile as described HADOOP-3315, which seems to have its first comprehensive patch back in December 2008. > Storage access layer > -------------------- > > Key: PIG-833 > URL: https://issues.apache.org/jira/browse/PIG-833 > Project: Pig > Issue Type: New Feature > Reporter: Jay Tang > > A layer is needed to provide a high level data access abstraction and a > tabular view of data in Hadoop, and could free Pig users from implementing > their own data storage/retrieval code. This layer should also include a > columnar storage format in order to provide fast data projection, > CPU/space-efficient data serialization, and a schema language to manage > physical storage metadata. Eventually it could also support predicate > pushdown for further performance improvement. Initially, this layer could be > a contrib project in Pig and become a hadoop subproject later on. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.