[ https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689151#action_12689151 ]
He Yongqiang commented on HIVE-352: ----------------------------------- One problem with this RCFile is that it needs to know the needed columns in advance, so it can skip and avoid decompress unneeded columns. I took a look at Hive's operators and SerDe, it seems that they all take a whole row object as input and do not know which column is needed before processing. Like with LazyStruct and StructObjectInspector, they only know which column is needed when getField/getStructFieldData is invoked by operators' evalators( like ExprNodeColumnEvaluator). > Make Hive support column based storage > -------------------------------------- > > Key: HIVE-352 > URL: https://issues.apache.org/jira/browse/HIVE-352 > Project: Hadoop Hive > Issue Type: New Feature > Reporter: He Yongqiang > > column based storage has been proven a better storage layout for OLAP. > Hive does a great job on raw row oriented storage. In this issue, we will > enhance hive to support column based storage. > Acctually we have done some work on column based storage on top of hdfs, i > think it will need some review and refactoring to port it to Hive. > Any thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.