[ 
https://issues.apache.org/jira/browse/HIVE-352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12689796#action_12689796
 ] 

He Yongqiang commented on HIVE-352:
-----------------------------------

Also the cost of tuple reconstruction accounts for a large proportion of the 
whole execution time. In our initial exprements, the reconstruction cost is 
much higher than the benefit of intergreting the column-execution and the 
underlying column-storage. The reconstruction is a Map-Reduce join operation. 
The cost can be extremely reduced in some queries when we can reduce the number 
of tuples needed to reconstruct. The key to this is a late materialization.
But in the current B2.2, the localize rows in a single file and adopt a 
record-level columnar storage, it does not have the tuple reconstruction cost. 
But it needs a more specific and more flexble compression algorithms, and i 
strongly recommed to support bitmap file in future. As the main benefit of a 
columnar strategy, it needs us to add some columnar operators in the next.
But now let us make the first step, and then add more optimizations.

> Make Hive support column based storage
> --------------------------------------
>
>                 Key: HIVE-352
>                 URL: https://issues.apache.org/jira/browse/HIVE-352
>             Project: Hadoop Hive
>          Issue Type: New Feature
>            Reporter: He Yongqiang
>
> column based storage has been proven a better storage layout for OLAP. 
> Hive does a great job on raw row oriented storage. In this issue, we will 
> enhance hive to support column based storage. 
> Acctually we have done some work on column based storage on top of hdfs, i 
> think it will need some review and refactoring to port it to Hive.
> Any thoughts?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to