[ 
https://issues.apache.org/jira/browse/HIVE-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874144#comment-15874144
 ] 

Gopal V commented on HIVE-15987:
--------------------------------

-1 for Hive-2.x branch storage-api impl, we consider this for Hive-3.0 branch 
since this breaks external interfaces to ORC and 3rd party vectorized udfs.

> Replace ColumnVector.isNull boolean[] impl. with BitSet
> -------------------------------------------------------
>
>                 Key: HIVE-15987
>                 URL: https://issues.apache.org/jira/browse/HIVE-15987
>             Project: Hive
>          Issue Type: Improvement
>          Components: Vectorization
>            Reporter: Teddy Choi
>            Assignee: Teddy Choi
>              Labels: incompatibleChange
>
> Most of data operations in Hive uses null operations. The current 
> implementation of ColumnVector.isNull uses a boolean array, which uses 8 bits 
> per 1 boolean. BitSet is a more compact representation, as it uses 1 bit per 
> 1 boolean with a backing long array. Also logical operations between longs 
> are much faster than ones with bytes as it uses less instructions per byte. 
> So it will bring 8x or more performance for null operations.
> However, there also are several cases that will make this improvement slow. 
> Such as simple reads will require more instructions per row. So it should 
> include benchmark tests to show its performance impact.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to