[ https://issues.apache.org/jira/browse/HIVE-15987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874144#comment-15874144 ]
Gopal V commented on HIVE-15987: -------------------------------- -1 for Hive-2.x branch storage-api impl, we consider this for Hive-3.0 branch since this breaks external interfaces to ORC and 3rd party vectorized udfs. > Replace ColumnVector.isNull boolean[] impl. with BitSet > ------------------------------------------------------- > > Key: HIVE-15987 > URL: https://issues.apache.org/jira/browse/HIVE-15987 > Project: Hive > Issue Type: Improvement > Components: Vectorization > Reporter: Teddy Choi > Assignee: Teddy Choi > Labels: incompatibleChange > > Most of data operations in Hive uses null operations. The current > implementation of ColumnVector.isNull uses a boolean array, which uses 8 bits > per 1 boolean. BitSet is a more compact representation, as it uses 1 bit per > 1 boolean with a backing long array. Also logical operations between longs > are much faster than ones with bytes as it uses less instructions per byte. > So it will bring 8x or more performance for null operations. > However, there also are several cases that will make this improvement slow. > Such as simple reads will require more instructions per row. So it should > include benchmark tests to show its performance impact. -- This message was sent by Atlassian JIRA (v6.3.15#6346)