Amruth S created HIVE-14741: ------------------------------- Summary: Incorrect results on boolean col when vectorization is ON Key: HIVE-14741 URL: https://issues.apache.org/jira/browse/HIVE-14741 Project: Hive Issue Type: Bug Affects Versions: 2.1.0, 2.0.0 Reporter: Amruth S
I have attached the ORC part file on which the issue is manifesting. It has just one boolean column (lot of nulls, 231=trues : verified using orc file dump utility) 1) Create external table on the part file attached CREATE EXTERNAL TABLE bool_vect_issue ( `bool_col` BOOLEAN) ROW FORMAT SERDE 'org.apache.hadoop.hive.ql.io.orc.OrcSerde' STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat' LOCATION '<loc to which the part file is copied>'; 2) set hive.vectorized.execution.enabled = true; select sum(if((bool_col) , 1, 0)) from bool_vect_issue; gives 708206 3) set hive.vectorized.execution.enabled = false; select sum(if((bool_col) , 1, 0)) from bool_vect_issue; gives 231 The issue seem to have the same impact as HIVE-12435 -- This message was sent by Atlassian JIRA (v6.3.4#6332)