[
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365222#comment-16365222
]
Teddy Choi edited comment on HIVE-18622 at 2/15/18 8:11 AM:
------------------------------------------------------------
I tested HIVE-18622.098. Two files may need to be changed:
vector_decimal_math_funcs.q.out and vectorization_part_project.q.out. I
attached the changes I found. Please check it. Thanks.
{{diff --git
ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out
ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out}}
{{index b1c25c4180..526aa16445 100644}}
{{--- ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out}}
{{+++ ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out}}
{{@@ -326,7 +326,7 @@ from decimal_test) as q}}
{{ POSTHOOK: type: QUERY}}
{{ POSTHOOK: Input: default@decimal_test}}
{{ #### A masked pattern was here ####}}
{{--1988477779865}}
{{+-1989051768985}}
{{ PREHOOK: query: CREATE TABLE decimal_test_small STORED AS ORC AS SELECT
cbigint, cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(12,4)) AS cdecimal1,
CAST (((cdouble*9.3)/13) AS DECIMAL(14,8)) AS cdecimal2 FROM alltypesorc}}
{{ PREHOOK: type: CREATETABLE_AS_SELECT}}
{{ PREHOOK: Input: default@alltypesorc}}
{{diff --git
ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out
ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out}}
{{index 130e1376ce..e46c7f4524 100644}}
{{--- ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out}}
{{+++ ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out}}
{{@@ -70,15 +70,15 @@ STAGE PLANS:}}
{{ Map Operator Tree:}}
{{ TableScan}}
{{ alias: alltypesorc_part}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1592 Basic stats: COMPLETE Column
stats: COMPLETE}}
{{ Select Operator}}
{{ expressions: (cdouble + 2.0) (type: double)}}
{{ outputColumnNames: _col0}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: COMPLETE}}
{{ Reduce Output Operator}}
{{ key expressions: _col0 (type: double)}}
{{ sort order: +}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: COMPLETE}}
{{ TopN Hash Memory Usage: 0.1}}
{{ Execution mode: vectorized, llap}}
{{ LLAP IO: all inputs}}
{{@@ -103,13 +103,13 @@ STAGE PLANS:}}
{{ Select Operator}}
{{ expressions: KEY.reducesinkkey0 (type: double)}}
{{ outputColumnNames: _col0}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column
stats: COMPLETE}}
{{ Limit}}
{{ Number of rows: 10}}
{{- Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats:
PARTIAL}}
{{+ Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats:
COMPLETE}}
{{ File Output Operator}}
{{ compressed: false}}
{{- Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats:
PARTIAL}}
{{+ Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats:
COMPLETE}}
{{ table:}}
{{ input format: org.apache.hadoop.mapred.SequenceFileInputFormat}}
{{ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat}}
was (Author: teddy.choi):
I tested HIVE-18622.098. Two files may need to be changed:
vector_decimal_math_funcs.q.out and vectorization_part_project.q.out. I
attached the changes I found. Please check it. Thanks.
> Vectorization: IF Statements, Comparisons, and more do not handle NULLs
> correctly
> ---------------------------------------------------------------------------------
>
> Key: HIVE-18622
> URL: https://issues.apache.org/jira/browse/HIVE-18622
> Project: Hive
> Issue Type: Bug
> Components: Hive
> Reporter: Matt McCline
> Assignee: Matt McCline
> Priority: Critical
> Fix For: 3.0.0
>
> Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch,
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch,
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch,
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch,
> HIVE-18622.095.patch, HIVE-18622.096.patch, HIVE-18622.097.patch,
> HIVE-18622.098.patch
>
>
>
> Many vector expression classes are setting noNulls to true which does not
> work if the VRB is a scratch column being reused. The previous use may have
> set noNulls to false and the isNull array will have some rows marked as NULL.
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
> // Carefully handle NULLs...
> /*
> * For better performance on LONG/DOUBLE we don't want the conditional
> * statements inside the for loop.
> */
> outputColVector.noNulls = false;
> {code}
> And, vector expressions need to make sure the isNull array entry is set when
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)