[ 
https://issues.apache.org/jira/browse/HIVE-18622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16365222#comment-16365222
 ] 

Teddy Choi edited comment on HIVE-18622 at 2/15/18 8:11 AM:
------------------------------------------------------------

I tested HIVE-18622.098. Two files may need to be changed: 
vector_decimal_math_funcs.q.out and vectorization_part_project.q.out. I 
attached the changes I found. Please check it. Thanks.
{noformat}
diff --git 
ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out 
ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out
index b1c25c4180..526aa16445 100644
--- ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out
+++ ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out
@@ -326,7 +326,7 @@ from decimal_test) as q
 POSTHOOK: type: QUERY
 POSTHOOK: Input: default@decimal_test
 #### A masked pattern was here ####
--1988477779865
+-1989051768985
 PREHOOK: query: CREATE TABLE decimal_test_small STORED AS ORC AS SELECT 
cbigint, cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(12,4)) AS cdecimal1, 
CAST (((cdouble*9.3)/13) AS DECIMAL(14,8)) AS cdecimal2 FROM alltypesorc
 PREHOOK: type: CREATETABLE_AS_SELECT
 PREHOOK: Input: default@alltypesorc
diff --git 
ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out 
ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out
index 130e1376ce..e46c7f4524 100644
--- ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out
+++ ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out
@@ -70,15 +70,15 @@ STAGE PLANS:
             Map Operator Tree:
                 TableScan
                   alias: alltypesorc_part
-                  Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: PARTIAL
+                  Statistics: Num rows: 200 Data size: 1592 Basic stats: 
COMPLETE Column stats: COMPLETE
                   Select Operator
                     expressions: (cdouble + 2.0) (type: double)
                     outputColumnNames: _col0
-                    Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: PARTIAL
+                    Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: COMPLETE
                     Reduce Output Operator
                       key expressions: _col0 (type: double)
                       sort order: +
-                      Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: PARTIAL
+                      Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: COMPLETE
                       TopN Hash Memory Usage: 0.1
             Execution mode: vectorized, llap
             LLAP IO: all inputs
@@ -103,13 +103,13 @@ STAGE PLANS:
               Select Operator
                 expressions: KEY.reducesinkkey0 (type: double)
                 outputColumnNames: _col0
-                Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: PARTIAL
+                Statistics: Num rows: 200 Data size: 1600 Basic stats: 
COMPLETE Column stats: COMPLETE
                 Limit
                   Number of rows: 10
-                  Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE 
Column stats: PARTIAL
+                  Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE 
Column stats: COMPLETE
                   File Output Operator
                     compressed: false
-                    Statistics: Num rows: 10 Data size: 80 Basic stats: 
COMPLETE Column stats: PARTIAL
+                    Statistics: Num rows: 10 Data size: 80 Basic stats: 
COMPLETE Column stats: COMPLETE
                     table:
                         input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                         output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
{noformat}
 


was (Author: teddy.choi):
I tested HIVE-18622.098. Two files may need to be changed: 
vector_decimal_math_funcs.q.out and vectorization_part_project.q.out. I 
attached the changes I found. Please check it. Thanks.

{{diff --git 
ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out 
ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out}}
{{index b1c25c4180..526aa16445 100644}}
{{--- ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out}}
{{+++ ql/src/test/results/clientpositive/llap/vector_decimal_math_funcs.q.out}}
{{@@ -326,7 +326,7 @@ from decimal_test) as q}}
{{ POSTHOOK: type: QUERY}}
{{ POSTHOOK: Input: default@decimal_test}}
{{ #### A masked pattern was here ####}}
{{--1988477779865}}
{{+-1989051768985}}
{{ PREHOOK: query: CREATE TABLE decimal_test_small STORED AS ORC AS SELECT 
cbigint, cdouble, CAST (((cdouble*22.1)/37) AS DECIMAL(12,4)) AS cdecimal1, 
CAST (((cdouble*9.3)/13) AS DECIMAL(14,8)) AS cdecimal2 FROM alltypesorc}}
{{ PREHOOK: type: CREATETABLE_AS_SELECT}}
{{ PREHOOK: Input: default@alltypesorc}}
{{diff --git 
ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out 
ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out}}
{{index 130e1376ce..e46c7f4524 100644}}
{{--- ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out}}
{{+++ ql/src/test/results/clientpositive/llap/vectorization_part_project.q.out}}
{{@@ -70,15 +70,15 @@ STAGE PLANS:}}
{{ Map Operator Tree:}}
{{ TableScan}}
{{ alias: alltypesorc_part}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1592 Basic stats: COMPLETE Column 
stats: COMPLETE}}
{{ Select Operator}}
{{ expressions: (cdouble + 2.0) (type: double)}}
{{ outputColumnNames: _col0}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: COMPLETE}}
{{ Reduce Output Operator}}
{{ key expressions: _col0 (type: double)}}
{{ sort order: +}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: COMPLETE}}
{{ TopN Hash Memory Usage: 0.1}}
{{ Execution mode: vectorized, llap}}
{{ LLAP IO: all inputs}}
{{@@ -103,13 +103,13 @@ STAGE PLANS:}}
{{ Select Operator}}
{{ expressions: KEY.reducesinkkey0 (type: double)}}
{{ outputColumnNames: _col0}}
{{- Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: PARTIAL}}
{{+ Statistics: Num rows: 200 Data size: 1600 Basic stats: COMPLETE Column 
stats: COMPLETE}}
{{ Limit}}
{{ Number of rows: 10}}
{{- Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats: 
PARTIAL}}
{{+ Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats: 
COMPLETE}}
{{ File Output Operator}}
{{ compressed: false}}
{{- Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats: 
PARTIAL}}
{{+ Statistics: Num rows: 10 Data size: 80 Basic stats: COMPLETE Column stats: 
COMPLETE}}
{{ table:}}
{{ input format: org.apache.hadoop.mapred.SequenceFileInputFormat}}
{{ output format: org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat}}

> Vectorization: IF Statements, Comparisons, and more do not handle NULLs 
> correctly
> ---------------------------------------------------------------------------------
>
>                 Key: HIVE-18622
>                 URL: https://issues.apache.org/jira/browse/HIVE-18622
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>             Fix For: 3.0.0
>
>         Attachments: HIVE-18622.03.patch, HIVE-18622.04.patch, 
> HIVE-18622.05.patch, HIVE-18622.06.patch, HIVE-18622.07.patch, 
> HIVE-18622.08.patch, HIVE-18622.09.patch, HIVE-18622.091.patch, 
> HIVE-18622.092.patch, HIVE-18622.093.patch, HIVE-18622.094.patch, 
> HIVE-18622.095.patch, HIVE-18622.096.patch, HIVE-18622.097.patch, 
> HIVE-18622.098.patch
>
>
>  
>  Many vector expression classes are setting noNulls to true which does not 
> work if the VRB is a scratch column being reused. The previous use may have 
> set noNulls to false and the isNull array will have some rows marked as NULL. 
> The result is wrong query results and sometimes NPEs (for BytesColumnVector).
> So, many vector expressions need this:
> {code:java}
>       // Carefully handle NULLs...
>       /*
>        * For better performance on LONG/DOUBLE we don't want the conditional
>        * statements inside the for loop.
>        */
>       outputColVector.noNulls = false;
>  {code}
> And, vector expressions need to make sure the isNull array entry is set when 
> outputColVector.noNulls is false.
> And, all place that assign column value need to set noNulls to false when the 
> value is NULL.
> Almost all cases where noNulls is set to true are incorrect.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to