[ 
https://issues.apache.org/jira/browse/HIVE-18421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16352725#comment-16352725
 ] 

Vihang Karajgaonkar commented on HIVE-18421:
--------------------------------------------

Copy pasting the test result from precommit job since its not posting it to the 
Jira for some reason.

 [Test 
Result|https://builds.apache.org/job/PreCommit-HIVE-Build/9014/testReport/] 
(312 failures / +285)

Most of these test failures are due to explain plan differences since the 
{{hive.vectorized.use.checked.expressions}} is set to true by default in the 
patch. We can either update all these q.out files or change the default value 
of this config to false.

The tests TestColumnScalarOperationVectorExpressionEvaluation are failing 
because they do not set the OutputTypeInfo in the test case. This will also get 
resolved if we set the outputTypeInfo appropriately in the test template. I 
think I will change the patch so that 
{{OverflowUtils.accountForOverflow[Double|Long]}} will not throw an exception 
if the outputTypeInfo is set to null since based on the tests, it doesn't look 
like its always going to be true for all the expression. [~mmccline] do you 
know if this understanding is correct?

I will update the patch which should fix most of these test failures.

> Vectorized execution handles overflows in a different manner than 
> non-vectorized execution
> ------------------------------------------------------------------------------------------
>
>                 Key: HIVE-18421
>                 URL: https://issues.apache.org/jira/browse/HIVE-18421
>             Project: Hive
>          Issue Type: Bug
>          Components: Vectorization
>    Affects Versions: 2.1.1, 2.2.0, 3.0.0, 2.3.2
>            Reporter: Vihang Karajgaonkar
>            Assignee: Vihang Karajgaonkar
>            Priority: Major
>         Attachments: HIVE-18421.01.patch, HIVE-18421.02.patch
>
>
> In vectorized execution arithmetic operations which cause integer overflows 
> can give wrong results. Issue is reproducible in both Orc and parquet.
> Simple test case to reproduce this issue
> {noformat}
> set hive.vectorized.execution.enabled=true;
> create table parquettable (t1 tinyint, t2 tinyint) stored as parquet;
> insert into parquettable values (-104, 25), (-112, 24), (54, 9);
> select t1, t2, (t1-t2) as diff from parquettable where (t1-t2) < 50 order by 
> diff desc;
> +-------+-----+-------+
> |  t1   | t2  | diff  |
> +-------+-----+-------+
> | -104  | 25  | 127   |
> | -112  | 24  | 120   |
> | 54    | 9   | 45    |
> +-------+-----+-------+
> {noformat}
> When vectorization is turned off the same query produces only one row.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to