[ 
https://issues.apache.org/jira/browse/HIVE-20174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16558991#comment-16558991
 ] 

Vihang Karajgaonkar commented on HIVE-20174:
--------------------------------------------

I see. The randowRowSource tests have been really great in flushing out these 
bugs. It generally much harder to write queries to execute such corner cases.

> Vectorization: Fix NULL / Wrong Results issues in GROUP BY Aggregation 
> Functions
> --------------------------------------------------------------------------------
>
>                 Key: HIVE-20174
>                 URL: https://issues.apache.org/jira/browse/HIVE-20174
>             Project: Hive
>          Issue Type: Bug
>          Components: Hive
>            Reporter: Matt McCline
>            Assignee: Matt McCline
>            Priority: Critical
>             Fix For: 4.0.0
>
>         Attachments: HIVE-20174.01.patch, HIVE-20174.02.patch, 
> HIVE-20174.03.patch, HIVE-20174.04.patch, HIVE-20174.05.patch
>
>
> Write new UT tests that use random data and intentional isRepeating batches 
> to checks for NULL and Wrong Results for vectorized aggregation functions.
>  
> BUGs found:
> 1) AVG/VARIANCE (family) in PARTIAL1 mode was returning NULL instead of count 
> = 0, sum = 0 (All data types).  For AVG DECIMAL, only return NULL if there 
> was an overflow.
> 2) AVG/MIN/MAX was not detecting repeated NULL correctly for the TIMESTAMP, 
> INTERVAL_DAY_TIME, and String Family.  Eliminated redundant code.
> 3) Fix incorrect calculation  for VARIANCE (family) in PARTIAL2 and FINAL 
> modes (HIVE-18758).
> 4) Fix row-mode AVG DECIMAL to enforce output type precision and scale in 
> COMPLETE and FINAL modes.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to