[ 
https://issues.apache.org/jira/browse/HIVE-5878?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13843882#comment-13843882
 ] 

Xuefu Zhang commented on HIVE-5878:
-----------------------------------

For information, the follow is the text from SQL-92[1] standard w.r.t AVG 
function:

{quote}
            c) If AVG is specified and DT is exact numeric, then the data
              type of the result is exact numeric with implementation-
              defined precision not less than the precision of DT and
              implementation-defined scale not less than the scale of DT.
{quote}

Clearly, currently Hive deviates from this. Both MySQL and SQL server are in 
line with this.

[1] http://www.contrib.andrew.cmu.edu/~shadow/sql/sql1992.txt

> Hive standard avg UDAF returns double as the return type for some exact input 
> types
> -----------------------------------------------------------------------------------
>
>                 Key: HIVE-5878
>                 URL: https://issues.apache.org/jira/browse/HIVE-5878
>             Project: Hive
>          Issue Type: Bug
>          Components: Types, UDF
>    Affects Versions: 0.12.0
>            Reporter: Xuefu Zhang
>            Assignee: Xuefu Zhang
>         Attachments: HIVE-5878.1.patch, HIVE-5878.patch
>
>
> For standard, no-partial avg result, hive currently returns double as the 
> result type.
> {code}
> hive> desc test;
> OK
> d                     int                     None                
> Time taken: 0.051 seconds, Fetched: 1 row(s)
> hive> explain select avg(`d`) from test;  
> ...
>       Reduce Operator Tree:
>         Group By Operator
>           aggregations:
>                 expr: avg(VALUE._col0)
>           bucketGroup: false
>           mode: mergepartial
>           outputColumnNames: _col0
>           Select Operator
>             expressions:
>                   expr: _col0
>                   type: double
> {code}
> However, exact types including integers and decimal should yield exact type. 
> Here is what MySQL does:
> {code}
> mysql> desc test;
> +-------+--------------+------+-----+---------+-------+
> | Field | Type         | Null | Key | Default | Extra |
> +-------+--------------+------+-----+---------+-------+
> | i     | int(11)      | YES  |     | NULL    |       |
> | b     | tinyint(1)   | YES  |     | NULL    |       |
> | d     | double       | YES  |     | NULL    |       |
> | s     | varchar(5)   | YES  |     | NULL    |       |
> | dd    | decimal(5,2) | YES  |     | NULL    |       |
> +-------+--------------+------+-----+---------+-------+
> mysql> create table test62 as select avg(i) from test;
> mysql> desc test62;
> +-------+---------------+------+-----+---------+-------+
> | Field | Type          | Null | Key | Default | Extra |
> +-------+---------------+------+-----+---------+-------+
> | avg(i) | decimal(14,4) | YES  |     | NULL    |       |
> +-------+---------------+------+-----+---------+-------+
> 1 row in set (0.00 sec)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.4#6159)

Reply via email to