[ https://issues.apache.org/jira/browse/HIVE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13845634#comment-13845634 ]
Eric Hanson commented on HIVE-5996: ----------------------------------- -1 We should not be changing the output data types of expression results that are arguably reasonable. It causes code churn and can break existing apps. Having sum(bigint) return bigint is long standing behavior in Hive and is reasonable. As a side note, SQL Server returns bigint for sum(bigint). If users need more digits, they can cast the input to sum to a decimal. > Query for sum of a long column of a table with only two rows produces wrong > result > ---------------------------------------------------------------------------------- > > Key: HIVE-5996 > URL: https://issues.apache.org/jira/browse/HIVE-5996 > Project: Hive > Issue Type: Bug > Components: UDF > Affects Versions: 0.12.0 > Reporter: Xuefu Zhang > Assignee: Xuefu Zhang > Attachments: HIVE-5996.patch > > > {code} > hive> desc test2; > OK > l bigint None > hive> select * from test2; > OK > 6666666666666666666 > 5555555555555555555 > hive> select sum(l) from test2; > OK > -6224521851487329395 > {code} > It's believed that a wrap-around error occurred. It's surprising that it > happens only with two rows. Same query in MySQL returns: > {code} > mysql> select sum(l) from test; > +----------------------+ > | sum(l) | > +----------------------+ > | 12222222222222222221 | > +----------------------+ > 1 row in set (0.00 sec) > {code} > Hive should accommodate large number of rows. Overflowing with only two rows > is very unusable. -- This message was sent by Atlassian JIRA (v6.1.4#6159)