[
https://issues.apache.org/jira/browse/HIVE-9385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279769#comment-14279769
]
Nick Martin commented on HIVE-9385:
-----------------------------------
[~damien.carol] So I have ~150m rows of sales data in an ORC table and there's
a column for the sales amount I'm storing as a double. When I sum on that
column I get the value I reported above (4.7...). The true sum of that field is
~$2.5b or so.
When I do the exact same thing (create the same table, store the sales column
as a double, sum on that column) but store the table as textfile I get the
correct amount.
So, I'm saying I think there's something going on with sum() on doubles in ORC
tables and am hoping someone could give it a shot in their environment and let
me know if it appears to be a bug or not.
> Sum a Double using an ORC table
> -------------------------------
>
> Key: HIVE-9385
> URL: https://issues.apache.org/jira/browse/HIVE-9385
> Project: Hive
> Issue Type: Bug
> Affects Versions: 0.13.1
> Environment: HDP 2.x, Hive
> Reporter: Nick Martin
> Priority: Minor
>
> I’m storing a sales amount column as a double in an ORC table and when I do:
> {code:sql}
> select sum(x) from sometable
> {code}
> I get a value like {{4.79165141174808E9}}
> A visual inspection of the column values reveals no glaring anomalies…all
> looks pretty normal.
> If I do the same thing in a textfile table I get a perfectly fine aggregation
> of the double field.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)