>From the code here
http://svn.apache.org/viewvc/hive/branches/branch-0.7/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java?view=markup
For float , doble and string the implementation points to common function
GenericUDAFSumDouble()
if (parameters[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
54 throw new UDFArgumentTypeException(0,
55 "Only primitive type arguments are accepted but "
56 + parameters[0].getTypeName() + " is passed.");
57 }
58 switch (((PrimitiveTypeInfo) parameters[0]).getPrimitiveCategory()) {
59 case BYTE:
60 case SHORT:
61 case INT:
62 case LONG:
63 return new GenericUDAFSumLong();
*64 * *case FLOAT:
65 case DOUBLE:
66 case STRING:
67 return new GenericUDAFSumDouble();*
On Mon, Jun 11, 2012 at 3:43 PM, Gabi D <[email protected]> wrote:
> float is known to have precision issues, because of the way it is
> implemented. If you are working with money data you should definitely move
> to double.
> google 'float precision' and you'll find a bunch of explanations.
>
>
> On Mon, Jun 11, 2012 at 12:49 PM, Guillaume Polaert <[email protected]>wrote:
>
>> Hi,
>>
>> We're expecting some issue with the sum function in Hive 0.7.1.
>> The precision of float number isn't correct (0.320484484676 instead of
>> 0.32)
>> We aren't expecting this error with double format.
>>
>> For instance, "select id, sum(col1), sum(col2) from test_table group by
>> id" returns incorrect values.
>>
>> ------
>> CREATE TABLE test_table ( id int, col1 double, col2 float )
>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073'
>> LINES TERMINATED BY '\n'
>> STORED AS TEXTFILE;
>>
>> LOAD DATA INPATH '/user/hive/xxxxx/import.csv' INTO TABLE `test _table`
>>
>> Import.csv
>> 1;1.012;1.012
>> 2;2.1;2.1
>> 3;3.2;3.2
>> 4;4.323;4.323
>> 5;5;5
>> ----
>>
>> Is it a bug ? Or maybe did we make a error ?
>>
>>
>> Guillaume Polaert | Cyrès Conseil
>>
>>
>