Re: Trouble with sum function

Jagat Singh Mon, 11 Jun 2012 03:18:29 -0700

>From the code here

http://svn.apache.org/viewvc/hive/branches/branch-0.7/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFSum.java?view=markup


For float , doble and string the implementation points to common function
GenericUDAFSumDouble()

if (parameters[0].getCategory() != ObjectInspector.Category.PRIMITIVE) {
54     throw new UDFArgumentTypeException(0,
55     "Only primitive type arguments are accepted but "
56     + parameters[0].getTypeName() + " is passed.");
57     }
58     switch (((PrimitiveTypeInfo) parameters[0]).getPrimitiveCategory()) {
59     case BYTE:
60     case SHORT:
61     case INT:
62     case LONG:
63     return new GenericUDAFSumLong();
*64  *   *case FLOAT:
65     case DOUBLE:
66     case STRING:
67     return new GenericUDAFSumDouble();*



On Mon, Jun 11, 2012 at 3:43 PM, Gabi D <gabi...@gmail.com> wrote:

> float is known to have precision issues, because of the way it is
> implemented. If you are working with money data you should definitely move
> to double.
> google 'float precision' and you'll find a bunch of explanations.
>
>
> On Mon, Jun 11, 2012 at 12:49 PM, Guillaume Polaert <gpola...@cyres.fr>wrote:
>
>> Hi,
>>
>> We're expecting some issue with the sum function in Hive 0.7.1.
>> The precision of float number isn't correct (0.320484484676 instead of
>> 0.32)
>> We aren't expecting this error with double format.
>>
>> For instance, "select id, sum(col1), sum(col2) from test_table group by
>> id" returns incorrect values.
>>
>> ------
>> CREATE TABLE test_table  ( id int,  col1 double,  col2 float )
>> ROW FORMAT DELIMITED FIELDS TERMINATED BY '\073'
>> LINES TERMINATED BY '\n'
>> STORED AS TEXTFILE;
>>
>> LOAD DATA INPATH '/user/hive/xxxxx/import.csv' INTO TABLE `test _table`
>>
>> Import.csv
>> 1;1.012;1.012
>> 2;2.1;2.1
>> 3;3.2;3.2
>> 4;4.323;4.323
>> 5;5;5
>> ----
>>
>> Is it a bug  ? Or maybe did we make a error ?
>>
>>
>> Guillaume Polaert | Cyrès Conseil
>>
>>
>

Re: Trouble with sum function

Reply via email to