Dayakar M created CALCITE-6080:
----------------------------------

             Summary: The simplified form after applying 
AggregateReduceFunctionsRule is giving wrong results for STDDEV, Covariance 
with double and decimal types.
                 Key: CALCITE-6080
                 URL: https://issues.apache.org/jira/browse/CALCITE-6080
             Project: Calcite
          Issue Type: Bug
            Reporter: Dayakar M
            Assignee: Dayakar M


The simplified form after applying AggregateReduceFunctionsRule is giving wrong 
results for STDDEV, Covariance with double and decimal types.

For example, after applying AggregateReduceFunctionsRule 

STDDEV_POP(x) -> SQRT((SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / COUNT(x))

for x as double/decimal, it is giving wrong result which can be easily 
reproducible with below simple java code

 
{code:java}
    double input1 = 23.79d;
    double o1 = input1 * input1;
    System.out.println("ip*ip=" + o1);
    double sum = o1 + o1 + o1;
    System.out.println("Sum(ip*ip)="+sum);    double sum1 = input1 + input1 + 
input1;
    System.out.println("Sum(ip)="+sum1);
    double sum2 = sum1 * sum1;
    System.out.println("Sum(ip)*Sum(ip)="+ sum2);
    double fin = sum2/3d;
    System.out.println("Sum(ip)*Sum(ip)/3="+fin);
    double fin1 = sum - fin;
    System.out.println("Sum(ip*ip)-Sum(ip)*Sum(ip)/3=" + fin1); 
    System.out.println("SQRT((Sum(ip*ip)-Sum(ip)*Sum(ip)/3)/3)=" + 
Math.sqrt(fin1/3));{code}
The output is

 

 
{code:java}
ip*ip=565.9640999999999
Sum(ip*ip)=1697.8922999999998
Sum(ip)=71.37
Sum(ip)*Sum(ip)=5093.6769
Sum(ip)*Sum(ip)/3=1697.8923000000002
Sum(ip*ip)-Sum(ip)*Sum(ip)/3=-4.547473508864641E-13
SQRT((Sum(ip*ip)-Sum(ip)*Sum(ip)/3)/3)=NaN {code}
The final output should be *0.0* but here it is coming as {*}NaN{*}.

So for double and decimal type data we should not simplify STDDEV, Covariance 
functions as it leads to wrong results.

 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to