[ 
https://issues.apache.org/jira/browse/ARROW-9779?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jorge closed ARROW-9779.
------------------------
    Resolution: Won't Fix

There are other trade-offs here, and there is no consesus that this is worth. 
Spark also uses sum / count.

> [Rust] [DataFusion] Increase stability of average accumulator
> -------------------------------------------------------------
>
>                 Key: ARROW-9779
>                 URL: https://issues.apache.org/jira/browse/ARROW-9779
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: Rust, Rust - DataFusion
>            Reporter: Jorge
>            Assignee: Jorge
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 50m
>  Remaining Estimate: 0h
>
> Currently, our method to compute the average is based on:
> 1. compute sum of all terms
> 2. compute count of all terms
> 3. compute sum / count
> however, the sum may overflow.
> There is a typical solution to this based on an online formula described e.g. 
> [here|http://www.heikohoffmann.de/htmlthesis/node134.html] to keep the 
> numbers small.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to