New submission from Raymond Hettinger <raymond.hettin...@gmail.com>:

The existing code makes two passes, one to compute the mean and another to 
compute the sum of squared differences from the mean.  A consequence of making 
two passes is that iterator inputs must be converted to a list before 
processing.  This throws away the memory saving advantages of iterators. 

The ostensible reason for the two pass code is that the single pass variant is 
numerically unstable when implemented with floating point accumulators.  
However, this code uses fractions throughout, so the accumulation is exact.

Changing to a single pass saves memory, doubles the speed, and simplifies the 
upstream code in variance(), pvariance(), stdev(), and pstdev().

----------
components: Library (Lib)
messages: 409692
nosy: rhettinger, steven.daprano
priority: normal
severity: normal
status: open
title: Convert statistics sum of squares to a single pass algorithm
type: performance
versions: Python 3.11

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue46257>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to