[issue25177] OverflowError in statistics.mean when summing large floats

2018-04-08 Thread Wolfgang Maier

Wolfgang Maier  added the comment:

Steven's commit here also fixed issue 24068.

--
nosy: +wolma

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2016-05-05 Thread Berker Peksag

Changes by Berker Peksag :


--
resolution:  -> fixed
stage:  -> resolved
status: open -> closed
type:  -> behavior

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-12-02 Thread Roundup Robot

Roundup Robot added the comment:

New changeset a7d2307055e7 by Victor Stinner in branch 'default':
Null merge 3.5, patch was already applied to default (isuse #25177)
https://hg.python.org/cpython/rev/a7d2307055e7

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-12-01 Thread Steven D'Aprano

Steven D'Aprano added the comment:

Larry,

Is it too late to get this into 3.5rc1?

changeset 99407:ed45a09e5a69

Thanks.

--
nosy: +larry

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-12-01 Thread Roundup Robot

Roundup Robot added the comment:

New changeset 4bc9405c4f7b by Steven D'Aprano in branch '3.4':
Fix for issue #25177 with the mean of very small and very large numbers.
https://hg.python.org/cpython/rev/4bc9405c4f7b

New changeset ed45a09e5a69 by Steven D'Aprano in branch '3.5':
Fixed issue #25177, problems with the mean of very small and very large numbers.
https://hg.python.org/cpython/rev/ed45a09e5a69

New changeset 0eeb39fc8ff5 by Steven D'Aprano in branch 'default':
Issue #25177: Fixed problem with the mean of very small and very large numbers.
https://hg.python.org/cpython/rev/0eeb39fc8ff5

--
nosy: +python-dev

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-11-07 Thread Bar Harel

Bar Harel added the comment:

Anyway, yes, it should be quite the same. I can provide some benchmarks 
tomorrow if you wish.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-11-06 Thread Mark Dickinson

Mark Dickinson added the comment:

I can reproduce here (OS X 10.9, Python 3.5), exactly as described in the 
original post.

Python 3.5.0 (default, Sep 22 2015, 18:26:54) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import statistics
>>> statistics.mean([8.988465674311579e+307, 8.98846567431158e+307])
Traceback (most recent call last):
  File "", line 1, in 
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/statistics.py",
 line 293, in mean
return _sum(data)/n
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/statistics.py",
 line 184, in _sum
return T(total)
  File 
"/opt/local/Library/Frameworks/Python.framework/Versions/3.5/lib/python3.5/numbers.py",
 line 291, in __float__
return self.numerator / self.denominator
OverflowError: integer division result too large for a float

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-11-06 Thread Mark Dickinson

Mark Dickinson added the comment:

Note that the two input values given in the original report are not the same:

[8.988465674311579e+307, 8.98846567431158e+307] != [8.988465674311579e+307] * 2.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-11-05 Thread Steven D'Aprano

Steven D'Aprano added the comment:

Has anyone confirmed that this bug actually exists? I'm afraid that I cannot 
verify it. I get these results on three different computers:

py> x = 8.988465674311579e+307
py> statistics.mean([x, x])
8.988465674311579e+307
py> statistics.mean([x, x]) == x
True

running Python 3.4.3, a backport on 3.3.0rc3, and the default branch in the 
repo (3.6.0a).

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-11-05 Thread Steven D'Aprano

Steven D'Aprano added the comment:

> Has anyone confirmed that this bug actually exists? 

Confirmed. The initial report is not quite correct: you need three 
values to trigger the overflow, not two:

py> x = 8.988465674311579e+307
py> statistics.mean([x]*2) == x
True
py> statistics.mean([x]*3) == x
Traceback (most recent call last):
  File "", line 1, in 
  File "./statistics.py", line 289, in mean
return _sum(data)/n
  File "./statistics.py", line 184, in _sum
return T(total)
  File "/usr/local/lib/python3.3/numbers.py", line 296, in __float__
return self.numerator / self.denominator
OverflowError: integer division result too large for a float

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-10-20 Thread Mark Dickinson

Mark Dickinson added the comment:

> I strongly suspect that moving from float to Fraction-based ratios is going 
> to kill performance in the common case

The existing code already converts each of the input items to Fraction; the 
only difference is that the old code converts the sum of those Fractions to 
float (or whatever the target type is) *before* dividing by the count, while 
the new code performs the sum/count division in Fraction-land, and only *then* 
converts to float.  That is, it's the difference between "float(exact_sum) / 
count" and "float(exact_sum / count)".

IOW, the performance is already dead.  Or rather, it's just resting: IIUC, the 
module design prioritises correctness over speed.  I'm sure Steven would be 
open to suggestions for faster algorithms that maintain the current accuracy.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-10-19 Thread Josh Rosenberg

Josh Rosenberg added the comment:

Do you have any benchmarks on the before and after? I strongly suspect that 
moving from float to Fraction-based ratios is going to kill performance in the 
common case, particularly for longer input sequences, but that's a hunch only.

--
nosy: +josh.r

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-10-19 Thread Steven D'Aprano

Changes by Steven D'Aprano :


--
assignee:  -> steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-10-10 Thread Bar Harel

Bar Harel added the comment:

Any comments on the patch?

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-10-10 Thread Steven D'Aprano

Steven D'Aprano added the comment:

On Sat, Oct 10, 2015 at 04:28:22PM +, Bar Harel wrote:
> Any comments on the patch?

Not yet, I've been unable to look at it, but thank you. If I haven't 
responded by the 17th of this month, please ping me again.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-27 Thread Bar Harel

Bar Harel added the comment:

Alright, this patch passed all tests.
I've changed the function sum to include a new parameter called fraction which 
decides whether to return a fraction or not. It might be useful in more 
scenarios due to the fact fractions are more accurate.
I'm still unsure why not to support mixed types, as we can just again, return a 
mixed type as fraction but I left it like this in order not to cause unexpected 
behavior. I still think however that supporting mixed types should be fine as 
the bottleneck regarding precision is fraction, which is used as the base for 
conversion anyway.
I've added a few doctests on the way including the float max and min, which did 
not previously pass on the current implementation, and changed the unittest to 
match.

--
Added file: http://bugs.python.org/file40597/Issue25177_rev2.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-27 Thread Bar Harel

Bar Harel added the comment:

Alright, I issued a fix, now testing it

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-22 Thread Steven D'Aprano

Steven D'Aprano added the comment:

Bar, thanks for the time you put into diagnosing this error, it is 
definitely a bug. The intention is for mean([huge, huge]) to return 
huge, not raise OverflowError.

I'm reluctant to say that mean() will *never* raise OverflowError, but 
it certainly shouldn't do so for this case if it can be avoided. I think 
Mark's diagnosis is probably correct that refactoring the code will fix 
this.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Bar Harel

Bar Harel added the comment:

Seems like this is the only viable option. It fixes the OverflowError but comes 
at the cost of precision and time.
Another option would be to try/except the overflow error and only then return 
the slower method.
Regarding the data [8.988465674311579e+307, 8.98846567431158e+307], float has 
lesser precision than other types like Decimal so the patched method would 
return 8.98846567431158e+307.
A dataset to test the fix is [8.99e+307, 8.989e+307] which gives a correct 
result.

--
keywords: +patch
Added file: http://bugs.python.org/file40519/issue25177.patch

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Mark Dickinson

Mark Dickinson added the comment:

That patch doesn't really help, I'm afraid, since it introduces problems at the 
other end of the floating-point range: for example, `mean([5e-324, 5e-324])` 
would give `0.0` with that patch (instead of the `5e-324` it currently gives).

So currently, when computing the mean of a sequence of floats (possibly mixed 
with ints), the code:

1. Computes the sum as a Fraction.
2. Converts that Fraction to a float.
3. Divides the result by the number of elements.

Reversing steps 2 and 3 here would solve the issue, but would require some 
refactoring.  It's really up to Steven whether he thinks that that refactoring 
is worth it for these corner cases at the extremes of the floating-point range. 
 (It's difficult to imagine that such numbers would turn up frequently in 
practical applications.)

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Berker Peksag

Changes by Berker Peksag :


--
nosy: +steven.daprano

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread David MacIver

New submission from David MacIver:

The following code produces an OverflowError:

import statistics
statistics.mean([8.988465674311579e+307, 8.98846567431158e+307])

The error is:

  File "/home/david/.pyenv/versions/3.5.0/lib/python3.5/statistics.py", line 
293, in mean
return _sum(data)/n
  File "/home/david/.pyenv/versions/3.5.0/lib/python3.5/statistics.py", line 
184, in _sum
return T(total)
  File "/home/david/.pyenv/versions/3.5.0/lib/python3.5/numbers.py", line 291, 
in __float__
return self.numerator / self.denominator
OverflowError: integer division result too large for a float


If this is intended behaviour then it is not documented: 
https://docs.python.org/3/library/statistics.html#statistics.mean only 
specifies that it may raise StatisticsError.

--
components: Library (Lib)
messages: 251076
nosy: David MacIver
priority: normal
severity: normal
status: open
title: OverflowError in statistics.mean when summing large floats
versions: Python 3.4, Python 3.5

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Bar Harel

Bar Harel added the comment:

I believe it's an intended behavior as python's float has a limit after all. 
It's hard to reach it but definitely possible.
A workaround is to internally use Decimal (also take the advantage that it's 
implementation is now way faster) but it will cause a precision loss, and 
somewhat unexpected behavior.
Last thing we can do is catch that OverflowError and raise StatisticsError from 
exc.
I believe catching and re-raising as Statistics or just writing that it may 
also cause an Overflow error are the most viable ways.
Either way, I'd like another opinion before implementing either method as a 
patch.

--
nosy: +bar.harel

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread David MacIver

David MacIver added the comment:

I'm not sure what you mean by float having a limit here. It's certainly finite 
precision, but there is still a representable value with that finite precision 
closest to the mean.

As an example where there is an obvious correct answer that will trigger this 
error: statistics.mean([sys.float_info.max, sys.float_info.max]), this should 
return sys.float_info.max (which is definitely representable!), but instead 
raises this overflow error.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Bar Harel

Bar Harel added the comment:

Whoop! I see the reason for it now. By limit I don't mean the precision limit, 
I mean the top limit in which float converts to "inf".
Seems like this bug is due to the change of python 3's division operator.
Under numbers it states:
"It's important that this conversion use the integer's "true"
division rather than casting one side to float before dividing
so that ratios of huge integers convert without overflowing."

It is meant to not to overflow but the "/" operator is now "//".
Lemme patch it up and see if it fixes the problem. If it states the integer's 
"true" division, I believe this small fix will be sufficient as it has been 
tested before.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Bar Harel

Bar Harel added the comment:

Yup, it indeed fixes the problem. Sorry for thinking it's intended.
Seems like it's a small problem but it does affects all the uses of Integral  
or Rational.
I'll test it with the suite hoping it has a sufficient coverage.

--

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue25177] OverflowError in statistics.mean when summing large floats

2015-09-19 Thread Bar Harel

Bar Harel added the comment:

Alright,
Seems like the problem is bigger than I thought. PEP 238 
(https://www.python.org/dev/peps/pep-0238/) mentions that problem.
Using the new division, // will cause a floor while / will cause an 
OverflowError.
An ugly way around it involves type checking or reverting back to the classic 
division.

--
nosy: +benjamin.peterson, mark.dickinson

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com