Josh Rosenberg added the comment:

A few examples (some are patently ridiculous, since the range of values anyone 
would use ends long before you'd overflow a 32 bit integer, let alone a 64 bit 
value on my build of Python, but bear with me:

>>> datetime.datetime(2**64, 1, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long
>>> datetime.datetime(-2**64, 1, 2)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

>>> time.mktime(time.struct_time([2**64]*9))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

>>> sqlite3.enable_callback_tracebacks(2**64)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
OverflowError: Python int too large to convert to C long

(That last one should really be changed to type code 'p' over 'i', or to 'B' 
since it's just a boolean, so overflow doesn't matter, just truthy/falsy 
behavior)

It also happens if you pass re functions/methods a too large flags value:
>>> re.sub(r'(abc)', r'\1', 'abcd', re.IGNORECASE << 64)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/shadowranger/src/cpython/Lib/re.py", line 175, in sub
    return _compile(pattern, flags).sub(repl, string, count)
OverflowError: Python int too large to convert to C ssize_t

Skipping the tracebacks, a few more examples of functions where at least one 
argument can raise OverflowError for implementation specific reasons, rather 
than a logical overflow of some kind (which is what math.factorial's is):

os.getpriority, os.setpriority, os.waitid, os.tcsetpgrp, a central utility 
function (get_data) in zipimport (I believe the value it's parsing is derived 
from a zip header, so it may not be possible to feed it too-large values; 
haven't checked), quite a number of socket functions (often for stuff that 
should really be ValueErrors, e.g. port numbers out of range), and more random 
things. I found all of these with a simple:

find cpython/ -type f -name '*.c' -exec grep -nP 'PyArg_Parse.*?"\w*?[bhilL]"' 
{} + > exampleoverflow.txt

There aren't any other good examples in math, largely because the other 
functions there deal with floats (or have arbitrary precision integer fallback 
paths, in the case of the log suite of functions).

That find only scratches the surface; many PyArg_Parse* calls are split across 
lines (so my simple regex won't catch them), and old Python code has a habit of 
not using PyArg_Parse* even when it makes sense (presumably because they wanted 
to customize error messages, or didn't like the way the provided formatting 
codes handled edge cases).

In reality, any place PyLong_As* is called (when * is not one of the masking 
functions) on an argument that came from the user without explicitly checking 
for an replacing OverflowError will potentially trigger this issue. A cursory 
search of locations where this function is called reveals OverflowErrors in the 
r parameter to to itertools.permutations, and that decimal is riddled with 
cases where they return if PyLong_As* has an error (including OverflowError) 
without changing the exception type, then a second round of range checking will 
set ValueError if it didn't Overflow. Examples include Context object's prec 
and clamp properties, but there are a dozen or more functions doing this, and I 
don't know if all of them are publically accessible.

Fewer of the calls will be publically visible, so there's more to look through, 
but you can run the same search to find tons of places with potentially similar 
behavior:

find cpython/ -type f -name '*.c' -exec grep -nP 
'Py(Long|Number)_As(?!.*(?:Mask|NULL|PyExc_(?!Overflow)))' {} + > 
exampleoverflowdirectcall.txt

I suspect that for every case where Python standard libs behave this way 
(raising OverflowErrors in ways that disregard the official docs description of 
when it should be used), there are a dozen where a third party module behaves 
this way, since third party modules are more likely to use the standardized 
argument parsing and numeric parsing APIs without rejiggering the default 
exceptions, assuming that the common APIs raise the "correct" errors.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue20539>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to