New submission from Greg Price <gnpr...@gmail.com>:

Currently `_PyObject_VAR_SIZE` effectively has a precondition that it must not 
be passed arguments which would overflow in its arithmetic. If that's violated, 
it overflows... which is undefined behavior already, and in fact the likely 
next thing that happens is we allocate a short buffer for a giant array of 
items, then overflow the buffer, bam!

(The arithmetic is on `Py_ssize_t`, a signed integer type; that's why overflow 
is UB.)

It's quite difficult to maintain that precondition at call sites, because it 
effectively means writing a check that duplicates the content of 
`_PyObject_VAR_SIZE` modulo some algebraic rearrangement.  Empirically we don't 
consistently manage that: we just fixed an 11-year-old bug where such a check 
was wrong (luckily in the safe/over-conservative direction), and some other 
call paths don't immediately appear to be checking at all.

So, I think it would be helpful to move that check into `_PyObject_VAR_SIZE` 
itself.

Cc'ing the three folks on the interest list "coverity scan", as that one seems 
the closest match for who'd likely be interested in commenting on this.

Details below, largely from GH-14838 which got me looking at this.

---

In GH-14838, @sir-sigurd fixed an overflow check on a call to 
PyObject_GC_NewVar.  The call site looked like this:

        if ((size_t)size > ((size_t)PY_SSIZE_T_MAX - sizeof(PyTupleObject) -
                    sizeof(PyObject *)) / sizeof(PyObject *)) {
            return (PyTupleObject *)PyErr_NoMemory();
        }
        op = PyObject_GC_NewVar(PyTupleObject, &PyTuple_Type, size);

but the `- sizeof(PyObject *)` had the wrong sign.  Happily the bug made the 
check too conservative, but that's basically a lucky coin-flip we had -- a 
similar bug could just as easily have gone the other direction, making the code 
vulnerable to arithmetic overflow, then memory corruption from overflowing a 
too-small buffer.

The bug was present for 11 years.

Sergey, I am still very curious how you discovered the bug. :-)

I feel like there's an opportunity for a more complete fix as a followup to 
this; that's this thread.

---

My next question on reading that fix was, OK, how can we write this so it's 
hard to get wrong?

I think the point is that `PyObject_GC_NewVar` really has a precondition, that 
the `nitems` it's passed (`size` in this caller) not be so big that the 
allocation will overflow. Specifically `_PyObject_VAR_SIZE(tp, nitems)` needs 
to not overflow. All the trickiness in this overflow check is basically an 
algebraic rearrangement of what `_PyObject_VAR_SIZE(&PyTuple_Type, size)` would 
say, plus substituting in the values actually found at `&PyTuple_Type`.

So, a sort of minimal fix would be to make something with a name like 
`PyObject_GC_NewVar_WouldOverflow` that encodes that. Maybe that in turn would 
just be a macro to delegate to a `_PyObject_VAR_SIZE_WOULD_OVERFLOW`, so that 
each layer only has to know what it actually knows.

I think that immediately raises a next question, though, which is: are other 
call sites of `PyObject_GC_NewVar` checking properly for overflow? At a quick 
grep, I see some that don't appear to be checking.

And: why should they have to? It seems like the best place for this check is 
immediately before the possibly-overflowing computation; or if not there, then 
the closer to it the better.

So, the ideal solution: `_PyObject_VAR_SIZE` itself would be augmented to first 
do this check, and only if it won't overflow then go and actually multiply.

There are only a handful of call sites of `_PyObject_VAR_SIZE`, so it wouldn't 
even be much of a migration to then take care of each of them.

Then we wouldn't need these prophylactic checks at callers' callers -- which 
(a) this bug demonstrates are hard to consistently get right, and (b) the grep 
mentioned above suggests we may not be consistently doing in the first place.

----------
components: Interpreter Core
messages: 351573
nosy: Greg Price, brett.cannon, christian.heimes, sir-sigurd, twouters
priority: normal
severity: normal
status: open
title: _PyObject_VAR_SIZE should avoid arithmetic overflow
versions: Python 3.9

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue38079>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to