[issue27945] five dictobject issues

2016-09-02 Thread tehybel

Changes by tehybel :


--
versions: +Python 3.5, Python 3.6

___
Python tracker 

___
___
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue27945] five dictobject issues

2016-09-02 Thread tehybel

New submission from tehybel:

Here I'll describe five distinct issues I found. Common to them all is that they
reside in the built-in dictionary object. 

Four of them are use-after-frees and one is an array-out-of-bounds indexing bug.


All of the described functions reside in /Objects/dictobject.c.


Issue 1: use-after-free when initializing a dictionary

Initialization of a dictionary happens via the function dict_init which calls
dict_update_common. From there, PyDict_MergeFromSeq2 may be called, and that is
where this issue resides.

In PyDict_MergeFromSeq2 we retrieve a sequence of size 2 with this line:

fast = PySequence_Fast(item, "");

After checking its size, we take out a key and value:

key = PySequence_Fast_GET_ITEM(fast, 0);
value = PySequence_Fast_GET_ITEM(fast, 1);

Then we call PyDict_GetItem. This calls back to Python code if the key has a
__hash__ function. From there the "item" sequence could get modified, resulting
in "key" or "value" getting used after having been freed.

Here's a PoC:

---

class X:
def __hash__(self):
pair[:] = []
return 13

pair = [X(), 123]
dict([pair])

---

It crashes while trying to use freed memory as a PyObject:

(gdb) run ./poc24.py 
Program received signal SIGSEGV, Segmentation fault.
0x0048fe25 in insertdict (mp=mp@entry=0x76d5c4b8, 
key=key@entry=0x76d52538, hash=0xd, 
value=value@entry=0x8d1ac0 ) at Objects/dictobject.c:831
831 MAINTAIN_TRACKING(mp, key, value);
(gdb) print *key
$26 = {_ob_next = 0xdbdbdbdbdbdbdbdb, _ob_prev = 0xdbdbdbdbdbdbdbdb, ob_refcnt 
= 0xdbdbdbdbdbdbdbdb, 
  ob_type = 0xdbdbdbdbdbdbdbdb}




Issue 2: use-after-free in dictitems_contains

In the function dictitems_contains we call PyDict_GetItem to look up a value in
the dictionary:

found = PyDict_GetItem((PyObject *)dv->dv_dict, key);

However this "found" variable is borrowed. We then go ahead and compare it:

return PyObject_RichCompareBool(value, found, Py_EQ);

But PyObject_RichCompareBool could call back into Python code and e.g. release
the GIL. As a result, the dictionary may be mutated. Thus "found" could get
freed. 

Then, inside PyObject_RichCompareBool (actually in do_richcompare), the "found"
variable gets used after being freed.

PoC:

---

class X:
def __eq__(self, other):
d.clear()
return NotImplemented

d = {0: set()}
(0, X()) in d.items()

---

Result:

(gdb) run ./poc25.py 
Program received signal SIGSEGV, Segmentation fault.
0x004a03b6 in do_richcompare (v=v@entry=0x76d52468, 
w=w@entry=0x76ddf7c8, op=op@entry=0x2)
at Objects/object.c:673
673 if (!checked_reverse_op && (f = w->ob_type->tp_richcompare) != 
NULL) {
(gdb) print w->ob_type
$26 = (struct _typeobject *) 0xdbdbdbdbdbdbdbdb




Issue 3: use-after-free in dict_equal

In the function dict_equal, we call the "lookdict" function via
b->ma_keys->dk_lookup to look up a value:

if ((b->ma_keys->dk_lookup)(b, key, ep->me_hash, &vaddr) == NULL)

This value's address is stored into the "vaddr" variable and the value is
fetched into the "bval" variable:

bval = *vaddr;

Then we call Py_DECREF(key) which can call back into Python code. This could
release the GIL and mutate dictionary b. Therefore "bval" could become freed at
this point. We then proceed to use "bval":

cmp = PyObject_RichCompareBool(aval, bval, Py_EQ);

This results in a use-after-free.

PoC:

---

class X():
def __del__(self): 
dict_b.clear()
def __eq__(self, other):
dict_a.clear()
return True
def __hash__(self): 
return 13

dict_a = {X(): 0}
dict_b = {X(): X()}
dict_a == dict_b

---

Result:

(gdb) run ./poc26.py 
Program received signal SIGSEGV, Segmentation fault.
PyType_IsSubtype (a=0xdbdbdbdbdbdbdbdb, b=0x87ec60 )
at Objects/typeobject.c:1343
1343mro = a->tp_mro;
(gdb) print a
$59 = (PyTypeObject *) 0xdbdbdbdbdbdbdbdb



Issue 4: use-after-free in _PyDict_FromKeys

The function _PyDict_FromKeys takes an iterable as argument. If the iterable is
a dict, _PyDict_FromKeys loops over it like this:

while (_PyDict_Next(iterable, &pos, &key, &oldvalue, &hash)) {
if (insertdict(mp, key, hash, value)) {
...
}
}

However if we look at the comment for PyDict_Next, we see this:

 * CAUTION:  In general, it isn't safe to use PyDict_Next in a loop that
 * mutates the dict.

But insertdict can call on to Python code which might mutate the dict. In that
case we perform a use-after-free of the "key" variable.

Here's a PoC:

---

class X(int):
def __hash__(self):
return 13 
def __eq__(self, other):
if len(d) > 1:
d.clear()
return False

d = {}
d = {X(1): 1, X(2): 2}
x = {}.fromkeys(d)

---

And the result:

(gdb) run ./poc27.py 
Program received signal SIGSEGV, Segmentation fault.
0x00435122 in visit