[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Antoine Pitrou pit...@free.fr added the comment: Patch committed in r84394 (py3k) and r84396 (3.1). Thank you Stefan! -- resolution: - fixed stage: needs patch - committed/rejected status: open - closed versions: -Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Changes by Antoine Pitrou pit...@free.fr: -- nosy: +haypo ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Stefan Behnel sco...@users.sourceforge.net added the comment: Here's a patch against the latest py3k. The following will call the new code, for example: str(memoryview(b'abc'), 'ASCII') whereas bytes and bytesarray continue to use their own special casing code (which has also changed a bit since I wanted to avoid code duplication). For testing, I wrote a short Cython module that implements the buffer protocol in an extension type and freshly allocates a new bytes object as buffer on each access: from cpython.ref cimport Py_INCREF, Py_DECREF, PyObject cdef class Test: def __getbuffer__(self, Py_buffer* buffer, int flags): s = b'abcdefg' * 10 buffer.buf = char* s buffer.obj = self buffer.len = len(s) Py_INCREF(s) buffer.internal = void* s def __releasebuffer__(self, Py_buffer* buffer): Py_DECREF(objectbuffer.internal) Put it into a file buftest.pyx, build it, start up Python 3.x and call import buftest print(len( str(buftest.Test(), ASCII) )) Under the unpatched Py3, this raises a decoding exception for me when it tries to decode data from the deallocated bytes object. Other systems may happily crash here. The patched Python runtime prints '70' as expected. -- keywords: +patch Added file: http://bugs.python.org/file18585/unicodeobject-PyUnicode_FromEncodedObject-buffer.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Antoine Pitrou pit...@free.fr added the comment: I think the bytearray special-casing should be removed. Otherwise one can reallocate the buffer in another thread while it is being used for decoding. -- nosy: +pitrou versions: -Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Stefan Behnel sco...@users.sourceforge.net added the comment: Doesn't the GIL protect the bytearray buffer? Or does decoding free the GIL? -- versions: +Python 2.7 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Stefan Behnel sco...@users.sourceforge.net added the comment: Regardless of the answer, I think Antoine is right, special cases aren't special enough to break the rules, and this is a special case that's more safely handled as part of the normal buffer case. Updated patch uploaded. -- Added file: http://bugs.python.org/file18587/unicodeobject-PyUnicode_FromEncodedObject-buffer2.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Antoine Pitrou pit...@free.fr added the comment: Doesn't the GIL protect the bytearray buffer? Or does decoding free the GIL? Well, decoding can call arbitrary Python code and therefore, yes, release the GIL. Ironically, PyUnicode_Decode() itself (called from PyUnicode_FromEncodedObject()) fills a dummy Py_buffer object before wrapping it into a memoryview... ;) -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Stefan Behnel sco...@users.sourceforge.net added the comment: ... and another complete patch that refactors the complete function to make it clearer what happens. Includes a small code duplication for the bytes object case, which I think it acceptable. -- Added file: http://bugs.python.org/file18588/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Changes by Stefan Behnel sco...@users.sourceforge.net: Removed file: http://bugs.python.org/file18588/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Stefan Behnel sco...@users.sourceforge.net added the comment: Another updated patch with a readability fix (replacing the last one). -- Added file: http://bugs.python.org/file18589/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Marc-Andre Lemburg m...@egenix.com added the comment: Stefan Behnel wrote: Stefan Behnel sco...@users.sourceforge.net added the comment: Another updated patch with a readability fix (replacing the last one). While you're at it, you might as well remove references to the char buffer - there's no such thing in Python3 anymore. We only have read buffers in Python3. -- nosy: +lemburg ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Stefan Behnel sco...@users.sourceforge.net added the comment: When I read the comments and exception texts in the function, it didn't occur to me that char buffer could have been used as a name for the old Py2 buffer interface. From the context, it totally makes sense to me that the function (which decodes a byte sequence into a unicode string) complains about not getting a bytes object or char buffer as input. Admittedly, this might sound slightly different when read in Python space. -- ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
Mark Lawrence breamore...@yahoo.co.uk added the comment: @Stefan can you provide a patch for this? -- nosy: +BreamoreBoy stage: - needs patch versions: +Python 2.7 -Python 3.0 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com
[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
New submission from Stefan Behnel sco...@users.sourceforge.net: PyUnicode_FromEncodedObject() currently calls PyObject_AsCharBuffer() to get the buffer pointer and length of a buffer supporting object. It should be changed to support the buffer protocol correctly instead. I filed this as a crash bug as the buffer protocol allows a buffer supporting object to discard its buffer when the release function is called. The decode function uses the buffer only *after* releasing it, thus provoking a crash for objects that implement the buffer protocol correctly in that they do not allow access to the buffer after the release. -- components: Interpreter Core messages: 95847 nosy: scoder severity: normal status: open title: PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer() type: crash versions: Python 3.0, Python 3.1, Python 3.2 ___ Python tracker rep...@bugs.python.org http://bugs.python.org/issue7415 ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com