[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-09-01 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

Patch committed in r84394 (py3k) and r84396 (3.1). Thank you Stefan!

--
resolution:  - fixed
stage: needs patch - committed/rejected
status: open - closed
versions:  -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-21 Thread Antoine Pitrou

Changes by Antoine Pitrou pit...@free.fr:


--
nosy: +haypo

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel sco...@users.sourceforge.net added the comment:

Here's a patch against the latest py3k. The following will call the new code, 
for example:

  str(memoryview(b'abc'), 'ASCII')

whereas bytes and bytesarray continue to use their own special casing code 
(which has also changed a bit since I wanted to avoid code duplication).

For testing, I wrote a short Cython module that implements the buffer protocol 
in an extension type and freshly allocates a new bytes object as buffer on each 
access:

  from cpython.ref cimport Py_INCREF, Py_DECREF, PyObject

  cdef class Test:
  def __getbuffer__(self, Py_buffer* buffer, int flags):
  s = b'abcdefg' * 10
  buffer.buf = char* s
  buffer.obj = self
  buffer.len = len(s)
  Py_INCREF(s)
  buffer.internal = void* s

  def __releasebuffer__(self, Py_buffer* buffer):
  Py_DECREF(objectbuffer.internal)

Put it into a file buftest.pyx, build it, start up Python 3.x and call

 import buftest
 print(len( str(buftest.Test(), ASCII) ))

Under the unpatched Py3, this raises a decoding exception for me when it tries 
to decode data from the deallocated bytes object. Other systems may happily 
crash here. The patched Python runtime prints '70' as expected.

--
keywords: +patch
Added file: 
http://bugs.python.org/file18585/unicodeobject-PyUnicode_FromEncodedObject-buffer.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

I think the bytearray special-casing should be removed. Otherwise one can 
reallocate the buffer in another thread while it is being used for decoding.

--
nosy: +pitrou
versions:  -Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel sco...@users.sourceforge.net added the comment:

Doesn't the GIL protect the bytearray buffer? Or does decoding free the GIL?

--
versions: +Python 2.7

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel sco...@users.sourceforge.net added the comment:

Regardless of the answer, I think Antoine is right, special cases aren't 
special enough to break the rules, and this is a special case that's more 
safely handled as part of the normal buffer case.

Updated patch uploaded.

--
Added file: 
http://bugs.python.org/file18587/unicodeobject-PyUnicode_FromEncodedObject-buffer2.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Antoine Pitrou

Antoine Pitrou pit...@free.fr added the comment:

 Doesn't the GIL protect the bytearray buffer? Or does decoding free the GIL?

Well, decoding can call arbitrary Python code and therefore, yes,
release the GIL.

Ironically, PyUnicode_Decode() itself (called from
PyUnicode_FromEncodedObject()) fills a dummy Py_buffer object before
wrapping it into a memoryview... ;)

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel sco...@users.sourceforge.net added the comment:

... and another complete patch that refactors the complete function to make it 
clearer what happens. Includes a small code duplication for the bytes object 
case, which I think it acceptable.

--
Added file: 
http://bugs.python.org/file18588/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Changes by Stefan Behnel sco...@users.sourceforge.net:


Removed file: 
http://bugs.python.org/file18588/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel sco...@users.sourceforge.net added the comment:

Another updated patch with a readability fix (replacing the last one).

--
Added file: 
http://bugs.python.org/file18589/unicodeobject-PyUnicode_FromEncodedObject-buffer-refactored.patch

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Marc-Andre Lemburg

Marc-Andre Lemburg m...@egenix.com added the comment:

Stefan Behnel wrote:
 
 Stefan Behnel sco...@users.sourceforge.net added the comment:
 
 Another updated patch with a readability fix (replacing the last one).

While you're at it, you might as well remove references to the
char buffer - there's no such thing in Python3 anymore.

We only have read buffers in Python3.

--
nosy: +lemburg

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-20 Thread Stefan Behnel

Stefan Behnel sco...@users.sourceforge.net added the comment:

When I read the comments and exception texts in the function, it didn't occur 
to me that char buffer could have been used as a name for the old Py2 buffer 
interface. From the context, it totally makes sense to me that the function 
(which decodes a byte sequence into a unicode string) complains about not 
getting a bytes object or char buffer as input. Admittedly, this might sound 
slightly different when read in Python space.

--

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2010-08-04 Thread Mark Lawrence

Mark Lawrence breamore...@yahoo.co.uk added the comment:

@Stefan can you provide a patch for this?

--
nosy: +BreamoreBoy
stage:  - needs patch
versions: +Python 2.7 -Python 3.0

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue7415] PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()

2009-11-30 Thread Stefan Behnel

New submission from Stefan Behnel sco...@users.sourceforge.net:

PyUnicode_FromEncodedObject() currently calls PyObject_AsCharBuffer() to
get the buffer pointer and length of a buffer supporting object. It
should be changed to support the buffer protocol correctly instead.

I filed this as a crash bug as the buffer protocol allows a buffer
supporting object to discard its buffer when the release function is
called. The decode function uses the buffer only *after* releasing it,
thus provoking a crash for objects that implement the buffer protocol
correctly in that they do not allow access to the buffer after the release.

--
components: Interpreter Core
messages: 95847
nosy: scoder
severity: normal
status: open
title: PyUnicode_FromEncodedObject() uses PyObject_AsCharBuffer()
type: crash
versions: Python 3.0, Python 3.1, Python 3.2

___
Python tracker rep...@bugs.python.org
http://bugs.python.org/issue7415
___
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com