[issue10181] Problems with Py_buffer management in memoryobject.c (and elsewhere?)

Nick Coghlan Sun, 26 Jun 2011 05:31:44 -0700

Nick Coghlan <[email protected]> added the comment:

The idea of PyManagedBuffer is for it to be an almost completely passive object 
that *just* acts as a refcounted wrapper around the Py_buffer structure, so it 
doesn't care about the actual contents. The only supplemental functionality I 
think it should provide is to disallow explicitly releasing the buffer while 
the reference count is greater than 1. I'm OK with my example cited above being 
unreliable. The correct way to write such code would then be:


  with memoryview(obj) as m:
    with m[:] as m2:
      ...

I think separating the concerns this way, letting PyManagedBuffer worry about 
the lifecycle issues of the underlying buffer reference, while PyMemoryView 
deals with the *interpretation* of the buffer description (such as by providing 
useful slicing functionality) will make the whole arrangement easier to handle. 
When a memoryview is sliced, it would create a new memoryview that has a 
reference to the same PyManagedBuffer object, but different internal state that 
affects how that buffer is accessed. This is better than requiring that every 
implementor of the buffer API worry about the slicing logic - we can do it 
right in memoryview and then implementers of producer objects don't have to 
worry about it.

Currently, however, memoryview gets tied up in knots since it is trying to do 
everything itself in a way that makes it unclear what is going on. The 
semantics of copying the Py_buffer struct or of accessing the PEP 3118 API on 
the underlying object when slicing or copying views are demonstrably broken. If 
we try to shoehorn reference counting semantics into the current object model, 
we would end up with two distinct modes of operation for memoryview:

  Direct: the view is directly accessing an underlying object via the PEP 3118 
API
  Indirect: the view has a reference to another memoryview object that it is 
using as a data source

That's complicated - hard to implement in the first place and hard to follow 
when reading the code. Adding the PyManagedBuffer object makes the object model 
more complex, but simplifies the runtime semantics: every memoryview instance 
will access a PyManagedBuffer object which takes care of the underlying PEP 
3118 details. Direct use of the PEP 3118 consumer API in 3rd party code will 
also be strongly discouraged, with PyManagedBuffer promoted as the preferred 
alternative (producers, of course, will still need to provide the raw Py_buffer 
data that PyManagedBuffer exposes).

At the Python level, I don't think it is necessary to expose a new object, so 
we can stick with Antoine's preferred model where memoryview is the only public 
API. My proposed new PyManagedBuffer object would just be about making life 
easier at the C level.

----------

_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10181>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue10181] Problems with Py_buffer management in memoryobject.c (and elsewhere?)

Reply via email to