Nick Coghlan <ncogh...@gmail.com> added the comment:

I'll try to do a summary of the conversation so far, since it's quite long and 
hard to follow.

The basic issue is that memoryview needs to support copying and slicing that 
creates a new memoryview object. The major problem with that is that the PEP 
3118 semantics as implemented operate in such a way that neither copying the 
Py_buffer struct *nor* requesting a new copy of the struct from the underlying 
object will do the right thing in all cases. (According to the PEP *as written* 
copying probably should have been OK, but the implementation doesn't match the 
PEP in several important respects such that copying is definitely wrong in the 
absence of tight control of the lifecycles of copies relative to the original).

Therefore, we either need to redesign the buffer export from memoryview to use 
daisy chaining (such that in "m = memoryview(obj); m2 = m[:]; m3 = m2[:]" m3 
references m2 which references m which in turn references obj) or else we need 
to introduce an internal reference counted object (PyManagedBuffer) which 
allows a single view of an underlying object to be safely shared amongst 
multiple clients (such that m, m2 and m3 would all reference the same managed 
buffer instance which holds the reference to obj). My preference is strongly 
for the latter approach as it prevents unbounded and wasteful daisy chaining 
while also providing a clean, easy to use interface that will make it easier 
for 3rd parties to write PEP 3118 API consumers (by using PyManagedBuffer 
instead of the raw Py_buffer struct).

Once that basic lifecycle problem for the underlying buffers is dealt with then 
we can start worrying about other problems like exporting Py_buffer objects 
from memoryview instances correctly. The lifecycle problem is unrelated to the 
details of the buffer *contents* though - it's entirely about the fact that 
clients can't safely copy all those pointers (as some may refer to addresses 
inside the struct) and asking the original object for a fresh copy is permitted 
to return a different answer each time.

The actual *slicing* code in memoryview isn't too bad - it just needs to use 
dedicated storage rather than messing with the contents of the Py_buffer struct 
it received from the underlying object. Probably the easiest way to handle that 
is by having the PyManagedBuffer reference be in *addition* to the current 
Py_buffer struct in the internal state - then the latter can be used to record 
the effects of the slicing, if any. Because we know the original Py_buffer 
struct is guaranteed to remain alive and unmodified, we don't need to worry 
about fiddling with any copied pointers - we can just leave them pointing into 
the original structure.

When accessed via the PEP 3118 API, memoryview objects would then export that 
modified Py_buffer struct rather than the original one (so daisychaining would 
be possible, but we wouldn't make it easy to do from pure Python code, as both 
the memoryview constructor and slicing would give each new memoryview object a 
reference to the original managed buffer and just update the internal view 
details as appropriate.

Here's the current MemoryView definition:

typedef struct {
    PyObject_HEAD
    Py_buffer view;
} PyMemoryViewObject;

The TL;DR version of the above is that I would like to see it become:

typedef struct {
    PyObject_HEAD
    PyManagedBuffer source_data; // shared read-only Py_buffer access
    Py_buffer view;  // shape, strides, etc potentially modified
} PyMemoryViewObject;

Once the internal Py_buffer had been initialised, the memoryview code actually 
wouldn't *use* the source data reference all that much (aside from eventually 
releasing the buffer, it wouldn't use it at all). Instead, that reference would 
be retained solely to control the lifecycle of the original Py_buffer object 
relative to the modified copies in the various memoryview instances.

Does all that make my perspective any clearer?

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10181>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to