Nick Coghlan <[email protected]> added the comment:
I'll try to do a summary of the conversation so far, since it's quite long and
hard to follow.
The basic issue is that memoryview needs to support copying and slicing that
creates a new memoryview object. The major problem with that is that the PEP
3118 semantics as implemented operate in such a way that neither copying the
Py_buffer struct *nor* requesting a new copy of the struct from the underlying
object will do the right thing in all cases. (According to the PEP *as written*
copying probably should have been OK, but the implementation doesn't match the
PEP in several important respects such that copying is definitely wrong in the
absence of tight control of the lifecycles of copies relative to the original).
Therefore, we either need to redesign the buffer export from memoryview to use
daisy chaining (such that in "m = memoryview(obj); m2 = m[:]; m3 = m2[:]" m3
references m2 which references m which in turn references obj) or else we need
to introduce an internal reference counted object (PyManagedBuffer) which
allows a single view of an underlying object to be safely shared amongst
multiple clients (such that m, m2 and m3 would all reference the same managed
buffer instance which holds the reference to obj). My preference is strongly
for the latter approach as it prevents unbounded and wasteful daisy chaining
while also providing a clean, easy to use interface that will make it easier
for 3rd parties to write PEP 3118 API consumers (by using PyManagedBuffer
instead of the raw Py_buffer struct).
Once that basic lifecycle problem for the underlying buffers is dealt with then
we can start worrying about other problems like exporting Py_buffer objects
from memoryview instances correctly. The lifecycle problem is unrelated to the
details of the buffer *contents* though - it's entirely about the fact that
clients can't safely copy all those pointers (as some may refer to addresses
inside the struct) and asking the original object for a fresh copy is permitted
to return a different answer each time.
The actual *slicing* code in memoryview isn't too bad - it just needs to use
dedicated storage rather than messing with the contents of the Py_buffer struct
it received from the underlying object. Probably the easiest way to handle that
is by having the PyManagedBuffer reference be in *addition* to the current
Py_buffer struct in the internal state - then the latter can be used to record
the effects of the slicing, if any. Because we know the original Py_buffer
struct is guaranteed to remain alive and unmodified, we don't need to worry
about fiddling with any copied pointers - we can just leave them pointing into
the original structure.
When accessed via the PEP 3118 API, memoryview objects would then export that
modified Py_buffer struct rather than the original one (so daisychaining would
be possible, but we wouldn't make it easy to do from pure Python code, as both
the memoryview constructor and slicing would give each new memoryview object a
reference to the original managed buffer and just update the internal view
details as appropriate.
Here's the current MemoryView definition:
typedef struct {
PyObject_HEAD
Py_buffer view;
} PyMemoryViewObject;
The TL;DR version of the above is that I would like to see it become:
typedef struct {
PyObject_HEAD
PyManagedBuffer source_data; // shared read-only Py_buffer access
Py_buffer view; // shape, strides, etc potentially modified
} PyMemoryViewObject;
Once the internal Py_buffer had been initialised, the memoryview code actually
wouldn't *use* the source data reference all that much (aside from eventually
releasing the buffer, it wouldn't use it at all). Instead, that reference would
be retained solely to control the lifecycle of the original Py_buffer object
relative to the modified copies in the various memoryview instances.
Does all that make my perspective any clearer?
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10181>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com