Re: [Python-Dev] An updated extended buffer PEP
Carl Banks wrote: > Now object B takes a view of A. If we don't have this field, then B > will have to hold a reference to A, like this: > > B -> A -> R > > A would be responsible for keeping track of views, A isn't keeping track of views, it's keeping track of the single object R, which it has to keep a reference to anyway. > and A could not be > garbage collected until B disappears. I'm not convinced that this would be a serious problem. An object that's using a different object to manage the buffer is probably quite small, so it doesn't matter much if it stays around. > Here's a concrete example of where it would be useful: consider a > ByteBufferSlice object. Basically, the object represents a > shared-memory slice of a 1-D array of bytes (for example, Python 3000 > bytes object, or an mmap object). And this would be a very small object, not worth the trouble of caring whether it stays around a bit longer than needed, IMO. > P.S. In thinking about this, it occurred to me that there should be a > way to lock the buffer without requesting details. Perhaps you could do this by calling getbuffer with NULL for the bufferinfo pointer, and similarly call releasebuffer with NULL to unlock it. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Carl Banks wrote: > /* don't define releasebuffer or lockbuffer */ > /* only objects that manage buffers themselves would define these */ That's an advantage, but it's a pretty small one -- the releasebuffer implementation would be very simple in this case. I'm bothered that the releaser field makes the protocol asymmetrical and thus harder to reason about. It would cost me more mental effort to convince myself that a releasebuffer implementation wasn't needed in any particular case than it would to write the one-line implementation otherwise required. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Carl Banks wrote: > Here's a concrete example of where it would be useful: consider a > ByteBufferSlice object. Basically, the object represents a > shared-memory slice of a 1-D array of bytes (for example, Python 3000 > bytes object, or an mmap object). > > Now, if the ByteBufferSlice object could tell the consumer that someone > else is managing the buffer, then it wouldn't have to keep track of > views, thus simplifying things. > > P.S. In thinking about this, it occurred to me that there should be a > way to lock the buffer without requesting details. ByteBufferSlice > would already know the details of the buffer, but it would need to > increment the original buffer's lock count. Thus, I propose new fuction: > > typedef int (*lockbufferproc)(PyObject* self); And, because real examples are better than philosophical speculations, here's a skeleton implementation of the ByteBufferSlice array, sans boilerplate and error checking, and with some educated guessing about future details: typedef struct { PyObject_HEAD PyObject* releaser; unsigned char* buf; Py_ssize_t length; } ByteBufferSliceObject; PyObject* ByteBufferSlice_new(PyObject* bufobj, Py_ssize_t start, Py_ssize_t end) { ByteBufferSliceObject* self; BufferInfoObject* bufinfo; self = (ByteBufferSliceObject*)type->tp_alloc(type, 0); bufinfo = PyObject_GetBuffer(bufobj); self->releaser = bufinfo->releaser; self->buf = bufinfo->buf + start; self->length = end-start; /* look how soon we're done with this information */ Py_DECREF(bufinfo); return self; } PyObject* ByteBufferSlice_dealloc(PyObject* self) { PyObject_ReleaseBuffer(self->releaser); self->ob_type->tp_free((PyObject*)self); } PyObject* ByteBufferSlice_getbuffer(PyObject* self, int flags) { BufferInfoObject* bufinfo; static Py_ssize_t stridesarray[] = { 1 }; bufinfo = BufferInfo_New(); bufinfo->releaser = self->releaser; bufinfo->writable = 1; bufinfo->buf = self->buf; bufinfo->length = self->length; bufinfo->ndims = 1; bufinfo->strides = stridesarray; bufinfo->size = &self->length; bufinfo->subbufoffsets = NULL; /* Before we go, increase the original buffer's lock count */ PyObject_LockBuffer(self->releaser); return bufinfo; } /* don't define releasebuffer or lockbuffer */ /* only objects that manage buffers themselves would define these */ /* Now look how easy this is */ /* Everything works out if ByteBufferSlice reexports the buffer */ PyObject* ByteBufferSlice_getslice(PyObject* self, Py_ssize_t start, Py_ssize_t end) { return ByteBufferSlice_new(self,start,end); } The implementation of this is very straightforward, and it's easy to see why and how "bufinfo->release" works, and why it'd be useful. It's almost like there's two protocols here: a buffer exporter protocol (getbuffer) and a buffer manager protocol (lockbuffer and releasebuffer). Some objects would support only exporter protocol; others both. Carl Banks ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Travis Oliphant wrote: > Carl Banks wrote: >> Travis E. Oliphant wrote: >>> I think we are getting closer. What do you think about Greg's idea >>> of basically making the provider the bufferinfo structure and having >>> the exporter handle copying memory over for shape and strides if it >>> wants to be able to change those before the lock is released. >> It seems like it's just a different way to return the data. You could >> do it by setting values through pointers, or do it by returning a >> structure. Which way you choose is a minor detail in my opinion. I'd >> probably favor returning the information in a structure. >> >> I would consider adding two fields to the structure: >> >> size_t structsize; /* size of the structure */ > Why is this necessary? can't you get that by sizeof(bufferinfo)? In case you want to add something later. Though if you did that, it would be a different major release, meaning you'd have to rebuild anyway. They rashly add fields to the PyTypeObject in the same way. :) So never mind. >> PyObject* releaser; /* the object you need to call releasebuffer on */ > Is this so that another object could be used to manage releases if desired? Yes, that was a use case I saw for a different "view" object. I don't think it's crucially important to have it, but for exporting objects that delegate management of the buffer to another object, then it would be very helpful if the exporter could tell consumers that the other object is managing the buffer. Suppose A is an exporting object, but it uses a hidden object R to manage the buffer memory. Thus you have A referring to R, like this: A -> R Now object B takes a view of A. If we don't have this field, then B will have to hold a reference to A, like this: B -> A -> R A would be responsible for keeping track of views, and A could not be garbage collected until B disappears. If we do have this field, then A could tell be B to hold a reference to R instead: B -> R A -> R A is no longer obliged to keep track of views, and it can be garbage collected even if B still exists. Here's a concrete example of where it would be useful: consider a ByteBufferSlice object. Basically, the object represents a shared-memory slice of a 1-D array of bytes (for example, Python 3000 bytes object, or an mmap object). Now, if the ByteBufferSlice object could tell the consumer that someone else is managing the buffer, then it wouldn't have to keep track of views, thus simplifying things. P.S. In thinking about this, it occurred to me that there should be a way to lock the buffer without requesting details. ByteBufferSlice would already know the details of the buffer, but it would need to increment the original buffer's lock count. Thus, I propose new fuction: typedef int (*lockbufferproc)(PyObject* self); Carl Banks ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Carl Banks wrote: > Travis E. Oliphant wrote: >> I think we are getting closer. What do you think about Greg's idea >> of basically making the provider the bufferinfo structure and having >> the exporter handle copying memory over for shape and strides if it >> wants to be able to change those before the lock is released. > > It seems like it's just a different way to return the data. You could > do it by setting values through pointers, or do it by returning a > structure. Which way you choose is a minor detail in my opinion. I'd > probably favor returning the information in a structure. > > I would consider adding two fields to the structure: > > size_t structsize; /* size of the structure */ Why is this necessary? can't you get that by sizeof(bufferinfo)? > PyObject* releaser; /* the object you need to call releasebuffer on */ Is this so that another object could be used to manage releases if desired? -Travis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Travis Oliphant wrote: > Perhaps, though we can stick with an > object-less buffer interface but have this "view object" as an expanded > buffer object. I like this idea. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Travis E. Oliphant wrote: > Greg Ewing wrote: >> struct bufferinfo { >> ... >> }; >> >> int (*getbuffer)(PyObject *obj, struct bufferinfo *info); >> int (*releasebuffer)(PyObject *obj, struct bufferinfo *info); > This is not much different from my original "view" object. Stick a > PyObject_HEAD at the start of this bufferinfo and you have it. The important difference is that it *doesn't* have PyObject_HEAD at the start of it. :-) > I don't see why a PyObject_HEAD would make anything significantly > slower. Then we could use Python's memory management very easily to > create and destroy these things. In the case where the shape/stride info is constant, and the caller is able to allocate the struct bufferinfo on the stack, my proposal requires no memory allocations at all. That's got to be faster than allocating and freeing a Python object. When it is necessary to allocate memory for the shape/stride, some mallocs and frees (or Python equivalents) are going to be needed either way. I don't see how using a Python object makes this any easier. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Lisandro Dalcin wrote: > On 3/26/07, Travis Oliphant <[EMAIL PROTECTED]> wrote: >> Here is my updated PEP which incorporates several parts of the >> discussions we have been having. > > Travis, it looks really good, below my comments I hope you don't mind me replying to python-dev. > > 1- Is it hard to EXTEND PyBufferProcs in order to be able to use all > this machinery in Py 2.X series, not having to wait until Py3k? No, I don't think it will be hard. I just wanted to focus on Py3k since it is going to happen before Python 2.6 and I wanted it discussed in that world. > > 2- Its not clear for me if this PEP will enable object types defined > in the Python side to export buffer info. This is a feature I really > like in numpy, and simplifies my life a lot when I need to export > memory for C/C++ object wrapped with the help of tools like SWIG. This PEP does not address that. You will have to rely on the objects themselves for any such information. > > 3- Why not to constraint the returned 'view' object to be of a > specific type defined in the C side (and perhaps available in the > Python side)? This 'view' object could maintain a reference to the > base object containing the data, could call releasebuffer using the > base object when the view object is decref'ed, and can have a flag > field for think like OWN_MEMORY, OWN_SHAPE, etc in order to properly > manage memory deallocation. Does all this make sense? Yes, that was my original thinking and we are kind of coming back to it after several iterations. Perhaps, though we can stick with an object-less buffer interface but have this "view object" as an expanded buffer object. -Travis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Carl Banks wrote: > Travis Oliphant wrote: >> Travis Oliphant wrote: >>> Hi Carl and Greg, >>> >>> Here is my updated PEP which incorporates several parts of the >>> discussions we have been having. >> And here is the actual link: >> >> http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt > > > What's the purpose of void** segments in PyObject_GetBuffer? It seems > like it's leftover from an older incarnation? > Yeah, I forgot to change that location. > I'd hope after more recent discussion, we'll end up simplifying > releasebuffer. It seems like it'd be a nightmare to keep track of what > you've released. Yeah, I agree. I think I'm leaning toward the bufferinfo structure which allows the exporter to copy memory for things that it wants to be free to change while the buffer is exported. > > > Finally, the isptr thing. It's just not sufficient. Frankly, I'm > having doubts whether it's a good idea to support multibuffer at all. > Sure, it brings generality, but I'm thinking its too hard to explain and > too hard to get one's head around, and will lead to lots of > misunderstanding and bugginess. OTOH, it really doen't burden anyone > except those who want to export multi-buffered arrays, and we only have > one shot to do it. I just hope it doesn't confuse everyone so much that > no one bothers. People used to have doubts about explaining strides in NumPy as well. I sure would have hated to see them eliminate the possiblity because of those doubts. I think the addition you discuss is not difficult once you get a hold of it. I also understand now why subbufferoffsets is needed. I was thinking that for slices you would just re-create a whole other array of pointers to contain that addition. But, that is really not advisable. It makes sense when you are talking about a single pointer variable (like in NumPy) but it doesn't when you have an array of pointers. Providing the example about how to extract the pointer from the returned information goes a long way towards clearing up any remaining confusion. Your ImageObject example is also helpful. I really like the addition and think it is clear enough and supports a lot of use cases with very little effort. > > Here's how I would update the isptr thing. I've changed "derefoff" to > "subbufferoffsets" to describe it better. > > > typedef PyObject *(*getbufferproc)(PyObject *obj, void **buf, > Py_ssize_t *len, int *writeable, > char **format, int *ndims, > Py_ssize_t **shape, > Py_ssize_t **strides, > Py_ssize_t **subbufferoffsets); > > > subbufferoffsets > >Used to export information about multibuffer arrays. It is an >address of a ``Py_ssize_t *`` variable that will be set to point at >an array of ``Py_ssize_t`` of length ``*ndims``. > >[I don't even want to try a verbal description.] > >To demonstrate how subbufferoffsets works, here is am example of a >function that returns a pointer to an element of ANY N-dimensional >array, single- or multi-buffered. > > void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides, > Py_ssize_t* subarrayoffs, Py_ssize_t *indices) { > char* pointer = (char*)buf; > int i; > for (i = 0; i < ndim; i++) { > pointer += strides[i]*indices[i]; > if (subarraysoffs[i] >= 0) { > pointer = *(char**)pointer + subarraysoffs[i]; > } > } > return (void*)pointer; > } > >For single buffers, subbufferoffsets is negative in every dimension >and it reduces to normal single-buffer indexing. What about just having subbufferoffsets be NULL in this case? i.e. you don't need it.If some of the dimensions did not need dereferencing then they would be negative (how about we just say -1 to be explicit)? >For multi-buffers, >subbufferoffsets indicates when to dereference the pointer and switch >to the new buffer, and gives the offset into the buffer to start at. >In most cases, the subbufferoffset would be zero (indicating it should >start at the beginning of the new buffer), but can be a positive >number if the following dimension has been sliced, and thus the 0th >entry in that dimension would not be at the beginning of the new >buffer. > > > > Other than that, looks good. :) > I think we are getting closer. What do you think about Greg's idea of basically making the provider the bufferinfo structure and having the exporter handle copying memory over for shape and strides if it wants to be able to change those before the lock is released. -Travis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailm
Re: [Python-Dev] An updated extended buffer PEP
Greg Ewing wrote: > Here's another idea, to allow multiple views of the same > buffer with different shape/stride info to coexist, but > without extra provider objects or refcount weirdness. > Also it avoids using calls with a brazillion arguments. > >struct bufferinfo { > void **buf; > Py_ssize_t *len; > int *writeable; > char **format; > int *ndims; > Py_ssize_t **shape; > Py_ssize_t **strides; > int **isptr; >}; > >int (*getbuffer)(PyObject *obj, struct bufferinfo *info); > >int (*releasebuffer)(PyObject *obj, struct bufferinfo *info); > This is not much different from my original "view" object. Stick a PyObject_HEAD at the start of this bufferinfo and you have it. Memory management was the big reason I wanted to do something like this. I don't see why a PyObject_HEAD would make anything significantly slower. Then we could use Python's memory management very easily to create and destroy these things. This bufferinfo object would become the "provider" I was talking about. > If the object has constant shape/stride info, it just fills > in the info struct with pointers to its own memory, and does > nothing when releasebuffer is called (other than unlocking > its buffer). > > If its shape/stride info can change, it mallocs memory for > them and copies them into the info struct. When releasebuffer > is called, it frees this memory. > > It is the responsibility of the consumer to ensure that the > base object remains alive until releasebuffer has been called > on the info struct (to avoid leaking any memory that has > been malloced for shapes/strides). This is a reasonable design choice. I actually prefer to place all the buffer information in a single object rather than the multiple argument design because it scales better and is easier to explain and understand. -Travis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Travis Oliphant wrote: > Travis Oliphant wrote: >> Hi Carl and Greg, >> >> Here is my updated PEP which incorporates several parts of the >> discussions we have been having. > > And here is the actual link: > > http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt What's the purpose of void** segments in PyObject_GetBuffer? It seems like it's leftover from an older incarnation? I'd hope after more recent discussion, we'll end up simplifying releasebuffer. It seems like it'd be a nightmare to keep track of what you've released. Finally, the isptr thing. It's just not sufficient. Frankly, I'm having doubts whether it's a good idea to support multibuffer at all. Sure, it brings generality, but I'm thinking its too hard to explain and too hard to get one's head around, and will lead to lots of misunderstanding and bugginess. OTOH, it really doen't burden anyone except those who want to export multi-buffered arrays, and we only have one shot to do it. I just hope it doesn't confuse everyone so much that no one bothers. Here's how I would update the isptr thing. I've changed "derefoff" to "subbufferoffsets" to describe it better. typedef PyObject *(*getbufferproc)(PyObject *obj, void **buf, Py_ssize_t *len, int *writeable, char **format, int *ndims, Py_ssize_t **shape, Py_ssize_t **strides, Py_ssize_t **subbufferoffsets); subbufferoffsets Used to export information about multibuffer arrays. It is an address of a ``Py_ssize_t *`` variable that will be set to point at an array of ``Py_ssize_t`` of length ``*ndims``. [I don't even want to try a verbal description.] To demonstrate how subbufferoffsets works, here is am example of a function that returns a pointer to an element of ANY N-dimensional array, single- or multi-buffered. void* get_item_pointer(int ndim, void* buf, Py_ssize_t* strides, Py_ssize_t* subarrayoffs, Py_ssize_t *indices) { char* pointer = (char*)buf; int i; for (i = 0; i < ndim; i++) { pointer += strides[i]*indices[i]; if (subarraysoffs[i] >= 0) { pointer = *(char**)pointer + subarraysoffs[i]; } } return (void*)pointer; } For single buffers, subbufferoffsets is negative in every dimension and it reduces to normal single-buffer indexing. For multi-buffers, subbufferoffsets indicates when to dereference the pointer and switch to the new buffer, and gives the offset into the buffer to start at. In most cases, the subbufferoffset would be zero (indicating it should start at the beginning of the new buffer), but can be a positive number if the following dimension has been sliced, and thus the 0th entry in that dimension would not be at the beginning of the new buffer. Other than that, looks good. :) Carl Banks ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Here's another idea, to allow multiple views of the same buffer with different shape/stride info to coexist, but without extra provider objects or refcount weirdness. Also it avoids using calls with a brazillion arguments. struct bufferinfo { void **buf; Py_ssize_t *len; int *writeable; char **format; int *ndims; Py_ssize_t **shape; Py_ssize_t **strides; int **isptr; }; int (*getbuffer)(PyObject *obj, struct bufferinfo *info); int (*releasebuffer)(PyObject *obj, struct bufferinfo *info); If the object has constant shape/stride info, it just fills in the info struct with pointers to its own memory, and does nothing when releasebuffer is called (other than unlocking its buffer). If its shape/stride info can change, it mallocs memory for them and copies them into the info struct. When releasebuffer is called, it frees this memory. It is the responsibility of the consumer to ensure that the base object remains alive until releasebuffer has been called on the info struct (to avoid leaking any memory that has been malloced for shapes/strides). -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Travis Oliphant wrote: > > Here is my updated PEP which incorporates several parts of the > > discussions we have been having. It looks pretty good. However, I'm still having trouble seeing what use it is returning a different object from getbuffer. There seems to be no rationale set out for this in the PEP. Can you give me a concrete example of a case where it would be necessary? Also it appears that you're returning a borrowed reference, so if the provider object is not the same as the main object, this would seem to require the main object to keep references to all the provider objects that it has handed out, until releasebuffer has been called on them. This seems very odd to me. -- Greg ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
Re: [Python-Dev] An updated extended buffer PEP
Travis Oliphant wrote: > Hi Carl and Greg, > > Here is my updated PEP which incorporates several parts of the > discussions we have been having. > And here is the actual link: http://projects.scipy.org/scipy/numpy/browser/trunk/numpy/doc/pep_buffer.txt > -Travis > > > > ___ > Python-Dev mailing list > Python-Dev@python.org > http://mail.python.org/mailman/listinfo/python-dev > Unsubscribe: > http://mail.python.org/mailman/options/python-dev/python-python-dev%40m.gmane.org > ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com
[Python-Dev] An updated extended buffer PEP
Hi Carl and Greg, Here is my updated PEP which incorporates several parts of the discussions we have been having. -Travis ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com