Re: [Cython] Add support for list/tuple slicing
2013/3/7 Zaur Shibzukhov : > Current Cython generate for slicing of list/tuple general > PySequence_GetSlice/SetSlice call. > We could replace that to native call for Py{List|Tuple}_GetSlice and > PyList_SetSlice for lists/tuples. There is updated change that use utility code __Pyx_Py{List|Tuple}_GetSlice because Py{List|Tuple}_GetSlice dosn't support negative indices. That job do (in CPython) {list|tuple}slice function from type object's slot ({list|tuple}_subscript), but it handle both indices and slice objects which add overhead. That's the reason why PySequence_GetSlice is slower: it create slice object and falls to {list|tuple}_subscript. Therefore I added utility code. Here is utility code: /// PyList_GetSlice.proto /// static PyObject* __Pyx_PyList_GetSlice( PyObject* lst, Py_ssize_t start, Py_ssize_t stop); /// PyList_GetSlice /// PyObject* __Pyx_PyList_GetSlice( PyObject* lst, Py_ssize_t start, Py_ssize_t stop) { Py_ssize_t i, length; PyListObject* np; PyObject **src, **dest; PyObject *v; length = PyList_GET_SIZE(lst); if (start < 0) { start += length; if (start < 0) start = 0; } if (stop < 0) stop += length; else if (stop > length) stop = length; length = stop - start; if (length <= 0) return PyList_New(0); np = (PyListObject*) PyList_New(length); if (np == NULL) return NULL; src = ((PyListObject*)lst)->ob_item + start; dest = np->ob_item; for (i = 0; i < length; i++) { v = src[i]; Py_INCREF(v); dest[i] = v; } return (PyObject*)np; } /// PyTuple_GetSlice.proto /// static PyObject* __Pyx_PyTuple_GetSlice( PyObject* ob, Py_ssize_t start, Py_ssize_t stop); /// PyTuple_GetSlice /// PyObject* __Pyx_PyTuple_GetSlice( PyObject* ob, Py_ssize_t start, Py_ssize_t stop) { Py_ssize_t i, length; PyTupleObject* np; PyObject **src, **dest; PyObject *v; length = PyTuple_GET_SIZE(ob); if (start < 0) { start += length; if (start < 0) start = 0; } if (stop < 0) stop += length; else if (stop > length) stop = length; length = stop - start; if (length <= 0) return PyList_New(0); np = (PyTupleObject *) PyTuple_New(length); if (np == NULL) return NULL; src = ((PyTupleObject*)ob)->ob_item + start; dest = np->ob_item; for (i = 0; i < length; i++) { v = src[i]; Py_INCREF(v); dest[i] = v; } return (PyObject*)np; } Here is testing code: list_slice.pyx - from cpython.sequence cimport PySequence_GetSlice cdef extern from "list_tuple_slices.h": inline object __Pyx_PyList_GetSlice(object ob, int start, int stop) inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop) cdef list lst = list(range(10)) cdef list lst2 = list(range(7)) def get_slice1(list lst): cdef int i cdef list res = [] for i in range(20): res.append(PySequence_GetSlice(lst, 2, 8)) return res def get_slice2(list lst): cdef int i cdef list res = [] for i in range(20): res.append(__Pyx_PyList_GetSlice(lst, 2, 8)) return res def test_get_slice1(): get_slice1(lst) def test_get_slice2(): get_slice2(lst) tuple_slicing.pyx --- from cpython.sequence cimport PySequence_GetSlice cdef extern from "list_tuple_slices.h": inline object __Pyx_PyList_GetSlice(object lst, int start, int stop) inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop) cdef tuple lst = tuple(range(10)) def get_slice1(tuple lst): cdef int i cdef list res = [] for i in range(20): res.append(PySequence_GetSlice(lst, 2, 8)) return res def get_slice2(tuple lst): cdef int i cdef list res = [] for i in range(20): res.append(__Pyx_PyTuple_GetSlice(lst, 2, 8)) return res def test_get_slice1(): get_slice1(lst) def test_get_slice2(): get_slice2(lst) Here are timings: for list (py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from mytests.list_slice import test_get_slice1" "test_get_slice1()" raw times: 10.2 10.3 10.4 10.1 10.2 100 loops, best of 5: 101 msec per loop (py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from mytests.list_slice import test_get_slice1" "test_get_slice1()" raw times: 10.3 10.3 10.2 10.3 10.2 100 loops, best of 5: 102 msec per loop (py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from mytests.list_slice import test_get_slice2" "test_get_slice2()" raw times: 8.16 8.19 8.17 8.2 8.16 100 loops, best of 5: 81.6 msec per loop (py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from mytests.list_slice import test_get_slice2" "test_get_slice2()"
Re: [Cython] Cython syntax to pre-allocate lists for performance
On Thu, Mar 7, 2013 at 3:19 PM, Greg Ewing wrote: > Nikita Nemkin wrote: >> >> Sorry, accidental early send. Previous mail continued... >> >> [None] * N makes an extra pass over the list to assign None to each item >> (and also incref None n times). > > > Maybe this could be optimised by adding N to the reference > count instead of incrementing it N times? I'd be surprised if the C compiler doesn't. http://hg.python.org/cpython/file/1d4849f9e37d/Objects/listobject.c#l515 ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
Nikita Nemkin wrote: Sorry, accidental early send. Previous mail continued... [None] * N makes an extra pass over the list to assign None to each item (and also incref None n times). Maybe this could be optimised by adding N to the reference count instead of incrementing it N times? -- Greg ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
On Thu, Mar 7, 2013 at 10:26 AM, Yury V. Zaytsev wrote: > Hi Stefan, > > On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote: > >> Note that Python has an algorithm for shrinking a list on appending, so >> this might not be sufficient for your use case. What do you need it for? > > W00t! I didn't know about that. > > I'm wrapping a C++ code that should transmit large lists of objects to > Python, while these objects are stored into something vector-like, which > shouldn't get exposed directly. In the past, they did something like > > obj = PyList_New(a.size()); > for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...) > > I figured I can translate it into a while loop like > > obj = [] > while (it != a.end()): > obj.append(it) > inc(it) > > but then I'm not using the information about the size of a that I > already have, and for huge lists this tends to be quite slow. > > I think this must be quite a common use case for bindings... > >> And why won't [None]*N help you out? It should be pretty cheap. > > It probably will, at least a bit. It just stroke me that if I'm going to > do something along the lines of > > idx = 0 > obj = [None] * a.size() > while (it != a.end()): > obj[idx] = it > idx += 1 > inc(it) > > I could also squeeze the last bits of performance by avoiding the > creation of Nones and subsequently populating the list with them. > > If you say I have to use PyList_New directly, oh well... It's just that > now since I'm rewriting the bindings in Cython anyways, I'm also trying > to avoid using Python C API directly as much as possible. I would time the two approaches to see if it really matters. >> Won't list comprehensions work for you? They could potentially be adapted >> to presize the list. > > I guess not. [o for o in a] is nice and clean. If a has a size method (a common stl pattern), we could optimistically call that to do the pre-allocation. I don't know exactly what your usecase is, but you might consider simply exposing a list-like wrapper supporting __getitem__ and iteration, rather than eagerly converting the entire thing to a list. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
Hi Stefan, On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote: > Note that Python has an algorithm for shrinking a list on appending, so > this might not be sufficient for your use case. What do you need it for? W00t! I didn't know about that. I'm wrapping a C++ code that should transmit large lists of objects to Python, while these objects are stored into something vector-like, which shouldn't get exposed directly. In the past, they did something like obj = PyList_New(a.size()); for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...) I figured I can translate it into a while loop like obj = [] while (it != a.end()): obj.append(it) inc(it) but then I'm not using the information about the size of a that I already have, and for huge lists this tends to be quite slow. I think this must be quite a common use case for bindings... > And why won't [None]*N help you out? It should be pretty cheap. It probably will, at least a bit. It just stroke me that if I'm going to do something along the lines of idx = 0 obj = [None] * a.size() while (it != a.end()): obj[idx] = it idx += 1 inc(it) I could also squeeze the last bits of performance by avoiding the creation of Nones and subsequently populating the list with them. If you say I have to use PyList_New directly, oh well... It's just that now since I'm rewriting the bindings in Cython anyways, I'm also trying to avoid using Python C API directly as much as possible. > Won't list comprehensions work for you? They could potentially be adapted > to presize the list. I guess not. -- Sincerely yours, Yury V. Zaytsev ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
On Thu, Mar 7, 2013 at 6:48 AM, Stefan Behnel wrote: > Zaur Shibzukhov, 07.03.2013 15:39: >> 2013/3/7 Zaur Shibzukhov >> >>> I guess the problem is to construct new (even empty) list with >>> pre-allocated memory exactly for N elements. >>> >>> N*[NULL] - changes semantics because there can't be list with N elements >>> and filled by NULL. >>> N*[None] - more expansive for further assignments because of Py_DECREFs. >>> >>> I suppose that N*[] could do the trick. > > That looks wrong to me. > > >>> It could be optimized so that N*[] >>> is equal to an empty list but with preallocated memory exactly for N >>> elements. Could it be? >> >> Cython optimize already PyList_Append very well. Theofore scenario when >> first one create empty list with exactly prealocated memory for N elements >> and second eval elements of the list and add them using plain list.append >> could optimized in Cython very well too. As result constructed list will >> contain memory only for N elements. This allows don't vast memory when one >> need to build many many lists with relative big size. > > I prefer not adding any new syntax as long as we are not sure we can't fix > it by making list comprehensions smarter. I tried this a while ago and have > some initial code lying around somewhere in my patch queue. Didn't have the > time to make it any usable, though, also because Cython didn't have its own > append() for list comprehensions at the time. It does now, as you noted. There are several cases where we can get the size of the result list upfront, we can certainly do better here now. I'm also -1 to adding special syntax for populating a list with NULL values, if you really want to do this (and I doubt it really matters in most cases) calling PyList_New is the "syntax" to use. - Robert ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Add support for list/tuple slicing
Hello! Current Cython generate for slicing of list/tuple general PySequence_GetSlice/SetSlice call. We could replace that to native call for Py{List|Tuple}_GetSlice and PyList_SetSlice for lists/tuples. Here are the changes: https://github.com/intellimath/cython/commit/27525a5dc9f6eba31b330a6ec04e7a105191d9f5 Zaur Shibzukhov ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
Zaur Shibzukhov, 07.03.2013 15:39: > 2013/3/7 Zaur Shibzukhov > >> I guess the problem is to construct new (even empty) list with >> pre-allocated memory exactly for N elements. >> >> N*[NULL] - changes semantics because there can't be list with N elements >> and filled by NULL. >> N*[None] - more expansive for further assignments because of Py_DECREFs. >> >> I suppose that N*[] could do the trick. That looks wrong to me. >> It could be optimized so that N*[] >> is equal to an empty list but with preallocated memory exactly for N >> elements. Could it be? > > Cython optimize already PyList_Append very well. Theofore scenario when > first one create empty list with exactly prealocated memory for N elements > and second eval elements of the list and add them using plain list.append > could optimized in Cython very well too. As result constructed list will > contain memory only for N elements. This allows don't vast memory when one > need to build many many lists with relative big size. I prefer not adding any new syntax as long as we are not sure we can't fix it by making list comprehensions smarter. I tried this a while ago and have some initial code lying around somewhere in my patch queue. Didn't have the time to make it any usable, though, also because Cython didn't have its own append() for list comprehensions at the time. It does now, as you noted. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
2013/3/7 Zaur Shibzukhov > I guess the problem is to construct new (even empty) list with > pre-allocated memory exactly for N elements. > > N*[NULL] - changes semantics because there can't be list with N elements > and filled by NULL. > N*[None] - more expansive for further assignments because of Py_DECREFs. > > I suppose that N*[] could do the trick. It could be optimized so that N*[] > is equal to an empty list but with preallocated memory exactly for N > elements. Could it be? > > Cython optimize already PyList_Append very well. Theofore scenario when first one create empty list with exactly prealocated memory for N elements and second eval elements of the list and add them using plain list.append could optimized in Cython very well too. As result constructed list will contain memory only for N elements. This allows don't vast memory when one need to build many many lists with relative big size. Zaur Shibzukhov ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
2013/3/7 Stefan Behnel > Yury V. Zaytsev, 07.03.2013 12:16: > > Is there any syntax that I can use to do something like this in Cython: > > > > py_object_ = PyList_New(123); ? > > Note that Python has an algorithm for shrinking a list on appending, so > this might not be sufficient for your use case. > > > > If not, do you think that this can be added in one way or another? > > > > Unfortunately, I can't think of a non-disruptive way of doing it. For > > instance, if this > > > > [None] * N > > > > is given a completely new meaning, like make an empty list (of NULLs), > > instead of making a real list of Nones, it will certainly break Python > > code. Besides, it would probably be still faster than no pre-allocation, > > but slower than an empty list with pre-allocation... > > > > Maybe > > > > [NULL] * N ? > > What do you need it for? > > Won't list comprehensions work for you? They could potentially be adapted > to presize the list. > > I guess the problem is to construct new (even empty) list with pre-allocated memory exactly for N elements. N*[NULL] - changes semantics because there can't be list with N elements and filled by NULL. N*[None] - more expansive for further assignments because of Py_DECREFs. I suppose that N*[] could do the trick. It could be optimized so that N*[] is equal to an empty list but with preallocated memory exactly for N elements. Could it be? Zaur Shibzukhov ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
Sorry, accidental early send. Previous mail continued... [None] * N makes an extra pass over the list to assign None to each item (and also incref None n times). This is useless extra work. The larget the list, the worse it gets. Best regards, Nikita Nemkin ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
On Thu, 07 Mar 2013 17:16:10 +0600, Yury V. Zaytsev wrote: Hi, Is there any syntax that I can use to do something like this in Cython: py_object_ = PyList_New(123); ? If not, do you think that this can be added in one way or another? Unfortunately, I can't think of a non-disruptive way of doing it. For instance, if this [None] * N is given a completely new meaning, like make an empty list (of NULLs), instead of making a real list of Nones, it will certainly break Python code. Besides, it would probably be still faster than no pre-allocation, but slower than an empty list with pre-allocation... Maybe [NULL] * N ? Any ideas? I really like the "[NULL] * N" thing. Efficient empty list allocation and filling is something I stumble upon quite often, especially in binding code. I doubt Cython will be able to automatically use PyList_SET_ITEM for assignment to such NULL list (it would require induction variable analysis), but eliminating one extra pass over the list is already helpful. Implementation note (if this gets implemented): Cython's optimized list assignment routine uses Py_DECREF, this will have to be changed to Py_XDECREF, otherwise NULL list items won't be directly assignable from Cython. (PyList_SetItem always uses Py_XDECREF on the old element). What do you need it for? Won't list comprehensions work for you? They could potentially be adapted to presize the list. List comprehensions do not preallocate the list. If they did, the need for the above would be somewhat diminished. And why won't [None]*N help you out? It should be pretty cheap. [None] * N makes ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
Re: [Cython] Cython syntax to pre-allocate lists for performance
Yury V. Zaytsev, 07.03.2013 12:16: > Is there any syntax that I can use to do something like this in Cython: > > py_object_ = PyList_New(123); ? Note that Python has an algorithm for shrinking a list on appending, so this might not be sufficient for your use case. > If not, do you think that this can be added in one way or another? > > Unfortunately, I can't think of a non-disruptive way of doing it. For > instance, if this > > [None] * N > > is given a completely new meaning, like make an empty list (of NULLs), > instead of making a real list of Nones, it will certainly break Python > code. Besides, it would probably be still faster than no pre-allocation, > but slower than an empty list with pre-allocation... > > Maybe > > [NULL] * N ? What do you need it for? Won't list comprehensions work for you? They could potentially be adapted to presize the list. And why won't [None]*N help you out? It should be pretty cheap. Stefan ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel
[Cython] Cython syntax to pre-allocate lists for performance
Hi, Is there any syntax that I can use to do something like this in Cython: py_object_ = PyList_New(123); ? If not, do you think that this can be added in one way or another? Unfortunately, I can't think of a non-disruptive way of doing it. For instance, if this [None] * N is given a completely new meaning, like make an empty list (of NULLs), instead of making a real list of Nones, it will certainly break Python code. Besides, it would probably be still faster than no pre-allocation, but slower than an empty list with pre-allocation... Maybe [NULL] * N ? Any ideas? -- Sincerely yours, Yury V. Zaytsev ___ cython-devel mailing list cython-devel@python.org http://mail.python.org/mailman/listinfo/cython-devel