Re: [Cython] Add support for list/tuple slicing

2013-03-07 Thread Zaur Shibzukhov
2013/3/7 Zaur Shibzukhov :
> Current Cython generate for slicing of list/tuple general
> PySequence_GetSlice/SetSlice call.
> We could replace that to native call for Py{List|Tuple}_GetSlice and
> PyList_SetSlice for lists/tuples.

There is updated change that use utility code
__Pyx_Py{List|Tuple}_GetSlice because Py{List|Tuple}_GetSlice dosn't
support negative indices. That job do (in CPython) {list|tuple}slice
function from type object's slot ({list|tuple}_subscript), but it
handle both indices and slice objects which add overhead. That's the
reason why PySequence_GetSlice is slower: it create slice object and
falls to {list|tuple}_subscript. Therefore I added utility code.

Here is utility code:

/// PyList_GetSlice.proto ///

static PyObject* __Pyx_PyList_GetSlice(
PyObject* lst, Py_ssize_t start, Py_ssize_t stop);

/// PyList_GetSlice ///

PyObject* __Pyx_PyList_GetSlice(
PyObject* lst, Py_ssize_t start, Py_ssize_t stop) {
Py_ssize_t i, length;
PyListObject* np;
PyObject **src, **dest;
PyObject *v;

length = PyList_GET_SIZE(lst);

if (start < 0) {
start += length;
if (start < 0)
start = 0;
}

if (stop < 0)
stop += length;
else if (stop > length)
stop = length;

length = stop - start;
if (length <= 0)
return PyList_New(0);

np = (PyListObject*) PyList_New(length);
if (np == NULL)
return NULL;

src = ((PyListObject*)lst)->ob_item + start;
dest = np->ob_item;
for (i = 0; i < length; i++) {
v = src[i];
Py_INCREF(v);
dest[i] = v;
}
return (PyObject*)np;
}

/// PyTuple_GetSlice.proto ///

static PyObject* __Pyx_PyTuple_GetSlice(
PyObject* ob, Py_ssize_t start, Py_ssize_t stop);

/// PyTuple_GetSlice ///

PyObject* __Pyx_PyTuple_GetSlice(
PyObject* ob, Py_ssize_t start, Py_ssize_t stop) {
Py_ssize_t i, length;
PyTupleObject* np;
PyObject **src, **dest;
PyObject *v;

length = PyTuple_GET_SIZE(ob);

if (start < 0) {
start += length;
if (start < 0)
start = 0;
}

if (stop < 0)
stop += length;
else if (stop > length)
stop = length;

length = stop - start;
if (length <= 0)
return PyList_New(0);

np = (PyTupleObject *) PyTuple_New(length);
if (np == NULL)
return NULL;

src = ((PyTupleObject*)ob)->ob_item + start;
dest = np->ob_item;
for (i = 0; i < length; i++) {
v = src[i];
Py_INCREF(v);
dest[i] = v;
}
return (PyObject*)np;
}

Here is testing code:

list_slice.pyx
-

from cpython.sequence cimport PySequence_GetSlice

cdef extern from "list_tuple_slices.h":
inline object __Pyx_PyList_GetSlice(object ob, int start, int stop)
inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop)


cdef list lst = list(range(10))
cdef list lst2 = list(range(7))

def get_slice1(list lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(PySequence_GetSlice(lst, 2, 8))

return res

def get_slice2(list lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(__Pyx_PyList_GetSlice(lst, 2, 8))

return res

def test_get_slice1():
get_slice1(lst)

def test_get_slice2():
get_slice2(lst)

tuple_slicing.pyx
---

from cpython.sequence cimport PySequence_GetSlice

cdef extern from "list_tuple_slices.h":
inline object __Pyx_PyList_GetSlice(object lst, int start, int stop)
inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop)

cdef tuple lst = tuple(range(10))

def get_slice1(tuple lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(PySequence_GetSlice(lst, 2, 8))

return res

def get_slice2(tuple lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(__Pyx_PyTuple_GetSlice(lst, 2, 8))

return res


def test_get_slice1():
get_slice1(lst)

def test_get_slice2():
get_slice2(lst)

Here are timings:

for list

(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice1" "test_get_slice1()"
raw times: 10.2 10.3 10.4 10.1 10.2
100 loops, best of 5: 101 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice1" "test_get_slice1()"
raw times: 10.3 10.3 10.2 10.3 10.2
100 loops, best of 5: 102 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice2" "test_get_slice2()"
raw times: 8.16 8.19 8.17 8.2 8.16
100 loops, best of 5: 81.6 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s "from
mytests.list_slice import test_get_slice2" "test_get_slice2()"

Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Robert Bradshaw
On Thu, Mar 7, 2013 at 3:19 PM, Greg Ewing  wrote:
> Nikita Nemkin wrote:
>>
>> Sorry, accidental early send. Previous mail continued...
>>
>> [None] * N makes an extra pass over the list to assign None to each item
>> (and also incref None n times).
>
>
> Maybe this could be optimised by adding N to the reference
> count instead of incrementing it N times?

I'd be surprised if the C compiler doesn't.

http://hg.python.org/cpython/file/1d4849f9e37d/Objects/listobject.c#l515
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Greg Ewing

Nikita Nemkin wrote:

Sorry, accidental early send. Previous mail continued...

[None] * N makes an extra pass over the list to assign None to each item
(and also incref None n times).


Maybe this could be optimised by adding N to the reference
count instead of incrementing it N times?

--
Greg
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Robert Bradshaw
On Thu, Mar 7, 2013 at 10:26 AM, Yury V. Zaytsev  wrote:
> Hi Stefan,
>
> On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote:
>
>> Note that Python has an algorithm for shrinking a list on appending, so
>> this might not be sufficient for your use case. What do you need it for?
>
> W00t! I didn't know about that.
>
> I'm wrapping a C++ code that should transmit large lists of objects to
> Python, while these objects are stored into something vector-like, which
> shouldn't get exposed directly. In the past, they did something like
>
> obj = PyList_New(a.size());
> for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...)
>
> I figured I can translate it into a while loop like
>
> obj = []
> while (it != a.end()):
> obj.append(it)
> inc(it)
>
> but then I'm not using the information about the size of a that I
> already have, and for huge lists this tends to be quite slow.
>
> I think this must be quite a common use case for bindings...
>
>> And why won't [None]*N help you out? It should be pretty cheap.
>
> It probably will, at least a bit. It just stroke me that if I'm going to
> do something along the lines of
>
> idx = 0
> obj = [None] * a.size()
> while (it != a.end()):
> obj[idx] = it
> idx += 1
> inc(it)
>
> I could also squeeze the last bits of performance by avoiding the
> creation of Nones and subsequently populating the list with them.
>
> If you say I have to use PyList_New directly, oh well... It's just that
> now since I'm rewriting the bindings in Cython anyways, I'm also trying
> to avoid using Python C API directly as much as possible.

I would time the two approaches to see if it really matters.

>> Won't list comprehensions work for you? They could potentially be adapted
>> to presize the list.
>
> I guess not.

[o for o in a]

is nice and clean. If a has a size method (a common stl pattern), we
could optimistically call that to do the pre-allocation.

I don't know exactly what your usecase is, but you might consider
simply exposing a list-like wrapper supporting __getitem__ and
iteration, rather than eagerly converting the entire thing to a list.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Yury V. Zaytsev
Hi Stefan,

On Thu, 2013-03-07 at 12:21 +0100, Stefan Behnel wrote:

> Note that Python has an algorithm for shrinking a list on appending, so
> this might not be sufficient for your use case. What do you need it for?

W00t! I didn't know about that.

I'm wrapping a C++ code that should transmit large lists of objects to
Python, while these objects are stored into something vector-like, which
shouldn't get exposed directly. In the past, they did something like

obj = PyList_New(a.size());
for (a.begin(); a.end(); ++a) PyList_SetItem(obj, ...)

I figured I can translate it into a while loop like

obj = []
while (it != a.end()):
obj.append(it)
inc(it)

but then I'm not using the information about the size of a that I
already have, and for huge lists this tends to be quite slow.

I think this must be quite a common use case for bindings...

> And why won't [None]*N help you out? It should be pretty cheap.

It probably will, at least a bit. It just stroke me that if I'm going to
do something along the lines of

idx = 0
obj = [None] * a.size()
while (it != a.end()):
obj[idx] = it
idx += 1
inc(it)

I could also squeeze the last bits of performance by avoiding the
creation of Nones and subsequently populating the list with them.

If you say I have to use PyList_New directly, oh well... It's just that
now since I'm rewriting the bindings in Cython anyways, I'm also trying
to avoid using Python C API directly as much as possible.

> Won't list comprehensions work for you? They could potentially be adapted
> to presize the list.

I guess not.

-- 
Sincerely yours,
Yury V. Zaytsev


___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Robert Bradshaw
On Thu, Mar 7, 2013 at 6:48 AM, Stefan Behnel  wrote:
> Zaur Shibzukhov, 07.03.2013 15:39:
>> 2013/3/7 Zaur Shibzukhov
>>
>>> I guess the problem is to construct new (even empty) list with
>>> pre-allocated memory exactly for N elements.
>>>
>>> N*[NULL] - changes semantics because there can't be list with N elements
>>> and filled by NULL.
>>> N*[None] - more expansive for further assignments because of Py_DECREFs.
>>>
>>> I suppose that N*[] could do the trick.
>
> That looks wrong to me.
>
>
>>> It could be optimized so that N*[]
>>> is equal to an empty list but with preallocated memory exactly for N
>>> elements. Could it be?
>>
>> Cython optimize already PyList_Append very well. Theofore scenario when
>> first one create empty list with exactly prealocated memory for N elements
>>  and second eval elements of the list and add them using plain list.append
>> could optimized in Cython very well too. As result constructed list will
>> contain memory only for N elements. This allows don't vast memory when one
>> need to build many many lists with relative big size.
>
> I prefer not adding any new syntax as long as we are not sure we can't fix
> it by making list comprehensions smarter. I tried this a while ago and have
> some initial code lying around somewhere in my patch queue. Didn't have the
> time to make it any usable, though, also because Cython didn't have its own
> append() for list comprehensions at the time. It does now, as you noted.

There are several cases where we can get the size of the result list
upfront, we can certainly do better here now. I'm also -1 to adding
special syntax for populating a list with NULL values, if you really
want to do this (and I doubt it really matters in most cases) calling
PyList_New is the "syntax" to use.

- Robert
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


[Cython] Add support for list/tuple slicing

2013-03-07 Thread Zaur Shibzukhov
Hello!

Current Cython generate for slicing of list/tuple general
PySequence_GetSlice/SetSlice call.
We could replace that to native call for Py{List|Tuple}_GetSlice and
PyList_SetSlice for lists/tuples.

Here are the changes:
https://github.com/intellimath/cython/commit/27525a5dc9f6eba31b330a6ec04e7a105191d9f5

Zaur Shibzukhov
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Stefan Behnel
Zaur Shibzukhov, 07.03.2013 15:39:
> 2013/3/7 Zaur Shibzukhov
> 
>> I guess the problem is to construct new (even empty) list with
>> pre-allocated memory exactly for N elements.
>>
>> N*[NULL] - changes semantics because there can't be list with N elements
>> and filled by NULL.
>> N*[None] - more expansive for further assignments because of Py_DECREFs.
>>
>> I suppose that N*[] could do the trick.

That looks wrong to me.


>> It could be optimized so that N*[]
>> is equal to an empty list but with preallocated memory exactly for N
>> elements. Could it be?
>
> Cython optimize already PyList_Append very well. Theofore scenario when
> first one create empty list with exactly prealocated memory for N elements
>  and second eval elements of the list and add them using plain list.append
> could optimized in Cython very well too. As result constructed list will
> contain memory only for N elements. This allows don't vast memory when one
> need to build many many lists with relative big size.

I prefer not adding any new syntax as long as we are not sure we can't fix
it by making list comprehensions smarter. I tried this a while ago and have
some initial code lying around somewhere in my patch queue. Didn't have the
time to make it any usable, though, also because Cython didn't have its own
append() for list comprehensions at the time. It does now, as you noted.

Stefan

___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Zaur Shibzukhov
2013/3/7 Zaur Shibzukhov 

> I guess the problem is to construct new (even empty) list with
> pre-allocated memory exactly for N elements.
>
> N*[NULL] - changes semantics because there can't be list with N elements
> and filled by NULL.
> N*[None] - more expansive for further assignments because of Py_DECREFs.
>
> I suppose that N*[] could do the trick. It could be optimized so that N*[]
> is equal to an empty list but with preallocated memory exactly for N
> elements. Could it be?
>
>
Cython optimize already PyList_Append very well. Theofore scenario when
first one create empty list with exactly prealocated memory for N elements
 and second eval elements of the list and add them using plain list.append
could optimized in Cython very well too. As result constructed list will
contain memory only for N elements. This allows don't vast memory when one
need to build many many lists with relative big size.

Zaur Shibzukhov
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Zaur Shibzukhov
2013/3/7 Stefan Behnel 

> Yury V. Zaytsev, 07.03.2013 12:16:
> > Is there any syntax that I can use to do something like this in Cython:
> >
> > py_object_ = PyList_New(123); ?
>
> Note that Python has an algorithm for shrinking a list on appending, so
> this might not be sufficient for your use case.
>
>
> > If not, do you think that this can be added in one way or another?
> >
> > Unfortunately, I can't think of a non-disruptive way of doing it. For
> > instance, if this
> >
> > [None] * N
> >
> > is given a completely new meaning, like make an empty list (of NULLs),
> > instead of making a real list of Nones, it will certainly break Python
> > code. Besides, it would probably be still faster than no pre-allocation,
> > but slower than an empty list with pre-allocation...
> >
> > Maybe
> >
> > [NULL] * N ?
>
> What do you need it for?
>
> Won't list comprehensions work for you? They could potentially be adapted
> to presize the list.
>
>
I guess the problem is to construct new (even empty) list with
pre-allocated memory exactly for N elements.

N*[NULL] - changes semantics because there can't be list with N elements
and filled by NULL.
N*[None] - more expansive for further assignments because of Py_DECREFs.

I suppose that N*[] could do the trick. It could be optimized so that N*[]
is equal to an empty list but with preallocated memory exactly for N
elements. Could it be?

Zaur Shibzukhov
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Nikita Nemkin

Sorry, accidental early send. Previous mail continued...

[None] * N makes an extra pass over the list to assign None to each item
(and also incref None n times).
This is useless extra work. The larget the list, the worse it gets.


Best regards,
Nikita Nemkin
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Nikita Nemkin
On Thu, 07 Mar 2013 17:16:10 +0600, Yury V. Zaytsev   
wrote:



Hi,

Is there any syntax that I can use to do something like this in Cython:

py_object_ = PyList_New(123); ?

If not, do you think that this can be added in one way or another?

Unfortunately, I can't think of a non-disruptive way of doing it. For
instance, if this

[None] * N

is given a completely new meaning, like make an empty list (of NULLs),
instead of making a real list of Nones, it will certainly break Python
code. Besides, it would probably be still faster than no pre-allocation,
but slower than an empty list with pre-allocation...

Maybe

[NULL] * N ?

Any ideas?



I really like the "[NULL] * N" thing.

Efficient empty list allocation and filling is something I stumble upon
quite often, especially in binding code.

I doubt Cython will be able to automatically use PyList_SET_ITEM
for assignment to such NULL list (it would require induction variable
analysis), but eliminating one extra pass over the list is already helpful.

Implementation note (if this gets implemented):
Cython's optimized list assignment routine uses Py_DECREF, this will
have to be changed to Py_XDECREF, otherwise NULL list items won't be
directly assignable from Cython.
(PyList_SetItem always uses Py_XDECREF on the old element).



What do you need it for?

Won't list comprehensions work for you? They could potentially be adapted
to presize the list.


List comprehensions do not preallocate the list.
If they did, the need for the above would be somewhat diminished.


And why won't [None]*N help you out? It should be pretty cheap.


[None] * N makes
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Stefan Behnel
Yury V. Zaytsev, 07.03.2013 12:16:
> Is there any syntax that I can use to do something like this in Cython:
> 
> py_object_ = PyList_New(123); ?

Note that Python has an algorithm for shrinking a list on appending, so
this might not be sufficient for your use case.


> If not, do you think that this can be added in one way or another?
> 
> Unfortunately, I can't think of a non-disruptive way of doing it. For
> instance, if this
> 
> [None] * N
> 
> is given a completely new meaning, like make an empty list (of NULLs),
> instead of making a real list of Nones, it will certainly break Python
> code. Besides, it would probably be still faster than no pre-allocation,
> but slower than an empty list with pre-allocation...
> 
> Maybe
> 
> [NULL] * N ?

What do you need it for?

Won't list comprehensions work for you? They could potentially be adapted
to presize the list.

And why won't [None]*N help you out? It should be pretty cheap.

Stefan

___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


[Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Yury V. Zaytsev
Hi,

Is there any syntax that I can use to do something like this in Cython:

py_object_ = PyList_New(123); ?

If not, do you think that this can be added in one way or another?

Unfortunately, I can't think of a non-disruptive way of doing it. For
instance, if this

[None] * N

is given a completely new meaning, like make an empty list (of NULLs),
instead of making a real list of Nones, it will certainly break Python
code. Besides, it would probably be still faster than no pre-allocation,
but slower than an empty list with pre-allocation...

Maybe

[NULL] * N ?

Any ideas?

-- 
Sincerely yours,
Yury V. Zaytsev


___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel