Re: [Cython] Cython syntax to pre-allocate lists for performance

2013-03-07 Thread Zaur Shibzukhov
2013/3/7 Stefan Behnel stefan...@behnel.de

 Yury V. Zaytsev, 07.03.2013 12:16:
  Is there any syntax that I can use to do something like this in Cython:
 
  py_object_ = PyList_New(123); ?

 Note that Python has an algorithm for shrinking a list on appending, so
 this might not be sufficient for your use case.


  If not, do you think that this can be added in one way or another?
 
  Unfortunately, I can't think of a non-disruptive way of doing it. For
  instance, if this
 
  [None] * N
 
  is given a completely new meaning, like make an empty list (of NULLs),
  instead of making a real list of Nones, it will certainly break Python
  code. Besides, it would probably be still faster than no pre-allocation,
  but slower than an empty list with pre-allocation...
 
  Maybe
 
  [NULL] * N ?

 What do you need it for?

 Won't list comprehensions work for you? They could potentially be adapted
 to presize the list.


I guess the problem is to construct new (even empty) list with
pre-allocated memory exactly for N elements.

N*[NULL] - changes semantics because there can't be list with N elements
and filled by NULL.
N*[None] - more expansive for further assignments because of Py_DECREFs.

I suppose that N*[] could do the trick. It could be optimized so that N*[]
is equal to an empty list but with preallocated memory exactly for N
elements. Could it be?

Zaur Shibzukhov
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] Add support for list/tuple slicing

2013-03-07 Thread Zaur Shibzukhov
2013/3/7 Zaur Shibzukhov szp...@gmail.com:
 Current Cython generate for slicing of list/tuple general
 PySequence_GetSlice/SetSlice call.
 We could replace that to native call for Py{List|Tuple}_GetSlice and
 PyList_SetSlice for lists/tuples.

There is updated change that use utility code
__Pyx_Py{List|Tuple}_GetSlice because Py{List|Tuple}_GetSlice dosn't
support negative indices. That job do (in CPython) {list|tuple}slice
function from type object's slot ({list|tuple}_subscript), but it
handle both indices and slice objects which add overhead. That's the
reason why PySequence_GetSlice is slower: it create slice object and
falls to {list|tuple}_subscript. Therefore I added utility code.

Here is utility code:

/// PyList_GetSlice.proto ///

static PyObject* __Pyx_PyList_GetSlice(
PyObject* lst, Py_ssize_t start, Py_ssize_t stop);

/// PyList_GetSlice ///

PyObject* __Pyx_PyList_GetSlice(
PyObject* lst, Py_ssize_t start, Py_ssize_t stop) {
Py_ssize_t i, length;
PyListObject* np;
PyObject **src, **dest;
PyObject *v;

length = PyList_GET_SIZE(lst);

if (start  0) {
start += length;
if (start  0)
start = 0;
}

if (stop  0)
stop += length;
else if (stop  length)
stop = length;

length = stop - start;
if (length = 0)
return PyList_New(0);

np = (PyListObject*) PyList_New(length);
if (np == NULL)
return NULL;

src = ((PyListObject*)lst)-ob_item + start;
dest = np-ob_item;
for (i = 0; i  length; i++) {
v = src[i];
Py_INCREF(v);
dest[i] = v;
}
return (PyObject*)np;
}

/// PyTuple_GetSlice.proto ///

static PyObject* __Pyx_PyTuple_GetSlice(
PyObject* ob, Py_ssize_t start, Py_ssize_t stop);

/// PyTuple_GetSlice ///

PyObject* __Pyx_PyTuple_GetSlice(
PyObject* ob, Py_ssize_t start, Py_ssize_t stop) {
Py_ssize_t i, length;
PyTupleObject* np;
PyObject **src, **dest;
PyObject *v;

length = PyTuple_GET_SIZE(ob);

if (start  0) {
start += length;
if (start  0)
start = 0;
}

if (stop  0)
stop += length;
else if (stop  length)
stop = length;

length = stop - start;
if (length = 0)
return PyList_New(0);

np = (PyTupleObject *) PyTuple_New(length);
if (np == NULL)
return NULL;

src = ((PyTupleObject*)ob)-ob_item + start;
dest = np-ob_item;
for (i = 0; i  length; i++) {
v = src[i];
Py_INCREF(v);
dest[i] = v;
}
return (PyObject*)np;
}

Here is testing code:

list_slice.pyx
-

from cpython.sequence cimport PySequence_GetSlice

cdef extern from list_tuple_slices.h:
inline object __Pyx_PyList_GetSlice(object ob, int start, int stop)
inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop)


cdef list lst = list(range(10))
cdef list lst2 = list(range(7))

def get_slice1(list lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(PySequence_GetSlice(lst, 2, 8))

return res

def get_slice2(list lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(__Pyx_PyList_GetSlice(lst, 2, 8))

return res

def test_get_slice1():
get_slice1(lst)

def test_get_slice2():
get_slice2(lst)

tuple_slicing.pyx
---

from cpython.sequence cimport PySequence_GetSlice

cdef extern from list_tuple_slices.h:
inline object __Pyx_PyList_GetSlice(object lst, int start, int stop)
inline object __Pyx_PyTuple_GetSlice(object ob, int start, int stop)

cdef tuple lst = tuple(range(10))

def get_slice1(tuple lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(PySequence_GetSlice(lst, 2, 8))

return res

def get_slice2(tuple lst):
cdef int i
cdef list res = []

for i in range(20):
res.append(__Pyx_PyTuple_GetSlice(lst, 2, 8))

return res


def test_get_slice1():
get_slice1(lst)

def test_get_slice2():
get_slice2(lst)

Here are timings:

for list

(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s from
mytests.list_slice import test_get_slice1 test_get_slice1()
raw times: 10.2 10.3 10.4 10.1 10.2
100 loops, best of 5: 101 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s from
mytests.list_slice import test_get_slice1 test_get_slice1()
raw times: 10.3 10.3 10.2 10.3 10.2
100 loops, best of 5: 102 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s from
mytests.list_slice import test_get_slice2 test_get_slice2()
raw times: 8.16 8.19 8.17 8.2 8.16
100 loops, best of 5: 81.6 msec per loop
(py33) zbook:mytests $ python -m timeit -n 100 -r 5 -v -s from
mytests.list_slice import test_get_slice2 test_get_slice2()
raw times: 8.1 8.05

[Cython] nonecheck and as_none_safe_node method

2013-03-04 Thread Zaur Shibzukhov
In ExprNodes.py there are several places where method `as_none_safe_node`
was applied in order to wrap a node by NoneCheckNode.
I think it would be resonable to apply that mostly only in cases when
noncheck=True.

Here are possible changes in ExprNodes.py:
https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c

Zaur Shibzukhov
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] nonecheck and as_none_safe_node method

2013-03-04 Thread Zaur Shibzukhov
2013/3/5 Zaur Shibzukhov szp...@gmail.com

 2013/3/5 Zaur Shibzukhov szp...@gmail.com

 In ExprNodes.py there are several places where method `as_none_safe_node`
 was applied in order to wrap a node by NoneCheckNode.
 I think it would be resonable to apply that mostly only in cases when
 noncheck=True.

 Here are possible changes in ExprNodes.py:

 https://github.com/intellimath/cython/commit/bd041680b78067007ad6b9894a2f2c18514e397c

 This change would prevent generation of None checking of an objects
 (lists, tuples, unicode) when nonecheck=True.


Sorry... when  nonecheck=False


 Any adeas?




 Zaur Shibzukhov



-- 
С уважением,
Шибзухов З.М.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] To Add datetime.pxd to cython.cpython

2013-03-03 Thread Zaur Shibzukhov
2013/3/3 Zaur Shibzukhov szp...@gmail.com:
 2013/3/2 Stefan Behnel stefan...@behnel.de:
 Hi,

 the last pull request looks good to me now.

 https://github.com/cython/cython/pull/189

 Any more comments on it?

 As was suggested earlier, I added `import_datetime` inline function to
 initialize PyDateTime C API instead of direct usage of non-native C
 macros from datetime.h.
 Now you call `import_array ()` first in the same way as is done with `numpy`.
  This approach looks natural in the light of experience with numpy.

 I make some performance comparisons. Here example for dates.

# test_date.pyx


Here test code:

from cpython.datetime cimport import_datetime, date_new, date

import_datetime()

from datetime import date as pydate

def test_date1():
cdef list lst = []
for year in range(1000, 2001):
for month in range(1,13):
for day in range(1, 20):
d = pydate(year, month, day)
lst.append(d)
return lst


def test_date2():
cdef list lst = []
for year in range(1000, 2001):
for month in range(1,13):
for day in range(1, 20):
d = date(year, month, day)
lst.append(d)
return lst

def test_date3():
cdef list lst = []
cdef int year, month, day
for year in range(1000, 2001):
for month in range(1,13):
for day in range(1, 20):
d = date_new(year, month, day)
lst.append(d)
return lst

def test1():
l = test_date1()
return l

def test2():
l = test_date2()
return l

def test3():
l = test_date3()
return l

Here are timings:

(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
mytests.test_date import test1 test1()
50 loops, best of 5: 83.2 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
mytests.test_date import test2 test2()
50 loops, best of 5: 74.7 msec per loop
(py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
mytests.test_date import test3 test3()
50 loops, best of 5: 20.9 msec per loop

OSX 10.6.8 64 bit python 3.2

Shibzukhov Zaur
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] To Add datetime.pxd to cython.cpython

2013-03-03 Thread Zaur Shibzukhov
2013/3/3 Zaur Shibzukhov szp...@gmail.com:
 2013/3/3 Zaur Shibzukhov szp...@gmail.com:
 2013/3/3 Zaur Shibzukhov szp...@gmail.com:
 2013/3/2 Stefan Behnel stefan...@behnel.de:
 Hi,

 the last pull request looks good to me now.

 https://github.com/cython/cython/pull/189

 Any more comments on it?

 As was suggested earlier, I added `import_datetime` inline function to
 initialize PyDateTime C API instead of direct usage of non-native C
 macros from datetime.h.
 Now you call `import_array ()` first in the same way as is done with 
 `numpy`.
  This approach looks natural in the light of experience with numpy.

  I make some performance comparisons. Here example for dates.

 # test_date.pyx
 

 Here test code:

 from cpython.datetime cimport import_datetime, date_new, date

 import_datetime()

 from datetime import date as pydate

 def test_date1():
 cdef list lst = []
 for year in range(1000, 2001):
 for month in range(1,13):
 for day in range(1, 20):
 d = pydate(year, month, day)
 lst.append(d)
 return lst


 def test_date2():
 cdef list lst = []
 for year in range(1000, 2001):
 for month in range(1,13):
 for day in range(1, 20):
 d = date(year, month, day)
 lst.append(d)
 return lst

 def test_date3():
 cdef list lst = []
 cdef int year, month, day
 for year in range(1000, 2001):
 for month in range(1,13):
 for day in range(1, 20):
 d = date_new(year, month, day)
 lst.append(d)
 return lst

 def test1():
 l = test_date1()
 return l

 def test2():
 l = test_date2()
 return l

 def test3():
 l = test_date3()
 return l

 Here are timings:

 (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
 mytests.test_date import test1 test1()
 50 loops, best of 5: 83.2 msec per loop
 (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
 mytests.test_date import test2 test2()
 50 loops, best of 5: 74.7 msec per loop
 (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
 mytests.test_date import test3 test3()
 50 loops, best of 5: 20.9 msec per loop

 OSX 10.6.8 64 bit python 3.2


 More acurate test...

 # coding: utf-8

 from cpython.datetime cimport import_datetime, date_new, date

 import_datetime()

 from datetime import date as pydate

 def test_date1():
 cdef list lst = []
 cdef int year, month, day
 for year in range(1000, 2001):
 for month in range(1,13):
 for day in range(1, 20):
 d = pydate(year, month, day)
 lst.append(d)
 return lst


 def test_date2():
 cdef list lst = []
 cdef int year, month, day
 for year in range(1000, 2001):
 for month in range(1,13):
 for day in range(1, 20):
 d = date(year, month, day)
 lst.append(d)
 return lst

 def test_date3():
 cdef list lst = []
 cdef int year, month, day
 for year in range(1000, 2001):
 for month in range(1,13):
 for day in range(1, 20):
 d = date_new(year, month, day)
 lst.append(d)
 return lst

 def test1():
 l = test_date1()
 return l

 def test2():
 l = test_date2()
 return l

 def test3():
 l = test_date3()
 return l

 Timings:

 (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
 mytests.test_date import test1 test1()
 50 loops, best of 5: 83.3 msec per loop
 (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
 mytests.test_date import test2 test2()
 50 loops, best of 5: 74.6 msec per loop
 (py32)zbook:mytests $ python -m timeit -n 50 -r 5 -s from
 mytests.test_date import test3 test3()
 50 loops, best of 5: 20.8 msec per loop

Yet another performance comparison for `time`:

# coding: utf-8

from cpython.datetime cimport import_datetime, time_new, time

import_datetime()

from datetime import time as pytime

def test_time1():
cdef list lst = []
cdef int hour, minute, second, microsecond
for hour in range(0, 24):
for minute in range(0,60):
for second in range(0, 60):
for microsecond in range(0, 10, 5):
d = pytime(hour, minute, second, microsecond)
lst.append(d)
return lst


def test_time2():
cdef list lst = []
cdef int hour, minute, second, microsecond
for hour in range(0, 24):
for minute in range(0,60):
for second in range(0, 60):
for microsecond in range(0, 10, 5):
d = time(hour, minute, second, microsecond)
lst.append(d)
return lst

def test_time3():
cdef list lst = []
cdef int hour, minute, second, microsecond
for hour in range(0, 24):
for minute in range(0,60):
for second in range(0, 60):
for microsecond in range(0

Re: [Cython] About IndexNode and unicode[index]

2013-03-02 Thread Zaur Shibzukhov
2013/3/2 Stefan Behnel stefan...@behnel.de:
 I think you could even pass in two flags, one for wraparound and one for
 boundscheck, and then just evaluate them appropriately in the existing if
 tests above. That should allow both features to be supported independently
 in a fast way.

 https://github.com/scoder/cython/commit/cc4f7daec3b1f19b5acaed7766e2b6f86902ad94

It seems to include the following directive at the beginning of the
tests (which tests indices for lists, tuples and unicode):

#cython: boundscheck=True
#cython: wraparound=True

as default mode for testing?

-- 
С уважением,
Шибзухов З.М.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel


Re: [Cython] About IndexNode and unicode[index]

2013-02-28 Thread Zaur Shibzukhov

 I think you could even pass in two flags, one for wraparound and one for
 boundscheck, and then just evaluate them appropriately in the existing if
 tests above. That should allow both features to be supported independently
 in a fast way.

 Intresting, could C compilers in optimization mode to eliminate unused
 evaluation path in nested if statements with constant conditional
 expressions?

 They'd be worthless if they didn't do that. (Even Cython does it, BTW.)

Then it can simplify writing utility code in order to support
different optimization flags in other cases too.
___
cython-devel mailing list
cython-devel@python.org
http://mail.python.org/mailman/listinfo/cython-devel