On Thu, Jun 11, 2009 at 2:01 AM, Dag Sverre Seljebotn<[email protected]> wrote: > Brent Pedersen wrote: >> On Wed, Jun 10, 2009 at 3:23 PM, Brent Pedersen<[email protected]> wrote: >>> On Wed, Jun 10, 2009 at 12:08 PM, Dag Sverre >>> Seljebotn<[email protected]> wrote: >>>> Brent Pedersen wrote: >>>>> On Thu, May 14, 2009 at 9:05 AM, Robert<[email protected]> wrote: >>>>>> Dag Sverre Seljebotn wrote: >>>>>>> Robert wrote: >>>>>>>> How to deal with Python's array.array directly - as with numpy.pxd >>>>>>>> is there a array.pxd ? >>>>>>> Which Python version? >>>>>> mainly 2.6; and 2.3 >>>>>> >>>>>>> Under Python 3 this should probably happen automatically, try: >>>>>>> >>>>>>> cdef object[int, ndim=1, mode="c"] arr = yourarray >>>>>>> >>>>>>> Not sure about Python 2.6+ >>>>>> 2.6.2 didn't: >>>>>> File "calc_c.pyx", line 32, in calc_c.test1 (calc_c.c:808) >>>>>> cdef object[float, ndim=1, mode="c"] b = pyarray >>>>>> TypeError: 'array.array' does not have the buffer interface >>>>>> >>>>>> in Python "buffer(myarray)" also behaves strange >>>>>> >>>>>>> For Python 2.5- an array.pxd must be written. It is not difficult, one >>>>>>> simply follows the pattern in numpy.pxd (by implementing __getbuffer__ >>>>>>> and filling in the Py_buffer struct). >>>>>>> >>>>>>> If somebody ends up doing this, please submit it for inclusion in >>>>>>> Python. >>>>>> I've put a array.pxd here: >>>>>> http://trac.cython.org/cython_trac/ticket/314 >>>>> hi, what's the status on this? it seemed very useful. >>>>> if it needs tests, i could try to write some given some info on where >>>>> to start, what to >>>>> cover or maybe some cython-numpy tests to from which to crib? >>>>> thanks, >>>>> -brent >>>> Great! >>>> >>>> The main issue with the patch as I see it is that it tries to hack on >>>> multi-dimensionality. That is very easily done simply by writing a >>>> subclass of array instead, and so doesn't belong in the pxd like this. >>>> >>>> If you could just remove the multi-dimensional stuff from it I'd be >>>> happy to accept it. Accompanying tests are strongly preferred though. >>>> >>>> There's a section on writing tests here: >>>> http://wiki.cython.org/HackerGuide >>>> >>>> I don't expect the numpy tests to be too useful (though they are in >>>> tests/run), just make sure the basics work with a couple of different >>>> datatypes. I'll be happy to suggest improvements if you make a first >>>> iteration anyway. >>>> >>> thanks for the pointers, >>> i have a start on this, but what should i do with the arrayarray.h? >>> there doesnt seem to be any cases where a header is included with cython >>> itself. >>> >>> -brent >>> >> >> here's a first iteration with tests for some feedback. >> i'm still not sure what to do with the .h file. currently > > Hmm. If it wasn't for the #ifdefs it could have been ported to Cython > code and inserted in the pxd I think. > > Anyway, there's not much to do -- but you can have a look at runtests.py > to make it add Cython/Includes to the C include path automatically?
i updated the patch to do that, but then what about when running outside of tests? perhaps this is better as its own python module? robert? i'm attaching updated patch here which adjusts the runtests.py hopefully someone can add to the tests or suggest more. they're surely not complete as i have only a superficial understanding of this. -brent > > I'm a bit uncertain about this though. Is the faster creation etc. > really necesarry? It seems like adding a lot of complexity (basically > one needs to keep up with all Python versions and verify that the header > still works for each new Python release). So I'm not quite sure about > it...thoughts? But if it is deemed important I think it can go in. > > Tests seems fine -- I only glanced superficially, I just see that they > are there and trust the submitter of the patch to know the subject > matter better than me. > > Finally, to properly support the PEP 3118, one should make sure that > memory is not deallocated between a call to getbuffer and release. I.e. > when the array is resized, one should not free the memory, but > releasebuffer should do that. > > I realize that this is likely impossible to do though (the array can be > resized through Python calls, right?) so we'll just have to drop that > and make a big warning not to resize while a buffer is acquired. > > -- > Dag Sverre > _______________________________________________ > Cython-dev mailing list > [email protected] > http://codespeak.net/mailman/listinfo/cython-dev >
# HG changeset patch # User Brent Pedersen <[email protected]> # Date 1244678131 25200 # Node ID 4f02dff6b427c947903ab1fa0825fd62af3b552f # Parent 51fa7e425dc815065c9e92af4e486d8d0afad326 add array.pxd by rh and include some tests. diff -r 51fa7e425dc8 -r 4f02dff6b427 Cython/Includes/array.pxd --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Cython/Includes/array.pxd Wed Jun 10 16:55:31 2009 -0700 @@ -0,0 +1,197 @@ +""" + array.pxd + + Cython direct interface to Python's array.array type (builtin module). + + * 1D contiguous data view + * tools for fast array creation, maximum C-speed and handiness + * suitable as allround light weight auto-array within Cython code too + + See also: array_example.pyx + + Usage: + + cimport array + + Usage through Cython buffer interface (Py2.3+): + + @cython.boundscheck(False) + def f(arg1, unsigned i, double dx) + array.array[double] a = arg1 + a[i] += dx + + Fast C-level new_array(_zeros), resize_array, copy_array, .length, + zero_array + + cdef array.array[double] k = array.copy_array(d) + cdef array.array[double] n = array.new_array(d, d.length * 2 ) + cdef array.array[double] m = array.new_array_zeros(FLOAT_TEMPLATE, 100) + array.resize_array(f, 200000) + + Zero overhead with naked data pointer views by union: + _f, _d, _i, _c, _u, ... + => Original C array speed + Python dynamic memory management + + cdef array.array a = inarray + if + a._d[2] += 0.66 # use as double array without extra casting + + float *subview = vector._f + 10 # starting from 10th element + unsigned char *subview_buffer = vector._B + 4 + + Suitable as lightweight arrays intra Cython without speed penalty. + Replacement for C stack/malloc arrays; no trouble with refcounting, + mem.leaks; seamless Python compatibility, buffer() option + + + IMPORTANT: arrayarray.h (arrayobject, arraydescr) is not part of + the official Python C-API so far; arrayarray.h is located + next to this file copy it to PythonXX/include or local or + somewhere on your -I path + + last changes: 2009-05-15 rk +""" +import os.path as op +cimport stdlib +import _cyarray + +cdef extern from "stdlib.h" nogil: + void *memset(void *str, int c, size_t n) + char *strcat(char *str1, char *str2) + char *strncat(char *str1, char *str2, size_t n) + void *memchr(void *str, int c, size_t n) + int memcmp(void *str1, void *str2, size_t n) + void *memcpy(void *str1, void *str2, size_t n) + void *memmove(void *str1, void *str2, size_t n) + + +cdef extern from "arrayarray.h": + ctypedef void PyTypeObject + ctypedef short Py_UNICODE + int PyErr_BadArgument() + ctypedef class array.array [object arrayobject] + ctypedef object GETF(array a, Py_ssize_t ix) + ctypedef object SETF(array a, Py_ssize_t ix, object o) + ctypedef struct arraydescr: # [object arraydescr]: + int typecode + int itemsize + GETF getitem # PyObject * (*getitem)(struct arrayobject *, Py_ssize_t); + SETF setitem # int (*setitem)(struct arrayobject *, Py_ssize_t, PyObject *); + + ctypedef class array.array [object arrayobject]: + cdef __cythonbufferdefaults__ = {'ndim' : 1, 'mode':'c'} + ##cdef __cythonbufferdefaults__ = {"mode": "strided"} + + cdef: + PyTypeObject* ob_type + + int ob_size # number of valid items; + unsigned length # == ob_size (by union) + + char* ob_item # to first item + + Py_ssize_t allocated # bytes + arraydescr* ob_descr # struct arraydescr *ob_descr; + object weakreflist # /* List of weak references */ + + # view's of ob_item: + float* _f # direct float pointer access to buffer + double* _d # double ... + int* _i + unsigned *_I + unsigned char *_B + signed char *_b + char *_c + unsigned long *_L + long *_l + short *_h + unsigned short *_H + Py_UNICODE *_u + void* _v + + #$9 method decorations don't work so far => module function + #$9 cdef inline resize(array self, int n): + #$9 PyMem_Realloc(self.ob_item, n * self.ob_descr.itemsize) + + # Note: This syntax (function definition in pxd files) is an + # experimental exception made for __getbuffer__ and __releasebuffer__ + # -- the details of this may change. + def __getbuffer__(array self, Py_buffer* info, int flags): + # This implementation of getbuffer is geared towards Cython + # requirements, and does not yet fullfill the PEP. + # In particular strided access is always provided regardless + # of flags + cdef unsigned rows, columns, itemsize, ndim = 1 + + info.suboffsets = NULL + info.buf = self.ob_item + info.readonly = 0 + info.itemsize = itemsize = self.ob_descr.itemsize # e.g. sizeof(float) + + info.strides = <Py_ssize_t*> \ + stdlib.malloc(sizeof(Py_ssize_t) * ndim * 2 + 2) + info.shape = info.strides + 1 + info.shape[0] = self.ob_size # number of items + info.strides[0] = info.itemsize + + info.ndim = ndim + info.format = <char*>(info.strides + 2 * ndim) + info.format[0] = self.ob_descr.typecode + info.format[1] = 0 + info.obj = self + ##print "array.pyx NDIM rows columns", ndim, rows, columns + + def __releasebuffer__(array self, Py_buffer* info): + #if PyArray_HASFIELDS(self): + # stdlib.free(info.format) + #if sizeof(npy_intp) != sizeof(Py_ssize_t): + stdlib.free(info.strides) + ##print "__releasebuffer__" + + array newarrayobject(PyTypeObject* type, Py_ssize_t size, + arraydescr *descr) + + # fast resize/realloc + # not suitable for small increments; reallocation 'to the point' + int array_resize(array self, Py_ssize_t n) + # efficient for small increments (not in Py2.3-) + int array_resize_smart(array self, Py_ssize_t n) + + +# fast creation of a new array - init with zeros +# yet you need a (any) template array of the same item type (but not same size) +cdef inline array new_array_zeros(array sametype, unsigned n): + cdef array op = newarrayobject(<PyTypeObject*>sametype.ob_type, n, sametype.ob_descr) + if op: + memset(op.ob_item, 0, n * op.ob_descr.itemsize) + return op + +# fast creation of a new array - no init with zeros +cdef inline array array_new(array sametype, unsigned n): + return newarrayobject(<PyTypeObject*>sametype.ob_type, n, + sametype.ob_descr) + +cdef inline array array_copy(array self): + cdef array op = newarrayobject(<PyTypeObject*>self.ob_type, self.ob_size, + self.ob_descr) + memcpy(op.ob_item, self.ob_item, op.ob_size * op.ob_descr.itemsize) + return op + +cdef inline int array_extend_buffer(array self, char* stuff, Py_ssize_t n): + """ efficent appending of new stuff of same type (e.g. of same array type) + n: number of elements (not number of bytes!) + """ + cdef Py_ssize_t itemsize = self.ob_descr.itemsize, orgsize = self.ob_size + if -1 == array_resize_smart(self, orgsize + n): + return -1 + memcpy( self.ob_item + orgsize * itemsize, stuff, n * itemsize ) + +cdef inline int array_extend(array self, array other): + if self.ob_descr.typecode != self.ob_descr.typecode: + PyErr_BadArgument() + return -1 + return array_extend_buffer(self, other.ob_item, other.ob_size) + + +cdef inline void array_zero(array op): + memset(op.ob_item, 0, op.ob_size * op.ob_descr.itemsize) diff -r 51fa7e425dc8 -r 4f02dff6b427 Cython/Includes/arrayarray.h --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/Cython/Includes/arrayarray.h Wed Jun 10 16:55:31 2009 -0700 @@ -0,0 +1,165 @@ +/* arrayarray.h + + artificial C-API for Python's + <array.array> type. + copy this file to your -I path, e.g. .../pythonXX/include + See array.pxd next to this file + + last changes: 2009-05-15 rk + +*/ + +#ifndef _ARRAYARRAY_H +#define _ARRAYARRAY_H + +#include <Python.h> + +struct arrayobject; /* Forward */ + +/* All possible arraydescr values are defined in the vector "descriptors" + * below. That's defined later because the appropriate get and set + * functions aren't visible yet. + */ +typedef struct arraydescr { + int typecode; + int itemsize; + PyObject * (*getitem)(struct arrayobject *, Py_ssize_t); + int (*setitem)(struct arrayobject *, Py_ssize_t, PyObject *); +#if PY_VERSION_HEX >= 0x03000000 + char *formats; +#endif +} arraydescr; + + +typedef struct arrayobject { + PyObject_HEAD + union { + int ob_size; + unsigned length; + }; + union { + char *ob_item; + float *_f; + double *_d; + int *_i; + unsigned *_I; + unsigned char *_B; + signed char *_b; + char *_c; + unsigned long *_L; + long *_l; + short *_h; + unsigned short *_H; + Py_UNICODE *_u; + void *_v; + }; +#if PY_VERSION_HEX >= 0x02040000 + Py_ssize_t allocated; +#endif + struct arraydescr *ob_descr; +#if PY_VERSION_HEX >= 0x02040000 + PyObject *weakreflist; /* List of weak references */ +#if PY_VERSION_HEX >= 0x03000000 + int ob_exports; /* Number of exported buffers */ +#endif +#endif +} arrayobject; + + +#ifndef NO_NEWARRAY_INLINE +/* + * + * fast creation of a new array - init with zeros + */ + +static inline PyObject * +newarrayobject(PyTypeObject *type, Py_ssize_t size, struct arraydescr *descr) +{ + arrayobject *op; + size_t nbytes; + + if (size < 0) { + PyErr_BadInternalCall(); + return NULL; + } + + nbytes = size * descr->itemsize; + /* Check for overflow */ + if (nbytes / descr->itemsize != (size_t)size) { + return PyErr_NoMemory(); + } + op = (arrayobject *) type->tp_alloc(type, 0); + if (op == NULL) { + return NULL; + } + op->ob_descr = descr; +#if !( PY_VERSION_HEX < 0x02040000 ) + op->allocated = size; + op->weakreflist = NULL; +#endif + Py_SIZE(op) = size; + if (size <= 0) { + op->ob_item = NULL; + } + else { + op->ob_item = PyMem_NEW(char, nbytes); + if (op->ob_item == NULL) { + Py_DECREF(op); + return PyErr_NoMemory(); + } + } + return (PyObject *) op; +} +#else +PyObject * +newarrayobject(PyTypeObject *type, Py_ssize_t size, struct arraydescr *descr); +#endif + +/* fast resize (reallocation to the point) + not designed for filing small increments (but for fast opaque array apps) */ +static inline int array_resize(arrayobject *self, Py_ssize_t n) +{ + char *item=self->ob_item; + PyMem_RESIZE(item, char, n * (unsigned)self->ob_descr->itemsize); + if (item == NULL) { + PyErr_NoMemory(); + return -1; + } + self->ob_item = item; + self->ob_size = n; +#if PY_VERSION_HEX >= 0x02040000 + self->allocated = n; +#endif + return 0; +} + +/* suitable for small increments; over allocation 50% ; + Remains non-smart in Python 2.3- ; but exists for compatibility */ +static inline int array_resize_smart(arrayobject *self, Py_ssize_t n) +{ + char *item=self->ob_item; +#if PY_VERSION_HEX >= 0x02040000 + if (n < self->allocated) { + if (n*4 > self->allocated) { + self->ob_size = n; + return 0; + } + } + Py_ssize_t newsize = n * 3 / 2 + 1; + PyMem_RESIZE(item, char, newsize * self->ob_descr->itemsize); + if (item == NULL) { + PyErr_NoMemory(); + return -1; + } + self->ob_item = item; + self->ob_size = n; + self->allocated = newsize; + return 0; +#else + return resize_array(self, n) /* Python 2.3 has no 'allocated' */ +#endif +} + + +#endif +/* _ARRAYARRAY_H */ diff -r 51fa7e425dc8 -r 4f02dff6b427 tests/run/pyarray.pyx --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tests/run/pyarray.pyx Wed Jun 10 16:55:31 2009 -0700 @@ -0,0 +1,103 @@ +__doc__ = u""" + >>> len(a) + 3 + + >>> test_len(a) + 3L + + >>> test_copy(a) + array('f', [1.0, 2.0, 3.0]) + + >>> a[2] + 3.5 + + >>> test_fast_access(a) + + >>> test_new_zero(a) + array('f', [0.0, 0.0, 0.0, 0.0, 0.0]) + + >>> test_set_zero(a) + array('f', [0.0, 0.0, 0.0]) + + >>> test_resize(a) + + >>> test_view() + + >>> test_extend() + + >>> test_extend_buffer() + array('c', 'abcdefghij') + +""" + +import array # Python builtin module +cimport array # array.pxd / arrayarray.h + +a = array.array('f', [1.0, 2.0, 3.0]) + +def test_len(a): + cdef array.array ca = a # for C-fast array usage + return ca.length + +def test_copy(a): + cdef array.array ca = a + cdef array.array b + b = array.array_copy(a) + a[2] = 3.5 + assert b[2] != a[2] + return b + + +def test_fast_access(a): + cdef array.array ca = a + assert ca._f[1] == 2.0, ca._f[1] + + assert ca._c[:5] == '\x00\x00\x80?\x00', ca.c[:5] + + ca._f[1] += 2.0 + assert ca._f[1] == 4.0 + + +def test_new_zero(a): + cdef array.array cb = array.new_array_zeros(a, 5) + assert cb.length == 5 + return cb + + +def test_set_zero(a): + cdef array.array cb = array.array_copy(a) + array.array_zero(cb) + return cb + + +def test_resize(a): + cdef array.array cb = array.array_copy(a) + array.array_resize(cb, 10) + for i in range(10): + cb._f[i] = i + assert cb.length == 10 + assert cb[9] == cb[-1] == cb._f[9] == 9 + + +def test_view(): + a = array.array('i', [1, 2, 3]) + cdef array.array[int] ca = a + assert ca._i[0] == 1 + assert ca._i[2] == 3 + + +def test_extend(): + cdef array.array ca = array.array('i', [1, 2, 3]) + cdef array.array cb = array.array('i', range(4, 6)) + array.array_extend(ca, cb) + assert list(ca) == range(1, 6), list(ca) + + +def test_extend_buffer(): + cdef array.array ca = array.array('c', "abcdef") + cdef char* s = "ghij" + array.array_extend_buffer(ca, s, len(s)) # or use stdlib.strlen + + assert ca._c[9] == 'j' + assert ca.length == 10 + return ca # HG changeset patch # User Brent Pedersen <[email protected]> # Date 1244730768 25200 # Node ID 495e5b552636f28ee548baf481a8f4d57f728367 # Parent 4f02dff6b427c947903ab1fa0825fd62af3b552f find arrayarray.h header automatically from runtests. exclude extra stuffs from array.pxd diff -r 4f02dff6b427 -r 495e5b552636 Cython/Includes/array.pxd --- a/Cython/Includes/array.pxd Wed Jun 10 16:55:31 2009 -0700 +++ b/Cython/Includes/array.pxd Thu Jun 11 07:32:48 2009 -0700 @@ -51,9 +51,7 @@ last changes: 2009-05-15 rk """ -import os.path as op cimport stdlib -import _cyarray cdef extern from "stdlib.h" nogil: void *memset(void *str, int c, size_t n) diff -r 4f02dff6b427 -r 495e5b552636 runtests.py --- a/runtests.py Wed Jun 10 16:55:31 2009 -0700 +++ b/runtests.py Thu Jun 11 07:32:48 2009 -0700 @@ -12,12 +12,19 @@ TEST_DIRS = ['compile', 'errors', 'run', 'pyregr'] TEST_RUN_DIRS = ['run', 'pyregr'] +PATH = os.path.abspath(__file__) + # Lists external modules, and a matcher matching tests # which should be excluded if the module is not present. EXT_DEP_MODULES = { - 'numpy' : re.compile('.*\.numpy_.*').match + 'numpy' : re.compile('.*\.numpy_.*').match, + 'array' : re.compile('.*\.pyarray.*').match } +def get_pyarray_include_dirs(): + op = os.path + return [op.join(op.dirname(PATH), "Cython/Includes")] + def get_numpy_include_dirs(): import numpy return [numpy.get_include()] @@ -25,6 +32,7 @@ EXT_DEP_INCLUDES = [ # test name matcher , callable returning list (re.compile('numpy_.*').match, get_numpy_include_dirs), + (re.compile('pyarray.*').match, get_pyarray_include_dirs), ] VER_DEP_MODULES = {
_______________________________________________ Cython-dev mailing list [email protected] http://codespeak.net/mailman/listinfo/cython-dev
