On 3/12/2009 4:05 PM, Andrew Straw wrote: > So, what's your take on having each row aligned? Is this also useful for > FFTW, for example? If so, we should perhaps come up with a better > routine for the cookbook.
Ok, so here is how it could be done. It fails for a reason I'll attribute to a bug in NumPy. import numpy as np def _nextpow(b,isize): i = 1 while b**i < isize: i += 1 return b**i def aligned_zeros(shape, boundary=16, dtype=float, order='C', imagealign=True): if (not imagealign) or (not hasattr(shape,'__len__')): N = np.prod(shape) d = np.dtype(dtype) tmp = np.zeros(N * d.itemsize + boundary, dtype=np.uint8) address = tmp.__array_interface__['data'][0] offset = (boundary - address % boundary) % boundary return tmp[offset:offset+N*d.itemsize]\ .view(dtype=d)\ .reshape(shape, order=order) else: if order == 'C': ndim0 = shape[-1] dim0 = -1 else: ndim0 = shape[0] dim0 = 0 d = np.dtype(dtype) bshape = [i for i in shape] padding = boundary + _nextpow(boundary, d.itemsize) - d.itemsize bshape[dim0] = ndim0*d.itemsize + padding print bshape tmp = np.zeros(bshape, dtype=np.uint8, order=order) address = tmp.__array_interface__['data'][0] offset = (boundary - address % boundary) % boundary aligned_slice = slice(offset, offset + ndim0*d.itemsize) if tmp.flags['C_CONTIGUOUS']: tmp = tmp[..., aligned_slice] print tmp.shape else: tmp = tmp[aligned_slice, ...] print tmp.shape return tmp.view(dtype=dtype) # this will often fail, # probably a bug in numpy So lets reproduce the NumPy issue: >>> a = zeros((10,52), dtype=uint8) >>> b = a[:, 3:8*2+3] >>> b.shape (10, 16) >>> b.view(dtype=float) Traceback (most recent call last): File "<pyshell#86>", line 1, in <module> b.view(dtype=float) ValueError: new type not compatible with array. However: >>> a = zeros((10,16), dtype=uint8) >>> a.view(dtype=float) array([[ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.], [ 0., 0.]]) Until we find a way to overcome this, it will be difficult to align rows to particular byte boundaries. It fails even if we make sure the padding is a multiple of the item size: padding = (boundary + _nextpow(boundary, d.itemsize) \ - d.itemsize) * d.itemsize Very annoying.. Using allocators in libraries (e.g. FFTW) would not help either, as NumPy would fail in the same way. Maybe we can force NumPy to do the right thing by hard-coding an array descriptor? We can do this in Cython though, as it supports pointers and double indirection. But it would be like using C. Sturla Molden _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion