Re: [Numpy-discussion] aligned matrix / ctypes
Hello all, Attached is code (plus tests) for allocating aligned arrays -- I think this addresses all the requests in this thread, with regard to allowing for different kinds of alignment. Thanks Robert and Anne for your help and suggestions. Hopefully this will be useful. The core is a function for allocating arrays with totally arbitrary alignment along each dimension (e.g. you could allocate an 10x20 array of uint16's where each uint16 is aligned to 4-byte boundaries and each row of 20 uint16's is aligned to 32-byte boundaries, and the entire buffer is aligned to a 128-byte boundary.) I've also included helper functions for two common cases: when you want everything aligned to a particular multiple (every element, row, etc. as well as the whole buffer), and when you want an array where the rows (second-fastest moving index) are so aligned (this was my original use case, for fast image-blitting). Zach def aligned_empty(shape, dtype, dim_alignments, array_alignment): '''Allocate an empty array with the given shape and dtype, where the array buffer starts at a memory address evenly-divisible by array_alignment, and where items along each dimension are offset from the first item on that dimension by a byte offset that is an integer multiple of the corresponding value in the dim_alignments tuple. Example: To allocate a 20x30 array of floats32s, where individual data elements are aligned to 16-bute boundaries, each row is aligned to a 64-byte boundary, and the array's buffer starts on a 128-byte boundary, call: aligned_empty((20,30), numpy.float32, (64, 16), 128) ''' def aligned_rows_empty(shape, dtype, alignment, order='C'): '''Return an array where the rows (second-fastest-moving index) are aligned to byte boundaries evenly-divisible by 'alignment'. If 'order' is 'C', then the indexing is such that the fastest-moving index is the last one; if the order is 'F', then the fastest-moving index is the first.''' def aligned_elements_empty(shape, dtype, alignment, order='C'): '''Return an array where each element is aligned to byte boundaries evenly- divisible by 'alignment'.''' import numpy def aligned_empty(shape, dtype, dim_alignments, array_alignment): '''Allocate an empty array with the given shape and dtype, where the array buffer starts at a memory address evenly-divisible by array_alignment, and where items along each dimension are offset from the first item on that dimension by a byte offset that is an integer multiple of the corresponding value in the dim_alignments tuple. Example: To allocate a 20x30 array of floats32s, where individual data elements are aligned to 16-bute boundaries, each row is aligned to a 64-byte boundary, and the array's buffer starts on a 128-byte boundary, call: aligned_empty((20,30), numpy.float32, (64, 16), 128) ''' if len(shape) != len(dim_alignments): raise ValueError('Alignments must be provided for each dimension.') dtype = numpy.dtype(dtype) strides = [] current_size = dtype.itemsize for width, alignment in zip(shape[::-1], dim_alignments[::-1]): # build up new strides array in reverse, so that the fastest-moving index # is the last (C-ish indexing, but not necessarily contiguous) current_size += (alignment - current_size % alignment) % alignment strides.append(current_size) current_size *= width strides = strides[::-1] total_bytes = current_size + (array_alignment - 1) buffer = numpy.empty(total_bytes, dtype=numpy.uint8) address = buffer.ctypes.data offset = (array_alignment - address % array_alignment) % array_alignment return numpy.ndarray(shape=shape, dtype=dtype, buffer=buffer, strides=strides, offset=offset) def aligned_rows_empty(shape, dtype, alignment, order='C'): '''Return an array where the rows (second-fastest-moving index) are aligned to byte boundaries evenly-divisible by 'alignment'. If 'order' is 'C', then the indexing is such that the fastest-moving index is the last one; if the order is 'F', then the fastest-moving index is the first.''' if len(shape) 2: raise ValueError('Need at least a 2D array to align rows.') order = order.upper() if order not in ('C', 'F'): raise ValueError(Order must be 'C' or 'F'.) dim_alignments = [1 for dim in shape] dim_alignments[-2] = alignment if order == 'F': shape = shape[::-1] return aligned_empty(shape, dtype, dim_alignments, alignment).T else: return aligned_empty(shape, dtype, dim_alignments, alignment) def aligned_elements_empty(shape, dtype, alignment, order='C'): '''Return an array where each element is aligned to byte boundaries evenly- divisible by 'alignment'.''' order = order.upper() if order not in ('C', 'F'): raise ValueError(Order must be 'C' or 'F'.) dim_alignments = [alignment for dim in shape] if order == 'F': shape = shape[::-1] return aligned_empty(shape, dtype,
Re: [Numpy-discussion] aligned matrix / ctypes
Robert, Can we check this in somewhere under numpy.core? It seems very useful. Stéfan 2008/4/25 Zachary Pincus [EMAIL PROTECTED]: Hello all, Attached is code (plus tests) for allocating aligned arrays -- I think this addresses all the requests in this thread, with regard to allowing for different kinds of alignment. Thanks Robert and Anne for your help and suggestions. Hopefully this will be useful. The core is a function for allocating arrays with totally arbitrary alignment along each dimension (e.g. you could allocate an 10x20 array of uint16's where each uint16 is aligned to 4-byte boundaries and each row of 20 uint16's is aligned to 32-byte boundaries, and the entire buffer is aligned to a 128-byte boundary.) I've also included helper functions for two common cases: when you want everything aligned to a particular multiple (every element, row, etc. as well as the whole buffer), and when you want an array where the rows (second-fastest moving index) are so aligned (this was my original use case, for fast image-blitting). Zach def aligned_empty(shape, dtype, dim_alignments, array_alignment): '''Allocate an empty array with the given shape and dtype, where the array buffer starts at a memory address evenly-divisible by array_alignment, and where items along each dimension are offset from the first item on that dimension by a byte offset that is an integer multiple of the corresponding value in the dim_alignments tuple. Example: To allocate a 20x30 array of floats32s, where individual data elements are aligned to 16-bute boundaries, each row is aligned to a 64-byte boundary, and the array's buffer starts on a 128-byte boundary, call: aligned_empty((20,30), numpy.float32, (64, 16), 128) ''' def aligned_rows_empty(shape, dtype, alignment, order='C'): '''Return an array where the rows (second-fastest-moving index) are aligned to byte boundaries evenly-divisible by 'alignment'. If 'order' is 'C', then the indexing is such that the fastest-moving index is the last one; if the order is 'F', then the fastest-moving index is the first.''' def aligned_elements_empty(shape, dtype, alignment, order='C'): '''Return an array where each element is aligned to byte boundaries evenly- divisible by 'alignment'.''' ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] aligned matrix / ctypes
The problem with alignment on 3 byte boundaries, is that 3 is a prime and not a factor of the size of any common data type. (The only exception I can think of is 24 bit RGB values.) So in general, the elements in an array for which the first element is aligned on a 3 byte boundary, may or may not not be 3-byte aligned. Byte boundary alignment should thus be a bit intelligent. If the size of the dtype is not divisable by the byte boundary, an exception should be raised. In practice, only alignment on 2-, 4- and perhaps 8-byte boundaries are really required. Alignment on 2 byte boundaries should perhaps be NumPy's default (over no alignment), as MMX and SSE extensions depend on it. nVidia's CUDA also require alignment on 2 byte boundaries. Sturla Molden On Thu, Apr 24, 2008 at 4:57 PM, Zachary Pincus [EMAIL PROTECTED] wrote: The reason that one must slice before .view()ing to allow arbitrary alignment is as follows. Imagine that we want rows of four 2-byte shorts aligned to 3-byte boundaries. (Assume that we already have a buffer that starts on a 3-byte boundary.) So we need an array that's 9 bytes wide by however many rows, and then we just want to use the first eight bytes of row. If we slice first, we can get a strided array that is eight bytes wide, and thus something that we can interpret as four shorts. (That is, if .view() could handle strided arrays.) On the other hand, there's absolutely no way that we can .view() before slicing, because our underlying array is 9 bytes wide, and you can't look at 9 bytes as any integral number of 2-byte shorts. So .view() should properly fail, and thus we can't get to the slicing. Yes, you are right, sorry. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] aligned matrix / ctypes
Hello all, I need to allocate a numpy array that I will then pass to a camera driver (via ctypes) so that the driver can fill the array with pixels. The catch is that the driver requires that rows of pixels start at 4- byte boundaries. The sample C++ code given for allocating memory for this is (pixels are unsigned shorts): // Two bytes for each pixel, then round // up to the next multiple of four. long width_bytes = ( ( 2 * width_pixels ) + 3 ) -4; long allocated_size = width_bytes * height; unsigned char* image_data = new unsigned char[allocated_size]; I could make an empty uint8 numpy array of the required shape (allocated_size,) and adjust its dtype, shape, and strides to get the desired result. I'd then feed the array's ctypes data attribute to the driver to get filled in. Alternately I could allocate the data buffer via ctypes and then construct an array around it. Is either option better? How does one construct a numpy array around a ctypes memory object? Can the array take over the memory management for the buffer? Thanks for any suggestions, Zach ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] aligned matrix / ctypes
On Wed, Apr 23, 2008 at 2:10 PM, Zachary Pincus [EMAIL PROTECTED] wrote: Hello all, I need to allocate a numpy array that I will then pass to a camera driver (via ctypes) so that the driver can fill the array with pixels. The catch is that the driver requires that rows of pixels start at 4- byte boundaries. The sample C++ code given for allocating memory for this is (pixels are unsigned shorts): // Two bytes for each pixel, then round // up to the next multiple of four. long width_bytes = ( ( 2 * width_pixels ) + 3 ) -4; long allocated_size = width_bytes * height; unsigned char* image_data = new unsigned char[allocated_size]; I could make an empty uint8 numpy array of the required shape (allocated_size,) and adjust its dtype, shape, and strides to get the desired result. I'd then feed the array's ctypes data attribute to the driver to get filled in. Alternately I could allocate the data buffer via ctypes and then construct an array around it. Note that the approach above doesn't ensure that the first row is correctly aligned. It just assumes that the allocator will always start a new block aligned at 4 bytes (which may be reasonable for the platforms you are targetting). Ignoring that issue for a moment, the way to make sure that the rows are aligned is very similar to how you do it in C. Round up the row length, make an array with the larger dimensions (height,width_pixels+width_pixels%2), then slice out [:,:width_pixels]. To solve the initial alignment, you overallocate a 1D array by 3 bytes and find the offset from the allocated initial address which is correctly aligned. Slice out [:allocated_size] portion of this, .view() it as uint16, reshape it to (height,width_pixels+width_pixels%2), then slice out [:,:width_pixels]. Is either option better? How does one construct a numpy array around a ctypes memory object? Can the array take over the memory management for the buffer? Whenever possible, I try to get numpy to do the allocating. That saves many headaches. It is possible to do otherwise, but let's cover that only if you need it. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] aligned matrix / ctypes
On 23/04/2008, Zachary Pincus [EMAIL PROTECTED] wrote: Hi, Thanks a ton for the advice, Robert! Taking an array slice (instead of trying to set up the strides, etc. myself) is a slick way of getting this result indeed. It's worth mentioning that there was some discussion of adding an allocator for aligned arrays. It got sidetracked into a discussion of SSE support for numpy, but there are all sorts of reasons to want aligned arrays in numpy, as this post demonstrates. Seeing as it's just a few lines worth of pure python, is anyone interested in writing an aligned_empty() for numpy? Anne ___ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion