On Fri, 2008-04-25 at 04:15 +0200, Sturla Molden wrote: > The problem with alignment on 3 byte boundaries, is that 3 is a prime and > not a factor of the size of any common data type. (The only exception I > can think of is 24 bit RGB values.) So in general, the elements in an > array for which the first element is aligned on a 3 byte boundary, may or > may not not be 3-byte aligned. > Byte boundary alignment should thus be a bit intelligent. If the size of > the dtype is not divisable by the byte boundary, an exception should be > raised. > > In practice, only alignment on 2-, 4- and perhaps 8-byte boundaries are > really required.
There are other useful alignements. I don't know for mmx, but for SSE, 16 bytes alignement is almost required for useful speedup (to be able to use movps instead of movups, which is extremely slow, when loading data from memory into sse registers). I saw once mention that the mkl also sometimes requires 64 byte alignement. > Alignment on 2 byte boundaries should perhaps be NumPy's > default (over no alignment), as MMX and SSE extensions depend on it. malloc on glibc alloc on 8 bytes boundaries by default, and malloc on mac os X on 16 bytes. I guess, but should check whether the same is true on solaris, since sparc does not like unusual alignement (bus errors if float are not 4 byte aligned, for example). I have somewhere the code for portable aligned allocators (mostly given by Steve Johnson from fftw fame) + a C api to access them in C extensions + plus default alignement to 16 bytes for PyDataMem_NEW (which is just a wrapper around those aligned allocators). cheers, David _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion