On Sat, 17 Sep 2011 14:01:19 -0400, Peter Alexander 
<peter.alexander...@gmail.com> wrote:
I posted this is d.learn, and also on stackoverflow.com with no
satisfactory answer. Can anyone help me with this?

http://stackoverflow.com/questions/7375165/aligning-stack-variables-in-d

---

Is there a way to align data on the stack? In particular, I want to
create an 16-byte aligned array of floats to load into XMM registers
using movaps, which is significantly faster than movups.

e.g.

void foo()
{
     float[4] v = [1.0f, 2.0f, 3.0f, 4.0f];
     asm
     {
         movaps XMM0, v; // v must be 16-byte aligned for this to work.
         ...
     }
}



It depends. OS X requires 16-byte alignment, which DMD complies with. So on Mac 
the above code is okay. However, on PC, the only way to get aligned memory is 
to a) use the heap or b) request extra stack space and align it yourself. (i.e. 
declare a float[7] and then slice it appropriately)

The other option is to just use movups. movups on aligned data had (IIRC) the 
same speed on aligned data as movaps did on my CPU (Core 2) and I'd really be 
surprised if on any modern architecture this wasn't true. (That said, movups 
does slow down on unaligned memory)

Also, you could use alloca or region allocator to get aligned memory.

Reply via email to