On Mon, Sep 16, 2013 at 10:29:37AM +0100, James Greenhalgh wrote: > The little endian compiler would lay memory out as: > > 0x0 ... 0x8 > | 0b | 0a | 1b | 1a | 2b | 2a | 3b | 3a | > > And the big endian compiler would lay out memory as: > > 0x0 ... 0x8 > | 0a | 0b | 1a | 1b | 2a | 2b | 3a | 3b | > > In both cases, element 0 is '0x0a0b'. If we load this array as a > vector with ld1.h both big and little-endian compilers will load > the vector as: > > bit 128 .. bit 64 bit 0 > lane 16 | lane 3 | | lane 0 | > |..... | 3b | 3a | 2b | 2a | 1b | 1a | 0b | 0a | >
Ugh, I knew I would make a mistake somewhere! This should, of course, be loaded as: bit 128 .. bit 64 bit 0 lane 16 | lane 3 | | lane 0 | |..... | 3a | 3b | 2a | 2b | 1a | 1b | 0a | 0b | James