On Thu, 13 Oct 2011 16:12:17 +0100 Richard Earnshaw <rearn...@arm.com> wrote:
> On 13/10/11 15:56, Joseph S. Myers wrote: > > Indeed, vector initializers are part of the target-independent GNU > > C language and have target-independent semantics that the elements > > go in memory order, corresponding to the target-independent > > semantics of lane numbers where they appear in GENERIC, GIMPLE and > > (non-UNSPEC) RTL and any target-independent built-in functions that > > use such numbers. (The issue here being, as you saw, that the lane > > numbers used in ARM-specific NEON intrinsics are for big-endian not > > the same as those used in target-independent features of GNU C and > > target-independent internal representations in GCC - hence various > > code to translate them between the two conventions when processing > > intrinsics into non-UNSPEC RTL, and to translate back when > > generating assembly instructions that encode lane numbers with the > > ARM conventions, as expounded at greater length at > > <http://gcc.gnu.org/ml/gcc-patches/2010-06/msg00409.html>.) > > > > This is all rather horrible, and leads to THREE different layouts for > a 128-bit vector for big-endian Neon. > > GCC format > 'VLD1.n' format > 'ABI' format > > GCC format and 'ABI' format differ in that the 64-bit words of the > 128-bit vector are swapped. > > All this and they are all expected to share a single machine mode. > > Furthermore, the definitions in GCC are broken, in that the types > defined in arm_neon.h (eg int8x16_t) are supposed to be ABI format, > not GCC format. > > Eukkkkkk! :-( FWIW, I thought long and hard about this problem, and eventually gave up trying to solve it. Note that many operations which depend on the ordering of vectors are now disabled entirely (at least for Q regs) in neon.md in big-endian mode to try and limit the damage. NEON is basically only supported properly in little-endian mode, IMO. I'd love to see this resolved properly. Some random observations: * The vectorizer can use whatever layout it wants for vectors in either endianness. Vectorizer vectors never interact with either GCC generic (source-level) vectors, nor the NEON intrinsics. Also they never cross ABI boundaries. * GCC generic vectors aren't specified very formally, particularly wrt. their interaction with NEON intrinsics. If you stick *entirely* to accessing vectors via NEON intrinsics, the problems in big-endian mode (I think) don't ever materialise. This includes not using indirection to load/store vectors, and (of course) not constructing vectors using { x, y, z... } syntax. One possibility might be to detect and *disallow* code which attempts to mix vector operations like that. I don't quite understand your comment about the GCC definitions of int8x16_t etc. being broken, tbh... Cheers, Julian