http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56351

             Bug #: 56351
           Summary: ARM Big-Endian: storing local double to packed
                    variable causes corruption
    Classification: Unclassified
           Product: gcc
           Version: 4.7.2
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: target
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: set...@google.com


Created attachment 29478
  --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29478
Test case which demonstrates incorrect codegen

The attached code behaves incorrectly on my platform with gcc 4.7.2.  In
particular, the output is:

val is: 1.234567 (0x3FF3C0C9:539B8887)
Calling PrintAndStoreUnaligned:
57432423068808260924249171392224224725059031612325630140261797720764832869069412330679690067968.000000
(0x539B8887:3FF3C0C9)
unaligned_double.val is:
57432423068808260924249171392224224725059031612325630140261797720764832869069412330679690067968.000000
(0x539B8887:3FF3C0C9)

It appears that storing a double parameter into an unaligned variable can cause
all accesses to that parameter within the function to have the upper and lower
32 bits swapped.

This code is being built for a TI TMS570-series processor, although I suspect
the problem would occur with any big-endian ARM target with VFPv3
floating-point support.  Here's compiler info.  To build the compiler with
these flags requires a minor patch:
http://gcc.gnu.org/ml/gcc-patches/2013-02/msg00791.html

% third_party/car/embedded/toolchains/gcc_tms570/bin/armeb-unknown-eabi-gcc -v
-save-temps -O1 -c gcc_bug.c -o gcc_bug.o -Wa,-adhlsn=gcc_bug.lst
Using built-in specs.
COLLECT_GCC=third_party/car/embedded/toolchains/gcc_tms570/bin/armeb-unknown-eabi-gcc
Target: armeb-unknown-eabi
Configured with: ../gcc-4.7.2/configure
--prefix=/usr/local/google/armeb/toolchain --build=x86_64-cross-linux-gnu
--target=armeb-unknown-eabi --host=x86_64-cross-linux-gnu
--with-sysroot=/usr/local/google/armeb/sysroot --with-newlib
--with-headers=../newlib-1.19.0/newlib/libc/include --disable-nls
--enable-languages=c,c++ --enable-c99 --enable-long-long
--with-mpfr=/usr/local/google/armeb/toolchain
--with-gmp=/usr/local/google/armeb/toolchain
--with-mpc=/usr/local/google/armeb/toolchain --disable-multilib
--with-abi=aapcs --with-arch=armv7-r --with-mode=thumb --with-float=hard
--with-fpu=vfpv3-d16 --disable-threads --disable-shared --disable-libgomp
--disable-libmudflap --disable-libssp
Thread model: single
gcc version 4.7.2 (GCC) 
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O1' '-c' '-o' 'gcc_bug.o'
'-march=armv7-r' '-mfloat-abi=hard' '-mfpu=vfpv3-d16' '-mabi=aapcs' '-mthumb'

/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../libexec/gcc/armeb-unknown-eabi/4.7.2/cc1
-E -quiet -v -iprefix
/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/
-D__USES_INITFINI__ gcc_bug.c -march=armv7-r -mfloat-abi=hard -mfpu=vfpv3-d16
-mabi=aapcs -mthumb -O1 -fpch-preprocess -o gcc_bug.i
ignoring duplicate directory
"/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/../../lib/gcc/armeb-unknown-eabi/4.7.2/include"
ignoring nonexistent directory
"/usr/local/google/armeb/sysroot/usr/local/include"
ignoring duplicate directory
"/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/../../lib/gcc/armeb-unknown-eabi/4.7.2/include-fixed"
ignoring duplicate directory
"/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/../../lib/gcc/armeb-unknown-eabi/4.7.2/../../../../armeb-unknown-eabi/include"
#include "..." search starts here:
#include <...> search starts here:

/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/include

/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/include-fixed

/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/../../../../armeb-unknown-eabi/include
 /usr/local/google/armeb/sysroot/usr/include
End of search list.
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O1' '-c' '-o' 'gcc_bug.o'
'-march=armv7-r' '-mfloat-abi=hard' '-mfpu=vfpv3-d16' '-mabi=aapcs' '-mthumb'

/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../libexec/gcc/armeb-unknown-eabi/4.7.2/cc1
-fpreprocessed gcc_bug.i -quiet -dumpbase gcc_bug.c -march=armv7-r
-mfloat-abi=hard -mfpu=vfpv3-d16 -mabi=aapcs -mthumb -auxbase-strip gcc_bug.o
-O1 -version -o gcc_bug.s
GNU C (GCC) version 4.7.2 (armeb-unknown-eabi)
        compiled by GNU C version 4.6.x-google 20120601 (prerelease), GMP
version 5.0.5, MPFR version 3.1.1, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
GNU C (GCC) version 4.7.2 (armeb-unknown-eabi)
        compiled by GNU C version 4.6.x-google 20120601 (prerelease), GMP
version 5.0.5, MPFR version 3.1.1, MPC version 1.0.1
GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
Compiler executable checksum: 67327bcd17af73e1cc289bfa68add0a9
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O1' '-c' '-o' 'gcc_bug.o'
'-march=armv7-r' '-mfloat-abi=hard' '-mfpu=vfpv3-d16' '-mabi=aapcs' '-mthumb'

/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/../../../../armeb-unknown-eabi/bin/as
-march=armv7-r -mfloat-abi=hard -mfpu=vfpv3-d16 -meabi=5 -adhlsn=gcc_bug.lst -o
gcc_bug.o gcc_bug.s
COMPILER_PATH=/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../libexec/gcc/armeb-unknown-eabi/4.7.2/:/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../libexec/gcc/:/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/../../../../armeb-unknown-eabi/bin/
LIBRARY_PATH=/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/:/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/:/google/src/cloud/sethml/head2/google3/third_party/car/embedded/toolchains/gcc_tms570/bin/../lib/gcc/armeb-unknown-eabi/4.7.2/../../../../armeb-unknown-eabi/lib/
COLLECT_GCC_OPTIONS='-v' '-save-temps' '-O1' '-c' '-o' 'gcc_bug.o'
'-march=armv7-r' '-mfloat-abi=hard' '-mfpu=vfpv3-d16' '-mabi=aapcs' '-mthumb'

Here's disassembly of the bad code.  The problem seems to be the the fmrs
instructions copying from s0/s1.  ARM document "DDI0363E ARM Cortex-R4-r1p3
technical reference" in section "12.2.1 FPU views of the register bank" says:
  The mapping between the registers is as follows:
  • S<2n> maps to the least significant half of D<n>
  • S<2n+1> maps to the most significant half of D<n>.
So, in the code below, r4 gets s0 which is the LSB of d0, and r5 gets s1 which
is the MSB of d0.  Then it stores r4 first in memory - incorrect for a
big-endian architecture.  Likewise, the fmdrr instruction is defined as taking
the LSB from the first argument, so the fmdrr instruction on line 64
reassembles d0 with its halves swapped.  (It's also worth noting that the code
below creates a lot of unnecessary temporaries, but that's not my bug.)

On gcc 4.7.2:

  56                    _ZN3car22PrintAndStoreUnalignedEd:
  59 002c B538                  push    {r3, r4, r5, lr}
  60 002e EE104A10              fmrs    r4, s0  @ int
  61 0032 EE105A90              fmrs    r5, s1  @ int
  62 0036 EE102A10              fmrs    r2, s0  @ int
  63 003a EE103A90              fmrs    r3, s1  @ int
  64 003e EC423B10              fmdrr   d0, r3, r2
  65 0042 F7FFFFFE              bl      _ZN3car11PrintDoubleEd
  66 0046 F2400300              movw    r3, #:lower16:.LANCHOR0
  67 004a F2C00300              movt    r3, #:upper16:.LANCHOR0
  68 004e 605D                  str     r5, [r3, #4]
  69 0050 601C                  str     r4, [r3, #0]
  70 0052 BD38                  pop     {r3, r4, r5, pc}

The latest gcc 4.8 snapshot produces correct code, although I'm not totally
convinced that it's fixed the underlying problem, as opposed to just happening
to avoid the problem by emitting slightly different instructions:

On gcc 4.8-20130210:

  44                PrintAndStoreUnaligned:
  47 0020 B538        push  {r3, r4, r5, lr}
  48 0022 EC523B10    fmrrd r3, r2, d0
  49 0026 4614        mov r4, r2
  50 0028 461D        mov r5, r3
  51 002a 4622        mov r2, r4
  52 002c EC423B10    fmdrr d0, r3, r2
  53 0030 F7FFFFFE    bl  PrintDouble
  54 0034 F2400300    movw  r3, #:lower16:unaligned_double
  55 0038 F2C00300    movt  r3, #:upper16:unaligned_double
  56 003c 605D        str r5, [r3, #4]
  57 003e 601C        str r4, [r3]
  58 0040 BD38        pop {r3, r4, r5, pc}

I'm currently building gcc 4.7-20130209 to see if the bug is already fixed in
the 4.7 branch.  I'll update this bug when my build completes.

Reply via email to