Am 24.02.2015 um 22:14 schrieb Andy Polyakov:
Basically I just want to say "good analysis" and confirm that yes,
everything points at compiler bug. I also don't sent this to rt, but to
openssl-dev, in order to prevent case reopen.

Looking at disassembly around CRYPTO_ccm128_decrypt+532 (decimal 532 =
0x214):

      CRYPTO_ccm128_decrypt+0x214: f4 1f 00 00  ldd       [%i4], %i2
      CRYPTO_ccm128_decrypt+0x218: b8 07 20 10  add       %i4, 0x10, %i4

I want to point out instruction that follows the ldd. If %i4 was temp,
then why would compiler have to increment it? Actually, if it was temp,
which can only be stack-based, then compiler would address it through
%fp, just like it addresses scratch in next instruction below. Also note
that %i4 value is the same as inp in your stack backtrace presented in
original message. Bottom line here is that compiler optimized away
memcpy by assuming that input is aligned, despite the fact that it's
declared char. That's compiler bug.

      CRYPTO_ccm128_decrypt+0x21c: c4 1f bf f0  ldd       [%fp - 0x10], %g2
      CRYPTO_ccm128_decrypt+0x220: d8 1f 3f f8  ldd       [%i4 - 0x8], %o4
      CRYPTO_ccm128_decrypt+0x224: b0 1e 80 02  xor       %i2, %g2, %i0
      CRYPTO_ccm128_decrypt+0x228: b2 1e c0 03  xor       %i3, %g3, %i1

If you still have 4.9.1 lying around, I wonder if it would help to
declare temp as volatile? It's really just a wonder, so don't feel
obligated to answer if you don't have time or energy.

Thanks for looking into it. I have everything in place for experiments.

Adding volatile first results in new compiler warnings:

ccm128.c: In function 'CRYPTO_ccm128_decrypt':
ccm128.c:300:9: warning: passing argument 1 of 'memcpy' discards 'volatile' qualifier from pointer target type
         memcpy(temp.c, inp, 16);
         ^
In file included from /usr/include/string.h:18:0,
                 from ccm128.c:53:
/usr/include/iso/string_iso.h:60:14: note: expected 'void *' but argument is of type 'volatile u8 *' extern void *memcpy(void *_RESTRICT_KYWD, const void *_RESTRICT_KYWD, size_t);
              ^


And then it also changes the resulting code:

-       ldd     [%i4], %i2
+       call    memcpy, 0
^^^ that's probably the memcpy() that was optimized away before
+        nop
+       ldd     [%fp-16], %i2
+       ldd     [%fp-32], %g2
+       add     %l6, %i4, %o0
+       mov     %l0, %o1
+       ldd     [%fp-8], %i0
+       mov     16, %o2
        add     %i4, 16, %i4
-       ldd     [%fp-16], %g2
-       ldd     [%i4-8], %o4
-       xor     %i2, %g2, %i0
-       xor     %i3, %g3, %i1
-       ldd     [%fp-8], %g2
-       std     %i0, [%fp-16]
-       xor     %o4, %g2, %g2
-       xor     %o5, %g3, %g3
        ldd     [%i5+16], %o4
-       std     %g2, [%fp-8]
-       xor     %o4, %i0, %i2
-       xor     %o5, %i1, %i3
+       xor     %i2, %g2, %i2
+       xor     %i3, %g3, %i3
+       ldd     [%fp-24], %g2
+       std     %i2, [%fp-32]
+       xor     %i0, %g2, %g2
+       xor     %i1, %g3, %g3
+       std     %g2, [%fp-24]
+       xor     %o4, %i2, %i0
+       xor     %o5, %i3, %i1
        ldd     [%i5+24], %o4
-       std     %i2, [%i5+16]
-       xor     %o4, %g2, %i2
-       xor     %o5, %g3, %i3
+       std     %i0, [%i5+16]
+       xor     %o4, %g2, %i0
+       xor     %o5, %g3, %i1
        call    memcpy, 0
-        std    %i2, [%i5+24]
-       mov     %l5, %o0
-       mov     %l5, %o1
+        std    %i0, [%i5+24]
+       mov     %l3, %o0
+       mov     %l3, %o1
        call    %l1, 0
         mov    %l2, %o2

And yes, with volatile the test no longer crashes even when compiled with the broken gcc 4.9.1 and -O3.

Regards,

Rainer
_______________________________________________
openssl-dev mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-dev

Reply via email to