http://gcc.gnu.org/bugzilla/show_bug.cgi?id=47000

Steven Bosscher <steven at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|UNCONFIRMED                 |NEW
   Last reconfirmed|                            |2010.12.18 12:39:26
     Ever Confirmed|0                           |1

--- Comment #3 from Steven Bosscher <steven at gcc dot gnu.org> 2010-12-18 
12:39:26 UTC ---
Compiled like so:
$ gcc-4.4.2 -S -O2 sha256_4way.i -o sha256_4way-44.s
$ gcc-4.5.0 -S -O2 sha256_4way.i -o sha256_4way-45.s

$ grep -c call *.s
sha256_4way-44.s:0
sha256_4way-45.s:484
$ grep call *.s|head
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
sha256_4way-45.s:    call    ROTR
$ 

ROTR should have been inlined:

static inline __m128i ROTR(__m128i x, const int n) {
    return _mm_srli_epi32(x, n) | _mm_slli_epi32(x, 32 - n);
}

This probably explains the slowdown.

Reply via email to