http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54089



--- Comment #29 from Oleg Endo <olegendo at gcc dot gnu.org> 2013-02-16 
11:36:37 UTC ---

Another case taken from CSiBE / bzip2, where reusing the intermediate shift

result would be better:



void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )

{

   n->b[7] = (UChar)((hi32 >> 24) & 0xFF);

   n->b[6] = (UChar)((hi32 >> 16) & 0xFF);

   n->b[5] = (UChar)((hi32 >> 8) & 0xFF);

   n->b[4] = (UChar) (hi32 & 0xFF);

/*

   n->b[3] = (UChar)((lo32 >> 24) & 0xFF);

   n->b[2] = (UChar)((lo32 >> 16) & 0xFF);

   n->b[1] = (UChar)((lo32 >> 8) & 0xFF);

   n->b[0] = (UChar) (lo32 & 0xFF);

*/

}



on rev 196091 with -O2 -m4 compiles to:



        mov     r6,r0

        shlr16  r0

        shlr8   r0

        mov.b   r0,@(7,r4)

        mov     r6,r0

        shlr16  r0

        mov.b   r0,@(6,r4)

        mov     r6,r0

        shlr8   r0

        mov.b   r0,@(5,r4)

        mov     r6,r0

    mov.b   r0,@(4,r4)



which would be better as:

        mov     r6,r0

        mov.b   r0,@(4,r4)

        shlr8   r0

        mov.b   r0,@(5,r4)

        shlr8   r0

        mov.b   r0,@(6,r4)

        shlr8   r0

        mov.b   r0,@(7,r4)



this would require reordering of the mem stores, which should be OK to do if

the mem is not volatile.  



Reordering the stores manually:



void uInt64_from_UInt32s ( UInt64* n, UInt32 lo32, UInt32 hi32 )

{

   n->b[4] = (UChar) (hi32 & 0xFF);

   n->b[5] = (UChar)((hi32 >> 8) & 0xFF);

   n->b[6] = (UChar)((hi32 >> 16) & 0xFF);

   n->b[7] = (UChar)((hi32 >> 24) & 0xFF);

}



still results in:



        mov     r6,r0

        mov.b   r0,@(4,r4)

        mov     r6,r0

        shlr8   r0

        mov.b   r0,@(5,r4)

        mov     r6,r0

        shlr16  r0

        mov.b   r0,@(6,r4)

        mov     r6,r0

        shlr16  r0

        shlr8   r0

        mov.b   r0,@(7,r4)



... at least this case should be handled, I think.

Reply via email to