In cases where the compiler has no alignment info, powerpc64le-linux gcc generates byte at a time copies for -mstrict-align (which is on for little-endian power7). That's awful code, a problem shared by other strict-align targets, see pr50417. However, we also have a case when -mno-strict-align generates less than ideal code, which I believe stems from using alignment as a proxy for testing an address offset. See http://gcc.gnu.org/ml/gcc-patches/1999-09n/msg01072.html.
So my first attempt at fixing this problem looked at address offsets directly. That worked fine too, but on thinking some more, I believe we no longer have the movdi restriction. Nowadays we'll reload the address if we have an offset that doesn't satisfy the "Y" constraint (ie. a multiple of 4 offset). Which led to this simpler patch. Bootstrapped and regression tested powerpc64le-linux, powerpc64-linux and powerpc-linux. OK to apply? PR target/60737 * config/rs6000/rs6000.c (expand_block_move): Allow 64-bit loads and stores when -mno-strict-align at any alignment. (expand_block_clear): Similarly. Also correct calculation of instruction count. Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 209926) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -15438,7 +15422,7 @@ expand_block_clear (rtx operands[]) load zero and three to do clearing. */ if (TARGET_ALTIVEC && align >= 128) clear_step = 16; - else if (TARGET_POWERPC64 && align >= 32) + else if (TARGET_POWERPC64 && (align >= 64 || !STRICT_ALIGNMENT)) clear_step = 8; else if (TARGET_SPE && align >= 64) clear_step = 8; @@ -15466,9 +15450,7 @@ expand_block_clear (rtx operands[]) mode = V2SImode; } else if (bytes >= 8 && TARGET_POWERPC64 - /* 64-bit loads and stores require word-aligned - displacements. */ - && (align >= 64 || (!STRICT_ALIGNMENT && align >= 32))) + && (align >= 64 || !STRICT_ALIGNMENT)) { clear_bytes = 8; mode = DImode; @@ -15599,9 +15581,7 @@ expand_block_move (rtx operands[]) gen_func.movmemsi = gen_movmemsi_4reg; } else if (bytes >= 8 && TARGET_POWERPC64 - /* 64-bit loads and stores require word-aligned - displacements. */ - && (align >= 64 || (!STRICT_ALIGNMENT && align >= 32))) + && (align >= 64 || !STRICT_ALIGNMENT)) { move_bytes = 8; mode = DImode; -- Alan Modra Australia Development Lab, IBM