http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50494
Bug #: 50494 Summary: gcc.dg/vect/vect-reduc-2char.c fails spuriously on ppc with -flto Classification: Unclassified Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: vr...@gcc.gnu.org The testcase contains 2 char array initializers: ... __attribute__ ((noinline)) void main1 (signed char x, signed char max_result, signed char min_result) { int i; signed char b[N] = {1,2,3,6,8,10,12,14,16,18,20,22,24,26,28,30}; signed char c[N] = {1,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15}; ... The initializer data is put into .rodata, and copied to stack at the start of the function by 2 blockmoves. When we do expand_block_move during cc1, the blockmove source looks like this as a tree: ... <var_decl 0xf7e37180 *.LC0 type <array_type 0xf7e317e0 type <integer_type 0xf7d53180 signed char sizes-gimplified public string-flag QI size <integer_cst 0xf7d406c8 constant 8> unit size <integer_cst 0xf7d406e4 constant 1> align 8 symtab 0 alias set 0 canonical type 0xf7d53180 precision 8 min <integer_cst 0xf7d40674 -128> max <integer_cst 0xf7d406ac 127> pointer_to_this <pointer_type 0xf7e31840>> sizes-gimplified BLK size <integer_cst 0xf7d409f4 constant 128> unit size <integer_cst 0xf7d40a10 constant 16> align 8 symtab 0 alias set 0 canonical type 0xf7e317e0 domain <integer_type 0xf7e1ff00 type <integer_type 0xf7d53000 sizetype> sizes-gimplified SI size <integer_cst 0xf7d40508 constant 32> unit size <integer_cst 0xf7d40524 constant 4> align 32 symtab 0 alias set -1 canonical type 0xf7e1ff00 precision 32 min <integer_cst 0xf7d40540 0> max <integer_cst 0xf7e1c524 15>> pointer_to_this <pointer_type 0xf7e37780>> readonly used static ignored in-constant-pool BLK file (null) line 0 col 0 size <integer_cst 0xf7d409f4 128> unit size <integer_cst 0xf7d40a10 16> user align 128 initial <constructor 0xf7db5860> (mem/s/c:BLK (symbol_ref/f:SI ("*.LC0") [flags 0x82] <var_decl 0xf7e37180 *.LC0>) [1 S16 A8])> ... and like this as rtl: ... (mem/s/c:BLK (reg/f:SI 180) [1 S16 A8]) ... This case is chosen in expand_block_move, and the blockmoves are expanded as 4-byte wordmoves. ... else if (bytes >= 4 && (align >= 32 || !STRICT_ALIGNMENT)) { /* move 4 bytes */ move_bytes = 4; mode = SImode; gen_func.mov = gen_movsi; } ... The .rodata section written by cc1 has align 2^4 == 16 bytes. ... .section .rodata .align 4 .set .LANCHOR0,. + 0 .LC0: .byte 1 .byte 2 ... However, when we do expand_block_move during lto1, the blockmove source looks like this as a tree: ... <mem_ref 0xf7dc76c8 type <array_type 0xf7dc67e0 type <integer_type 0xf7d51180 signed char public string-flag QI size <integer_cst 0xf7d4071c constant 8> unit size <integer_cst 0xf7d40738 constant 1> align 8 symtab 0 alias set -1 canonical type 0xf7d51180 precision 8 min <integer_cst 0xf7d406ac -128> max <integer_cst 0xf7d40700 127> pointer_to_this <pointer_type 0xf7dce8a0>> BLK size <integer_cst 0xf7d40a48 constant 128> unit size <integer_cst 0xf7d40a64 constant 16> align 8 symtab 0 alias set 0 canonical type 0xf7dc67e0 domain <integer_type 0xf7dc6840 type <integer_type 0xf7d51000 sizetype> SI size <integer_cst 0xf7d40524 constant 32> unit size <integer_cst 0xf7d40540 constant 4> align 32 symtab 0 alias set -1 canonical type 0xf7d51360 precision 32 min <integer_cst 0xf7d4055c 0> max <integer_cst 0xf7dc7118 15>> pointer_to_this <pointer_type 0xf7dc6a20>> readonly arg 0 <addr_expr 0xf7dc0a38 type <pointer_type 0xf7dc6a20 type <array_type 0xf7dc67e0> public unsigned SI size <integer_cst 0xf7d40524 32> unit size <integer_cst 0xf7d40540 4> align 32 symtab 0 alias set -1 canonical type 0xf7dc6a20> readonly constant arg 0 <var_decl 0xf7dc6780 *.LC0 type <array_type 0xf7dc67e0> readonly used static ignored in-constant-pool BLK file (null) line 0 col 0 size <integer_cst 0xf7d40a48 128> unit size <integer_cst 0xf7d40a64 16> user align 128 initial <constructor 0xf7dc51f0> (mem/s/c:BLK (symbol_ref/f:SI ("*.LC0") [flags 0x82] <var_decl 0xf7dd9060 *.LC0>) [1 S16 A8])>> arg 1 <integer_cst 0xf7dc76e4 type <pointer_type 0xf7dc6a20> constant 0> /scratch/vries/b6/pr43864.42.all-fsf-mainline-powerpc-linux-gnu.cfg/src/gcc-mainline/gcc/testsuite/gcc.dg/vect/vect-reduc-2char.c:11:15> ... and like this as rtx: ... (mem/s/u/c:BLK (reg/f:SI 180) [0 *.LC0+0 S16 A128]) ... This case is now chosen in expand_block_move, and the blockmoves are expanded as 16-byte vectormoves. ... if (TARGET_ALTIVEC && bytes >= 16 && align >= 128) { move_bytes = 16; mode = V4SImode; gen_func.mov = gen_movv4si; } ... The .rodata section written by lto1 however now has align 2^2 == 4 bytes. ... .section .rodata .align 2 .set .LANCHOR0,. + 0 .LC0: .byte 1 .byte 2 ... This alignment is not enough for the vector moves which require 16-byte alignment. This causes the test to fail spuriously.