On 02/22/2015 11:45 AM, David Edelsohn wrote:
Does this patch really fix the problem? The PR notes that the testcase fails and code quality has regressed. Has the code generation been corrected but the testcase looks for the wrong string? Presumably the message that basic block was vectorized means that the code generation is correct, but the commentary about the patch does not mention it.
There appear to be at least three problems at play here: 1) The test expects the wrong string to determine success. 2) GCC 4.9.0 and later emit suboptimal code compared to 4.8.4. 3) With (1) fixed, the test fails to detect (2). During my initial investigation, besides trunk, I had only looked at the assembly emitted at revision 198852 since there the test is reported as passing in comment #2. The code appears comparable between the two. Now that I've also compared the assembly emitted by 4.8.4 I see what I suspect the original reporter was referring to: 4.9.0 and later both uses vectorization to copy the arrays and also assigns the four elements using ordinary loads and stores. And since the code has been successfully vectorized (and GCC reports it in the dump) the test passes. I'll need to spend some more time to find the revision that caused this. Martin PS For reference, the assembly emitted by 4.8.4 for powerpc-linux is as follows: main1: lis 10,.LANCHOR0@ha la 10,.LANCHOR0@l(10) addi 9,10,4 lvx 1,0,9 neg 8,9 lis 9,.LANCHOR1@ha lvsr 13,0,8 addi 10,10,16 la 9,.LANCHOR1@l(9) lvx 0,0,10 li 3,0 vperm 0,1,0,13 stvx 0,0,9 blr ...while 5.0.0 20150223 emits this: main1: lis 7,.LANCHOR0@ha stwu 1,-32(1) la 7,.LANCHOR0@l(7) li 3,0 addi 5,7,4 addi 7,7,16 rlwinm 6,5,0,0,27 lvx 0,0,7 lwz 9,4(6) addi 7,1,16 lwz 11,12(6) neg 5,5 lwz 10,8(6) lwz 8,0(6) lvsr 1,0,5 stw 9,4(7) lis 9,.LANCHOR1@ha stw 8,0(7) la 9,.LANCHOR1@l(9) stw 10,8(7) stw 11,12(7) lvx 13,0,7 vperm 0,13,0,1 stvx 0,0,9 addi 1,1,32 blr powerpc64-linux has a similar problem and emits: main1: .quad .L.main1,.TOC.@tocbase,0 .previous .type main1, @function .L.main1: addis 10,2,.LANCHOR0@toc@ha li 3,0 addi 10,10,.LANCHOR0@toc@l addi 9,10,4 addi 8,10,16 neg 7,9 rldicr 9,9,0,59 lvx 0,0,8 addi 8,1,-16 ld 11,8(9) ld 10,0(9) lvsr 1,0,7 addis 9,2,.LANCHOR1@toc@ha addi 9,9,.LANCHOR1@toc@l std 10,0(8) std 11,8(8) ori 2,2,0 lvx 13,0,8 vperm 0,13,0,1 stvx 0,0,9 blr