[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #20 from Segher Boessenkool --- (In reply to Wilco from comment #19) > Well insn_cost() uses COSTS_N_INSNS (1) for instructions with unknown (zero) > costs. That's a reasonable default and gives more accurate cost comparisons, > eg. 0 + 4 vs 0 + 8 now becomes 4 + 4 vs 4 + 8. You're talking about the default insn_cost, which uses pattern_cost to do the work; pattern_cost returns COSTS_N_INSNS (1) if there is a single SET (or a single SET and a single SET of COMPARE), and set_src_cost of that SET says it is cost 0. It does not say cost 0 for a PARALLEL of two SETs, like you have here. A target-specific insn_cost can return a saner cost for all insns, including for those that are a "real" PARALLEL. (COSTS_N_INSNS (1) isn't minimal of course, it is 4 like you already point out; it is the default insn cost). It never is clear that 0+4 is better than 0+8, because the 0's usually are not the cost of the same insn. > With those changes I think there will be far fewer cases with unknown costs. I'm not sure what changes you propose. > There will be cases where target insn_cost needs to be improved to more > accurately model complex instructions, but that will result in even better > code. The arm port does not _have_ the insn_cost hook implemented; doing that will probably help, sure.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #19 from Wilco --- (In reply to Segher Boessenkool from comment #16) > (In reply to Wilco from comment #14) > > Note there is also an issue with costs, if the cost is zero then it seems to > > behave like infinite cost. > > 0 means unknown cost. Any known cost is treated as at least as good as > unknown cost. > > > It would be better to properly cost an > > instruction sequence given there is a new interface for this now. > > Yup. > > > If the > > cost is still zero, it could get the default cost rather than being treated > > like infinite cost, resulting in non-optimal replacements. > > Treating unknown cost as minimal cost is still non-optimal. It might work > better for your case of course. But it means for example that combine will > modify parallels less often. Well insn_cost() uses COSTS_N_INSNS (1) for instructions with unknown (zero) costs. That's a reasonable default and gives more accurate cost comparisons, eg. 0 + 4 vs 0 + 8 now becomes 4 + 4 vs 4 + 8. With insn_cost() the ldm's get cost 8+8, so the comparison is 4 + 16 vs 8 + 8, and thus the substitution will happen (though it's not clear it's useful for the selected CPU). With those changes I think there will be far fewer cases with unknown costs. There will be cases where target insn_cost needs to be improved to more accurately model complex instructions, but that will result in even better code.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 Segher Boessenkool changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |segher at gcc dot gnu.org --- Comment #18 from Segher Boessenkool --- I have a patch.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #17 from Segher Boessenkool --- Please do the combine dumps with details enabled; these are pretty useless. (-fdump-rtl-combine-all) A C testcase would be very helpful, too (or some magic configure command to run on some cfarm machine).
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #16 from Segher Boessenkool --- (In reply to Wilco from comment #14) > Note there is also an issue with costs, if the cost is zero then it seems to > behave like infinite cost. 0 means unknown cost. Any known cost is treated as at least as good as unknown cost. > It would be better to properly cost an > instruction sequence given there is a new interface for this now. Yup. > If the > cost is still zero, it could get the default cost rather than being treated > like infinite cost, resulting in non-optimal replacements. Treating unknown cost as minimal cost is still non-optimal. It might work better for your case of course. But it means for example that combine will modify parallels less often.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #15 from Segher Boessenkool --- (In reply to Wilco from comment #13) > It seems the safest way > to split an instruction is to place the new instructions next to each other. combine can only place new insns at i2 and i3, in either order. There is code to figure out which way (if any) works; apparently it doesn't work here.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #14 from Wilco --- Note there is also an issue with costs, if the cost is zero then it seems to behave like infinite cost. It would be better to properly cost an instruction sequence given there is a new interface for this now. If the cost is still zero, it could get the default cost rather than being treated like infinite cost, resulting in non-optimal replacements.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 Wilco changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-08-20 CC||wilco at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #13 from Wilco --- (In reply to nsz from comment #12) > the wrong string seems to be caused by a missing ldm Yes what happens is that it decides to combine instruction 150 and 190 and split an ldm2: insn_cost 4 for 150: r197:SI=sfp:SI-0x24 ... insn_cost 0 for 190: {r0:SI=[r197:SI];r1:SI=[r197:SI+0x4];} allowing combination of insns 150 and 190 original costs 4 + 0 = 0 replacement costs 8 + 8 = 16 modifying insn i2 150: r0:SI=[sfp:SI-0x24] deferring rescan insn with uid = 150. modifying insn i3 190: r1:SI=[sfp:SI-0x20] deferring rescan insn with uid = 190. There are lots of instructions between 150 and 190, including another ldm2 which writes r0: 150: r0:SI=[sfp:SI-0x24] ... 163: {r0:SI=[r206:SI];r1:SI=[r206:SI+0x4];} ... 190: r1:SI=[sfp:SI-0x20] The combination ends up corrupting r0 because one of the loads is lifted across many instructions without checking liveness. It seems the safest way to split an instruction is to place the new instructions next to each other.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #12 from nsz at gcc dot gnu.org --- the wrong string seems to be caused by a missing ldm good main: ... mov r4, #0 str r4, [sp, #32] mov r2, #2 str r2, [sp, #36] add r2, sp, #40 str r2, [sp, #4] str r4, [sp, #8] ldm ip, {r0, r1}/// load r0 (content is 'hell') str r0, [sp, #40] /// store the right r0 strhr1, [sp, #44] @ movhi ldr r2, [sp, #66] @ unaligned str r2, [sp, #46] @ unaligned ldrhr2, [sp, #70] @ unaligned strhr2, [sp, #50] @ unaligned add r2, sp, #72 ldm r2, {r0, r1} str r0, [sp, #52] strhr1, [sp, #56] @ movhi mov r1, r3 add r0, sp, #4 bl option_stopwatch_a.5061 bad main: ... mov r4, #0 str r4, [sp, #32] mov r2, #2 str r2, [sp, #36] add r2, sp, #40 str r2, [sp, #4] str r4, [sp, #8] ldr r1, [sp, #64] str r0, [sp, #40] /// store a bad r0 (content is 'godd') strhr1, [sp, #44] @ movhi ldr r2, [sp, #66] @ unaligned str r2, [sp, #46] @ unaligned ldrhr2, [sp, #70] @ unaligned strhr2, [sp, #50] @ unaligned ldr r1, [sp, #76] str r0, [sp, #52] strhr1, [sp, #56] @ movhi mov r1, r3 add r0, sp, #4 bl option_stopwatch_a.5061
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #11 from nsz at gcc dot gnu.org --- Created attachment 44562 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44562&action=edit bad rtl dump of combine pass attached good and bad dump of combine, i also tried to look at the array contents, it seems to go from (good): hello hola! goddag to (bad): goddo hola! goddag in the last option_stopwatch_a call in main (even though the array 'a' seems to be 'hello' before the call).
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 nsz at gcc dot gnu.org changed: What|Removed |Added CC||nsz at gcc dot gnu.org --- Comment #10 from nsz at gcc dot gnu.org --- Created attachment 44561 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44561&action=edit good rtl dump of combine pass
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #9 from Andrey Guskov --- OK.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #8 from Segher Boessenkool --- So is it worse code, better code, is the testcase broken / suboptimal? The haswell problem seems to be completely unrelated, so open a separate PR please.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 Andrey Guskov changed: What|Removed |Added CC||andrey.y.guskov at intel dot com --- Comment #7 from Andrey Guskov --- On Haswell, r263067 is also responsible for that: spawn -ignore SIGHUP /work/gcc/xgcc -B/work/gcc/ /source/gcc/testsuite/gcc.target/i386/avx-cvt-2.c -fno-diagnostics-show-caret -fdiagnostics-color=never -O3 -mavx -mno-avx2 -mtune=generic -fdump-tree-vect-details -ffat-lto-objects -S -o avx-cvt-2.s PASS: gcc.target/i386/avx-cvt-2.c (test for excess errors) PASS: gcc.target/i386/avx-cvt-2.c scan-tree-dump-times vect "vectorized 1 loops in function" 6 PASS: gcc.target/i386/avx-cvt-2.c scan-assembler vcvttpd2dq(y[^\n\r]*%xmm|[^\n\r]*xmm[^\n\r]*YMMWORD PTR) PASS: gcc.target/i386/avx-cvt-2.c scan-assembler vcvtdq2ps[^\n\r]*ymm FAIL: gcc.target/i386/avx-cvt-2.c scan-assembler vcvtps2pd[^\n\r]*(%xmm[^\n\r]*%ymm|ymm[^\n\r]*xmm) Option set: -with-system-zlib --with-demangler-in-ld --with-fpmath=sse --enable-shared --enable-host-shared --enable-clocale=gnu --enable-cloog-backend=isl --enable-languages=c,c++,fortran,jit,lto -with-arch=haswell --with-cpu=haswell
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #6 from Segher Boessenkool --- So, what is happening at all? What is different during/after combine, etc.?
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #5 from Christophe Lyon --- I think in the "ok" version we have: add ip, sp, #60 ... ldm ip, {r0, r1} ... add r2, sp, #72 ldm r2, {r0, r1} in the "ko" version we have: ldr r1, [sp, #64] ... ldr r1, [sp, #76] So in the "ko" version we do not load r0
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #4 from Christophe Lyon --- Created attachment 44488 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44488&action=edit Good code This is with r263197 and r263067 (your patch) reverted
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #3 from Christophe Lyon --- Created attachment 44487 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=44487&action=edit Wrong code generated This is with trunk @r263197
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #2 from Christophe Lyon --- gfortran.log contains: STOP 4 STOP 4 STOP 4 before the execution fails I'll regenerate the 2 asm files.
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 --- Comment #1 from Segher Boessenkool --- Could you trace this down to some bad code generated, at least?
[Bug target/86771] [9 Regression] gfortran.dg/actual_array_constructor_1.f90 fails on arm after combine 2 insns to 2 insns patch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86771 Richard Biener changed: What|Removed |Added Keywords||wrong-code Target Milestone|--- |9.0 Summary|gfortran.dg/actual_array_co |[9 Regression] |nstructor_1.f90 fails on|gfortran.dg/actual_array_co |arm after combine 2 insns |nstructor_1.f90 fails on |to 2 insns patch|arm after combine 2 insns ||to 2 insns patch