[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #49 from Wilco --- AArch64 does this: (define_expand "vec_store_lanesoi" [(set (match_operand:OI 0 "aarch64_simd_struct_operand" "=Utv") (unspec:OI [(match_operand:OI 1 "register_operand" "w") (unspec:VQ [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] UNSPEC_ST2))] "TARGET_SIMD" { if (BYTES_BIG_ENDIAN) { rtx tmp = gen_reg_rtx (OImode); rtx mask = aarch64_reverse_mask (mode, ); emit_insn (gen_aarch64_rev_reglistoi (tmp, operands[1], mask)); emit_insn (gen_aarch64_simd_st2 (operands[0], tmp)); } else emit_insn (gen_aarch64_simd_st2 (operands[0], operands[1])); DONE; }) ARM seems to be missing the swap: (define_expand "vec_store_lanesoi" [(set (match_operand:OI 0 "neon_struct_operand") (unspec:OI [(match_operand:OI 1 "s_register_operand") (unspec:VQ2 [(const_int 0)] UNSPEC_VSTRUCTDUMMY)] UNSPEC_VST2))] "TARGET_NEON") So clearly looks like a backend issue.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Jakub Jelinek changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #48 from Jakub Jelinek --- Can someone familiar with ARM please take this over? Either this is a backend bug, or a bug in the vectorizer part specific to only ARM/AArch64 (no other target has STORE_LANES stuff). This is a P1 (for now), so it would be nice to get it resolved soon.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #47 from Wilco --- (In reply to Jakub Jelinek from comment #46) > Wonder if that: > vect_array.11[0] = vect_vec_iv_.7_45; > vect_array.11[1] = vect__4.8_48; > on armeb shouldn't have been [1] and [0] instead, otherwise we end up with: > (insn 35 37 38 5 (set (subreg:V4SI (reg:OI 155 [ vect_array.11 ]) 0) > (reg:V4SI 110 [ vect_vec_iv_.7 ])) "pr82518.c":8 939 {*neon_movv4si} > (nil)) > (insn 38 35 41 5 (set (subreg:V4SI (reg:OI 155 [ vect_array.11 ]) 16) > (plus:V4SI (reg:V4SI 110 [ vect_vec_iv_.7 ]) > (reg:V4SI 171))) "pr82518.c":8 998 {*addv4si3_neon} > (nil)) > (insn 41 38 39 5 (set (reg:V4SI 110 [ vect_vec_iv_.7 ]) > (plus:V4SI (reg:V4SI 110 [ vect_vec_iv_.7 ]) > (reg:V4SI 169))) 998 {*addv4si3_neon} > (nil)) > (insn 39 41 43 5 (set (mem:OI (post_inc:SI (reg:SI 152 [ ivtmp.31 ])) [2 > MEM[(int *)vectp_p.9_49]+0 S32 A32]) > (unspec:OI [ > (reg:OI 155 [ vect_array.11 ]) > (unspec:V4SI [ > (const_int 0 [0]) > ] UNSPEC_VSTRUCTDUMMY) > ] UNSPEC_VST2)) "pr82518.c":8 2396 {neon_vst2v4si} > (expr_list:REG_INC (reg:SI 152 [ ivtmp.31 ]) > (nil))) > where pseudo 110 is the vect_vec_iv_.7_45 ({i, i + 1, i + 2, i + 3}) and > insn 38 adds {1, 1, 1, 1} to that. It really depends on what exactly the > neon_vst2v4si instruction does on armeb. > vmov.i32q10, #4 @ v4si > vmov.i32q9, #1 @ v4si > ... > vldrd16, .L19 > vldrd17, .L19+8 > .L4: > vadd.i32q11, q8, q9 > vst1.64 {d16-d17}, [sp:64] > vadd.i32q8, q8, q10 > vstrd22, [sp, #16] > vstrd23, [sp, #24] > vld1.64 {d22-d25}, [sp:64] > vst2.32 {d22-d25}, [r3]! > If it works like on armel, except the elements of the vectors are > byte-swapped, then it should be [1] and [0]. The vst2 works on little endian, but in big-endian the lane numbering is complex since all data is still treated as 64-bit quantities. The stores and vld1.64 have no effect on data layout, so everything is still 64-bit data in 64-bit registers. The vst2.32 can only be used in big-endian if the data is lane-swapped first. AArch64 in big-endian does this: .L26: mov v2.16b, v0.16b add v3.4s, v0.4s, v6.4s add v0.4s, v0.4s, v7.4s tbl v4.16b, {v2.16b}, v1.16b tbl v5.16b, {v3.16b}, v1.16b st2 {v4.4s - v5.4s}, [x2], 32 cmp x2, x3 bne .L26
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #46 from Jakub Jelinek --- Wonder if that: vect_array.11[0] = vect_vec_iv_.7_45; vect_array.11[1] = vect__4.8_48; on armeb shouldn't have been [1] and [0] instead, otherwise we end up with: (insn 35 37 38 5 (set (subreg:V4SI (reg:OI 155 [ vect_array.11 ]) 0) (reg:V4SI 110 [ vect_vec_iv_.7 ])) "pr82518.c":8 939 {*neon_movv4si} (nil)) (insn 38 35 41 5 (set (subreg:V4SI (reg:OI 155 [ vect_array.11 ]) 16) (plus:V4SI (reg:V4SI 110 [ vect_vec_iv_.7 ]) (reg:V4SI 171))) "pr82518.c":8 998 {*addv4si3_neon} (nil)) (insn 41 38 39 5 (set (reg:V4SI 110 [ vect_vec_iv_.7 ]) (plus:V4SI (reg:V4SI 110 [ vect_vec_iv_.7 ]) (reg:V4SI 169))) 998 {*addv4si3_neon} (nil)) (insn 39 41 43 5 (set (mem:OI (post_inc:SI (reg:SI 152 [ ivtmp.31 ])) [2 MEM[(int *)vectp_p.9_49]+0 S32 A32]) (unspec:OI [ (reg:OI 155 [ vect_array.11 ]) (unspec:V4SI [ (const_int 0 [0]) ] UNSPEC_VSTRUCTDUMMY) ] UNSPEC_VST2)) "pr82518.c":8 2396 {neon_vst2v4si} (expr_list:REG_INC (reg:SI 152 [ ivtmp.31 ]) (nil))) where pseudo 110 is the vect_vec_iv_.7_45 ({i, i + 1, i + 2, i + 3}) and insn 38 adds {1, 1, 1, 1} to that. It really depends on what exactly the neon_vst2v4si instruction does on armeb. vmov.i32q10, #4 @ v4si vmov.i32q9, #1 @ v4si ... vldrd16, .L19 vldrd17, .L19+8 .L4: vadd.i32q11, q8, q9 vst1.64 {d16-d17}, [sp:64] vadd.i32q8, q8, q10 vstrd22, [sp, #16] vstrd23, [sp, #24] vld1.64 {d22-d25}, [sp:64] vst2.32 {d22-d25}, [r3]! If it works like on armel, except the elements of the vectors are byte-swapped, then it should be [1] and [0].
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #45 from Jakub Jelinek --- Note the vectorized loop is pretty much the same on arm little-endian, # vect_vec_iv_.6_33 = PHI <{ 0, 1, 2, 3 }(4), vect_vec_iv_.6_34(5)> # ivtmp.12_14 = PHI vectp_p.8_37 = (int[8] *) ivtmp.12_14; vect_vec_iv_.6_34 = vect_vec_iv_.6_33 + { 4, 4, 4, 4 }; vect__4.7_36 = vect_vec_iv_.6_33 + { 1, 1, 1, 1 }; vect_array.10[0] = vect_vec_iv_.6_33; vect_array.10[1] = vect__4.7_36; MEM[(int *)vectp_p.8_37] = STORE_LANES (vect_array.10); ivtmp.12_23 = ivtmp.12_14 + 32; if (ivtmp.12_23 != _54) goto ; [83.33%] else goto ; [16.67%] for which we emit: vmov.i32q12, #4 @ v4si vmov.i32q9, #1 @ v4si ... vldrd16, .L13 vldrd17, .L13+8 .L4: vmovq10, q8 @ v4si vadd.i32q11, q8, q9 vadd.i32q8, q8, q12 vst2.32 {d20-d23}, [r3]! cmp r3, r2 bne .L4 vst2.32 seems to be documented to do 32-bit interleaving, so if qN registers overlap d{2*N} and d{2*N+1} registers, I guess this does the right thing.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #44 from Jakub Jelinek --- Maybe -O3 -mcpu=cortex-a9 -mfpu=neon-fp16 -mfloat-abi=hard is needed. With that I certainly see the #c42 loop vectorized. On x86_64 we get in *.optimized: [local count: 567644349]: # vect_vec_iv_.4_33 = PHI <{ 0, 1, 2, 3, 4, 5, 6, 7 }(4), vect_vec_iv_.4_34(5)> # ivtmp.10_14 = PHI vect_vec_iv_.4_34 = vect_vec_iv_.4_33 + { 8, 8, 8, 8, 8, 8, 8, 8 }; vect__4.5_36 = vect_vec_iv_.4_33 + { 1, 1, 1, 1, 1, 1, 1, 1 }; vect_inter_high_39 = VEC_PERM_EXPR ; vect_inter_low_40 = VEC_PERM_EXPR ; _86 = (void *) ivtmp.10_14; MEM[base: _86, offset: 0B] = vect_inter_high_39; MEM[base: _86, offset: 32B] = vect_inter_low_40; ivtmp.10_23 = ivtmp.10_14 + 64; if (ivtmp.10_23 != _90) goto ; [83.33%] else goto ; [16.67%] which doesn't look optimal either, in this case I'd say better would be to have two IVs bumped by { 8, ... 8 } in each iteration, one starting with { 0, 1, 1, 2, 2, 3, 3, 4 } and another with { 4, 5, 5, 6, 6, 7, 7, 8 } or just one and add { 4, ... 4 }; to it for the second store and avoid both VEC_PERM_EXPRs in that case. On armeb with the above options I see: [local count: 504572758]: # vect_vec_iv_.7_45 = PHI <{ 0, 1, 2, 3 }(4), vect_vec_iv_.7_46(5)> # ivtmp.31_128 = PHI vectp_p.9_49 = (int[8] *) ivtmp.31_128; vect_vec_iv_.7_46 = vect_vec_iv_.7_45 + { 4, 4, 4, 4 }; vect__4.8_48 = vect_vec_iv_.7_45 + { 1, 1, 1, 1 }; vect_array.11[0] = vect_vec_iv_.7_45; vect_array.11[1] = vect__4.8_48; MEM[(int *)vectp_p.9_49] = STORE_LANES (vect_array.11); ivtmp.31_129 = ivtmp.31_128 + 32; if (ivtmp.31_129 != _133) goto ; [83.33%] else goto ; [16.67%] which looks wrong to me (because vect_vec_iv_.7_45 and vect__4.8_48 really should be interleaved when stored into MEM[(int *)vectp_p.9_49]), but I really don't know what exactly the STORE_LANES does.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #43 from Richard Biener --- (In reply to Wilco from comment #42) > Cut down example: > > typedef struct { int x, y; } X; > > void f (X *p, int n) > { > for (int i = 0; i < n; i++) > { p[i].x = i; > p[i].y = i + 1; > } > } Can't reproduce your assembler with -O3 -mcpu=cortex-a9 -mfpu=neon-fp16 [-fno-vect-cost-model] [-mthumb] Without -fno-vect-cost-model we don't vectorize anything. With we only SLP vectorize and that using V1SI vector types (huh). For the loop case: t.c:6:3: note: Build SLP for _3->x = i_15; t.c:6:3: note: Build SLP for _3->y = _4; t.c:6:3: note: vect_is_simple_use: operand i_15 t.c:6:3: note: def_stmt: i_15 = PHI <0(5), _4(6)> t.c:6:3: note: type of def: induction t.c:6:3: note: vect_is_simple_use: operand _4 t.c:6:3: note: def_stmt: _4 = i_15 + 1; t.c:6:3: note: type of def: internal t.c:6:3: note: Build SLP failed: different types ok, known missed handling of SLP induction. t.c:6:3: note: ==> examining statement: _3->x = i_15; t.c:6:3: note: vect_is_simple_use: operand i_15 t.c:6:3: note: def_stmt: i_15 = PHI <0(5), _4(6)> t.c:6:3: note: type of def: induction t.c:6:3: note: no array mode for DI[2] permutaion op not supported by target. so we don't support intereaving either. Not sure why it talks about DI[2] instead of SI[2].
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #42 from Wilco --- Cut down example: typedef struct { int x, y; } X; void f (X *p, int n) { for (int i = 0; i < n; i++) { p[i].x = i; p[i].y = i + 1; } }
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Wilco changed: What|Removed |Added CC||wilco.dijkstra at arm dot com --- Comment #41 from Wilco --- I'm guessing it's this (64-bit loads reading 32-bit data): vldrd16, .L39 vldrd17, .L39+8 .L5: vsub.i32q10, q11, q8 add r3, r3, #1 vsub.i32q9, q8, q11 .L39: .word 1 .word 2 .word 3 .word 4
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Aldy Hernandez changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2018-02-09 Ever confirmed|0 |1 --- Comment #40 from Aldy Hernandez --- *sigh* Confirmed with: $ sudo yum install qemu $ git clone https://git.linaro.org/toolchain/abe.git $ (cd abe; git checkout stable) $ mkdir build-cortex-a9 $ cd build-cortex-a9 $ ../abe/configure $../abe/abe.sh --target armeb-linux-gnueabihf --set languages=fortran --build gcc,glibc,binutils gcc=gcc.git~master --set cpu=cortex-a9 I avoided --build all because gdb fails to build with --set cpu=cortex-a9. Then I ran tests like this: ../abe/abe.sh --target armeb-linux-gnueabihf --set runtestflags=execute.exp=in-pack.f90 --build gcc --check gcc gcc=gcc.git~master --disable update which actually passed, because there doesn't seem to be a way to pass to pass -mfpu= to either the abe.sh configury, or the "runtestflags=" line. So... then you do: $ find . -name gfortran.log ./builds/x86_64-unknown-linux-gnu/armeb-linux-gnueabihf/gcc.git~master-stage2/gcc/testsuite/gfortran/gfortran.log and fish out the gcc command line to run manually on the command line with -mcpu=cortex-a9 -mfpu=neon-fp16 -O3 -o i-wanna-kill-myself.exe And finally: $ qemu-armeb -cpu any -R 0 -L /home/cygnus/aldyh/bld/arm-hell/build-cortex-a9/sysroots/armeb-linux-gnueabihf i-wanna-kill-myself.exe Program aborted. Backtrace: qemu: uncaught target signal 6 (Aborted) - core dumped I've verified with -O2 or without -mfpu=neon-fp16 succeeds, so the test isn't just dying for all flags :). Sigh (have I said that before?). I will now see if I can reduce the test and see where we are dying. p.s. gcc.git~master directory seems to have a pretty up to date upstream GCC (as of today). I'll double check dropping in an actual FSF gcc doesn't change the results.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #39 from Christophe Lyon --- Maybe we can demote this from P1? I'm sure armeb is getting a lot of attention, given other bug reports.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #38 from Christophe Lyon --- (In reply to Aldy Hernandez from comment #37) > (In reply to Christophe Lyon from comment #31) > > Created attachment 43352 [details] > > Reduced testcase > > > > I commented out most calls, since abort() is called from csub4. > > Can you also remove the csub8, isub4, and isub8 unused functions as well? > > I see you've commented out this in csub4: > > !! if (any(bb /= b)) call abort > > I assume this is irrelevant to the failure? > > Can you also verify that after these changes you have a revision of GCC for > which this reduced testcase succeeds (regardless of the vect cost model > rabbit hole), and a revision of GCC for which this fails? > > I'm trying to make sure all this removing of stuff didn't cause an > inconditional abort. I don't speak fortran, but I thought the program did: main-> call csub4-> call abort if some condition In my testing, removing all calls but csub4 from main is sufficient to make the program fail, and then it seems the first call to abort in csub4 is taken too. What would it change to remove csub8/isub4/isub8? (except not generating dead code, which is irrelevant to the current problem) > > Also, is this only reproducible with -g? I don't know, it's added by the torture harness. I wouldn't expect this to change code generation, though. > > BTW, no need to include the assembly. I should be able to generate it with > a cross ./cc1.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #37 from Aldy Hernandez --- (In reply to Christophe Lyon from comment #31) > Created attachment 43352 [details] > Reduced testcase > > I commented out most calls, since abort() is called from csub4. Can you also remove the csub8, isub4, and isub8 unused functions as well? I see you've commented out this in csub4: !! if (any(bb /= b)) call abort I assume this is irrelevant to the failure? Can you also verify that after these changes you have a revision of GCC for which this reduced testcase succeeds (regardless of the vect cost model rabbit hole), and a revision of GCC for which this fails? I'm trying to make sure all this removing of stuff didn't cause an inconditional abort. Also, is this only reproducible with -g? BTW, no need to include the assembly. I should be able to generate it with a cross ./cc1.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #36 from Christophe Lyon --- The attachments were generated with trunk r257076
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #35 from Christophe Lyon --- Created attachment 43356 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43356&action=edit execution traces for armeb
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #34 from Christophe Lyon --- Created attachment 43355 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43355&action=edit execution traces for arm I have removed the logs/traces for -O1/-O2/-Os/etc... and kept only -O3 -g
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #33 from Christophe Lyon --- Created attachment 43354 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43354&action=edit assembly for armeb (big-endian)
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #32 from Christophe Lyon --- Created attachment 43353 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43353&action=edit assembly for arm (little-endian)
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #31 from Christophe Lyon --- Created attachment 43352 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=43352&action=edit Reduced testcase I commented out most calls, since abort() is called from csub4.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #30 from Aldy Hernandez --- (In reply to Christophe Lyon from comment #29) > I still haven't found a commit where the test passes with > -fno-vect-cost-model (before -O3). > > I went back to r193053 (Nov 1, 2012), where I was able to build GCC but the > test fails. > With a revision 1 month earlier, the GCC fails to build. > I tried with earlier revision, but soon reached a point where > "-mfloat-abi=hard and VFP" is not implemented. Ok, bisecting isn't getting us anywhere. I apologize for making you going through all this in vain. Could you further reduce the testcase as I suggested in comment 12, and perhaps we can start looking at the assembly to see where things are going wrong?
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #29 from Christophe Lyon --- I still haven't found a commit where the test passes with -fno-vect-cost-model (before -O3). I went back to r193053 (Nov 1, 2012), where I was able to build GCC but the test fails. With a revision 1 month earlier, the GCC fails to build. I tried with earlier revision, but soon reached a point where "-mfloat-abi=hard and VFP" is not implemented.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #28 from Christophe Lyon --- It's possible that my bisect script got confused by the fact the GCC started ICEing at -O2 on this test at r197671. Investigating
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #27 from Aldy Hernandez --- (In reply to Christophe Lyon from comment #26) > I've manually built or tried to build several revisions: > * 197671: build OK, test fails to run at -fno-vect-cost-model -O3 -g > * 197669: same (!) > * 197815: GCC fails to build > * 197816: same > * 197900: same > > So although I still see the test failing, I don't understand why bisect > said the guilty commit would be between 197671 and 197815, given that the > test fails at 197669 What's the first revision before 197669 that shows the test succeeding?
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #26 from Christophe Lyon --- I've manually built or tried to build several revisions: * 197671: build OK, test fails to run at -fno-vect-cost-model -O3 -g * 197669: same (!) * 197815: GCC fails to build * 197816: same * 197900: same So although I still see the test failing, I don't understand why bisect said the guilty commit would be between 197671 and 197815, given that the test fails at 197669
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #25 from Aldy Hernandez --- > Aldy - these easiest thing for now would be to unilaterally relax the > alignment > test in Handle_Store_Double and see if that allows you to get further with > your > tests. We're debugging past each other :). I was already doing that, but it only mildly helps: pc: 8358, ldm r3, {r0, r1, r2, r3} pc: 835c, stm ip, {r0, r1, r2, r3} pc: 8360, sub r3, fp, #40 ; 0x28 pc: 8364, mov r0, r3 pc: 8368, blx 0x0698 pc changed to 8a00 pc: 8a00, Thumb instr: f890|f000 emulate as: e5d0f000 ldrb pc, [r0] ; pc changed to a I'm done mucking around with the simulator. I'll file a GDB/sim PR for the alignment issue though. Thanks. > > (But yes, I agree, a reduced testcase would be a much better help than all > this > mucking about in the simulator). Yes please. Cristophe?
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #24 from Aldy Hernandez --- (In reply to Richard Earnshaw from comment #22) > (In reply to Nick Clifton from comment #21) > > Hi Aldy, > > > > >>> instruction. :-( Looking at the code in Handle_Store_Double() in > > >>> sim/arm/armemu.c, I think that the reason is probably because the > > >>> address > > >>> for the store is not double word aligned. Which leads me to wonder, > > >>> what value is stored in r5 when the STRD instruction is being executed ? > > > > > > >> => 0x8c24 : strdr2, [r5, #12] > > >> (gdb) info reg r5 > > >> r5 0x1b6e8 112360 > > > > >> ...which is 64 bit aligned. > > > > But, as you have just discovered, (r5 + 12) is not 64-bit aligned... > > But from ARMv7 onwards it only has to be 4-byte aligned, which it is. And > this code was build for cortex-a9, which is ARMv7-a. (In reply to Richard Earnshaw from comment #22) > (In reply to Nick Clifton from comment #21) > > Hi Aldy, > > > > >>> instruction. :-( Looking at the code in Handle_Store_Double() in > > >>> sim/arm/armemu.c, I think that the reason is probably because the > > >>> address > > >>> for the store is not double word aligned. Which leads me to wonder, > > >>> what value is stored in r5 when the STRD instruction is being executed ? > > > > > > >> => 0x8c24 : strdr2, [r5, #12] > > >> (gdb) info reg r5 > > >> r5 0x1b6e8 112360 > > > > >> ...which is 64 bit aligned. > > > > But, as you have just discovered, (r5 + 12) is not 64-bit aligned... > > But from ARMv7 onwards it only has to be 4-byte aligned, which it is. And > this code was build for cortex-a9, which is ARMv7-a. In that case, unless I'm missing something, the simulator looks wrong. The unalignment occurs in initialise_monitor_files() here: openfiles[0].handle = monitor_stdin; (gdb) p &openfiles[0].handle $14 = (int *) 0x1b6f4 (gdb) p/x (unsigned int)$14 % 4 $15 = 0x0 (gdb) p/x (unsigned int)$14 % 8 $16 = 0x4 So openfiles[0].handle is aligned to 4 bytes, but not to 8. Forthat matter, &openfiles is 4 byte aligned only. And Richard says that is ok. So, why is Handle_Store_Double() unilaterally barfing on non 64-bit alignment? /* The address must be aligned on a 8 byte boundary. */ if (addr & 0x7) { #ifdef ABORTS ARMul_DATAABORT (addr); #else ARMul_UndefInstr (state, instr); #endif return; }
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #23 from Nick Clifton --- Hi Guys, >> But, as you have just discovered, (r5 + 12) is not 64-bit aligned... > > But from ARMv7 onwards it only has to be 4-byte aligned, which it is. And > this > code was build for cortex-a9, which is ARMv7-a. Ok, so this is a simulator bug. Aldy - these easiest thing for now would be to unilaterally relax the alignment test in Handle_Store_Double and see if that allows you to get further with your tests. (But yes, I agree, a reduced testcase would be a much better help than all this mucking about in the simulator). Cheers Nick
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #22 from Richard Earnshaw --- (In reply to Nick Clifton from comment #21) > Hi Aldy, > > >>> instruction. :-( Looking at the code in Handle_Store_Double() in > >>> sim/arm/armemu.c, I think that the reason is probably because the address > >>> for the store is not double word aligned. Which leads me to wonder, > >>> what value is stored in r5 when the STRD instruction is being executed ? > > > >> => 0x8c24 : strdr2, [r5, #12] > >> (gdb) info reg r5 > >> r5 0x1b6e8 112360 > > >> ...which is 64 bit aligned. > > But, as you have just discovered, (r5 + 12) is not 64-bit aligned... But from ARMv7 onwards it only has to be 4-byte aligned, which it is. And this code was build for cortex-a9, which is ARMv7-a.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #21 from Nick Clifton --- Hi Aldy, >>> instruction. :-( Looking at the code in Handle_Store_Double() in >>> sim/arm/armemu.c, I think that the reason is probably because the address >>> for the store is not double word aligned. Which leads me to wonder, >>> what value is stored in r5 when the STRD instruction is being executed ? >> => 0x8c24 : strdr2, [r5, #12] >> (gdb) info reg r5 >> r5 0x1b6e8 112360 >> ...which is 64 bit aligned. But, as you have just discovered, (r5 + 12) is not 64-bit aligned...
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #20 from Nick Clifton --- Hi Aldy, >>> for the store is not double word aligned. Which leads me to wonder, >>> what value is stored in r5 when the STRD instruction is being executed ? >> >> 1: x/i $pc >> => 0x8c24 : strdr2, [r5, #12] >> (gdb) info reg r5 >> r5 0x1b6e8 112360 >> ...which is 64 bit aligned. Hmm, curious. OK - my next recommendation would be to add some printf()s to the simulator to find out a) if Handle_Store_Double() really is being called, or if the abort is happening somewhere else. Plus, if it is being called, then b) where inside that function the abort is happening. Maybe the store operations are triggering a memory access failure. Cheers Nick
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #19 from Aldy Hernandez --- (In reply to Richard Earnshaw from comment #16) > (In reply to Nick Clifton from comment #13) > > Hi Aldy, > > > > > > > pc: 8ca4, instr: e1c520fc > > > pc: 4, instr: ea00089b > > > > > > I took a peek at the executable being run with "/my-arm-build/objdudump -D > > > the-executable.exe", and I see we are failing in > > > initialise_monitor_handles(). > > > This suggests we're failing during the start-up code: > > > > > 8ca4: e1c520fcstrdr2, [r5, #12] > > > > > > It seems that last store is corrupting things and making us jump to a PC > > > of > > > 4??? > > > > Address 4 is the "undefined instruction" vector. If the simulator thinks > > that the instruction is illegal/unknown then it will branch to address 4 > > and start executing from there. (Or else it loads the value stored at > > address 4 and starts executing from that address. I forget which). > > > > So, this basically means that the simulator does not like that STRD > > instruction. :-( Looking at the code in Handle_Store_Double() in > > sim/arm/armemu.c, I think that the reason is probably because the address > > for the store is not double word aligned. Which leads me to wonder, > > what value is stored in r5 when the STRD instruction is being executed ? > > You wouldn't take the undef vector for an alignment issue: that would take a > data abort. > > Sounds like your simulator is built for an older architecture, that doesn't > have strd (ie it's pre-armv5te). Actually, Nick is correct. The simulator stops around here: /* The address must be aligned on a 8 byte boundary. */ if (addr & 0x7) { #ifdef ABORTS ARMul_DATAABORT (addr); #else ARMul_UndefInstr (state, instr); #endif (gdb) p/x addr $26 = 0x1b6f4 (gdb) p/x addr & 7 $27 = 0x4 and ARMul_UndefInstr() is: void ARMul_UndefInstr (ARMul_State * state, ARMword instr ATTRIBUTE_UNUSED) { ARMul_Abort (state, ARMul_UndefinedInstrV); } #define ARMUndefinedInstrV 4L ... #define ARMul_UndefinedInstrV ARMUndefinedInstrV The 4 is not exactly intuitive... I still think someone with access to a working environment should reduce or debug this further :(.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #18 from Aldy Hernandez --- (In reply to Christophe Lyon from comment #17) > (In reply to Aldy Hernandez from comment #12) > > > along with the isub8 subroutine, and continue chopping things similarly > > upward until you get to the abort that fails. Then see if you can chop > > non-dependent things from the top down until you get to the smallest block > > that has no problem before 197671 and a problem after 197815 (with -O3 -g > > -fno-vect-cost-model as suggested before). > > > Hmmm does -O3 overrides -fno-vect-cost-model? > > When I running the testsuite with qemu/-fno-vect-cost-model (as target > board), my logs show: > [...] > /home/christophe.lyon/src/GCC/sources/gcc-fsf/r197671/gcc/testsuite/gfortran. > fortran-torture/execute/in-pack.f90 -fno-vect-cost-model > -fno-diagnostics-show-caret -w -O3 -g [] > > so I may not have been testing what I thought :( -fno-vect-cost-model should take effect regardless of the -O3 specified. That is, the order shouldn't matter. BTW, I only mentioned -O3 -fno-vect-cost-model -g because that is what was suggested by Jakub in comment #3.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #17 from Christophe Lyon --- (In reply to Aldy Hernandez from comment #12) > along with the isub8 subroutine, and continue chopping things similarly > upward until you get to the abort that fails. Then see if you can chop > non-dependent things from the top down until you get to the smallest block > that has no problem before 197671 and a problem after 197815 (with -O3 -g > -fno-vect-cost-model as suggested before). > Hmmm does -O3 overrides -fno-vect-cost-model? When I running the testsuite with qemu/-fno-vect-cost-model (as target board), my logs show: [...] /home/christophe.lyon/src/GCC/sources/gcc-fsf/r197671/gcc/testsuite/gfortran.fortran-torture/execute/in-pack.f90 -fno-vect-cost-model -fno-diagnostics-show-caret -w -O3 -g [] so I may not have been testing what I thought :(
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #16 from Richard Earnshaw --- (In reply to Nick Clifton from comment #13) > Hi Aldy, > > > > pc: 8ca4, instr: e1c520fc > > pc: 4, instr: ea00089b > > > > I took a peek at the executable being run with "/my-arm-build/objdudump -D > > the-executable.exe", and I see we are failing in > > initialise_monitor_handles(). > > This suggests we're failing during the start-up code: > > > 8ca4: e1c520fcstrdr2, [r5, #12] > > > > It seems that last store is corrupting things and making us jump to a PC of > > 4??? > > Address 4 is the "undefined instruction" vector. If the simulator thinks > that the instruction is illegal/unknown then it will branch to address 4 > and start executing from there. (Or else it loads the value stored at > address 4 and starts executing from that address. I forget which). > > So, this basically means that the simulator does not like that STRD > instruction. :-( Looking at the code in Handle_Store_Double() in > sim/arm/armemu.c, I think that the reason is probably because the address > for the store is not double word aligned. Which leads me to wonder, > what value is stored in r5 when the STRD instruction is being executed ? You wouldn't take the undef vector for an alignment issue: that would take a data abort. Sounds like your simulator is built for an older architecture, that doesn't have strd (ie it's pre-armv5te).
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #15 from Aldy Hernandez --- (In reply to Aldy Hernandez from comment #14) > (In reply to Nick Clifton from comment #13) > > Hi Aldy, > > > > > > > pc: 8ca4, instr: e1c520fc > > > pc: 4, instr: ea00089b > > > > > > I took a peek at the executable being run with "/my-arm-build/objdudump -D > > > the-executable.exe", and I see we are failing in > > > initialise_monitor_handles(). > > > This suggests we're failing during the start-up code: > > > > > 8ca4: e1c520fcstrdr2, [r5, #12] > > > > > > It seems that last store is corrupting things and making us jump to a PC > > > of > > > 4??? > > > > Address 4 is the "undefined instruction" vector. If the simulator thinks > > that the instruction is illegal/unknown then it will branch to address 4 > > and start executing from there. (Or else it loads the value stored at > > address 4 and starts executing from that address. I forget which). > > > > So, this basically means that the simulator does not like that STRD > > instruction. :-( Looking at the code in Handle_Store_Double() in > > sim/arm/armemu.c, I think that the reason is probably because the address > > for the store is not double word aligned. Which leads me to wonder, > > what value is stored in r5 when the STRD instruction is being executed ? > > 1: x/i $pc > => 0x8c24 : strdr2, [r5, #12] > (gdb) info reg r5 > r5 0x1b6e8 112360 > (gdb) x/4x 0x1b6e8 > 0x1b6e8 :0x 0x0001 0x0001 > 0x > > ...which is 64 bit aligned. And if you're curious what the 12 offset points to: (gdb) x/4x $r5 + 12 0x1b6f4 :0x 0x 0x0001 0x00 00 (gdb) x/4x $r5 + 0x12 0x1b6fa : 0x 0x0001 0x 0x00 00
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #14 from Aldy Hernandez --- (In reply to Nick Clifton from comment #13) > Hi Aldy, > > > > pc: 8ca4, instr: e1c520fc > > pc: 4, instr: ea00089b > > > > I took a peek at the executable being run with "/my-arm-build/objdudump -D > > the-executable.exe", and I see we are failing in > > initialise_monitor_handles(). > > This suggests we're failing during the start-up code: > > > 8ca4: e1c520fcstrdr2, [r5, #12] > > > > It seems that last store is corrupting things and making us jump to a PC of > > 4??? > > Address 4 is the "undefined instruction" vector. If the simulator thinks > that the instruction is illegal/unknown then it will branch to address 4 > and start executing from there. (Or else it loads the value stored at > address 4 and starts executing from that address. I forget which). > > So, this basically means that the simulator does not like that STRD > instruction. :-( Looking at the code in Handle_Store_Double() in > sim/arm/armemu.c, I think that the reason is probably because the address > for the store is not double word aligned. Which leads me to wonder, > what value is stored in r5 when the STRD instruction is being executed ? 1: x/i $pc => 0x8c24 : strdr2, [r5, #12] (gdb) info reg r5 r5 0x1b6e8 112360 (gdb) x/4x 0x1b6e8 0x1b6e8 :0x 0x0001 0x0001 0x ...which is 64 bit aligned. The above maps to the source: newlib/libc/sys/arm/syscalls.c for (i = 0; i < MAX_OPEN_FILES; i ++) openfiles[i].handle = -1; openfiles[0].handle = monitor_stdin; > > Should I run the dejagnu tests with -mcpu= or whatever, or is the > > --with-cpu=cortex-a9 configury flag enough? > > Be paranoid - add the option. :-) No difference :(.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #13 from Nick Clifton --- Hi Aldy, > pc: 8ca4, instr: e1c520fc > pc: 4, instr: ea00089b > > I took a peek at the executable being run with "/my-arm-build/objdudump -D > the-executable.exe", and I see we are failing in > initialise_monitor_handles(). > This suggests we're failing during the start-up code: > 8ca4: e1c520fcstrdr2, [r5, #12] > > It seems that last store is corrupting things and making us jump to a PC of > 4??? Address 4 is the "undefined instruction" vector. If the simulator thinks that the instruction is illegal/unknown then it will branch to address 4 and start executing from there. (Or else it loads the value stored at address 4 and starts executing from that address. I forget which). So, this basically means that the simulator does not like that STRD instruction. :-( Looking at the code in Handle_Store_Double() in sim/arm/armemu.c, I think that the reason is probably because the address for the store is not double word aligned. Which leads me to wonder, what value is stored in r5 when the STRD instruction is being executed ? > Am I running the simulator correctly? Yes. > Does it require a special flag for > cortex-a9? No. > Is the cortex-a9 CPU even handled by the simulator? Yes. > Should I run the dejagnu tests with -mcpu= or whatever, or is the > --with-cpu=cortex-a9 configury flag enough? Be paranoid - add the option. :-) > Does the arm newlib/libgloss/whatever code have instructions that aren't > handled by the GDB simulator? No. Well not in the assembler parts of it. The possible exception to this are the memory manipulation functions (memcpy, strlen ,etc) in newlib/libc/ sys/arm/ which tend to be very tightly coded, and will often be updated to take advantage of new instructions as they are added to the ISA. Of course the C parts of these libraries might use unsupported instructions if gcc generates them. But if you have configured gcc correctly (and I think that you have) then this should not be an issue. Cheers Nick
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #12 from Aldy Hernandez --- (In reply to Christophe Lyon from comment #11) > My setup uses armeb-none-linux-gnueabihf (as opposed to armeb-eabi as you > report). I have never tried armeb-eabi. > > I am also using qemu as simulator (in user mode, not in system mode). > > The failure in initialise_monitor_handles indicates a problem in the startup > code, while initializing the semi-hosting interface. > > Can you try qemu? Ughhh, off-list I've found that you build glibc/kernel and all that jazz. This is going to take forever if we (ahem I) rebuild the world in order to reproduce. In comment #6 you mention that the regression started between r197671 and r197815. Could you reduce the fortran testcase to the absolute minimum, and then we can take a look at the assembly before r197671 and after r197815? Perhaps start chopping off: i8 = (/(i,i=1,5)/) call isub8(i8(5:1:-1),5) ii8 = (/(5-i+1,i=1,5)/) if (any(ii8 /= i8)) call abort along with the isub8 subroutine, and continue chopping things similarly upward until you get to the abort that fails. Then see if you can chop non-dependent things from the top down until you get to the smallest block that has no problem before 197671 and a problem after 197815 (with -O3 -g -fno-vect-cost-model as suggested before). Once you have a reduced testcase, perhaps we could discern something from the assembly or gimple dumps. Again, please try to get the testcase as small as possible. That will exponentially improve the chances of things getting fixed :).
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #11 from Christophe Lyon --- My setup uses armeb-none-linux-gnueabihf (as opposed to armeb-eabi as you report). I have never tried armeb-eabi. I am also using qemu as simulator (in user mode, not in system mode). The failure in initialise_monitor_handles indicates a problem in the startup code, while initializing the semi-hosting interface. Can you try qemu?
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #10 from Aldy Hernandez --- I'm having some trouble reproducing this bug. I'm a little rusty on cross builds, so perhaps someone can lend a hand. I have a set of combined sources which I'm using to build a toolchain like this: ~/src/combined/configure --host=x86_64-linux --build=x86_64-linux --target=armeb-eabi --with-mode=arm --with-cpu=cortex-a9 --with-fpu=neon-fp16 --enable-languages=fortran --disable-libgomp --disable-libsanitizer --disable-werror I then do: cd gcc && make check-fortran RUNTESTFLAGS="execute.exp=in-pack.f90 --target_board=arm-sim" === gfortran tests === Schedule of variations: arm-sim Running target arm-sim Using /usr/share/dejagnu/baseboards/arm-sim.exp as board description file for target. Using /usr/share/dejagnu/config/sim.exp as generic interface file for target. Using /usr/share/dejagnu/baseboards/basic-sim.exp as board description file for target. Using /home/cygnus/aldyh/src/combined/gcc/testsuite/config/default.exp as tool-and-target-specific interface file. Running /home/cygnus/aldyh/src/combined/gcc/testsuite/gfortran.fortran-torture/execute/execute.exp ... FAIL: gfortran.fortran-torture/execute/in-pack.f90 execution, -O0 FAIL: gfortran.fortran-torture/execute/in-pack.f90 execution, -O1 ... I get failures for everything, but it seems every execution test fails, even simple C tests. Tests fail with no messages, just a simple execution failure, so I had to dig into the simulator. I patched the gdb simulator to trace the instructions to see where we are dying: diff --git a/sim/arm/wrapper.c b/sim/arm/wrapper.c index bc1a043..4492b19 100644 --- a/sim/arm/wrapper.c +++ b/sim/arm/wrapper.c @@ -53,7 +53,7 @@ int stop_simulator; #include "dis-asm.h" This presumably gives us some tracing in gcc/testsuite/*/*.log: /* TODO: Tracing should be converted to common tracing module. */ -int trace = 0; +int trace = 1; int disas = 0; int trace_funcs = 0; The GCC testsuite log file now shows: pc: 8c8c, instr: e3530014 pc: 8c90, instr: 1afb pc: 8c94, instr: e5952000 pc: 8c98, instr: e3a03000 pc: 8c9c, instr: e5856014 pc: 8ca0, instr: e5853018 pc: 8ca4, instr: e1c520fc pc: 4, instr: ea00089b I took a peek at the executable being run with "/my-arm-build/objdudump -D the-executable.exe", and I see we are failing in initialise_monitor_handles(). This suggests we're failing during the start-up code: 8c8c: e3530014cmp r3, #20 8c90: 1afbbne 8c84 8c94: e5952000ldr r2, [r5] 8c98: e3a03000mov r3, #0 8c9c: e5856014str r6, [r5, #20] 8ca0: e5853018str r3, [r5, #24] 8ca4: e1c520fcstrdr2, [r5, #12] It seems that last store is corrupting things and making us jump to a PC of 4??? Before I bark up the wrong trees, I have some questions. Am I running the simulator correctly? Does it require a special flag for cortex-a9? Is the cortex-a9 CPU even handled by the simulator? Should I run the dejagnu tests with -mcpu= or whatever, or is the --with-cpu=cortex-a9 configury flag enough? Does the arm newlib/libgloss/whatever code have instructions that aren't handled by the GDB simulator? I don't want to dig too deep into this, only to find out that our simulator, newlib, or whatever cannot handle cortex-a9 + neon-fp16. For that matter, is this bug reproducible on a more generic Arm variant that *IS* supported by gdb? Sorry for the barrage of questions, but this is a P1, and there doesn't seem to be an easy way to reproduce this.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #9 from Christophe Lyon --- (In reply to Aldy Hernandez from comment #7) > Hi Nick! Hi all! > > Do we have a way of testing armeb, either through a simulator or through > some aarch64 with magic flags? > Please note that the bug appears on armeb (ie. AArch32 big-endian), and not on aarch64.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #8 from Nick Clifton --- Hi Aldy, > Do we have a way of testing armeb, either through a simulator or through > some aarch64 with magic flags? GDB has an ARM simulator which is OK unless you need to test some of the newer features like scalable vector instructions. Just compile your code as normal and then build an armeb targeted version of gdb. Cheers Nick
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Aldy Hernandez changed: What|Removed |Added CC||aldyh at gcc dot gnu.org, ||nickc at gcc dot gnu.org --- Comment #7 from Aldy Hernandez --- Hi Nick! Hi all! Do we have a way of testing armeb, either through a simulator or through some aarch64 with magic flags? If anyone has a hint on how to reproduce this, I'll gladly take a stab at reproducing, bisecting, debugging, etc. Whatever it takes to inch this forward :).
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #6 from Christophe Lyon --- My bisect script cannot find the commit that introduced the problem with -fno-vect-cost-model, because the build was broken for quite some time. The regression seems to have been introduced between r197671 and r197815.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #5 from Christophe Lyon --- So far, my bisect has been unsuccessful, because my bisect script thinks the guilty commit is one of the exit-code-125 ones, and there are too many. I should probably re-try manually.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 --- Comment #4 from Christophe Lyon --- Indeed it looks like the testcase has been failing with -fno-vect-cost-model for a very long time. Trying the find the 'good' starting point for a bisect. (I'm using qemu, I have no such board either)
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Ping, can you bisect with -O3 -g fno-vect-cost-model? I don't have access to any armeb box, so no idea what to even look at.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P1 CC||law at redhat dot com
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Target Milestone|--- |8.0 --- Comment #2 from Richard Biener --- So it was broken before the rev. with -fno-vect-cost-model added I guess.
[Bug tree-optimization/82518] [8 regression] gfortran.fortran-torture/execute/in-pack.f90 fails on armeb since r252917
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82518 Christophe Lyon changed: What|Removed |Added Target||armeb --- Comment #1 from Christophe Lyon --- r252917 was a fix for PR tree-optimization/82220