[Bug target/113014] RISC-V: Redundant zeroing instructions in reduction due to r14-3998-g6223ea766daf7c

2023-12-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113014 --- Comment #4 from Robin Dapp --- Richard has posted it and asked for reviews. I have tested it and we have several testsuite regressions with it but no severe ones. Most or all of them are dump fails because we combine into vx variants that

[Bug target/113014] RISC-V: Redundant zeroing instructions in reduction due to r14-3998-g6223ea766daf7c

2023-12-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113014 --- Comment #2 from Robin Dapp --- Yes, that's right.

[Bug target/112999] riscv: Infinite loop with mask extraction

2023-12-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112999 --- Comment #1 from Robin Dapp --- What actually gets in the way of vec_extract here is changing to a "better" vector mode (which is RVVMF4QI here). If we tried to extract from the mask directly everything would work directly. I have a patch

[Bug target/112999] New: riscv: Infinite loop with mask extraction

2023-12-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112999 Bug ID: 112999 Summary: riscv: Infinite loop with mask extraction Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug middle-end/112971] [14] RISC-V rv64gcv_zvl256b vector -O3: internal compiler error: Segmentation fault signal terminated program cc1

2023-12-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971 --- Comment #8 from Robin Dapp --- Yes, can confirm that this helps.

[Bug target/112971] [14] RISC-V rv64gcv_zvl256b vector -O3: internal compiler error: Segmentation fault signal terminated program cc1

2023-12-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971 --- Comment #5 from Robin Dapp --- Yes that's what I just tried. No infinite loop anymore then. But that's not a new simplification and looks reasonable so there must be something special for our backend.

[Bug target/112971] [14] RISC-V rv64gcv_zvl256b vector -O3: internal compiler error: Segmentation fault signal terminated program cc1

2023-12-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971 --- Comment #3 from Robin Dapp --- In match.pd we do something like this: ;; Function e (e, funcdef_no=0, decl_uid=2751, cgraph_uid=1, symbol_order=4) Pass statistics of "forwprop": Matching expression match.pd:2771,

[Bug target/112971] [14] RISC-V rv64gcv_zvl256b vector -O3: internal compiler error: Segmentation fault signal terminated program cc1

2023-12-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112971 --- Comment #2 from Robin Dapp --- It doesn't look like the same issue to me. The other bug is related to TImode handling in combination with mask registers. I will also have a look at this one.

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-11 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929 --- Comment #15 from Robin Dapp --- I think we need to make sure that we're not writing out of bounds. In that case anything might happen and if we just don't happen to overwrite this variable we might hit another one but the test can still

[Bug target/112853] RISC-V: RVV: SPEC2017 525.x264 regression

2023-12-11 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 --- Comment #10 from Robin Dapp --- I just realized that I forgot to post the comparison recently. With the patch now upstream I don't see any differences for zvl128b and different vlens anymore. What I haven't fully tested yet is zvl256b or

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-11 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929 --- Comment #13 from Robin Dapp --- I just built from the most recent commit and it still fails for me. Could there be a difference in qemu? I'm on qemu-riscv64 version 8.1.91 but yours is even newer so that might not explain it. You could

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929 --- Comment #9 from Robin Dapp --- In the good version the length is 32 here because directly before the vsetvl we have: li a4,32 That seems to get lost somehow.

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929 --- Comment #7 from Robin Dapp --- Here 0x105c6 vse8.v v8,(a5) is where we overwrite m. The vl is 128 but the preceding vsetvl gets a4 = 46912504507016 as AVL which seems already borken.

[Bug target/112929] [14] RISC-V vector: Variable clobbered at runtime

2023-12-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112929 --- Comment #6 from Robin Dapp --- This seems to be gone when simple vsetvl (instead of lazy) is used or with -fno-schedule-insns which might indicate a vsetvl pass problem. We might have a few more of those. Maybe it would make sense to run

[Bug target/112853] RISC-V: RVV: SPEC2017 525.x264 regression

2023-12-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 --- Comment #8 from Robin Dapp --- With Juzhe's latest fix that disables VLS modes >= 128 bit for zvl128b x264 runs without issues here and some of the additional execution failures are gone. Will post the current comparison later.

[Bug middle-end/112872] [14 Regression] RISCV ICE: in store_integral_bit_field, at expmed.cc:1049 with -03 rv64gcv_zvl1024b --param=riscv-autovec-preference=fixed-vlmax

2023-12-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112872 --- Comment #2 from Robin Dapp --- Thanks. Yes that's similar and also looks fixed by the introduction of the vec_init expander. Added this test case to the patch and will push it soon.

[Bug target/112853] RISC-V: RVV: SPEC2017 525.x264 regression

2023-12-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 --- Comment #7 from Robin Dapp --- Ah, forgot three tests: FAIL: gcc.dg/vect/bb-slp-cond-1.c execution test FAIL: gcc.dg/vect/bb-slp-pr101668.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/bb-slp-pr101668.c execution test On

[Bug target/112853] RISC-V: RVV: SPEC2017 525.x264 regression

2023-12-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 --- Comment #6 from Robin Dapp --- I indeed see more failures with _zvl128b, vlen=256 (than with _zvl128b, vlen=128): FAIL: gcc.dg/vect/pr66251.c -flto -ffat-lto-objects execution test FAIL: gcc.dg/vect/pr66251.c execution test FAIL:

[Bug middle-end/112854] [14] RISCV ICE: expand: in store_integral_bit_field, at expmed.cc:1049 on rv32gcv_zvl1024b --param=riscv-autovec-preference=fixed-vlmax

2023-12-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112854 --- Comment #3 from Robin Dapp --- The problem seems to be that we can overlay a 32-bit bitmask with an SImode subreg and work with it. For zvl1024b on rv32 we don't allow this causing the ICE. We might be able to work around it by providing

[Bug target/112853] RISC-V: RVV: SPEC2017 525.x264 regression

2023-12-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112853 --- Comment #5 from Robin Dapp --- Can confirm. The scalable build works with qemu vlen=128 but fails with vlen=256. That's a good data point as I'm not sure we're already covering this with the current runs? I'm going to start a testsuite

[Bug middle-end/112854] [14] RISCV ICE: expand: in store_integral_bit_field, at expmed.cc:1049 on rv32gcv_zvl1024b --param=riscv-autovec-preference=fixed-vlmax

2023-12-05 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112854 --- Comment #2 from Robin Dapp --- Hehe I was hoping we wouldn't hit a vec_set on a mask but apparently this happens as well. We don't have a pattern for that either, yet. Thanks for the test. I would expect this to be fixed in a similar way

[Bug target/112583] RISC-V regression testsuite errors with rv64gcv_zvl128b

2023-12-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112583 --- Comment #14 from Robin Dapp --- Yes, that's the culprit. I already pushed a fix yesterday.

[Bug target/112583] RISC-V regression testsuite errors with rv64gcv_zvl128b

2023-12-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112583 --- Comment #12 from Robin Dapp --- Ok, on my server the difference is that I didn't add vext_spec=v1.0 to the qemu options. This caused the qemu diagnostic which would of course not match the expected output.

[Bug target/112583] RISC-V regression testsuite errors with rv64gcv_zvl128b

2023-12-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112583 --- Comment #11 from Robin Dapp --- Verified they work locally but also fail on a different server. Also fail without vector and at -O0. Maybe it's different tcl versions or the shell doing wonky stuff?

[Bug target/112583] RISC-V regression testsuite errors with rv64gcv_zvl128b

2023-12-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112583 --- Comment #10 from Robin Dapp --- I didn't yet look at all those closer because they are more dump failures than real execution failures. The ones I checked are expected "^foobar$" but got: "foobar" so I considered this rather an

[Bug target/112773] [14 Regression] RISC-V ICE: in force_align_down_and_div, at poly-int.h:1828 on rv32gcv_zvl256b

2023-12-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112773 --- Comment #13 from Robin Dapp --- Mostly an issue because our expander is definitely not prepared to handle that :) It looks like aarch64's is, though, and ours can/should be changed then. aarch64 doesn't need to implement a qi/bi extract

[Bug target/112773] [14 Regression] RISC-V ICE: in force_align_down_and_div, at poly-int.h:1828 on rv32gcv_zvl256b

2023-12-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112773 --- Comment #11 from Robin Dapp --- When I define a vec_extract...bi pattern we don't enter the if (vec_extract) in expmed because e.g. bitsize = {1, 0} bitnum = {3, 4} and GET_MODE_BITSIZE (innermode) = {1, 0} with innermode = BImode. This

[Bug target/112773] [14 Regression] RISC-V ICE: in force_align_down_and_div, at poly-int.h:1828 on rv32gcv_zvl256b

2023-12-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112773 --- Comment #9 from Robin Dapp --- Ok, it's not the fold_extract_last expander. It just appeared that way here because I disabled some other things. What we want to do is extract the last element from a vector. This works as long as we have

[Bug target/112773] [14 Regression] RISC-V ICE: in force_align_down_and_div, at poly-int.h:1828 on rv32gcv_zvl256b

2023-12-01 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112773 --- Comment #8 from Robin Dapp --- Thanks for the testcase. It looks pretty similar to the situation why I introduced the bitmask extract in the first place and I don't think that's the root cause. As last time the problem is that the generic

[Bug target/112598] RISC-V regression testsuite errors with rv64gcv_zvl512b

2023-11-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112598 --- Comment #15 from Robin Dapp --- Does the =m fix your issue? Or is the code gen different then and we're just lucky? For my problem it doesn't help because we still don't recognize an alias between load and store and the load is moved.

[Bug rtl-optimization/110237] gcc.dg/torture/pr58955-2.c is miscompiled by RTL scheduling after reload

2023-11-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110237 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org

[Bug target/112598] RISC-V regression testsuite errors with rv64gcv_zvl512b

2023-11-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112598 --- Comment #13 from Robin Dapp --- It looks like the takeaway from the other thread is that there are many likewise assumptions about masked stores in the middle end. It's probably difficult to get them all right in a short time. Therefore I

[Bug target/112598] RISC-V regression testsuite errors with rv64gcv_zvl512b

2023-11-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112598 --- Comment #11 from Robin Dapp --- On Friday I looked into one of the Fortran fails, class_67.f90 and debugged it independently without reading here further. It is also due to the same reason - alias analysis finds that the predicated store

[Bug tree-optimization/112464] [14 Regression] ICE avx512 with -ftrapv since r14-5076

2023-11-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 Robin Dapp changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/112670] RISC-V: Run fail on pr65518.c with -flto

2023-11-22 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112670 --- Comment #1 from Robin Dapp --- The problem is exposed with the ipa copy propagation pass. I haven't narrowed it down yet but will continue tomorrow.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #3 from Robin Dapp --- Yes, as agreed. Though today I probably won't be able to do much due to private matters.

[Bug tree-optimization/112661] [14] RISC-V ICE: in duplicate_and_interleave, at tree-vect-slp.cc:8025 with maxval_char_3.f90 vlen256b

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112661 --- Comment #1 from Robin Dapp --- Confirmed, smaller example: program main implicit none integer, parameter :: n=5 character(len=6), dimension(n,n) :: a character(len=6), dimension(n) :: r1 integer :: i logical, dimension(n,n) ::

[Bug target/111488] ICE ion riscv gcc.dg/vect/vect-126.c

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111488 Robin Dapp changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #21 from Robin Dapp --- Grml, ../../gcc/tree-vect-loop.cc:12248:1: fatal error: error writing to /tmp/ccsMqSV2.s: No space left on device on cfarm185, cannot even build anymore.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #20 from Robin Dapp --- Not really depending on an order but rather expecting that the reduction variable is in op[1] (as created by ifcvt). That might already be the problem because here the reduction index is 2. It just never

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #18 from Robin Dapp --- Already in ifcvt we have: _ifc__60 = .COND_ADD (_2, _6, MADPictureC1_lsm.10_25, MADPictureC1_lsm.10_25); which we should not. This is similar on riscv. But during value numbering it still is Value

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #17 from Robin Dapp --- Thanks, I reproduced it on the compile farm with this example. Going to have a look. riscv doesn't fail in a similar way this time.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #15 from Robin Dapp --- Hmm, that's definitely related to the original change but most likely not to the fixes. gcc_assert (code == IFN_COND_ADD || code == IFN_COND_SUB || code == IFN_COND_MUL || code ==

[Bug target/112583] RISC-V regression testsuite errors with rv64gcv_zvl128b

2023-11-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112583 --- Comment #3 from Robin Dapp --- I cannot reproduce this either. Just started with binop/* and don't see any fails locally. Patrick, could you check what caused this?

[Bug tree-optimization/111970] [14 regression] SLP for non-IFN gathers result in RISC-V test failure on gather since r14-4745-gbeab5b95c58145

2023-11-20 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111970 --- Comment #18 from Robin Dapp --- I did a quick testsuite run on rv32 and can confirm that this fixes the issue for me.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #47 from Robin Dapp --- And, just to confirm: Testsuite is unchanged on riscv with your patch.

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #46 from Robin Dapp --- (In reply to Jakub Jelinek from comment #43) > Now, the patch changed it to allow one extra use in certain cases (but I > think only on use_stmt, because there should be one use on use_stmt and if > there is

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #40 from Robin Dapp --- (In reply to Jakub Jelinek from comment #37) [..] > The above isn't complete, so one just has to guess what you mean outside of > that, but the above doesn't seem to be correct. There are many internal >

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #35 from Robin Dapp --- What does get rid of the comparison failures in the three last posted reduced examples is: gcall *call = dyn_cast (op_use_stmt); internal_fn ifn; if (call &&

[Bug tree-optimization/112374] [14 Regression] Failed bootstrap with `--with-arch=skylake-avx512 --with-cpu=skylake-avx512`, causes a comparison failure since r14-5076-g01c18f58d37865d5f3bbe93e666183b

2023-11-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #34 from Robin Dapp --- (In reply to Jakub Jelinek from comment #29) > --- gcc/tree-vect-loop.cc.jj 2023-11-14 10:35:52.0 +0100 > +++ gcc/tree-vect-loop.cc 2023-11-15 22:42:32.782007408 +0100 > @@ -4105,9 +4105,9 @@

[Bug middle-end/112552] [14 Regression] ICE: in expand_insn, at optabs.cc:8305

2023-11-15 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112552 --- Comment #7 from Robin Dapp --- Ah, it's even easier to trigger then. I already have a somewhat working solution by going with Richi's suggestion and adding the handling for COND_OPs in vect patterns. Still needs a bit more polishing and

[Bug target/112531] [14] RISC-V: gcc.dg/unroll-8.c rtl-dump scan errors with --param=riscv-autovec-preference=scalable

2023-11-15 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112531 --- Comment #4 from Robin Dapp --- Personally, I don't mind having some FAILs as long as we know them and understand the reason for them. I wouldn't insist on "fixing" them but don't mind if others prefer to have the results "clean".

[Bug target/112531] [14] RISC-V: gcc.dg/unroll-8.c rtl-dump scan errors with --param=riscv-autovec-preference=scalable

2023-11-15 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112531 --- Comment #2 from Robin Dapp --- Yes, I'd also argue in favor of -fno-tree-vectorize here.

[Bug target/112527] RVV integer vector instructions generated with rv64gc_zvfh

2023-11-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112527 Robin Dapp changed: What|Removed |Added Resolution|--- |INVALID Status|UNCONFIRMED

[Bug target/112527] RVV integer vector instructions generated with rv64gc_zvfh

2023-11-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112527 --- Comment #2 from Robin Dapp --- Ah, thanks, so it depends on zve32f which implies zve32x. Ok, then all good and we can close this.

[Bug target/112527] New: RVV integer vector instructions generated with rv64gc_zvfh

2023-11-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112527 Bug ID: 112527 Summary: RVV integer vector instructions generated with rv64gc_zvfh Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal

[Bug target/112481] [14 Regression] RISCV: ICE: Segmentation fault when compiling pr110817-3.c

2023-11-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112481 Robin Dapp changed: What|Removed |Added CC||palmer at dabbelt dot com --- Comment #10

[Bug target/112374] [14 Regression] `--with-arch=skylake-avx512 --with-cpu=skylake-avx512` causes a comparison failure

2023-11-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #11 from Robin Dapp --- Thanks for figuring that out. No idea if the pattern is the problem, most likely not? I rather suppose there is still a missing fixup somewhere in the vectorizer that I didn't encounter with my testing. So

[Bug tree-optimization/112464] [14 Regression] ICE avx512 with -ftrapv since r14-5076

2023-11-10 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 --- Comment #4 from Robin Dapp --- Is there another way to make it more robust? Or does the existing void vect_finish_replace_stmt (vec_info *vinfo, stmt_vec_info stmt_info, gimple *vec_stmt) { gimple *scalar_stmt

[Bug tree-optimization/112464] [14 Regression] ICE avx512 with -ftrapv since r14-5076

2023-11-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 --- Comment #2 from Robin Dapp --- I tested diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc index a544bc9b059..257fd40793e 100644 --- a/gcc/tree-vect-loop.cc +++ b/gcc/tree-vect-loop.cc @@ -7084,7 +7084,7 @@

[Bug tree-optimization/112464] [14 Regression] ICE avx512 with -ftrapv since r14-5076

2023-11-09 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112464 --- Comment #1 from Robin Dapp --- We fail at: void vect_finish_replace_stmt (vec_info *vinfo, stmt_vec_info stmt_info, gimple *vec_stmt) { gimple *scalar_stmt = vect_orig_stmt (stmt_info)->stmt; gcc_assert

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #11 from Robin Dapp --- Thanks, this is helpful. I have a patch that I just bootstrapped and ran the testsuite with on aarch64. Going to post it soon, maybe Richi still has a better idea how to work around this.

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #9 from Robin Dapp --- I believe the problem is that in if (vectype) vector_type = vectype; else if (VECT_SCALAR_BOOLEAN_TYPE_P (TREE_TYPE (op)) && VECTOR_BOOLEAN_TYPE_P (stmt_vectype))

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #8 from Robin Dapp --- Ah of course it's not the first argument but the mask. During vectorization we already create fail1.c:15:10: note: add new stmt: vect__ifc__141.81_358 = .COND_ADD (vect_cst__356,

[Bug middle-end/112406] [14 Regression] Several SPECCPU 2017 benchmarks fail with on internal compiler error: in expand_insn, at optabs.cc:8305 after g:01c18f58d37865d5f3bbe93e666183b54ec608c7

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112406 --- Comment #7 from Robin Dapp --- Ah, thanks, I can reproduce this on the cfarm/gcc185. We don't expand: vect__ifc__141.81_358 = .COND_ADD (vect_cst__356, vect_GetImageChannelMoments_M00_0_lsm.74_338, { 1.0e+0, ... },

[Bug target/112374] [14 Regression] `--with-arch=skylake-avx512 --with-cpu=skylake-avx512` causes a comparison failure

2023-11-08 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112374 --- Comment #6 from Robin Dapp --- How does the test suite look without bootstrapping? Are there still new FAILs?

[Bug middle-end/112359] [14 Regression] ICE: in expand_fn_using_insn, at internal-fn.cc:215 with -O -ftree-loop-if-convert -mavx512fp16

2023-11-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112359 --- Comment #2 from Robin Dapp --- Would something like + bool allow_cond_op = flag_tree_loop_vectorize +&& !gimple_bb (phi)->loop_father->dont_vectorize; in convert_scalar_cond_reduction be sufficient or are the more conditions to check

[Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076

2023-11-06 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361 --- Comment #6 from Robin Dapp --- So "before" we created vect__3.12_55 = MEM [(float *)vectp_a.10_53]; vect__ifc__43.13_57 = VEC_COND_EXPR ; // _ifc__43 = _24 ? _3 : 0.0; stmp__44.14_58 = BIT_FIELD_REF ; stmp__44.14_59 = r3_29 +

[Bug target/112363] GCN: 'FAIL: gcc.dg/vect/vect-cond-reduc-in-order-2-signed-zero.c execution test'

2023-11-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112363 --- Comment #1 from Robin Dapp --- This test was introduced in order to check that we correctly "reduce" with -0.0 as neutral element, i.e. a reduction preserves an intial -0.0 and doesn't turn it into 0.0 by adding 0.0. Kernel aborted means

[Bug tree-optimization/112361] [14 Regression] avx512f-reduce-op-1.c miscompiled since r14-5076

2023-11-03 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112361 --- Comment #2 from Robin Dapp --- I can have a look. Of course I tested it but neither the compile farm machine (gcc188) I used nor my local device have AVX512 run capability. Anywhere else I can test it?

[Bug target/111311] RISC-V regression testsuite errors with --param=riscv-autovec-preference=scalable

2023-11-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311 --- Comment #10 from Robin Dapp --- As a general remark: Some of those are present on other backends as well, some have been introduced by recent common-code changes and some are bogus test prerequisites or checks. I'm not saying we are in

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-11-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #30 from Robin Dapp --- On my machine it is not nearly as bad as insn-emit.cc. What dominates for me with a GCC 13 host compiler is the already fixed insn-opinit problem. How long does it take for you (maybe in % of the total

[Bug target/112109] New: Missing riscv vectorized strcmp (and other) expanders

2023-10-27 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112109 Bug ID: 112109 Summary: Missing riscv vectorized strcmp (and other) expanders Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3

[Bug tree-optimization/111791] RISC-V: Strange loop vectorizaion on popcount function

2023-10-18 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111791 --- Comment #4 from Robin Dapp --- This is a scalar popcount and as Kito already noted we will just emit cpop a0, a0 once the zbb extension is present. As to the question what is actually being vectorized here, I'm not so sure :D It looks

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794 --- Comment #10 from Robin Dapp --- >From what I can tell with my barely working connection no regressions on x86, aarch64 or power10 with the adjusted check.

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794 --- Comment #9 from Robin Dapp --- Yes, that's from pattern recog: slp.c:11:20: note: === vect_pattern_recog === slp.c:11:20: note: vect_recog_mask_conversion_pattern: detected: _5 = _2 & _4; slp.c:11:20: note: mask_conversion pattern

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794 --- Comment #7 from Robin Dapp --- vectp.4_188 = x_50(D); vect__1.5_189 = MEM [(int *)vectp.4_188]; mask__2.6_190 = { 1, 1, 1, 1, 1, 1, 1, 1 } == vect__1.5_189; mask_patt_156.7_191 = VIEW_CONVERT_EXPR>(mask__2.6_190); _1 = *x_50(D);

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-16 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794 --- Comment #5 from Robin Dapp --- Disregarding the reasons for the precision adjustment, for this case here, we seem to fail at: /* We do not handle bit-precision changes. */ if ((CONVERT_EXPR_CODE_P (code) || code ==

[Bug c/111794] RISC-V: Missed SLP optimization due to mask mode precision

2023-10-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111794 --- Comment #4 from Robin Dapp --- Just to mention here as well. As this seems ninstance++ where the adjust_precision thing comes back to bite us, I'm going to go back and check if the issue why it was introduced (DCE?) cannot be solved

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #26 from Robin Dapp --- So insn-opinit.cc still takes 2-3 minutes to compile here, even though the file is not gigantic. With the same GCC 13.1 x86 host compiler I see: phase opt and generate : 170.28 ( 99%) 0.75 (

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #25 from Robin Dapp --- At least here locally the maximum I saw was 1.4 GB of RES for insn-emit-10.cc. That's still not ideal (especially when 8 or 10 of those files compile in parallel) but at least no 8 GB for a single file

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-12 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #23 from Robin Dapp --- For the lack of a better idea (and time constraints as looking for compiler bottlenecks is slow and tedious) I went with Kito's suggestion of splitting insn-emit.cc This reduces this part of the compilation

[Bug tree-optimization/111760] risc-v regression: COND_LEN_* incorrect fold/simplify in middle-end

2023-10-11 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111760 --- Comment #6 from Robin Dapp --- Yes, thanks for filing this bug separately. The patch doesn't disable all of those optimizations, of course I paid special attention not mess up with them. The difference here is that we valueize, add

[Bug tree-optimization/111760] risc-v regression: COND_LEN_* incorrect fold/simplify in middle-end

2023-10-10 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111760 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org,

[Bug target/111428] RISC-V vector: Flaky segfault in {min|max}val_char_{1|2}.f90

2023-10-10 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111428 --- Comment #3 from Robin Dapp --- Still difficult to track down. The following is a smaller reproducer: program main implicit none integer, parameter :: n=5, m=3 integer, dimension(n,m) :: v real, dimension(n,m) :: r do call

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #22 from Robin Dapp --- Ah, then it's not that different, your machine is just faster ;) callgraph ipa passes : 69.77 ( 11%) 5.97 ( 13%) 76.05 ( 12%) 2409M ( 10%) integration: 91.95 (

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #20 from Robin Dapp --- Mhm, why is your profile so different from mine? I'm also on an x86_64 host with a 13.2.1 host compiler (Fedora). Is it because of the preprocessed source? Or am I just reading the timing report wrong?

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #18 from Robin Dapp --- Just finished an initial timing run, sorted, first 10: Time variable usr sys wall GGC phase opt and generate : 567.60 ( 97%) 38.23

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-04 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 --- Comment #16 from Robin Dapp --- Confirming that it's the compilation of insn-emit.cc which takes > 10 minutes. The rest (including auto generating of files) is reasonably fast. Going to do some experiments with it and see which pass takes

[Bug target/111506] RISC-V: Failed to vectorize conversion from INT64 -> _Float16

2023-10-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111506 --- Comment #5 from Robin Dapp --- Ah, thanks Joseph, so this at least means that we do not need !flag_trapping_math here. However, the vectorizer emulates the 64-bit integer to _Float16 conversion via an intermediate int32_t and now the riscv

[Bug target/111600] [14 Regression] RISC-V bootstrap time regression

2023-10-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111600 Robin Dapp changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #12

[Bug target/111506] RISC-V: Failed to vectorize conversion from INT64 -> _Float16

2023-10-02 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111506 Robin Dapp changed: What|Removed |Added CC||joseph at codesourcery dot com ---

[Bug target/111428] RISC-V vector: Flaky segfault in {min|max}val_char_{1|2}.f90

2023-09-21 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111428 --- Comment #2 from Robin Dapp --- Reproduced locally. The identical binary sometimes works and sometimes doesn't so it must be a race...

[Bug target/111488] ICE ion riscv gcc.dg/vect/vect-126.c

2023-09-19 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111488 Robin Dapp changed: What|Removed |Added CC||juzhe.zhong at rivai dot ai --- Comment

[Bug target/111488] New: ICE ion riscv gcc.dg/vect/vect-126.c

2023-09-19 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111488 Bug ID: 111488 Summary: ICE ion riscv gcc.dg/vect/vect-126.c Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target

[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS

2023-09-14 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 --- Comment #6 from Robin Dapp --- Created attachment 55902 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55902=edit Tentative You're referring to the case where we have init = -0.0, the condition is false and we end up wrongly doing

[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS

2023-09-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 --- Comment #3 from Robin Dapp --- Several other things came up, so I'm just going to post the latest status here without having revised or tested it. Going to try fixing it and testing tomorrow. --- a/gcc/tree-vect-loop.cc +++

[Bug middle-end/111401] Middle-end: Missed optimization of MASK_LEN_FOLD_LEFT_PLUS

2023-09-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111401 Robin Dapp changed: What|Removed |Added CC||rdapp at gcc dot gnu.org --- Comment #2

[Bug c/111153] RISC-V: Incorrect Vector cost model for reduction

2023-09-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53 --- Comment #4 from Robin Dapp --- Yes, with VLS reduction this will improve. On aarch64 + sve I see loop inside costs: 2 This is similar to our VLS costs. And their loop is indeed short: ld1wz30.s, p7/z, [x0, x2, lsl 2]

[Bug c/111153] RISC-V: Incorrect Vector cost model for reduction

2023-09-13 Thread rdapp at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53 --- Comment #2 from Robin Dapp --- With the current trunk we don't spill anymore: (VLS) .L4: vle32.v v2,0(a5) vadd.vv v1,v1,v2 addia5,a5,16 bne a5,a4,.L4 Considering just that loop I'd say costing works

<    1   2   3   >