[Bug bootstrap/112497] [14 Regression] Bootstrap comparison failure: gcc/analyzer/constraint-manager.o differs on loongarch64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112497 --- Comment #5 from Jeffrey A. Law --- This failure means the stage1 and stage2 compilers generated different code for the same input. So when I need to debug this I usually start by first getting that source code. Based in the title of this bugzilla you're going to want the .ii file for constraint-manager as built by either the stage1 or stage2 compiler. Then I feed that into the stage1 and stage2 compiler with the same optimization options to verify that they indeed generate different code. Sometimes that doesn't work when the issue is debug insns, but that's where I start. Once I have confirmed the two compilers generate different code, then I try to isolate where/why. This can often be done by looking a debug dumps to narrow things down to a pass that's behaving differently. Alternately you can replace objects in the stage2 compiler with those from the stage1 compiler to narrow it down to a single .o that causes the compiler's behavior to diverge. Then it's usually a matter going into the debugger and understanding why the given pass is behaving differently. It's a long, painful process. *Sometimes* you can just build the stage1 compiler and run the testsuite and see if there are new failures on your target. It doesn't always generate something useful, but when it does it's often faster than the process I mentioned above.
[Bug bootstrap/112497] [14 Regression] Bootstrap comparison failure: gcc/analyzer/constraint-manager.o differs on loongarch64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112497 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #3 from Jeffrey A. Law --- If at all possible, cc Jin Ma in this since it's his change, I just reviewed and committed the bits on Jin's behalf.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #41 from Jeffrey A. Law --- I would agree. In fact,the whole point of the f-m-o pass is to bring those immediates into the memory reference. It'd be really useful to know why that isn't happening. The only thing I can think of would be if multiple instructions needed the %r20 in the RTL you attached. Which might point to a refinement we should make in f-m-o, specifically the transformation isn't likely profitable if we aren't able to fold away a term or fold a constant term into the actual memory reference.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #31 from Jeffrey A. Law --- IIRC r21 is call-clobbered. So I guess the question turns into what was the sequence before f-m-o got involved -- was it assuming r21 would be preserved, or did f-m-o make r21 live across the call?
[Bug tree-optimization/112468] New: [14 Regression] Missed phi-opt after recent change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112468 Bug ID: 112468 Summary: [14 Regression] Missed phi-opt after recent change Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- This change: commit 3f176e1adc6bc9cc2c21222d776b51d9f43cb66b (HEAD) Author: Tamar Christina Date: Thu Nov 9 13:59:39 2023 + middle-end: optimize fneg (fabs (x)) to copysign (x, -1) [PR109154] This patch transforms fneg (fabs (x)) into copysign (x, -1) which is more canonical and allows a target to expand this sequence efficiently. Such sequences are common in scientific code working with gradients. There is an existing canonicalization of copysign (x, -1) to fneg (fabs (x)) which I remove since this is a less efficient form. The testsuite is also updated in light of this. gcc/ChangeLog: PR tree-optimization/109154 * match.pd: Add new neg+abs rule, remove inverse copysign rule. gcc/testsuite/ChangeLog: PR tree-optimization/109154 * gcc.dg/fold-copysign-1.c: Updated. * gcc.dg/pr55152-2.c: Updated. * gcc.dg/tree-ssa/abs-4.c: Updated. * gcc.dg/tree-ssa/backprop-6.c: Updated. * gcc.dg/tree-ssa/copy-sign-2.c: Updated. * gcc.dg/tree-ssa/mult-abs-2.c: Updated. * gcc.target/aarch64/fneg-abs_1.c: New test. * gcc.target/aarch64/fneg-abs_2.c: New test. * gcc.target/aarch64/fneg-abs_3.c: New test. * gcc.target/aarch64/fneg-abs_4.c: New test. * gcc.target/aarch64/sve/fneg-abs_1.c: New test. * gcc.target/aarch64/sve/fneg-abs_2.c: New test. * gcc.target/aarch64/sve/fneg-abs_3.c: New test. * gcc.target/aarch64/sve/fneg-abs_4.c: New test. Is causing a testsuite regression on moxie-elf. This is a scan dump failure, so you don't need a full toolchain, just a cross compiler. moxie-sim: gcc.dg/tree-ssa/phi-opt-24.c scan-tree-dump-not phiopt2 "if"
[Bug target/112462] New: RISC-V zicond cost model enhancements
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112462 Bug ID: 112462 Summary: RISC-V zicond cost model enhancements Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- Currently the costing of zicond always returns COSTS_N_INSNS (1) which can be inaccurate. I see two primary issues that need to be fixed. First, for conditions which are not equality comparisons against zero the expander will need to emit a sCC insn. That additional instruction needs to be included in the cost. Second, the expander needs to look at the true/false arms and potentially emit additional code because of the limitations of the czero instruction. Those additional instructions need to be included in the cost as well. It's unclear if we should refactor the expander logic so that its basic structure can be used to drive costing as well as expansion logic or if we should just mirror the basic structure with new code and keep it in sync with the expander logic.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #26 from Jeffrey A. Law --- As a compiler junkie, I tend to think compiler first until I can prove it otherwise. I wouldn't get too hung up on aliasing issues and such at this point. Do we already have a dump for the key function? Presumably f-m-o doesn't trigger *that* much. And if this is triggering w/o LTO we can probably move to cross debugging and analysis of those dump files and assembly code with and without f-m-o enabled, narrowing our focus on the key function.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #19 from Jeffrey A. Law --- f-m-o runs post-allocation, so the scope of where it's behavior can change things is narrower. So testing with -fno-schedule-insns isn't going to be useful, but -fno-schedule-insns2 might. I'm a bit concerned that we can't turn off f-m-o with an attribute. That would indicating something isn't wired up right in the options handling.
[Bug rtl-optimization/112415] [14 regression] Python 3.11 miscompiled on HPPA with new RTL fold mem offset pass, since r14-4664-g04c9cf5c786b94
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112415 --- Comment #6 from Jeffrey A. Law --- Do we have assembly code around the faulting point (x/20i $pc) and a register dump (i r)? The biggest concern I'd have with f-m-o on the PA would be the implicit segment selection that happens on the base register -- but it would only be an issue if we are faulting on an unscaled indexed addressing mode and only if the linux-gnu port was actually putting different values into the space registers. WRT testing -- we did test this on hppa1.1-linux-gnu. Just a bootstrap and regression test of the compiler itself.
[Bug target/111311] RISC-V regression testsuite errors with --param=riscv-autovec-preference=scalable
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111311 --- Comment #14 from Jeffrey A. Law --- As Andrew said, if there's a test that depends on behavior of -INT_MIN, then the test needs to be fixed. That's undefined behavior.
[Bug rtl-optimization/109035] meaningless memory store on RISC-V and LoongArch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109035 --- Comment #8 from Jeffrey A. Law --- No spills on rv64 either.
[Bug rtl-optimization/104387] aarch64: Redundant SXTH for “bag of bits” moves
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104387 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law --- As noted in bz111384, this can be addressed via Joern's extension DCE pass that we're beating on right now. Conceptually it tracks liveness of sub-word objects within a register and when it encounters an extension that sets bits that are never read, it eliminates the extension. Conceptually simple and we've confirmed it addresses the issue in 111384. I strongly suspect it would fix this one as well. It's still got bugs and isn't really for integration, but to date Joern's basic approach seems the most viable for eliminating unnecessary extensions.
[Bug tree-optimization/112320] [14 Regression] crash from insert_debug_temp_for_var_def since r14-5032-ge3da1d7bb288c8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112320 --- Comment #6 from Jeffrey A. Law --- Created attachment 56480 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=56480=edit Testcase for fr30-elf -Os -g
[Bug tree-optimization/112320] [14 Regression] crash from insert_debug_temp_for_var_def since r14-5032-ge3da1d7bb288c8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112320 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |NEW CC||law at gcc dot gnu.org Last reconfirmed||2023-11-01 Ever confirmed|0 |1 --- Comment #5 from Jeffrey A. Law --- I've bisected a failure on fr30-elf to the same commit. The failure mode is different, but given it's the same commit, I'm attaching the testcase to this BZ. 0xba6a67 phi_nodes_ptr(basic_block_def*) /home/jlaw/test/gcc/gcc/gimple.h:4700 0xba6a67 gsi_start_phis(basic_block_def*) /home/jlaw/test/gcc/gcc/gimple-iterator.cc:935 0xba6a67 gsi_for_stmt(gimple*) /home/jlaw/test/gcc/gcc/gimple-iterator.cc:620 0xf477c1 replace_uses_by(tree_node*, tree_node*) /home/jlaw/test/gcc/gcc/tree-cfg.cc:2055 0x111b731 clean_up_loop_closed_phi(function*) /home/jlaw/test/gcc/gcc/tree-ssa-propagate.cc:1296 0xd23348 loop_optimizer_finalize(function*, bool) /home/jlaw/test/gcc/gcc/loop-init.cc:146 0x10e1448 tree_ssa_loop_done /home/jlaw/test/gcc/gcc/tree-ssa-loop.cc:478 0x10e1498 execute /home/jlaw/test/gcc/gcc/tree-ssa-loop.cc:507 Compile with -Os -g
[Bug target/112298] Poor code for DImode operations on H8 port
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112298 Jeffrey A. Law changed: What|Removed |Added Target||h8300 Priority|P3 |P4
[Bug target/112298] New: Poor code for DImode operations on H8 port
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112298 Bug ID: 112298 Summary: Poor code for DImode operations on H8 port Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- long long foo(long long x) { return x << 1; } Highlights several code inefficiencies WRT DImode values on the H8. I would expect that defining a reasonable adddi3 and some DImode shifts would likely help this problem considerably. I'm not currently working on this problem.
[Bug libstdc++/107885] H8/300: libsupc++/hash_bytes.cc fix shift-count-overflow warning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107885 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Jeffrey A. Law --- No plans to backport this.
[Bug target/111466] RISC-V: redundant sign extensions despite ABI guarantees
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111466 Jeffrey A. Law changed: What|Removed |Added Status|ASSIGNED|RESOLVED CC||law at gcc dot gnu.org Resolution|--- |FIXED --- Comment #5 from Jeffrey A. Law --- Fixed on the trunk now.
[Bug tree-optimization/111798] New: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111798 Bug ID: 111798 Summary: [14 Regression] Recent change causing testsuite regression and poor code on mcore-elf Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- This change: commit 6decda1a35be5764101987c210b5693a0d914e58 Author: Richard Biener Date: Thu Oct 12 11:34:57 2023 +0200 tree-optimization/111779 - Handle some BIT_FIELD_REFs in SRA The following handles byte-aligned, power-of-two and byte-multiple sized BIT_FIELD_REF reads in SRA. In particular this should cover BIT_FIELD_REFs created by optimize_bit_field_compare. For gcc.dg/tree-ssa/ssa-dse-26.c we now SRA the BIT_FIELD_REF appearing there leading to more DSE, fully eliding the aggregates. This results in the same false positive -Wuninitialized as the older attempt to remove the folding from optimize_bit_field_compare, fixed by initializing part of the aggregate unconditionally. PR tree-optimization/111779 gcc/ * tree-sra.cc (sra_handled_bf_read_p): New function. (build_access_from_expr_1): Handle some BIT_FIELD_REFs. (sra_modify_expr): Likewise. (make_fancy_name_1): Skip over BIT_FIELD_REF. gcc/fortran/ * trans-expr.cc (gfc_trans_assignment_1): Initialize lhs_caf_attr and rhs_caf_attr codimension flag to avoid false positive -Wuninitialized. gcc/testsuite/ * gcc.dg/tree-ssa/ssa-dse-26.c: Adjust for more DSE. * gcc.dg/vect/vect-pr111779.c: New testcase. Causes execute/20040709-2.c to fail on mcore-elf at -O2. It also results in what appears to be significantly poorer code generation. Note I haven't managed to get mcore-elf-gdb to work, so debugging is, umm, painful. And I wouldn't put a lot of faith in the simulator correctness. I have simplified the test to this: extern void abort (void); extern void exit (int); unsigned int myrnd (void) { static unsigned int s = 1388815473; s *= 1103515245; s += 12345; return (s / 65536) % 2048; } struct __attribute__((packed)) K { unsigned int k:6, l:1, j:10, i:15; }; struct K sK; unsigned int fn1K (unsigned int x) { struct K y = sK; y.k += x; return y.k; } void testK (void) { int i; unsigned int mask, v, a, r; struct K x; char *p = (char *) for (i = 0; i < sizeof (sK); ++i) *p++ = myrnd (); v = myrnd (); a = myrnd (); sK.k = v; x = sK; r = fn1K (a); if (x.j != sK.j || x.l != sK.l) abort (); } int main (void) { testK (); exit (0); } Which should at least make the poor code gen obvious. I don't expect to have time to debug this further anytime in the near future.
[Bug middle-end/111777] [14 regression] build breaks after r14-4558-g400efdddf3d849
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111777 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED --- Comment #11 from Jeffrey A. Law --- Fixed by Mary's patch on the trunk.
[Bug middle-end/111777] [14 regression] build breaks after r14-4558-g400efdddf3d849
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111777 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-10-12 Ever confirmed|0 |1 --- Comment #6 from Jeffrey A. Law --- I would hazard a guess that Mary doesn't have a bugzilla account. I'll drop her a direct email.
[Bug target/93062] Failed to generate indirect branch for long branches on riscv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93062 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|NEW |RESOLVED CC||law at gcc dot gnu.org --- Comment #3 from Jeffrey A. Law --- This should be fixed on the trunk now. No plans to backport to the release branches.
[Bug bootstrap/111664] [14 regression] Fails to build with mawk (error in gcc/opt-read.awk) after r14-4354-ge4a4b8e983bac8
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111664 Jeffrey A. Law changed: What|Removed |Added Status|ASSIGNED|RESOLVED CC||law at gcc dot gnu.org Resolution|--- |FIXED --- Comment #6 from Jeffrey A. Law --- Fixed on the trunk.
[Bug rtl-optimization/111384] missed optimization: GCC adds extra any extend when storing subreg#0 multiple times
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111384 Jeffrey A. Law changed: What|Removed |Added Last reconfirmed||2023-10-07 Ever confirmed|0 |1 Status|UNCONFIRMED |NEW --- Comment #4 from Jeffrey A. Law --- So this is something we've been pondering over in rv64 land. Joern has an extension to DCE which tracks subobjects in an attempt to determine if bits set by sign/zero extensions are never read. If they aren't read, then the extension can be eliminated.
[Bug target/109414] RISC-V: unnecessary sext.w in rv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109414 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED CC||law at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law --- These code generation inefficiences have been fixed. I didn't bisect, but I would hazard a guess it was Jivan's work on exposing the widening nature of the 32 bit operations and extracting the result via a promoted subreg. ie, for the first example we now generate this during expand: (insn 2 5 3 2 (set (reg/v:DI 136 [ x ]) (reg:DI 10 a0 [ x ])) "j.c":1:26 -1 (nil)) (insn 3 2 4 2 (set (reg/v:DI 137 [ n ]) (reg:DI 11 a1 [ n ])) "j.c":1:26 -1 (nil)) (note 4 3 7 2 NOTE_INSN_FUNCTION_BEG) (insn 7 4 8 2 (set (reg:DI 140) (sign_extend:DI (plus:SI (subreg/s/u:SI (reg/v:DI 136 [ x ]) 0) (const_int 1 [0x1] "j.c":2:12 -1 (nil)) (insn 8 7 9 2 (set (reg:SI 139) (subreg/s/u:SI (reg:DI 140) 0)) "j.c":2:12 -1 (expr_list:REG_EQUAL (plus:SI (subreg/s/u:SI (reg/v:DI 136 [ x ]) 0) (const_int 1 [0x1])) (nil))) (insn 9 8 10 2 (set (reg:DI 141) (xor:DI (reg/v:DI 137 [ n ]) (subreg:DI (reg:SI 139) 0))) "j.c":2:17 -1 (nil)) (insn 10 9 11 2 (set (reg:DI 142) (sign_extend:DI (subreg:SI (reg:DI 141) 0))) "j.c":2:17 discrim 1 -1 (nil)) (insn 11 10 15 2 (set (reg:DI 135 [ ]) (reg:DI 142)) "j.c":2:17 discrim 1 -1 (nil)) (insn 15 11 16 2 (set (reg/i:DI 10 a0) (reg:DI 135 [ ])) "j.c":3:1 -1 (nil)) (insn 16 15 0 2 (use (reg/i:DI 10 a0)) "j.c":3:1 -1 (nil)) Which is much easier for combine to analyze and prove the trailing sign extension is unnecessary.
[Bug target/106271] Bootstrap on RISC-V on Ubuntu 22.04 LTS: bits/libc-header-start.h: No such file or directory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106271 Jeffrey A. Law changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #8 from Jeffrey A. Law --- I wasn't aware of this BZ when I made the commit referenced in c#6. But yes, the whole point of that commit was to fix this problem.
[Bug target/64215] -Os misses an opportunity to merge two ret instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64215 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law --- Andrew, the reason the patch you referenced doesn't help this case is because we don't have an unconditional jump to a return only block. To optimize this case we'd have to detect that we have a return only block that is immediately preceded by another return block after bbro. ie: (note 48 23 59 6 [bb 6] NOTE_INSN_BASIC_BLOCK) (insn 59 48 49 6 (use (reg/i:SI 10 a0)) -1 (nil)) (jump_insn 49 59 37 6 (simple_return) 346 {simple_return} (nil) -> simple_return) ;; lr out 1 [ra] 2 [sp] 10 [a0] ;; live out 1 [ra] 2 [sp] 10 [a0] ;; succ: EXIT [always] count:52738306 (estimated locally, freq 0.4591) ;; basic block 7, loop depth 0, count 6317494 (estimated locally, freq 0.0550), maybe hot ;; prev block 6, next block 1, flags: (REACHABLE, RTL) ;; pred: 2 [5.5% (guessed)] count:6317494 (estimated locally, freq 0.0550) (CAN_FALLTHRU) ;; bb 7 artificial_defs: { } ;; bb 7 artificial_uses: { u-1(2){ }} ;; lr in1 [ra] 2 [sp] 10 [a0] ;; lr use 2 [sp] 10 [a0] ;; lr def ;; live in 1 [ra] 2 [sp] 10 [a0] ;; live gen ;; live kill (code_label 37 49 36 7 4 (nil) [1 uses]) (note 36 37 60 7 [bb 7] NOTE_INSN_BASIC_BLOCK) (insn 60 36 51 7 (use (reg/i:SI 10 a0)) -1 (nil)) (jump_insn 51 60 41 7 (simple_return) 346 {simple_return} (nil) -> simple_return)
[Bug target/111670] H8/300 SX uses incorrect code sequences
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111670 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P4 Target||h8300
[Bug target/111670] New: H8/300 SX uses incorrect code sequences
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111670 Bug ID: 111670 Summary: H8/300 SX uses incorrect code sequences Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- The H8/SX port can create sequences like (set (mem (autoinc (reg sp)) (reg_sp)) Here autoinc is PRE_DECEMENT or PRE_INCREMENT addressing modes. Which is invalid RTL. I believe this is the root cause of the following H8/SX failures in the testsuite: h8300-sim/-msx/-mint32: gcc.c-torture/execute/920501-6.c -O1 execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/920501-6.c -Os execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/pr20466-1.c -O1 execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/pr20466-1.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/pr20466-1.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/pr39339.c -O2 -flto -fno-use-linker-plugin -flto-partition=none execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/ssad-run.c -O1 execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/ssad-run.c -Os execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/usad-run.c -O1 execution test h8300-sim/-msx/-mint32: gcc.c-torture/execute/usad-run.c -Os execution test I suspect we need to break the "Q" constraint into two variants. One which allows autoinc addressing modes and the other does not. For movsi/movhi we would use the version which does not allow autoinc addressing modes and instead use the Z0/ZA approach like the other H8 variants are using. I'm not currently working on this.
[Bug rtl-optimization/111467] REE failing to eliminate redundant extension due to multiple reaching def(s)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111467 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #2 from Jeffrey A. Law --- I thought REE handled multiple reaching definition. So this is a bit of a surprise.
[Bug target/82666] [11/12/13/14 regression]: sum += (x>128 ? x : 0) puts the cmov on the critical path (at -O2)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82666 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #14 from Jeffrey A. Law --- A better approach might be to to try and create COND_EXPRs for the conditional move in the gimple code. The biggest problem I see with that is the gimple->rtl converters aren't great at creating efficient code on targets without conditional moves. Meaning that we could well end up improving x86, but making several other targets worse. I know this because I was recently poking at a similar problem. We expressed a conditional move of 0, C as a multiply of a boolean by C in gimple. It really should just have been a COND_EXPR, but when we generate that form targets without good conditional move expanders will end up recreating branchy code :(
[Bug driver/77576] gcc-ar doesn't work if all options are read from file
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77576 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #9 from Jeffrey A. Law --- Fixed on the trunk.
[Bug target/110748] RISC-V: optimize store of DF 0.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110748 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law --- I'd bet it's const_0_operand not allowing CONST_DOUBLE. The question is what unintended side effects we'd have if we allowed CONST_DOUBLE 0.0 in const_0_operand.
[Bug tree-optimization/105832] [13/14 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 12.1.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105832 --- Comment #11 from Jeffrey A. Law --- Looks viable to me. Are you thinking match.pd?
[Bug target/110559] Bad mask_load/mask_store codegen of RVV
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110559 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-07-07 Ever confirmed|0 |1 --- Comment #2 from Jeffrey A. Law --- Yea, we definitely want pressure sensitive scheduling. While it's more valuable for scalar cases, it can help with some vector as well. Also note there's two variants of the pressure sensitive scheduler support. I think we use the newer one which is supposed to be better, but I don't think we've really evaluated one over the other. Setting issue rate to 1 for the first pass scheduler is a bit of a hack, though not terribly uncommon. It's something I've wanted to go back and review, so fully support you digging into that as well.
[Bug tree-optimization/110460] New: [14 Regression] ft32 ICE on 931110-1.c with new TYPE_PRECISION checking
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110460 Bug ID: 110460 Summary: [14 Regression] ft32 ICE on 931110-1.c with new TYPE_PRECISION checking Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- commit fe48f2651334bc4d96b6df6b2bb6b29fcb732a83 Author: Richard Biener Date: Fri Jun 9 09:31:14 2023 +0200 Prevent TYPE_PRECISION on VECTOR_TYPEs The following makes sure that using TYPE_PRECISION on VECTOR_TYPE ICEs when tree checking is enabled. This should avoid wrong-code in cases like PR110182 and instead ICE. It also introduces a TYPE_PRECISION_RAW accessor and adjusts places I found that are eligible to use that. * tree.h (TYPE_PRECISION): Check for non-VECTOR_TYPE. (TYPE_PRECISION_RAW): Provide raw access to the precision field. * tree.cc (verify_type_variant): Compare TYPE_PRECISION_RAW. (gimple_canonical_types_compatible_p): Likewise. * tree-streamer-out.cc (pack_ts_type_common_value_fields): Stream TYPE_PRECISION_RAW. * tree-streamer-in.cc (unpack_ts_type_common_value_fields): Likewise. * lto-streamer-out.cc (hash_tree): Hash TYPE_PRECISION_RAW. gcc/lto/ * lto-common.cc (compare_tree_sccs_1): Use TYPE_PRECISION_RAW. One example on ft32-elf: Tests that now fail, but worked before (13 tests): ft32-sim: gcc.c-torture/execute/931110-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) ft32-sim: gcc.c-torture/execute/931110-1.c -O3 -g (test for excess errors) ft32-sim: gcc.dg/pr108095.c (test for excess errors) And if you dig into the 931110-1.c failure you find: ft32-sim: gcc.c-torture/execute/931110-1.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (internal compiler error: tree check: expected none of vector_type, have vector_type in type_has_mode_precision_p, at tree.h:6644) ft32-sim: gcc.c-torture/execute/931110-1.c -O3 -g (internal compiler error: tree check: expected none of vector_type, have vector_type in type_has_mode_precision_p, at tree.h:6644) It looks like SCALAR_DEST in vectorizable_operation is actually a vector type -- meaning that STMT was already vectorized. This is the patch I'm testing. There are other failures that don't seem to be fixed by this patch. Anyway, the whole point of the change is to find these lurking bugs. diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index d642d3c257f..3dd8a284577 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -6481,6 +6481,10 @@ vectorizable_operation (vec_info *vinfo, scalar_dest = gimple_assign_lhs (stmt); vectype_out = STMT_VINFO_VECTYPE (stmt_info); + /* STMT may have already been vectorized. */ + if (VECTOR_TYPE_P (TREE_TYPE (scalar_dest))) +return false; + /* Most operations cannot handle bit-precision types without extra truncations. */ bool mask_op_p = VECTOR_BOOLEAN_TYPE_P (vectype_out);
[Bug rtl-optimization/110423] Redundant constants not getting eliminated on RISCV.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110423 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #2 from Jeffrey A. Law --- So there is another broad approach we can take here. As Vineet mentioned, this isn't really a job for PRE/LCM as those are formulated around a requirement that they never insert an expression evaluation in any path that did not have an evaluation before. ie no speculative constant loads. We could potentially relax that condition. I'm not sure we'd formulate it as a PRE/LCM problem, but it gives you a sense of how we could tackle this. The difficulty would be in the heuristics for when to apply this transformation since it will make some codes slower and may increase register pressure. This is derived heavily from Click's work in the 90s. This would happen in gimple most likely, though I guess one could do it in RTL if they have a high pain threshold. In the simplest way to think about the placement algorithm is to find the blocks where all the uses of any given constant C occur. A trivially correct placement of load of that constant would be the entry block as it must dominate every block in that set. Of course that would make the placement quite speculative and lengthen live ranges. That's usually referred to an an early placement. Next find the latest placement for the constant load that covers all the uses. That will be the lowest common ancestor in the dominator tree of the set of blocks that use the constant. If you were to imagine a path through the dominator tree starting at the early placement (entry) and ending at the lowest common ancestor, any block on that path could be selected for generating the constant load and would cover every use with that single load. Within the set of blocks on that path, find the set with the lowest loop nesting, then within that reduced set find those with the deepest control nesting (or lowest estimated frequency counts). There may be more than one block in that final set. Any are valid and "reasonable" choices. Click's paper is much more general, but the same concepts apply. His paper doesn't cover anything like bifurcating the graph (thus allowing multiple constant loads in an effort to reduce undesired speculation or register allocation conflicts). We might be able to get away with this precisely because these are constant loads and thus subject to rematerialization later if register pressure is high. https://courses.cs.washington.edu/courses/cse501/06wi/reading/click-pldi95.pdf
[Bug debug/110308] [14 Regression] ICE on audiofile-0.3.6: RTL: vartrack: Segmentation fault in mode_to_precision(machine_mode)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110308 --- Comment #9 from Jeffrey A. Law --- Right. It's fairly common with fold-mem-offsets to end up rewriting the address arithmetic such that we'll have an sp->gpr copy of some sort in the IL. We'd really like to be able to cprop that copy away. After Manolis's fixes to that code it seemed independently commit-able so I acked it while we iterate on the fold-mem-offsets work. It's tickled a few problems, but nothing that seems unmanageable right now.
[Bug target/110201] RISC-V: __builtin_riscv_sm4ks and __builtin_riscv_sm4ed produce invalid assembly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110201 --- Comment #4 from Jeffrey A. Law --- Yea, the tests aren't great. They'll be better shortly. They'll test non-constant arguments and out-of-range constants, expecting a suitable diagnostic. They'll also test the extrema of valid constants.
[Bug target/110201] RISC-V: __builtin_riscv_sm4ks and __builtin_riscv_sm4ed produce invalid assembly
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110201 Jeffrey A. Law changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-06-19 Status|UNCONFIRMED |NEW --- Comment #1 from Jeffrey A. Law --- It looks like some of the aes patterns have the same problem. It may just have been Liao not understanding the difference between an operand constraint and an operand predicate.
[Bug target/110264] internal compiler error: riscv_vector::vector_insn_info::get_avl_reg_rtx
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110264 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-06-17 Ever confirmed|0 |1 CC||law at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law --- Note that Pan can cherry pick it into gcc-13. Typically folks wait a week or so after the patch is on the trunk to see if there's any fallout. Given that I don't expect gcc-13.2 until late summer, we've certainly got time.
[Bug middle-end/79173] add-with-carry and subtract-with-borrow support (x86_64 and others)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79173 --- Comment #23 from Jeffrey A. Law --- risc-v doesn't have any special instructions to implement add-with-carry or subtract-with-borrow. Depending on who you talk do, it's either a feature or a mis-design.
[Bug tree-optimization/110218] sink pass heuristic not working in practice
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110218 --- Comment #2 from Jeffrey A. Law --- So what I think was happening was that we would sink past a bunch of conditionals that were never going to be true thinking that we were moving to a deeper control nest. So the idea was to use the frequency information to avoid movements that weren't likely to improve anything. I don't remember how I selected the param's value though. I've got no objection to adjusting how this works.
[Bug rtl-optimization/110163] [14 Regression] Comparing against a constant string is inefficient on some targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110163 --- Comment #2 from Jeffrey A. Law --- It is a regression for rv64. So probably P4 would be most appropriate.
[Bug rtl-optimization/110163] New: [14 Regression] Comparing against a constant string is inefficient on some targets
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110163 Bug ID: 110163 Summary: [14 Regression] Comparing against a constant string is inefficient on some targets Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- Comparing against a constant string is expanded by inline_string_cmp and on some targets the generated code can be inefficient. This can be seen in spec2017's omnetpp benchmark, particularly when the inline string comparison limits are increased. The problem is the expansion code arranges to do all the arithmetic and tests in SImode. On RV64 this introduces a sign extension for each test due to how RV64 expresses 32bit ops. It would be better to do all the computations in word_mode, then convert the final result to SImode, at least for RV64 and likely for other targets. I experimented with starting to build out cost checks to determine what mode to use for the internal computations. That ran afoul of x86 where the cost of a byte load is different than the cost of an extended byte load, even though they use the exact same instruction. There's also a need to cost out the computations, test & branch in the different modes as well once the x86 hurdle is behind us. I've set work on this aside for now. But the discussion can be found in these two threads: https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620601.html https://gcc.gnu.org/pipermail/gcc-patches/2023-June/620577.html #include int foo (char *x) { return strcmp (x, "lowerLayout"); } Compiled with -O2 --param builtin-string-cmp-inline-length=100 on rv64 should show the issue.
[Bug target/110109] RISC-V: ICE when build the Intrinsic code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110109 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||law at gcc dot gnu.org Resolution|--- |FIXED --- Comment #4 from Jeffrey A. Law --- Should be fixed on the trunk.
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #10 from Jeffrey A. Law --- Created attachment 55218 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55218=edit (Incomplete) Patch
[Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041 --- Comment #4 from Jeffrey A. Law --- Patch was for a different problem. Sorry.
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #9 from Jeffrey A. Law --- Weird, I don't see the attachment either. I'll extract & upload it again. WRT costing. fwprop and combine will both query the target rtx costs and will reject when the target costing model indicates the change isn't actually profitable. As you'd noted before, combine will internally transform a sign/zero extension into a pair of shifts. The whole point of that internal canonicalization is to expose cases where the shifts can combine with other nearby operations. So there's no significant risk to detecting and creating the extension form earlier.
[Bug tree-optimization/108041] ivopts results in extra instruction in simple loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108041 --- Comment #3 from Jeffrey A. Law --- Created attachment 55185 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55185=edit (Incomplete) Patch
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #7 from Jeffrey A. Law --- Attached is what I cobbled together. It doesn't use magic numbers. But it doesn't yet handle zero extensions in the simplify-rtx code. But I think it shows the overall direction fairly well.
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED |RESOLVED --- Comment #12 from Jeffrey A. Law --- Should be fixed with Raphael's patch on the trunk.
[Bug tree-optimization/109848] New: [14 Regression] Recent change causing testsuite ICE on csky port
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109848 Bug ID: 109848 Summary: [14 Regression] Recent change causing testsuite ICE on csky port Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- This patch: commit cc0e22b3f25d4b2a326322bce711179c02377e6c Author: Richard Biener Date: Fri May 12 13:43:27 2023 +0200 tree-optimization/64731 - extend store-from CTOR lowering to TARGET_MEM_REF The following also covers TARGET_MEM_REF when decomposing stores from CTORs to supported elementwise operations. This avoids spilling and cleans up after vector lowering which doesn't touch loads or stores. It also mimics what we already do for loads. PR tree-optimization/64731 * tree-ssa-forwprop.cc (pass_forwprop::execute): Also handle TARGET_MEM_REF destinations of stores from vector CTORs. * gcc.target/i386/pr64731.c: New testcase. Is causing the csky port to abort in forwprop with an verify_ssa failure FAIL: gcc.dg/torture/pr52407.c -O2 (internal compiler error: verify_ssa failed) FAIL: gcc.dg/torture/pr52407.c -O2 (test for excess errors) Excess errors: /home/jlaw/test/gcc/gcc/testsuite/gcc.dg/torture/pr52407.c:22:1: error: definition in block 3 follows the use for SSA_NAME: _38 in statement: _24 = [(vl_t *)_38]; during GIMPLE pass: forwprop /home/jlaw/test/gcc/gcc/testsuite/gcc.dg/torture/pr52407.c:22:1: internal compiler error: verify_ssa failed 0x11a93bf verify_ssa(bool, bool) /home/jlaw/test/gcc/gcc/tree-ssa.cc:1203 0xe5f8a5 execute_function_todo /home/jlaw/test/gcc/gcc/passes.cc:2105 0xe5e4de do_per_function /home/jlaw/test/gcc/gcc/passes.cc:1694 0xe5fa4e execute_todo /home/jlaw/test/gcc/gcc/passes.cc:2152 Testsuite is gcc.dg/torture/pr52407 can can be seen with just a cross compiler.
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #6 from Jeffrey A. Law --- I would still rather not introduce special cases for SUBREGs if we can avoid it. I think the question remains whether or not patching simplify-rtx's canonicalize_shift is sufficient to fix this problem (perhaps with the adjustment to fwprop as well). If they are, then they would be much preferred over the original patch which special cased SUBREGs.
[Bug target/109777] [14 regression] Compare-debug failure after recent changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109777 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P4 --- Comment #4 from Jeffrey A. Law --- If it's inside the bfin bundling code, let's just mark it as a p4 and we can chase it down whenever it's convenient. My primary motivation is to catch generic issues. A target specific issue on a barely used target just isn't that interesting IMHO.
[Bug testsuite/109776] [14 Regression] pr81192 fails on some targets after recent propagator changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109776 --- Comment #7 from Jeffrey A. Law --- Thanks. That took care of the xstormy16 issues.
[Bug tree-optimization/109777] New: [14 regression] Compare-debug failure after recent changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109777 Bug ID: 109777 Summary: [14 regression] Compare-debug failure after recent changes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- This change: commit 21e2ef2dc25de318de29ec32d5390350c6717c6a (refs/bisect/bad) Author: Andrew Pinski Date: Tue May 2 00:10:46 2023 -0700 Move substitute_and_fold over to use simple_dce_from_worklist While looking into a different issue, I noticed that it would take until the second forwprop pass to do some forward proping and it was because the ssa name was used more than once but the second statement was "dead" and we don't remove that until much later. So this uses simple_dce_from_worklist instead of manually removing of the known unused statements instead. Propagate engine does not do a cleanupcfg afterwards either but manually cleans up possible EH edges so simple_dce_from_worklist needs to communicate that back to the propagate engine. Some testcases needed to be updated/changed even because of better optimization. gcc.dg/pr81192.c even had to be changed to be using the gimple FE so it would be less fragile in the future too. gcc.dg/tree-ssa/pr98737-1.c was failing because __atomic_fetch_ was being matched but in those cases, the result was not being used so both __atomic_fetch_ and __atomic_x_and_fetch_ are valid choices and would not make a code generation difference. evrp7.c, evrp8.c, vrp35.c, vrp36.c: just needed a slightly change as the removal message is different slightly. kernels-alias-8.c: ccp1 is able to remove an unused load which causes ealias to have one less load to analysis so update the expected scan #. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/109691 * tree-ssa-dce.cc (simple_dce_from_worklist): Add need_eh_cleanup argument. If the removed statement can throw, have need_eh_cleanup include the bb of that statement. * tree-ssa-dce.h (simple_dce_from_worklist): Update declaration. * tree-ssa-propagate.cc (struct prop_stats_d): Remove num_dce. (substitute_and_fold_dom_walker::substitute_and_fold_dom_walker): Initialize dceworklist instead of stmts_to_remove. (substitute_and_fold_dom_walker::~substitute_and_fold_dom_walker): Destore dceworklist instead of stmts_to_remove. (substitute_and_fold_dom_walker::before_dom_children): Set dceworklist instead of adding to stmts_to_remove. (substitute_and_fold_engine::substitute_and_fold): Call simple_dce_from_worklist instead of poping from the list. Don't update the stat on removal statements. gcc/testsuite/ChangeLog: * gcc.dg/tree-ssa/evrp7.c: Update for output change. * gcc.dg/tree-ssa/evrp8.c: Likewise. * gcc.dg/tree-ssa/vrp35.c: Likewise. * gcc.dg/tree-ssa/vrp36.c: Likewise. * gcc.dg/tree-ssa/pr98737-1.c: Update scan-tree-dump-not to check for assignment too instead of just a call. * c-c++-common/goacc/kernels-alias-8.c: Update test for removal of load. * gcc.dg/pr81192.c: Rewrite testcase in gimple based test. Is triggering a compare-debug failure on the bfin-elf port: bfin-sim: gcc.dg/pr44023.c (test for excess errors) If you dig into the log file: xgcc: error: /home/jlaw/test/gcc/gcc/testsuite/gcc.dg/pr44023.c: '-fcompare-debug' failure (length)
[Bug testsuite/109776] New: [14 Regression] pr81192 fails on some targets after recent propagator changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109776 Bug ID: 109776 Summary: [14 Regression] pr81192 fails on some targets after recent propagator changes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: testsuite Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- pr81192 is failing on some targets (xstormy16-elf for example) after this change: commit 21e2ef2dc25de318de29ec32d5390350c6717c6a Author: Andrew Pinski Date: Tue May 2 00:10:46 2023 -0700 Move substitute_and_fold over to use simple_dce_from_worklist While looking into a different issue, I noticed that it would take until the second forwprop pass to do some forward proping and it was because the ssa name was used more than once but the second statement was "dead" and we don't remove that until much later. So this uses simple_dce_from_worklist instead of manually removing of the known unused statements instead. Propagate engine does not do a cleanupcfg afterwards either but manually cleans up possible EH edges so simple_dce_from_worklist needs to communicate that back to the propagate engine. Some testcases needed to be updated/changed even because of better optimization. gcc.dg/pr81192.c even had to be changed to be using the gimple FE so it would be less fragile in the future too. gcc.dg/tree-ssa/pr98737-1.c was failing because __atomic_fetch_ was being matched but in those cases, the result was not being used so both __atomic_fetch_ and __atomic_x_and_fetch_ are valid choices and would not make a code generation difference. evrp7.c, evrp8.c, vrp35.c, vrp36.c: just needed a slightly change as the removal message is different slightly. kernels-alias-8.c: ccp1 is able to remove an unused load which causes ealias to have one less load to analysis so update the expected scan #. OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions. gcc/ChangeLog: PR tree-optimization/109691 * tree-ssa-dce.cc (simple_dce_from_worklist): Add need_eh_cleanup argument. If the removed statement can throw, have need_eh_cleanup include the bb of that statement. * tree-ssa-dce.h (simple_dce_from_worklist): Update declaration. * tree-ssa-propagate.cc (struct prop_stats_d): Remove num_dce. (substitute_and_fold_dom_walker::substitute_and_fold_dom_walker): Initialize dceworklist instead of stmts_to_remove. (substitute_and_fold_dom_walker::~substitute_and_fold_dom_walker): Destore dceworklist instead of stmts_to_remove. (substitute_and_fold_dom_walker::before_dom_children): Set dceworklist instead of adding to stmts_to_remove. (substitute_and_fold_engine::substitute_and_fold): Call simple_dce_from_worklist instead of poping from the list. Don't update the stat on removal statements. [ ... ] The compiler is complaining with this message: /home/jlaw/test/gcc/gcc/testsuite/gcc.dg/pr81192.c: In function 'fn2':^M /home/jlaw/test/gcc/gcc/testsuite/gcc.dg/pr81192.c:50:1: error: type mismatch in binary expression^M long int^M ^M long int^M ^M int^M ^M iftmp2_8_14 = j_6(D) + 1;^M /home/jlaw/test/gcc/gcc/testsuite/gcc.dg/pr81192.c:50:1: error: mismatching comparison operand types^M long int^M int^M if (c0_1_13 != 0)^M compiler exited with status 1 I suspect the testsuite needs further twiddling to work on 16bit int targets.
[Bug tree-optimization/109721] New: [14 Regression] predcom-2 fails after recent changes
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109721 Bug ID: 109721 Summary: [14 Regression] predcom-2 fails after recent changes Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- arc-elf target. FAIL: gcc.dg/tree-ssa/predcom-2.c scan-tree-dump-times pcom "Unrolling 2 times." 2 Bisection points to: f385252b2336a4a57a30fddf82e558c73bcc85cc is the first bad commit commit f385252b2336a4a57a30fddf82e558c73bcc85cc Author: Richard Biener Date: Tue May 2 10:34:48 2023 +0200 tree-optimization/109672 - properly check emulated plus during vect The following refactors the check for emulated vector support for the cases of plus, minus and negate. In the PR we end up with a SImode plus, supported by the target but emulated and in this context fail to verify we are dealing with exactly word_mode. PR tree-optimization/109672 * tree-vect-stmts.cc (vectorizable_operation): For plus, minus and negate always check the vector mode is word mode. It should be visible with a cross-compiler. No need for a full toolchain stack.
[Bug tree-optimization/109672] [14 regression] many ICEs after r14-323-g977a43f5ba778b
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109672 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2023-04-29 CC||law at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jeffrey A. Law --- Similar failures on arc-elf: arc-sim: gcc.c-torture/execute/pr36691.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) arc-sim: gcc.c-torture/execute/pr36691.c -O3 -g (test for excess errors) arc-sim: gcc.dg/pr53749.c (test for excess errors) arc-sim: gcc.dg/pr83480.c (test for excess errors) arc-sim: gcc.dg/torture/pr98117.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) arc-sim: gcc.dg/torture/pr98117.c -O3 -fomit-frame-pointer -funroll-loops -fpeel-loops -ftracer -finline-functions (test for excess errors) arc-sim: gcc.dg/torture/pr98117.c -O3 -g (test for excess errors) arc-sim: gcc.dg/torture/pr98117.c -O3 -g (test for excess errors) If you dig inside, they're tripping the new checking for conversions too.
[Bug testsuite/109549] [14 Regression] Conditional move regressions after r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549 Jeffrey A. Law changed: What|Removed |Added Target|x86_64-*-* |s390 Summary|[14 Regression] cmov6.c |[14 Regression] Conditional |test fail after commit |move regressions after |r14-53-g675b1a7f113adb1d737 |r14-53-g675b1a7f113adb1d737 |adaf78b4fd90be7a0ed1a |adaf78b4fd90be7a0ed1a --- Comment #9 from Jeffrey A. Law --- WRT the s390 failures: gcc.target/s390/arch13/sel-1.c scan-assembler-times \tselgr(?:h|le)\t 1 gcc.target/s390/arch13/sel-1.c scan-assembler-times \tselr(?:h|le)\t 1 gcc.target/s390/ifcvt-one-insn-bool.c scan-assembler lochinh\t%r.?,1 gcc.target/s390/ifcvt-one-insn-char.c scan-assembler locrnh\t%r.?,%r.? gcc.target/s390/loc-1.c scan-assembler \tlochine\t%r2,-1 gcc.target/s390/loc-1.c scan-assembler \tlocrne\t%r2,%r4 gcc.target/s390/vector/vec-scalar-cmp-1.c scan-assembler eq:\n[^:]*\twfcdb\t%v[0-9]*,%v[0-9]*\n\t[^:]+\tlochie\t%r2,1 gcc.target/s390/vector/vec-scalar-cmp-1.c scan-assembler ge:\n[^:]*\twfkdb\t%v[0-9]*,%v[0-9]*\n\t[^:]+\tlochihe\t%r2,1 gcc.target/s390/vector/vec-scalar-cmp-1.c scan-assembler gt:\n[^:]*\twfkdb\t%v[0-9]*,%v[0-9]*\n\t[^:]+\tlochih\t%r2,1 gcc.target/s390/vector/vec-scalar-cmp-1.c scan-assembler le:\n[^:]*\twfkdb\t%v[0-9]*,%v[0-9]*\n\t[^:]+\tlochile\t%r2,1 gcc.target/s390/vector/vec-scalar-cmp-1.c scan-assembler lt:\n[^:]*\twfkdb\t%v[0-9]*,%v[0-9]*\n\t[^:]+\tlochil\t%r2,1 gcc.target/s390/vector/vec-scalar-cmp-1.c scan-assembler ne:\n[^:]*\twfcdb\t%v[0-9]*,%v[0-9]*\n\t[^:]+\tlochine\t%r2,1 These are also cases where the s390 cost model says these particular if-conversion opportunities aren't profitable. Basically the backend has no costing model for (set (if_then_else ...)) so it recursively computes the cost of all the sub-rtxs which ultimately turns out to be higher than the cost of the branchy code. I'm not qualified to address this problem as I have no sense of s390 costing. I'm going to have the tester regenerate new baselines for s390, but I'm not going to actively try to fix this problem. Also note my testing was s390, not s390x which may be behaving differently.
[Bug target/106585] RISC-V: Mis-optimized code gen for zbs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106585 --- Comment #11 from Jeffrey A. Law --- Coming back to this. WRT extension elimination. I've been pondering if we want a late pass to do a bit of this that can't be handled by REE. So let's take the case of a Zbs instruction operating on a variable bit in RV64. I think we can probably agree that in the absence of additional information we can't do those kind of bit manipulations because we could potentially change bit 31 and have the result escape as a parameter to a function call, return value or get used in a compare type instruction. So to make use of the Zbs instructions that manipulate a variable bit we could could emit a suitable sign extension after each such operation. That, of course, has the potential to be expensive. But if we chase down the uses we can probably eliminate a lot of these extensions. Essentially we need to know if the extension reaches a comparison, one of the ABI escape points or a real 64bit operation. If not, then the extension is unnecessary and can be dropped. Ideally we'd find that a significant number of extensions could be dropped. We're not actively working on this, but it is something rattling around in the empty space between my ears.
[Bug rtl-optimization/109592] Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 --- Comment #4 from Jeffrey A. Law --- If we need to handle subregs here, I would suggest something like this if (SUBREG_P (XEXP (op0, 0)) && subreg_lowpart_p (op0) ... other tests ... That way we know we're extracting the low word of the subreg. But I'm not sure at all why we need to handle them in this code. I would expect generic optimizers to strip away the subregs in the result if they are extraneous. It's not clear why you check the size of the subreg modes. It seems like this optimization should work even for a paradoxical subreg (bitsize of inner will be smaller than bitsize of outer). In general if you only have one statement in an arm of an IF-THEN-ELSE, then it need not be inside a { } block. Rather than using magic numbers like INTVAL (op1) + 8 == 32 Instead use mode information. INTVAL (op) + GET_MODE_BITSIZE (QImode) == GET_MODE_BITSIZE (SImode) // code for QI->SI expansion Then repeat for the other mode combinations. Note that we probably should go ahead and support QI->HI. While it doesn't happen for RISC-V, it could likely happen on other architectures. So you end up wanting to supprot QI->HI, QI->SI QI->DI HI->SI, HI->DI SI->DI I don't know if it happens in practice, so check first to see what we do for a zero extension variant of your original test. If we need to handle that too, it can be easily done by changing the shifts we recognize. Anyway, it looks like you're on the right track. I would suggest further discussions happen on gcc-patches. Anyway, it definitely looks like you're on the right track.
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 --- Comment #10 from Jeffrey A. Law --- The sign_extend later gets turned into zero_extend. Presumably because we know the value is never negative. That in and of itself wouldn't be a big deal as it should be easily recognizable using any_extend. But combine steps in and scrambles the RTL in various unhelpful ways.
[Bug rtl-optimization/109592] New: Failure to recognize shifts as sign/zero extension
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109592 Bug ID: 109592 Summary: Failure to recognize shifts as sign/zero extension Product: gcc Version: 13.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: rtl-optimization Assignee: unassigned at gcc dot gnu.org Reporter: law at gcc dot gnu.org Target Milestone: --- This is a trivial sign extension: int sextb32(int x) { return (x << 24) >> 24; } Yet on RV64 with ZBB enabled we get: sextb32: slliw a0,a0,24# 6 [c=4 l=4] ashlsi3 sraiw a0,a0,24# 13[c=8 l=4] *ashrsi3_extend ret # 21[c=0 l=4] simple_return We actually get a good form to optimize in simplify_binary_operation_1: > #0 simplify_context::simplify_binary_operation (this=0x7fffda68, > code=ASHIFTRT, mode=E_SImode, op0=0x7fffea11eb40, op1=0x7fffea009610) at > /home/jlaw/riscv-persist/ventana/gcc/gcc/simplify-rtx.cc:2558 > 2558 gcc_assert (GET_RTX_CLASS (code) != RTX_COMPARE); > (gdb) p code > $24 = ASHIFTRT > (gdb) p mode > $25 = E_SImode > (gdb) p debug_rtx (op0) > (ashift:SI (subreg/s/u:SI (reg/v:DI 74 [ x ]) 0) > (const_int 24 [0x18])) > $26 = void > (gdb) p debug_rtx (op1) > (const_int 24 [0x18]) > $27 = void So that's (ashiftrt (ashift (object) 24) 24), ie sign extension. I suspect if we fix simplify_binary_operation_1 then we'll see this get simplified by fwprop. I also suspect we could construct a zero extension variant.
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 --- Comment #8 from Jeffrey A. Law --- So coming back to this after a couple months, I'm confident the match.pd change is unnecessary and in fact wrong. So we definitely want to set that aside.
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 --- Comment #6 from Jeffrey A. Law --- Comment on attachment 54905 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54905 proposed patch So that's a subset of what we've done. We initially thought that was going to be enough to solve this class of problems. But it's actually deeper than just having a zero_extension variant of this pattern. I'll officially submit the zero_extension pattern and the match.pd bits. The other pattern we wrote is fugly and I'd like to look at it one more time.
[Bug tree-optimization/106888] [RISCV] Negative optimization that excess andi instructions are generated in gcc.dg/pr90838.c
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106888 --- Comment #4 from Jeffrey A. Law --- Vineet, we've got some bits here you might want to play with. I'm about to leave for the evening, but I'll put you in touch with Raphael tomorrow afternoon.
[Bug target/108247] Missed opportunity to generate shNadd on risc-v
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108247 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #3 from Jeffrey A. Law --- Per c#1 and c#2.
[Bug target/108248] Some insns in the risc-v backend do not have mappings to functional units
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108248 Jeffrey A. Law changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #8 from Jeffrey A. Law --- Fixed on the trunk.
[Bug target/109549] [14 Regression] cmov6.c test fail after commit r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549 --- Comment #6 from Jeffrey A. Law --- And just an FYI, the tester is flagging conditional move failures for mips64-* rx-elf and s390-linux-gnu. Most likely these are additional cases where the hook is indicating the transformation isn't profitable, but the target tests are expecting the transformation to happen. I'll debug those other targets and take appropriate action. At this point I don't see anything that would strongly suggest reversion of the patch, just that we need a bit of testsuite adjustment.
[Bug target/109549] [14 Regression] cmov6.c test fail after commit r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549 Jeffrey A. Law changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2023-04-19 --- Comment #5 from Jeffrey A. Law --- Yea, that's exactly what's kicking in here. The converted sequence looks like this: (insn 29 0 28 (set (reg:SI 86) (const_int 10 [0xa])) 83 {*movsi_internal} (nil)) (insn 28 29 30 (set (reg:CCZ 17 flags) (compare:CCZ (reg/v:SI 83 [ c ]) (const_int 0 [0]))) 7 {*cmpsi_ccno_1} (nil)) (insn 30 28 32 (set (reg/v:SI 85 [ e ]) (if_then_else:SI (eq (reg:CCZ 17 flags) (const_int 0 [0])) (reg/v:SI 85 [ e ]) (reg:SI 86))) 1318 {*movsicc_noc} (nil)) (insn 32 30 31 (set (reg:SI 87) (const_int 20 [0x14])) 83 {*movsi_internal} (nil)) (insn 31 32 33 (set (reg:CCZ 17 flags) (compare:CCZ (reg/v:SI 83 [ c ]) (const_int 0 [0]))) 7 {*cmpsi_ccno_1} (nil)) (insn 33 31 0 (set (reg/v:SI 84 [ d ]) (if_then_else:SI (ne (reg:CCZ 17 flags) (const_int 0 [0])) (reg/v:SI 84 [ d ]) (reg:SI 87))) 1318 {*movsicc_noc} (nil)) Note the two movsicc_* patterns. So the question now is what to do about it. It looks like things are behaving as expected, so my first inclination would be to adjust the test. Actually splitting it into two would likely be even better. One would verify that by default we do not generate a pair of cmovs for this code, the other would turn the tuning bit off and verify that we do generate the pair of cmovs. Happy to do whatever the x86 maintainers want here.
[Bug target/109549] [14 Regression] cmov6.c test fail after commit r14-53-g675b1a7f113adb1d737adaf78b4fd90be7a0ed1a
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109549 --- Comment #4 from Jeffrey A. Law --- x86's tuning does have some support for avoiding multiple cmovs in a single if-converted sequence. I'll double check if that's kicking in here.
[Bug target/109508] [13 Regression] ICE: in extract_insn, at recog.cc:2791 with -mcpu=sifive-s76 on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109508 Jeffrey A. Law changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #3 from Jeffrey A. Law --- Fixed on the trunk.
[Bug target/108807] [11/12 regression] gcc.target/powerpc/vsx-builtin-10d.c fails after r11-6857-gb29225597584b6 on power 9 BE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108807 --- Comment #7 from Jeffrey A. Law --- Once you've committed to the active release branches where this bug is active (11/12 in this case), you can just close the bug as resolved/fixed. No need to update the summary/title in that case. Thanks, Jeff
[Bug target/108807] [11/12/13 regression] gcc.target/powerpc/vsx-builtin-10d.c fails after r11-6857-gb29225597584b6 on power 9 BE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108807 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org --- Comment #5 from Jeffrey A. Law --- Kewen, is this BZ fixed on the trunk? If so we should update the title by dropping the "/13" so that's not flagged as a gcc-13 regression.
[Bug target/109508] [13 Regression] ICE: in extract_insn, at recog.cc:2791 with -mcpu=sifive-s76 on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109508 Jeffrey A. Law changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |law at gcc dot gnu.org Priority|P3 |P4 --- Comment #1 from Jeffrey A. Law --- Trivial issue in the riscv backend. We just need to fix the operand on the movXXcc pattern.
[Bug target/109508] [13 Regression] ICE: in extract_insn, at recog.cc:2791 with -mcpu=sifive-s76 on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109508 Jeffrey A. Law changed: What|Removed |Added Ever confirmed|0 |1 Last reconfirmed||2023-04-14 Status|UNCONFIRMED |NEW
[Bug analyzer/103602] [11/12/13 regression] Analyzer takes excessive amount of memory and time linking GNU grep with LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103602 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug middle-end/103637] [12/13 Regression] missing warning writing past the end of one of multiple elements of the same array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103637 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug rtl-optimization/103829] [10/11/12/13 Regression] missing shrink wrapping for simple/obvious code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103829 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug analyzer/107943] [11/12/13 Regression] gcc -fanalyzer hangs in openssl curve25519.c since r11-3840-gaf66094d03779377
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107943 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug analyzer/109027] [13 Regression] ICE: SIGSEGV (infinite recursion in ana::constraint_manager::eval_condition / ana::constraint_manager::impossible_derived_conditions_p) with -fanalyzer since r13-
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109027 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug middle-end/109478] FAIL: g++.dg/other/pr104989.C -std=gnu++14 (internal compiler error: Segmentation fault)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109478 --- Comment #2 from Jeffrey A. Law --- The pa.cc bits look reasonable. It's been forever since I looked at this code, but clearly using a HOST_WIDE_INT is the right thing to be doing. While it may not fix this bug completely, consider it pre-approved. My PA-fu isn't what it used to be, but I strongly suspect we can't add that constant directly. It'd need to be broken down into a multi-instruction sequence. Not sure if ldil+ldo is sufficient there or not.
[Bug c/105628] [12/13 Regression] False positive with -Waddress
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105628 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug rtl-optimization/105715] [13 Regression] missed RTL if-conversion with COND_EXPR change
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105715 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug tree-optimization/105832] [13 Regression] Dead Code Elimination Regression at -O3 (trunk vs. 12.1.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105832 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug tree-optimization/105834] [13 Regression] Dead Code Elimination Regression at -O2 (trunk vs. 12.1.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105834 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug target/106240] [13 Regression] missed vectorization opportunity (cond move) on mips since r13-707-g68e0063397ba82
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106240 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2
[Bug tree-optimization/106511] [13 Regression] New -Werror=maybe-uninitialized since r13-1268-g8c99e307b20c502e
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106511 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug target/107270] [10/11/12/13 Regression] return for structure is not as good as before
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107270 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug tree-optimization/107823] [13 Regression] Dead Code Elimination Regression at -Os (trunk vs. 12.2.0)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107823 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug tree-optimization/108197] [12/13 Regression] -Wstringop-overread emitted on simple boost small_vector code
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108197 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug tree-optimization/108351] [13 Regression] Dead Code Elimination Regression at -O3 since r13-4240-gfeeb0d68f1c708
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108351 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug tree-optimization/108355] [13 Regression] Dead Code Elimination Regression at -O2 since r13-2772-g9baee6181b4e42
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108355 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug tree-optimization/108358] [13 Regression] Dead Code Elimination Regression at -Os since r13-3378-gf6c168f8c06047
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108358 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P2
[Bug tree-optimization/108360] [13 Regression] Dead Code Elimination Regression at -Os since r13-2048-g418b71c0d535bf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108360 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P2 CC||law at gcc dot gnu.org
[Bug libgomp/108895] [13.0.1 (exp)] Fortran + gfx90a !$acc update device produces a segfault.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108895 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P4 CC||law at gcc dot gnu.org
[Bug target/108947] [13 Regression] wrong code with -O2 -fno-forward-propagate and vector compare on riscv64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108947 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P1 --- Comment #4 from Jeffrey A. Law --- P1 as this look like a latent issue in combine or simplification routines.
[Bug target/109104] [13 Regression] ICE: in gen_reg_rtx, at emit-rtl.cc:1171 with -fzero-call-used-regs=all -march=rv64gv
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109104 Jeffrey A. Law changed: What|Removed |Added CC||law at gcc dot gnu.org Priority|P3 |P4