[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 rsandifo at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0 |1 Last reconfirmed||2020-10-28 Assignee|unassigned at gcc dot gnu.org |rsandifo at gcc dot gnu.org --- Comment #9 from rsandifo at gcc dot gnu.org --- Thanks for the s390 fix. On second thoughts, I guess it would be easier to keep this bug open for the general problem, since it has all the info. Hope to get to this early in stage 3.
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #8 from CVS Commits --- The master branch has been updated by Andreas Krebbel : https://gcc.gnu.org/g:2b3e722a3ca1b9dcfff1c016e651d0d681de1af0 commit r11-4460-g2b3e722a3ca1b9dcfff1c016e651d0d681de1af0 Author: Andreas Krebbel Date: Tue Oct 27 20:57:39 2020 +0100 Fix PR97497 This works around a limitation of gcse with handling of partially clobbered registers. With this patch our GOT pointer register r12 is not marked as partially clobbered anymore for the -m31 -mzarch -fpic combination. This is correct since all the bits in r12 we actually care about are in fact preserved. gcc/ChangeLog: PR rtl-optimization/97497 * config/s390/s390.c (s390_hard_regno_call_part_clobbered): Do not return true for r12 when -fpic is used. gcc/testsuite/ChangeLog: * gcc.target/s390/pr97497.c: New test.
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #7 from rsandifo at gcc dot gnu.org --- (In reply to Andreas Krebbel from comment #6) > Alternatively I could also mark r12 as preserved across function calls for > -fpic in the backend. In fact all the bits we care about are preserved. > Since the register is fixed all the accesses do come from the backend itself. Ah, yeah, that sounds good to me FWIW. > That's similar to what I was trying with the fixed_regs hack. But I agree > that this might not be correct in general. > > The full fix is probably to track the exact parts of partially clobbered > regs which stay live but this would be a major change. Yeah. If you do the above, I'll open a new PR for the underlying gcse.c issue, since I think the problem is latent on targets that have “real” partially-clobbered registers.
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #6 from Andreas Krebbel --- Alternatively I could also mark r12 as preserved across function calls for -fpic in the backend. In fact all the bits we care about are preserved. Since the register is fixed all the accesses do come from the backend itself. That's similar to what I was trying with the fixed_regs hack. But I agree that this might not be correct in general. The full fix is probably to track the exact parts of partially clobbered regs which stay live but this would be a major change.
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #5 from rsandifo at gcc dot gnu.org --- I think the problem is a disconnect between compute_transp and the code in gcse.c itself. compute_transp considers %r12 to be transparent in all blocks despite the partial clobbers. But whether that's true is context-dependent. I think the fix is to make transp also handle partial clobbers in a conservative way. I don't have a specific suggestion how to do that yet.
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #4 from Andreas Krebbel --- Reading from symbol t uses the GOT pointer in r12. The call then partially clobbers r12 but does not affect the lower 32 bits where the GOT pointer resides. So the GOT pointer stays in fact live across the call. The way this is currently handled in gcse the second access of t reads a different value than the first and this is wrong I think. This leads to a disagreement between the pre_delete_map and the insert locations. The later read of t is removed because it is an anticipated use but no copy of the value is inserted after the first read of t because the expression is not considered to be available anymore at the second location. The availability of the entire expression is broken by the set of r12 at the call insns. I didn't know how to solve this without being able to keep track of what parts of hard regs are clobbered. The idea behind the fixed_reg check is to trust the backend that it does not emit uninitialized uses of hard regs in the first place. t.c.250r.cprop1: (insn 6 3 7 2 (set (reg/f:SI 65) (mem/u/c:SI (plus:SI (reg:SI 12 %r12) (const:SI (unspec:SI [ (symbol_ref:SI ("t") [flags 0x6c0] ) ] UNSPEC_GOT))) [0 S4 A8])) "t.c":8:3 1387 {*movsi_zarch} (nil)) (insn 7 6 8 2 (set (reg:SI 3 %r3) (mem/f/c:SI (reg/f:SI 65) [1 t+0 S4 A32])) "t.c":8:3 1387 {*movsi_zarch} (expr_list:REG_DEAD (reg/f:SI 65) (nil))) (insn 8 7 9 2 (set (reg:SI 2 %r2) (const_int 1 [0x1])) "t.c":8:3 1387 {*movsi_zarch} (nil)) (call_insn 9 8 10 2 (parallel [ (call (mem:QI (const:SI (unspec:SI [ (symbol_ref:SI ("bar") [flags 0x41] ) ] UNSPEC_PLT)) [0 bar S1 A8]) (const_int 0 [0])) (clobber (reg:SI 14 %r14)) ]) "t.c":8:3 2053 {*brasl} (expr_list:REG_DEAD (reg:SI 3 %r3) (expr_list:REG_DEAD (reg:SI 2 %r2) (expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41] ) (nil (expr_list (use (reg:SI 12 %r12)) (expr_list:SI (use (reg:SI 2 %r2)) (expr_list:SI (use (reg:SI 3 %r3)) (nil) ... (insn 13 12 14 4 (set (reg/f:SI 66) (mem/u/c:SI (plus:SI (reg:SI 12 %r12) (const:SI (unspec:SI [ (symbol_ref:SI ("t") [flags 0x6c0] ) ] UNSPEC_GOT))) [0 S4 A8])) "t.c":10:5 1387 {*movsi_zarch} (nil))
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #3 from Andreas Krebbel --- Created attachment 49405 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49405&action=edit testcase
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 rsandifo at gcc dot gnu.org changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #2 from rsandifo at gcc dot gnu.org --- (In reply to Andreas Krebbel from comment #1) > Created attachment 49402 [details] > Proposed fix > > With the patch only regs are considered which aren't "fixed" assuming that > for fixed_regs the backend takes care of only actually using the > well-defined part of the hard regs. I don't think this is right. The principle is the same for all types of register. The idea is that a partial clobber can conservatively be treated as a read of the old value and a set of the new value. That might be suboptimal, but it should never lead to wrong code. It sounds like there's a deeper issue here. (BTW, the testcase attachment seems to be missing.)
[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497 --- Comment #1 from Andreas Krebbel --- Created attachment 49402 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49402&action=edit Proposed fix With the patch only regs are considered which aren't "fixed" assuming that for fixed_regs the backend takes care of only actually using the well-defined part of the hard regs.