[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-28 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1
   Last reconfirmed||2020-10-28
   Assignee|unassigned at gcc dot gnu.org  |rsandifo at gcc dot 
gnu.org

--- Comment #9 from rsandifo at gcc dot gnu.org  
---
Thanks for the s390 fix.  On second thoughts, I guess it would
be easier to keep this bug open for the general problem, since
it has all the info.

Hope to get to this early in stage 3.

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-27 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #8 from CVS Commits  ---
The master branch has been updated by Andreas Krebbel :

https://gcc.gnu.org/g:2b3e722a3ca1b9dcfff1c016e651d0d681de1af0

commit r11-4460-g2b3e722a3ca1b9dcfff1c016e651d0d681de1af0
Author: Andreas Krebbel 
Date:   Tue Oct 27 20:57:39 2020 +0100

Fix PR97497

This works around a limitation of gcse with handling of partially
clobbered registers.  With this patch our GOT pointer register r12 is
not marked as partially clobbered anymore for the -m31 -mzarch -fpic
combination. This is correct since all the bits in r12 we actually
care about are in fact preserved.

gcc/ChangeLog:

PR rtl-optimization/97497
* config/s390/s390.c (s390_hard_regno_call_part_clobbered): Do not
return true for r12 when -fpic is used.

gcc/testsuite/ChangeLog:

* gcc.target/s390/pr97497.c: New test.

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-26 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #7 from rsandifo at gcc dot gnu.org  
---
(In reply to Andreas Krebbel from comment #6)
> Alternatively I could also mark r12 as preserved across function calls for
> -fpic in the backend. In fact all the bits we care about are preserved.
> Since the register is fixed all the accesses do come from the backend itself.

Ah, yeah, that sounds good to me FWIW.

> That's similar to what I was trying with the fixed_regs hack. But I agree
> that this might not be correct in general.
> 
> The full fix is probably to track the exact parts of partially clobbered
> regs which stay live but this would be a major change.

Yeah.  If you do the above, I'll open a new PR for the underlying
gcse.c issue, since I think the problem is latent on targets that
have “real” partially-clobbered registers.

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-20 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #6 from Andreas Krebbel  ---
Alternatively I could also mark r12 as preserved across function calls for
-fpic in the backend. In fact all the bits we care about are preserved. Since
the register is fixed all the accesses do come from the backend itself.

That's similar to what I was trying with the fixed_regs hack. But I agree that
this might not be correct in general.

The full fix is probably to track the exact parts of partially clobbered regs
which stay live but this would be a major change.

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-20 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #5 from rsandifo at gcc dot gnu.org  
---
I think the problem is a disconnect between compute_transp
and the code in gcse.c itself.  compute_transp considers %r12
to be transparent in all blocks despite the partial clobbers.
But whether that's true is context-dependent.

I think the fix is to make transp also handle partial clobbers
in a conservative way.  I don't have a specific suggestion how
to do that yet.

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-19 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #4 from Andreas Krebbel  ---
Reading from symbol t uses the GOT pointer in r12. The call then partially
clobbers r12 but does not affect the lower 32 bits where the GOT pointer
resides. So the GOT pointer stays in fact live across the call.

The way this is currently handled in gcse the second access of t reads a
different value than the first and this is wrong I think. This leads to a
disagreement between the pre_delete_map and the insert locations. The later
read of t is removed because it is an anticipated use but no copy of the value
is inserted after the first read of t because the expression is not considered
to be available anymore at the second location. The availability of the entire
expression is broken by the set of r12 at the call insns.

I didn't know how to solve this without being able to keep track of what parts
of hard regs are clobbered.

The idea behind the fixed_reg check is to trust the backend that it does not
emit uninitialized uses of hard regs in the first place.

t.c.250r.cprop1:

(insn 6 3 7 2 (set (reg/f:SI 65)
(mem/u/c:SI (plus:SI (reg:SI 12 %r12)
(const:SI (unspec:SI [
(symbol_ref:SI ("t") [flags 0x6c0]  )
] UNSPEC_GOT))) [0  S4 A8])) "t.c":8:3 1387
{*movsi_zarch}
 (nil))
(insn 7 6 8 2 (set (reg:SI 3 %r3)
(mem/f/c:SI (reg/f:SI 65) [1 t+0 S4 A32])) "t.c":8:3 1387
{*movsi_zarch}
 (expr_list:REG_DEAD (reg/f:SI 65)
(nil)))
(insn 8 7 9 2 (set (reg:SI 2 %r2)
(const_int 1 [0x1])) "t.c":8:3 1387 {*movsi_zarch}
 (nil))
(call_insn 9 8 10 2 (parallel [
(call (mem:QI (const:SI (unspec:SI [
(symbol_ref:SI ("bar") [flags 0x41] 
)
] UNSPEC_PLT)) [0 bar S1 A8])
(const_int 0 [0]))
(clobber (reg:SI 14 %r14))
]) "t.c":8:3 2053 {*brasl}
 (expr_list:REG_DEAD (reg:SI 3 %r3)
(expr_list:REG_DEAD (reg:SI 2 %r2)
(expr_list:REG_CALL_DECL (symbol_ref:SI ("bar") [flags 0x41] 
)
(nil
(expr_list (use (reg:SI 12 %r12))
(expr_list:SI (use (reg:SI 2 %r2))
(expr_list:SI (use (reg:SI 3 %r3))
(nil)


...

(insn 13 12 14 4 (set (reg/f:SI 66)
(mem/u/c:SI (plus:SI (reg:SI 12 %r12)
(const:SI (unspec:SI [
(symbol_ref:SI ("t") [flags 0x6c0]  )
] UNSPEC_GOT))) [0  S4 A8])) "t.c":10:5 1387
{*movsi_zarch}
 (nil))

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-19 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #3 from Andreas Krebbel  ---
Created attachment 49405
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49405&action=edit
testcase

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-19 Thread rsandifo at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

rsandifo at gcc dot gnu.org  changed:

   What|Removed |Added

 CC||rsandifo at gcc dot gnu.org

--- Comment #2 from rsandifo at gcc dot gnu.org  
---
(In reply to Andreas Krebbel from comment #1)
> Created attachment 49402 [details]
> Proposed fix
> 
> With the patch only regs are considered which aren't "fixed" assuming that
> for fixed_regs the backend takes care of only actually using the
> well-defined part of the hard regs.
I don't think this is right.  The principle is the same for
all types of register.

The idea is that a partial clobber can conservatively be treated
as a read of the old value and a set of the new value.  That might
be suboptimal, but it should never lead to wrong code.  It sounds
like there's a deeper issue here.

(BTW, the testcase attachment seems to be missing.)

[Bug rtl-optimization/97497] gcse wrong code generation with partial register clobbers

2020-10-19 Thread krebbel at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97497

--- Comment #1 from Andreas Krebbel  ---
Created attachment 49402
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49402&action=edit
Proposed fix

With the patch only regs are considered which aren't "fixed" assuming that for
fixed_regs the backend takes care of only actually using the well-defined part
of the hard regs.