Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Segher Boessenkool Thu, 06 Aug 2020 16:38:14 -0700

Hi!

On Thu, Aug 06, 2020 at 10:31:27AM +0200, Richard Biener wrote:
> Jeff might be, but with the intended purpose (ROP mitigation AFAIU)
> it would be nice for other target maintainers to chime in (Segher for
> power maybe) for the question below...


It would be nice if this described anywhere what the benefit of this is,
including actual hard numbers.  I only see it is very costly, and I see
no benefit whatsoever.

> > >>>>> [PATCH] Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]
> > >>>>> command-line option and
> > >>>>> zero_call_used_regs("skip|used-gpr|all-gpr||used|all") function 
> > >>>>> attribue:

"call-used" is such a bad name.  "call-clobbered" is better already, but
"volatile" (over calls) is most obvious I think.

There are at least four different kinds of volatile registers:

1) Argument registers are volatile, on most ABIs.
2) The *linker* (or dynamic linker!) may insert code that needs some
   registers for itself;
3) Registers only used for scratch space;
4) Registers used for returning the function value.

And these can overlap, and differ per function.

> > > Again - what's the intended use (and how does it fulful anything useful
> > > for that case)?

Yes, exactly.

> > >>>>> +      if (!targetm.calls.zero_call_used_regno_p (regno, gpr_only))
> > >>>>> +     continue;
> > > 
> > > Why does the target need some extra say here?
> > 
> > Only target can decide which hard regs should be zeroed, and which hard 
> > regs are general purpose register. 
> 
> I'm mostly questioning the plethora of target hooks added and whether
> this details are a good granularity applying to more than just x86.
> Did I suggest to compute a hardreg set that the middle-end says was
> used and is not live and leave the rest to the target?

It probably would be much easier to just have the target do *all* of
this, in one hook, or maybe even in the existing epilogue stuff.  The
resulting binary code will be very slow no matter what, so this should
not matter much at all.

> > >>>>> +      machine_mode mode
> > >>>>> +     = targetm.calls.zero_call_used_regno_mode (regno,
> > >>>>> +                                                reg_raw_mode[regno]);
> > > 
> > > In what case does the target ever need to adjust this (we're dealing
> > > with hard-regs only?)?
> > 
> > For x86, for example, even though the GPR registers are 64-bit, we only 
> > need to zero the lower 32-bit. etc.
> 
> That's an optimization, yes.

I gues what is meant here is that the usual x86-64 insns to clear the
low 32 bits of a register actually clear the whole register?  It is a
huge security leak otherwise.  And, the generic code has nothing to do
with this, define hooks that ask the target to clear stuff, instead?

> > >>>>> +      reg = gen_rtx_REG (mode, regno);
> > >>>>> +      if (zero_rtx[(int)mode] == NULL_RTX)
> > >>>>> +     {
> > >>>>> +       zero_rtx[(int)mode] = reg;
> > >>>>> +       tmp = gen_rtx_SET (reg, const0_rtx);
> > >>>>> +       emit_insn (tmp);
> > >>>>> +     }
> > >>>>> +      else
> > >>>>> +     emit_move_insn (reg, zero_rtx[(int)mode]);
> > > 
> > > Not sure but I think the canonical zero to use is CONST0_RTX (mode)
> > > but I may be wrong.  
> > 
> > You mean “const0_rtx” should be “CONST0_RTX(mode)”? 
> > I will check on this.

If it is a CONST_INT, you should use const0_rtx; otherwise,
CONST0_RTX (mode) .  I have no idea what zero_rtx is, but there is
const_tiny_rtx already, and you shouldn't use that directly either.

> But why not simplify it all to a single hook
> 
>   targetm.calls.zero_regs (used-not-live-hardregset, gpr_only);
> 
> ?

Yeah.  With a much better name though (it should say what it is for, or
describe at a *high level* what it does).

> > >>>>> start_sequence ();
> > >>>>> emit_note (NOTE_INSN_EPILOGUE_BEG);
> > >>>>> +
> > >>>>> +  gen_call_used_regs_seq ();
> > >>>>> +
> > > 
> > > The caller eventually performs shrink-wrapping - are you sure that
> > > doesn't mess up things?
> > 
> > My understanding is, in the standard epilogue, there is no handling of 
> > “call-used” registers.  Therefore, shrink-wrapping will not impact
> > “call-used” registers as well. 
> > Our patch only handles call-used registers, so, there should be no any 
> > interaction between this patch and shrink-wrapping.
> 
> I don't know (CCed Segher, he should eventually).

Shrink-wrapping often deals with the non-volatile registers, so that
doesn't matter much for this patch series.  But the epilogue can use
some volatile registers as well, including to hold sensitive info.  And
of course everything is different if you use separate shrink-wrapping,
but that work is done already when you get here (so it is too late?)


Anyway.  This all needs a good description in the user manual (is there?
I couldn't find any), explaining what exactly it does (user-visible),
and when you would want to use it, etc.  We need that before we can
review anything else in this patch sanely.


Segher

Re: PING [Patch][Middle-end]Add -fzero-call-used-regs=[skip|used-gpr|all-gpr|used|all]

Reply via email to