Tamar Christina <tamar.christ...@arm.com> writes:
> Hi Richard,
>
>> -----Original Message-----
>> From: Richard Sandiford <richard.sandif...@arm.com>
>> Sent: Tuesday, September 11, 2018 15:49
>> To: Tamar Christina <tamar.christ...@arm.com>
>> Cc: Jeff Law <l...@redhat.com>; gcc-patches@gcc.gnu.org; nd
>> <n...@arm.com>; James Greenhalgh <james.greenha...@arm.com>;
>> Richard Earnshaw <richard.earns...@arm.com>; Marcus Shawcroft
>> <marcus.shawcr...@arm.com>
>> Subject: Re: [PATCH][GCC][AArch64] Updated stack-clash implementation
>> supporting 64k probes. [patch (1/7)]
>> 
>> Tamar Christina <tamar.christ...@arm.com> writes:
>> > Hi Richard,
>> >
>> > The 08/28/2018 21:58, Richard Sandiford wrote:
>> >> Tamar Christina <tamar.christ...@arm.com> writes:
>> >> > +  HOST_WIDE_INT guard_used_by_caller =
>> STACK_CLASH_CALLER_GUARD;
>> >> > + /* When doing the final adjustment for the outgoing argument size
>> >> > we can't
>> >> > + assume that LR was saved at position 0.  So subtract it's offset
>> >> > from the
>> >> > +     ABI safe buffer so that we don't accidentally allow an adjustment
>> that
>> >> > +     would result in an allocation larger than the ABI buffer without
>> >> > +     probing.  */
>> >> > +  HOST_WIDE_INT min_probe_threshold
>> >> > +    = final_adjustment_p
>> >> > +      ? guard_used_by_caller - cfun->machine-
>> >frame.reg_offset[LR_REGNUM]
>> >> > +      : guard_size - guard_used_by_caller;
>> >> [...]
>> >> > +  if (residual)
>> >> > +    {
>> >> > +      aarch64_sub_sp (temp1, temp2, residual, frame_related_p);
>> >> > +      if (residual >= min_probe_threshold)
>> >> > +       {
>> >> > +         if (dump_file)
>> >> > +           fprintf (dump_file,
>> >> > +                    "Stack clash AArch64 prologue residuals: "
>> >> > +                    HOST_WIDE_INT_PRINT_DEC " bytes, probing will be
>> required."
>> >> > +                    "\n", residual);
>> >> > +         emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx,
>> >> > +                                          STACK_CLASH_CALLER_GUARD));
>> >>
>> >> reg_offsets are nonnegative, so if LR_REGNUM isn't saved at position
>> >> 0, min_probe_threshold will be less than STACK_CLASH_CALLER_GUARD.
>> >> It looks like the probe would then write above the region.
>> >>
>> >> Using >= rather than > means that the same thing could happen when
>> >> LR_REGNUM is at position 0, if the residual is exactly
>> >> STACK_CLASH_CALLER_GUARD.
>> >
>> > That's true. While addressing this we changed how the residuals are
>> probed.
>> >
>> > To address a comment you raised offline about the saving of LR when
>> > calling a no-return function using a tail call and
>> > -fomit-frame-pointer, I think this should be safe as the code in
>> > frame_layout (line 4131-4136) would ensure that R30 is saved.
>> 
>> That line number range doesn't seem to match up with current sources.
>> But my point was that "X is a non-leaf function" does not directly imply "X
>> saves R30_REGNUM".  It might happen to imply that indirectly for all cases at
>> the moment, but it's not something that the AArch64 code enforces itself.
>> AFAICT the only time we force R30 to be saved explicitly is when emitting a
>> chain:
>> 
>>   if (cfun->machine->frame.emit_frame_chain)
>>     {
>>       /* FP and LR are placed in the linkage record.  */
>>       cfun->machine->frame.reg_offset[R29_REGNUM] = 0;
>>       cfun->machine->frame.wb_candidate1 = R29_REGNUM;
>>       cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD;
>>       cfun->machine->frame.wb_candidate2 = R30_REGNUM;
>>       offset = 2 * UNITS_PER_WORD;
>>     }
>> 
>> Otherwise we only save R30_REGNUM if the df machinery says R30 is live.
>> And in principle, it should be correct to use a sibcall pattern that doesn't
>> clobber R30 for a noreturn call even in functions that require a frame.  We
>> don't do that yet, and it probably doesn't make sense from a QoI perspective
>> on AArch64, but I don't think it's invalid in terms of rtl semantics.  
>> There's no
>> real reason why a noreturn function has to save the address of its caller
>> unless the target forces it to, such as for frame chains above.
>> 
>> In this patch we're relying on the link between the two concepts for a
>> security feature, so I think we should either enforce it explicitly or add an
>> assert like:
>> 
>>   gcc_assert (crtl->is_leaf
>>            || (cfun->machine->frame.reg_offset[R30_REGNUM]
>>                != SLOT_NOT_REQUIRED));
>> 
>> to aarch64_expand_prologue.  (I don't think we should make the assert
>> dependent on stack-clash, since the point is that we're assuming this
>> happens regardless of stack-clash.)
>
> I agree that the assert would be a good idea, though I'm not sure
> enabling it always is
> a good idea. I'm not sure what other languages that don't use
> stack-clash-protection and do their
> own probing do. Like Ada or Go?

I think the argument was that R30 will always be saved in non-leaf
frames without any specific help from the target (which I have my
doubts about, but might be true in practice).  The mechanism by which
that happens doesn't include a test for stack-clash, so I think we
should have the courage of our convictions and not test it in the
assert either.

If we're not comfortable asserting without the stack-clash check
as things stand, I think we should instead make the layout code
force R30 to be saved for stack-clash.  Then including stack-clash
in the assert makes sense, since it ties to a specific piece of
the layout code.

Thanks,
Richard

Reply via email to