Tamar Christina <tamar.christ...@arm.com> writes: > Hi Richard, > >> -----Original Message----- >> From: Richard Sandiford <richard.sandif...@arm.com> >> Sent: Tuesday, September 11, 2018 15:49 >> To: Tamar Christina <tamar.christ...@arm.com> >> Cc: Jeff Law <l...@redhat.com>; gcc-patches@gcc.gnu.org; nd >> <n...@arm.com>; James Greenhalgh <james.greenha...@arm.com>; >> Richard Earnshaw <richard.earns...@arm.com>; Marcus Shawcroft >> <marcus.shawcr...@arm.com> >> Subject: Re: [PATCH][GCC][AArch64] Updated stack-clash implementation >> supporting 64k probes. [patch (1/7)] >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> > Hi Richard, >> > >> > The 08/28/2018 21:58, Richard Sandiford wrote: >> >> Tamar Christina <tamar.christ...@arm.com> writes: >> >> > + HOST_WIDE_INT guard_used_by_caller = >> STACK_CLASH_CALLER_GUARD; >> >> > + /* When doing the final adjustment for the outgoing argument size >> >> > we can't >> >> > + assume that LR was saved at position 0. So subtract it's offset >> >> > from the >> >> > + ABI safe buffer so that we don't accidentally allow an adjustment >> that >> >> > + would result in an allocation larger than the ABI buffer without >> >> > + probing. */ >> >> > + HOST_WIDE_INT min_probe_threshold >> >> > + = final_adjustment_p >> >> > + ? guard_used_by_caller - cfun->machine- >> >frame.reg_offset[LR_REGNUM] >> >> > + : guard_size - guard_used_by_caller; >> >> [...] >> >> > + if (residual) >> >> > + { >> >> > + aarch64_sub_sp (temp1, temp2, residual, frame_related_p); >> >> > + if (residual >= min_probe_threshold) >> >> > + { >> >> > + if (dump_file) >> >> > + fprintf (dump_file, >> >> > + "Stack clash AArch64 prologue residuals: " >> >> > + HOST_WIDE_INT_PRINT_DEC " bytes, probing will be >> required." >> >> > + "\n", residual); >> >> > + emit_stack_probe (plus_constant (Pmode, stack_pointer_rtx, >> >> > + STACK_CLASH_CALLER_GUARD)); >> >> >> >> reg_offsets are nonnegative, so if LR_REGNUM isn't saved at position >> >> 0, min_probe_threshold will be less than STACK_CLASH_CALLER_GUARD. >> >> It looks like the probe would then write above the region. >> >> >> >> Using >= rather than > means that the same thing could happen when >> >> LR_REGNUM is at position 0, if the residual is exactly >> >> STACK_CLASH_CALLER_GUARD. >> > >> > That's true. While addressing this we changed how the residuals are >> probed. >> > >> > To address a comment you raised offline about the saving of LR when >> > calling a no-return function using a tail call and >> > -fomit-frame-pointer, I think this should be safe as the code in >> > frame_layout (line 4131-4136) would ensure that R30 is saved. >> >> That line number range doesn't seem to match up with current sources. >> But my point was that "X is a non-leaf function" does not directly imply "X >> saves R30_REGNUM". It might happen to imply that indirectly for all cases at >> the moment, but it's not something that the AArch64 code enforces itself. >> AFAICT the only time we force R30 to be saved explicitly is when emitting a >> chain: >> >> if (cfun->machine->frame.emit_frame_chain) >> { >> /* FP and LR are placed in the linkage record. */ >> cfun->machine->frame.reg_offset[R29_REGNUM] = 0; >> cfun->machine->frame.wb_candidate1 = R29_REGNUM; >> cfun->machine->frame.reg_offset[R30_REGNUM] = UNITS_PER_WORD; >> cfun->machine->frame.wb_candidate2 = R30_REGNUM; >> offset = 2 * UNITS_PER_WORD; >> } >> >> Otherwise we only save R30_REGNUM if the df machinery says R30 is live. >> And in principle, it should be correct to use a sibcall pattern that doesn't >> clobber R30 for a noreturn call even in functions that require a frame. We >> don't do that yet, and it probably doesn't make sense from a QoI perspective >> on AArch64, but I don't think it's invalid in terms of rtl semantics. >> There's no >> real reason why a noreturn function has to save the address of its caller >> unless the target forces it to, such as for frame chains above. >> >> In this patch we're relying on the link between the two concepts for a >> security feature, so I think we should either enforce it explicitly or add an >> assert like: >> >> gcc_assert (crtl->is_leaf >> || (cfun->machine->frame.reg_offset[R30_REGNUM] >> != SLOT_NOT_REQUIRED)); >> >> to aarch64_expand_prologue. (I don't think we should make the assert >> dependent on stack-clash, since the point is that we're assuming this >> happens regardless of stack-clash.) > > I agree that the assert would be a good idea, though I'm not sure > enabling it always is > a good idea. I'm not sure what other languages that don't use > stack-clash-protection and do their > own probing do. Like Ada or Go?
I think the argument was that R30 will always be saved in non-leaf frames without any specific help from the target (which I have my doubts about, but might be true in practice). The mechanism by which that happens doesn't include a test for stack-clash, so I think we should have the courage of our convictions and not test it in the assert either. If we're not comfortable asserting without the stack-clash check as things stand, I think we should instead make the layout code force R30 to be saved for stack-clash. Then including stack-clash in the assert makes sense, since it ties to a specific piece of the layout code. Thanks, Richard