On 07/17/2017 04:42 PM, Wilco Dijkstra wrote:
> Jeff Law wrote:    
>> On 07/17/2017 05:27 AM, Wilco Dijkstra wrote:
> 
>>> A minimum guard size of 64KB seems reasonable even on systems with
>>> 4KB pages. However whatever the chosen guard size, you cannot defend
>>> against hostile code. An OS can of course increase the guard size well 
>>> beyond the minimum required, but that's simply reducing the probability -
>>> it is never going to block a big unchecked alloca.
>> That's a kernel issue and I'm not in a position to change that.  On
>> systems with a 64bit address space, I'm keen to see the kernel team
>> increase the guard size, but it's not something I can unilaterally make
>> happen.
> 
> Well if we want the stack guard catch cases with high probability even if some
> or most of the code is unchecked, it must be made much larger. And the fact
> you can set the stack guard to zero in GLIBC is worrying as that would allow 
> an
> attacker to trivially bypass the stack guard...
Well, the attacker would have to arrange to emit a call to change the
guard size -- if the program doesn't already have that code sequence,
then that's going to be reasonably hard (they'd have to control the
argument to that pthread call to set the guard or conjure up a ROP
sequence -- and if they've got a ROP chain started there's probably more
direct attacks).




>>> In 99% of the frames only one stack allocation is made. There are a few
>>> cases where the stack can be adjusted twice.
>> BUt we still have to deal with the cases where there are multiple
>> adjustments.  Punting in the case of multiple adjustments isn't the
>> right thing to do.  Some level of tracking is needed.
> 
> I didn't say we should punt, just that no tracking is required. The AArch64 
> prolog
> code is extremely simple. The only adjustments that need to be checked are 
> initial_adjust and final_adjust. Callee_adjust doesn't need any checks since 
> it is 
> limited by the range of STP (ie. < 512) and if the locals are large, it is 
> zero.
Hmm, so we can't ever get into a case where INITIAL_ADJUST and
CALLEE_ADJUST are both nonzero?  If so, then yes, that does simplify
things -- dealing with that case is an serious annoyance and I'd be
happy to add an assert to ensure it never happens.   That's where
knowing the architecture details (which I don't) really turns out to be
useful :-)

> 
> Anyway the only complex case is shrinkwrapping. We know that at least LR must 
> be
> saved before a call, but with -fomit-frame-pointer it doesn't always end up 
> at the
> bottom of the callee-saves. We could take its offset into account or force it 
> at offset 0.
Right.  I'd roughly figured out that the LR save is not separately
wrapped, which is good to know in terms of cases that have to be supported.


> 
>>> To be safe I think we first need to probe and then allocate. Or are there 
>>> going
>>> to be extra checks in asynchronous interrupt handlers that check whether SP 
>>> is
>>> above the stack guard?
>> That hits the red zone, which is unfortunate.  But it's acceptable IMHO.
> 
> How do you mean? What will happen if SP points into the heap? Will the trap
> handler just write to it without any check?
In general writing below SP tends of be discouraged.  But there's cases
where, IMHO, it's called for.  I think this is one of them.

I'm not aware of any checks in the signal handler frame setup done by
the kernel, which is a significant issue.  I'd focused on making sure we
don't have a window where the stack pointer has been pushed past the
guard before probing.  However, as Richard E has pointed out, if the
kernel doesn't have checks, then the signal frame setup itself in the
kernel could be used to push through the guard depending on the size of
the signal frame and the state of the stack pointer relative to the end
of the guard.

Jeff

Reply via email to