> As some of you are likely aware, Qualys has just published fairly > detailed information on using stack/heap clashes as an attack vector. > Eric B, Michael M -- sorry I couldn't say more when I contact you about > -fstack-check and some PPC specific stuff. This has been under embargo > for the last month.
No problem and thanks for putting together this message. > Unfortunately, -fstack-check is actually not well suited for our purposes. > > Some background. -fstack-check was designed primarily for Ada's needs. > It assumes the whole program is compiled with -fstack-check and it is > designed to ensure there is enough stack space left so that if the > program hits the guard (say via infinite recursion) the program can > safely call into a signal handler and raise an exception. > > To ensure there's always enough space to meet that design requirement, > -fstack-check probes stack space ahead of the actual need of the code. > > The assumption that all code was compiled with -fstack-check allows for > elision of some stack probes as they are assumed to have been probed by > earlier callers in the call chain. This elision is safe in an > environment where all callers use -fstack-check, but fatally flawed in a > mixed environment. > > Most ports first probe by pages for whatever space is requested, then > after all probing is done, they actually allocate space. This runs > afoul of valgrind in various unpleasant ways (including crashing > valgrind on two targets). > > Only x86-linux currently uses a "moving sp" allocation and probing > strategy. ie, it actually allocates space, then probes the space. Right, because the Linux kernel for x86/x86-64 is the only OS flavor that doesn't let you probe the stack ahead of the stack pointer. All other combinations of OS and architecture we tried (and it's quite a lot) do. > After much poking around I concluded that we really need to implement > allocation and probing via a "moving sp" strategy. Probing into > unallocated areas runs afoul of valgrind, so that's a non-starter. The reason why you cannot use this strategy on a global basis for stack checking is that some ABIs specify that you cannot update the stack pointer more than once to establish a frame; others don't explicitly care but... > Allocating stack space, then probing the pages within the space is > vulnerable to async signal delivery between the allocation point and the > probe point. If that occurs the signal handler could end up running on > a stack that has collided with the heap. ...yes, there are difficulties with the "moving sp" strategy. > Finally, we need not ensure the ability to handle a signal at stack > overflow. It is fine for the kernel to halt the process immediately if > it detects a reference to the guard page. In Ada it's the opposite and we use an alternate signal stack in this case. > Dynamic (alloca) space is handled fairly generically with simple code to > allocate a page and probe the just allocated page. Right, it's not the most difficult part. > Michael Matz has suggested some generic support so that we don't have to > write target specific code for each and every target we support. THe > idea is to have a helper function which allocates and probes stack > space. THe port can then call that helper function from within its > prologue generator. I think this is wise -- I wouldn't want to go > through this exercise on every port. Interesting. We never convinced ourselves that this was worthwhile. -- Eric Botcazou
