Re: 3 quick questions about stack alignment for powerpc (32-bit) signal handlers [the change that caused misaligned]
Good catch! I'll commit the change tonight. - Justin On Tue, Feb 2, 2016 at 3:48 AM, Mark Millardwrote: > I tried the change to -32 and 32 (from -20 and 20) on/for the powerpc > (32-bit) PowerMac that I use and the results were: > > A) "info frame" in gdb shows signal handlers are now started with 16-byte > aligned stack frames. (Applies to gcc 4.2.1 based contexts too, not just to > the clang 3.8.0 ones with the __vfprintf-tied segmentation faults during > signals.) > > and. . . > > B) The "clang 3.8.0 compiled __vfprintf" segmentation faults in libc/stdio > library code during signal handlers that use such code no longer happen > because the alignment matches the code requirements. > > I've added this information to Bug 206810. > > > (Note: There are a couple of segmentation fault contexts that I've never tied > down to any specific property: no discovered evidence of signal handler > involvement or of __vfprintf involvement, for example. These are still a > problem. But where I had tied the faults to signal handlers using __vfprintf > now instead work fine in my experimental clang 3.8.0 based builds.) > > > === > Mark Millard > markmi at dsl-only.net > > On 2016-Feb-1, at 12:11 AM, Mark Millard wrote: > > The -16/16 code below produced correct alignment but too little space. > > The -20/20 code below produces enough space but misalignment. > > To maintain 16-byte alignment while increasing the space would have required > going from -16/16 to -32/32. At least that is how I understand this code. > > >> Index: sys/powerpc/powerpc/sigcode32.S >> === >> --- sys/powerpc/powerpc/sigcode32.S >> (.../head/sys/powerpc/powerpc/sigcode32.S) (revision 209975) >> +++ sys/powerpc/powerpc/sigcode32.S >> (.../projects/clang380-import/sys/powerpc/powerpc/sigcode32.S) (working >> copy) >> @@ -45,9 +45,9 @@ >> */ >>.globl CNAME(sigcode32),CNAME(szsigcode32) >> CNAME(sigcode32): >> - addi1,1,-16 /* reserved space for callee */ >> + addi1,1,-20 /* reserved space for callee */ >>blrl >> - addi3,1,16+SF_UC/* restore sp, and get >sf_uc >> */ >> + addi3,1,20+SF_UC/* restore sp, and get >sf_uc >> */ >>li 0,SYS_sigreturn >>sc /* sigreturn(scp) */ >>li 0,SYS_exit > > > > The "working copy" is -r266778 from 2014-May-27. > > -r209975 is from 2010-Jul-13. > > > === > Mark Millard > markmi at dsl-only.net > > On 2016-Jan-31, at 10:58 PM, Mark Millard wrote: > > Just a correction to a sentence that I wrote. I had written: > >> Frame at:0x...90 vs. 0x...1c >> call by frame: 0x...b0 vs. 0x...1c >> Arglist at: 0x...70 vs. 0x...dc >> Locals at: 0x...70 vs. 0x...dc >> Previous frame's sp: 0x...90 vs. 0x...1c >> >> It looks like 4 additional pad bytes on the user/process stack are needed to >> get back to alignment. > > Of course the figures on the right need to get smaller, not larger: The stack > grows towards smaller addresses. So to get to 0x...0 on the right I should > have said: > > It looks like 12 additional pad bytes on the user/process stack are needed to > get back to alignment. > > That would produce: > > Frame at:0x...90 vs. 0x...10 > call by frame: 0x...b0 vs. 0x...10 > Arglist at: 0x...70 vs. 0x...d0 > Locals at: 0x...70 vs. 0x...d0 > Previous frame's sp: 0x...90 vs. 0x...10 > > === > Mark Millard > markmi at dsl-only.net > > On 2016-Jan-31, at 10:47 PM, Mark Millard wrote: > > More evidence: By adding "break raise" and then using "info frame" to show > the alignment at that point I can show that the later signal delivery changes > the alignment on the user process stack compared to when raise was called. > (Later I show the same for thr_kill.) > >> Breakpoint 2, __raise (s=29) at /usr/src/lib/libc/gen/raise.c:50 >> warning: Source file is more recent than executable. >> 50if (__sys_thr_self() == -1) >> (gdb) info frame >> Stack level 0, frame at 0xdc90: >> pc = 0x41904630 in __raise (/usr/src/lib/libc/gen/raise.c:50); saved pc = >> 0x1800774 >> called by frame at 0xdcb0 >> source language c. >> Arglist at 0xdc70, args: s=29 >> Locals at 0xdc70, Previous frame's sp is 0xdc90 >> Saved registers: >> r29 at 0xdc84, r30 at 0xdc88, r31 at 0xdc8c, pc at 0xdc94, >> lr at 0xdc94 >> (gdb) cont >> Continuing. >> >> Program received signal SIGINFO, Information request. >> >> Breakpoint 1, 0x018006d0 in handler () >> (gdb) info frame >> Stack level 0, frame at 0xd71c: >> pc = 0x18006d0 in handler; saved pc = 0xe008 >> called by frame at 0xd71c >> Arglist at 0xd6dc, args: >> Locals at 0xd6dc, Previous frame's sp is 0xd71c >> Saved registers: >> r31 at 0xd718, pc at 0xd720, lr at
Re: 3 quick questions about stack alignment for powerpc (32-bit) signal handlers [the change that caused misaligned]
On Tue, Feb 02, 2016 at 10:05:16AM -0600, Justin Hibbits wrote: > Good catch! I'll commit the change tonight. I looked once at the powerpc sigsend(), and I think that it has an issue. The usfp is calculated by taking the stack pointer at the time of signal delivery and substracting the sigframe size. This means that a transient misalignment during some code (e.g. leaf function) is transferred to the signal handler execution. Other arches explicitely realign stack pointer for the signal frame before the frame is formed. I am not sure if the problem reported in the thread is caused by this or not, but forced realignment in sendsig() is required for ABI compliance. > > - Justin > > On Tue, Feb 2, 2016 at 3:48 AM, Mark Millardwrote: > > I tried the change to -32 and 32 (from -20 and 20) on/for the powerpc > > (32-bit) PowerMac that I use and the results were: > > > > A) "info frame" in gdb shows signal handlers are now started with 16-byte > > aligned stack frames. (Applies to gcc 4.2.1 based contexts too, not just to > > the clang 3.8.0 ones with the __vfprintf-tied segmentation faults during > > signals.) > > > > and. . . > > > > B) The "clang 3.8.0 compiled __vfprintf" segmentation faults in libc/stdio > > library code during signal handlers that use such code no longer happen > > because the alignment matches the code requirements. > > > > I've added this information to Bug 206810. > > > > > > (Note: There are a couple of segmentation fault contexts that I've never > > tied down to any specific property: no discovered evidence of signal > > handler involvement or of __vfprintf involvement, for example. These are > > still a problem. But where I had tied the faults to signal handlers using > > __vfprintf now instead work fine in my experimental clang 3.8.0 based > > builds.) > > > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > On 2016-Feb-1, at 12:11 AM, Mark Millard wrote: > > > > The -16/16 code below produced correct alignment but too little space. > > > > The -20/20 code below produces enough space but misalignment. > > > > To maintain 16-byte alignment while increasing the space would have > > required going from -16/16 to -32/32. At least that is how I understand > > this code. > > > > > >> Index: sys/powerpc/powerpc/sigcode32.S > >> === > >> --- sys/powerpc/powerpc/sigcode32.S > >> (.../head/sys/powerpc/powerpc/sigcode32.S) (revision 209975) > >> +++ sys/powerpc/powerpc/sigcode32.S > >> (.../projects/clang380-import/sys/powerpc/powerpc/sigcode32.S) (working > >> copy) > >> @@ -45,9 +45,9 @@ > >> */ > >>.globl CNAME(sigcode32),CNAME(szsigcode32) > >> CNAME(sigcode32): > >> - addi1,1,-16 /* reserved space for callee */ > >> + addi1,1,-20 /* reserved space for callee */ > >>blrl > >> - addi3,1,16+SF_UC/* restore sp, and get > >> >sf_uc */ > >> + addi3,1,20+SF_UC/* restore sp, and get > >> >sf_uc */ > >>li 0,SYS_sigreturn > >>sc /* sigreturn(scp) */ > >>li 0,SYS_exit > > > > > > > > The "working copy" is -r266778 from 2014-May-27. > > > > -r209975 is from 2010-Jul-13. > > > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > On 2016-Jan-31, at 10:58 PM, Mark Millard wrote: > > > > Just a correction to a sentence that I wrote. I had written: > > > >> Frame at:0x...90 vs. 0x...1c > >> call by frame: 0x...b0 vs. 0x...1c > >> Arglist at: 0x...70 vs. 0x...dc > >> Locals at: 0x...70 vs. 0x...dc > >> Previous frame's sp: 0x...90 vs. 0x...1c > >> > >> It looks like 4 additional pad bytes on the user/process stack are needed > >> to get back to alignment. > > > > Of course the figures on the right need to get smaller, not larger: The > > stack grows towards smaller addresses. So to get to 0x...0 on the right I > > should have said: > > > > It looks like 12 additional pad bytes on the user/process stack are needed > > to get back to alignment. > > > > That would produce: > > > > Frame at:0x...90 vs. 0x...10 > > call by frame: 0x...b0 vs. 0x...10 > > Arglist at: 0x...70 vs. 0x...d0 > > Locals at: 0x...70 vs. 0x...d0 > > Previous frame's sp: 0x...90 vs. 0x...10 > > > > === > > Mark Millard > > markmi at dsl-only.net > > > > On 2016-Jan-31, at 10:47 PM, Mark Millard wrote: > > > > More evidence: By adding "break raise" and then using "info frame" to show > > the alignment at that point I can show that the later signal delivery > > changes the alignment on the user process stack compared to when raise was > > called. (Later I show the same for thr_kill.) > > > >> Breakpoint 2, __raise (s=29) at /usr/src/lib/libc/gen/raise.c:50 > >> warning: Source file is more recent than executable. > >> 50if
Re: 3 quick questions about stack alignment for powerpc (32-bit) signal handlers [the change that caused misaligned]
On Tue, Feb 2, 2016 at 10:13 AM, Konstantin Belousovwrote: > On Tue, Feb 02, 2016 at 10:05:16AM -0600, Justin Hibbits wrote: >> Good catch! I'll commit the change tonight. > I looked once at the powerpc sigsend(), and I think that it has an > issue. The usfp is calculated by taking the stack pointer at the time > of signal delivery and substracting the sigframe size. This means that > a transient misalignment during some code (e.g. leaf function) is > transferred to the signal handler execution. > > Other arches explicitely realign stack pointer for the signal > frame before the frame is formed. > > I am not sure if the problem reported in the thread is caused by this > or not, but forced realignment in sendsig() is required for ABI compliance. Good point. Currently the assumption is that the stack will always be 16-byte aligned, which is required per ABI. Since there's no push/pop, only full frame creation/destruction, it hasn't bitten us yet, but it should be fixed. It's not the cause of this bug, though. This bug is caused after sendsig(), in the sigcode trampoline in user space. - Justin ___ freebsd-toolchain@freebsd.org mailing list https://lists.freebsd.org/mailman/listinfo/freebsd-toolchain To unsubscribe, send any mail to "freebsd-toolchain-unsubscr...@freebsd.org"
Re: 3 quick questions about stack alignment for powerpc (32-bit) signal handlers [the change that caused misaligned]
The -16/16 code below produced correct alignment but too little space. The -20/20 code below produces enough space but misalignment. To maintain 16-byte alignment while increasing the space would have required going from -16/16 to -32/32. At least that is how I understand this code. Index: sys/powerpc/powerpc/sigcode32.S === --- sys/powerpc/powerpc/sigcode32.S (.../head/sys/powerpc/powerpc/sigcode32.S) (revision 209975) +++ sys/powerpc/powerpc/sigcode32.S (.../projects/clang380-import/sys/powerpc/powerpc/sigcode32.S) (working copy) @@ -45,9 +45,9 @@ */ .globl CNAME(sigcode32),CNAME(szsigcode32) CNAME(sigcode32): - addi1,1,-16 /* reserved space for callee */ + addi1,1,-20 /* reserved space for callee */ blrl - addi3,1,16+SF_UC/* restore sp, and get >sf_uc */ + addi3,1,20+SF_UC/* restore sp, and get >sf_uc */ li 0,SYS_sigreturn sc /* sigreturn(scp) */ li 0,SYS_exit The "working copy" is -r266778 from 2014-May-27. -r209975 is from 2010-Jul-13. === Mark Millard markmi at dsl-only.net On 2016-Jan-31, at 10:58 PM, Mark Millard wrote: Just a correction to a sentence that I wrote. I had written: > Frame at:0x...90 vs. 0x...1c > call by frame: 0x...b0 vs. 0x...1c > Arglist at: 0x...70 vs. 0x...dc > Locals at: 0x...70 vs. 0x...dc > Previous frame's sp: 0x...90 vs. 0x...1c > > It looks like 4 additional pad bytes on the user/process stack are needed to > get back to alignment. Of course the figures on the right need to get smaller, not larger: The stack grows towards smaller addresses. So to get to 0x...0 on the right I should have said: It looks like 12 additional pad bytes on the user/process stack are needed to get back to alignment. That would produce: Frame at:0x...90 vs. 0x...10 call by frame: 0x...b0 vs. 0x...10 Arglist at: 0x...70 vs. 0x...d0 Locals at: 0x...70 vs. 0x...d0 Previous frame's sp: 0x...90 vs. 0x...10 === Mark Millard markmi at dsl-only.net On 2016-Jan-31, at 10:47 PM, Mark Millard wrote: More evidence: By adding "break raise" and then using "info frame" to show the alignment at that point I can show that the later signal delivery changes the alignment on the user process stack compared to when raise was called. (Later I show the same for thr_kill.) > Breakpoint 2, __raise (s=29) at /usr/src/lib/libc/gen/raise.c:50 > warning: Source file is more recent than executable. > 50if (__sys_thr_self() == -1) > (gdb) info frame > Stack level 0, frame at 0xdc90: > pc = 0x41904630 in __raise (/usr/src/lib/libc/gen/raise.c:50); saved pc = > 0x1800774 > called by frame at 0xdcb0 > source language c. > Arglist at 0xdc70, args: s=29 > Locals at 0xdc70, Previous frame's sp is 0xdc90 > Saved registers: > r29 at 0xdc84, r30 at 0xdc88, r31 at 0xdc8c, pc at 0xdc94, lr > at 0xdc94 > (gdb) cont > Continuing. > > Program received signal SIGINFO, Information request. > > Breakpoint 1, 0x018006d0 in handler () > (gdb) info frame > Stack level 0, frame at 0xd71c: > pc = 0x18006d0 in handler; saved pc = 0xe008 > called by frame at 0xd71c > Arglist at 0xd6dc, args: > Locals at 0xd6dc, Previous frame's sp is 0xd71c > Saved registers: > r31 at 0xd718, pc at 0xd720, lr at 0xd720 Note the difference (raise before delivery vs. handler via delivery): Frame at:0x...90 vs. 0x...1c call by frame: 0x...b0 vs. 0x...1c Arglist at: 0x...70 vs. 0x...dc Locals at: 0x...70 vs. 0x...dc Previous frame's sp: 0x...90 vs. 0x...1c It looks like 4 additional pad bytes on the user/process stack are needed to get back to alignment. [The span of addresses seems to be about: 0xdc90-0xd6dc==0x5B4==1460 (raise's "frame at" minus handler's "Locals at").] If I look at the frame for "break thr_kill" it also still shows an aligned user/process stack before the delivery: > Breakpoint 3, 0x419046a0 in thr_kill () from /lib/libc.so.7 > (gdb) info frame > Stack level 0, frame at 0xdc70: > pc = 0x419046a0 in thr_kill; saved pc = 0x41904650 > called by frame at 0xdc90 > Arglist at 0xdc70, args: > Locals at 0xdc70, Previous frame's sp is 0xdc70 (The relevant addresses are the same as raise showed.) Reminder of the source program structure that uses the potentially frame/stack alignment sensitive libc/stdio library code: > # more sig_snprintf_use_test.c > #include // for signal, SIGINFO, SIG_ERR, raise. > #include // for snprintf > > void handler(int sig) > { > char buf[32]; > snprintf(buf, sizeof buf, "%d", sig); // FreeBSD's world does such > // things in some of its handlers. > } > >