Re: [RS6000] asynch exceptions and unwind info
On Fri, Jul 29, 2011 at 10:28:28PM +0930, Alan Modra wrote: libgcc/ * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__): Restore for indirect call bcrtl from correct stack slot, and only if cfa+40 isn't valid. gcc/ * config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete. * config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static. (rs6000_emit_prologue): Don't prematurely return when TARGET_SINGLE_PIC_BASE. Don't emit eh_frame info in save_toc_in_prologue case. (rs6000_call_indirect_aix): Only disallow save_toc_in_prologue for calls_alloca. Approved offline and applied with a comment change. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] asynch exceptions and unwind info
On Fri, Jul 29, 2011 at 10:57:48AM +0930, Alan Modra wrote: Except that any info about r2 in an indirect call sequence really belongs to the *called* function frame, not the callee. I woke up this morning with the realization that what I'd done in frob_update_context for indirect call sequences was wrong. Ditto for the r2 store that Michael moved into the prologue. The only time we want the unwinder to restore from that particular save is if r2 isn't saved in the current frame. Untested patch follows. Here's a tested patch that fixes an issue with TOC_SINGLE_PIC_BASE and enables Michael's save_toc_in_prologue optimization for all functions except those that make dynamic stack adjustments. Incidentally, the rs6000_emit_prologue comment I added below suggests another solution. Since all we need is the toc pointer for the frame, it would be possible to tell the unwinder to simply load r2 from the .opd entry. I think.. libgcc/ * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__): Restore for indirect call bcrtl from correct stack slot, and only if cfa+40 isn't valid. gcc/ * config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete. * config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static. (rs6000_emit_prologue): Don't prematurely return when TARGET_SINGLE_PIC_BASE. Don't emit eh_frame info in save_toc_in_prologue case. (rs6000_call_indirect_aix): Only disallow save_toc_in_prologue for calls_alloca. Index: libgcc/config/rs6000/linux-unwind.h === --- libgcc/config/rs6000/linux-unwind.h (revision 176905) +++ libgcc/config/rs6000/linux-unwind.h (working copy) @@ -354,20 +354,22 @@ frob_update_context (struct _Unwind_Cont /* We are in a plt call stub or r2 adjusting long branch stub, before r2 has been saved. Keep REG_UNSAVED. */ } - else if (pc[0] == 0x4E800421 - pc[1] == 0xE8410028) - { - /* We are at the bctrl instruction in a call via function -pointer. gcc always emits the load of the new r2 just -before the bctrl. */ - _Unwind_SetGRPtr (context, 2, context-cfa + 40); - } else { unsigned int *insn = (unsigned int *) _Unwind_GetGR (context, R_LR); if (insn *insn == 0xE8410028) _Unwind_SetGRPtr (context, 2, context-cfa + 40); + else if (pc[0] == 0x4E800421 + pc[1] == 0xE8410028) + { + /* We are at the bctrl instruction in a call via function +pointer. gcc always emits the load of the new R2 just +before the bctrl so this is the first and only place +we need to use the stored R2. */ + _Unwind_Word sp = _Unwind_GetGR (context, 1); + _Unwind_SetGRPtr (context, 2, sp + 40); + } } } #endif Index: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs6000/rs6000-protos.h (revision 176905) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -172,8 +172,6 @@ extern void rs6000_emit_epilogue (int); extern void rs6000_emit_eh_reg_restore (rtx, rtx); extern const char * output_isel (rtx *); extern void rs6000_call_indirect_aix (rtx, rtx, rtx); -extern bool rs6000_save_toc_in_prologue_p (void); - extern void rs6000_aix_asm_output_dwarf_table_ref (char *); /* Declare functions in rs6000-c.c */ Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 176905) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1178,6 +1178,7 @@ static void rs6000_conditional_register_ static void rs6000_trampoline_init (rtx, tree, rtx); static bool rs6000_cannot_force_const_mem (enum machine_mode, rtx); static bool rs6000_legitimate_constant_p (enum machine_mode, rtx); +static bool rs6000_save_toc_in_prologue_p (void); /* Hash table stuff for keeping track of TOC entries. */ @@ -20478,14 +20504,12 @@ rs6000_emit_prologue (void) insn = emit_insn (generate_set_vrsave (reg, info, 0)); } - if (TARGET_SINGLE_PIC_BASE) -return; /* Do not set PIC register */ - /* If we are using RS6000_PIC_OFFSET_TABLE_REGNUM, we need to set it up. */ - if ((TARGET_TOC TARGET_MINIMAL_TOC get_pool_size () != 0) - || (DEFAULT_ABI == ABI_V4 - (flag_pic == 1 || (flag_pic TARGET_SECURE_PLT)) - df_regs_ever_live_p (RS6000_PIC_OFFSET_TABLE_REGNUM))) + if (!TARGET_SINGLE_PIC_BASE + ((TARGET_TOC TARGET_MINIMAL_TOC get_pool_size () != 0) + || (DEFAULT_ABI == ABI_V4 + (flag_pic == 1 || (flag_pic TARGET_SECURE_PLT)) + df_regs_ever_live_p (RS6000_PIC_OFFSET_TABLE_REGNUM { /* If
Re: [RS6000] asynch exceptions and unwind info
On Thu, Jul 28, 2011 at 9:27 PM, Alan Modra amo...@gmail.com wrote: Right, but I was talking about the normal case, where the unwinder won't even look at .glink unwind info. The whole problem is that toc pointer copy in 40(1) is only valid during indirect call sequences, and iff ld inserted a stub? I.e. direct calls between functions that share toc pointers never save the copy? Yes. Would it make sense, if a function has any indirect call, to move the toc pointer save into the prologue? You'd get to avoid that store all the time. Of course you'd not be able to sink the load after the call, but it might still be a win. And in that special case you can annotate the r2 save slot just once, correctly. Except that any info about r2 in an indirect call sequence really belongs to the *called* function frame, not the callee. I woke up this morning with the realization that what I'd done in frob_update_context for indirect call sequences was wrong. Ditto for the r2 store that Michael moved into the prologue. The only time we want the unwinder to restore from that particular save is if r2 isn't saved in the current frame. This discussion seems to be referencing both PLT stubs and pointer glue. Indirect calls through a function pointer create a frame, save R2, and the unwinder can visit that frame. PLT stub calls are tail calls, save R2, and the unwinder only would visit the frame if an exception occurs in the middle of a call. One also can add lazy resolution using the glink code, which performs additional work in the dynamic linker on the first call. Which has the problem? Which are you trying to solve? And how is your change solving it? Thanks, David
Re: [RS6000] asynch exceptions and unwind info
On Fri, Jul 29, 2011 at 09:16:09AM -0400, David Edelsohn wrote: Which has the problem? Which are you trying to solve? And how is your change solving it? Michael's save_toc_in_prologue emit_frame_save writes unwind info for the wrong frame. That r2 save is the current r2. What we need is info about the previous r2, so we can restore when unwinding. I made a similar mistake in frob_update_context in that the value saved by an indirect function call sequence is the r2 for the current function. I also restored from the wrong location. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] asynch exceptions and unwind info
On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote: Ideally what I'd like to do is have ld and gcc emit accurate r2 tracking unwind info and dispense with hacks like frob_update_context. If ld did emit accurate unwind info for .glink, then the justification for frob_update_context disappears. For the record, this statement of mine doesn't make sense. A .glink stub doesn't make a frame, so a backtrace won't normally pass through a stub, thus having accurate unwind info for .glink doesn't help at all. ld would need to insert unwind info for r2 on the call, but that involves editing .eh_frame and in any case isn't accurate since the r2 save doesn't happen until one or two instructions after the call, in the stub. I think we are stuck with frob_update_context. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] asynch exceptions and unwind info
On 07/28/2011 12:27 AM, Alan Modra wrote: On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote: Ideally what I'd like to do is have ld and gcc emit accurate r2 tracking unwind info and dispense with hacks like frob_update_context. If ld did emit accurate unwind info for .glink, then the justification for frob_update_context disappears. For the record, this statement of mine doesn't make sense. A .glink stub doesn't make a frame, so a backtrace won't normally pass through a stub, thus having accurate unwind info for .glink doesn't help at all. It does, for the duration of the stub. The whole problem is that toc pointer copy in 40(1) is only valid during indirect call sequences, and iff ld inserted a stub? I.e. direct calls between functions that share toc pointers never save the copy? Would it make sense, if a function has any indirect call, to move the toc pointer save into the prologue? You'd get to avoid that store all the time. Of course you'd not be able to sink the load after the call, but it might still be a win. And in that special case you can annotate the r2 save slot just once, correctly. For functions that do not contain an indirect function call, I don't believe that there's a any way to use DW_CFA_offset that is always correct. One could, however, move the code in frob_update_context into a (series of) DW_CFA_val_expression's. DW_CFA_val_expression DW_OP_reg2 // Default to the value currently in R2 DW_OP_regx LR // Test the insn following the call, as per frob_update_context DW_OP_deref_size 4 DW_OP_const4u 0xE8410028 DW_OP_ne DW_OP_bra L1 DW_OP_drop // Could be omitted, given that we only examine top-of-stack at the end DW_OP_breg1 40 // Pull the value from *(R1+40) DW_OP_deref L1: This version could appear in the CIE. You'd have to adjust it once LR gets saved to the stack, and R2 isn't itself being saved as per above. There isn't currently a hook in dwarf2cfi to add extra stuff to the CIE program, but that wouldn't be hard to add. The version that gets emitted after LR is saved would need a new note as well. But it all seems fairly tractable to actually implement, if we think it'll actually solve the problem. r~
Re: [RS6000] asynch exceptions and unwind info
On Thu, Jul 28, 2011 at 2:49 PM, Richard Henderson r...@redhat.com wrote: The whole problem is that toc pointer copy in 40(1) is only valid during indirect call sequences, and iff ld inserted a stub? I.e. direct calls between functions that share toc pointers never save the copy? Would it make sense, if a function has any indirect call, to move the toc pointer save into the prologue? You'd get to avoid that store all the time. Of course you'd not be able to sink the load after the call, but it might still be a win. And in that special case you can annotate the r2 save slot just once, correctly. Michael Meissner recently did move R2 save into the prologue, under certain circumstances. See TARGET_SAVE_TOC_INDIRECT. Limitations include alloca (unless one re-copies the R2. Mike also encountered some problems with EH, which may be related to this discussion. The other problem is hoisting the store into the prologue is not always profitable for performance. It should be better once shrink wrapping is implemented. Currently the PPC ABI may perform a lot of stores in the prologue if the function *may* make a call. R2 adds yet another store to the common path. - David
Re: [RS6000] asynch exceptions and unwind info
On 07/28/2011 12:02 PM, David Edelsohn wrote: The other problem is hoisting the store into the prologue is not always profitable for performance. It should be better once shrink wrapping is implemented. Currently the PPC ABI may perform a lot of stores in the prologue if the function *may* make a call. R2 adds yet another store to the common path. Well, even if we're not able to hoist the R2 store, we may be able to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns in the stream. r~
Re: [RS6000] asynch exceptions and unwind info
On Thu, Jul 28, 2011 at 11:49:16AM -0700, Richard Henderson wrote: On 07/28/2011 12:27 AM, Alan Modra wrote: On Wed, Jul 27, 2011 at 03:00:45PM +0930, Alan Modra wrote: Ideally what I'd like to do is have ld and gcc emit accurate r2 tracking unwind info and dispense with hacks like frob_update_context. If ld did emit accurate unwind info for .glink, then the justification for frob_update_context disappears. For the record, this statement of mine doesn't make sense. A .glink stub doesn't make a frame, so a backtrace won't normally pass through a stub, thus having accurate unwind info for .glink doesn't help at all. It does, for the duration of the stub. Right, but I was talking about the normal case, where the unwinder won't even look at .glink unwind info. The whole problem is that toc pointer copy in 40(1) is only valid during indirect call sequences, and iff ld inserted a stub? I.e. direct calls between functions that share toc pointers never save the copy? Yes. Would it make sense, if a function has any indirect call, to move the toc pointer save into the prologue? You'd get to avoid that store all the time. Of course you'd not be able to sink the load after the call, but it might still be a win. And in that special case you can annotate the r2 save slot just once, correctly. Except that any info about r2 in an indirect call sequence really belongs to the *called* function frame, not the callee. I woke up this morning with the realization that what I'd done in frob_update_context for indirect call sequences was wrong. Ditto for the r2 store that Michael moved into the prologue. The only time we want the unwinder to restore from that particular save is if r2 isn't saved in the current frame. Untested patch follows. libgcc/ * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__): Restore for indirect call bcrtl from correct stack slot, and only if cfa+40 isn't valid. gcc/ * config/rs6000/rs6000-protos.h (rs6000_save_toc_in_prologue_p): Delete. * config/rs6000/rs6000.c (rs6000_save_toc_in_prologue_p): Make static. (rs6000_emit_prologue): Don't emit eh_frame info in save_toc_in_prologue case. (rs6000_call_indirect_aix): Formatting. Index: libgcc/config/rs6000/linux-unwind.h === --- libgcc/config/rs6000/linux-unwind.h (revision 176905) +++ libgcc/config/rs6000/linux-unwind.h (working copy) @@ -354,20 +354,22 @@ frob_update_context (struct _Unwind_Cont /* We are in a plt call stub or r2 adjusting long branch stub, before r2 has been saved. Keep REG_UNSAVED. */ } - else if (pc[0] == 0x4E800421 - pc[1] == 0xE8410028) - { - /* We are at the bctrl instruction in a call via function -pointer. gcc always emits the load of the new r2 just -before the bctrl. */ - _Unwind_SetGRPtr (context, 2, context-cfa + 40); - } else { unsigned int *insn = (unsigned int *) _Unwind_GetGR (context, R_LR); if (insn *insn == 0xE8410028) _Unwind_SetGRPtr (context, 2, context-cfa + 40); + else if (pc[0] == 0x4E800421 + pc[1] == 0xE8410028) + { + /* We are at the bctrl instruction in a call via function +pointer. gcc always emits the load of the new R2 just +before the bctrl so this is the first and only place +we need to use the stored R2. */ + _Unwind_Word sp = _Unwind_GetGR (context, 1); + _Unwind_SetGRPtr (context, 2, sp + 40); + } } } #endif Index: gcc/config/rs6000/rs6000-protos.h === --- gcc/config/rs6000/rs6000-protos.h (revision 176905) +++ gcc/config/rs6000/rs6000-protos.h (working copy) @@ -172,8 +172,6 @@ extern void rs6000_emit_epilogue (int); extern void rs6000_emit_eh_reg_restore (rtx, rtx); extern const char * output_isel (rtx *); extern void rs6000_call_indirect_aix (rtx, rtx, rtx); -extern bool rs6000_save_toc_in_prologue_p (void); - extern void rs6000_aix_asm_output_dwarf_table_ref (char *); /* Declare functions in rs6000-c.c */ Index: gcc/config/rs6000/rs6000.c === --- gcc/config/rs6000/rs6000.c (revision 176905) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -1178,6 +1178,7 @@ static void rs6000_conditional_register_ static void rs6000_trampoline_init (rtx, tree, rtx); static bool rs6000_cannot_force_const_mem (enum machine_mode, rtx); static bool rs6000_legitimate_constant_p (enum machine_mode, rtx); +static bool rs6000_save_toc_in_prologue_p (void); /* Hash table stuff for keeping track of TOC entries. */ @@ -20536,8 +20562,11 @@ rs6000_emit_prologue (void) /* If we
Re: [RS6000] asynch exceptions and unwind info
On Thu, Jul 28, 2011 at 12:09:51PM -0700, Richard Henderson wrote: Well, even if we're not able to hoist the R2 store, we may be able to simply add REG_CFA_OFFSET and REG_CFA_RESTORE notes to the insns in the stream. You'd need to mark every non-local call with something that says R2 may be saved, effectively duplicating md_frob_update in dwarf. I guess that is possible even without extending our eh encoding, but each call would have at least 6 bytes added to eh_frame: DW_CFA_expression, 2, 3, DW_OP_skip, offset_to_r2_prog and you'd need to emit multiple copies of r2_prog for functions that have a lot of calls, since the offset is limited to +/-32k. I think that would inflate the size of .eh_frame too much, and slow down handling of exceptions dramatically. -- Alan Modra Australia Development Lab, IBM
Re: [RS6000] asynch exceptions and unwind info
On Wed, Jul 27, 2011 at 1:30 AM, Alan Modra amo...@gmail.com wrote: * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__): Leave r2 REG_UNSAVED if stopped on the instruction that saves r2 in a plt call stub. Do restore r2 if stopped on bctrl. Okay. Thanks, David
[RS6000] asynch exceptions and unwind info
Hi David, I've been looking into what we need to do to support unwinding from async signal handlers. I've implemented unwind info generation for .glink in the linker, but to keep the ppc64 .glink unwind info simple I've assumed that frob_update_context is still used. We still have some difficulties related to r2 tracking on ppc64. frob_update_context doesn't quite do the right thing for async unwinding. A typical (no-r11) plt call stub looks like addis r12,2,off@ha std 2,40(1) ld 11,off@l(12) mtctr 11 ld 2,off+8@l(12) bctr or, when the offset from r2 to the function descriptor is small std 2,40(1) ld 11,off(2) mtctr 11 ld 2,off+8(2) bctr Now if we're stopped before the save of r2 we obviously don't want the unwinder to restore r2 from 40(1), but that's exactly what the current unwinder does. Also, there is a one insn window where frob_update_context may do the wrong thing for gcc generated calls via function pointer, which typically looks like ld 0,0(r) std 2,40(1) mtctr 0 ld 2,8(r) bctrl ld 2,40(1) Here, if we are stopped after the ld 2,8(r) then r2 needs to be restored from 40(1). The following patch fixes these two issues. Ideally what I'd like to do is have ld and gcc emit accurate r2 tracking unwind info and dispense with hacks like frob_update_context. If ld did emit accurate unwind info for .glink, then the justification for frob_update_context disappears. The difficulty then is backwards compatibility. You'd need a way for the gcc unwinder to handle a mix of old code (that needs frob_update_context) with new code (that doesn't). One way to accomplish this would be to set a dummy reg with initial CIE dwarf instructions, then test this reg in frob_update_context. Bootstrapped and regression tested powerpc64-linux. * config/rs6000/linux-unwind.h (frob_update_context __powerpc64__): Leave r2 REG_UNSAVED if stopped on the instruction that saves r2 in a plt call stub. Do restore r2 if stopped on bctrl. Index: libgcc/config/rs6000/linux-unwind.h === --- libgcc/config/rs6000/linux-unwind.h (revision 176780) +++ libgcc/config/rs6000/linux-unwind.h (working copy) @@ -346,10 +346,28 @@ frob_update_context (struct _Unwind_Cont figure out if it was saved. The big problem here is that the code that does the save/restore is generated by the linker, so we have no good way to determine at compile time what to do. */ - unsigned int *insn - = (unsigned int *) _Unwind_GetGR (context, R_LR); - if (insn *insn == 0xE8410028) - _Unwind_SetGRPtr (context, 2, context-cfa + 40); + if (pc[0] == 0xF8410028 + || ((pc[0] 0x) == 0x3D82 + pc[1] == 0xF8410028)) + { + /* We are in a plt call stub or r2 adjusting long branch stub, +before r2 has been saved. Keep REG_UNSAVED. */ + } + else if (pc[0] == 0x4E800421 + pc[1] == 0xE8410028) + { + /* We are at the bctrl instruction in a call via function +pointer. gcc always emits the load of the new r2 just +before the bctrl. */ + _Unwind_SetGRPtr (context, 2, context-cfa + 40); + } + else + { + unsigned int *insn + = (unsigned int *) _Unwind_GetGR (context, R_LR); + if (insn *insn == 0xE8410028) + _Unwind_SetGRPtr (context, 2, context-cfa + 40); + } } #endif } -- Alan Modra Australia Development Lab, IBM