Re: [patch] Fix LRA/reload issue with -fnon-call-exceptions
> I have backported it onto the 8 branch, where it fixes the failure (timeout) > of gnat.dg/opt73.adb for PowerPC/Linux, after testing it on this platform. And now onto the 7 branch, after testing on x86-64/Linux and PowerPC/Linux. -- Eric Botcazou
Re: [patch] Fix LRA/reload issue with -fnon-call-exceptions
> * rtlanal.c (get_initial_register_offset): Fall back to the raw estimate > as long as the epilogue isn't completed. I have backported it onto the 8 branch, where it fixes the failure (timeout) of gnat.dg/opt73.adb for PowerPC/Linux, after testing it on this platform. -- Eric Botcazou
Re: [patch] Fix LRA/reload issue with -fnon-call-exceptions
> Doesn't the frame have to be laid out correctly during the final reload > subpass anyway, in order to get the right elimination choices and > elimination offsets? If so, I'm not sure what the " && > info->reload_completed" check in rs6000_stack_info achieves. It seems like > that's at least partly responsible for the problem. Clearly a comment would be in order. I'm not sure it's responsible for the problem, unless you mean that it's papering over the underlying issue, in which case you're probably right. This underlying issue is probably that: /* Calculate which registers need to be saved & save area size. */ info->first_gp_reg_save = first_reg_to_save (); overlooks the specific status of the frame pointer, i.e. it doesn't test frame_pointer_needed explicitly, which I think is pretty much mandatory. The rest of the code apparently works correctly despite this though. > I realise you've already applied it and that you saw it as a hack too, > but this seems like a bit too much. IMO a cleaner fix would be to define > TARGET_COMPUTE_FRAME_LAYOUT for rs6000. Well, it's a hack on top of a big kludge (the get_initial_register_offset stuff in rtlanal.c) so it can be viewed as a safety net too. :-) > I guess that approach means that TARGET_COMPUTE_FRAME_LAYOUT isn't really > optional though. IMO another workaround for the underlying issue. -- Eric Botcazou
Re: [patch] Fix LRA/reload issue with -fnon-call-exceptions
Eric Botcazou writes: > Hi, > > this is a regression present on all active branches since the controversial > get_initial_register_offset stuff was added to rtlanal.c some time ago, and > visible in the testsuite on PowerPC/Linux under the form of gnat.dg/opt73.adb > timing out at run time. > > The problem is that the compiler generates code that doesn't save the frame > pointer before clobbering it, because rs6000_stack_info computes a wrong > final > (post-reload) stack layout. The scenario is as follows: LRA decides to use > the frame pointer, sets reload_completed to 1 at the end and then does: > > /* We've possibly turned single trapping insn into multiple ones. */ > if (cfun->can_throw_non_call_exceptions) > { > auto_sbitmap blocks (last_basic_block_for_fn (cfun)); > bitmap_ones (blocks); > find_many_sub_basic_blocks (blocks); > } > > But find_many_sub_basic_blocks calls control_flow_insn_p, which in turn can > call rtx_addr_can_trap_p_1, which can call get_initial_register_offset, which > uses INITIAL_ELIMINATION_OFFSET, IOW rs6000_initial_elimination_offset, which > calls rs6000_stack_info. But at this point the DF information hasn't been > updated so the frame pointer isn't detected as live by df_regs_ever_live_p. Doesn't the frame have to be laid out correctly during the final reload subpass anyway, in order to get the right elimination choices and elimination offsets? If so, I'm not sure what the " && info->reload_completed" check in rs6000_stack_info achieves. It seems like that's at least partly responsible for the problem. > You may think that the fix is just to set reload_completed to 1 after the > above code in lra, but that's not sufficient because the same issue can arise > from the do_reload function: > > if (optimize) > cleanup_cfg (CLEANUP_EXPENSIVE); > > when checking is enabled, because cleanup_cfg can calls control_flow_insn_p > and then eventually rtx_addr_can_trap_p_1. In other words, we would need > to set reload_completed to 1 only after the above code, which is very late. > As a matter of fact, that's not possible for old reload itself because of: > > /* We must set reload_completed now since the cleanup_subreg_operands call > below will re-recognize each insn and reload may have generated insns > which are only valid during and after reload. */ > reload_completed = 1; > > So, barring the removal of the get_initial_register_offset stuff, the only > simple fix is probably to prevent it from calling into the back-end too > early, > for example with the attached fixlet. Tested on x86-64 and PowerPC/Linux. > > Thoughts? Where do we want to fix this? > > > * rtlanal.c (get_initial_register_offset): Fall back to the raw estimate > as long as the epilogue isn't completed. I realise you've already applied it and that you saw it as a hack too, but this seems like a bit too much. IMO a cleaner fix would be to define TARGET_COMPUTE_FRAME_LAYOUT for rs6000. I guess that approach means that TARGET_COMPUTE_FRAME_LAYOUT isn't really optional though. Thanks, Richard
Re: [patch] Fix LRA/reload issue with -fnon-call-exceptions
> So, barring the removal of the get_initial_register_offset stuff, the only > simple fix is probably to prevent it from calling into the back-end too > early, for example with the attached fixlet. Tested on x86-64 and > PowerPC/Linux. > > Thoughts? Where do we want to fix this? > > > * rtlanal.c (get_initial_register_offset): Fall back to the raw estimate > as long as the epilogue isn't completed. I have installed it on mainline only for now. -- Eric Botcazou
[patch] Fix LRA/reload issue with -fnon-call-exceptions
Hi, this is a regression present on all active branches since the controversial get_initial_register_offset stuff was added to rtlanal.c some time ago, and visible in the testsuite on PowerPC/Linux under the form of gnat.dg/opt73.adb timing out at run time. The problem is that the compiler generates code that doesn't save the frame pointer before clobbering it, because rs6000_stack_info computes a wrong final (post-reload) stack layout. The scenario is as follows: LRA decides to use the frame pointer, sets reload_completed to 1 at the end and then does: /* We've possibly turned single trapping insn into multiple ones. */ if (cfun->can_throw_non_call_exceptions) { auto_sbitmap blocks (last_basic_block_for_fn (cfun)); bitmap_ones (blocks); find_many_sub_basic_blocks (blocks); } But find_many_sub_basic_blocks calls control_flow_insn_p, which in turn can call rtx_addr_can_trap_p_1, which can call get_initial_register_offset, which uses INITIAL_ELIMINATION_OFFSET, IOW rs6000_initial_elimination_offset, which calls rs6000_stack_info. But at this point the DF information hasn't been updated so the frame pointer isn't detected as live by df_regs_ever_live_p. You may think that the fix is just to set reload_completed to 1 after the above code in lra, but that's not sufficient because the same issue can arise from the do_reload function: if (optimize) cleanup_cfg (CLEANUP_EXPENSIVE); when checking is enabled, because cleanup_cfg can calls control_flow_insn_p and then eventually rtx_addr_can_trap_p_1. In other words, we would need to set reload_completed to 1 only after the above code, which is very late. As a matter of fact, that's not possible for old reload itself because of: /* We must set reload_completed now since the cleanup_subreg_operands call below will re-recognize each insn and reload may have generated insns which are only valid during and after reload. */ reload_completed = 1; So, barring the removal of the get_initial_register_offset stuff, the only simple fix is probably to prevent it from calling into the back-end too early, for example with the attached fixlet. Tested on x86-64 and PowerPC/Linux. Thoughts? Where do we want to fix this? * rtlanal.c (get_initial_register_offset): Fall back to the raw estimate as long as the epilogue isn't completed. -- Eric BotcazouIndex: rtlanal.c === --- rtlanal.c (revision 268849) +++ rtlanal.c (working copy) @@ -359,10 +359,10 @@ get_initial_register_offset (int from, i if (to == from) return 0; - /* It is not safe to call INITIAL_ELIMINATION_OFFSET - before the reload pass. We need to give at least - an estimation for the resulting frame size. */ - if (! reload_completed) + /* It is not safe to call INITIAL_ELIMINATION_OFFSET before the epilogue + is completed, but we need to give at least an estimate for the stack + pointer based on the frame size. */ + if (!epilogue_completed) { offset1 = crtl->outgoing_args_size + get_frame_size (); #if !STACK_GROWS_DOWNWARD