Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-12-05 Thread David Edelsohn
On Fri, Nov 4, 2011 at 9:07 AM, Alan Modra  wrote:

> OK to apply?  gcc-4.6 branch too?
>
>        PR target/50906
>        * config/rs6000/rs6000.c (rs6000_emit_prologue ):
>        Do not mark r11 setup as frame-related.  Pass correct offset to
>        rs6000_emit_savres_rtx.  Correct out-of-line rs6000_frame_related
>        arguments.  Correct sp_offset.  Remove "offset" fudge from
>        in-line rs6000_frame_related call.  Rename misleading variable.
>        Fix comments and whitespace.  Tidy some expressions.
>        (rs6000_emit_epilogue ): Always set frame_reg_rtx
>        to r11 in out-of-line case.  Correct sp_offset.  Pass correct
>        offset to rs6000_emit_savres_rtx.  Rename misleading variable.
>        Fix comments and whitespace.  Tidy some expressions.
>        (rs6000_emit_epilogue ): Add sp_offset
>        adjustment when !saving_GPRs_inline.  Correct register mode
>        used in address calcs.
>        (rs6000_emit_epilogue ): Similarly when
>        !restoring_GPRs_inline.

Okay for trunk and 4.6.

Thanks, David


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-12-05 Thread Alan Modra
Ping http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00543.html

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-18 Thread Olivier Hainque

On Nov 17, 2011, at 20:46 , David Edelsohn wrote:

> Coincidentally, IBM's XL compiler is encountering a similar issue and
> proposing a similar solution of having the stack pointer update
> conflict with all memory accesses.

 Interesting timing :-)

> I think all these corner cases
> confirm that it is impractical for the compiler to track all accesses
> against SP and the only expedient solution is a hammer like
> conflicting with all memory.

 OK, I'll come up with a patch. I'll compare the generated assembly for
 the Ada runtime lib with and without the change. That should be easy to
 do and will provide a datapoint.

 Thanks for your feedback.

 Olivier



Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-17 Thread David Edelsohn
Coincidentally, IBM's XL compiler is encountering a similar issue and
proposing a similar solution of having the stack pointer update
conflict with all memory accesses.  I think all these corner cases
confirm that it is impractical for the compiler to track all accesses
against SP and the only expedient solution is a hammer like
conflicting with all memory.

Thanks, David


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-16 Thread David Edelsohn
On Wed, Nov 16, 2011 at 6:54 AM, Olivier Hainque  wrote:
>
> On Nov 9, 2011, at 18:15 , Olivier Hainque wrote:
>> I'm not convinced that the potential gain is worth the extra
>> complexity and potential risk of running into another subtle
>> subcase, with hard to track sporadic runtime failures for
>> starters. I don't have numbers though.
>>
>> That's a port maintainer call, I guess ?
>
>  David, opinion on this point ?
>
>  My understanding is that we have two options
>
>  1) try to preserve the current attempt at maximizing
>    optimization opportunities with precise stack/frame
>    tie insns,
>
>  2) simplify for a slightly more brutal option, with
>    a strong (mem:blk scratch) barrier instead
>
>  My feeling is that 2 would be a sensible option.
>
>  Trying to get the precise insns right has caused multiple
>  issues (several PRs about hard to track silent wrong code
>  generated, trickier implementation), and I'm not convinced
>  that the legitimate code efficiency gains are worth the
>  trouble.
>
>  As I wrote, I don't have numbers to backup the latter point
>  though.

We can try (2), but we will have to benchmark it to determine the
impact on performance.

Thanks, David


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-16 Thread Olivier Hainque

On Nov 9, 2011, at 18:15 , Olivier Hainque wrote:
> I'm not convinced that the potential gain is worth the extra
> complexity and potential risk of running into another subtle
> subcase, with hard to track sporadic runtime failures for
> starters. I don't have numbers though.
> 
> That's a port maintainer call, I guess ?

 David, opinion on this point ?

 My understanding is that we have two options

 1) try to preserve the current attempt at maximizing
optimization opportunities with precise stack/frame
tie insns, 

 2) simplify for a slightly more brutal option, with
a strong (mem:blk scratch) barrier instead

 My feeling is that 2 would be a sensible option.
 
 Trying to get the precise insns right has caused multiple
 issues (several PRs about hard to track silent wrong code
 generated, trickier implementation), and I'm not convinced
 that the legitimate code efficiency gains are worth the 
 trouble.

 As I wrote, I don't have numbers to backup the latter point
 though.

 Thanks for your feedback,

 Olivier

 
 




Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-09 Thread Olivier Hainque

On Nov 8, 2011, at 2:40 PM, Alan Modra wrote:

> On Tue, Nov 08, 2011 at 11:37:57AM +0100, Olivier Hainque wrote:
>> Joseph resorted to mem:scratch to impose a strong barrier. That's certainly
>> safe and I don't think the performance impact can be significant, so this
>> looks like a good way out.
> 
> I agree.  Our fancy powerpc stack deallocation barriers rely too much
> on alias analysis, which as you've shown (thanks!) isn't reliable in
> this instance.  Other targets just use the mem:scratch barrier.

 Right. While the instance we have at hand here involves a bug
 I think (r11 perceived as a base reg to symbol), I don't see that
 we have strong guarantees that the current assumptions hold for
 sure in absence of such a bug.
  
>> I tried an approach in between here: replace the stack_tie insn by a
>> frame_tie insn with frame_base_rtx as the second base. In the case above,
>> with frame_base_rtx==r11, this is something like
>> 
>>  (set (mem/c:BLK (reg:SI 11 11) [0 A8])
>>(unspec:BLK [
>>(mem/c:BLK (reg:SI 11 11) [0 A8])
>>(mem/c:BLK (reg/f:SI 1 1) [0 A8])
> 
> Looks similar to
> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30282#c15

 Exactly. The idea here is to use the same mechanism in
 the other places where emit_epilogue emits a tie, making
 sure that the tie mem access will conflict with the frame
 accesses by issuing a write from the same base register.

 Now, if we can indeed have frame accesses using different
 base registers and a tie is needed after that (needs double
 checking), well, we'll need several of them.

 We could probably abstract this away and "just" keep track
 of a set of base registers used before a tie gets emitted.

 I'm not convinced that the potential gain is worth the extra
 complexity and potential risk of running into another subtle
 subcase, with hard to track sporadic runtime failures for
 starters. I don't have numbers though.

 That's a port maintainer call, I guess ?

 Olivier


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-08 Thread Alan Modra
On Tue, Nov 08, 2011 at 11:37:57AM +0100, Olivier Hainque wrote:
>  Joseph resorted to mem:scratch to impose a strong barrier. That's certainly
>  safe and I don't think the performance impact can be significant, so this
>  looks like a good way out.

I agree.  Our fancy powerpc stack deallocation barriers rely too much
on alias analysis, which as you've shown (thanks!) isn't reliable in
this instance.  Other targets just use the mem:scratch barrier.

>  I tried an approach in between here: replace the stack_tie insn by a
>  frame_tie insn with frame_base_rtx as the second base. In the case above,
>  with frame_base_rtx==r11, this is something like
> 
>   (set (mem/c:BLK (reg:SI 11 11) [0 A8])
> (unspec:BLK [
> (mem/c:BLK (reg:SI 11 11) [0 A8])
> (mem/c:BLK (reg/f:SI 1 1) [0 A8])

Looks similar to
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=30282#c15

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-08 Thread Olivier Hainque

On Nov 8, 2011, at 6:13 AM, Alan Modra wrote:

> It's been a while since I looked at what was happening with this
> testcase, but from memory what stops sheduling over the stack_tie is
> alias.c:base_alias_check.  Which relies on tracking registers to see
> whether two memory accesses using different regs could alias.  Quite
> possibly gcc-4.4 is deficient in this area.

 I re-investigated our testcase yesterday and this is indeed what 
 makes the difference. I couldn't reproduce the misbehavior on mainline,
 not because of the recent change you mentioned upthread, but because
 of an earlier one, rev 180522.

 The problem is visible on 180521, and here is what is happening (roughly):
 The testcase (sources below), compiled with -O1 -fschedule-insns2 by a
 powerpc-wrs-vxworks compiler produces in .pro_and_epilogue:

   (insn 29 22 28 2 (set (reg/f:SI 11 11 [133])
  (high:SI (symbol_ref:SI ("mysym11") ...))) ./t.c:19 408 {elf_high}
 (expr_list:REG_EQUIV (high:SI (symbol_ref:SI ("mysym11") ...)

   (insn 28 29 24 2 (set (reg/f:SI 0 0 [orig:123 p11$3 ] [123])
(lo_sum:SI (reg/f:SI 11 11 [133]) (symbol_ref:SI ("mysym11") ) 
 (expr_list:REG_EQUIV (symbol_ref:SI ("mysym11") ...)
   ...

   (note 39 24 40 2 NOTE_INSN_EPILOGUE_BEG)

   (insn 40 39 41 2 (set (reg:SI 11 11)
(plus:SI (reg/f:SI 31 31)
(const_int 48 [0x30]))) ./t.c:22 -1
   ...
   (insn 43 42 44 2 (set (reg:SI 25 25)
(mem/c:SI (plus:SI (reg:SI 11 11)
(const_int -28 [0xffe4])) [0 S4 A8])) ./t.c:22 -1

   (insn/f 44 43 45 2 (set (reg/f:SI 31 31)
(mem/c:SI (plus:SI (reg:SI 11 11)
(const_int -4 [0xfffc])) [0 S4 A8])) ./t.c:22 -1
 
   (insn 45 44 46 2 (set (mem/c:BLK (reg/f:SI 1 1) [0 A8])
(unspec:BLK [
(mem/c:BLK (reg/f:SI 1 1) [0 A8])
] UNSPEC_TIE)) ./t.c:22 -1
 
(insn/f 46 45 47 2 (set (reg/f:SI 1 1)
(reg:SI 11 11)) ./t.c:22 -1


 insn 45 is the "stack tie" intended to prevent a move of the sp restore
 (insn 46) prior to the register restores accessing the frame (insn 43 here) 

 Most of the time, as you say, this works because the mem accesses are
 considered conflicting. In this very particular case,  we get into 
 base_alias_check with 

   x = (reg/f:SI 1 1)
   y = (plus:SI (reg:SI 11 11)
 (const_int -4 [0xfffc]))

 from which base_alias_check infers

   x_base = (address:SI (reg/f:SI 1 1))
   y_base = (symbol_ref:SI ("mysym11") ...)

 (Erm, _that_ doesn't sound right: r11 certainly is not a base access
  to mysym11 past insn 40)

 x_base and y_base are both != 0, and we get (still within base_alias_check)
 into
 
/* If one address is a stack reference there can be no alias:
   stack references using different base registers do not alias,
   a stack reference can not alias a parameter, and a stack reference
   can not alias a global.  */

 Eventually, the tie and the sp restore move up together prior to the
 register restore. 

 rev 180522 changes the registers allocation so that r11 is not used in
 the function body any more.

 That r11 is considered a base to mysym11 certainly looks incorrect. Now,
 looking at the comment quoted above suggests that the current stack tie
 mechanism used in emit_epilogue (relying on mem r11 to always conflict with
 mem:blk sp) is optimistic, regardless, which Joesph's experience seem to
 confirm.

 Joseph resorted to mem:scratch to impose a strong barrier. That's certainly
 safe and I don't think the performance impact can be significant, so this
 looks like a good way out.

 I tried an approach in between here: replace the stack_tie insn by a
 frame_tie insn with frame_base_rtx as the second base. In the case above,
 with frame_base_rtx==r11, this is something like

  (set (mem/c:BLK (reg:SI 11 11) [0 A8])
(unspec:BLK [
(mem/c:BLK (reg:SI 11 11) [0 A8])
(mem/c:BLK (reg/f:SI 1 1) [0 A8])

 As I mentioned upthread, I'm still unclear on a couple of aspects though.
 For example, ISTM that there are paths where we might need a tie and different
 frame base regs were used before. The conditions controlling the different
 paths are intricate so I'm not quite sure. Joseph's approach would for sure
 void the need to seek answers :)

 Olivier

--

char mysym11;
char * volatile g11;

void foo (long x)
{
  /* Force frame_pointer_needed & use of r11 as a frame_reg_rtx
 from emit_epilogue.  */
  char volatile s [x];
  s[0] = 12;

  /* Force a register save/restore.  */
  asm volatile ("" : : : "25");

  /* Try to force something like (set r11 (symref:mysym))
 in the function body.  */
  {
register char * volatile p11 asm("11");

p11 = &mysym11;
g11 = p11;
  }  
}




Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Alan Modra
On Tue, Nov 08, 2011 at 12:30:36AM +, Joseph S. Myers wrote:
> On Tue, 8 Nov 2011, Alan Modra wrote:
> 
> > OK, so you made the stack ties conflict with all memory.  I wondered
> > about that too, but didn't find a testcase where it mattered.  It
> 
> The test I had was one of those from the various PRs for this issue.
> 
> int find_num(int i)
> {
>   const int arr[5] = {0, 1, 2, 3, 4};
>   return arr[i];
> }
> 
> (-O2 -mcpu=603).  But I don't know if it reproduces with FSF 4.4 or trunk; 
> there were plenty of other local changes in the 4.4-based tree where I 
> found this to be necessary.

The testcase generates good code on current gcc-4.7, gcc-4.6, and
gcc-4.5, but fails on current gcc-4.4.  I thought I had this PR
fixed on all active branches.  :-(

It's been a while since I looked at what was happening with this
testcase, but from memory what stops sheduling over the stack_tie is
alias.c:base_alias_check.  Which relies on tracking registers to see
whether two memory accesses using different regs could alias.  Quite
possibly gcc-4.4 is deficient in this area.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Joseph S. Myers
On Tue, 8 Nov 2011, Alan Modra wrote:

> OK, so you made the stack ties conflict with all memory.  I wondered
> about that too, but didn't find a testcase where it mattered.  It

The test I had was one of those from the various PRs for this issue.

int find_num(int i)
{
  const int arr[5] = {0, 1, 2, 3, 4};
  return arr[i];
}

(-O2 -mcpu=603).  But I don't know if it reproduces with FSF 4.4 or trunk; 
there were plenty of other local changes in the 4.4-based tree where I 
found this to be necessary.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Alan Modra
On Mon, Nov 07, 2011 at 04:10:38PM +, Joseph S. Myers wrote:
> FWIW, when I encountered such a problem in 4.4-based tools I found I 
> needed the following change to stack_tie patterns to fix it (applied to 
> and rediffed against trunk, but not tested there), in addition to a 
> DEFAULT_ABI == ABI_V4 check such as you added.  Given all the confusion 
> about this class of bugs and whether they should be addressed in each back 
> end or in a target-independent way, I didn't try to send this upstream at 
> the time.
> 
> 2011-04-11  Joseph Myers  
> 
>   * config/rs6000/rs6000.md (stack_tie): Use (mem:BLK (scratch)) as
>   output of SET.
>   (restore_stack_block, restore_stack_nonlocal): Update UNSPEC_TIE
>   patterns.

OK, so you made the stack ties conflict with all memory.  I wondered
about that too, but didn't find a testcase where it mattered.  It
would certainly be safer to use the big hammer, but on the other hand
it can restrict scheduling unnecessarily.  For example

void byref (int *a)
{
  int x;

  asm ("#%0 set here" : "=r" (x) : : "r29", "r30");
  *a = x;
}

with the existing stack ties, the write to *a happens after stack
deallocation.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Joseph S. Myers
On Mon, 7 Nov 2011, Alan Modra wrote:

> On Mon, Nov 07, 2011 at 10:53:51AM +0100, Olivier Hainque wrote:
> > Another bug we're running into here is an unwarranted move of the sp
> > restore prior to register fetches despite an attempt at preventing that
> > with a stack_tie instruction (VxWorks target).
> > 
> >   http://gcc.gnu.org/ml/gcc/2011-03/msg00123.html
> > 
> > The failure can still be exposed on mainline with a minor adjustment
> > to the C testcase quoted in the msg.
> 
> Even after revision 181056?

FWIW, when I encountered such a problem in 4.4-based tools I found I 
needed the following change to stack_tie patterns to fix it (applied to 
and rediffed against trunk, but not tested there), in addition to a 
DEFAULT_ABI == ABI_V4 check such as you added.  Given all the confusion 
about this class of bugs and whether they should be addressed in each back 
end or in a target-independent way, I didn't try to send this upstream at 
the time.

2011-04-11  Joseph Myers  

* config/rs6000/rs6000.md (stack_tie): Use (mem:BLK (scratch)) as
output of SET.
(restore_stack_block, restore_stack_nonlocal): Update UNSPEC_TIE
patterns.

Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 181085)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -11989,7 +11989,7 @@
 (define_expand "restore_stack_block"
   [(set (match_dup 2) (match_dup 3))
(set (match_dup 4) (match_dup 2))
-   (set (match_dup 5) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
+   (set (mem:BLK (scratch)) (unspec:BLK [(match_dup 5)] UNSPEC_TIE))
(set (match_operand 0 "register_operand" "")
(match_operand 1 "register_operand" ""))]
   ""
@@ -12022,7 +12022,7 @@
   [(set (match_dup 2) (match_operand 1 "memory_operand" ""))
(set (match_dup 3) (match_dup 4))
(set (match_dup 5) (match_dup 2))
-   (set (match_dup 6) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
+   (set (mem:BLK (scratch)) (unspec:BLK [(match_dup 6)] UNSPEC_TIE))
(set (match_operand 0 "register_operand" "") (match_dup 3))]
   ""
   "
@@ -15902,8 +15902,8 @@
 ; These are to explain that changes to the stack pointer should
 ; not be moved over stores to stack memory.
 (define_insn "stack_tie"
-  [(set (match_operand:BLK 0 "memory_operand" "+m")
-(unspec:BLK [(match_dup 0)] UNSPEC_TIE))]
+  [(set (mem:BLK (scratch))
+(unspec:BLK [(match_operand:BLK 0 "memory_operand" "+m")] UNSPEC_TIE))]
   ""
   ""
   [(set_attr "length" "0")])

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread David Edelsohn
On Mon, Nov 7, 2011 at 4:53 AM, Olivier Hainque  wrote:

> There are lots of subtle inter-section dependencies and redundancies
> in emit_epilogue, which has grown pretty difficult to understand IMHO.
>
> I can see two tracks to improve things in this area:
>
> - Concentrate on the sp-move problem at hand and get to an agreement
>  on the proper correction for it. I'll be happy to send my current
>  patch with comments and questions for that.
>
> - Start working on reducing emit_epilogue's complexity, e.g. by first
>  extracting portions of it out into separate functions, then see what
>  concepts we can identify and materialize to simplify the sequence
>  organization as a whole.
>
>  I did an extraction attempt as an exercise to better understand the
>  various sections and dependencies. It did help me and I'd be happy
>  to provide patches gather feedback etc if you believe it might be
>  useful to others.

Refactoring the prologue and epilogue code in rs6000.c would be great
and welcomed.

Thanks, David


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Olivier Hainque

On Nov 7, 2011, at 12:02 PM, Alan Modra wrote:

>> The failure can still be exposed on mainline with a minor adjustment
>> to the C testcase quoted in the msg.
> 
> Even after revision 181056?

 I'll double check but I'm pretty sure that this change cannot
 help the specific problem at hand. VxWorks is V4 and we do have
 a tie insn emitted. It's just that the tie fails to prevent the
 move. I'll re-check with the current sources and followup.

 Olivier




Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Alan Modra
On Mon, Nov 07, 2011 at 10:53:51AM +0100, Olivier Hainque wrote:
> Another bug we're running into here is an unwarranted move of the sp
> restore prior to register fetches despite an attempt at preventing that
> with a stack_tie instruction (VxWorks target).
> 
>   http://gcc.gnu.org/ml/gcc/2011-03/msg00123.html
> 
> The failure can still be exposed on mainline with a minor adjustment
> to the C testcase quoted in the msg.

Even after revision 181056?

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Iain Sandoe


On 7 Nov 2011, at 09:53, Olivier Hainque wrote:


On Nov 4, 2011, at 2:07 PM, Alan Modra wrote:

Lots of bugs here.  Most of them in TARGET_SPE_ABI code, but some  
also

for other ABIs.


...

Another bug we're running into here is an unwarranted move of the sp
restore prior to register fetches despite an attempt at preventing  
that

with a stack_tie instruction (VxWorks target).

 http://gcc.gnu.org/ml/gcc/2011-03/msg00123.html

The failure can still be exposed on mainline with a minor adjustment
to the C testcase quoted in the msg.

I have been looking into this issue again recently and I'm convinced
that the the current stack_tie insn just is not quite adequate.

I have a local patch which fixes the case at hand but I'm
still unclear about a couple of aspects.

There are lots of subtle inter-section dependencies and redundancies
in emit_epilogue, which has grown pretty difficult to understand IMHO.

I can see two tracks to improve things in this area:

- Concentrate on the sp-move problem at hand and get to an agreement
 on the proper correction for it. I'll be happy to send my current
 patch with comments and questions for that.

- Start working on reducing emit_epilogue's complexity, e.g. by first
 extracting portions of it out into separate functions, then see what
 concepts we can identify and materialize to simplify the sequence
 organization as a whole.


I'd welcome this too - I was considering partitioning out the  
save_world () stuff - since it's only used by Darwin AFAICT.


Iain


 I did an extraction attempt as an exercise to better understand the
 various sections and dependencies. It did help me and I'd be happy
 to provide patches gather feedback etc if you believe it might be
 useful to others.

Olivier








Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-07 Thread Olivier Hainque
On Nov 4, 2011, at 2:07 PM, Alan Modra wrote:

> Lots of bugs here.  Most of them in TARGET_SPE_ABI code, but some also
> for other ABIs.

...

Another bug we're running into here is an unwarranted move of the sp
restore prior to register fetches despite an attempt at preventing that
with a stack_tie instruction (VxWorks target).

  http://gcc.gnu.org/ml/gcc/2011-03/msg00123.html

The failure can still be exposed on mainline with a minor adjustment
to the C testcase quoted in the msg.

I have been looking into this issue again recently and I'm convinced
that the the current stack_tie insn just is not quite adequate.

I have a local patch which fixes the case at hand but I'm
still unclear about a couple of aspects.

There are lots of subtle inter-section dependencies and redundancies
in emit_epilogue, which has grown pretty difficult to understand IMHO.

I can see two tracks to improve things in this area:

- Concentrate on the sp-move problem at hand and get to an agreement
  on the proper correction for it. I'll be happy to send my current
  patch with comments and questions for that.

- Start working on reducing emit_epilogue's complexity, e.g. by first
  extracting portions of it out into separate functions, then see what
  concepts we can identify and materialize to simplify the sequence
  organization as a whole.
 
  I did an extraction attempt as an exercise to better understand the
  various sections and dependencies. It did help me and I'd be happy
  to provide patches gather feedback etc if you believe it might be
  useful to others.

Olivier


  
  


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-04 Thread Alan Modra
On Fri, Nov 04, 2011 at 06:19:07PM +, Iain Sandoe wrote:
> I did wonder, when implementing the out-of-line saves for Darwin,
> whether the setting in sysv4.h is a bit optimistic in firing for r31
> when -Os is on.
> 
> whether it saves space, in reality, would presumably depend on
> whether branch islands were inserted to reach the common code.

Yes, but that doesn't happen on powerpc-linux until code size goes
above 33M.  We get a (small) gain otherwise when able to use the
exit versions of the out-of-line restores.

> Bootstrapped this,

Thanks!

-- 
Alan Modra
Australia Development Lab, IBM


Re: [rs6000] Fix PR 50906, eh_frame and other woes

2011-11-04 Thread Iain Sandoe

Hello Alan,

On 4 Nov 2011, at 13:07, Alan Modra wrote:


Lots of bugs here.  Most of them in TARGET_SPE_ABI code, but some also
for other ABIs.

1) Marking an instruction setting up r11 for use by _save64gpr_* as
frame-related results in r11 being set as the cfa for the rest of the
function.  That's bad when r11 gets used for other things later.
Even worse, the offset given was wrong.  This bug was likely the
primary cause for the segfault reported in the PR.

2) RTL emitted to describe the _save64gpr_* used wrong offsets.
Harmless at the moment, I think, but still incorrect.  Ditto for the
epilogue _rest64gpr_* RTL.  eg. _save64gpr_24 rtl said the store for
r24 happened at 0(11), but see e500crtsav64gpr.S.

3) The .cfi_offset values given for the SPE _save64gpr_* reg saves
were wrong.

4) sp_offset was not set correctly in a number of places where
out-of-line save and restore functions were used, possibly leading to
incorrect code later in the prologue/epilogue.  For instance in this
code:

+ emit_insn (gen_add3_insn (src_reg, frame_reg_rtx,
GEN_INT (sp_offset - info->fp_size)));
- if (REGNO (frame_reg_rtx) == 11)
-   sp_offset += info->fp_size;
+ if (REGNO (frame_reg_rtx) == REGNO (src_reg))
+   sp_offset = info->fp_size;

src_reg can be r12, so testing for r11 as frame_reg_rtx is wrong.  You
want to update sp_offset when frame_reg_rtx changes.  Also, adding
info->fp_size to sp_offset is obviously wrong when you've just added
(sp_offset - info->fp_size) to frame_reg_rtx.  Clearly you want to
subtract (sp_offset - info->fp_size) from sp_offset, to keep the
invariant that frame_reg_rtx + sp_offset points to top of frame.
Really, sp_offset should be renamed to frame_offset.

5) The wrong register mode was used in a number of places for address
calculations.  A nitpick, but reg_mode is for data, Pmode for  
addresses.


6) The epilogue _rest64gpr_* r11 wasn't always set up.

7) Not fixed by this patch, but many instances of rs6000_frame_related
and emit_frame_save pass frame_ptr_rtx for the reg to replace with
sp+offset in the cfi expression, when frame_ptr_rtx is not the reg
being used in the memory access.  That means we're relying on the
generic dwarf code tracking registers for us.  Which does seem to
work.  However, it's confusing to have calls to rs6000_frame_related
that really don't do anything.

Bootstrapped and regression tested powerpc-linux, with standard
BOOT_CFLAGS, with BOOT_CFLAGS="-g -Os" (tends to test just out-of-line
restore), and with BOOT_CFLAGS="-g -Os -mno-multiple" (test
out-of-line save and restore).  The PR reporter is running SPE
bootstrap and regression tests.  Incidentally, I had to use
--disable-werror as we run into some bogus warnings when trying to
bootstrap with -Os.


I did wonder, when implementing the out-of-line saves for Darwin,  
whether the setting in sysv4.h is a bit optimistic in firing for r31  
when -Os is on.


whether it saves space, in reality, would presumably depend on whether  
branch islands were inserted to reach the common code.




Bootstrapped this,  [together with http://gcc.gnu.org/ml/gcc-patches/2011-11/msg00550.html 
]
all+ada+java on powerpc-darwin9.  Tests started (slow machine .. will  
take a while)


Iain


gcc/calls.c: In function 'rtx_def* expand_call(tree, rtx, int)':
gcc/calls.c:1098:52: warning: 'base' may be used uninitialized in  
this function [-Wmaybe-uninitialized]

gcc/calls.c:1079:9: note: 'base' was declared here

OK to apply?  gcc-4.6 branch too?

PR target/50906
* config/rs6000/rs6000.c (rs6000_emit_prologue ):
Do not mark r11 setup as frame-related.  Pass correct offset to
rs6000_emit_savres_rtx.  Correct out-of-line rs6000_frame_related
arguments.  Correct sp_offset.  Remove "offset" fudge from
in-line rs6000_frame_related call.  Rename misleading variable.
Fix comments and whitespace.  Tidy some expressions.
(rs6000_emit_epilogue ): Always set frame_reg_rtx
to r11 in out-of-line case.  Correct sp_offset.  Pass correct
offset to rs6000_emit_savres_rtx.  Rename misleading variable.
Fix comments and whitespace.  Tidy some expressions.
(rs6000_emit_epilogue ): Add sp_offset
adjustment when !saving_GPRs_inline.  Correct register mode
used in address calcs.
(rs6000_emit_epilogue ): Similarly when
!restoring_GPRs_inline.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 180867)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -20089,56 +20115,52 @@ rs6000_emit_prologue (void)
{
  int i;
  rtx spe_save_area_ptr;
-
+  int save_ptr_to_sp;
+  int ool_adjust = 0;
+
  /* Determine whether we can address all of the registers that  
need

-to be saved with an offset from the stack pointer that fits in
+  

[rs6000] Fix PR 50906, eh_frame and other woes

2011-11-04 Thread Alan Modra
Lots of bugs here.  Most of them in TARGET_SPE_ABI code, but some also
for other ABIs.

1) Marking an instruction setting up r11 for use by _save64gpr_* as
frame-related results in r11 being set as the cfa for the rest of the
function.  That's bad when r11 gets used for other things later.
Even worse, the offset given was wrong.  This bug was likely the
primary cause for the segfault reported in the PR.

2) RTL emitted to describe the _save64gpr_* used wrong offsets.
Harmless at the moment, I think, but still incorrect.  Ditto for the
epilogue _rest64gpr_* RTL.  eg. _save64gpr_24 rtl said the store for
r24 happened at 0(11), but see e500crtsav64gpr.S.

3) The .cfi_offset values given for the SPE _save64gpr_* reg saves
were wrong.

4) sp_offset was not set correctly in a number of places where
out-of-line save and restore functions were used, possibly leading to
incorrect code later in the prologue/epilogue.  For instance in this
code:

+ emit_insn (gen_add3_insn (src_reg, frame_reg_rtx,
GEN_INT (sp_offset - info->fp_size)));
- if (REGNO (frame_reg_rtx) == 11)
-   sp_offset += info->fp_size;
+ if (REGNO (frame_reg_rtx) == REGNO (src_reg))
+   sp_offset = info->fp_size;

src_reg can be r12, so testing for r11 as frame_reg_rtx is wrong.  You
want to update sp_offset when frame_reg_rtx changes.  Also, adding
info->fp_size to sp_offset is obviously wrong when you've just added
(sp_offset - info->fp_size) to frame_reg_rtx.  Clearly you want to
subtract (sp_offset - info->fp_size) from sp_offset, to keep the
invariant that frame_reg_rtx + sp_offset points to top of frame.
Really, sp_offset should be renamed to frame_offset.

5) The wrong register mode was used in a number of places for address
calculations.  A nitpick, but reg_mode is for data, Pmode for addresses.

6) The epilogue _rest64gpr_* r11 wasn't always set up.

7) Not fixed by this patch, but many instances of rs6000_frame_related
and emit_frame_save pass frame_ptr_rtx for the reg to replace with
sp+offset in the cfi expression, when frame_ptr_rtx is not the reg
being used in the memory access.  That means we're relying on the
generic dwarf code tracking registers for us.  Which does seem to
work.  However, it's confusing to have calls to rs6000_frame_related
that really don't do anything.

Bootstrapped and regression tested powerpc-linux, with standard
BOOT_CFLAGS, with BOOT_CFLAGS="-g -Os" (tends to test just out-of-line
restore), and with BOOT_CFLAGS="-g -Os -mno-multiple" (test
out-of-line save and restore).  The PR reporter is running SPE
bootstrap and regression tests.  Incidentally, I had to use
--disable-werror as we run into some bogus warnings when trying to
bootstrap with -Os.

gcc/calls.c: In function 'rtx_def* expand_call(tree, rtx, int)':
gcc/calls.c:1098:52: warning: 'base' may be used uninitialized in this function 
[-Wmaybe-uninitialized]
gcc/calls.c:1079:9: note: 'base' was declared here

OK to apply?  gcc-4.6 branch too?

PR target/50906
* config/rs6000/rs6000.c (rs6000_emit_prologue ):
Do not mark r11 setup as frame-related.  Pass correct offset to
rs6000_emit_savres_rtx.  Correct out-of-line rs6000_frame_related
arguments.  Correct sp_offset.  Remove "offset" fudge from
in-line rs6000_frame_related call.  Rename misleading variable.
Fix comments and whitespace.  Tidy some expressions.
(rs6000_emit_epilogue ): Always set frame_reg_rtx
to r11 in out-of-line case.  Correct sp_offset.  Pass correct
offset to rs6000_emit_savres_rtx.  Rename misleading variable.
Fix comments and whitespace.  Tidy some expressions.  
(rs6000_emit_epilogue ): Add sp_offset
adjustment when !saving_GPRs_inline.  Correct register mode
used in address calcs.
(rs6000_emit_epilogue ): Similarly when
!restoring_GPRs_inline.

Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 180867)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -20089,56 +20115,52 @@ rs6000_emit_prologue (void)
 {
   int i;
   rtx spe_save_area_ptr;
- 
+  int save_ptr_to_sp;
+  int ool_adjust = 0;
+
   /* Determine whether we can address all of the registers that need
-to be saved with an offset from the stack pointer that fits in
+to be saved with an offset from frame_reg_rtx that fits in
 the small const field for SPE memory instructions.  */
-  int spe_regs_addressable_via_sp
-   = (SPE_CONST_OFFSET_OK(info->spe_gp_save_offset + sp_offset
-  + (32 - info->first_gp_reg_save - 1) * reg_size)
+  int spe_regs_addressable
+   = (SPE_CONST_OFFSET_OK (info->spe_gp_save_offset + sp_offset
+   + reg_size * (32 - info->first_gp_reg_save - 1))
   && saving_GPRs_inline);
   int spe