> My apologies for taking so long to look at this.

No problem, all the more so that...

> > 2013-09-05  Eric Botcazou  <ebotca...@adacore.com>
> > 
> >     * config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
> >     arguments to push onto the stack and no varargs, save ip into a stack
> >     slot if r3 isn't available on entry.
> 
> Sorry, but this is not quite right either, as shown by the attached
> testcase (in C this time, so we can commit it to gcc.target/arm :-)
> 
> The problem is that if we have some alignment padding we end up storing
> ip in one location but restoring it from another.
> 
>       str     ip, [sp, #4]
>       add     ip, sp, #8
>       stmfd   sp!, {fp, ip, lr, pc}
>       sub     fp, ip, #12
>       ldr     ip, [fp, #4]    // <==== Should be fp + 8
>       @ ip needed
>       str     r3, [fp, #8]
>       ldr     ip, [ip]

...ugh, right, the computation of the slot address was supposed to be clever 
but is totally broken instead. :-(  Thanks for spotting it, I don't understand 
how we haven't seen this on VxWorks (although your testcase does not fail on 
VxWorks, since args_to_push is 4 instead of 8 as on arm-eabi).

The computation of the slot address is:

+               {
+                 rtx x = plus_constant (Pmode, stack_pointer_rtx,
+                                        args_to_push - 4);
+                 insn = emit_insn (gen_addsi3 (stack_pointer_rtx,
+                                               stack_pointer_rtx,
+                                               GEN_INT (- args_to_push)));
+                 emit_set_insn (gen_frame_mem (SImode, x), ip_rtx);
+               }

meaning that IP will be saved into the first slot on the stack.  But the 
restore code is:

              /* Recover the static chain register.  */
              if (!arm_r3_live_at_start_p () || saved_pretend_args)
                insn = gen_rtx_REG (SImode, 3);
              else
                {
                  insn = plus_constant (Pmode, hard_frame_pointer_rtx, 4);
                  insn = gen_frame_mem (SImode, insn);
                }
              emit_set_insn (ip_rtx, insn);

meaning that IP is restored from the last slot on the stack, so this can work 
only if args_to_push == 4.

Revised patch attached, your testcase passes on arm-eabi with it.  Does it 
look OK to you?  If so, I'll run a testing cycle on arm-vxworks and arm-eabi.


        * config/arm/arm.c (arm_expand_prologue): In a nested APCS frame with
        arguments to push onto the stack and no varargs, save ip into the last
        stack slot if r3 isn't available on entry.


-- 
Eric Botcazou
Index: config/arm/arm.c
===================================================================
--- config/arm/arm.c	(revision 205579)
+++ config/arm/arm.c	(working copy)
@@ -18549,8 +18549,7 @@ arm_r3_live_at_start_p (void)
   /* Just look at cfg info, which is still close enough to correct at this
      point.  This gives false positives for broken functions that might use
      uninitialized data that happens to be allocated in r3, but who cares?  */
-  return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)),
-			  3);
+  return REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN (cfun)), 3);
 }
 
 /* Compute the number of bytes used to store the static chain register on the
@@ -20576,11 +20575,13 @@ arm_expand_prologue (void)
 	     whilst the frame is being created.  We try the following
 	     places in order:
 
-	       1. The last argument register r3.
-	       2. A slot on the stack above the frame.  (This only
-	          works if the function is not a varargs function).
+	       1. The last argument register r3 if it is available.
+	       2. A slot on the stack above the frame if there are no
+		  arguments to push onto the stack.
 	       3. Register r3 again, after pushing the argument registers
-	          onto the stack.
+	          onto the stack, if this is a varargs function.
+	       4. The last slot on the stack created for the arguments to
+		  push, if this isn't a varargs function.
 
 	     Note - we only need to tell the dwarf2 backend about the SP
 	     adjustment in the second variant; the static chain register
@@ -20611,21 +20612,24 @@ arm_expand_prologue (void)
 	    {
 	      /* Store the args on the stack.  */
 	      if (cfun->machine->uses_anonymous_args)
-		insn = emit_multi_reg_push
-		  ((0xf0 >> (args_to_push / 4)) & 0xf);
+		{
+		  insn
+		    = emit_multi_reg_push ((0xf0 >> (args_to_push / 4)) & 0xf);
+		  emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
+		  saved_pretend_args = 1;
+		}
 	      else
-		insn = emit_insn
-		  (gen_addsi3 (stack_pointer_rtx, stack_pointer_rtx,
-			       GEN_INT (- args_to_push)));
+		{
+		  insn = emit_insn (gen_addsi3 (stack_pointer_rtx,
+						stack_pointer_rtx,
+						GEN_INT (- args_to_push)));
+		  emit_set_insn (gen_frame_mem (SImode, stack_pointer_rtx),
+				 ip_rtx);
+		}
 
 	      RTX_FRAME_RELATED_P (insn) = 1;
-
-	      saved_pretend_args = 1;
 	      fp_offset = args_to_push;
 	      args_to_push = 0;
-
-	      /* Now reuse r3 to preserve IP.  */
-	      emit_set_insn (gen_rtx_REG (SImode, 3), ip_rtx);
 	    }
 	}
 
@@ -20731,7 +20735,7 @@ arm_expand_prologue (void)
 	      /* Recover the static chain register.  */
 	      if (!arm_r3_live_at_start_p () || saved_pretend_args)
 		insn = gen_rtx_REG (SImode, 3);
-	      else /* if (crtl->args.pretend_args_size == 0) */
+	      else
 		{
 		  insn = plus_constant (Pmode, hard_frame_pointer_rtx, 4);
 		  insn = gen_frame_mem (SImode, insn);

Reply via email to