------- Additional Comments From andrewhutchinson at cox dot net  2005-03-13 
01:19 -------
Subject: Re:  unable to find a register to spill in class
 `POINTER_REGS'

The concerns have merit but can be discounted:.

The reload problem occurs because the original pattern demands two 
pointers in "parrallel" existence.(Actually two pointers and a 
counter!)  The current register allocation is imperfect and with this 
constraint (and only 3 pointers incl frame)  fails to find a solution.

The new RTL expansion does not demand both pointer registers at the same 
time - indeed they could be the same register in extreme circumstances. 
Breaking up the RTL reveals this to GCC and allows the  register 
allocator to find a solution. So that is why it works.

The SAME result can also be realised by deleting the offending pattern - 
in this situation GCC generates it's own solution which happens to be 
identical RTL to the proposed solution (with a 16 bit counter). And 
indeed there is no reload failure.

Since the proposed pattern "FAILS", the variable case, we will still end 
up with GCC's solution and we can conclude there will be no hidden 
reload issue. (It should also be noted that a variable count is also 
less retrictive on hard register use than a constant).

Now here is the neat bit!. Since GCC middle end generates the detailed 
RTL loop, for a variable count,  we can and should rely on it to 
consider any restriction on the variable (ie variable count=0). If not, 
its very very broken.

I was very tempted to submit a patch that just deletes the pattern, 
however, that would produce worse code for the very common case where 
"fixed" count<255.

I hope this clarifies things.



marekm at amelek dot gda dot pl wrote:

>------- Additional Comments From marekm at amelek dot gda dot pl  2005-03-13 
>00:30 -------
>Subject: Re:  unable to find a register to spill in class `POINTER_REGS'
>
>On Sat, Mar 12, 2005 at 09:20:18PM -0000, andrewhutchinson at cox dot net 
>wrote:
>
>  
>
>>The pattern only helps if it is a constant.  I also thought it should handle
>>variable block size. However,  I found gcc already produces optimal code for
>>that case without any help.
>>    
>>
>
>See below for revised patch (currently for mainline):
> - FAIL if count is not a CONST_INT
> - handle count == 0 (nothing to do)
> - handle count > 32767 (negative in RTL, mask with 0xffff)
> - minor formatting fixes
>
>But, I'm still concerned a little about the variable block size:
> - __tmp_reg__ will not be used (some other register will)
> - more importantly, can the problem from this PR (unable to find a register
>   to spill in class POINTER_REGS) still occur in the variable size case?
>   (only with a different, not yet known test case - this means we are
>   perhaps trying to hide the real bug instead of fixing it...)
>
>If we have to handle the variable count case too, one more insn will
>be needed (initially jump to decrementing the counter; test for carry
>instead of zero).  Some other targets handle this by calling a subroutine
>in libgcc.S - smaller (but slower) than generating the loop inline.
>
>Marek
>
>
>2005-03-12  Andy Hutchinson  <[EMAIL PROTECTED]>
>
>       PR target/18251
>       * config/avr/avr.md (movmemhi): Rewrite as RTL loop.
>       (*movmemqi_insn): Delete.
>       (*movmemhi): Delete.
>
>Index: avr.md
>===================================================================
>RCS file: /cvs/gcc/gcc/gcc/config/avr/avr.md,v
>retrieving revision 1.50
>diff -c -3 -p -r1.50 avr.md
>*** avr.md     6 Mar 2005 21:50:36 -0000       1.50
>--- avr.md     12 Mar 2005 23:51:57 -0000
>***************
>*** 346,421 ****
>  
>  ;;=========================================================================
>  ;; move string (like memcpy)
>  
>  (define_expand "movmemhi"
>    [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
>!                 (match_operand:BLK 1 "memory_operand" ""))
>!            (use (match_operand:HI 2 "const_int_operand" ""))
>!            (use (match_operand:HI 3 "const_int_operand" ""))
>!            (clobber (match_scratch:HI 4 ""))
>!            (clobber (match_scratch:HI 5 ""))
>!            (clobber (match_dup 6))])]
>    ""
>    "{
>!   rtx addr0, addr1;
>!   int cnt8;
>    enum machine_mode mode;
>  
>    if (GET_CODE (operands[2]) != CONST_INT)
>      FAIL;
>-   cnt8 = byte_immediate_operand (operands[2], GET_MODE (operands[2]));
>-   mode = cnt8 ? QImode : HImode;
>-   operands[6] = gen_rtx_SCRATCH (mode);
>-   operands[2] = copy_to_mode_reg (mode,
>-                                   gen_int_mode (INTVAL (operands[2]), mode));
>-   addr0 = copy_to_mode_reg (Pmode, XEXP (operands[0], 0));
>-   addr1 = copy_to_mode_reg (Pmode, XEXP (operands[1], 0));
>  
>!   operands[0] = gen_rtx_MEM (BLKmode, addr0);
>!   operands[1] = gen_rtx_MEM (BLKmode, addr1);
>  }")
>  
>- (define_insn "*movmemqi_insn"
>-   [(set (mem:BLK (match_operand:HI 0 "register_operand" "e"))
>-      (mem:BLK (match_operand:HI 1 "register_operand" "e")))
>-    (use (match_operand:QI 2 "register_operand" "r"))
>-    (use (match_operand:QI 3 "const_int_operand" "i"))
>-    (clobber (match_scratch:HI 4 "=0"))
>-    (clobber (match_scratch:HI 5 "=1"))
>-    (clobber (match_scratch:QI 6 "=2"))]
>-   ""
>-   "ld __tmp_reg__,%a1+
>-      st %a0+,__tmp_reg__
>-      dec %2
>-      brne .-8"
>-   [(set_attr "length" "4")
>-    (set_attr "cc" "clobber")])
>- 
>- (define_insn "*movmemhi"
>-   [(set (mem:BLK (match_operand:HI 0 "register_operand" "e,e"))
>-      (mem:BLK (match_operand:HI 1 "register_operand" "e,e")))
>-    (use (match_operand:HI 2 "register_operand" "!w,d"))
>-    (use (match_operand:HI 3 "const_int_operand" ""))
>-    (clobber (match_scratch:HI 4 "=0,0"))
>-    (clobber (match_scratch:HI 5 "=1,1"))
>-    (clobber (match_scratch:HI 6 "=2,2"))]
>-   ""
>-   "*{
>-      if (which_alternative==0)
>-        return (AS2 (ld,__tmp_reg__,%a1+) CR_TAB
>-             AS2 (st,%a0+,__tmp_reg__)  CR_TAB
>-             AS2 (sbiw,%A2,1) CR_TAB
>-             AS1 (brne,.-8));
>-      else
>-        return (AS2 (ld,__tmp_reg__,%a1+) CR_TAB
>-             AS2 (st,%a0+,__tmp_reg__)  CR_TAB
>-             AS2 (subi,%A2,1) CR_TAB
>-             AS2 (sbci,%B2,0) CR_TAB
>-             AS1 (brne,.-10));
>- }"
>-   [(set_attr "length" "4,5")
>-    (set_attr "cc" "clobber,clobber")])
>- 
>  ;; =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0
>  ;; memset (%0, 0, %1)
>  
>--- 346,414 ----
>  
>  ;;=========================================================================
>  ;; move string (like memcpy)
>+ ;; implement as RTL loop
>  
>  (define_expand "movmemhi"
>    [(parallel [(set (match_operand:BLK 0 "memory_operand" "")
>!           (match_operand:BLK 1 "memory_operand" ""))
>!           (use (match_operand:HI 2 "const_int_operand" ""))
>!           (use (match_operand:HI 3 "const_int_operand" ""))])]
>    ""
>    "{
>!   int prob, count;
>    enum machine_mode mode;
>+   rtx label = gen_label_rtx ();
>+   rtx loop_reg;
>+   rtx jump;
>+ 
>+   /* Copy pointers into new psuedos - they will be changed.  */
>+   rtx addr0 = copy_to_mode_reg (Pmode, XEXP (operands[0], 0));
>+   rtx addr1 = copy_to_mode_reg (Pmode, XEXP (operands[1], 0));
>+ 
>+   /* Create rtx for tmp register - we use this as scratch.  */
>+   rtx tmp_reg_rtx  = gen_rtx_REG (QImode, TMP_REGNO);
>  
>    if (GET_CODE (operands[2]) != CONST_INT)
>      FAIL;
>  
>!   count = INTVAL (operands[2]) & 0xffff;
>!   if (count == 0)
>!     DONE;
>! 
>!   /* Work out branch probability for latter use.  */
>!   prob = REG_BR_PROB_BASE - REG_BR_PROB_BASE / count;
>! 
>!   /* See if constant fit 8 bits.  */
>!   mode = (count < 0x100) ? QImode : HImode;
>!   /* Create loop counter register.  */
>!   loop_reg = copy_to_mode_reg (mode, gen_int_mode (count, mode));
>! 
>!   /* Now create RTL code for move loop.  */
>!   /* Label at top of loop.  */
>!   emit_label (label);
>! 
>!   /* Move one byte into scratch and inc pointer.  */
>!   emit_move_insn (tmp_reg_rtx, gen_rtx_MEM (QImode, addr1));
>!   emit_move_insn (addr1, gen_rtx_PLUS (Pmode, addr1, const1_rtx));
>! 
>!   /* Move to mem and inc pointer.  */
>!   emit_move_insn (gen_rtx_MEM (QImode, addr0), tmp_reg_rtx);
>!   emit_move_insn (addr0, gen_rtx_PLUS (Pmode, addr0, const1_rtx));
>! 
>!   /* Decrement count.  */
>!   emit_move_insn (loop_reg, gen_rtx_PLUS (mode, loop_reg, constm1_rtx));
>! 
>!   /* Compare with zero and jump if not equal. */
>!   emit_cmp_and_jump_insns (loop_reg, const0_rtx, NE, NULL_RTX, mode, 1,
>!                            label);
>!   /* Set jump probability based on loop count.  */
>!   jump = get_last_insn ();
>!   REG_NOTES (jump) = gen_rtx_EXPR_LIST (REG_BR_PROB,
>!                     GEN_INT (prob),
>!                     REG_NOTES (jump));
>!   DONE;
>  }")
>  
>  ;; =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0 =0
>  ;; memset (%0, 0, %1)
>  
>
>
>
>  
>




-- 


http://gcc.gnu.org/bugzilla/show_bug.cgi?id=18251

Reply via email to