On Wed, Jan 06, 2010 at 10:15:58AM +0000, Andrew Haley wrote:
> On 01/06/2010 09:59 AM, Mark Colby wrote:
> >>>> Yabbut, how come RTL cse can handle it in x86_64, but PPC not?
> >>>
> >>> Probably because the RTL on x86_64 uses and's and ior's, but PPC uses
> >>> set's of zero_extract's (insvsi).
> >>
> >> Aha!  Yes, that'll probably be it.  It should be easy to fix cse to
> >> recognize those too.
> 
> > I'm not familiar with the gcc source yet, but just in case I get the
> > time to look at this, could anyone give me a file/line ref to dive
> > into and examine?
> 
> Would you believe cse.c?  :-)
> 
> I can't find the line without investigating further.
> 
> Andrew.
> 
> P.S.  This is a nontrivial task if you don't know gcc, but might be a
> good place for a beginner to start.  OTOH, might be hard: no way to
> know without digging.

I've digged a little bit and this optimizes the testcase on PowerPC 32-bit.
The patch is completely untested though.

On PowerPC 64-bit which apparently doesn't use ZERO_EXTRACT in this case I
see a different issue.  It generates
        li 3,0
        ori 3,3,32820
        sldi 3,3,16
while IMHO 2 insns to load the constant would be completely sufficient,
apparently rs6000_emit_set_long_const needs work.
        lis 3,0x8034
        extsw 3,3
or
        li 3,0x401a
        sldi 3,3,17
etc. do IMHO the same.

2010-01-06  Jakub Jelinek  <ja...@redhat.com>

        * cse.c (cse_insn): Optimize lhs ZERO_EXTRACT if only CONST_INTs are
        involved.

--- gcc/cse.c.jj        2009-11-25 16:47:36.000000000 +0100
+++ gcc/cse.c   2010-01-06 16:00:41.000000000 +0100
@@ -4436,6 +4436,7 @@ cse_insn (rtx insn)
 
   for (i = 0; i < n_sets; i++)
     {
+      bool repeat = false;
       rtx src, dest;
       rtx src_folded;
       struct table_elt *elt = 0, *p;
@@ -5029,6 +5030,72 @@ cse_insn (rtx insn)
                break;
            }
 
+         /* Try to optimize
+            (set (reg:M N) (const_int A))
+            (set (reg:M2 O) (const_int B))
+            (set (zero_extract:M2 (reg:M N) (const_int C) (const_int D))
+                 (reg:M2 O)).  */
+         if (GET_CODE (SET_DEST (sets[i].rtl)) == ZERO_EXTRACT
+             && CONST_INT_P (trial)
+             && CONST_INT_P (XEXP (SET_DEST (sets[i].rtl), 1))
+             && CONST_INT_P (XEXP (SET_DEST (sets[i].rtl), 2))
+             && REG_P (XEXP (SET_DEST (sets[i].rtl), 0))
+             && (GET_MODE_BITSIZE (GET_MODE (SET_DEST (sets[i].rtl)))
+                 >= INTVAL (XEXP (SET_DEST (sets[i].rtl), 1)))
+             && ((unsigned) INTVAL (XEXP (SET_DEST (sets[i].rtl), 1))
+                 + (unsigned) INTVAL (XEXP (SET_DEST (sets[i].rtl), 2))
+                 <= HOST_BITS_PER_WIDE_INT))
+           {
+             rtx dest_reg = XEXP (SET_DEST (sets[i].rtl), 0);
+             rtx width = XEXP (SET_DEST (sets[i].rtl), 1);
+             rtx pos = XEXP (SET_DEST (sets[i].rtl), 2);
+             unsigned int dest_hash = HASH (dest_reg, GET_MODE (dest_reg));
+             struct table_elt *dest_elt
+               = lookup (dest_reg, dest_hash, GET_MODE (dest_reg));
+             rtx dest_cst = NULL;
+
+             if (dest_elt)
+               for (p = dest_elt->first_same_value; p; p = p->next_same_value)
+                 if (p->is_const && CONST_INT_P (p->exp))
+                   {
+                     dest_cst = p->exp;
+                     break;
+                   }
+             if (dest_cst)
+               {
+                 HOST_WIDE_INT val = INTVAL (dest_cst);
+                 HOST_WIDE_INT mask;
+                 unsigned int shift;
+                 if (BITS_BIG_ENDIAN)
+                   shift = GET_MODE_BITSIZE (GET_MODE (dest_reg))
+                           - INTVAL (pos) - INTVAL (width);
+                 else
+                   shift = INTVAL (pos);
+                 if (INTVAL (width) == HOST_BITS_PER_WIDE_INT)
+                   mask = ~(HOST_WIDE_INT) 0;
+                 else
+                   mask = ((HOST_WIDE_INT) 1 << INTVAL (width)) - 1;
+                 val &= ~(mask << shift);
+                 val |= (INTVAL (trial) & mask) << shift;
+                 val = trunc_int_for_mode (val, GET_MODE (dest_reg));
+                 validate_unshare_change (insn, &SET_DEST (sets[i].rtl),
+                                          dest_reg, 1);
+                 validate_unshare_change (insn, &SET_SRC (sets[i].rtl),
+                                          GEN_INT (val), 1);
+                 if (apply_change_group ())
+                   {
+                     rtx note = find_reg_note (insn, REG_EQUAL, NULL_RTX);
+                     if (note)
+                       {
+                         remove_note (insn, note);
+                         df_notes_rescan (insn);
+                       }
+                     repeat = true;
+                     break;
+                   }
+               }
+           }
+
          /* We don't normally have an insn matching (set (pc) (pc)), so
             check for this separately here.  We will delete such an
             insn below.
@@ -5104,6 +5171,13 @@ cse_insn (rtx insn)
            }
        }
 
+      /* If we changed the insn too much, handle this set from scratch.  */
+      if (repeat)
+       {
+         i--;
+         continue;
+       }
+
       src = SET_SRC (sets[i].rtl);
 
       /* In general, it is good to have a SET with SET_SRC == SET_DEST.


        Jakub

Reply via email to