Hi Jeff,

Many thanks for the pointers. I will make the changes and attach the
patch to the bugzilla soon.

Cheers,
Rahul

-----Original Message-----
From: Jeff Law [mailto:l...@redhat.com] 
Sent: 09 February 2010 00:45
To: Rahul Kharche
Cc: gcc@gcc.gnu.org; sdkteam-gnu
Subject: Re: Failure to combine SHIFT with ZERO_EXTEND

On 02/04/10 08:39, Rahul Kharche wrote:
> Hi All,
>
> On our private port of GCC 4.4.1 we fail to combine successive SHIFT
> operations like in the following case
>
> #include<stdlib.h>
> #include<stdio.h>
>
> void f1 ()
> {
>    unsigned short t1;
>    unsigned short t2;
>
>    t1 = rand();
>    t2 = rand();
>
>    t1<<= 1; t2<<= 1;
>    t1<<= 1; t2<<= 1;
>    t1<<= 1; t2<<= 1;
>    t1<<= 1; t2<<= 1;
>    t1<<= 1; t2<<= 1;
>    t1<<= 1; t2<<= 1;
>
>    printf("%d\n", (t1+t2));
> }
>
> This is a ZERO_EXTEND problem, because combining SHIFTs with whole
> integers works correctly, so do signed values. The problem seems to
> arise in the RTL combiner which combines the ZERO_EXTEND with the
> SHIFT to generate a SHIFT and an AND. Our architecture does not
> support AND with large constants and hence do not have a matching
> insn pattern (we prefer not doing this, because of large constants
> remain hanging at the end of all RTL optimisations and cause needless
> reloads).
>
> Fixing the combiner to convert masking AND operations to ZERO_EXTRACT
> fixes this issue without any obvious regressions. I'm adding the
> patch here against GCC 4.4.1 for any comments and/or suggestions.
>    
Good catch.    However, note we are a regression bugfix only phase of 
development right now in preparation for branching for GCC 4.5.  As a 
result the patch can't be checked in at this time; I would recommend you

update the patch to the current sources & attach it to bug #41998 which 
contains queued patches for after GCC 4.5 branches.


> Cheers,
> Rahul
>
>
> --- combine.c 2009-04-01 21:47:37.000000000 +0100
> +++ combine.c 2010-02-04 15:04:41.000000000 +0000
> @@ -446,6 +446,7 @@
>   static void record_truncated_values (rtx *, void *);
>   static bool reg_truncated_to_mode (enum machine_mode, const_rtx);
>   static rtx gen_lowpart_or_truncate (enum machine_mode, rtx);
> +static bool can_zero_extract_p (rtx, rtx, enum machine_mode);
>
>
>
>   /* It is not safe to use ordinary gen_lowpart in combine.
> @@ -6973,6 +6974,16 @@
>                              make_compound_operation (XEXP (x, 0),
>                                                       next_code),
>                              i, NULL_RTX, 1, 1, 0, 1);
> +      else if (can_zero_extract_p (XEXP (x, 0), XEXP (x, 1), mode))
> +        {
> +       unsigned HOST_WIDE_INT len =  HOST_BITS_PER_WIDE_INT
> +                                     - CLZ_HWI (UINTVAL (XEXP (x,
> 1)));
> +       new_rtx = make_extraction (mode,
> +                             make_compound_operation (XEXP (x, 0),
> +                                                      next_code),
> +                                  0, NULL_RTX, len, 1, 0,
> +                                  in_code == COMPARE);
>    
There should be a comment prior to this code fragment describing the 
transformation being performed.   Something like:

/* Convert (and (shift X Y) MASK) into ... when ... */

That will make it clear in the future when your transformation applies 
rather than forcing someone scanning the code to read it in detail.


> +        }
>
>         break;
>
> @@ -7245,6 +7256,25 @@
>       return simplify_gen_unary (TRUNCATE, mode, x, GET_MODE (x));
>   }
>
> +static bool
> +can_zero_extract_p (rtx x, rtx mask_rtx, enum machine_mode mode)
>    
There should be a comment prior to this function which briefly describes

what the function does, the parameters & return value.  Use comments 
prior to other functions to guide you.
> @@ -8957,7 +8987,6 @@
>       op0 = UNKNOWN;
>
>     *pop0 = op0;
> -
>     /* ??? Slightly redundant with the above mask, but not entirely.
>        Moving this above means we'd have to sign-extend the mode mask
>        for the final test.  */
>    
Useless diff fragment.  Remove this change as it's completely unrelated 
and useless.

You should also write a ChangeLog entry for your patch.  ChangeLogs 
describe what changed, not why something changed.  So a suitable entry 
might look something like:

<date>  Your Name <yourem...@somewhere>

     * combine.c (make_compound_operation): Convert shifts and masks
     into zero_extract in certain cases.
     (can_zero_extract_p): New function.


If you could make those changes and attach the result to PR 41998 they 
should be able to go in once 4.5 branches from the mainline.

Jeff

Reply via email to