Hi Jeff, Many thanks for the pointers. I will make the changes and attach the patch to the bugzilla soon.
Cheers, Rahul -----Original Message----- From: Jeff Law [mailto:[email protected]] Sent: 09 February 2010 00:45 To: Rahul Kharche Cc: [email protected]; sdkteam-gnu Subject: Re: Failure to combine SHIFT with ZERO_EXTEND On 02/04/10 08:39, Rahul Kharche wrote: > Hi All, > > On our private port of GCC 4.4.1 we fail to combine successive SHIFT > operations like in the following case > > #include<stdlib.h> > #include<stdio.h> > > void f1 () > { > unsigned short t1; > unsigned short t2; > > t1 = rand(); > t2 = rand(); > > t1<<= 1; t2<<= 1; > t1<<= 1; t2<<= 1; > t1<<= 1; t2<<= 1; > t1<<= 1; t2<<= 1; > t1<<= 1; t2<<= 1; > t1<<= 1; t2<<= 1; > > printf("%d\n", (t1+t2)); > } > > This is a ZERO_EXTEND problem, because combining SHIFTs with whole > integers works correctly, so do signed values. The problem seems to > arise in the RTL combiner which combines the ZERO_EXTEND with the > SHIFT to generate a SHIFT and an AND. Our architecture does not > support AND with large constants and hence do not have a matching > insn pattern (we prefer not doing this, because of large constants > remain hanging at the end of all RTL optimisations and cause needless > reloads). > > Fixing the combiner to convert masking AND operations to ZERO_EXTRACT > fixes this issue without any obvious regressions. I'm adding the > patch here against GCC 4.4.1 for any comments and/or suggestions. > Good catch. However, note we are a regression bugfix only phase of development right now in preparation for branching for GCC 4.5. As a result the patch can't be checked in at this time; I would recommend you update the patch to the current sources & attach it to bug #41998 which contains queued patches for after GCC 4.5 branches. > Cheers, > Rahul > > > --- combine.c 2009-04-01 21:47:37.000000000 +0100 > +++ combine.c 2010-02-04 15:04:41.000000000 +0000 > @@ -446,6 +446,7 @@ > static void record_truncated_values (rtx *, void *); > static bool reg_truncated_to_mode (enum machine_mode, const_rtx); > static rtx gen_lowpart_or_truncate (enum machine_mode, rtx); > +static bool can_zero_extract_p (rtx, rtx, enum machine_mode); > > > > /* It is not safe to use ordinary gen_lowpart in combine. > @@ -6973,6 +6974,16 @@ > make_compound_operation (XEXP (x, 0), > next_code), > i, NULL_RTX, 1, 1, 0, 1); > + else if (can_zero_extract_p (XEXP (x, 0), XEXP (x, 1), mode)) > + { > + unsigned HOST_WIDE_INT len = HOST_BITS_PER_WIDE_INT > + - CLZ_HWI (UINTVAL (XEXP (x, > 1))); > + new_rtx = make_extraction (mode, > + make_compound_operation (XEXP (x, 0), > + next_code), > + 0, NULL_RTX, len, 1, 0, > + in_code == COMPARE); > There should be a comment prior to this code fragment describing the transformation being performed. Something like: /* Convert (and (shift X Y) MASK) into ... when ... */ That will make it clear in the future when your transformation applies rather than forcing someone scanning the code to read it in detail. > + } > > break; > > @@ -7245,6 +7256,25 @@ > return simplify_gen_unary (TRUNCATE, mode, x, GET_MODE (x)); > } > > +static bool > +can_zero_extract_p (rtx x, rtx mask_rtx, enum machine_mode mode) > There should be a comment prior to this function which briefly describes what the function does, the parameters & return value. Use comments prior to other functions to guide you. > @@ -8957,7 +8987,6 @@ > op0 = UNKNOWN; > > *pop0 = op0; > - > /* ??? Slightly redundant with the above mask, but not entirely. > Moving this above means we'd have to sign-extend the mode mask > for the final test. */ > Useless diff fragment. Remove this change as it's completely unrelated and useless. You should also write a ChangeLog entry for your patch. ChangeLogs describe what changed, not why something changed. So a suitable entry might look something like: <date> Your Name <yourem...@somewhere> * combine.c (make_compound_operation): Convert shifts and masks into zero_extract in certain cases. (can_zero_extract_p): New function. If you could make those changes and attach the result to PR 41998 they should be able to go in once 4.5 branches from the mainline. Jeff
