Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions
On 23 November 2016 at 17:30, Michael Collison wrote: > Hi Christophe, > > This is not a regression per se; the patch causes the test case to generate > one less instruction overall, but one additional 'and'. Trunk before the > patch (-O2): > > foo: > and w0, w0, 255 > lsl w1, w0, 20 > orr w0, w1, w0, lsl 8 > mov w1, 65024 > movkw1, 0xfe0, lsl 16 > and w0, w0, w1 > ret > > The new trunk aarch64 compiler with the patch generates fewer instructions > but one additional 'and' > > foo: > and w0, w0, 255 > lsl w1, w0, 20 > orr w0, w1, w0, lsl 8 > and x0, x0, 268434944 > and x0, x0, -2031617 > ret > OK, by "regression" I meant that a test used to pass and now fails. It looks like you have to update it to take into account the new, better code. Christophe > -Original Message- > From: Christophe Lyon [mailto:christophe.l...@linaro.org] > Sent: Wednesday, November 23, 2016 6:42 AM > To: James Greenhalgh > Cc: Michael Collison; gcc-patches@gcc.gnu.org; nd > Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions > > Hi Michael, > > > On 21 November 2016 at 10:52, James Greenhalgh > wrote: >> On Fri, Nov 18, 2016 at 07:41:58AM +, Michael Collison wrote: >>> James, >>> >>> I incorporated all your suggestions, and successfully bootstrapped >>> and re-ran the testsuite. >>> >>> Okay for trunk? >>> >>> 2016-11-18 Michael Collison >>> >>> * config/aarch64/aarch64-protos.h >>> (aarch64_and_split_imm1, aarch64_and_split_imm2) >>> (aarch64_and_bitmask_imm): New prototypes >>> * config/aarch64/aarch64.c (aarch64_and_split_imm1): >>> New overloaded function to create bit mask covering the >>> lowest to highest bits set. >>> (aarch64_and_split_imm2): New overloaded functions to create bit >>> mask of zeros between first and last bit set. >>> (aarch64_and_bitmask_imm): New function to determine if a integer >>> is a valid two instruction "and" operation. >>> * config/aarch64/aarch64.md:(and3): New define_insn and _split >>> allowing wider range of constants with "and" operations. >>> * (ior3, xor3): Use new LOGICAL2 iterator to prevent >>> "and" operator from matching restricted constant range used for >>> ior and xor operators. >>> * config/aarch64/constraints.md (UsO constraint): New SImode >>> constraint >>> for constants in "and" operantions. >>> (UsP constraint): New DImode constraint for constants in "and" >>> operations. >>> * config/aarch64/iterators.md (lconst2): New mode iterator. >>> (LOGICAL2): New code iterator. >>> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New >>> predicate >>> (aarch64_logical_and_operand): New predicate allowing extended >>> constants >>> for "and" operations. >>> * testsuite/gcc.target/aarch64/and_const.c: New test to verify >>> additional constants are recognized and fewer instructions generated. >>> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify >>> additional constants are recognized and fewer instructions generated. >>> >> >>> diff --git a/gcc/config/aarch64/aarch64-protos.h >>> b/gcc/config/aarch64/aarch64-protos.h >>> index 3cdd69b..7ef8cdf 100644 >>> --- a/gcc/config/aarch64/aarch64-protos.h >>> +++ b/gcc/config/aarch64/aarch64-protos.h >>> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; >>> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, >>> unsigned); int aarch64_get_condition_code (rtx); bool >>> aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); >>> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT >>> +val_in); unsigned HOST_WIDE_INT aarch64_and_split_imm2 >>> +(HOST_WIDE_INT val_in); bool aarch64_and_bitmask_imm (unsigned >>> +HOST_WIDE_INT val_in, machine_mode mode); >>> int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type >>> aarch64_classify_symbolic_expression (rtx); bool >>> aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git >>> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch
RE: [Aarch64][PATCH] Improve Logical And Immediate Expressions
Hi Christophe, This is not a regression per se; the patch causes the test case to generate one less instruction overall, but one additional 'and'. Trunk before the patch (-O2): foo: and w0, w0, 255 lsl w1, w0, 20 orr w0, w1, w0, lsl 8 mov w1, 65024 movkw1, 0xfe0, lsl 16 and w0, w0, w1 ret The new trunk aarch64 compiler with the patch generates fewer instructions but one additional 'and' foo: and w0, w0, 255 lsl w1, w0, 20 orr w0, w1, w0, lsl 8 and x0, x0, 268434944 and x0, x0, -2031617 ret -Original Message- From: Christophe Lyon [mailto:christophe.l...@linaro.org] Sent: Wednesday, November 23, 2016 6:42 AM To: James Greenhalgh Cc: Michael Collison; gcc-patches@gcc.gnu.org; nd Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions Hi Michael, On 21 November 2016 at 10:52, James Greenhalgh wrote: > On Fri, Nov 18, 2016 at 07:41:58AM +, Michael Collison wrote: >> James, >> >> I incorporated all your suggestions, and successfully bootstrapped >> and re-ran the testsuite. >> >> Okay for trunk? >> >> 2016-11-18 Michael Collison >> >> * config/aarch64/aarch64-protos.h >> (aarch64_and_split_imm1, aarch64_and_split_imm2) >> (aarch64_and_bitmask_imm): New prototypes >> * config/aarch64/aarch64.c (aarch64_and_split_imm1): >> New overloaded function to create bit mask covering the >> lowest to highest bits set. >> (aarch64_and_split_imm2): New overloaded functions to create bit >> mask of zeros between first and last bit set. >> (aarch64_and_bitmask_imm): New function to determine if a integer >> is a valid two instruction "and" operation. >> * config/aarch64/aarch64.md:(and3): New define_insn and _split >> allowing wider range of constants with "and" operations. >> * (ior3, xor3): Use new LOGICAL2 iterator to prevent >> "and" operator from matching restricted constant range used for >> ior and xor operators. >> * config/aarch64/constraints.md (UsO constraint): New SImode constraint >> for constants in "and" operantions. >> (UsP constraint): New DImode constraint for constants in "and" >> operations. >> * config/aarch64/iterators.md (lconst2): New mode iterator. >> (LOGICAL2): New code iterator. >> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New >> predicate >> (aarch64_logical_and_operand): New predicate allowing extended >> constants >> for "and" operations. >> * testsuite/gcc.target/aarch64/and_const.c: New test to verify >> additional constants are recognized and fewer instructions generated. >> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify >> additional constants are recognized and fewer instructions generated. >> > >> diff --git a/gcc/config/aarch64/aarch64-protos.h >> b/gcc/config/aarch64/aarch64-protos.h >> index 3cdd69b..7ef8cdf 100644 >> --- a/gcc/config/aarch64/aarch64-protos.h >> +++ b/gcc/config/aarch64/aarch64-protos.h >> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; >> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, >> unsigned); int aarch64_get_condition_code (rtx); bool >> aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); >> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT >> +val_in); unsigned HOST_WIDE_INT aarch64_and_split_imm2 >> +(HOST_WIDE_INT val_in); bool aarch64_and_bitmask_imm (unsigned >> +HOST_WIDE_INT val_in, machine_mode mode); >> int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type >> aarch64_classify_symbolic_expression (rtx); bool >> aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git >> a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index >> 3e663eb..8e33c40 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, >> machine_mode mode) >>return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; >> } >> >> +/* Create mask of ones, covering the lowest to highest bits set in >> +VAL_IN. */ >> + >> +unsigned HOST_WIDE_INT >> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) { >> + int lowest_bit_set = ctz_hwi (val_in); >> + int highest_bit_set = floor_log2 (val_in); >> + gcc_assert (val_in != 0); > > The comment above the function should make this precondition clear. Or > you could pick a well defined behaviour for when val_in == 0 (return > 0?), if that fits the rest of the algorithm. > > Otherwise, this patch looks OK to me. > > Thanks, > James > This patch (r242739) causes a regression on aarch64: gcc.dg/zero_bits_compound-2.c scan-assembler-times \\(and: 2 now fails. Christophe.
Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions
Hi Michael, On 21 November 2016 at 10:52, James Greenhalgh wrote: > On Fri, Nov 18, 2016 at 07:41:58AM +, Michael Collison wrote: >> James, >> >> I incorporated all your suggestions, and successfully bootstrapped and re-ran >> the testsuite. >> >> Okay for trunk? >> >> 2016-11-18 Michael Collison >> >> * config/aarch64/aarch64-protos.h >> (aarch64_and_split_imm1, aarch64_and_split_imm2) >> (aarch64_and_bitmask_imm): New prototypes >> * config/aarch64/aarch64.c (aarch64_and_split_imm1): >> New overloaded function to create bit mask covering the >> lowest to highest bits set. >> (aarch64_and_split_imm2): New overloaded functions to create bit >> mask of zeros between first and last bit set. >> (aarch64_and_bitmask_imm): New function to determine if a integer >> is a valid two instruction "and" operation. >> * config/aarch64/aarch64.md:(and3): New define_insn and _split >> allowing wider range of constants with "and" operations. >> * (ior3, xor3): Use new LOGICAL2 iterator to prevent >> "and" operator from matching restricted constant range used for >> ior and xor operators. >> * config/aarch64/constraints.md (UsO constraint): New SImode constraint >> for constants in "and" operantions. >> (UsP constraint): New DImode constraint for constants in "and" >> operations. >> * config/aarch64/iterators.md (lconst2): New mode iterator. >> (LOGICAL2): New code iterator. >> * config/aarch64/predicates.md (aarch64_logical_and_immediate): New >> predicate >> (aarch64_logical_and_operand): New predicate allowing extended >> constants >> for "and" operations. >> * testsuite/gcc.target/aarch64/and_const.c: New test to verify >> additional constants are recognized and fewer instructions generated. >> * testsuite/gcc.target/aarch64/and_const2.c: New test to verify >> additional constants are recognized and fewer instructions generated. >> > >> diff --git a/gcc/config/aarch64/aarch64-protos.h >> b/gcc/config/aarch64/aarch64-protos.h >> index 3cdd69b..7ef8cdf 100644 >> --- a/gcc/config/aarch64/aarch64-protos.h >> +++ b/gcc/config/aarch64/aarch64-protos.h >> @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; >> HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); >> int aarch64_get_condition_code (rtx); >> bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); >> +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in); >> +unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in); >> +bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode >> mode); >> int aarch64_branch_cost (bool, bool); >> enum aarch64_symbol_type aarch64_classify_symbolic_expression (rtx); >> bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); >> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> index 3e663eb..8e33c40 100644 >> --- a/gcc/config/aarch64/aarch64.c >> +++ b/gcc/config/aarch64/aarch64.c >> @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, >> machine_mode mode) >>return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; >> } >> >> +/* Create mask of ones, covering the lowest to highest bits set in VAL_IN. >> */ >> + >> +unsigned HOST_WIDE_INT >> +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) >> +{ >> + int lowest_bit_set = ctz_hwi (val_in); >> + int highest_bit_set = floor_log2 (val_in); >> + gcc_assert (val_in != 0); > > The comment above the function should make this precondition clear. Or you > could pick a well defined behaviour for when val_in == 0 (return 0?), if > that fits the rest of the algorithm. > > Otherwise, this patch looks OK to me. > > Thanks, > James > This patch (r242739) causes a regression on aarch64: gcc.dg/zero_bits_compound-2.c scan-assembler-times \\(and: 2 now fails. Christophe.
Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions
On Fri, Nov 18, 2016 at 07:41:58AM +, Michael Collison wrote: > James, > > I incorporated all your suggestions, and successfully bootstrapped and re-ran > the testsuite. > > Okay for trunk? > > 2016-11-18 Michael Collison > > * config/aarch64/aarch64-protos.h > (aarch64_and_split_imm1, aarch64_and_split_imm2) > (aarch64_and_bitmask_imm): New prototypes > * config/aarch64/aarch64.c (aarch64_and_split_imm1): > New overloaded function to create bit mask covering the > lowest to highest bits set. > (aarch64_and_split_imm2): New overloaded functions to create bit > mask of zeros between first and last bit set. > (aarch64_and_bitmask_imm): New function to determine if a integer > is a valid two instruction "and" operation. > * config/aarch64/aarch64.md:(and3): New define_insn and _split > allowing wider range of constants with "and" operations. > * (ior3, xor3): Use new LOGICAL2 iterator to prevent > "and" operator from matching restricted constant range used for > ior and xor operators. > * config/aarch64/constraints.md (UsO constraint): New SImode constraint > for constants in "and" operantions. > (UsP constraint): New DImode constraint for constants in "and" > operations. > * config/aarch64/iterators.md (lconst2): New mode iterator. > (LOGICAL2): New code iterator. > * config/aarch64/predicates.md (aarch64_logical_and_immediate): New > predicate > (aarch64_logical_and_operand): New predicate allowing extended constants > for "and" operations. > * testsuite/gcc.target/aarch64/and_const.c: New test to verify > additional constants are recognized and fewer instructions generated. > * testsuite/gcc.target/aarch64/and_const2.c: New test to verify > additional constants are recognized and fewer instructions generated. > > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > index 3cdd69b..7ef8cdf 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; > HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); > int aarch64_get_condition_code (rtx); > bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); > +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in); > +unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in); > +bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode > mode); > int aarch64_branch_cost (bool, bool); > enum aarch64_symbol_type aarch64_classify_symbolic_expression (rtx); > bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 3e663eb..8e33c40 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -3600,6 +3600,43 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, > machine_mode mode) >return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; > } > > +/* Create mask of ones, covering the lowest to highest bits set in VAL_IN. > */ > + > +unsigned HOST_WIDE_INT > +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) > +{ > + int lowest_bit_set = ctz_hwi (val_in); > + int highest_bit_set = floor_log2 (val_in); > + gcc_assert (val_in != 0); The comment above the function should make this precondition clear. Or you could pick a well defined behaviour for when val_in == 0 (return 0?), if that fits the rest of the algorithm. Otherwise, this patch looks OK to me. Thanks, James
RE: [Aarch64][PATCH] Improve Logical And Immediate Expressions
James, I incorporated all your suggestions, and successfully bootstrapped and re-ran the testsuite. Okay for trunk? 2016-11-18 Michael Collison * config/aarch64/aarch64-protos.h (aarch64_and_split_imm1, aarch64_and_split_imm2) (aarch64_and_bitmask_imm): New prototypes * config/aarch64/aarch64.c (aarch64_and_split_imm1): New overloaded function to create bit mask covering the lowest to highest bits set. (aarch64_and_split_imm2): New overloaded functions to create bit mask of zeros between first and last bit set. (aarch64_and_bitmask_imm): New function to determine if a integer is a valid two instruction "and" operation. * config/aarch64/aarch64.md:(and3): New define_insn and _split allowing wider range of constants with "and" operations. * (ior3, xor3): Use new LOGICAL2 iterator to prevent "and" operator from matching restricted constant range used for ior and xor operators. * config/aarch64/constraints.md (UsO constraint): New SImode constraint for constants in "and" operantions. (UsP constraint): New DImode constraint for constants in "and" operations. * config/aarch64/iterators.md (lconst2): New mode iterator. (LOGICAL2): New code iterator. * config/aarch64/predicates.md (aarch64_logical_and_immediate): New predicate (aarch64_logical_and_operand): New predicate allowing extended constants for "and" operations. * testsuite/gcc.target/aarch64/and_const.c: New test to verify additional constants are recognized and fewer instructions generated. * testsuite/gcc.target/aarch64/and_const2.c: New test to verify additional constants are recognized and fewer instructions generated. -Original Message- From: James Greenhalgh [mailto:james.greenha...@arm.com] Sent: Thursday, November 17, 2016 9:26 AM To: Michael Collison Cc: gcc-patches@gcc.gnu.org; nd Subject: Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions On Thu, Oct 27, 2016 at 08:44:02PM +, Michael Collison wrote: > This patch is designed to improve code generation for "and" > instructions with certain immediate operands. > > For the following test case: > > int f2(int x) > { >x &= 0x0ff8; > >x &= 0xff001fff; > >return x; > } > > the trunk aarch64 compiler generates: > > mov w1, 8184 > movk w1, 0xf00, lsl 16 > and w0, w0, w1 > > We can generate one fewer instruction if we recognize certain > constants. With the attached patch the current trunk compiler generates: > > and w0, w0, 268435448 > and w0, w0, -16769025 > > Bootstrapped and make check successfully completed with no regressions > on aarch64-linux-gnu. > > Okay for trunk? Hi Michael, I have some minor comments in line. > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > index 3cdd69b..7ef8cdf 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; > HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); > int aarch64_get_condition_code (rtx); bool aarch64_bitmask_imm > (HOST_WIDE_INT val, machine_mode); > +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in); > +unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in); > +bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, > +machine_mode mode); > int aarch64_branch_cost (bool, bool); enum aarch64_symbol_type > aarch64_classify_symbolic_expression (rtx); bool > aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); diff --git > a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index > 3e663eb..db82c5c 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -3600,6 +3600,46 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, > machine_mode mode) >return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; } > > +/* Create mask of ones, covering the lowest to highest bits set in > +VAL_IN. */ > + > +unsigned HOST_WIDE_INT > +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) { > + int first_bit_set = ctz_hwi (val_in); > + int last_bit_set = floor_log2 (val_in); > + gcc_assert (first_bit_set < last_bit_set); This assert can never trigger (by definition) unless VAL_IN == 0 (in which case floor_log2 = -1, ctz_hwi = HOST_BITS_PER_WORD_SIZE). If that's the case you're testing for, then this assert would be more explicit as gcc_assert (val_in != 0) I'd suggest renaming these to lowest_bit_set and hi
Re: [Aarch64][PATCH] Improve Logical And Immediate Expressions
On Thu, Oct 27, 2016 at 08:44:02PM +, Michael Collison wrote: > This patch is designed to improve code generation for "and" instructions with > certain immediate operands. > > For the following test case: > > int f2(int x) > { >x &= 0x0ff8; > >x &= 0xff001fff; > >return x; > } > > the trunk aarch64 compiler generates: > > mov w1, 8184 > movk w1, 0xf00, lsl 16 > and w0, w0, w1 > > We can generate one fewer instruction if we recognize certain constants. With > the attached patch the current trunk compiler generates: > > and w0, w0, 268435448 > and w0, w0, -16769025 > > Bootstrapped and make check successfully completed with no regressions on > aarch64-linux-gnu. > > Okay for trunk? Hi Michael, I have some minor comments in line. > diff --git a/gcc/config/aarch64/aarch64-protos.h > b/gcc/config/aarch64/aarch64-protos.h > index 3cdd69b..7ef8cdf 100644 > --- a/gcc/config/aarch64/aarch64-protos.h > +++ b/gcc/config/aarch64/aarch64-protos.h > @@ -300,6 +300,9 @@ extern struct tune_params aarch64_tune_params; > HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned); > int aarch64_get_condition_code (rtx); > bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode); > +unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in); > +unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in); > +bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode > mode); > int aarch64_branch_cost (bool, bool); > enum aarch64_symbol_type aarch64_classify_symbolic_expression (rtx); > bool aarch64_const_vec_all_same_int_p (rtx, HOST_WIDE_INT); > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > index 3e663eb..db82c5c 100644 > --- a/gcc/config/aarch64/aarch64.c > +++ b/gcc/config/aarch64/aarch64.c > @@ -3600,6 +3600,46 @@ aarch64_bitmask_imm (HOST_WIDE_INT val_in, > machine_mode mode) >return val == mask * bitmask_imm_mul[__builtin_clz (bits) - 26]; > } > > +/* Create mask of ones, covering the lowest to highest bits set in VAL_IN. > */ > + > +unsigned HOST_WIDE_INT > +aarch64_and_split_imm1 (HOST_WIDE_INT val_in) > +{ > + int first_bit_set = ctz_hwi (val_in); > + int last_bit_set = floor_log2 (val_in); > + gcc_assert (first_bit_set < last_bit_set); This assert can never trigger (by definition) unless VAL_IN == 0 (in which case floor_log2 = -1, ctz_hwi = HOST_BITS_PER_WORD_SIZE). If that's the case you're testing for, then this assert would be more explicit as gcc_assert (val_in != 0) I'd suggest renaming these to lowest_bit_set and highest_bit_set, as first and last are ambiguous (you have them the opposite way round from what I'd consider first [highest, thus leading bits] and last [lowest/trailing bits]). > + > + return ((HOST_WIDE_INT_UC (2) << last_bit_set) - > + (HOST_WIDE_INT_1U << first_bit_set)); > +} > + > +/* Create constant with zero bits between first and last bit set and one > + bits elsewhere. */ I can't parse this comment in relation to what the code below does. Looking at the code, you're building a new constant where the bits between the first and last bits are unmodified, and all other bits are set to one. i.e. for 1010 1000 you'd generate 1010 . > +unsigned HOST_WIDE_INT > +aarch64_and_split_imm2 (HOST_WIDE_INT val_in) > +{ > + return val_in | ~aarch64_and_split_imm1 (val_in); > +} > + > +/* Return true if VAL_IN is a valid 'and' bitmask immediate. */ > + > +bool > +aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode mode) > +{ > + if (aarch64_bitmask_imm (val_in, mode)) > +return false; > + > + if (aarch64_move_imm (val_in, mode)) > +return false; > + > + unsigned HOST_WIDE_INT imm2 = aarch64_and_split_imm2 (val_in); > + > + if (!aarch64_bitmask_imm (imm2, mode)) > +return false; > + > + return true; You can replace these four lines with: return aarch64_bitmask_imm (imm2, mode); > +} > > /* Return true if val is an immediate that can be loaded into a > register in a single instruction. */ > diff --git a/gcc/testsuite/gcc.target/aarch64/and_const.c > b/gcc/testsuite/gcc.target/aarch64/and_const.c > new file mode 100644 > index 000..9c377d7 > --- /dev/null > +++ b/gcc/testsuite/gcc.target/aarch64/and_const.c > @@ -0,0 +1,14 @@ > +/* { dg-do compile } */ > +/* { dg-options "-O2" } */ > + > +int f2 (int x) > +{ > + x &= 0x0ff8; > + > + x &= 0xff001fff; > + > + return x; > +} It would be good to have a test for the DImode cases too (using long long). Thanks, James