> -----Original Message----- > From: Richard Sandiford <richard.sandif...@arm.com> > Sent: Tuesday, November 15, 2022 10:51 AM > To: Tamar Christina <tamar.christ...@arm.com> > Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus Shawcroft > <marcus.shawcr...@arm.com> > Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. > > Tamar Christina <tamar.christ...@arm.com> writes: > >> -----Original Message----- > >> From: Richard Sandiford <richard.sandif...@arm.com> > >> Sent: Tuesday, November 15, 2022 10:36 AM > >> To: Tamar Christina <tamar.christ...@arm.com> > >> Cc: gcc-patches@gcc.gnu.org; Richard Earnshaw > >> <richard.earns...@arm.com>; nd <n...@arm.com>; Marcus Shawcroft > >> <marcus.shawcr...@arm.com> > >> Subject: Re: [PATCH 2/2]AArch64 Support new tbranch optab. > >> > >> Tamar Christina <tamar.christ...@arm.com> writes: > >> > Hello, > >> > > >> > Ping and updated patch. > >> > > >> > Bootstrapped Regtested on aarch64-none-linux-gnu and no issues. > >> > > >> > Ok for master? > >> > > >> > Thanks, > >> > Tamar > >> > > >> > gcc/ChangeLog: > >> > > >> > * config/aarch64/aarch64.md (*tb<optab><mode>1): Rename to... > >> > (*tb<optab><ALLI:mode><GPI:mode>1): ... this. > >> > (tbranch<mode>4): New. > >> > > >> > gcc/testsuite/ChangeLog: > >> > > >> > * gcc.target/aarch64/tbz_1.c: New test. > >> > > >> > --- inline copy of patch --- > >> > > >> > diff --git a/gcc/config/aarch64/aarch64.md > >> > b/gcc/config/aarch64/aarch64.md index > >> > > >> > 2bc2684b82c35a44e0a2cea6e3aaf32d939f8cdf..d7684c93fba5b717d568e1a4fd > >> 71 > >> > 2bde55c7c72e 100644 > >> > --- a/gcc/config/aarch64/aarch64.md > >> > +++ b/gcc/config/aarch64/aarch64.md > >> > @@ -943,12 +943,29 @@ (define_insn "*cb<optab><mode>1" > >> > (const_int 1)))] > >> > ) > >> > > >> > -(define_insn "*tb<optab><mode>1" > >> > +(define_expand "tbranch<mode>4" > >> > [(set (pc) (if_then_else > >> > - (EQL (zero_extract:DI (match_operand:GPI 0 > >> > "register_operand" > >> "r") > >> > - (const_int 1) > >> > - (match_operand 1 > >> > - "aarch64_simd_shift_imm_<mode>" > >> > "n")) > >> > + (match_operator 0 "aarch64_comparison_operator" > >> > + [(match_operand:ALLI 1 "register_operand") > >> > + (match_operand:ALLI 2 > >> "aarch64_simd_shift_imm_<ALLI:mode>")]) > >> > + (label_ref (match_operand 3 "" "")) > >> > + (pc)))] > >> > + "optimize > 0" > >> > >> Why's the pattern conditional on optimize? Seems a valid choice at -O0 > too. > >> > > > > Hi, > > > > I had explained the reason why in the original patch, just didn't repeat it > > in > the ping: > > > > Instead of emitting the instruction directly I've chosen to expand the > > pattern using a zero extract and generating the existing pattern for > > comparisons for two > > reasons: > > > > 1. Allows for CSE of the actual comparison. > > 2. It looks like the code in expand makes the label as unused and removed > it > > if it doesn't see a separate reference to it. > > > > Because of this expansion though I disable the pattern at -O0 since we > have no combine in that case so we'd end up with worse code. I did try > emitting the pattern directly, but as mentioned in no#2 expand would then > kill the label. > > > > Basically I emit the pattern directly, immediately during expand the label > > is > marked as dead for some weird reason. > > Isn't #2 a bug though? It seems like something we should fix rather than > work around.
Yes it's a bug ☹ ok if I'm going to fix that bug then do I need to split the optabs still? Isn't the problem atm that I need the split? If I'm emitting the instruction directly then the recog pattern for it can just be (eq (vec_extract x 1) 0) which is the correct semantics? Thanks, Tamar > > Thanks, > Richard > > > > > > Tamar. > > > >> I think the split here shows the difficulty with having a single > >> optab and a comparison operator though. operand 0 can be something > like: > >> > >> (eq x 1) > >> > >> but we're not comparing x for equality with 1. We're testing whether > >> bit 1 is zero. This means that operand 0 can't be taken literally > >> and can't be used directly in insn patterns. > >> > >> In an earlier review, I'd said: > >> > >> For the TB instructions (and for other similar instructions that I've > >> seen on other architectures) it would be more useful to have a single-bit > >> test, with operand 4 specifying the bit position. Arguably it might then > >> be better to have separate eq and ne optabs, to avoid the awkward > >> doubling > >> of the operands (operand 1 contains operands 2 and 3). > >> > >> I think we should do that eq/ne split (sorry for not pushing harder > >> for it before). > >> > >> Thanks, > >> Richard > >> > >> > >> > >> > +{ > >> > + rtx bitvalue = gen_reg_rtx (DImode); > >> > + rtx tmp = simplify_gen_subreg (DImode, operands[1], GET_MODE > >> > +(operands[1]), 0); > >> > + emit_insn (gen_extzv (bitvalue, tmp, const1_rtx, operands[2])); > >> > + operands[2] = const0_rtx; > >> > + operands[1] = aarch64_gen_compare_reg (GET_CODE (operands[0]), > >> bitvalue, > >> > + operands[2]); > >> > +}) > >> > + > >> > +(define_insn "*tb<optab><ALLI:mode><GPI:mode>1" > >> > + [(set (pc) (if_then_else > >> > + (EQL (zero_extract:GPI (match_operand:ALLI 0 > "register_operand" > >> "r") > >> > + (const_int 1) > >> > + (match_operand 1 > >> > + > >> > +"aarch64_simd_shift_imm_<ALLI:mode>" "n")) > >> > (const_int 0)) > >> > (label_ref (match_operand 2 "" "")) > >> > (pc))) > >> > @@ -959,15 +976,15 @@ (define_insn "*tb<optab><mode>1" > >> > { > >> > if (get_attr_far_branch (insn) == 1) > >> > return aarch64_gen_far_branch (operands, 2, "Ltb", > >> > - "<inv_tb>\\t%<w>0, %1, "); > >> > + "<inv_tb>\\t%<ALLI:w>0, > >> > + %1, "); > >> > else > >> > { > >> > operands[1] = GEN_INT (HOST_WIDE_INT_1U << UINTVAL > >> (operands[1])); > >> > - return "tst\t%<w>0, %1\;<bcond>\t%l2"; > >> > + return "tst\t%<ALLI:w>0, %1\;<bcond>\t%l2"; > >> > } > >> > } > >> > else > >> > - return "<tbz>\t%<w>0, %1, %l2"; > >> > + return "<tbz>\t%<ALLI:w>0, %1, %l2"; > >> > } > >> > [(set_attr "type" "branch") > >> > (set (attr "length") > >> > diff --git a/gcc/testsuite/gcc.target/aarch64/tbz_1.c > >> > b/gcc/testsuite/gcc.target/aarch64/tbz_1.c > >> > new file mode 100644 > >> > index > >> > > >> > 0000000000000000000000000000000000000000..86f5d3e23cf7f1ea6f3596549c > >> e1 > >> > a0cff6774463 > >> > --- /dev/null > >> > +++ b/gcc/testsuite/gcc.target/aarch64/tbz_1.c > >> > @@ -0,0 +1,95 @@ > >> > +/* { dg-do compile } */ > >> > +/* { dg-additional-options "-O2 -std=c99 -fno-unwind-tables > >> > +-fno-asynchronous-unwind-tables" } */ > >> > +/* { dg-final { check-function-bodies "**" "" "" { target { le } } > >> > +} } */ > >> > + > >> > +#include <stdbool.h> > >> > + > >> > +void h(void); > >> > + > >> > +/* > >> > +** g1: > >> > +** tbnz x[0-9]+, #?0, .L([0-9]+) > >> > +** ret > >> > +** ... > >> > +*/ > >> > +void g1(bool x) > >> > +{ > >> > + if (__builtin_expect (x, 0)) > >> > + h (); > >> > +} > >> > + > >> > +/* > >> > +** g2: > >> > +** tbz x[0-9]+, #?0, .L([0-9]+) > >> > +** b h > >> > +** ... > >> > +*/ > >> > +void g2(bool x) > >> > +{ > >> > + if (__builtin_expect (x, 1)) > >> > + h (); > >> > +} > >> > + > >> > +/* > >> > +** g3_ge: > >> > +** tbnz w[0-9]+, #?31, .L[0-9]+ > >> > +** b h > >> > +** ... > >> > +*/ > >> > +void g3_ge(int x) > >> > +{ > >> > + if (__builtin_expect (x >= 0, 1)) > >> > + h (); > >> > +} > >> > + > >> > +/* > >> > +** g3_gt: > >> > +** cmp w[0-9]+, 0 > >> > +** ble .L[0-9]+ > >> > +** b h > >> > +** ... > >> > +*/ > >> > +void g3_gt(int x) > >> > +{ > >> > + if (__builtin_expect (x > 0, 1)) > >> > + h (); > >> > +} > >> > + > >> > +/* > >> > +** g3_lt: > >> > +** tbz w[0-9]+, #?31, .L[0-9]+ > >> > +** b h > >> > +** ... > >> > +*/ > >> > +void g3_lt(int x) > >> > +{ > >> > + if (__builtin_expect (x < 0, 1)) > >> > + h (); > >> > +} > >> > + > >> > +/* > >> > +** g3_le: > >> > +** cmp w[0-9]+, 0 > >> > +** bgt .L[0-9]+ > >> > +** b h > >> > +** ... > >> > +*/ > >> > +void g3_le(int x) > >> > +{ > >> > + if (__builtin_expect (x <= 0, 1)) > >> > + h (); > >> > +} > >> > + > >> > +/* > >> > +** g5: > >> > +** mov w[0-9]+, 65279 > >> > +** tst w[0-9]+, w[0-9]+ > >> > +** beq .L[0-9]+ > >> > +** b h > >> > +** ... > >> > +*/ > >> > +void g5(int x) > >> > +{ > >> > + if (__builtin_expect (x & 0xfeff, 1)) > >> > + h (); > >> > +}