Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.
On Wed, Jun 08, 2016 at 03:57:42PM +0100, Bin Cheng wrote: > > From: James Greenhalgh > > Sent: 31 May 2016 16:24 > > To: Bin Cheng > > Cc: gcc-patches@gcc.gnu.org; nd > > Subject: Re: [PATCH AArch64]Support missing vcond pattern by adding/using > > vec_cmp/vcond_mask patterns. > > > > On Tue, May 17, 2016 at 09:02:22AM +, Bin Cheng wrote: > > > Hi, > > > Alan and Renlin noticed that some vcond patterns are not supported in > > > AArch64(or AArch32?) backend, and they both had some patches fixing this. > > > After investigation, I agree with them that vcond/vcondu in AArch64's > > > backend > > > should be re-implemented using vec_cmp/vcond_mask patterns, so here comes > > > this patch which is based on Alan's. This patch supports all vcond/vcondu > > > patterns by implementing/using vec_cmp and vcond_mask patterns. > > > Different to > > > the original patch, it doesn't change GCC's expanding process, and it > > > keeps > > > vcond patterns. The patch also introduces vec_cmp*_internal to support > > > special case optimization for vcond/vcondu which current implementation > > > does. > > > Apart from Alan's patch, I also learned ideas from Renlin's, and it is my > > > change that shall be blamed if any potential bug is introduced. > > > > > > With this patch, GCC's test condition "vect_cond_mixed" can be enabled on > > > AArch64 (in a following patch). Bootstrap and test on AArch64. Is it OK? > > > BTW, this patch is necessary for gcc.dg/vect/PR56541.c (on AArch64) which > > > was > > > added before in tree if-conversion patch. > > > > Splitting this patch would have been very helpful. One patch each for the > > new standard pattern names, and one patch for the refactor of vcond. As > > it is, this patch is rather difficult to read. > Done, patch split into two with one implementing new vcond_mask&vec_cmp > patterns, and another re-writing vcond patterns. > Hi Bin, Thanks for the changes. Just to make keeping track of review easier for me, would you mind resending these in series as two seperate emails. i.e. [Patch AArch64 1/2] Implement vcond_mask and vec_cmp patterns [Patch AArch64 2/2] Rewrite vcond patterns Thanks, James
Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.
> From: James Greenhalgh > Sent: 31 May 2016 16:24 > To: Bin Cheng > Cc: gcc-patches@gcc.gnu.org; nd > Subject: Re: [PATCH AArch64]Support missing vcond pattern by adding/using > vec_cmp/vcond_mask patterns. > > On Tue, May 17, 2016 at 09:02:22AM +, Bin Cheng wrote: > > Hi, > > Alan and Renlin noticed that some vcond patterns are not supported in > > AArch64(or AArch32?) backend, and they both had some patches fixing this. > > After investigation, I agree with them that vcond/vcondu in AArch64's > > backend > > should be re-implemented using vec_cmp/vcond_mask patterns, so here comes > > this patch which is based on Alan's. This patch supports all vcond/vcondu > > patterns by implementing/using vec_cmp and vcond_mask patterns. Different > > to > > the original patch, it doesn't change GCC's expanding process, and it keeps > > vcond patterns. The patch also introduces vec_cmp*_internal to support > > special case optimization for vcond/vcondu which current implementation > > does. > > Apart from Alan's patch, I also learned ideas from Renlin's, and it is my > > change that shall be blamed if any potential bug is introduced. > > > > With this patch, GCC's test condition "vect_cond_mixed" can be enabled on > > AArch64 (in a following patch). Bootstrap and test on AArch64. Is it OK? > > BTW, this patch is necessary for gcc.dg/vect/PR56541.c (on AArch64) which > > was > > added before in tree if-conversion patch. > > Splitting this patch would have been very helpful. One patch each for the > new standard pattern names, and one patch for the refactor of vcond. As > it is, this patch is rather difficult to read. Done, patch split into two with one implementing new vcond_mask&vec_cmp patterns, and another re-writing vcond patterns. > > > diff --git a/gcc/config/aarch64/aarch64-simd.md > > b/gcc/config/aarch64/aarch64-simd.md > > index bd73bce..f51473a 100644 > > --- a/gcc/config/aarch64/aarch64-simd.md > > +++ b/gcc/config/aarch64/aarch64-simd.md > > @@ -1053,7 +1053,7 @@ > > } > > > >cmp_fmt = gen_rtx_fmt_ee (cmp_operator, V2DImode, operands[1], > > operands[2]); > > - emit_insn (gen_aarch64_vcond_internalv2div2di (operands[0], operands[1], > > + emit_insn (gen_vcondv2div2di (operands[0], operands[1], > >operands[2], cmp_fmt, operands[1], operands[2])); > >DONE; > > }) > > @@ -2225,204 +2225,215 @@ > >DONE; > > }) > > > > -(define_expand "aarch64_vcond_internal" > > +(define_expand "vcond_mask_" > > + [(match_operand:VALLDI 0 "register_operand") > > + (match_operand:VALLDI 1 "nonmemory_operand") > > + (match_operand:VALLDI 2 "nonmemory_operand") > > + (match_operand: 3 "register_operand")] > > + "TARGET_SIMD" > > +{ > > + /* If we have (a = (P) ? -1 : 0); > > + Then we can simply move the generated mask (result must be int). */ > > + if (operands[1] == CONSTM1_RTX (mode) > > + && operands[2] == CONST0_RTX (mode)) > > +emit_move_insn (operands[0], operands[3]); > > + /* Similarly, (a = (P) ? 0 : -1) is just inverting the generated mask. > > */ > > + else if (operands[1] == CONST0_RTX (mode) > > +&& operands[2] == CONSTM1_RTX (mode)) > > +emit_insn (gen_one_cmpl2 (operands[0], operands[3])); > > + else > > +{ > > + if (!REG_P (operands[1])) > > + operands[1] = force_reg (mode, operands[1]); > > + if (!REG_P (operands[2])) > > + operands[2] = force_reg (mode, operands[2]); > > + emit_insn (gen_aarch64_simd_bsl (operands[0], operands[3], > > + operands[1], operands[2])); > > +} > > + > > + DONE; > > +}) > > + > > This pattern is fine. > > > +;; Patterns comparing two vectors to produce a mask. > > This comment is insufficient. The logic in vec_cmp_internal > does not always return the expected mask (in particular for NE), but this > is not made clear in the comment. Comments added to various places to make the code easier to understand and maintain. > > > + > > +(define_expand "vec_cmp_internal" > >[(set (match_operand:VSDQ_I_DI 0 "register_operand") > > - (if_then_else:VSDQ_I_DI > > - (match_operator 3 "comparison_operator" > > - [(match_operand:VSDQ_I_DI 4 "register_operand") > > - (match_operand:VSDQ
Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.
On Tue, May 17, 2016 at 09:02:22AM +, Bin Cheng wrote: > Hi, > Alan and Renlin noticed that some vcond patterns are not supported in > AArch64(or AArch32?) backend, and they both had some patches fixing this. > After investigation, I agree with them that vcond/vcondu in AArch64's backend > should be re-implemented using vec_cmp/vcond_mask patterns, so here comes > this patch which is based on Alan's. This patch supports all vcond/vcondu > patterns by implementing/using vec_cmp and vcond_mask patterns. Different to > the original patch, it doesn't change GCC's expanding process, and it keeps > vcond patterns. The patch also introduces vec_cmp*_internal to support > special case optimization for vcond/vcondu which current implementation does. > Apart from Alan's patch, I also learned ideas from Renlin's, and it is my > change that shall be blamed if any potential bug is introduced. > > With this patch, GCC's test condition "vect_cond_mixed" can be enabled on > AArch64 (in a following patch). Bootstrap and test on AArch64. Is it OK? > BTW, this patch is necessary for gcc.dg/vect/PR56541.c (on AArch64) which was > added before in tree if-conversion patch. Splitting this patch would have been very helpful. One patch each for the new standard pattern names, and one patch for the refactor of vcond. As it is, this patch is rather difficult to read. > diff --git a/gcc/config/aarch64/aarch64-simd.md > b/gcc/config/aarch64/aarch64-simd.md > index bd73bce..f51473a 100644 > --- a/gcc/config/aarch64/aarch64-simd.md > +++ b/gcc/config/aarch64/aarch64-simd.md > @@ -1053,7 +1053,7 @@ > } > >cmp_fmt = gen_rtx_fmt_ee (cmp_operator, V2DImode, operands[1], > operands[2]); > - emit_insn (gen_aarch64_vcond_internalv2div2di (operands[0], operands[1], > + emit_insn (gen_vcondv2div2di (operands[0], operands[1], >operands[2], cmp_fmt, operands[1], operands[2])); >DONE; > }) > @@ -2225,204 +2225,215 @@ >DONE; > }) > > -(define_expand "aarch64_vcond_internal" > +(define_expand "vcond_mask_" > + [(match_operand:VALLDI 0 "register_operand") > + (match_operand:VALLDI 1 "nonmemory_operand") > + (match_operand:VALLDI 2 "nonmemory_operand") > + (match_operand: 3 "register_operand")] > + "TARGET_SIMD" > +{ > + /* If we have (a = (P) ? -1 : 0); > + Then we can simply move the generated mask (result must be int). */ > + if (operands[1] == CONSTM1_RTX (mode) > + && operands[2] == CONST0_RTX (mode)) > +emit_move_insn (operands[0], operands[3]); > + /* Similarly, (a = (P) ? 0 : -1) is just inverting the generated mask. */ > + else if (operands[1] == CONST0_RTX (mode) > +&& operands[2] == CONSTM1_RTX (mode)) > +emit_insn (gen_one_cmpl2 (operands[0], operands[3])); > + else > +{ > + if (!REG_P (operands[1])) > + operands[1] = force_reg (mode, operands[1]); > + if (!REG_P (operands[2])) > + operands[2] = force_reg (mode, operands[2]); > + emit_insn (gen_aarch64_simd_bsl (operands[0], operands[3], > + operands[1], operands[2])); > +} > + > + DONE; > +}) > + This pattern is fine. > +;; Patterns comparing two vectors to produce a mask. This comment is insufficient. The logic in vec_cmp_internal does not always return the expected mask (in particular for NE), but this is not made clear in the comment. > + > +(define_expand "vec_cmp_internal" >[(set (match_operand:VSDQ_I_DI 0 "register_operand") > - (if_then_else:VSDQ_I_DI > - (match_operator 3 "comparison_operator" > - [(match_operand:VSDQ_I_DI 4 "register_operand") > - (match_operand:VSDQ_I_DI 5 "nonmemory_operand")]) > - (match_operand:VSDQ_I_DI 1 "nonmemory_operand") > - (match_operand:VSDQ_I_DI 2 "nonmemory_operand")))] > + (match_operator 1 "comparison_operator" > + [(match_operand:VSDQ_I_DI 2 "register_operand") > + (match_operand:VSDQ_I_DI 3 "nonmemory_operand")]))] >"TARGET_SIMD" > +(define_expand "vec_cmp" > + [(set (match_operand:VSDQ_I_DI 0 "register_operand") > + (match_operator 1 "comparison_operator" > + [(match_operand:VSDQ_I_DI 2 "register_operand") > + (match_operand:VSDQ_I_DI 3 "nonmemory_operand")]))] > + "TARGET_SIMD" > +{ > + enum rtx_code code = GET_CODE (operands[1]); > + > + emit_insn (gen_vec_cmp_internal (operands[0], > + operands[1], operands[2], > + operands[3])); > + /* See comments in vec_cmp_internal, below > + operators are computed using other operators, and we need to > + revert the result. */ s/revert/invert There are no comments in vec_cmp_internal for this to refer to. But, there should be - more comments would definitely help this patch! These comments apply equally to the other copies of this comment in the patch. > + if (code == NE) > +emit_insn (gen_one_cmpl2 (operands[0], operands[0])); >DONE; > })
Re: [PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.
Ping. Thanks, bin On Tue, May 17, 2016 at 10:02 AM, Bin Cheng wrote: > Hi, > Alan and Renlin noticed that some vcond patterns are not supported in > AArch64(or AArch32?) backend, and they both had some patches fixing this. > After investigation, I agree with them that vcond/vcondu in AArch64's backend > should be re-implemented using vec_cmp/vcond_mask patterns, so here comes > this patch which is based on Alan's. This patch supports all vcond/vcondu > patterns by implementing/using vec_cmp and vcond_mask patterns. Different to > the original patch, it doesn't change GCC's expanding process, and it keeps > vcond patterns. The patch also introduces vec_cmp*_internal to support > special case optimization for vcond/vcondu which current implementation does. > Apart from Alan's patch, I also learned ideas from Renlin's, and it is my > change that shall be blamed if any potential bug is introduced. > > With this patch, GCC's test condition "vect_cond_mixed" can be enabled on > AArch64 (in a following patch). > Bootstrap and test on AArch64. Is it OK? BTW, this patch is necessary for > gcc.dg/vect/PR56541.c (on AArch64) which was added before in tree > if-conversion patch. > > Thanks, > bin > > 2016-05-11 Alan Lawrence > Renlin Li > Bin Cheng > > * config/aarch64/iterators.md (V_cmp_mixed, v_cmp_mixed): New. > * config/aarch64/aarch64-simd.md (v2di3): Call > gen_vcondv2div2di instead of gen_aarch64_vcond_internalv2div2di. > (aarch64_vcond_internal): Delete pattern. > (aarch64_vcond_internal): Ditto. > (vcond_mask_): New pattern. > (vec_cmp_internal, vec_cmp): New pattern. > (vec_cmp_internal): New pattern. > (vec_cmp, vec_cmpu): New pattern. > (vcond): Re-implement using vec_cmp and vcond_mask. > (vcondu): Ditto. > (vcond): Delete. > (vcond): New pattern. > (vcondu): New pattern. > (aarch64_cmtst): Revise comment using aarch64_vcond instead > of aarch64_vcond_internal. > > gcc/testsuite/ChangeLog > 2016-05-11 Bin Cheng > > * gcc.target/aarch64/vect-vcond.c: New test.
[PATCH AArch64]Support missing vcond pattern by adding/using vec_cmp/vcond_mask patterns.
Hi, Alan and Renlin noticed that some vcond patterns are not supported in AArch64(or AArch32?) backend, and they both had some patches fixing this. After investigation, I agree with them that vcond/vcondu in AArch64's backend should be re-implemented using vec_cmp/vcond_mask patterns, so here comes this patch which is based on Alan's. This patch supports all vcond/vcondu patterns by implementing/using vec_cmp and vcond_mask patterns. Different to the original patch, it doesn't change GCC's expanding process, and it keeps vcond patterns. The patch also introduces vec_cmp*_internal to support special case optimization for vcond/vcondu which current implementation does. Apart from Alan's patch, I also learned ideas from Renlin's, and it is my change that shall be blamed if any potential bug is introduced. With this patch, GCC's test condition "vect_cond_mixed" can be enabled on AArch64 (in a following patch). Bootstrap and test on AArch64. Is it OK? BTW, this patch is necessary for gcc.dg/vect/PR56541.c (on AArch64) which was added before in tree if-conversion patch. Thanks, bin 2016-05-11 Alan Lawrence Renlin Li Bin Cheng * config/aarch64/iterators.md (V_cmp_mixed, v_cmp_mixed): New. * config/aarch64/aarch64-simd.md (v2di3): Call gen_vcondv2div2di instead of gen_aarch64_vcond_internalv2div2di. (aarch64_vcond_internal): Delete pattern. (aarch64_vcond_internal): Ditto. (vcond_mask_): New pattern. (vec_cmp_internal, vec_cmp): New pattern. (vec_cmp_internal): New pattern. (vec_cmp, vec_cmpu): New pattern. (vcond): Re-implement using vec_cmp and vcond_mask. (vcondu): Ditto. (vcond): Delete. (vcond): New pattern. (vcondu): New pattern. (aarch64_cmtst): Revise comment using aarch64_vcond instead of aarch64_vcond_internal. gcc/testsuite/ChangeLog 2016-05-11 Bin Cheng * gcc.target/aarch64/vect-vcond.c: New test.diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md index bd73bce..f51473a 100644 --- a/gcc/config/aarch64/aarch64-simd.md +++ b/gcc/config/aarch64/aarch64-simd.md @@ -1053,7 +1053,7 @@ } cmp_fmt = gen_rtx_fmt_ee (cmp_operator, V2DImode, operands[1], operands[2]); - emit_insn (gen_aarch64_vcond_internalv2div2di (operands[0], operands[1], + emit_insn (gen_vcondv2div2di (operands[0], operands[1], operands[2], cmp_fmt, operands[1], operands[2])); DONE; }) @@ -2225,204 +2225,215 @@ DONE; }) -(define_expand "aarch64_vcond_internal" +(define_expand "vcond_mask_" + [(match_operand:VALLDI 0 "register_operand") + (match_operand:VALLDI 1 "nonmemory_operand") + (match_operand:VALLDI 2 "nonmemory_operand") + (match_operand: 3 "register_operand")] + "TARGET_SIMD" +{ + /* If we have (a = (P) ? -1 : 0); + Then we can simply move the generated mask (result must be int). */ + if (operands[1] == CONSTM1_RTX (mode) + && operands[2] == CONST0_RTX (mode)) +emit_move_insn (operands[0], operands[3]); + /* Similarly, (a = (P) ? 0 : -1) is just inverting the generated mask. */ + else if (operands[1] == CONST0_RTX (mode) + && operands[2] == CONSTM1_RTX (mode)) +emit_insn (gen_one_cmpl2 (operands[0], operands[3])); + else +{ + if (!REG_P (operands[1])) + operands[1] = force_reg (mode, operands[1]); + if (!REG_P (operands[2])) + operands[2] = force_reg (mode, operands[2]); + emit_insn (gen_aarch64_simd_bsl (operands[0], operands[3], +operands[1], operands[2])); +} + + DONE; +}) + +;; Patterns comparing two vectors to produce a mask. + +(define_expand "vec_cmp_internal" [(set (match_operand:VSDQ_I_DI 0 "register_operand") - (if_then_else:VSDQ_I_DI - (match_operator 3 "comparison_operator" - [(match_operand:VSDQ_I_DI 4 "register_operand") -(match_operand:VSDQ_I_DI 5 "nonmemory_operand")]) - (match_operand:VSDQ_I_DI 1 "nonmemory_operand") - (match_operand:VSDQ_I_DI 2 "nonmemory_operand")))] + (match_operator 1 "comparison_operator" + [(match_operand:VSDQ_I_DI 2 "register_operand") +(match_operand:VSDQ_I_DI 3 "nonmemory_operand")]))] "TARGET_SIMD" { - rtx op1 = operands[1]; - rtx op2 = operands[2]; - rtx mask = gen_reg_rtx (mode); - enum rtx_code code = GET_CODE (operands[3]); + rtx mask = operands[0]; + enum rtx_code code = GET_CODE (operands[1]); - /* Switching OP1 and OP2 is necessary for NE (to output a cmeq insn), - and desirable for other comparisons if it results in FOO ? -1 : 0 - (this allows direct use of the comparison result without a bsl). */ - if (code == NE - || (code != EQ - && op1 == CONST0_RTX (mode) - && op2 == CONSTM1_RTX (mode))) + if (operands[3] == CONST0_RTX (mode) + && (code == NE || code == LE || code == L