Re: [Patch AArch64 2/2]Add missing vcond by rewriting it with vcond_mask/vec_cmp patterns.

2016-06-21 Thread James Greenhalgh
On Wed, Jun 15, 2016 at 09:22:20AM +, Bin Cheng wrote:
> +  rtx mask = gen_reg_rtx (mode);
> +  enum rtx_code code = GET_CODE (operands[3]);
> +
> +  emit_insn (gen_vec_cmp_internal (mask, operands[3],
> +operands[4], operands[5]));
> +  /* See comments of vec_cmp_internal, the opposite
> + result masks are computed for below operators, we need to invert
> + the mask here.  In this case we can save an inverting instruction
> + by simply swapping the two operands to bsl.  */
> +  if (code == NE || code == UNEQ || code == UNLT || code == UNLE
> +  || code == UNGT || code == UNGE || code == UNORDERED)
> +std::swap (operands[1], operands[2]);

With regards to my comments on patch 1/2 - do you not get the same code-gen
if you change those functions to always generate the correct mask, and add
a second invert here before swapping the operands. The two mask inverts
should simplify to nothing, and you'd clean up the design?

Thanks,
James



[Patch AArch64 2/2]Add missing vcond by rewriting it with vcond_mask/vec_cmp patterns.

2016-06-15 Thread Bin Cheng
Hi,
This is the second patch.  It rewrites vcond patterns using vcond_mask/vec_cmp 
patterns introduced in the first one.  It also implements vcond patterns which 
were missing in the current AArch64 backend.  After this patch, I have a simple 
follow up change enabling testing requirement "vect_cond_mixed" on AArch64, 
which will enable various tests.

Bootstrap & test along with the first patch on AArch64, is it OK?

Thanks,
bin

2016-06-07  Alan Lawrence  
Renlin Li  
Bin Cheng  

* config/aarch64/iterators.md (V_cmp_mixed, v_cmp_mixed): New.
* config/aarch64/aarch64-simd.md (v2di3): Call
gen_vcondv2div2di instead of gen_aarch64_vcond_internalv2div2di.
(aarch64_vcond_internal): Delete pattern.
(aarch64_vcond_internal): Ditto.
(vcond): Re-implement using vec_cmp and vcond_mask.
(vcondu): Ditto.
(vcond): Delete.
(vcond): New pattern.
(vcondu): New pattern.
(aarch64_cmtst): Revise comment using aarch64_vcond instead
of aarch64_vcond_internal.diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 6ea35bf..e080b71 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1053,7 +1053,7 @@
 }
 
   cmp_fmt = gen_rtx_fmt_ee (cmp_operator, V2DImode, operands[1], operands[2]);
-  emit_insn (gen_aarch64_vcond_internalv2div2di (operands[0], operands[1],
+  emit_insn (gen_vcondv2div2di (operands[0], operands[1],
   operands[2], cmp_fmt, operands[1], operands[2]));
   DONE;
 })
@@ -2202,314 +2202,6 @@
   DONE;
 })
 
-(define_expand "aarch64_vcond_internal"
-  [(set (match_operand:VSDQ_I_DI 0 "register_operand")
-   (if_then_else:VSDQ_I_DI
- (match_operator 3 "comparison_operator"
-   [(match_operand:VSDQ_I_DI 4 "register_operand")
-(match_operand:VSDQ_I_DI 5 "nonmemory_operand")])
- (match_operand:VSDQ_I_DI 1 "nonmemory_operand")
- (match_operand:VSDQ_I_DI 2 "nonmemory_operand")))]
-  "TARGET_SIMD"
-{
-  rtx op1 = operands[1];
-  rtx op2 = operands[2];
-  rtx mask = gen_reg_rtx (mode);
-  enum rtx_code code = GET_CODE (operands[3]);
-
-  /* Switching OP1 and OP2 is necessary for NE (to output a cmeq insn),
- and desirable for other comparisons if it results in FOO ? -1 : 0
- (this allows direct use of the comparison result without a bsl).  */
-  if (code == NE
-  || (code != EQ
- && op1 == CONST0_RTX (mode)
- && op2 == CONSTM1_RTX (mode)))
-{
-  op1 = operands[2];
-  op2 = operands[1];
-  switch (code)
-{
-case LE: code = GT; break;
-case LT: code = GE; break;
-case GE: code = LT; break;
-case GT: code = LE; break;
-/* No case EQ.  */
-case NE: code = EQ; break;
-case LTU: code = GEU; break;
-case LEU: code = GTU; break;
-case GTU: code = LEU; break;
-case GEU: code = LTU; break;
-default: gcc_unreachable ();
-}
-}
-
-  /* Make sure we can handle the last operand.  */
-  switch (code)
-{
-case NE:
-  /* Normalized to EQ above.  */
-  gcc_unreachable ();
-
-case LE:
-case LT:
-case GE:
-case GT:
-case EQ:
-  /* These instructions have a form taking an immediate zero.  */
-  if (operands[5] == CONST0_RTX (mode))
-break;
-  /* Fall through, as may need to load into register.  */
-default:
-  if (!REG_P (operands[5]))
-operands[5] = force_reg (mode, operands[5]);
-  break;
-}
-
-  switch (code)
-{
-case LT:
-  emit_insn (gen_aarch64_cmlt (mask, operands[4], operands[5]));
-  break;
-
-case GE:
-  emit_insn (gen_aarch64_cmge (mask, operands[4], operands[5]));
-  break;
-
-case LE:
-  emit_insn (gen_aarch64_cmle (mask, operands[4], operands[5]));
-  break;
-
-case GT:
-  emit_insn (gen_aarch64_cmgt (mask, operands[4], operands[5]));
-  break;
-
-case LTU:
-  emit_insn (gen_aarch64_cmgtu (mask, operands[5], operands[4]));
-  break;
-
-case GEU:
-  emit_insn (gen_aarch64_cmgeu (mask, operands[4], operands[5]));
-  break;
-
-case LEU:
-  emit_insn (gen_aarch64_cmgeu (mask, operands[5], operands[4]));
-  break;
-
-case GTU:
-  emit_insn (gen_aarch64_cmgtu (mask, operands[4], operands[5]));
-  break;
-
-/* NE has been normalized to EQ above.  */
-case EQ:
-  emit_insn (gen_aarch64_cmeq (mask, operands[4], operands[5]));
-  break;
-
-default:
-  gcc_unreachable ();
-}
-
-/* If we have (a = (b CMP c) ? -1 : 0);
-   Then we can simply move the generated mask.  */
-
-if (op1 == CONSTM1_RTX (mode)
-   && op2 == CONST0_RTX (mode))
-  emit_move_insn (operands[0], mask);
-else
-  {
-   if (!REG_P (op1))
- op1 = force_reg (mode, op1);
-   if (!REG_P (op2))
- op2 = force_reg (mode,