On 08 Sep 15:37, Ilya Enkovich wrote:
> 2015-09-04 23:42 GMT+03:00 Jeff Law <l...@redhat.com>:
> >
> > So do we have enough confidence in this representation that we want to go
> > ahead and commit to it?
> 
> I think new representation fits nice mostly. There are some places
> where I have to make some exceptions for vector of bools to make it
> work. This is mostly to avoid target modifications. I'd like to avoid
> necessity to change all targets currently supporting vec_cond. It
> makes me add some special handling of vec<bool> in GIMPLE, e.g. I add
> a special code in vect_init_vector to build vec<bool> invariants with
> proper casting to int. Otherwise I'd need to do it on a target side.
> 
> I made several fixes and current patch (still allowing integer vector
> result for vector comparison and applying bool patterns) passes
> bootstrap and regression testing on x86_64. Now I'll try to fully
> switch to vec<bool> and see how it goes.
> 
> Thanks,
> Ilya
> 

Hi,

I made a step forward forcing vector comparisons have a mask (vec<bool>) result 
and disabling bool patterns in case vector comparison is supported by target.  
Several issues were met.

 - c/c++ front-ends generate vector comparison with integer vector result.  I 
had to make some modifications to use vec_cond instead.  Don't know if there 
are other front-ends producing vector comparisons.
 - vector lowering fails to expand vector masks due to mismatch of type and 
mode sizes.  I fixed vector type size computation to match mode size and added 
a special handling of mask expand.
 - I disabled canonical type creation for vector mask because we can't layout 
it with VOID mode. I don't know why we may need a canonical type here.  But 
get_mask_mode call may be moved into type layout to get it.
 - Expand of vec<bool> constants/contstructors requires special handling.  
Common case should require target hooks/optabs to expand vector into required 
mode.  But I suppose we want to have a generic code to handle vector of int 
mode case to avoid modification of multiple targets which use default vec<bool> 
modes.

Currently 'make check' shows two types of regression.
  - missed vector expression pattern recongnition (MIN, MAX, ABX, VEC_COND).  
This must be due to my front-end changes.  Hope it will be easy to fix.
  - missed vectorization. All of them appear due to bool patterns disabling.  I 
didn't look into all of them but it seems the main problem is in mixed type 
sizes.  With bool patterns and integer vector masks we just put int->(other 
sized int) conversion for masks and it gives us required mask transformation.  
With boolean mask we don't have a proper scalar statements to do that.  I think 
mask widening/narrowing may be directly supported in masked statements 
vectorization.  Going to look into it.

I attach what I currently have for a prototype.  It grows bigger so I split 
into several parts.

Thanks,
Ilya
--
* avx512-vec-bool-01-add-truth-vector.ChangeLog

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * doc/tm.texi: Regenerated.
        * doc/tm.texi.in (TARGET_VECTORIZE_GET_MASK_MODE): New.
        * stor-layout.c (layout_type): Use mode to get vector mask size.
        (vector_type_mode): Likewise.
        * target.def (get_mask_mode): New.
        * targhooks.c (default_vector_alignment): Use mode alignment
        for vector masks.
        (default_get_mask_mode): New.
        * targhooks.h (default_get_mask_mode): New.
        * tree.c (make_vector_type): Vector mask has no canonical type.
        (build_truth_vector_type): New.
        (build_same_sized_truth_vector_type): New.
        (truth_type_for): Support vector masks.
        * tree.h (VECTOR_MASK_TYPE_P): New.
        (build_truth_vector_type): New.
        (build_same_sized_truth_vector_type): New.

* avx512-vec-bool-02-no-int-vec-cmp.ChangeLog

gcc/

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * tree-cfg.c (verify_gimple_comparison) Require vector mask
        type for vector comparison.
        (verify_gimple_assign_ternary): Likewise.

gcc/c

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * c-typeck.c (build_conditional_expr): Use vector mask
        type for vector comparison.
        (build_vec_cmp): New.
        (build_binary_op): Use build_vec_cmp for comparison.

gcc/cp

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * call.c (build_conditional_expr_1): Use vector mask
        type for vector comparison.
        * typeck.c (build_vec_cmp): New.
        (cp_build_binary_op): Use build_vec_cmp for comparison.

* avx512-vec-bool-03-vec-lower.ChangeLog

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * tree-vect-generic.c (tree_vec_extract): Use additional
        comparison when extracting boolean value.
        (do_bool_compare): New.
        (expand_vector_comparison): Add casts for vector mask.
        (expand_vector_divmod): Use vector mask type for vector
        comparison.
        (expand_vector_operations_1) Skip scalar mode mask statements.

* avx512-vec-bool-04-vectorize.ChangeLog

gcc/

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * expr.c (do_store_flag): Use expand_vec_cmp_expr for mask results.
        (const_vector_mask_from_tree): New.
        (const_vector_from_tree): Use const_vector_mask_from_tree for vector
        masks.
        * internal-fn.c (expand_MASK_LOAD): Adjust to optab changes.
        (expand_MASK_STORE): Likewise.
        * optabs.c (vector_compare_rtx): Add OPNO arg.
        (expand_vec_cond_expr): Adjust to vector_compare_rtx change.
        (get_vec_cmp_icode): New.
        (expand_vec_cmp_expr_p): New.
        (expand_vec_cmp_expr): New.
        (can_vec_mask_load_store_p): Add MASK_MODE arg.
        * optabs.def (vec_cmp_optab): New.
        (vec_cmpu_optab): New.
        (maskload_optab): Transform into convert optab.
        (maskstore_optab): Likewise.
        * optabs.h (expand_vec_cmp_expr_p): New.
        (expand_vec_cmp_expr): New.
        (can_vec_mask_load_store_p): Add MASK_MODE arg.
        * tree-if-conv.c (ifcvt_can_use_mask_load_store): Adjust to
        can_vec_mask_load_store_p signature change.
        (predicate_mem_writes): Use boolean mask.
        * tree-vect-data-refs.c (vect_get_new_vect_var): Support vect_mask_var.
        (vect_create_destination_var): Likewise.
        * tree-vect-loop.c (vect_determine_vectorization_factor): Ignore mask
        operations for VF.  Add mask type computation.
        * tree-vect-stmts.c (vect_init_vector): Support mask invariants.
        (vect_get_vec_def_for_operand): Support mask constant.
        (vectorizable_mask_load_store): Adjust to can_vec_mask_load_store_p
        signature change.
        (vectorizable_condition): Use vector mask type for vector comparison.
        (vectorizable_comparison): New.
        (vect_analyze_stmt): Add vectorizable_comparison.
        (vect_transform_stmt): Likewise.
        (get_mask_type_for_scalar_type): New.
        * tree-vectorizer.h (enum stmt_vec_info_type): Add vect_mask_var
        (enum stmt_vec_info_type): Add comparison_vec_info_type.
        (get_mask_type_for_scalar_type): New.

* avx512-vec-bool-05-bool-patterns.ChangeLog

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * tree-vect-patterns.c (check_bool_pattern): Check fails
        if we can vectorize comparison directly.
        (search_type_for_mask): New.
        (vect_recog_bool_pattern): Support cases when bool pattern
        check fails.

* avx512-vec-bool-06-i386.ChangeLog

2015-09-15  Ilya Enkovich  <enkovich....@gmail.com>

        * config/i386/i386-protos.h (ix86_expand_mask_vec_cmp): New.
        (ix86_expand_int_vec_cmp): New.
        (ix86_expand_fp_vec_cmp): New.
        * config/i386/i386.c (ix86_expand_sse_cmp): Allow NULL for
        op_true and op_false.
        (ix86_int_cmp_code_to_pcmp_immediate): New.
        (ix86_fp_cmp_code_to_pcmp_immediate): New.
        (ix86_cmp_code_to_pcmp_immediate): New.
        (ix86_expand_mask_vec_cmp): New.
        (ix86_expand_fp_vec_cmp): New.
        (ix86_expand_int_sse_cmp): New.
        (ix86_expand_int_vcond): Use ix86_expand_int_sse_cmp.
        (ix86_expand_int_vec_cmp): New.
        (ix86_get_mask_mode): New.
        (TARGET_VECTORIZE_GET_MASK_MODE): New.
        * config/i386/sse.md (avx512fmaskmodelower): New.
        (vec_cmp<mode><avx512fmaskmodelower>): New.
        (vec_cmp<mode><sseintvecmodelower>): New.
        (vec_cmpv2div2di): New.
        (vec_cmpu<mode><avx512fmaskmodelower>): New.
        (vec_cmpu<mode><sseintvecmodelower>): New.
        (vec_cmpuv2div2di): New.
        (maskload<mode>): Rename to ...
        (maskload<mode><sseintvecmodelower>): ... this.
        (maskstore<mode>): Rename to ...
        (maskstore<mode><sseintvecmodelower>): ... this.
        (maskload<mode><avx512fmaskmodelower>): New.
        (maskstore<mode><avx512fmaskmodelower>): New.
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f5a1f84..acdfcd5 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -5688,6 +5688,11 @@ mode returned by 
@code{TARGET_VECTORIZE_PREFERRED_SIMD_MODE}.
 The default is zero which means to not iterate over other vector sizes.
 @end deftypefn
 
+@deftypefn {Target Hook} machine_mode TARGET_VECTORIZE_GET_MASK_MODE (unsigned 
@var{nunits}, unsigned @var{length})
+This hook returns mode to be used for a mask to be used for a vector
+of specified @var{length} with @var{nunits} elements.
+@end deftypefn
+
 @deftypefn {Target Hook} {void *} TARGET_VECTORIZE_INIT_COST (struct loop 
*@var{loop_info})
 This hook should initialize target-specific data structures in preparation for 
modeling the costs of vectorizing a loop or basic block.  The default allocates 
three unsigned integers for accumulating costs for the prologue, body, and 
epilogue of the loop or basic block.  If @var{loop_info} is non-NULL, it 
identifies the loop being vectorized; otherwise a single block is being 
vectorized.
 @end deftypefn
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 9d5ac0a..52e912a 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -4225,6 +4225,8 @@ address;  but often a machine-dependent strategy can 
generate better code.
 
 @hook TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES
 
+@hook TARGET_VECTORIZE_GET_MASK_MODE
+
 @hook TARGET_VECTORIZE_INIT_COST
 
 @hook TARGET_VECTORIZE_ADD_STMT_COST
diff --git a/gcc/stor-layout.c b/gcc/stor-layout.c
index 938e54b..f24a0c4 100644
--- a/gcc/stor-layout.c
+++ b/gcc/stor-layout.c
@@ -2184,11 +2184,22 @@ layout_type (tree type)
 
        TYPE_SATURATING (type) = TYPE_SATURATING (TREE_TYPE (type));
         TYPE_UNSIGNED (type) = TYPE_UNSIGNED (TREE_TYPE (type));
-       TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
-                                                TYPE_SIZE_UNIT (innertype),
-                                                size_int (nunits));
-       TYPE_SIZE (type) = int_const_binop (MULT_EXPR, TYPE_SIZE (innertype),
-                                           bitsize_int (nunits));
+       if (VECTOR_MASK_TYPE_P (type))
+         {
+           TYPE_SIZE_UNIT (type)
+             = size_int (GET_MODE_SIZE (type->type_common.mode));
+           TYPE_SIZE (type)
+             = bitsize_int (GET_MODE_BITSIZE (type->type_common.mode));
+         }
+       else
+         {
+           TYPE_SIZE_UNIT (type) = int_const_binop (MULT_EXPR,
+                                                    TYPE_SIZE_UNIT (innertype),
+                                                    size_int (nunits));
+           TYPE_SIZE (type) = int_const_binop (MULT_EXPR,
+                                               TYPE_SIZE (innertype),
+                                               bitsize_int (nunits));
+         }
 
        /* For vector types, we do not default to the mode's alignment.
           Instead, query a target hook, defaulting to natural alignment.
@@ -2455,7 +2466,14 @@ vector_type_mode (const_tree t)
       machine_mode innermode = TREE_TYPE (t)->type_common.mode;
 
       /* For integers, try mapping it to a same-sized scalar mode.  */
-      if (GET_MODE_CLASS (innermode) == MODE_INT)
+      if (VECTOR_MASK_TYPE_P (t))
+       {
+         mode = mode_for_size (GET_MODE_BITSIZE (mode), MODE_INT, 0);
+
+         if (mode != VOIDmode && have_regs_of_mode[mode])
+           return mode;
+       }
+      else if (GET_MODE_CLASS (innermode) == MODE_INT)
        {
          mode = mode_for_size (TYPE_VECTOR_SUBPARTS (t)
                                * GET_MODE_BITSIZE (innermode), MODE_INT, 0);
diff --git a/gcc/target.def b/gcc/target.def
index 4edc209..c5b8ed9 100644
--- a/gcc/target.def
+++ b/gcc/target.def
@@ -1789,6 +1789,15 @@ The default is zero which means to not iterate over 
other vector sizes.",
  (void),
  default_autovectorize_vector_sizes)
 
+/* Function to get a target mode for a vector mask.  */
+DEFHOOK
+(get_mask_mode,
+ "This hook returns mode to be used for a mask to be used for a vector\n\
+of specified @var{length} with @var{nunits} elements.",
+ machine_mode,
+ (unsigned nunits, unsigned length),
+ default_get_mask_mode)
+
 /* Target builtin that implements vector gather operation.  */
 DEFHOOK
 (builtin_gather,
diff --git a/gcc/targhooks.c b/gcc/targhooks.c
index 7238c8f..ac01d57 100644
--- a/gcc/targhooks.c
+++ b/gcc/targhooks.c
@@ -1087,6 +1087,20 @@ default_autovectorize_vector_sizes (void)
   return 0;
 }
 
+/* By defaults a vector of integers is used as a mask.  */
+
+machine_mode
+default_get_mask_mode (unsigned nunits, unsigned vector_size)
+{
+  unsigned elem_size = vector_size / nunits;
+  machine_mode elem_mode
+    = smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
+
+  gcc_assert (elem_size * nunits == vector_size);
+
+  return mode_for_vector (elem_mode, nunits);
+}
+
 /* By default, the cost model accumulates three separate costs (prologue,
    loop body, and epilogue) for a vectorized loop or block.  So allocate an
    array of three unsigned ints, set it to zero, and return its address.  */
diff --git a/gcc/targhooks.h b/gcc/targhooks.h
index 5ae991d..cc7263f 100644
--- a/gcc/targhooks.h
+++ b/gcc/targhooks.h
@@ -100,6 +100,7 @@ default_builtin_support_vector_misalignment (machine_mode 
mode,
                                             int, bool);
 extern machine_mode default_preferred_simd_mode (machine_mode mode);
 extern unsigned int default_autovectorize_vector_sizes (void);
+extern machine_mode default_get_mask_mode (unsigned, unsigned);
 extern void *default_init_cost (struct loop *);
 extern unsigned default_add_stmt_cost (void *, int, enum vect_cost_for_stmt,
                                       struct _stmt_vec_info *, int,
diff --git a/gcc/tree.c b/gcc/tree.c
index af3a6a3..946d2ad 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -9742,8 +9742,9 @@ make_vector_type (tree innertype, int nunits, 
machine_mode mode)
 
   if (TYPE_STRUCTURAL_EQUALITY_P (innertype))
     SET_TYPE_STRUCTURAL_EQUALITY (t);
-  else if (TYPE_CANONICAL (innertype) != innertype
-          || mode != VOIDmode)
+  else if ((TYPE_CANONICAL (innertype) != innertype
+           || mode != VOIDmode)
+          && !VECTOR_MASK_TYPE_P (t))
     TYPE_CANONICAL (t)
       = make_vector_type (TYPE_CANONICAL (innertype), nunits, VOIDmode);
 
@@ -10568,6 +10569,36 @@ build_vector_type (tree innertype, int nunits)
   return make_vector_type (innertype, nunits, VOIDmode);
 }
 
+/* Build truth vector with specified length and number of units.  */
+
+tree
+build_truth_vector_type (unsigned nunits, unsigned vector_size)
+{
+  machine_mode mask_mode = targetm.vectorize.get_mask_mode (nunits,
+                                                           vector_size);
+
+  if (mask_mode == VOIDmode)
+    return NULL;
+
+  return make_vector_type (boolean_type_node, nunits, mask_mode);
+}
+
+/* Returns a vector type corresponding to a comparison of VECTYPE.  */
+
+tree
+build_same_sized_truth_vector_type (tree vectype)
+{
+  if (VECTOR_MASK_TYPE_P (vectype))
+    return vectype;
+
+  unsigned HOST_WIDE_INT size = GET_MODE_SIZE (TYPE_MODE (vectype));
+
+  if (!size)
+    size = tree_to_uhwi (TYPE_SIZE_UNIT (vectype));
+
+  return build_truth_vector_type (TYPE_VECTOR_SUBPARTS (vectype), size);
+}
+
 /* Similarly, but builds a variant type with TYPE_VECTOR_OPAQUE set.  */
 
 tree
@@ -11054,9 +11085,10 @@ truth_type_for (tree type)
 {
   if (TREE_CODE (type) == VECTOR_TYPE)
     {
-      tree elem = lang_hooks.types.type_for_size
-        (GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (type))), 0);
-      return build_opaque_vector_type (elem, TYPE_VECTOR_SUBPARTS (type));
+      if (VECTOR_MASK_TYPE_P (type))
+       return type;
+      return build_truth_vector_type (TYPE_VECTOR_SUBPARTS (type),
+                                     GET_MODE_SIZE (TYPE_MODE (type)));
     }
   else
     return boolean_type_node;
diff --git a/gcc/tree.h b/gcc/tree.h
index 2cd6ec4..09fb26d 100644
--- a/gcc/tree.h
+++ b/gcc/tree.h
@@ -469,6 +469,12 @@ extern void omp_clause_range_check_failed (const_tree, 
const char *, int,
 
 #define VECTOR_TYPE_P(TYPE) (TREE_CODE (TYPE) == VECTOR_TYPE)
 
+/* Nonzero if TYPE represents a vector of booleans.  */
+
+#define VECTOR_MASK_TYPE_P(TYPE)                               \
+  (TREE_CODE (TYPE) == VECTOR_TYPE                     \
+   && TREE_CODE (TREE_TYPE (TYPE)) == BOOLEAN_TYPE)
+
 /* Nonzero if TYPE represents an integral type.  Note that we do not
    include COMPLEX types here.  Keep these checks in ascending code
    order.  */
@@ -3820,6 +3826,8 @@ extern tree build_reference_type_for_mode (tree, 
machine_mode, bool);
 extern tree build_reference_type (tree);
 extern tree build_vector_type_for_mode (tree, machine_mode);
 extern tree build_vector_type (tree innertype, int nunits);
+extern tree build_truth_vector_type (unsigned, unsigned);
+extern tree build_same_sized_truth_vector_type (tree vectype);
 extern tree build_opaque_vector_type (tree innertype, int nunits);
 extern tree build_index_type (tree);
 extern tree build_array_type (tree, tree);
diff --git a/gcc/c/c-typeck.c b/gcc/c/c-typeck.c
index e8c8189..6ea4f19 100644
--- a/gcc/c/c-typeck.c
+++ b/gcc/c/c-typeck.c
@@ -4753,6 +4753,18 @@ build_conditional_expr (location_t colon_loc, tree 
ifexp, bool ifexp_bcp,
                       && TREE_CODE (orig_op2) == INTEGER_CST
                       && !TREE_OVERFLOW (orig_op2)));
     }
+
+  /* Need to convert condition operand into a vector mask.  */
+  if (VECTOR_TYPE_P (TREE_TYPE (ifexp)))
+    {
+      tree vectype = TREE_TYPE (ifexp);
+      tree elem_type = TREE_TYPE (vectype);
+      tree zero = build_int_cst (elem_type, 0);
+      tree zero_vec = build_vector_from_val (vectype, zero);
+      tree cmp_type = build_same_sized_truth_vector_type (vectype);
+      ifexp = build2 (NE_EXPR, cmp_type, ifexp, zero_vec);
+    }
+
   if (int_const || (ifexp_bcp && TREE_CODE (ifexp) == INTEGER_CST))
     ret = fold_build3_loc (colon_loc, COND_EXPR, result_type, ifexp, op1, op2);
   else
@@ -10195,6 +10207,19 @@ push_cleanup (tree decl, tree cleanup, bool eh_only)
   STATEMENT_LIST_STMT_EXPR (list) = stmt_expr;
 }
 
+/* Build a vector comparison using VEC_COND_EXPR.  */
+
+static tree
+build_vec_cmp (tree_code code, tree type,
+              tree arg0, tree arg1)
+{
+  tree zero_vec = build_zero_cst (type);
+  tree minus_one_vec = build_minus_one_cst (type);
+  tree cmp_type = build_same_sized_truth_vector_type (type);
+  tree cmp = build2 (code, cmp_type, arg0, arg1);
+  return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
+}
+
 /* Build a binary-operation expression without default conversions.
    CODE is the kind of expression to build.
    LOCATION is the operator's location.
@@ -10753,7 +10778,8 @@ build_binary_op (location_t location, enum tree_code 
code,
           result_type = build_opaque_vector_type (intt,
                                                  TYPE_VECTOR_SUBPARTS (type0));
           converted = 1;
-          break;
+         ret = build_vec_cmp (resultcode, result_type, op0, op1);
+          goto return_build_binary_op;
         }
       if (FLOAT_TYPE_P (type0) || FLOAT_TYPE_P (type1))
        warning_at (location,
@@ -10895,7 +10921,8 @@ build_binary_op (location_t location, enum tree_code 
code,
           result_type = build_opaque_vector_type (intt,
                                                  TYPE_VECTOR_SUBPARTS (type0));
           converted = 1;
-          break;
+         ret = build_vec_cmp (resultcode, result_type, op0, op1);
+          goto return_build_binary_op;
         }
       build_type = integer_type_node;
       if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 8d4a9e2..7f16e84 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -4727,8 +4727,10 @@ build_conditional_expr_1 (location_t loc, tree arg1, 
tree arg2, tree arg3,
        }
 
       if (!COMPARISON_CLASS_P (arg1))
-       arg1 = cp_build_binary_op (loc, NE_EXPR, arg1,
-                                  build_zero_cst (arg1_type), complain);
+       {
+         tree cmp_type = build_same_sized_truth_vector_type (arg1_type);
+         arg1 = build2 (NE_EXPR, cmp_type, arg1, build_zero_cst (arg1_type));
+       }
       return fold_build3 (VEC_COND_EXPR, arg2_type, arg1, arg2, arg3);
     }
 
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index 83fd34c..89bacc2 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -3898,6 +3898,18 @@ build_binary_op (location_t location, enum tree_code 
code, tree op0, tree op1,
   return cp_build_binary_op (location, code, op0, op1, tf_warning_or_error);
 }
 
+/* Build a vector comparison using VEC_COND_EXPR.  */
+
+static tree
+build_vec_cmp (tree_code code, tree type,
+              tree arg0, tree arg1)
+{
+  tree zero_vec = build_zero_cst (type);
+  tree minus_one_vec = build_minus_one_cst (type);
+  tree cmp_type = build_same_sized_truth_vector_type(type);
+  tree cmp = build2 (code, cmp_type, arg0, arg1);
+  return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
+}
 
 /* Build a binary-operation expression without default conversions.
    CODE is the kind of expression to build.
@@ -4757,7 +4769,7 @@ cp_build_binary_op (location_t location,
          result_type = build_opaque_vector_type (intt,
                                                  TYPE_VECTOR_SUBPARTS (type0));
          converted = 1;
-         break;
+         return build_vec_cmp (resultcode, result_type, op0, op1);
        }
       build_type = boolean_type_node;
       if ((code0 == INTEGER_TYPE || code0 == REAL_TYPE
diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index 5ac73b3..2ce5a84 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -3464,10 +3464,10 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
           return true;
         }
     }
-  /* Or an integer vector type with the same size and element count
+  /* Or a boolean vector type with the same element count
      as the comparison operand types.  */
   else if (TREE_CODE (type) == VECTOR_TYPE
-          && TREE_CODE (TREE_TYPE (type)) == INTEGER_TYPE)
+          && TREE_CODE (TREE_TYPE (type)) == BOOLEAN_TYPE)
     {
       if (TREE_CODE (op0_type) != VECTOR_TYPE
          || TREE_CODE (op1_type) != VECTOR_TYPE)
@@ -3478,12 +3478,7 @@ verify_gimple_comparison (tree type, tree op0, tree op1)
           return true;
         }
 
-      if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type)
-         || (GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (type)))
-             != GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (op0_type))))
-         /* The result of a vector comparison is of signed
-            integral type.  */
-         || TYPE_UNSIGNED (TREE_TYPE (type)))
+      if (TYPE_VECTOR_SUBPARTS (type) != TYPE_VECTOR_SUBPARTS (op0_type))
         {
           error ("invalid vector comparison resulting type");
           debug_generic_expr (type);
@@ -3970,15 +3965,13 @@ verify_gimple_assign_ternary (gassign *stmt)
       break;
 
     case VEC_COND_EXPR:
-      if (!VECTOR_INTEGER_TYPE_P (rhs1_type)
-         || TYPE_SIGN (rhs1_type) != SIGNED
-         || TYPE_SIZE (rhs1_type) != TYPE_SIZE (lhs_type)
+      if (!VECTOR_MASK_TYPE_P (rhs1_type)
          || TYPE_VECTOR_SUBPARTS (rhs1_type)
             != TYPE_VECTOR_SUBPARTS (lhs_type))
        {
-         error ("the first argument of a VEC_COND_EXPR must be of a signed "
-                "integral vector type of the same size and number of "
-                "elements as the result");
+         error ("the first argument of a VEC_COND_EXPR must be of a "
+                "boolean vector type of the same number of elements "
+                "as the result");
          debug_generic_expr (lhs_type);
          debug_generic_expr (rhs1_type);
          return true;
diff --git a/gcc/tree-vect-generic.c b/gcc/tree-vect-generic.c
index be3d27f..a89b08c 100644
--- a/gcc/tree-vect-generic.c
+++ b/gcc/tree-vect-generic.c
@@ -122,7 +122,19 @@ tree_vec_extract (gimple_stmt_iterator *gsi, tree type,
                  tree t, tree bitsize, tree bitpos)
 {
   if (bitpos)
-    return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+    {
+      if (TREE_CODE (type) == BOOLEAN_TYPE)
+       {
+         tree itype
+           = build_nonstandard_integer_type (tree_to_uhwi (bitsize), 0);
+         tree field = gimplify_build3 (gsi, BIT_FIELD_REF, itype, t,
+                                       bitsize, bitpos);
+         return gimplify_build2 (gsi, NE_EXPR, type, field,
+                                 build_zero_cst (itype));
+       }
+      else
+       return gimplify_build3 (gsi, BIT_FIELD_REF, type, t, bitsize, bitpos);
+    }
   else
     return gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, t);
 }
@@ -171,6 +183,21 @@ do_compare (gimple_stmt_iterator *gsi, tree inner_type, 
tree a, tree b,
                          build_int_cst (comp_type, 0));
 }
 
+/* Construct expression (A[BITPOS] code B[BITPOS])
+
+   INNER_TYPE is the type of A and B elements
+
+   returned expression is of boolean type.  */
+static tree
+do_bool_compare (gimple_stmt_iterator *gsi, tree inner_type, tree a, tree b,
+                tree bitpos, tree bitsize, enum tree_code code)
+{
+  a = tree_vec_extract (gsi, inner_type, a, bitsize, bitpos);
+  b = tree_vec_extract (gsi, inner_type, b, bitsize, bitpos);
+
+  return gimplify_build2 (gsi, code, boolean_type_node, a, b);
+}
+
 /* Expand vector addition to scalars.  This does bit twiddling
    in order to increase parallelism:
 
@@ -350,9 +377,31 @@ expand_vector_comparison (gimple_stmt_iterator *gsi, tree 
type, tree op0,
                           tree op1, enum tree_code code)
 {
   tree t;
-  if (! expand_vec_cond_expr_p (type, TREE_TYPE (op0)))
-    t = expand_vector_piecewise (gsi, do_compare, type,
-                                TREE_TYPE (TREE_TYPE (op0)), op0, op1, code);
+  if (!expand_vec_cmp_expr_p (TREE_TYPE (op0), type)
+      && !expand_vec_cond_expr_p (type, TREE_TYPE (op0)))
+    {
+      if (VECTOR_MODE_P (TYPE_MODE (type)))
+       {
+         tree inner_type = TREE_TYPE (TREE_TYPE (op0));
+         tree elem_type = build_nonstandard_integer_type
+           (GET_MODE_BITSIZE (TYPE_MODE (inner_type)), 0);
+         tree int_vec_type = build_vector_type (elem_type,
+                                                TYPE_VECTOR_SUBPARTS (type));
+         tree vec = expand_vector_piecewise (gsi, do_compare, int_vec_type,
+                                             TREE_TYPE (TREE_TYPE (op0)),
+                                             op0, op1, code);
+         gimple stmt;
+
+         return gimplify_build1 (gsi, VIEW_CONVERT_EXPR, type, vec);
+         t = make_ssa_name (type);
+         stmt = gimple_build_assign (t, build1 (VIEW_CONVERT_EXPR, type, vec));
+         gsi_insert_before (gsi, stmt, GSI_SAME_STMT);
+       }
+      else
+       t = expand_vector_piecewise (gsi, do_bool_compare, type,
+                                    TREE_TYPE (TREE_TYPE (op0)),
+                                    op0, op1, code);
+    }
   else
     t = NULL_TREE;
 
@@ -625,11 +674,12 @@ expand_vector_divmod (gimple_stmt_iterator *gsi, tree 
type, tree op0,
          if (addend == NULL_TREE
              && expand_vec_cond_expr_p (type, type))
            {
-             tree zero, cst, cond;
+             tree zero, cst, cond, mask_type;
              gimple stmt;
 
+             mask_type = build_same_sized_truth_vector_type (type);
              zero = build_zero_cst (type);
-             cond = build2 (LT_EXPR, type, op0, zero);
+             cond = build2 (LT_EXPR, mask_type, op0, zero);
              for (i = 0; i < nunits; i++)
                vec[i] = build_int_cst (TREE_TYPE (type),
                                        ((unsigned HOST_WIDE_INT) 1
@@ -1506,6 +1556,12 @@ expand_vector_operations_1 (gimple_stmt_iterator *gsi)
   if (TREE_CODE (type) != VECTOR_TYPE)
     return;
 
+  /* A scalar operation pretending to be a vector one.  */
+  if (VECTOR_MASK_TYPE_P (type)
+      && !VECTOR_MODE_P (TYPE_MODE (type))
+      && TYPE_MODE (type) != BLKmode)
+    return;
+
   if (CONVERT_EXPR_CODE_P (code)
       || code == FLOAT_EXPR
       || code == FIX_TRUNC_EXPR
diff --git a/gcc/expr.c b/gcc/expr.c
index 1e820b4..6ae0c4d 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -11000,9 +11000,15 @@ do_store_flag (sepops ops, rtx target, machine_mode 
mode)
   if (TREE_CODE (ops->type) == VECTOR_TYPE)
     {
       tree ifexp = build2 (ops->code, ops->type, arg0, arg1);
-      tree if_true = constant_boolean_node (true, ops->type);
-      tree if_false = constant_boolean_node (false, ops->type);
-      return expand_vec_cond_expr (ops->type, ifexp, if_true, if_false, 
target);
+      if (VECTOR_MASK_TYPE_P (ops->type))
+       return expand_vec_cmp_expr (ops->type, ifexp, target);
+      else
+       {
+         tree if_true = constant_boolean_node (true, ops->type);
+         tree if_false = constant_boolean_node (false, ops->type);
+         return expand_vec_cond_expr (ops->type, ifexp, if_true,
+                                      if_false, target);
+       }
     }
 
   /* Get the rtx comparison code to use.  We know that EXP is a comparison
@@ -11289,6 +11295,39 @@ try_tablejump (tree index_type, tree index_expr, tree 
minval, tree range,
   return 1;
 }
 
+/* Return a CONST_VECTOR rtx representing vector mask for
+   a VECTOR_CST of booleans.  */
+static rtx
+const_vector_mask_from_tree (tree exp)
+{
+  rtvec v;
+  unsigned i;
+  int units;
+  tree elt;
+  machine_mode inner, mode;
+
+  mode = TYPE_MODE (TREE_TYPE (exp));
+  units = GET_MODE_NUNITS (mode);
+  inner = GET_MODE_INNER (mode);
+
+  v = rtvec_alloc (units);
+
+  for (i = 0; i < VECTOR_CST_NELTS (exp); ++i)
+    {
+      elt = VECTOR_CST_ELT (exp, i);
+
+      gcc_assert (TREE_CODE (elt) == INTEGER_CST);
+      if (integer_zerop (elt))
+       RTVEC_ELT (v, i) = CONST0_RTX (inner);
+      else if (integer_onep (elt))
+       RTVEC_ELT (v, i) = CONSTM1_RTX (inner);
+      else
+       gcc_unreachable ();
+    }
+
+  return gen_rtx_CONST_VECTOR (mode, v);
+}
+
 /* Return a CONST_VECTOR rtx for a VECTOR_CST tree.  */
 static rtx
 const_vector_from_tree (tree exp)
@@ -11304,6 +11343,9 @@ const_vector_from_tree (tree exp)
   if (initializer_zerop (exp))
     return CONST0_RTX (mode);
 
+  if (VECTOR_MASK_TYPE_P (TREE_TYPE (exp)))
+      return const_vector_mask_from_tree (exp);
+
   units = GET_MODE_NUNITS (mode);
   inner = GET_MODE_INNER (mode);
 
diff --git a/gcc/internal-fn.c b/gcc/internal-fn.c
index e785946..4ca0a40 100644
--- a/gcc/internal-fn.c
+++ b/gcc/internal-fn.c
@@ -1885,7 +1885,9 @@ expand_MASK_LOAD (gcall *stmt)
   create_output_operand (&ops[0], target, TYPE_MODE (type));
   create_fixed_operand (&ops[1], mem);
   create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  expand_insn (optab_handler (maskload_optab, TYPE_MODE (type)), 3, ops);
+  expand_insn (convert_optab_handler (maskload_optab, TYPE_MODE (type),
+                                     TYPE_MODE (TREE_TYPE (maskt))),
+              3, ops);
 }
 
 static void
@@ -1908,7 +1910,9 @@ expand_MASK_STORE (gcall *stmt)
   create_fixed_operand (&ops[0], mem);
   create_input_operand (&ops[1], reg, TYPE_MODE (type));
   create_input_operand (&ops[2], mask, TYPE_MODE (TREE_TYPE (maskt)));
-  expand_insn (optab_handler (maskstore_optab, TYPE_MODE (type)), 3, ops);
+  expand_insn (convert_optab_handler (maskstore_optab, TYPE_MODE (type),
+                                     TYPE_MODE (TREE_TYPE (maskt))),
+              3, ops);
 }
 
 static void
diff --git a/gcc/optabs.c b/gcc/optabs.c
index e533e6e..fd9932f 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -6490,11 +6490,13 @@ get_rtx_code (enum tree_code tcode, bool unsignedp)
 }
 
 /* Return comparison rtx for COND. Use UNSIGNEDP to select signed or
-   unsigned operators. Do not generate compare instruction.  */
+   unsigned operators.  OPNO holds an index of the first comparison
+   operand in insn with code ICODE.  Do not generate compare instruction.  */
 
 static rtx
 vector_compare_rtx (enum tree_code tcode, tree t_op0, tree t_op1,
-                   bool unsignedp, enum insn_code icode)
+                   bool unsignedp, enum insn_code icode,
+                   unsigned int opno)
 {
   struct expand_operand ops[2];
   rtx rtx_op0, rtx_op1;
@@ -6520,7 +6522,7 @@ vector_compare_rtx (enum tree_code tcode, tree t_op0, 
tree t_op1,
 
   create_input_operand (&ops[0], rtx_op0, m0);
   create_input_operand (&ops[1], rtx_op1, m1);
-  if (!maybe_legitimize_operands (icode, 4, 2, ops))
+  if (!maybe_legitimize_operands (icode, opno, 2, ops))
     gcc_unreachable ();
   return gen_rtx_fmt_ee (rcode, VOIDmode, ops[0].value, ops[1].value);
 }
@@ -6843,16 +6845,25 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, 
tree op1, tree op2,
       op0a = TREE_OPERAND (op0, 0);
       op0b = TREE_OPERAND (op0, 1);
       tcode = TREE_CODE (op0);
+      unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
     }
   else
     {
+      gcc_assert (VECTOR_MASK_TYPE_P (TREE_TYPE (op0)));
+      if (GET_MODE_CLASS (TYPE_MODE (TREE_TYPE ((op0)))) != MODE_VECTOR_INT)
+       {
+         /* This is a vcond with mask.  To be supported soon...  */
+         gcc_unreachable ();
+       }
       /* Fake op0 < 0.  */
-      gcc_assert (!TYPE_UNSIGNED (TREE_TYPE (op0)));
-      op0a = op0;
-      op0b = build_zero_cst (TREE_TYPE (op0));
-      tcode = LT_EXPR;
+      else
+       {
+         op0a = op0;
+         op0b = build_zero_cst (TREE_TYPE (op0));
+         tcode = LT_EXPR;
+         unsignedp = false;
+       }
     }
-  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
   cmp_op_mode = TYPE_MODE (TREE_TYPE (op0a));
 
 
@@ -6863,7 +6874,7 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, tree 
op1, tree op2,
   if (icode == CODE_FOR_nothing)
     return 0;
 
-  comparison = vector_compare_rtx (tcode, op0a, op0b, unsignedp, icode);
+  comparison = vector_compare_rtx (tcode, op0a, op0b, unsignedp, icode, 4);
   rtx_op1 = expand_normal (op1);
   rtx_op2 = expand_normal (op2);
 
@@ -6877,6 +6888,63 @@ expand_vec_cond_expr (tree vec_cond_type, tree op0, tree 
op1, tree op2,
   return ops[0].value;
 }
 
+/* Return insn code for a comparison operator with VMODE
+   resultin MASK_MODE, unsigned if UNS is true.  */
+
+static inline enum insn_code
+get_vec_cmp_icode (machine_mode vmode, machine_mode mask_mode, bool uns)
+{
+  optab tab = uns ? vec_cmpu_optab : vec_cmp_optab;
+  return convert_optab_handler (tab, vmode, mask_mode);
+}
+
+/* Return TRUE if appropriate vector insn is available
+   for vector comparison expr with vector type VALUE_TYPE
+   and resulting mask with MASK_TYPE.  */
+
+bool
+expand_vec_cmp_expr_p (tree value_type, tree mask_type)
+{
+  enum insn_code icode = get_vec_cmp_icode (TYPE_MODE (value_type),
+                                           TYPE_MODE (mask_type),
+                                           TYPE_UNSIGNED (value_type));
+  return (icode != CODE_FOR_nothing);
+}
+
+/* Generate insns for a vector comparison into a mask.  */
+
+rtx
+expand_vec_cmp_expr (tree type, tree exp, rtx target)
+{
+  struct expand_operand ops[4];
+  enum insn_code icode;
+  rtx comparison;
+  machine_mode mask_mode = TYPE_MODE (type);
+  machine_mode vmode;
+  bool unsignedp;
+  tree op0a, op0b;
+  enum tree_code tcode;
+
+  op0a = TREE_OPERAND (exp, 0);
+  op0b = TREE_OPERAND (exp, 1);
+  tcode = TREE_CODE (exp);
+
+  unsignedp = TYPE_UNSIGNED (TREE_TYPE (op0a));
+  vmode = TYPE_MODE (TREE_TYPE (op0a));
+
+  icode = get_vec_cmp_icode (vmode, mask_mode, unsignedp);
+  if (icode == CODE_FOR_nothing)
+    return 0;
+
+  comparison = vector_compare_rtx (tcode, op0a, op0b, unsignedp, icode, 2);
+  create_output_operand (&ops[0], target, mask_mode);
+  create_fixed_operand (&ops[1], comparison);
+  create_fixed_operand (&ops[2], XEXP (comparison, 0));
+  create_fixed_operand (&ops[3], XEXP (comparison, 1));
+  expand_insn (icode, 4, ops);
+  return ops[0].value;
+}
+
 /* Return non-zero if a highpart multiply is supported of can be synthisized.
    For the benefit of expand_mult_highpart, the return value is 1 for direct,
    2 for even/odd widening, and 3 for hi/lo widening.  */
@@ -7002,26 +7070,32 @@ expand_mult_highpart (machine_mode mode, rtx op0, rtx 
op1,
 
 /* Return true if target supports vector masked load/store for mode.  */
 bool
-can_vec_mask_load_store_p (machine_mode mode, bool is_load)
+can_vec_mask_load_store_p (machine_mode mode,
+                          machine_mode mask_mode,
+                          bool is_load)
 {
   optab op = is_load ? maskload_optab : maskstore_optab;
-  machine_mode vmode;
   unsigned int vector_sizes;
 
   /* If mode is vector mode, check it directly.  */
   if (VECTOR_MODE_P (mode))
-    return optab_handler (op, mode) != CODE_FOR_nothing;
+    return convert_optab_handler (op, mode, mask_mode) != CODE_FOR_nothing;
 
   /* Otherwise, return true if there is some vector mode with
      the mask load/store supported.  */
 
   /* See if there is any chance the mask load or store might be
      vectorized.  If not, punt.  */
-  vmode = targetm.vectorize.preferred_simd_mode (mode);
-  if (!VECTOR_MODE_P (vmode))
+  mode = targetm.vectorize.preferred_simd_mode (mode);
+  if (!VECTOR_MODE_P (mode))
+    return false;
+
+  mask_mode = targetm.vectorize.get_mask_mode (GET_MODE_NUNITS (mode),
+                                              GET_MODE_SIZE (mode));
+  if (mask_mode == VOIDmode)
     return false;
 
-  if (optab_handler (op, vmode) != CODE_FOR_nothing)
+  if (convert_optab_handler (op, mode, mask_mode) != CODE_FOR_nothing)
     return true;
 
   vector_sizes = targetm.vectorize.autovectorize_vector_sizes ();
@@ -7031,9 +7105,12 @@ can_vec_mask_load_store_p (machine_mode mode, bool 
is_load)
       vector_sizes &= ~cur;
       if (cur <= GET_MODE_SIZE (mode))
        continue;
-      vmode = mode_for_vector (mode, cur / GET_MODE_SIZE (mode));
-      if (VECTOR_MODE_P (vmode)
-         && optab_handler (op, vmode) != CODE_FOR_nothing)
+      mode = mode_for_vector (mode, cur / GET_MODE_SIZE (mode));
+      mask_mode = targetm.vectorize.get_mask_mode (GET_MODE_NUNITS (mode),
+                                                  cur);
+      if (VECTOR_MODE_P (mode)
+         && mask_mode != VOIDmode
+         && convert_optab_handler (op, mode, mask_mode) != CODE_FOR_nothing)
        return true;
     }
   return false;
diff --git a/gcc/optabs.def b/gcc/optabs.def
index 888b21c..9804378 100644
--- a/gcc/optabs.def
+++ b/gcc/optabs.def
@@ -61,6 +61,10 @@ OPTAB_CD(vec_load_lanes_optab, "vec_load_lanes$a$b")
 OPTAB_CD(vec_store_lanes_optab, "vec_store_lanes$a$b")
 OPTAB_CD(vcond_optab, "vcond$a$b")
 OPTAB_CD(vcondu_optab, "vcondu$a$b")
+OPTAB_CD(vec_cmp_optab, "vec_cmp$a$b")
+OPTAB_CD(vec_cmpu_optab, "vec_cmpu$a$b")
+OPTAB_CD(maskload_optab, "maskload$a$b")
+OPTAB_CD(maskstore_optab, "maskstore$a$b")
 
 OPTAB_NL(add_optab, "add$P$a3", PLUS, "add", '3', gen_int_fp_fixed_libfunc)
 OPTAB_NX(add_optab, "add$F$a3")
@@ -264,8 +268,6 @@ OPTAB_D (udot_prod_optab, "udot_prod$I$a")
 OPTAB_D (usum_widen_optab, "widen_usum$I$a3")
 OPTAB_D (usad_optab, "usad$I$a")
 OPTAB_D (ssad_optab, "ssad$I$a")
-OPTAB_D (maskload_optab, "maskload$a")
-OPTAB_D (maskstore_optab, "maskstore$a")
 OPTAB_D (vec_extract_optab, "vec_extract$a")
 OPTAB_D (vec_init_optab, "vec_init$a")
 OPTAB_D (vec_pack_sfix_trunc_optab, "vec_pack_sfix_trunc_$a")
diff --git a/gcc/optabs.h b/gcc/optabs.h
index 95f5cbc..dfe9ebf 100644
--- a/gcc/optabs.h
+++ b/gcc/optabs.h
@@ -496,6 +496,12 @@ extern bool can_vec_perm_p (machine_mode, bool, const 
unsigned char *);
 extern rtx expand_vec_perm (machine_mode, rtx, rtx, rtx, rtx);
 
 /* Return tree if target supports vector operations for COND_EXPR.  */
+bool expand_vec_cmp_expr_p (tree, tree);
+
+/* Generate code for VEC_COND_EXPR.  */
+extern rtx expand_vec_cmp_expr (tree, tree, rtx);
+
+/* Return true if target supports vector comparison.  */
 bool expand_vec_cond_expr_p (tree, tree);
 
 /* Generate code for VEC_COND_EXPR.  */
@@ -508,7 +514,7 @@ extern int can_mult_highpart_p (machine_mode, bool);
 extern rtx expand_mult_highpart (machine_mode, rtx, rtx, rtx, bool);
 
 /* Return true if target supports vector masked load/store for mode.  */
-extern bool can_vec_mask_load_store_p (machine_mode, bool);
+extern bool can_vec_mask_load_store_p (machine_mode, machine_mode, bool);
 
 /* Return true if there is an inline compare and swap pattern.  */
 extern bool can_compare_and_swap_p (machine_mode, bool);
diff --git a/gcc/tree-if-conv.c b/gcc/tree-if-conv.c
index 291e602..d66517d 100644
--- a/gcc/tree-if-conv.c
+++ b/gcc/tree-if-conv.c
@@ -811,7 +811,7 @@ ifcvt_can_use_mask_load_store (gimple stmt)
       || VECTOR_MODE_P (mode))
     return false;
 
-  if (can_vec_mask_load_store_p (mode, is_load))
+  if (can_vec_mask_load_store_p (mode, VOIDmode, is_load))
     return true;
 
   return false;
@@ -2068,7 +2068,7 @@ predicate_mem_writes (loop_p loop)
          {
            tree lhs = gimple_assign_lhs (stmt);
            tree rhs = gimple_assign_rhs1 (stmt);
-           tree ref, addr, ptr, masktype, mask_op0, mask_op1, mask;
+           tree ref, addr, ptr, masktype, mask;
            gimple new_stmt;
            int bitsize = GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (lhs)));
            ref = TREE_CODE (lhs) == SSA_NAME ? rhs : lhs;
@@ -2082,15 +2082,47 @@ predicate_mem_writes (loop_p loop)
              mask = vect_masks[index];
            else
              {
-               masktype = build_nonstandard_integer_type (bitsize, 1);
-               mask_op0 = build_int_cst (masktype, swap ? 0 : -1);
-               mask_op1 = build_int_cst (masktype, swap ? -1 : 0);
-               cond = force_gimple_operand_gsi_1 (&gsi, unshare_expr (cond),
-                                                  is_gimple_condexpr,
-                                                  NULL_TREE,
-                                                  true, GSI_SAME_STMT);
-               mask = fold_build_cond_expr (masktype, unshare_expr (cond),
-                                            mask_op0, mask_op1);
+               masktype = boolean_type_node;
+               if ((TREE_CODE (cond) == NE_EXPR
+                    || TREE_CODE (cond) == EQ_EXPR)
+                   && (integer_zerop (TREE_OPERAND (cond, 1))
+                       || integer_onep (TREE_OPERAND (cond, 1)))
+                   && TREE_CODE (TREE_TYPE (TREE_OPERAND (cond, 0)))
+                      == BOOLEAN_TYPE)
+                 {
+                   bool negate = (TREE_CODE (cond) == EQ_EXPR);
+                   if (integer_onep (TREE_OPERAND (cond, 1)))
+                     negate = !negate;
+                   if (swap)
+                     negate = !negate;
+                   mask = TREE_OPERAND (cond, 0);
+                   if (negate)
+                     {
+                       mask = ifc_temp_var (masktype, unshare_expr (cond),
+                                            &gsi);
+                       mask = build1 (TRUTH_NOT_EXPR, masktype, mask);
+                     }
+                 }
+               else if (swap &&
+                        TREE_CODE_CLASS (TREE_CODE (cond)) == tcc_comparison)
+                 {
+                   tree op_type = TREE_TYPE (TREE_OPERAND (cond, 0));
+                   tree_code code
+                     = invert_tree_comparison (TREE_CODE (cond),
+                                               HONOR_NANS (op_type));
+                   if (code != ERROR_MARK)
+                       mask = build2 (code, TREE_TYPE (cond),
+                                      TREE_OPERAND (cond, 0),
+                                      TREE_OPERAND (cond, 1));
+                   else
+                     {
+                       mask = ifc_temp_var (masktype, unshare_expr (cond),
+                                            &gsi);
+                       mask = build1 (TRUTH_NOT_EXPR, masktype, mask);
+                     }
+                 }
+               else
+                 mask = unshare_expr (cond);
                mask = ifc_temp_var (masktype, mask, &gsi);
                /* Save mask and its size for further use.  */
                vect_sizes.safe_push (bitsize);
diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index f1eaef4..0a39825 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -3849,6 +3849,9 @@ vect_get_new_vect_var (tree type, enum vect_var_kind 
var_kind, const char *name)
   case vect_scalar_var:
     prefix = "stmp";
     break;
+  case vect_mask_var:
+    prefix = "mask";
+    break;
   case vect_pointer_var:
     prefix = "vectp";
     break;
@@ -4403,7 +4406,11 @@ vect_create_destination_var (tree scalar_dest, tree 
vectype)
   tree type;
   enum vect_var_kind kind;
 
-  kind = vectype ? vect_simple_var : vect_scalar_var;
+  kind = vectype
+    ? VECTOR_MASK_TYPE_P (vectype)
+    ? vect_mask_var
+    : vect_simple_var
+    : vect_scalar_var;
   type = vectype ? vectype : TREE_TYPE (scalar_dest);
 
   gcc_assert (TREE_CODE (scalar_dest) == SSA_NAME);
diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index 59c75af..1810f78 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -193,19 +193,21 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
 {
   struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
   basic_block *bbs = LOOP_VINFO_BBS (loop_vinfo);
-  int nbbs = loop->num_nodes;
+  unsigned nbbs = loop->num_nodes;
   unsigned int vectorization_factor = 0;
   tree scalar_type;
   gphi *phi;
   tree vectype;
   unsigned int nunits;
   stmt_vec_info stmt_info;
-  int i;
+  unsigned i;
   HOST_WIDE_INT dummy;
   gimple stmt, pattern_stmt = NULL;
   gimple_seq pattern_def_seq = NULL;
   gimple_stmt_iterator pattern_def_si = gsi_none ();
   bool analyze_pattern_stmt = false;
+  bool bool_result;
+  auto_vec<stmt_vec_info> mask_producers;
 
   if (dump_enabled_p ())
     dump_printf_loc (MSG_NOTE, vect_location,
@@ -424,6 +426,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
              return false;
            }
 
+         bool_result = false;
+
          if (STMT_VINFO_VECTYPE (stmt_info))
            {
              /* The only case when a vectype had been already set is for stmts
@@ -444,6 +448,32 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
                scalar_type = TREE_TYPE (gimple_call_arg (stmt, 3));
              else
                scalar_type = TREE_TYPE (gimple_get_lhs (stmt));
+
+             /* Bool ops don't participate in vectorization factor
+                computation.  For comparison use compared types to
+                compute a factor.  */
+             if (TREE_CODE (scalar_type) == BOOLEAN_TYPE)
+               {
+                 mask_producers.safe_push (stmt_info);
+                 bool_result = true;
+
+                 if (gimple_code (stmt) == GIMPLE_ASSIGN
+                     && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
+                        == tcc_comparison
+                     && TREE_CODE (TREE_TYPE (gimple_assign_rhs1 (stmt)))
+                        != BOOLEAN_TYPE)
+                   scalar_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
+                 else
+                   {
+                     if (!analyze_pattern_stmt && gsi_end_p (pattern_def_si))
+                       {
+                         pattern_def_seq = NULL;
+                         gsi_next (&si);
+                       }
+                     continue;
+                   }
+               }
+
              if (dump_enabled_p ())
                {
                  dump_printf_loc (MSG_NOTE, vect_location,
@@ -466,7 +496,8 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
                  return false;
                }
 
-             STMT_VINFO_VECTYPE (stmt_info) = vectype;
+             if (!bool_result)
+               STMT_VINFO_VECTYPE (stmt_info) = vectype;
 
              if (dump_enabled_p ())
                {
@@ -479,8 +510,9 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
          /* The vectorization factor is according to the smallest
             scalar type (or the largest vector size, but we only
             support one vector size per loop).  */
-         scalar_type = vect_get_smallest_scalar_type (stmt, &dummy,
-                                                      &dummy);
+         if (!bool_result)
+           scalar_type = vect_get_smallest_scalar_type (stmt, &dummy,
+                                                        &dummy);
          if (dump_enabled_p ())
            {
              dump_printf_loc (MSG_NOTE, vect_location,
@@ -555,6 +587,100 @@ vect_determine_vectorization_factor (loop_vec_info 
loop_vinfo)
     }
   LOOP_VINFO_VECT_FACTOR (loop_vinfo) = vectorization_factor;
 
+  for (i = 0; i < mask_producers.length (); i++)
+    {
+      tree mask_type = NULL;
+      bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (mask_producers[i]);
+
+      stmt = STMT_VINFO_STMT (mask_producers[i]);
+
+      if (gimple_code (stmt) == GIMPLE_ASSIGN
+         && TREE_CODE_CLASS (gimple_assign_rhs_code (stmt)) == tcc_comparison
+         && TREE_CODE (TREE_TYPE (gimple_assign_rhs1 (stmt))) != BOOLEAN_TYPE)
+       {
+         scalar_type = TREE_TYPE (gimple_assign_rhs1 (stmt));
+         mask_type = get_mask_type_for_scalar_type (scalar_type);
+
+         if (!mask_type)
+           {
+             if (dump_enabled_p ())
+               dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                                "not vectorized: unsupported mask\n");
+             return false;
+           }
+       }
+      else
+       {
+         tree rhs, def;
+         ssa_op_iter iter;
+         gimple def_stmt;
+         enum vect_def_type dt;
+
+         FOR_EACH_SSA_TREE_OPERAND (rhs, stmt, iter, SSA_OP_USE)
+           {
+             if (!vect_is_simple_use_1 (rhs, stmt, loop_vinfo, bb_vinfo,
+                                        &def_stmt, &def, &dt, &vectype))
+               {
+                 if (dump_enabled_p ())
+                   {
+                     dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                                      "not vectorized: can't compute mask type 
"
+                                      "for statement, ");
+                     dump_gimple_stmt (MSG_MISSED_OPTIMIZATION,  TDF_SLIM, 
stmt,
+                                       0);
+                     dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
+                   }
+                 return false;
+               }
+
+             /* No vectype probably means external definition.
+                Allow it in case there is another operand which
+                allows to determine mask type.  */
+             if (!vectype)
+               continue;
+
+             if (!mask_type)
+               mask_type = vectype;
+             else if (TYPE_VECTOR_SUBPARTS (mask_type)
+                      != TYPE_VECTOR_SUBPARTS (vectype))
+               {
+                 if (dump_enabled_p ())
+                   {
+                     dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                                      "not vectorized: different sized masks "
+                                      "types in statement, ");
+                     dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
+                                        mask_type);
+                     dump_printf (MSG_MISSED_OPTIMIZATION, " and ");
+                     dump_generic_expr (MSG_MISSED_OPTIMIZATION, TDF_SLIM,
+                                        vectype);
+                     dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
+                   }
+                 return false;
+               }
+           }
+       }
+
+      /* No mask_type should mean loop invariant predicate.
+        This is probably a subject for optimization in
+        if-conversion.  */
+      if (!mask_type)
+       {
+         if (dump_enabled_p ())
+           {
+             dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                              "not vectorized: can't compute mask type "
+                              "for statement, ");
+             dump_gimple_stmt (MSG_MISSED_OPTIMIZATION,  TDF_SLIM, stmt,
+                               0);
+             dump_printf (MSG_MISSED_OPTIMIZATION, "\n");
+           }
+         return false;
+       }
+
+      STMT_VINFO_VECTYPE (mask_producers[i]) = mask_type;
+    }
+
   return true;
 }
 
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index f87c066..f3887be 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -1316,27 +1316,61 @@ vect_init_vector_1 (gimple stmt, gimple new_stmt, 
gimple_stmt_iterator *gsi)
 tree
 vect_init_vector (gimple stmt, tree val, tree type, gimple_stmt_iterator *gsi)
 {
+  tree val_type = TREE_TYPE (val);
+  machine_mode mode = TYPE_MODE (type);
+  machine_mode val_mode = TYPE_MODE(val_type);
   tree new_var;
   gimple init_stmt;
   tree vec_oprnd;
   tree new_temp;
 
   if (TREE_CODE (type) == VECTOR_TYPE
-      && TREE_CODE (TREE_TYPE (val)) != VECTOR_TYPE)
-    {
-      if (!types_compatible_p (TREE_TYPE (type), TREE_TYPE (val)))
+      && TREE_CODE (val_type) != VECTOR_TYPE)
+    {
+      /* Handle vector of bool represented as a vector of
+        integers here rather than on expand because it is
+        a default mask type for targets.  Vector mask is
+        built in a following way:
+
+        tmp = (int)val
+        vec_tmp = {tmp, ..., tmp}
+        vec_cst = VIEW_CONVERT_EXPR<vector(N) _Bool>(vec_tmp);  */
+      if (TREE_CODE (val_type) == BOOLEAN_TYPE
+         && VECTOR_MODE_P (mode)
+         && SCALAR_INT_MODE_P (GET_MODE_INNER (mode))
+         && GET_MODE_INNER (mode) != val_mode)
        {
-         if (CONSTANT_CLASS_P (val))
-           val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
-         else
+         unsigned size = GET_MODE_BITSIZE (GET_MODE_INNER (mode));
+         tree stype = build_nonstandard_integer_type (size, 1);
+         tree vectype = get_vectype_for_scalar_type (stype);
+
+         new_temp = make_ssa_name (stype);
+         init_stmt = gimple_build_assign (new_temp, NOP_EXPR, val);
+         vect_init_vector_1 (stmt, init_stmt, gsi);
+
+         val = make_ssa_name (vectype);
+         new_temp = build_vector_from_val (vectype, new_temp);
+         init_stmt = gimple_build_assign (val, new_temp);
+         vect_init_vector_1 (stmt, init_stmt, gsi);
+
+         val = build1 (VIEW_CONVERT_EXPR, type, val);
+       }
+      else
+       {
+         if (!types_compatible_p (TREE_TYPE (type), val_type))
            {
-             new_temp = make_ssa_name (TREE_TYPE (type));
-             init_stmt = gimple_build_assign (new_temp, NOP_EXPR, val);
-             vect_init_vector_1 (stmt, init_stmt, gsi);
-             val = new_temp;
+             if (CONSTANT_CLASS_P (val))
+               val = fold_unary (VIEW_CONVERT_EXPR, TREE_TYPE (type), val);
+             else
+               {
+                 new_temp = make_ssa_name (TREE_TYPE (type));
+                 init_stmt = gimple_build_assign (new_temp, NOP_EXPR, val);
+                 vect_init_vector_1 (stmt, init_stmt, gsi);
+                 val = new_temp;
+               }
            }
+         val = build_vector_from_val (type, val);
        }
-      val = build_vector_from_val (type, val);
     }
 
   new_var = vect_get_new_vect_var (type, vect_simple_var, "cst_");
@@ -1368,6 +1402,7 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, tree 
*scalar_def)
   gimple def_stmt;
   stmt_vec_info def_stmt_info = NULL;
   stmt_vec_info stmt_vinfo = vinfo_for_stmt (stmt);
+  tree stmt_vectype = STMT_VINFO_VECTYPE (stmt_vinfo);
   unsigned int nunits;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
   tree def;
@@ -1411,7 +1446,12 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, tree 
*scalar_def)
     /* Case 1: operand is a constant.  */
     case vect_constant_def:
       {
-       vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
+       if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE
+           && VECTOR_MASK_TYPE_P (stmt_vectype))
+         vector_type = stmt_vectype;
+       else
+         vector_type = get_vectype_for_scalar_type (TREE_TYPE (op));
+
        gcc_assert (vector_type);
        nunits = TYPE_VECTOR_SUBPARTS (vector_type);
 
@@ -1429,7 +1469,11 @@ vect_get_vec_def_for_operand (tree op, gimple stmt, tree 
*scalar_def)
     /* Case 2: operand is defined outside the loop - loop invariant.  */
     case vect_external_def:
       {
-       vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
+       if (TREE_CODE (TREE_TYPE (op)) == BOOLEAN_TYPE
+           && VECTOR_MASK_TYPE_P (stmt_vectype))
+         vector_type = stmt_vectype;
+       else
+         vector_type = get_vectype_for_scalar_type (TREE_TYPE (def));
        gcc_assert (vector_type);
 
        if (scalar_def)
@@ -1758,6 +1802,7 @@ vectorizable_mask_load_store (gimple stmt, 
gimple_stmt_iterator *gsi,
   bool nested_in_vect_loop = nested_in_vect_loop_p (loop, stmt);
   struct data_reference *dr = STMT_VINFO_DATA_REF (stmt_info);
   tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree mask_vectype;
   tree elem_type;
   gimple new_stmt;
   tree dummy;
@@ -1785,8 +1830,8 @@ vectorizable_mask_load_store (gimple stmt, 
gimple_stmt_iterator *gsi,
 
   is_store = gimple_call_internal_fn (stmt) == IFN_MASK_STORE;
   mask = gimple_call_arg (stmt, 2);
-  if (TYPE_PRECISION (TREE_TYPE (mask))
-      != GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (vectype))))
+
+  if (TREE_CODE (TREE_TYPE (mask)) != BOOLEAN_TYPE)
     return false;
 
   /* FORNOW. This restriction should be relaxed.  */
@@ -1815,6 +1860,19 @@ vectorizable_mask_load_store (gimple stmt, 
gimple_stmt_iterator *gsi,
   if (STMT_VINFO_STRIDED_P (stmt_info))
     return false;
 
+  if (TREE_CODE (mask) != SSA_NAME)
+    return false;
+
+  if (!vect_is_simple_use_1 (mask, stmt, loop_vinfo, NULL,
+                            &def_stmt, &def, &dt, &mask_vectype))
+    return false;
+
+  if (!mask_vectype)
+    mask_vectype = get_mask_type_for_scalar_type (TREE_TYPE (vectype));
+
+  if (!mask_vectype)
+    return false;
+
   if (STMT_VINFO_GATHER_P (stmt_info))
     {
       gimple def_stmt;
@@ -1848,14 +1906,9 @@ vectorizable_mask_load_store (gimple stmt, 
gimple_stmt_iterator *gsi,
                                 : DR_STEP (dr), size_zero_node) <= 0)
     return false;
   else if (!VECTOR_MODE_P (TYPE_MODE (vectype))
-          || !can_vec_mask_load_store_p (TYPE_MODE (vectype), !is_store))
-    return false;
-
-  if (TREE_CODE (mask) != SSA_NAME)
-    return false;
-
-  if (!vect_is_simple_use (mask, stmt, loop_vinfo, NULL,
-                          &def_stmt, &def, &dt))
+          || !can_vec_mask_load_store_p (TYPE_MODE (vectype),
+                                         TYPE_MODE (mask_vectype),
+                                         !is_store))
     return false;
 
   if (is_store)
@@ -7229,10 +7282,7 @@ vectorizable_condition (gimple stmt, 
gimple_stmt_iterator *gsi,
           && TREE_CODE (else_clause) != FIXED_CST)
     return false;
 
-  unsigned int prec = GET_MODE_BITSIZE (TYPE_MODE (TREE_TYPE (vectype)));
-  /* The result of a vector comparison should be signed type.  */
-  tree cmp_type = build_nonstandard_integer_type (prec, 0);
-  vec_cmp_type = get_same_sized_vectype (cmp_type, vectype);
+  vec_cmp_type = build_same_sized_truth_vector_type (comp_vectype);
   if (vec_cmp_type == NULL_TREE)
     return false;
 
@@ -7373,6 +7423,201 @@ vectorizable_condition (gimple stmt, 
gimple_stmt_iterator *gsi,
   return true;
 }
 
+/* vectorizable_comparison.
+
+   Check if STMT is comparison expression that can be vectorized.
+   If VEC_STMT is also passed, vectorize the STMT: create a vectorized
+   comparison, put it in VEC_STMT, and insert it at GSI.
+
+   Return FALSE if not a vectorizable STMT, TRUE otherwise.  */
+
+bool
+vectorizable_comparison (gimple stmt, gimple_stmt_iterator *gsi,
+                        gimple *vec_stmt, tree reduc_def,
+                        slp_tree slp_node)
+{
+  tree lhs, rhs1, rhs2;
+  stmt_vec_info stmt_info = vinfo_for_stmt (stmt);
+  tree vectype1, vectype2;
+  tree vectype = STMT_VINFO_VECTYPE (stmt_info);
+  tree vec_rhs1 = NULL_TREE, vec_rhs2 = NULL_TREE;
+  tree vec_compare;
+  tree new_temp;
+  loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_info);
+  tree def;
+  enum vect_def_type dt, dts[4];
+  unsigned nunits;
+  int ncopies;
+  enum tree_code code;
+  stmt_vec_info prev_stmt_info = NULL;
+  int i, j;
+  bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_info);
+  vec<tree> vec_oprnds0 = vNULL;
+  vec<tree> vec_oprnds1 = vNULL;
+  tree mask_type;
+  tree mask;
+
+  if (!VECTOR_MASK_TYPE_P (vectype))
+    return false;
+
+  mask_type = vectype;
+  nunits = TYPE_VECTOR_SUBPARTS (vectype);
+
+  if (slp_node || PURE_SLP_STMT (stmt_info))
+    ncopies = 1;
+  else
+    ncopies = LOOP_VINFO_VECT_FACTOR (loop_vinfo) / nunits;
+
+  gcc_assert (ncopies >= 1);
+  if (!STMT_VINFO_RELEVANT_P (stmt_info) && !bb_vinfo)
+    return false;
+
+  if (STMT_VINFO_DEF_TYPE (stmt_info) != vect_internal_def
+      && !(STMT_VINFO_DEF_TYPE (stmt_info) == vect_nested_cycle
+          && reduc_def))
+    return false;
+
+  if (STMT_VINFO_LIVE_P (stmt_info))
+    {
+      if (dump_enabled_p ())
+       dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+                        "value used after loop.\n");
+      return false;
+    }
+
+  if (!is_gimple_assign (stmt))
+    return false;
+
+  code = gimple_assign_rhs_code (stmt);
+
+  if (TREE_CODE_CLASS (code) != tcc_comparison)
+    return false;
+
+  rhs1 = gimple_assign_rhs1 (stmt);
+  rhs2 = gimple_assign_rhs2 (stmt);
+
+  if (TREE_CODE (rhs1) == SSA_NAME)
+    {
+      gimple rhs1_def_stmt = SSA_NAME_DEF_STMT (rhs1);
+      if (!vect_is_simple_use_1 (rhs1, stmt, loop_vinfo, bb_vinfo,
+                                &rhs1_def_stmt, &def, &dt, &vectype1))
+       return false;
+    }
+  else if (TREE_CODE (rhs1) != INTEGER_CST && TREE_CODE (rhs1) != REAL_CST
+          && TREE_CODE (rhs1) != FIXED_CST)
+    return false;
+
+  if (TREE_CODE (rhs2) == SSA_NAME)
+    {
+      gimple rhs2_def_stmt = SSA_NAME_DEF_STMT (rhs2);
+      if (!vect_is_simple_use_1 (rhs2, stmt, loop_vinfo, bb_vinfo,
+                                &rhs2_def_stmt, &def, &dt, &vectype2))
+       return false;
+    }
+  else if (TREE_CODE (rhs2) != INTEGER_CST && TREE_CODE (rhs2) != REAL_CST
+          && TREE_CODE (rhs2) != FIXED_CST)
+    return false;
+
+  vectype = vectype1 ? vectype1 : vectype2;
+
+  if (!vectype
+      || nunits != TYPE_VECTOR_SUBPARTS (vectype))
+    return false;
+
+  if (!vec_stmt)
+    {
+      STMT_VINFO_TYPE (stmt_info) = comparison_vec_info_type;
+      return expand_vec_cmp_expr_p (vectype, mask_type);
+    }
+
+  /* Transform.  */
+  if (!slp_node)
+    {
+      vec_oprnds0.create (1);
+      vec_oprnds1.create (1);
+    }
+
+  /* Handle def.  */
+  lhs = gimple_assign_lhs (stmt);
+  mask = vect_create_destination_var (lhs, mask_type);
+
+  /* Handle cmp expr.  */
+  for (j = 0; j < ncopies; j++)
+    {
+      gassign *new_stmt = NULL;
+      if (j == 0)
+       {
+         if (slp_node)
+           {
+             auto_vec<tree, 2> ops;
+             auto_vec<vec<tree>, 2> vec_defs;
+
+             ops.safe_push (rhs1);
+             ops.safe_push (rhs2);
+             vect_get_slp_defs (ops, slp_node, &vec_defs, -1);
+             vec_oprnds1 = vec_defs.pop ();
+             vec_oprnds0 = vec_defs.pop ();
+
+             ops.release ();
+             vec_defs.release ();
+           }
+         else
+           {
+             gimple gtemp;
+             vec_rhs1
+               = vect_get_vec_def_for_operand (rhs1, stmt, NULL);
+             vect_is_simple_use (rhs1, stmt, loop_vinfo, NULL,
+                                 &gtemp, &def, &dts[0]);
+             vec_rhs2 =
+               vect_get_vec_def_for_operand (rhs2, stmt, NULL);
+             vect_is_simple_use (rhs2, stmt, loop_vinfo, NULL,
+                                 &gtemp, &def, &dts[1]);
+           }
+       }
+      else
+       {
+         vec_rhs1 = vect_get_vec_def_for_stmt_copy (dts[0],
+                                                    vec_oprnds0.pop ());
+         vec_rhs2 = vect_get_vec_def_for_stmt_copy (dts[1],
+                                                    vec_oprnds1.pop ());
+       }
+
+      if (!slp_node)
+       {
+         vec_oprnds0.quick_push (vec_rhs1);
+         vec_oprnds1.quick_push (vec_rhs2);
+       }
+
+      /* Arguments are ready.  Create the new vector stmt.  */
+      FOR_EACH_VEC_ELT (vec_oprnds0, i, vec_rhs1)
+       {
+         vec_rhs2 = vec_oprnds1[i];
+
+         vec_compare = build2 (code, mask_type, vec_rhs1, vec_rhs2);
+         new_stmt = gimple_build_assign (mask, vec_compare);
+         new_temp = make_ssa_name (mask, new_stmt);
+         gimple_assign_set_lhs (new_stmt, new_temp);
+         vect_finish_stmt_generation (stmt, new_stmt, gsi);
+         if (slp_node)
+           SLP_TREE_VEC_STMTS (slp_node).quick_push (new_stmt);
+       }
+
+      if (slp_node)
+       continue;
+
+      if (j == 0)
+       STMT_VINFO_VEC_STMT (stmt_info) = *vec_stmt = new_stmt;
+      else
+       STMT_VINFO_RELATED_STMT (prev_stmt_info) = new_stmt;
+
+      prev_stmt_info = vinfo_for_stmt (new_stmt);
+    }
+
+  vec_oprnds0.release ();
+  vec_oprnds1.release ();
+
+  return true;
+}
 
 /* Make sure the statement is vectorizable.  */
 
@@ -7576,7 +7821,8 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize, 
slp_tree node)
          || vectorizable_call (stmt, NULL, NULL, node)
          || vectorizable_store (stmt, NULL, NULL, node)
          || vectorizable_reduction (stmt, NULL, NULL, node)
-         || vectorizable_condition (stmt, NULL, NULL, NULL, 0, node));
+         || vectorizable_condition (stmt, NULL, NULL, NULL, 0, node)
+         || vectorizable_comparison (stmt, NULL, NULL, NULL, node));
   else
     {
       if (bb_vinfo)
@@ -7588,7 +7834,8 @@ vect_analyze_stmt (gimple stmt, bool *need_to_vectorize, 
slp_tree node)
              || vectorizable_load (stmt, NULL, NULL, node, NULL)
              || vectorizable_call (stmt, NULL, NULL, node)
              || vectorizable_store (stmt, NULL, NULL, node)
-             || vectorizable_condition (stmt, NULL, NULL, NULL, 0, node));
+             || vectorizable_condition (stmt, NULL, NULL, NULL, 0, node)
+             || vectorizable_comparison (stmt, NULL, NULL, NULL, node));
     }
 
   if (!ok)
@@ -7704,6 +7951,11 @@ vect_transform_stmt (gimple stmt, gimple_stmt_iterator 
*gsi,
       gcc_assert (done);
       break;
 
+    case comparison_vec_info_type:
+      done = vectorizable_comparison (stmt, gsi, &vec_stmt, NULL, slp_node);
+      gcc_assert (done);
+      break;
+
     case call_vec_info_type:
       done = vectorizable_call (stmt, gsi, &vec_stmt, slp_node);
       stmt = gsi_stmt (*gsi);
@@ -8038,6 +8290,23 @@ get_vectype_for_scalar_type (tree scalar_type)
   return vectype;
 }
 
+/* Function get_mask_type_for_scalar_type.
+
+   Returns the mask type corresponding to a result of comparison
+   of vectors of specified SCALAR_TYPE as supported by target.  */
+
+tree
+get_mask_type_for_scalar_type (tree scalar_type)
+{
+  tree vectype = get_vectype_for_scalar_type (scalar_type);
+
+  if (!vectype)
+    return NULL;
+
+  return build_truth_vector_type (TYPE_VECTOR_SUBPARTS (vectype),
+                                 current_vector_size);
+}
+
 /* Function get_same_sized_vectype
 
    Returns a vector type corresponding to SCALAR_TYPE of size
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 58e8f10..94aea1a 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -28,7 +28,8 @@ along with GCC; see the file COPYING3.  If not see
 enum vect_var_kind {
   vect_simple_var,
   vect_pointer_var,
-  vect_scalar_var
+  vect_scalar_var,
+  vect_mask_var
 };
 
 /* Defines type of operation.  */
@@ -482,6 +483,7 @@ enum stmt_vec_info_type {
   call_simd_clone_vec_info_type,
   assignment_vec_info_type,
   condition_vec_info_type,
+  comparison_vec_info_type,
   reduc_vec_info_type,
   induc_vec_info_type,
   type_promotion_vec_info_type,
@@ -995,6 +997,7 @@ extern bool vect_can_advance_ivs_p (loop_vec_info);
 /* In tree-vect-stmts.c.  */
 extern unsigned int current_vector_size;
 extern tree get_vectype_for_scalar_type (tree);
+extern tree get_mask_type_for_scalar_type (tree);
 extern tree get_same_sized_vectype (tree, tree);
 extern bool vect_is_simple_use (tree, gimple, loop_vec_info,
                                bb_vec_info, gimple *,
diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
index 758ca38..cffacaa 100644
--- a/gcc/tree-vect-patterns.c
+++ b/gcc/tree-vect-patterns.c
@@ -2957,7 +2957,7 @@ check_bool_pattern (tree var, loop_vec_info loop_vinfo, 
bb_vec_info bb_vinfo)
     default:
       if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
        {
-         tree vecitype, comp_vectype;
+         tree vecitype, comp_vectype, mask_type;
 
          /* If the comparison can throw, then is_gimple_condexpr will be
             false and we can't make a COND_EXPR/VEC_COND_EXPR out of it.  */
@@ -2968,6 +2968,11 @@ check_bool_pattern (tree var, loop_vec_info loop_vinfo, 
bb_vec_info bb_vinfo)
          if (comp_vectype == NULL_TREE)
            return false;
 
+         mask_type = get_mask_type_for_scalar_type (TREE_TYPE (rhs1));
+         if (mask_type
+             && expand_vec_cmp_expr_p (comp_vectype, mask_type))
+           return false;
+
          if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE)
            {
              machine_mode mode = TYPE_MODE (TREE_TYPE (rhs1));
@@ -3192,6 +3197,75 @@ adjust_bool_pattern (tree var, tree out_type, tree 
trueval,
 }
 
 
+/* Try to determine a proper type for converting bool VAR
+   into an integer value.  The type is chosen so that
+   conversion has the same number of elements as a mask
+   producer.  */
+
+static tree
+search_type_for_mask (tree var, loop_vec_info loop_vinfo, bb_vec_info bb_vinfo)
+{
+  gimple def_stmt;
+  enum vect_def_type dt;
+  tree def, rhs1;
+  enum tree_code rhs_code;
+  tree res = NULL;
+
+  if (TREE_CODE (var) != SSA_NAME)
+    return NULL;
+
+  if ((TYPE_PRECISION (TREE_TYPE (var)) != 1
+       || !TYPE_UNSIGNED (TREE_TYPE (var)))
+      && TREE_CODE (TREE_TYPE (var)) != BOOLEAN_TYPE)
+    return NULL;
+
+  if (!vect_is_simple_use (var, NULL, loop_vinfo, bb_vinfo, &def_stmt, &def,
+                          &dt))
+    return NULL;
+
+  if (dt != vect_internal_def)
+    return NULL;
+
+  if (!is_gimple_assign (def_stmt))
+    return NULL;
+
+  rhs_code = gimple_assign_rhs_code (def_stmt);
+  rhs1 = gimple_assign_rhs1 (def_stmt);
+
+  switch (rhs_code)
+    {
+    case SSA_NAME:
+    case BIT_NOT_EXPR:
+    CASE_CONVERT:
+      res = search_type_for_mask (rhs1, loop_vinfo, bb_vinfo);
+      break;
+
+    case BIT_AND_EXPR:
+    case BIT_IOR_EXPR:
+    case BIT_XOR_EXPR:
+      if (!(res = search_type_for_mask (rhs1, loop_vinfo, bb_vinfo)))
+       res = search_type_for_mask (gimple_assign_rhs2 (def_stmt),
+                                       loop_vinfo, bb_vinfo);
+      break;
+
+    default:
+      if (TREE_CODE_CLASS (rhs_code) == tcc_comparison)
+       {
+         if (TREE_CODE (TREE_TYPE (rhs1)) != INTEGER_TYPE
+             || !TYPE_UNSIGNED (TREE_TYPE (rhs1)))
+           {
+             machine_mode mode = TYPE_MODE (TREE_TYPE (rhs1));
+             res = build_nonstandard_integer_type (GET_MODE_BITSIZE (mode), 1);
+           }
+         else
+           res = TREE_TYPE (rhs1);
+       }
+    }
+
+  return res;
+}
+
+
 /* Function vect_recog_bool_pattern
 
    Try to find pattern like following:
@@ -3249,6 +3323,7 @@ vect_recog_bool_pattern (vec<gimple> *stmts, tree 
*type_in,
   enum tree_code rhs_code;
   tree var, lhs, rhs, vectype;
   stmt_vec_info stmt_vinfo = vinfo_for_stmt (last_stmt);
+  stmt_vec_info new_stmt_info;
   loop_vec_info loop_vinfo = STMT_VINFO_LOOP_VINFO (stmt_vinfo);
   bb_vec_info bb_vinfo = STMT_VINFO_BB_VINFO (stmt_vinfo);
   gimple pattern_stmt;
@@ -3274,16 +3349,43 @@ vect_recog_bool_pattern (vec<gimple> *stmts, tree 
*type_in,
       if (vectype == NULL_TREE)
        return NULL;
 
-      if (!check_bool_pattern (var, loop_vinfo, bb_vinfo))
-       return NULL;
-
-      rhs = adjust_bool_pattern (var, TREE_TYPE (lhs), NULL_TREE, stmts);
-      lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
-      if (useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
-       pattern_stmt = gimple_build_assign (lhs, SSA_NAME, rhs);
+      if (check_bool_pattern (var, loop_vinfo, bb_vinfo))
+       {
+         rhs = adjust_bool_pattern (var, TREE_TYPE (lhs), NULL_TREE, stmts);
+         lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+         if (useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
+           pattern_stmt = gimple_build_assign (lhs, SSA_NAME, rhs);
+         else
+           pattern_stmt
+             = gimple_build_assign (lhs, NOP_EXPR, rhs);
+       }
       else
-       pattern_stmt
-         = gimple_build_assign (lhs, NOP_EXPR, rhs);
+       {
+         tree type = search_type_for_mask (var, loop_vinfo, bb_vinfo);
+         tree cst0, cst1;
+
+         if (!type || TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (lhs)))
+           type = TREE_TYPE (lhs);
+         cst0 = build_int_cst (type, 0);
+         cst1 = build_int_cst (type, 1);
+         lhs = vect_recog_temp_ssa_var (type, NULL);
+         pattern_stmt = gimple_build_assign (lhs, COND_EXPR, var, cst0, cst1);
+
+         if (!useless_type_conversion_p (type, TREE_TYPE (lhs)))
+           {
+             tree new_vectype = get_vectype_for_scalar_type (type);
+             new_stmt_info = new_stmt_vec_info (pattern_stmt, loop_vinfo,
+                                                bb_vinfo);
+             set_vinfo_for_stmt (pattern_stmt, new_stmt_info);
+             STMT_VINFO_VECTYPE (new_stmt_info) = new_vectype;
+             new_pattern_def_seq (stmt_vinfo, pattern_stmt);
+
+             rhs = lhs;
+             lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
+             pattern_stmt = gimple_build_assign (lhs, CONVERT_EXPR, rhs);
+           }
+       }
+
       *type_out = vectype;
       *type_in = vectype;
       stmts->safe_push (last_stmt);
@@ -3312,10 +3414,11 @@ vect_recog_bool_pattern (vec<gimple> *stmts, tree 
*type_in,
       if (get_vectype_for_scalar_type (type) == NULL_TREE)
        return NULL;
 
-      if (!check_bool_pattern (var, loop_vinfo, bb_vinfo))
-       return NULL;
+      if (check_bool_pattern (var, loop_vinfo, bb_vinfo))
+       rhs = adjust_bool_pattern (var, type, NULL_TREE, stmts);
+      else
+       rhs = var;
 
-      rhs = adjust_bool_pattern (var, type, NULL_TREE, stmts);
       lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
       pattern_stmt 
          = gimple_build_assign (lhs, COND_EXPR,
@@ -3340,16 +3443,38 @@ vect_recog_bool_pattern (vec<gimple> *stmts, tree 
*type_in,
       gcc_assert (vectype != NULL_TREE);
       if (!VECTOR_MODE_P (TYPE_MODE (vectype)))
        return NULL;
-      if (!check_bool_pattern (var, loop_vinfo, bb_vinfo))
-       return NULL;
 
-      rhs = adjust_bool_pattern (var, TREE_TYPE (vectype), NULL_TREE, stmts);
+      if (check_bool_pattern (var, loop_vinfo, bb_vinfo))
+       rhs = adjust_bool_pattern (var, TREE_TYPE (vectype),
+                                  NULL_TREE, stmts);
+      else
+       {
+         tree type = search_type_for_mask (var, loop_vinfo, bb_vinfo);
+         tree cst0, cst1, new_vectype;
+
+         if (!type || TYPE_MODE (type) == TYPE_MODE (TREE_TYPE (vectype)))
+           type = TREE_TYPE (vectype);
+
+         cst0 = build_int_cst (type, 0);
+         cst1 = build_int_cst (type, 1);
+         new_vectype = get_vectype_for_scalar_type (type);
+
+         rhs = vect_recog_temp_ssa_var (type, NULL);
+         pattern_stmt = gimple_build_assign (rhs, COND_EXPR, var, cst0, cst1);
+
+         pattern_stmt_info = new_stmt_vec_info (pattern_stmt, loop_vinfo,
+                                                bb_vinfo);
+         set_vinfo_for_stmt (pattern_stmt, pattern_stmt_info);
+         STMT_VINFO_VECTYPE (pattern_stmt_info) = new_vectype;
+         append_pattern_def_seq (stmt_vinfo, pattern_stmt);
+       }
+
       lhs = build1 (VIEW_CONVERT_EXPR, TREE_TYPE (vectype), lhs);
       if (!useless_type_conversion_p (TREE_TYPE (lhs), TREE_TYPE (rhs)))
        {
          tree rhs2 = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
          gimple cast_stmt = gimple_build_assign (rhs2, NOP_EXPR, rhs);
-         new_pattern_def_seq (stmt_vinfo, cast_stmt);
+         append_pattern_def_seq (stmt_vinfo, cast_stmt);
          rhs = rhs2;
        }
       pattern_stmt = gimple_build_assign (lhs, SSA_NAME, rhs);
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 6a17ef4..e22aa57 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -129,6 +129,9 @@ extern bool ix86_expand_fp_vcond (rtx[]);
 extern bool ix86_expand_int_vcond (rtx[]);
 extern void ix86_expand_vec_perm (rtx[]);
 extern bool ix86_expand_vec_perm_const (rtx[]);
+extern bool ix86_expand_mask_vec_cmp (rtx[]);
+extern bool ix86_expand_int_vec_cmp (rtx[]);
+extern bool ix86_expand_fp_vec_cmp (rtx[]);
 extern void ix86_expand_sse_unpack (rtx, rtx, bool, bool);
 extern bool ix86_expand_int_addcc (rtx[]);
 extern rtx ix86_expand_call (rtx, rtx, rtx, rtx, rtx, bool);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 070605f..d17c350 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21440,8 +21440,8 @@ ix86_expand_sse_cmp (rtx dest, enum rtx_code code, rtx 
cmp_op0, rtx cmp_op1,
     cmp_op1 = force_reg (cmp_ops_mode, cmp_op1);
 
   if (optimize
-      || reg_overlap_mentioned_p (dest, op_true)
-      || reg_overlap_mentioned_p (dest, op_false))
+      || (op_true && reg_overlap_mentioned_p (dest, op_true))
+      || (op_false && reg_overlap_mentioned_p (dest, op_false)))
     dest = gen_reg_rtx (maskcmp ? cmp_mode : mode);
 
   /* Compare patterns for int modes are unspec in AVX512F only.  */
@@ -21713,34 +21713,127 @@ ix86_expand_fp_movcc (rtx operands[])
   return true;
 }
 
-/* Expand a floating-point vector conditional move; a vcond operation
-   rather than a movcc operation.  */
+/* Helper for ix86_cmp_code_to_pcmp_immediate for int modes.  */
+
+static int
+ix86_int_cmp_code_to_pcmp_immediate (enum rtx_code code)
+{
+  switch (code)
+    {
+    case EQ:
+      return 0;
+    case LT:
+    case LTU:
+      return 1;
+    case LE:
+    case LEU:
+      return 2;
+    case NE:
+      return 4;
+    case GE:
+    case GEU:
+      return 5;
+    case GT:
+    case GTU:
+      return 6;
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Helper for ix86_cmp_code_to_pcmp_immediate for fp modes.  */
+
+static int
+ix86_fp_cmp_code_to_pcmp_immediate (enum rtx_code code)
+{
+  switch (code)
+    {
+    case EQ:
+      return 0x08;
+    case NE:
+      return 0x04;
+    case GT:
+      return 0x16;
+    case LE:
+      return 0x1a;
+    case GE:
+      return 0x15;
+    case LT:
+      return 0x19;
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Return immediate value to be used in UNSPEC_PCMP
+   for comparison CODE in MODE.  */
+
+static int
+ix86_cmp_code_to_pcmp_immediate (enum rtx_code code, machine_mode mode)
+{
+  if (FLOAT_MODE_P (mode))
+    return ix86_fp_cmp_code_to_pcmp_immediate (code);
+  return ix86_int_cmp_code_to_pcmp_immediate (code);
+}
+
+/* Expand AVX-512 vector comparison.  */
 
 bool
-ix86_expand_fp_vcond (rtx operands[])
+ix86_expand_mask_vec_cmp (rtx operands[])
 {
-  enum rtx_code code = GET_CODE (operands[3]);
+  machine_mode mask_mode = GET_MODE (operands[0]);
+  machine_mode cmp_mode = GET_MODE (operands[2]);
+  enum rtx_code code = GET_CODE (operands[1]);
+  rtx imm = GEN_INT (ix86_cmp_code_to_pcmp_immediate (code, cmp_mode));
+  int unspec_code;
+  rtx unspec;
+
+  switch (code)
+    {
+    case LEU:
+    case GTU:
+    case GEU:
+    case LTU:
+      unspec_code = UNSPEC_UNSIGNED_PCMP;
+    default:
+      unspec_code = UNSPEC_PCMP;
+    }
+
+  unspec = gen_rtx_UNSPEC (mask_mode, gen_rtvec (3, operands[2],
+                                                operands[3], imm),
+                          unspec_code);
+  emit_insn (gen_rtx_SET (operands[0], unspec));
+
+  return true;
+}
+
+/* Expand fp vector comparison.  */
+
+bool
+ix86_expand_fp_vec_cmp (rtx operands[])
+{
+  enum rtx_code code = GET_CODE (operands[1]);
   rtx cmp;
 
   code = ix86_prepare_sse_fp_compare_args (operands[0], code,
-                                          &operands[4], &operands[5]);
+                                          &operands[2], &operands[3]);
   if (code == UNKNOWN)
     {
       rtx temp;
-      switch (GET_CODE (operands[3]))
+      switch (GET_CODE (operands[1]))
        {
        case LTGT:
-         temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[4],
-                                     operands[5], operands[0], operands[0]);
-         cmp = ix86_expand_sse_cmp (operands[0], NE, operands[4],
-                                    operands[5], operands[1], operands[2]);
+         temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[2],
+                                     operands[3], NULL, NULL);
+         cmp = ix86_expand_sse_cmp (operands[0], NE, operands[2],
+                                    operands[3], NULL, NULL);
          code = AND;
          break;
        case UNEQ:
-         temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[4],
-                                     operands[5], operands[0], operands[0]);
-         cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[4],
-                                    operands[5], operands[1], operands[2]);
+         temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[2],
+                                     operands[3], NULL, NULL);
+         cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[2],
+                                    operands[3], NULL, NULL);
          code = IOR;
          break;
        default:
@@ -21748,72 +21841,26 @@ ix86_expand_fp_vcond (rtx operands[])
        }
       cmp = expand_simple_binop (GET_MODE (cmp), code, temp, cmp, cmp, 1,
                                 OPTAB_DIRECT);
-      ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
-      return true;
     }
+  else
+    cmp = ix86_expand_sse_cmp (operands[0], code, operands[2], operands[3],
+                              operands[1], operands[2]);
 
-  if (ix86_expand_sse_fp_minmax (operands[0], code, operands[4],
-                                operands[5], operands[1], operands[2]))
-    return true;
+  if (operands[0] != cmp)
+    emit_move_insn (operands[0], cmp);
 
-  cmp = ix86_expand_sse_cmp (operands[0], code, operands[4], operands[5],
-                            operands[1], operands[2]);
-  ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
   return true;
 }
 
-/* Expand a signed/unsigned integral vector conditional move.  */
-
-bool
-ix86_expand_int_vcond (rtx operands[])
+static rtx
+ix86_expand_int_sse_cmp (rtx dest, enum rtx_code code, rtx cop0, rtx cop1,
+                        rtx op_true, rtx op_false, bool *negate)
 {
-  machine_mode data_mode = GET_MODE (operands[0]);
-  machine_mode mode = GET_MODE (operands[4]);
-  enum rtx_code code = GET_CODE (operands[3]);
-  bool negate = false;
-  rtx x, cop0, cop1;
-
-  cop0 = operands[4];
-  cop1 = operands[5];
+  machine_mode data_mode = GET_MODE (dest);
+  machine_mode mode = GET_MODE (cop0);
+  rtx x;
 
-  /* Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
-     and x < 0 ? 1 : 0 into (unsigned) x >> 31.  */
-  if ((code == LT || code == GE)
-      && data_mode == mode
-      && cop1 == CONST0_RTX (mode)
-      && operands[1 + (code == LT)] == CONST0_RTX (data_mode)
-      && GET_MODE_UNIT_SIZE (data_mode) > 1
-      && GET_MODE_UNIT_SIZE (data_mode) <= 8
-      && (GET_MODE_SIZE (data_mode) == 16
-         || (TARGET_AVX2 && GET_MODE_SIZE (data_mode) == 32)))
-    {
-      rtx negop = operands[2 - (code == LT)];
-      int shift = GET_MODE_UNIT_BITSIZE (data_mode) - 1;
-      if (negop == CONST1_RTX (data_mode))
-       {
-         rtx res = expand_simple_binop (mode, LSHIFTRT, cop0, GEN_INT (shift),
-                                        operands[0], 1, OPTAB_DIRECT);
-         if (res != operands[0])
-           emit_move_insn (operands[0], res);
-         return true;
-       }
-      else if (GET_MODE_INNER (data_mode) != DImode
-              && vector_all_ones_operand (negop, data_mode))
-       {
-         rtx res = expand_simple_binop (mode, ASHIFTRT, cop0, GEN_INT (shift),
-                                        operands[0], 0, OPTAB_DIRECT);
-         if (res != operands[0])
-           emit_move_insn (operands[0], res);
-         return true;
-       }
-    }
-
-  if (!nonimmediate_operand (cop1, mode))
-    cop1 = force_reg (mode, cop1);
-  if (!general_operand (operands[1], data_mode))
-    operands[1] = force_reg (data_mode, operands[1]);
-  if (!general_operand (operands[2], data_mode))
-    operands[2] = force_reg (data_mode, operands[2]);
+  *negate = false;
 
   /* XOP supports all of the comparisons on all 128-bit vector int types.  */
   if (TARGET_XOP
@@ -21834,13 +21881,13 @@ ix86_expand_int_vcond (rtx operands[])
        case LE:
        case LEU:
          code = reverse_condition (code);
-         negate = true;
+         *negate = true;
          break;
 
        case GE:
        case GEU:
          code = reverse_condition (code);
-         negate = true;
+         *negate = true;
          /* FALLTHRU */
 
        case LT:
@@ -21861,14 +21908,14 @@ ix86_expand_int_vcond (rtx operands[])
            case EQ:
              /* SSE4.1 supports EQ.  */
              if (!TARGET_SSE4_1)
-               return false;
+               return NULL;
              break;
 
            case GT:
            case GTU:
              /* SSE4.2 supports GT/GTU.  */
              if (!TARGET_SSE4_2)
-               return false;
+               return NULL;
              break;
 
            default:
@@ -21929,12 +21976,13 @@ ix86_expand_int_vcond (rtx operands[])
            case V8HImode:
              /* Perform a parallel unsigned saturating subtraction.  */
              x = gen_reg_rtx (mode);
-             emit_insn (gen_rtx_SET (x, gen_rtx_US_MINUS (mode, cop0, cop1)));
+             emit_insn (gen_rtx_SET (x, gen_rtx_US_MINUS (mode, cop0,
+                                                          cop1)));
 
              cop0 = x;
              cop1 = CONST0_RTX (mode);
              code = EQ;
-             negate = !negate;
+             *negate = !*negate;
              break;
 
            default:
@@ -21943,22 +21991,162 @@ ix86_expand_int_vcond (rtx operands[])
        }
     }
 
+  if (*negate)
+    std::swap (op_true, op_false);
+
   /* Allow the comparison to be done in one mode, but the movcc to
      happen in another mode.  */
   if (data_mode == mode)
     {
-      x = ix86_expand_sse_cmp (operands[0], code, cop0, cop1,
-                              operands[1+negate], operands[2-negate]);
+      x = ix86_expand_sse_cmp (dest, code, cop0, cop1,
+                              op_true, op_false);
     }
   else
     {
       gcc_assert (GET_MODE_SIZE (data_mode) == GET_MODE_SIZE (mode));
       x = ix86_expand_sse_cmp (gen_reg_rtx (mode), code, cop0, cop1,
-                              operands[1+negate], operands[2-negate]);
+                              op_true, op_false);
       if (GET_MODE (x) == mode)
        x = gen_lowpart (data_mode, x);
     }
 
+  return x;
+}
+
+/* Expand integer vector comparison.  */
+
+bool
+ix86_expand_int_vec_cmp (rtx operands[])
+{
+  rtx_code code = GET_CODE (operands[1]);
+  bool negate = false;
+  rtx cmp = ix86_expand_int_sse_cmp (operands[0], code, operands[2],
+                                    operands[3], NULL, NULL, &negate);
+
+  if (!cmp)
+    return false;
+
+  if (negate)
+    cmp = ix86_expand_int_sse_cmp (operands[0], EQ, cmp,
+                                  CONST0_RTX (GET_MODE (cmp)),
+                                  NULL, NULL, &negate);
+
+  gcc_assert (!negate);
+
+  if (operands[0] != cmp)
+    emit_move_insn (operands[0], cmp);
+
+  return true;
+}
+
+/* Expand a floating-point vector conditional move; a vcond operation
+   rather than a movcc operation.  */
+
+bool
+ix86_expand_fp_vcond (rtx operands[])
+{
+  enum rtx_code code = GET_CODE (operands[3]);
+  rtx cmp;
+
+  code = ix86_prepare_sse_fp_compare_args (operands[0], code,
+                                          &operands[4], &operands[5]);
+  if (code == UNKNOWN)
+    {
+      rtx temp;
+      switch (GET_CODE (operands[3]))
+       {
+       case LTGT:
+         temp = ix86_expand_sse_cmp (operands[0], ORDERED, operands[4],
+                                     operands[5], operands[0], operands[0]);
+         cmp = ix86_expand_sse_cmp (operands[0], NE, operands[4],
+                                    operands[5], operands[1], operands[2]);
+         code = AND;
+         break;
+       case UNEQ:
+         temp = ix86_expand_sse_cmp (operands[0], UNORDERED, operands[4],
+                                     operands[5], operands[0], operands[0]);
+         cmp = ix86_expand_sse_cmp (operands[0], EQ, operands[4],
+                                    operands[5], operands[1], operands[2]);
+         code = IOR;
+         break;
+       default:
+         gcc_unreachable ();
+       }
+      cmp = expand_simple_binop (GET_MODE (cmp), code, temp, cmp, cmp, 1,
+                                OPTAB_DIRECT);
+      ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
+      return true;
+    }
+
+  if (ix86_expand_sse_fp_minmax (operands[0], code, operands[4],
+                                operands[5], operands[1], operands[2]))
+    return true;
+
+  cmp = ix86_expand_sse_cmp (operands[0], code, operands[4], operands[5],
+                            operands[1], operands[2]);
+  ix86_expand_sse_movcc (operands[0], cmp, operands[1], operands[2]);
+  return true;
+}
+
+/* Expand a signed/unsigned integral vector conditional move.  */
+
+bool
+ix86_expand_int_vcond (rtx operands[])
+{
+  machine_mode data_mode = GET_MODE (operands[0]);
+  machine_mode mode = GET_MODE (operands[4]);
+  enum rtx_code code = GET_CODE (operands[3]);
+  bool negate = false;
+  rtx x, cop0, cop1;
+
+  cop0 = operands[4];
+  cop1 = operands[5];
+
+  /* Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31
+     and x < 0 ? 1 : 0 into (unsigned) x >> 31.  */
+  if ((code == LT || code == GE)
+      && data_mode == mode
+      && cop1 == CONST0_RTX (mode)
+      && operands[1 + (code == LT)] == CONST0_RTX (data_mode)
+      && GET_MODE_UNIT_SIZE (data_mode) > 1
+      && GET_MODE_UNIT_SIZE (data_mode) <= 8
+      && (GET_MODE_SIZE (data_mode) == 16
+         || (TARGET_AVX2 && GET_MODE_SIZE (data_mode) == 32)))
+    {
+      rtx negop = operands[2 - (code == LT)];
+      int shift = GET_MODE_UNIT_BITSIZE (data_mode) - 1;
+      if (negop == CONST1_RTX (data_mode))
+       {
+         rtx res = expand_simple_binop (mode, LSHIFTRT, cop0, GEN_INT (shift),
+                                        operands[0], 1, OPTAB_DIRECT);
+         if (res != operands[0])
+           emit_move_insn (operands[0], res);
+         return true;
+       }
+      else if (GET_MODE_INNER (data_mode) != DImode
+              && vector_all_ones_operand (negop, data_mode))
+       {
+         rtx res = expand_simple_binop (mode, ASHIFTRT, cop0, GEN_INT (shift),
+                                        operands[0], 0, OPTAB_DIRECT);
+         if (res != operands[0])
+           emit_move_insn (operands[0], res);
+         return true;
+       }
+    }
+
+  if (!nonimmediate_operand (cop1, mode))
+    cop1 = force_reg (mode, cop1);
+  if (!general_operand (operands[1], data_mode))
+    operands[1] = force_reg (data_mode, operands[1]);
+  if (!general_operand (operands[2], data_mode))
+    operands[2] = force_reg (data_mode, operands[2]);
+
+  x = ix86_expand_int_sse_cmp (operands[0], code, cop0, cop1,
+                              operands[1], operands[2], &negate);
+
+  if (!x)
+    return false;
+
   ix86_expand_sse_movcc (operands[0], x, operands[1+negate],
                         operands[2-negate]);
   return true;
@@ -51678,6 +51866,30 @@ ix86_autovectorize_vector_sizes (void)
     (TARGET_AVX && !TARGET_PREFER_AVX128) ? 32 | 16 : 0;
 }
 
+/* Implemenation of targetm.vectorize.get_mask_mode.  */
+
+static machine_mode
+ix86_get_mask_mode (unsigned nunits, unsigned vector_size)
+{
+  /* Scalar mask case.  */
+  if (TARGET_AVX512F && vector_size == 64)
+    {
+      unsigned elem_size = vector_size / nunits;
+      if ((vector_size == 64 || TARGET_AVX512VL)
+         && ((elem_size == 4 || elem_size == 8)
+             || TARGET_AVX512BW))
+       return smallest_mode_for_size (nunits, MODE_INT);
+    }
+
+  unsigned elem_size = vector_size / nunits;
+  machine_mode elem_mode
+    = smallest_mode_for_size (elem_size * BITS_PER_UNIT, MODE_INT);
+
+  gcc_assert (elem_size * nunits == vector_size);
+
+  return mode_for_vector (elem_mode, nunits);
+}
+
 
 
 /* Return class of registers which could be used for pseudo of MODE
@@ -52612,6 +52824,8 @@ ix86_operands_ok_for_move_multiple (rtx *operands, bool 
load,
 #undef TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES
 #define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_SIZES \
   ix86_autovectorize_vector_sizes
+#undef TARGET_VECTORIZE_GET_MASK_MODE
+#define TARGET_VECTORIZE_GET_MASK_MODE ix86_get_mask_mode
 #undef TARGET_VECTORIZE_INIT_COST
 #define TARGET_VECTORIZE_INIT_COST ix86_init_cost
 #undef TARGET_VECTORIZE_ADD_STMT_COST
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 4535570..a8d55cc 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -605,6 +605,15 @@
    (V16SF "HI") (V8SF  "QI") (V4SF  "QI")
    (V8DF  "QI") (V4DF  "QI") (V2DF  "QI")])
 
+;; Mapping of vector modes to corresponding mask size
+(define_mode_attr avx512fmaskmodelower
+  [(V64QI "di") (V32QI "si") (V16QI "hi")
+   (V32HI "si") (V16HI "hi") (V8HI  "qi") (V4HI "qi")
+   (V16SI "hi") (V8SI  "qi") (V4SI  "qi")
+   (V8DI  "qi") (V4DI  "qi") (V2DI  "qi")
+   (V16SF "hi") (V8SF  "qi") (V4SF  "qi")
+   (V8DF  "qi") (V4DF  "qi") (V2DF  "qi")])
+
 ;; Mapping of vector float modes to an integer mode of the same size
 (define_mode_attr sseintvecmode
   [(V16SF "V16SI") (V8DF  "V8DI")
@@ -2803,6 +2812,150 @@
                      (const_string "0")))
    (set_attr "mode" "<MODE>")])
 
+(define_expand "vec_cmp<mode><avx512fmaskmodelower>"
+  [(set (match_operand:<avx512fmaskmode> 0 "register_operand")
+       (match_operator:<avx512fmaskmode> 1 ""
+         [(match_operand:V48_AVX512VL 2 "register_operand")
+          (match_operand:V48_AVX512VL 3 "nonimmediate_operand")]))]
+  "TARGET_AVX512F"
+{
+  bool ok = ix86_expand_mask_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><avx512fmaskmodelower>"
+  [(set (match_operand:<avx512fmaskmode> 0 "register_operand")
+       (match_operator:<avx512fmaskmode> 1 ""
+         [(match_operand:VI12_AVX512VL 2 "register_operand")
+          (match_operand:VI12_AVX512VL 3 "nonimmediate_operand")]))]
+  "TARGET_AVX512BW"
+{
+  bool ok = ix86_expand_mask_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><sseintvecmodelower>"
+  [(set (match_operand:<sseintvecmode> 0 "register_operand")
+       (match_operator:<sseintvecmode> 1 ""
+         [(match_operand:VI_256 2 "register_operand")
+          (match_operand:VI_256 3 "nonimmediate_operand")]))]
+  "TARGET_AVX2"
+{
+  bool ok = ix86_expand_int_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><sseintvecmodelower>"
+  [(set (match_operand:<sseintvecmode> 0 "register_operand")
+       (match_operator:<sseintvecmode> 1 ""
+         [(match_operand:VI124_128 2 "register_operand")
+          (match_operand:VI124_128 3 "nonimmediate_operand")]))]
+  "TARGET_SSE2"
+{
+  bool ok = ix86_expand_int_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpv2div2di"
+  [(set (match_operand:V2DI 0 "register_operand")
+       (match_operator:V2DI 1 ""
+         [(match_operand:V2DI 2 "register_operand")
+          (match_operand:V2DI 3 "nonimmediate_operand")]))]
+  "TARGET_SSE4_2"
+{
+  bool ok = ix86_expand_int_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><sseintvecmodelower>"
+  [(set (match_operand:<sseintvecmode> 0 "register_operand")
+       (match_operator:<sseintvecmode> 1 ""
+         [(match_operand:VF_256 2 "register_operand")
+          (match_operand:VF_256 3 "nonimmediate_operand")]))]
+  "TARGET_AVX"
+{
+  bool ok = ix86_expand_fp_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmp<mode><sseintvecmodelower>"
+  [(set (match_operand:<sseintvecmode> 0 "register_operand")
+       (match_operator:<sseintvecmode> 1 ""
+         [(match_operand:VF_128 2 "register_operand")
+          (match_operand:VF_128 3 "nonimmediate_operand")]))]
+  "TARGET_SSE"
+{
+  bool ok = ix86_expand_fp_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<mode><avx512fmaskmodelower>"
+  [(set (match_operand:<avx512fmaskmode> 0 "register_operand")
+       (match_operator:<avx512fmaskmode> 1 ""
+         [(match_operand:VI48_AVX512VL 2 "register_operand")
+          (match_operand:VI48_AVX512VL 3 "nonimmediate_operand")]))]
+  "TARGET_AVX512F"
+{
+  bool ok = ix86_expand_mask_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<mode><avx512fmaskmodelower>"
+  [(set (match_operand:<avx512fmaskmode> 0 "register_operand")
+       (match_operator:<avx512fmaskmode> 1 ""
+         [(match_operand:VI12_AVX512VL 2 "register_operand")
+          (match_operand:VI12_AVX512VL 3 "nonimmediate_operand")]))]
+  "TARGET_AVX512BW"
+{
+  bool ok = ix86_expand_mask_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<mode><sseintvecmodelower>"
+  [(set (match_operand:<sseintvecmode> 0 "register_operand")
+       (match_operator:<sseintvecmode> 1 ""
+         [(match_operand:VI_256 2 "register_operand")
+          (match_operand:VI_256 3 "nonimmediate_operand")]))]
+  "TARGET_AVX2"
+{
+  bool ok = ix86_expand_int_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpu<mode><sseintvecmodelower>"
+  [(set (match_operand:<sseintvecmode> 0 "register_operand")
+       (match_operator:<sseintvecmode> 1 ""
+         [(match_operand:VI124_128 2 "register_operand")
+          (match_operand:VI124_128 3 "nonimmediate_operand")]))]
+  "TARGET_SSE2"
+{
+  bool ok = ix86_expand_int_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
+(define_expand "vec_cmpuv2div2di"
+  [(set (match_operand:V2DI 0 "register_operand")
+       (match_operator:V2DI 1 ""
+         [(match_operand:V2DI 2 "register_operand")
+          (match_operand:V2DI 3 "nonimmediate_operand")]))]
+  "TARGET_SSE4_2"
+{
+  bool ok = ix86_expand_int_vec_cmp (operands);
+  gcc_assert (ok);
+  DONE;
+})
+
 (define_expand "vcond<V_512:mode><VF_512:mode>"
   [(set (match_operand:V_512 0 "register_operand")
        (if_then_else:V_512
@@ -17895,7 +18048,7 @@
    (set_attr "btver2_decode" "vector") 
    (set_attr "mode" "<sseinsnmode>")])
 
-(define_expand "maskload<mode>"
+(define_expand "maskload<mode><sseintvecmodelower>"
   [(set (match_operand:V48_AVX2 0 "register_operand")
        (unspec:V48_AVX2
          [(match_operand:<sseintvecmode> 2 "register_operand")
@@ -17903,7 +18056,23 @@
          UNSPEC_MASKMOV))]
   "TARGET_AVX")
 
-(define_expand "maskstore<mode>"
+(define_expand "maskload<mode><avx512fmaskmodelower>"
+  [(set (match_operand:V48_AVX512VL 0 "register_operand")
+       (vec_merge:V48_AVX512VL
+         (match_operand:V48_AVX512VL 1 "memory_operand")
+         (match_dup 0)
+         (match_operand:<avx512fmaskmode> 2 "register_operand")))]
+  "TARGET_AVX512F")
+
+(define_expand "maskload<mode><avx512fmaskmodelower>"
+  [(set (match_operand:VI12_AVX512VL 0 "register_operand")
+       (vec_merge:VI12_AVX512VL
+         (match_operand:VI12_AVX512VL 1 "memory_operand")
+         (match_dup 0)
+         (match_operand:<avx512fmaskmode> 2 "register_operand")))]
+  "TARGET_AVX512BW")
+
+(define_expand "maskstore<mode><sseintvecmodelower>"
   [(set (match_operand:V48_AVX2 0 "memory_operand")
        (unspec:V48_AVX2
          [(match_operand:<sseintvecmode> 2 "register_operand")
@@ -17912,6 +18081,22 @@
          UNSPEC_MASKMOV))]
   "TARGET_AVX")
 
+(define_expand "maskstore<mode><avx512fmaskmodelower>"
+  [(set (match_operand:V48_AVX512VL 0 "memory_operand")
+       (vec_merge:V48_AVX512VL
+         (match_operand:V48_AVX512VL 1 "register_operand")
+         (match_dup 0)
+         (match_operand:<avx512fmaskmode> 2 "register_operand")))]
+  "TARGET_AVX512F")
+
+(define_expand "maskstore<mode><avx512fmaskmodelower>"
+  [(set (match_operand:VI12_AVX512VL 0 "memory_operand")
+       (vec_merge:VI12_AVX512VL
+         (match_operand:VI12_AVX512VL 1 "register_operand")
+         (match_dup 0)
+         (match_operand:<avx512fmaskmode> 2 "register_operand")))]
+  "TARGET_AVX512BW")
+
 (define_insn_and_split "avx_<castmode><avxsizesuffix>_<castmode>"
   [(set (match_operand:AVX256MODE2P 0 "nonimmediate_operand" "=x,m")
        (unspec:AVX256MODE2P

Reply via email to