For loop masking we need to mask a mask AND operation with the loop mask. The following makes sure we have a corresponding mask available. There's no good way to distinguish loop masking from len masking here, so assume we have recorded a mask for the operands mask producers.
Bootstrap and regtest running on x86_64-unknown-linux-gnu. PR tree-optimization/121059 * tree-vect-stmts.cc (vectorizable_operation): Record a loop mask for mask AND operations. * gcc.dg/vect/pr121059.c: New testcase. --- gcc/testsuite/gcc.dg/vect/pr121059.c | 24 ++++++++++++++++++++++++ gcc/tree-vect-stmts.cc | 10 ++++++++++ 2 files changed, 34 insertions(+) create mode 100644 gcc/testsuite/gcc.dg/vect/pr121059.c diff --git a/gcc/testsuite/gcc.dg/vect/pr121059.c b/gcc/testsuite/gcc.dg/vect/pr121059.c new file mode 100644 index 00000000000..d7f69b4f1f5 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/pr121059.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-additional-options "-O3 --param vect-partial-vector-usage=1" } */ +/* { dg-additional-options "-march=x86-64-v4" { target avx512f } } */ + +typedef struct { + long left, right, top, bottom; +} MngBox; +typedef struct { + MngBox object_clip[6]; + char exists[256], frozen[]; +} MngReadInfo; +MngReadInfo mng_info; + +long ReadMNGImage_i; + +void ReadMNGImage(int ReadMNGImage_i) +{ + for (; ReadMNGImage_i < 256; ReadMNGImage_i++) + if (mng_info.exists[ReadMNGImage_i] && mng_info.frozen[ReadMNGImage_i]) + mng_info.object_clip[ReadMNGImage_i].left = + mng_info.object_clip[ReadMNGImage_i].right = + mng_info.object_clip[ReadMNGImage_i].top = + mng_info.object_clip[ReadMNGImage_i].bottom = 0; +} diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc index 4aa69da2218..f0dc4843ca7 100644 --- a/gcc/tree-vect-stmts.cc +++ b/gcc/tree-vect-stmts.cc @@ -6978,6 +6978,16 @@ vectorizable_operation (vec_info *vinfo, LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) = false; } } + else if (loop_vinfo + && LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P (loop_vinfo) + && code == BIT_AND_EXPR + && VECTOR_BOOLEAN_TYPE_P (vectype) + /* We cannot always record a mask since that will disable + len-based partial vectors, but there should be already + one mask producer stmt which should require loop + masking. */ + && !masks->is_empty ()) + vect_record_loop_mask (loop_vinfo, masks, vec_num, vectype, NULL); /* Put types on constant and invariant SLP children. */ if (!vect_maybe_update_slp_op_vectype (slp_op0, vectype) -- 2.43.0