On Wed, Jan 7, 2026 at 5:43 AM Richard Biener <[email protected]> wrote:
>
> We fold (v >> CST) == { 0, 0.. } into v < { 0, 0.. } but fail to
> validate that's valid for the target. The following adds such check,
> making sure to apply after IPA (due to offloading) and only when
> the original form wasn't valid for the target (like before vector
> lowering) or when the new form is. In particular in this case
> we have an equality compare resulting in a non-vector which we
> can handle, but a similar LT/GT is never handled.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, will push unless
> Andrew has some comments.
Looks good to me. Note Pengxuan is working on moving this
transformation to match. He is close to posting it but I think your
patch should go in anyways. I did point him to this patch to base his
checks for expand_vec_cmp_expr_p on it.
Thanks,
Andrew
>
> Richard.
>
> PR middle-end/123107
> * fold-const.cc (fold_binary_loc): Guard (v >> CST) == { 0, 0.. }
> to v < { 0, 0.. } folding.
>
> * gcc.dg/torture/pr123107.c: New testcase.
> ---
> gcc/fold-const.cc | 17 +++++++++++++++--
> gcc/testsuite/gcc.dg/torture/pr123107.c | 18 ++++++++++++++++++
> 2 files changed, 33 insertions(+), 2 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/torture/pr123107.c
>
> diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> index ec6757cdf05..52c92ad66b5 100644
> --- a/gcc/fold-const.cc
> +++ b/gcc/fold-const.cc
> @@ -86,6 +86,7 @@ along with GCC; see the file COPYING3. If not see
> #include "vec-perm-indices.h"
> #include "asan.h"
> #include "gimple-range.h"
> +#include "optabs-tree.h"
>
> /* Nonzero if we are folding constants inside an initializer or a C++
> manifestly-constant-evaluated context; zero otherwise.
> @@ -12409,8 +12410,20 @@ fold_binary_loc (location_t loc, enum tree_code
> code, tree type,
> itype = signed_type_for (itype);
> arg00 = fold_convert_loc (loc, itype, arg00);
> }
> - return fold_build2_loc (loc, code == EQ_EXPR ? GE_EXPR :
> LT_EXPR,
> - type, arg00, build_zero_cst (itype));
> + enum tree_code code2 = code == EQ_EXPR ? GE_EXPR : LT_EXPR;
> + /* Make sure to transform vector compares only to supported
> + ones or from unsupported ones and check that only after
> + IPA so offloaded code is handled correctly in this regard.
> */
> + if (!VECTOR_TYPE_P (itype)
> + || (cfun
> + && cfun->after_inlining
> + /* We can jump on EQ/NE but not GE/LT. */
> + && VECTOR_BOOLEAN_TYPE_P (type)
> + && (expand_vec_cmp_expr_p (itype, type, code2)
> + || !expand_vec_cmp_expr_p (TREE_TYPE (op0),
> + type, code))))
> + return fold_build2_loc (loc, code2,
> + type, arg00, build_zero_cst (itype));
> }
> }
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr123107.c
> b/gcc/testsuite/gcc.dg/torture/pr123107.c
> new file mode 100644
> index 00000000000..b4982fa4ab4
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr123107.c
> @@ -0,0 +1,18 @@
> +/* { dg-do compile } */
> +/* { dg-additional-options "-Wno-psabi" } */
> +/* { dg-additional-options "-mavx2" { target { x86_64-*-* i?86-*-* } } } */
> +
> +#define BS_VEC(type, num) type __attribute__((vector_size(num *
> sizeof(type))))
> +
> +int f( BS_VEC(short, 16)
> + BS_TEMP_206)
> +{
> + BS_TEMP_206 = BS_TEMP_206 < 0;
> + if (BS_TEMP_206[0] | BS_TEMP_206[1] | BS_TEMP_206[2] | BS_TEMP_206[3]
> + | BS_TEMP_206[4] | BS_TEMP_206[5] | BS_TEMP_206[6] | BS_TEMP_206[7]
> + | BS_TEMP_206[8] | BS_TEMP_206[9] | BS_TEMP_206[10]
> + | BS_TEMP_206[11] | BS_TEMP_206[12] | BS_TEMP_206[13]
> + | BS_TEMP_206[14] | BS_TEMP_206[15])
> + return 1;
> + return 0;
> +}
> --
> 2.51.0