On 1/9/2026 12:26 PM, Pengxuan Zheng wrote:
Implement (X >> C) NE/EQ 0 -> X LT/GE 0 in match.pd instead of fold-const.cc.
Bootstrapped and tested on x86_64 and aarch64.
PR tree-optimization/123109
gcc/ChangeLog:
* fold-const.cc (fold_binary_loc): Remove (X >> C) NE/EQ 0 -> X LT/GE 0
folding.
* match.pd (`(X >> C) NE/EQ 0 -> X LT/GE 0`): New pattern.
gcc/testsuite/ChangeLog:
* gcc.dg/tree-ssa/vrp99.c: Update test.
* gcc.dg/pr123109.c: New test.
This was submitted prior to stage3->stage4 transition, so if we can get
resolution on implementation details, it can still go in now.
+/* Fold (X >> C) != 0 into X < 0 if C is one less than the width
+ of X. Similarly fold (X >> C) == 0 into X >= 0. */
+(for neeq (ne eq)
+ ltge (lt ge)
+ (simplify
+ (neeq
+ (rshift@2 @0 INTEGER_CST@1)
+ integer_zerop)
+ (with { tree itype = signed_type_for (TREE_TYPE (@0)); }
+ /* Make sure to transform vector compares only to supported
+ ones or from unsupported ones and check that only after
+ IPA so offloaded code is handled correctly in this regard. */
+ (if (wi::to_wide (@1) == element_precision (itype) - 1
+ && (!VECTOR_TYPE_P (itype)
+ || (cfun
+ && cfun->after_inlining
+ && VECTOR_BOOLEAN_TYPE_P (type)
+ && (expand_vec_cmp_expr_p (itype, type, ltge)
+ || !expand_vec_cmp_expr_p (TREE_TYPE (@2),
+ type, neeq)))))
+ (ltge (convert:itype @0) { build_zero_cst (itype); })))))
+
So in the PR Richi said the condition ought to be
canonicalize_math_after_vectorization_p; this seems to check something
meaningfully different (after_inlining). Can you explain why you ended
up using the after_inlining test rather than the after_vectorization test?
Jeff