https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115073
Bug ID: 115073 Summary: RISC-V: Gimple fold not honor C[LT]Z_DEFINED_VALUE_AT_ZERO Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: kito at gcc dot gnu.org Target Milestone: --- Target: riscv64-unknown-linux-gnu # What's up? A loop induction variable initialized from __builtin_ctz (x), and the loop bound is 32, and increment is one, and GCC turn it into infinite loops when x is 0. However RISC-V has defined CTZ_DEFINED_VALUE_AT_ZERO as 32 for SImode, so it's not UB IMO, but seems like gimple-range-op.cc and match.pd are not handle that. # Command to reproduce ``` $ riscv64-unknown-elf-gcc -O3 -march=rv64gc_zba_zbb_zbc ``` # Testcase ```c void f(); void foo(unsigned int id, unsigned int x) { for (unsigned int idx = __builtin_ctz(x); idx < 32; idx++) { f(); } } ``` # Asm output with comment: ``` foo: addi sp,sp,-32 sd s0,16(sp) sd s1,8(sp) sd ra,24(sp) ctzw s0,a1 # s0 is 32 if a1 is 0 li s1,32 .L2: addiw s0,s0,1 # thne s0 become 33 here call f bne s0,s1,.L2 # compare with 32, which never terminate ld ra,24(sp) ld s0,16(sp) ld s1,8(sp) addi sp,sp,32 jr ra ``` # What I tried? I try to call CTZ_DEFINED_VALUE_AT_ZERO gimple-range-op.cc but it seems not help for this test case, and then I found it was screw up at match.pd when ccp pass. It applied a CTZ simplifications at match.pd: ``` (for op (eq ne) (simplify /* __builtin_ctz (x) == C -> (x & ((1 << (C + 1)) - 1)) == (1 << C). */ (op (ctz:s @0) INTEGER_CST@1) (with { tree type0 = TREE_TYPE (@0); int prec = TYPE_PRECISION (type0); } (if (prec <= MAX_FIXED_MODE_SIZE) (if (tree_int_cst_sgn (@1) < 0 || wi::to_widest (@1) >= prec) { constant_boolean_node (op == EQ_EXPR ? false : true, type); } (op (bit_and @0 { wide_int_to_tree (type0, wi::mask (tree_to_uhwi (@1) + 1, false, prec)); }) { wide_int_to_tree (type0, wi::shifted_mask (tree_to_uhwi (@1), 1, false, prec)); }))))))) ``` Then I found it has checked with CTZ_DEFINED_VALUE_AT_ZERO (g:75f8900159133ce069ef1d2edf3b67c7bc82e305) untill g:7383cb56e1170789929201b0dadc156888928fdd, but I realized it because is not really work well here CLZ_DEFINED_VALUE_AT_ZERO. So I did some aggressive experiment here: convert __builtin_ctz to IFN_CTZ with second operand (from C[LT]Z_DEFINED_VALUE_AT_ZERO, ideally), it can work *IF* backend provide patterns for ctz, but NOT work when backend is not provided, it could be a problem to RISC-V since ctz is not included in baseline ISA for RISC-V. It might be arguable if target didn't have ctz/clz pattern but C[LT]Z_DEFINED_VALUE_AT_ZERO is provided to backend, so I think middle-end optimization should still honor with that? Or another thought is convert that into target macro to resolve the issue describe in g:75f8900159133ce069ef1d2edf3b67c7bc82e305? # Aggressive experiment: ``` diff --git a/gcc/c-family/c-gimplify.cc b/gcc/c-family/c-gimplify.cc index 494da49791d..d84469a6dca 100644 --- a/gcc/c-family/c-gimplify.cc +++ b/gcc/c-family/c-gimplify.cc @@ -858,7 +858,16 @@ c_gimplify_expr (tree *expr_p, gimple_seq *pre_p ATTRIBUTE_UNUSED, c, CALL_EXPR_ARG (*expr_p, 1)); return GS_OK; } - break; + if (fndecl && fndecl_built_in_p(fndecl, BUILT_IN_CTZ) && + call_expr_nargs(*expr_p) == 1) { + tree a = save_expr(CALL_EXPR_ARG(*expr_p, 0)); + *expr_p = build_call_expr_internal_loc( + EXPR_LOCATION(*expr_p), IFN_CTZ, TREE_TYPE(a), 2, a, + build_int_cst(TREE_TYPE(a), 32)); + return GS_OK; + } + + break; } default:; ```