Hi, As suggested by Martin in PR78153 strlen's return value cannot exceed PTRDIFF_MAX. So I set it's range to [0, PTRDIFF_MAX - 1] in extract_range_basic() in the attached patch.
However it regressed strlenopt-3.c: Consider fn1() from strlenopt-3.c: __attribute__((noinline, noclone)) size_t fn1 (char *p, char *q) { size_t s = strlen (q); strcpy (p, q); return s - strlen (p); } The optimized dump shows the following: __attribute__((noclone, noinline)) fn1 (char * p, char * q) { size_t s; size_t _7; long unsigned int _9; <bb 2>: s_4 = strlen (q_3(D)); _9 = s_4 + 1; __builtin_memcpy (p_5(D), q_3(D), _9); _7 = 0; return _7; } which introduces the regression, because the test expects "return 0;" in fn1(). The issue seems to be in vrp2: Before the patch: Visiting statement: s_4 = strlen (q_3(D)); Found new range for s_4: VARYING Visiting statement: _1 = s_4; Found new range for _1: [s_4, s_4] marking stmt to be not simulated again Visiting statement: _7 = s_4 - _1; Applying pattern match.pd:111, gimple-match.c:27997 Match-and-simplified s_4 - _1 to 0 Intersecting [0, 0] and [0, +INF] to [0, 0] Found new range for _7: [0, 0] __attribute__((noclone, noinline)) fn1 (char * p, char * q) { size_t s; long unsigned int _1; long unsigned int _9; <bb 2>: s_4 = strlen (q_3(D)); _9 = s_4 + 1; __builtin_memcpy (p_5(D), q_3(D), _9); _1 = s_4; return 0; } After the patch: Visiting statement: s_4 = strlen (q_3(D)); Intersecting [0, 9223372036854775806] and [0, 9223372036854775806] to [0, 9223372036854775806] Found new range for s_4: [0, 9223372036854775806] marking stmt to be not simulated again Visiting statement: _1 = s_4; Intersecting [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) and [0, 9223372036854775806] to [0, 9223372036854775806] EQUIVALENCES: { s_4 } (1 elements) Found new range for _1: [0, 9223372036854775806] marking stmt to be not simulated again Visiting statement: _7 = s_4 - _1; Intersecting ~[9223372036854775807, 9223372036854775809] and ~[9223372036854775807, 9223372036854775809] to ~[9223372036854775807, 9223372036854775809] Found new range for _7: ~[9223372036854775807, 9223372036854775809] marking stmt to be not simulated again __attribute__((noclone, noinline)) fn1 (char * p, char * q) { size_t s; long unsigned int _1; size_t _7; long unsigned int _9; <bb 2>: s_4 = strlen (q_3(D)); _9 = s_4 + 1; __builtin_memcpy (p_5(D), q_3(D), _9); _1 = s_4; _7 = s_4 - _1; return _7; } Then forwprop4 turns _1 = s_4 _7 = s_4 - _1 into _7 = 0 and we end up with: _7 = 0 return _7 in optimized dump. Running ccp again after forwprop4 trivially solves the issue, however I am not sure if we want to run ccp again ? The issue is probably with extract_range_from_ssa_name(): For _1 = s_4 Before patch: VR for s_4 is set to varying. So VR for _1 is set to [s_4, s_4] by extract_range_from_ssa_name. Since VR for _1 is [s_4, s_4] it implicitly implies that _1 is equal to s_4, and vrp is able to transform _7 = s_4 - _1 to _7 = 0 (by using match.pd pattern x - x -> 0). After patch: VR for s_4 is set to [0, PTRDIFF_MAX - 1] And correspondingly VR for _1 is set to [0, PTRDIFF_MAX - 1] so IIUC, we then lose the information that _1 is equal to s_4, and vrp doesn't transform _7 = s_4 - _1 to _7 = 0. forwprop4 does that because it sees that s_4 and _1 are equivalent. Does this sound correct ? I am not sure how to proceed with the patch, and would be grateful for suggestions. Thanks, Prathamesh
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr78153-1.c b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-1.c new file mode 100644 index 0000000..2530ba0 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-1.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-evrp-slim" } */ + +void f(const char *s) +{ + if (__PTRDIFF_MAX__ <= __builtin_strlen (s)) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not "__builtin_abort" "evrp" } } */ diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr78153-2.c b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-2.c new file mode 100644 index 0000000..de70450 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr78153-2.c @@ -0,0 +1,11 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -fdump-tree-evrp-slim" } */ + +void f(const char *s) +{ + __PTRDIFF_TYPE__ n = __builtin_strlen (s); + if (n < 0) + __builtin_abort (); +} + +/* { dg-final { scan-tree-dump-not "__builtin_abort" "evrp" } } */ diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c index c2a4133..d17b413 100644 --- a/gcc/tree-vrp.c +++ b/gcc/tree-vrp.c @@ -4013,6 +4013,16 @@ extract_range_basic (value_range *vr, gimple *stmt) : vrp_val_max (type), NULL); } return; + case CFN_BUILT_IN_STRLEN: + { + tree type = TREE_TYPE (gimple_call_lhs (stmt)); + unsigned HOST_WIDE_INT max = + TREE_INT_CST_LOW (vrp_val_max (ptrdiff_type_node)) - 1; + + set_value_range (vr, VR_RANGE, build_int_cst (type, 0), + build_int_cst (type, max), NULL); + } + return; default: break; }
after.vrp2
Description: Binary data
before.vrp2
Description: Binary data