[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #15 from GCC Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:c4f2c84e8fa369856aee76679590eb613724bfb0 commit r14-9668-gc4f2c84e8fa369856aee76679590eb613724bfb0 Author: Jakub Jelinek Date: Tue Mar 26 11:21:38 2024 +0100 fold-const: Punt on MULT_EXPR in extract_muldiv MIN/MAX_EXPR case [PR51] As I've tried to explain in the comments, the extract_muldiv_1 MIN/MAX_EXPR optimization is wrong for code == MULT_EXPR. If the multiplication is done in unsigned type or in signed type with -fwrapv, it is fairly obvious that max (a, b) * c in many cases isn't equivalent to max (a * c, b * c) (or min if c is negative) due to overflows, but even for signed with undefined overflow, the optimization could turn something without UB in it (where say a * c invokes UB, but max (or min) picks the other operand where b * c doesn't). As for division/modulo, I think it is in most cases safe, except if the problematic INT_MIN / -1 case could be triggered, but we can just punt for MAX_EXPR because for MIN_EXPR if one operand is INT_MIN, we'd pick that operand already. It is just for completeness, match.pd already has an optimization which turns x / -1 into -x, so the division by zero is mostly theoretical. That is also why in the testcase the i case isn't actually miscompiled without the patch, while the c and f cases are. 2024-03-26 Jakub Jelinek PR middle-end/51 * fold-const.cc (extract_muldiv_1) : Punt for MULT_EXPR altogether, or for MAX_EXPR if c is -1. * gcc.c-torture/execute/pr51.c: New test.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Richard Biener changed: What|Removed |Added Priority|P1 |P2
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #14 from Jakub Jelinek --- (In reply to Jakub Jelinek from comment #12) > The following testcase at least reproduces the unsigned multiplication > issue, but doesn't reproduce the signed multiplication nor division by -1. > int > main () > { > unsigned a = (1U + __INT_MAX__) / 2U; > unsigned b = 1U; > unsigned c = (a * 2U > b * 2U ? a * 2U : b * 2U) * 2U; > if (c != 0U) > __builtin_abort (); > int d = (-__INT_MAX__ - 1) / 2; > int e = 10; > int f = (d * 2 > e * 2 ? d * 2 : e * 2) * 6; > if (f != 120) > __builtin_abort (); > int g = (-__INT_MAX__ - 1) / 2; > int h = 0; > int i = (g * 2 > h * 2 ? g * 2 : h * 2) / -1; > if (i != 0) > __builtin_abort (); > } Ah, the reason it doesn't fail for the f and i cases is that for the signed type cases, we actually don't create a MIN_EXPR or MAX_EXPR but COND_EXPR which just compares the vars and performs multiplication only in the COND_EXPR second/third arguments. So it is kind of hard trying to make it trigger for the problematic cases where the recursive calls would extract something. Will see in full bootstrap/regtest with logging how often does the patch trigger.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #13 from Richard Biener --- (In reply to Jakub Jelinek from comment #11) > Perhaps > --- fold-const.cc.jj8 2024-03-11 22:37:29.0 +0100 > +++ fold-const.cc 2024-03-22 19:31:44.189686120 +0100 > @@ -7104,6 +7104,27 @@ extract_muldiv_1 (tree t, tree c, enum t >if (TYPE_UNSIGNED (ctype) != TYPE_UNSIGNED (type)) > break; > > + /* Punt for multiplication altogether. > + MAX (1U + INT_MAX, 1U) * 2U is not equivalent to > + MAX ((1U + INT_MAX) * 2U, 1U * 2U), the former is > + 0U, the latter is 2U. > + MAX (INT_MIN / 2, 0) * -2 is not equivalent to > + MIN (INT_MIN / 2 * -2, 0 * -2), the former is > + well defined 0, the latter invokes UB. > + MAX (INT_MIN / 2, 5) * 5 is not equivalent to > + MAX (INT_MIN / 2 * 5, 5 * 5), the former is > + well defined 25, the latter invokes UB. */ > + if (code == MULT_EXPR) > + break; > + /* For division/modulo, punt on c being -1 for MAX, as > + MAX (INT_MIN, 0) / -1 is not equivalent to > + MIN (INT_MIN / -1, 0 / -1), the former is well defined > + 0, the latter invokes UB (or for -fwrapv is INT_MIN). > + MIN (INT_MIN, 0) / -1 already invokes UB, so the > + transformation won't make it worse. */ > + else if (tcode == MAX_EXPR && integer_minus_onep (c)) > + break; > + >/* MIN (a, b) / 5 -> MIN (a / 5, b / 5) */ >sub_strict_overflow_p = false; >if ((t1 = extract_muldiv (op0, c, code, wide_type, > ? I think that looks reasonable.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #12 from Jakub Jelinek --- The following testcase at least reproduces the unsigned multiplication issue, but doesn't reproduce the signed multiplication nor division by -1. int main () { unsigned a = (1U + __INT_MAX__) / 2U; unsigned b = 1U; unsigned c = (a * 2U > b * 2U ? a * 2U : b * 2U) * 2U; if (c != 0U) __builtin_abort (); int d = (-__INT_MAX__ - 1) / 2; int e = 10; int f = (d * 2 > e * 2 ? d * 2 : e * 2) * 6; if (f != 120) __builtin_abort (); int g = (-__INT_MAX__ - 1) / 2; int h = 0; int i = (g * 2 > h * 2 ? g * 2 : h * 2) / -1; if (i != 0) __builtin_abort (); }
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #11 from Jakub Jelinek --- Perhaps --- fold-const.cc.jj8 2024-03-11 22:37:29.0 +0100 +++ fold-const.cc 2024-03-22 19:31:44.189686120 +0100 @@ -7104,6 +7104,27 @@ extract_muldiv_1 (tree t, tree c, enum t if (TYPE_UNSIGNED (ctype) != TYPE_UNSIGNED (type)) break; + /* Punt for multiplication altogether. +MAX (1U + INT_MAX, 1U) * 2U is not equivalent to +MAX ((1U + INT_MAX) * 2U, 1U * 2U), the former is +0U, the latter is 2U. +MAX (INT_MIN / 2, 0) * -2 is not equivalent to +MIN (INT_MIN / 2 * -2, 0 * -2), the former is +well defined 0, the latter invokes UB. +MAX (INT_MIN / 2, 5) * 5 is not equivalent to +MAX (INT_MIN / 2 * 5, 5 * 5), the former is +well defined 25, the latter invokes UB. */ + if (code == MULT_EXPR) + break; + /* For division/modulo, punt on c being -1 for MAX, as +MAX (INT_MIN, 0) / -1 is not equivalent to +MIN (INT_MIN / -1, 0 / -1), the former is well defined +0, the latter invokes UB (or for -fwrapv is INT_MIN). +MIN (INT_MIN, 0) / -1 already invokes UB, so the +transformation won't make it worse. */ + else if (tcode == MAX_EXPR && integer_minus_onep (c)) + break; + /* MIN (a, b) / 5 -> MIN (a / 5, b / 5) */ sub_strict_overflow_p = false; if ((t1 = extract_muldiv (op0, c, code, wide_type, ? Though int main () { unsigned a = 1U + __INT_MAX__; unsigned b = 1U; unsigned c = (a > b ? a : b) * 2U; if (c != 0U) __builtin_abort (); int d = (-__INT_MAX__ - 1) / 2; int e = 5; int f = (d > e ? d : e) * 5; if (f != 25) __builtin_abort (); int g = -__INT_MAX__ - 1; int h = 0; int i = (g > h ? g : h) / -1; if (i != 0) __builtin_abort (); } doesn't seem to be miscompiled, we just don't do that transformation at all even without the patch. Need to tweak such that the min/max arguments are both something on which extract_muldiv returns non-NULL.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #10 from Jakub Jelinek --- It isn't just those 2 values though. MAX (INT_MIN / 2, 0) * -2 etc. would be a problem too. So maybe play safe and only do it for MULT_EXPR when TYPE_OVERFLOW_UNDEFINED and c is non-negative? Maybe non-power of two -c could be ok too.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #9 from Jakub Jelinek --- (In reply to Jakub Jelinek from comment #8) > (In reply to Richard Biener from comment #5) > > but even when overflow is undefined we don't know whether we introduce > > additional overflow then. Consider MAX (INT_MIN, 0) * -1 where we compute > > 0 * -1 (fine) but after the transform we'd do MIN (INT_MIN * -1, 0) > > which isn't valid. > > > > And when overflow wraps consider MAX (UINT_MAX, 1) * 2 which > > will compute UINT_MAX * 2 == 0 while MAX (UINT_MAX * 2, 1 * 2) will compute > > 2. > > > > Unless I'm missing something. > > You're right. So perhaps punt on this optimization for code == MULT_EXPR > altogether. Although, even for MULT_EXPR when TYPE_OVERFLOW_UNDEFINED, if the only problematic cases are when one of the multiplication operand is -1 and the other is signed minimum, because we know one of those operands (c), we could just punt if integer_minus_onep (c) or if c is TYPE_MIN_VALUE.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #8 from Jakub Jelinek --- (In reply to Richard Biener from comment #5) > but even when overflow is undefined we don't know whether we introduce > additional overflow then. Consider MAX (INT_MIN, 0) * -1 where we compute > 0 * -1 (fine) but after the transform we'd do MIN (INT_MIN * -1, 0) > which isn't valid. > > And when overflow wraps consider MAX (UINT_MAX, 1) * 2 which > will compute UINT_MAX * 2 == 0 while MAX (UINT_MAX * 2, 1 * 2) will compute > 2. > > Unless I'm missing something. You're right. So perhaps punt on this optimization for code == MULT_EXPR altogether. For the division/modulo, the problematic case is signed division by -1 (unless we can prove that neither operand is signed type minimum), but c is constant here, so we could as well just punt for code == MULT_EXPR || integer_minus_onep (c)?
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #7 from Jakub Jelinek --- (In reply to Richard Biener from comment #3) > @@ -6970,8 +6972,11 @@ extract_muldiv_1 (tree t, tree c, enum tree_code > code, tree wide_type, > >/* MIN (a, b) / 5 -> MIN (a / 5, b / 5) */ >sub_strict_overflow_p = false; > - if ((t1 = extract_muldiv (op0, c, code, wide_type, > - _strict_overflow_p)) != 0 > + if ((wide_type > + ? TYPE_OVERFLOW_UNDEFINED (wide_type) > + : TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))) > + && (t1 = extract_muldiv (op0, c, code, wide_type, > + _strict_overflow_p)) != 0 > && (t2 = extract_muldiv (op1, c, code, wide_type, >_strict_overflow_p)) != 0) > { Isn't it fine as is for unsigned divisions and modulos? I'd think the only problem is TYPE_OVERFLOW_WRAPS code == MULT_EXPR or !TYPE_UNSIGNED && TYPE_OVERFLOW_WRAPS division/modulo (say -fwrapv MIN (a, b) / -1 -> MAX (a / -1, b / -1) transformation wouldn't be correct for a INT_MIN b 0, because the first one is INT_MIN / -1 aka. INT_MIN, while the latter is 0 or perhaps TYPE_OVERFLOW_SANITIZED should be punted on as well.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Jeffrey A. Law changed: What|Removed |Added Priority|P3 |P1 CC||law at gcc dot gnu.org
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Andrew Pinski changed: What|Removed |Added Known to work||2.95.3 Keywords|needs-bisection | --- Comment #6 from Andrew Pinski --- extract_muldiv seems to be the cause of so many issues. It was added in r0-25100-g1baa375feaa2 which was only included in GCC 3.0+ even. The bad MIN/MAX code was added with that revision even so I am going to remove the needs-bisect since I think it is obvious that revision broke it.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #5 from Richard Biener --- (In reply to Richard Biener from comment #3) > I think when overflow wraps we cannot do this transform at all, independent > on the "sign" of 'c'. > > Maybe > > @@ -6970,8 +6972,11 @@ extract_muldiv_1 (tree t, tree c, enum tree_code > code, tree wide_type, > >/* MIN (a, b) / 5 -> MIN (a / 5, b / 5) */ >sub_strict_overflow_p = false; > - if ((t1 = extract_muldiv (op0, c, code, wide_type, > - _strict_overflow_p)) != 0 > + if ((wide_type > + ? TYPE_OVERFLOW_UNDEFINED (wide_type) > + : TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))) > + && (t1 = extract_muldiv (op0, c, code, wide_type, > + _strict_overflow_p)) != 0 > && (t2 = extract_muldiv (op1, c, code, wide_type, >_strict_overflow_p)) != 0) > { but even when overflow is undefined we don't know whether we introduce additional overflow then. Consider MAX (INT_MIN, 0) * -1 where we compute 0 * -1 (fine) but after the transform we'd do MIN (INT_MIN * -1, 0) which isn't valid. And when overflow wraps consider MAX (UINT_MAX, 1) * 2 which will compute UINT_MAX * 2 == 0 while MAX (UINT_MAX * 2, 1 * 2) will compute 2. Unless I'm missing something. What we'd need to know is whether the inner operations are known to not overflow/wrap (or whether they change sign consistently). But without range info we can't know this unless op0 and op1 are constants. So - scrap that whole sub-rule?
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 --- Comment #4 from Richard Biener --- Created attachment 55794 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=55794=edit fold "dumping" This adds DUMP_FOLD, but I get /home/rguenther/src/trunk/gcc/fold-const.cc:6808:22: warning: ISO C++ forbids braced-groups within expressions [-Wpedantic] #define DUMP_FOLD(X) ({ auto x = (X); if (x && dump_file && (dump_flags & TDF_FOLDING)) fprintf (dump_file, "Applying fold-const.c:%d\n", __LINE__); x; }) ^ /home/rguenther/src/trunk/gcc/fold-const.cc:7193:15: note: in expansion of macro 'DUMP_FOLD' return DUMP_FOLD (fold_build2 (code, ctype, fold_convert (ctype, op0), ^ with #define DUMP_FOLD(X) ({ auto x = (X); if (x && dump_file && (dump_flags & TDF_FOLDING)) fprintf (dump_file, "Applying fold-const.c:%d\n", __LINE__); x; }) so maybe need to use __extension__ and #if __GNUC for this?
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Richard Biener changed: What|Removed |Added CC||pinskia at gcc dot gnu.org, ||rguenth at gcc dot gnu.org --- Comment #3 from Richard Biener --- instrumenting extract_muldiv shows ... Applying pattern match.pd:5113, generic-match-4.cc:2339 Applying fold-const.c:6892 Applying fold-const.c:7101 Applying fold-const.c:6892 Applying fold-const.c:6985 Applying pattern match.pd:4392, generic-match-8.cc:3091 and commenting case MIN_EXPR: case MAX_EXPR: /* If widening the type changes the signedness, then we can't perform this optimization as that changes the result. */ if (TYPE_UNSIGNED (ctype) != TYPE_UNSIGNED (type)) break; /* MIN (a, b) / 5 -> MIN (a / 5, b / 5) */ sub_strict_overflow_p = false; if ((t1 = extract_muldiv (op0, c, code, wide_type, _strict_overflow_p)) != 0 && (t2 = extract_muldiv (op1, c, code, wide_type, _strict_overflow_p)) != 0) { if (tree_int_cst_sgn (c) < 0) tcode = (tcode == MIN_EXPR ? MAX_EXPR : MIN_EXPR); if (sub_strict_overflow_p) *strict_overflow_p = true; return DUMP_FOLD (fold_build2 (tcode, ctype, fold_convert (ctype, t1), fold_convert (ctype, t2))); } break; fixes the testcase. We turn MAX ((long long unsigned int) t + 4503599, 32739) * 18446744073709551606 to MIN ((long long unsigned int) t * 18446744073709551606 + 18446744073664515626, 18446744073709224226) I think when overflow wraps we cannot do this transform at all, independent on the "sign" of 'c'. Maybe @@ -6970,8 +6972,11 @@ extract_muldiv_1 (tree t, tree c, enum tree_code code, tree wide_type, /* MIN (a, b) / 5 -> MIN (a / 5, b / 5) */ sub_strict_overflow_p = false; - if ((t1 = extract_muldiv (op0, c, code, wide_type, - _strict_overflow_p)) != 0 + if ((wide_type + ? TYPE_OVERFLOW_UNDEFINED (wide_type) + : TYPE_OVERFLOW_UNDEFINED (TREE_TYPE (op0))) + && (t1 = extract_muldiv (op0, c, code, wide_type, + _strict_overflow_p)) != 0 && (t2 = extract_muldiv (op1, c, code, wide_type, _strict_overflow_p)) != 0) {
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Mikael Pettersson changed: What|Removed |Added CC||mikpelinux at gmail dot com --- Comment #2 from Mikael Pettersson --- On x86_64-linux-gnu I can reproduce this all the way back to gcc-3.2.3 with both -m64 and -m32. gcc-2.95.3 compiling for i686-linux-gnu gets it right.
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Richard Biener changed: What|Removed |Added Keywords||needs-bisection, wrong-code Target Milestone|--- |12.4
[Bug middle-end/111151] [12/13/14 Regression] Wrong code at -O0 on x86_64-pc-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51 Richard Biener changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED |NEW Last reconfirmed||2023-08-25 --- Comment #1 from Richard Biener --- a = MAX_EXPR <(long long unsigned int) t + 4503599, 32739>; b = a * 18446744073709551606; printf ((const char *) "%llu (Split calculation result)\n", b); c = MAX_EXPR <(long long unsigned int) t * 18446744073709551606 + 18446744073664515626, 18446744073709224226>; printf ((const char *) "%llu (Combine calculation result)\n", c); smells like extract_muldiv, but Applying pattern match.pd:4359, generic-match-8.cc:3047 Applying pattern match.pd:4359, generic-match-8.cc:3047 Applying pattern match.pd:5475, generic-match-8.cc:1606 Applying pattern match.pd:4256, generic-match-8.cc:2977 Applying pattern match.pd:5113, generic-match-4.cc:2339 Applying pattern match.pd:4392, generic-match-8.cc:3091 Applying pattern match.pd:4359, generic-match-8.cc:3047 Applying pattern match.pd:4359, generic-match-8.cc:3047 Applying pattern match.pd:5475, generic-match-8.cc:1606 Applying pattern match.pd:4256, generic-match-8.cc:2977 Applying pattern match.pd:5113, generic-match-4.cc:2339 Applying pattern match.pd:4392, generic-match-8.cc:3091 it would be nice to have fold-const.cc report "matches" as well. Maybe replace all return ; with return report (); with a report macro doing dumping like above if is non-NULL.