[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 Jakub Jelinek changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED|RESOLVED --- Comment #8 from Jakub Jelinek --- Fixed now.
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 --- Comment #7 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:fe78528c05fdd562f21e12675781473b0fbe892e commit r11-5957-gfe78528c05fdd562f21e12675781473b0fbe892e Author: Jakub Jelinek Date: Sat Dec 12 14:48:47 2020 +0100 widening_mul: Recognize another form of ADD_OVERFLOW [PR96272] The following patch recognizes another form of hand written __builtin_add_overflow (this time _p), in particular when the code does unsigned if (x > ~0U - y) or if (x <= ~0U - y) it can be optimized (if the subtraction turned into ~y is single use) into if (__builtin_add_overflow_p (x, y, 0U)) or if (!__builtin_add_overflow_p (x, y, 0U)) and generate better code, e.g. for the first function in the testcase: - movl%esi, %eax addl%edi, %esi - notl%eax - cmpl%edi, %eax - movl$-1, %eax - cmovnb %esi, %eax + jc .L3 + movl%esi, %eax + ret +.L3: + orl $-1, %eax ret on x86_64. As for the jumps vs. conditional move case, that is some CE issue with complex branch patterns we should fix up no matter what, but in this case I'm actually not sure if branchy code isn't better, overflow is something that isn't that common. 2020-12-12 Jakub Jelinek PR tree-optimization/96272 * tree-ssa-math-opts.c (uaddsub_overflow_check_p): Add OTHER argument. Handle BIT_NOT_EXPR. (match_uaddsub_overflow): Optimize unsigned a > ~b into __imag__ .ADD_OVERFLOW (a, b). (math_opts_dom_walker::after_dom_children): Call match_uaddsub_overflow even for BIT_NOT_EXPR. * gcc.dg/tree-ssa/pr96272.c: New test.
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org Status|NEW |ASSIGNED --- Comment #6 from Jakub Jelinek --- Created attachment 49744 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=49744&action=edit gcc11-pr96272.patch Untested fix. The addition is then CSEd by RTL PRE, and the reason why we don't emit a conditional move is something in the CE. Though it is a question if we want to use a conditional move, generally predicting overflows don't happen is a good idea and so the overflow case should be pretty cold.
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 --- Comment #5 from Jakub Jelinek --- It shouldn't be added to match.pd, the check as written in the source is certainly better for other optimization passes. For PR95853 I've added recently a widening_mul (i.e. very late pass, almost before expansion) pattern matching of another form (very lame in the source code) of the overflow check but in that case I've put a constraint that the overflow check and addition must be in the same bb (and not too far from each other). In this case it isn't in the same bb and again would be very undesirable if it is really too far (would increase register pressure).
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 --- Comment #4 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #2) > Well, it needs the addition too, so I think this can't be done in match.pd, > but would need to be done in some other pass (not sure which, perhaps > phiopt?). Maybe I was not too clear. Please consider this testcase: --cut here-- #include #include int foo (unsigned a, unsigned b) { return a > UINT_MAX - b; } int bar (unsigned a, unsigned b) { int dummy; return __builtin_uadd_overflow (a, b, &dummy); } --cut here-- This results in (-O2): foo: notl%esi xorl%eax, %eax cmpl%edi, %esi setb%al ret bar: xorl%eax, %eax addl%esi, %edi setc%al ret So, if it is possible to transform the comparison via the following equivalence: a > UINT_MAX - b == (a + b) > UINT_MAX to a __builtin_uadd_overflow, then at least on x86 it is possible to produce much better code.
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 --- Comment #3 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #2) > Well, it needs the addition too, so I think this can't be done in match.pd, > but would need to be done in some other pass (not sure which, perhaps > phiopt?). No, I was referring to the first step of the optimization. The converted source would read something like: unsigned bar (unsigned a, unsigned b) { int dummy; if (__builtin_uadd_overflow (a, b, &dummy)) return UINT_MAX; return a + b; } The RTL CSE pass is able to eliminate one addition, resulting in: bar: addl%esi, %edi jc .L5 movl%edi, %eax ret .L5: orl $-1, %eax ret Eventually, some tree pass could convert the above source to: unsigned bar (unsigned a, unsigned b) { unsigned res; if (__builtin_uadd_overflow (a, b, &res)) return UINT_MAX; return res; } which results in: bar: addl%esi, %edi movl$-1, %eax cmovnc %edi, %eax ret
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 --- Comment #2 from Jakub Jelinek --- Well, it needs the addition too, so I think this can't be done in match.pd, but would need to be done in some other pass (not sure which, perhaps phiopt?).
[Bug tree-optimization/96272] Failure to optimize overflow check
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96272 Uroš Bizjak changed: What|Removed |Added CC||jakub at gcc dot gnu.org Ever confirmed|0 |1 Last reconfirmed||2020-07-22 Status|UNCONFIRMED |NEW --- Comment #1 from Uroš Bizjak --- Confirmed, pattern to convert: a > UINT_MAX - b; to __builtin_uadd_overflow should be added to match.pd.