https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94083
Bug ID: 94083 Summary: inefficient soft-float x!=Inf code Product: gcc Version: 9.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end Assignee: unassigned at gcc dot gnu.org Reporter: wilson at gcc dot gnu.org Target Milestone: --- Given a testcase like this int foo(void) { volatile float f; int n; f = __builtin_huge_valf(); n += 1 - (f != __builtin_huge_valf()); return n; } and compiling for soft-float, we end up with a call to __unordsf2 followed by a call to __lesf2. This means the floats have to be unpacked twice and checked for nan twice. This gives both poor performance and poor code size. I've confirmed this for x86, arm, and riscv. Folding in the C front end is creating an unordered less then or equal comparison against FLT_MAX. From the 004.original file n = SAVE_EXPR <!(f u<= 3.4028234663852885981170418348451692544e+38)> + n; This optimization is coming from a rule in the match.pd file. /* x != +Inf is always equal to !(x > DBL_MAX), but this introduces an exception for x a NaN so use an unordered comparison. */ When we generate rtl, we call do_compare_rtx_and_jump which notices that we don't have an operation for UNLE_EXPR, but decides we can't reverse it because it is unsafe. It tries swapping arguments, but we don't have UNGE_EXPR either. So it emits two libcalls. Converting a NE compare to a UNLE compare looks like an odd optimization. If we want to consider unordered operations as canonical operations, then maybe we should add libgcc support for the unordered operations. Or maybe we should check to see if unordered operations are handled by the target before converting a simple NE into a UNLE. The match.pd rule was changed to use UNLE in the patch for PR 64811 which fixed a problem with handling NaNs. This happened 2018-01-09. The optimization dates back to 2003-05-22 but was originally using LE which is OK for soft-float. It wasn't until the NaN bug was fixed by using UNLE instead of LE that this became an optimization problem. Maybe we just shouldn't perform this optimization when honoring NaNs? That would avoid generating the problematic unordered operation early in the optimizer.