https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122103
--- Comment #15 from GCC Commits <cvs-commit at gcc dot gnu.org> --- The master branch has been updated by Tamar Christina <[email protected]>: https://gcc.gnu.org/g:974c04dc2cb7f44705a9fd62b3b9592d7c6faca3 commit r16-6511-g974c04dc2cb7f44705a9fd62b3b9592d7c6faca3 Author: Tamar Christina <[email protected]> Date: Mon Jan 5 20:56:03 2026 +0000 vect: teach vectorizable_call to predicate calls when they can trap [PR122103] The following example void f (float *__restrict c, int *__restrict d, int n) { for (int i = 0; i < n; i++) { c[i] = __builtin_sqrtf (c[i]); } } compiled with -O3 -march=armv9-a -fno-math-errno -ftrapping-math needs to be predicated on the conditional. It's invalid to execute the branch and use a select to extract it later unless using -fno-trapping-math. We currently generate: f: cmp w2, 0 ble .L1 mov x1, 0 whilelo p7.s, wzr, w2 ptrue p6.b, all .L3: ld1w z31.s, p7/z, [x0, x1, lsl 2] fsqrt z31.s, p6/m, z31.s st1w z31.s, p7, [x0, x1, lsl 2] incw x1 whilelo p7.s, w1, w2 b.any .L3 .L1: ret Which means the inactive lanes of the operation can raise an FE. With this change we now generate f: cmp w2, 0 ble .L1 mov x1, 0 whilelo p7.s, wzr, w2 .p2align 5,,15 .L3: ld1w z31.s, p7/z, [x0, x1, lsl 2] fsqrt z31.s, p7/m, z31.s st1w z31.s, p7, [x0, x1, lsl 2] incw x1 whilelo p7.s, w1, w2 b.any .L3 .L1: ret However as discussed in PR96373 while we probably shouldn't vectorize for the cases where we can trap but don't support conditional operation there doesn't seem to be a clear consensus on how GCC should handle trapping math. As such similar to PR96373 I don't stop vectorization if trapping math and the conditional operation isn't supported. gcc/ChangeLog: PR tree-optimization/122103 * tree-vect-stmts.cc (vectorizable_call): Handle trapping math. gcc/testsuite/ChangeLog: PR tree-optimization/122103 * gcc.target/aarch64/sve/pr122103_4.c: New test. * gcc.target/aarch64/sve/pr122103_5.c: New test. * gcc.target/aarch64/sve/pr122103_6.c: New test.
