[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 Josef Melcr changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #7 from Josef Melcr --- The fix is now upstreamed, so the bug should be resolved. Closing.
[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 --- Comment #6 from GCC Commits --- The master branch has been updated by Josef Melcr : https://gcc.gnu.org/g:fe22bdabd4ca51c036e5e2ab89f56ffe32f4cac9 commit r16-5959-gfe22bdabd4ca51c036e5e2ab89f56ffe32f4cac9 Author: Josef Melcr Date: Mon Dec 8 09:21:38 2025 +0100 ipa/122798: Adjust local and address_taken flags for callback clones. Hi, previously, clones of callback functions had their local flag set. Because callback edges are direct rather than indirect, GCC falsely assumes that their callsites are available and that it can change their ABI, leading to segfaults. This patch fixes that. Additionally, this patch fixes a check in redirect_callee for clearing the address_taken flag. PR ipa/122798 gcc/ChangeLog: * cgraph.cc (cgraph_edge::redirect_callee): Use iterate_referring instead of referred_to_p. * cgraphclones.cc (set_new_clone_decl_and_node_flags): Set local to true iff the node does not have its address taken. Signed-off-by: Josef Melcr
[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 --- Comment #5 from Josef Melcr --- It seems that that is the problem, properly setting target->local does fix the regressions for me. I'll post the patch in the coming days, once I properly test it.
[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 --- Comment #4 from Jakub Jelinek --- On x86 with -m32, there are various ABIs, normal, regparm(1), regparm(2), regparm(3), sseregparm, fastcall, thiscall etc. See config/i386/i386.cc (ix86_function_regparm) where it in some cases picks a different ABI for local functions if it can arrange for all the callers to be changed. That isn't the case for the OpenMP/OpenACC functions though, and generally not for other callbacks either unless all the indirect calls can be adjusted as well. Guess before your changes cgraph was seeing a static function having its address escape to a non-local function, so most likely target->local wasn't set: /* Set when function is visible in current compilation unit only and its address is never taken. */ unsigned local : 1; Now, if the cgraph optimizations of the gnu_callback attribute cause node->local not to be set because while the address escapes, we have a guarantee that it will be only called through, then either we should arrange for node->cannot_change_signature to be set (although, frangly the optimizations actually could change the signature if they can change one struct XYZ * argument to struct YZW * with different layout (or do they just keep the same struct and just omit some loads or stores)?). Or add a new flag, so that we at least don't try to change ABI of such functions.
[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 --- Comment #3 from Josef Melcr --- Hi, I reduced simd7.f90 to: subroutine foo (d) integer :: i, d(:) d = 9; !$omp parallel do do i = 0, 0 d = d + 3; end do end subroutine interface subroutine foo (d) integer :: d(:) end subroutine end interface integer :: d(7:9) call foo (d) end The array's stride gets propagated and the kernel gets cloned. The segfault happens when the function tries to read the upper bound of the array: [local count: 25984552]: # q.12_9 = PHI # tt.13_11 = PHI _10 = _6 * q.12_9; _12 = _10 + tt.13_11; _13 = q.12_9 + _12; if (_12 >= _13) goto ; [50.00%] else goto ; [50.00%] [local count: 12992276]: ubound.0_4 = *.omp_data_i_1(D).ubound.0;<=== SEGFAULT if (ubound.0_4 <= 0) goto ; [11.00%] else goto ; [89.00%] The gimple dumps of the optimized and original kernel are identical leading up (and including) to the problematic load. Examining assembly yields the following cause: Optimized: 080493e0 : 80493ed: e8 3e fc ff ff call 8049030 <=== MISSING ARGUMENT POINTER LOAD ==> 80493f2: 89 c3 mov%eax,%ebx 80493f4: e8 47 fc ff ff call 8049040 80493f9: 89 c1 mov%eax,%ecx . 804941f: 39 f9 cmp%edi,%ecx 8049421: 7d 7d jge80494a0 8049423: 8b 56 04mov0x4(%esi),%edx <== SEGFAULT 8049426: 85 d2 test %edx,%edx 8049428: 7e 76 jle80494a0 804942a: 8b 5e 0cmov0xc(%esi),%ebx Unoptimized: . 804920b: e8 20 fe ff ff call 8049030 8049210: 8b 74 24 30 mov0x30(%esp),%esi < ARGUMENT POINTER LOAD 8049214: 89 c7 mov%eax,%edi 8049216: e8 25 fe ff ff call 8049040 804921b: 89 c1 mov%eax,%ecx . 8049244: 39 da cmp%ebx,%edx 8049246: 0f 8d da 00 00 00 jge8049326 804924c: 8b 46 04mov0x4(%esi),%eax <= NO SEGFAULT 804924f: 85 c0 test %eax,%eax 8049251: 0f 8e cf 00 00 00 jle8049326 8049257: 8b 6e 0cmov0xc(%esi),%ebp The functions do the same work, except the unoptimized version loads the argument pointer from the stack. The optimized version lacks this load, which leads to the invalid memory access when it tries to load from the struct. It appears that the optimization changes the function's calling convention. I will investigate further.
[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 Andrew Pinski changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Andrew Pinski --- *** Bug 122813 has been marked as a duplicate of this bug. ***
[Bug ipa/122798] [16 Regression] libgomp fortran regressions since r16-5466
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=122798 Andrew Pinski changed: What|Removed |Added Target Milestone|--- |16.0 Keywords||wrong-code Summary|libgomp fortran regressions |[16 Regression] libgomp |since |fortran regressions since |f9c01c47f8e5451e41a5df78f04 |r16-5466 |4d7b1e7b3d5d6 | Component|regression |ipa --- Comment #1 from Andrew Pinski --- r16-5466-gf9c01c47f8e5451e41a5df78f044d7b1e7b3d5d6
