[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #33 from Marc Poulhiès --- No worries, we've applied it locally. Thanks!
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #32 from Martin Jambor --- (In reply to Marc Poulhiès from comment #31) > Hello Martin, > > Any chance the fix that fixes the new test for 32bits can be also backported? > > 4923ed49b93352bcf9e43cafac38345e4a54c3f8 > https://gcc.gnu.org/g:4923ed49b93352bcf9e43cafac38345e4a54c3f8 > > Not sure why it's not tagged so that it would appear here. My apologies for not including this commit, I completely forgot about it. Unfortunately I'm afraid it will have to wait until after the 13.3 release, but I will backport it quickly afterwards. Sorry again.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 Marc Poulhiès changed: What|Removed |Added CC||dkm at gcc dot gnu.org --- Comment #31 from Marc Poulhiès --- Hello Martin, Any chance the fix that fixes the new test for 32bits can be also backported? 4923ed49b93352bcf9e43cafac38345e4a54c3f8 https://gcc.gnu.org/g:4923ed49b93352bcf9e43cafac38345e4a54c3f8 Not sure why it's not tagged so that it would appear here.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 Martin Jambor changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #30 from Martin Jambor --- ...so set to fixed as well.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #29 from Martin Jambor --- Fixed
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #28 from GCC Commits --- The releases/gcc-13 branch has been updated by Martin Jambor : https://gcc.gnu.org/g:10bf53a80eefa46500bffb442719777e2640e7d7 commit r13-8773-g10bf53a80eefa46500bffb442719777e2640e7d7 Author: Martin Jambor Date: Mon Apr 8 18:53:23 2024 +0200 ICF: Make ICF and SRA agree on padding PR 113359 shows that (at least with -fno-strict-aliasing) ICF can unify two functions which copy an aggregate type of the same size but then SRA, through its total scalarization, can copy the aggregate by pieces, skipping paddding, but the padding was not the same in the two original functions that ICF unified. This patch enhances SRA with the ability to collect padding information which then can be compared from within ICF. Unfortunately SRA uses OPTION_SET_P when determining its limits, so ICF needs to switch cfuns at least once to figure it out too. gcc/ChangeLog: 2024-03-27 Martin Jambor PR ipa/113359 * ipa-icf-gimple.h (func_checker): New members safe_for_total_scalarization_p, m_total_scalarization_limit_known_p and m_total_scalarization_limit. (func_checker::func_checker): Initialize new member variables. * ipa-icf-gimple.cc: Include tree-sra.h. (func_checker::func_checker): Initialize new member variables. (func_checker::safe_for_total_scalarization_p): New function. (func_checker::compare_operand): Use the new function. * tree-sra.h (sra_get_max_scalarization_size): Declare. (sra_total_scalarization_would_copy_same_data_p): Likewise. * tree-sra.cc (prepare_iteration_over_array_elts): New function. (class sra_padding_collecting): New. (sra_padding_collecting::record_padding): Likewise. (scalarizable_type_p): Rename to totally_scalarizable_type_p. Add ability to record padding when requested. (totally_scalarize_subtree): Split out gathering information necessary to iterate over array elements to prepare_iteration_over_array_elts. Fix errornous early exit. (analyze_all_variable_accesses): Adjust the call to totally_scalarizable_type_p. Move determining of total scalariation size limit... (sra_get_max_scalarization_size): ...here. (check_ts_and_push_padding_to_vec): New function. (sra_total_scalarization_would_copy_same_data_p): Likewise. gcc/testsuite/ChangeLog: 2024-03-27 Martin Jambor PR ipa/113359 * gcc.dg/lto/pr113359-1_0.c: New. * gcc.dg/lto/pr113359-1_1.c: Likewise. * gcc.dg/lto/pr113359-2_0.c: Likewise. * gcc.dg/lto/pr113359-2_1.c: Likewise. * gcc.dg/lto/pr113359-3_0.c: Likewise. * gcc.dg/lto/pr113359-3_1.c: Likewise. * gcc.dg/lto/pr113359-4_0.c: Likewise. * gcc.dg/lto/pr113359-4_1.c: Likewise. * gcc.dg/lto/pr113359-5_0.c: Likewise. * gcc.dg/lto/pr113359-5_1.c: Likewise. (cherry picked from commit 1e3312a25a7b34d6e3f549273e1674c7114e4408)
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 Richard Biener changed: What|Removed |Added Known to work||14.0 Priority|P1 |P2 Known to fail|14.0|
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64 and x86_64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 Jakub Jelinek changed: What|Removed |Added Summary|[13/14 Regression] LTO |[13 Regression] LTO |miscompilation of ceph on |miscompilation of ceph on |aarch64 and x86_64 |aarch64 and x86_64 --- Comment #27 from Jakub Jelinek --- .
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #17 from Sam James --- There's another small testcase in PR114263.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #16 from Andrew Pinski --- *** Bug 114263 has been marked as a duplicate of this bug. ***
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #15 from Martin Jambor --- Created attachment 57462 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57462=edit Simple testcase (needs disabling early - and only early - SRA) This is a simpler testcase which exhibits the problem on x86_64-linux and current master. Steps to reproduce: $ ~/gcc/trunk/inst/bin/gcc -O2 -fno-strict-aliasing -fno-ipa-cp --disable-tree-esra -flto pr113359.c -c -o 1.o cc1: note: disable pass tree-esra for functions in the range of [0, 4294967295] $ ~/gcc/trunk/inst/bin/gcc -O2 -fno-strict-aliasing -fno-ipa-cp --disable-tree-esra -flto -DFILE2 pr113359.c -c -o 2.o cc1: note: disable pass tree-esra for functions in the range of [0, 4294967295] $ ~/gcc/trunk/inst/bin/gcc -flto 1.o 2.o -o test.exe $ ./test.exe Aborted (core dumped) If you add -fno-ipa-icf to the "compilation" commands, the test will pass. Late (post ICF) intra-procedural SRA is necessary to exhibit the problem. On the other hand, early SRA must be suppressed or it will scalarize the aggregate assignment too early and the results will look different to IPA-ICF. Instead of using --disable-tree-esra we could pass the address of tmp in both geta() and getb() to an empty function coming from a third compilation unit. Disabling strict aliasing is also necessary to show the problem, with strict aliasing IPA-ICF takes the alias class of types into acount when hashing and considers geta() and getb() different from the start.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #14 from Martin Jambor --- (In reply to rguent...@suse.de from comment #13) > Might be also an interaction with IPA ICF in case there's a pointer to > the pair involved? Yes, this is exactly what seems to be happening. The problem goes away with -fno-icf. (Possibly because the testcase uses -fno-strict-aliasing,) IPA-ICF merges two functions which copy a structure and that access type it what IPA-SRA saves, but loads only the one of the merged functions. SRA then uses the (wrong) type to split aggregate copies into copies by individual fields. I have talked to Honza about this. It seems that IPA-ICF needs to be careful about aggreage with holes in different places. The ideal next step would be to create a testcase not dependent on IPA-SRA.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #13 from rguenther at suse dot de --- On Tue, 6 Feb 2024, jakub at gcc dot gnu.org wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 > > --- Comment #12 from Jakub Jelinek --- > Just going from the demangled name of > std::pair std::chrono::duration > > const, > Context*> > it would surprise me if it was ODR violation in the testcase, because class > Context > is only defined in Timer.ii, not in the other preprocessed source, > ceph::mono_clock is defined in both but looks the same (and it is empty class > anyway), and the pair uses Context* as second type anyway, so it is unclear > how > it could become something smaller than pointer. > But, admittedly I haven't looked up at this under the debugger and not even > sure where to look at that. Might be also an interaction with IPA ICF in case there's a pointer to the pair involved?
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #12 from Jakub Jelinek --- Just going from the demangled name of std::pair > > const, Context*> it would surprise me if it was ODR violation in the testcase, because class Context is only defined in Timer.ii, not in the other preprocessed source, ceph::mono_clock is defined in both but looks the same (and it is empty class anyway), and the pair uses Context* as second type anyway, so it is unclear how it could become something smaller than pointer. But, admittedly I haven't looked up at this under the debugger and not even sure where to look at that.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #11 from Jan Hubicka --- If there are two ODR types with same ODR name one with integer and other with pointer types third field, then indeed we should get ODR warning and give up on handling them as ODR types for type merging. So dumping their assembler names would be useful starting point. Of course if you have two ODR types of different names but you mix them up in COMDAT function of same name, then the warning will not trigger, so this might be some missing type compatibility check in ipa-sra or ipa-prop summary, too.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #10 from Richard Biener --- I see the 'pair' type is marked TYPE_CXX_ODR_P, I'd say you should see a ODR type violation diagnostic, and if you don't, this means we force different alias sets for both? Not sure - Honza added this stuff. It only affects TYPE_CANONICAL though, regular type merging shouldn't merge them but it's likely that you get to see another type because of COMDATs and symbol merging chosing a different prevailing function which has that other type? Btw, can you dump the mangled name of the type? It should be type_with_linkage_p () I think, of course 'pair' itself is a template so only a specific instantiation should be subject to ODR. (of course there might be ODR functions that use different instantiated pair in the signature ..)
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #9 from Martin Jambor --- SRA creates the replacements (in GCC 13) during total scalarization, i.e. the bit that is not driven by pre-existing accesses to aggregates, but because it sees an aggregate that is small and regular and so it is split according to its type in the hope it will go away. Unfortunately in the LTO and non-LTO case, they see a different type. I have added a dumping of types and fields of totally scalarized records and got the following. In the non-LTO case, the type of the aggregate is: constant 128> unit-size constant 16> align:64 warn_if_not_align:0 symtab:1430035184 alias-set -1 canonical-type 0x553cabd0 ... and specifically its third field is a pointer: pointer_to_this > unsigned DI size unit-size align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x562729d8 reference_to_this > used unsigned nonlocal decl_3 DI /usr/include/c++/13/bits/stl_pair.h:194:11 size constant 64> unit-size constant 8> align:64 warn_if_not_align:0 offset_align 128 decl_not_flexarray: 0 offset constant 0> bit-offset constant 64> context > However, in the LTO case the type of the aggregate is: constant 128> unit-size constant 16> align:64 warn_if_not_align:0 symtab:0 alias-set 98 canonical-type 0x61cc1498 ... which however has an unsigned int as its third field: unit-size align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 0x62410690 precision:32 min max pointer_to_this reference_to_this > unsigned nonlocal SI /usr/include/c++/13/bits/stl_pair.h:194:11 size constant 32> unit-size constant 4> align:32 warn_if_not_align:0 offset_align 128 decl_not_flexarray: 0 offset constant 0> bit-offset constant 64> context > An so only an unsigned int replacement is created. The name of the aggregate indicates it has been created by IPA-SRA and so that is where I am looking right now, but IPA-SRA simply takes (and streams) the type of the access in the original function body for these. Can't this perhaps be some type-merging issue?
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 Richard Biener changed: What|Removed |Added CC||hubicka at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Target||aarch64 --- Comment #8 from Richard Biener --- All these might also point to IPA mod-ref issues.
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 Andrew Pinski changed: What|Removed |Added CC||pinskia at gcc dot gnu.org --- Comment #7 from Andrew Pinski --- I am going to double check something but it might be related to PR 112616 .
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #6 from Jakub Jelinek --- Created attachment 57061 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57061=edit SloppyCRCMap.ii.xz When using current GCC trunk instead of current GCC 13 branch (CrtStuff.i.xz is the same between both), the problematic optimization doesn't happen, both before SRA and after it the code has MEM[(struct pair *)_5 + 32B] = MEM[(const struct pair &)__args#0_4(D)];
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #5 from Jakub Jelinek --- Created attachment 57060 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=57060=edit Timer.ii.xz from GCC 14
[Bug ipa/113359] [13 Regression] LTO miscompilation of ceph on aarch64
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113359 --- Comment #4 from Jakub Jelinek --- Will try to reproduce with trunk next (unfortunately due to the builtin trait changes the GCC 13 preprocessed sources can't be compiled with GCC 14).