[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 --- Comment #13 from andysem at mail dot ru --- Ok. For the record, opened bug 77845.
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 --- Comment #12 from Andrew Pinski --- (In reply to andysem from comment #10) > (In reply to Andrew Pinski from comment #9) > > > > I think this testcase is violating C++ ODR. In that > > INSTRUCTION_SET::my_simd_func_impl is the same between the TUs. If you had > > used an anonymous namespace, it should have worked correctly. If anonymous > > namespace does not work, please file a separate bug. > > INSTRUCTION_SET is defined differently for the two translation units, so we > essentially have sse2::my_simd_func_impl and avx::my_simd_func_impl. This > does not violate ODR. Oh I did not notice INSTRUCTION_SET was defined on the command line. as I said please file a different bug.
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #11 from Andrew Pinski --- (In reply to andysem from comment #10) > (In reply to Andrew Pinski from comment #9) > > > > I think this testcase is violating C++ ODR. In that > > INSTRUCTION_SET::my_simd_func_impl is the same between the TUs. If you had > > used an anonymous namespace, it should have worked correctly. If anonymous > > namespace does not work, please file a separate bug. > > INSTRUCTION_SET is defined differently for the two translation units, so we > essentially have sse2::my_simd_func_impl and avx::my_simd_func_impl. This > does not violate ODR. Open a new bug.
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 andysem at mail dot ru changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|FIXED |--- --- Comment #10 from andysem at mail dot ru --- (In reply to Andrew Pinski from comment #9) > > I think this testcase is violating C++ ODR. In that > INSTRUCTION_SET::my_simd_func_impl is the same between the TUs. If you had > used an anonymous namespace, it should have worked correctly. If anonymous > namespace does not work, please file a separate bug. INSTRUCTION_SET is defined differently for the two translation units, so we essentially have sse2::my_simd_func_impl and avx::my_simd_func_impl. This does not violate ODR.
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #9 from Andrew Pinski --- (In reply to andysem from comment #8) > Created attachment 39751 [details] > A new testcase which produces invalid code with gcc 5.4 I think this testcase is violating C++ ODR. In that INSTRUCTION_SET::my_simd_func_impl is the same between the TUs. If you had used an anonymous namespace, it should have worked correctly. If anonymous namespace does not work, please file a separate bug.
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 --- Comment #8 from andysem at mail dot ru --- Created attachment 39751 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39751=edit A new testcase which produces invalid code with gcc 5.4
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 andysem at mail dot ru changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|FIXED |--- --- Comment #7 from andysem at mail dot ru --- I believe, this bug is not yet fixed in gcc 5.4 on Kubuntu 16.04 x86_64. $ g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper Target: x86_64-linux-gnu Configured with: ../src/configure -v --with-pkgversion='Ubuntu 5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs --enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr --program-suffix=-5 --enable-shared --enable-linker-build-id --libexecdir=/usr/lib --without-included-gettext --enable-threads=posix --libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu --enable-libstdcxx-debug --enable-libstdcxx-time=yes --with-default-libstdcxx-abi=new --enable-gnu-unique-object --disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib --disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo --with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home --with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64 --with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64 --with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar --enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686 --with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib --with-tune=generic --enable-checking=release --build=x86_64-linux-gnu --host=x86_64-linux-gnu --target=x86_64-linux-gnu Thread model: posix gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2) I attached a new testcase. The compiler still produces AVX instructions in both _Z17my_simd_func_sse2PKhPh and _Z16my_simd_func_avxPKhPh functions.
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED Target Milestone|--- |4.9.0 --- Comment #6 from Andrew Pinski --- Fixed in 4.9
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 --- Comment #5 from Andrew Pinski --- I think this has been fixed already in GCC 5 (maybe even in 4.9). Can you try GCC 5 and see if it has been fixed?
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 Richard Biener rguenth at gcc dot gnu.org changed: What|Removed |Added Keywords||lto --- Comment #4 from Richard Biener rguenth at gcc dot gnu.org --- Known issue (there are duplicate bugs for this AFAIK).
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 --- Comment #2 from andysem at mail dot ru --- (In reply to Andi Kleen from comment #1) Yes LTO doesn't support different options for different files, and combines some of them (which happens in your case) and ignores some others. You could use tag the functions in the different file with __attribute__(target(...)) This will also allow automatic switching. Arguably gcc should do this automatically for LTO, but unfortunately it doesn't Unfortunately, gcc does not allow using SIMD intrinsics if not enabled by compiler switches, so leaving the compiler options for a generic target CPU wouldn't work. At least that is the case with gcc 4.8. Or alternatively don't compile the file that needs the changed options with LTO Yes, I'm currently not using LTO in my real world project that exhibits this problem. But users of my project would like to enable LTO, and currently this silently produces incorrect binaries. The purpose of this ticket is to indicate the problem and suggest a possible solution (automatically marking each function in every translation unit with the target options).
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 --- Comment #3 from Andi Kleen andi-gcc at firstfloor dot org --- Unfortunately, gcc does not allow using SIMD intrinsics if not enabled by compiler switches, so leaving the compiler options for a generic target CPU wouldn't work. At least that is the case with gcc 4.8. This has been fixed in 4.9
[Bug lto/61043] LTO accumulates CPU requirements from all input objects
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043 Andi Kleen andi-gcc at firstfloor dot org changed: What|Removed |Added CC||andi-gcc at firstfloor dot org --- Comment #1 from Andi Kleen andi-gcc at firstfloor dot org --- Yes LTO doesn't support different options for different files, and combines some of them (which happens in your case) and ignores some others. You could use tag the functions in the different file with __attribute__(target(...)) This will also allow automatic switching. Arguably gcc should do this automatically for LTO, but unfortunately it doesn't Or alternatively don't compile the file that needs the changed options with LTO