[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread andysem at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

--- Comment #13 from andysem at mail dot ru ---
Ok. For the record, opened bug 77845.

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

--- Comment #12 from Andrew Pinski  ---
(In reply to andysem from comment #10)
> (In reply to Andrew Pinski from comment #9)
> > 
> > I think this testcase is violating C++ ODR.  In that
> > INSTRUCTION_SET::my_simd_func_impl is the same between the TUs.  If you had
> > used an anonymous namespace, it should have worked correctly.  If anonymous
> > namespace does not work, please file a separate bug.
> 
> INSTRUCTION_SET is defined differently for the two translation units, so we
> essentially have sse2::my_simd_func_impl and avx::my_simd_func_impl. This
> does not violate ODR.

Oh I did not notice INSTRUCTION_SET was defined on the command line.  as I said
please file a different bug.

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Andrew Pinski  ---
(In reply to andysem from comment #10)
> (In reply to Andrew Pinski from comment #9)
> > 
> > I think this testcase is violating C++ ODR.  In that
> > INSTRUCTION_SET::my_simd_func_impl is the same between the TUs.  If you had
> > used an anonymous namespace, it should have worked correctly.  If anonymous
> > namespace does not work, please file a separate bug.
> 
> INSTRUCTION_SET is defined differently for the two translation units, so we
> essentially have sse2::my_simd_func_impl and avx::my_simd_func_impl. This
> does not violate ODR.

Open a new bug.

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread andysem at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

andysem at mail dot ru changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|FIXED   |---

--- Comment #10 from andysem at mail dot ru ---
(In reply to Andrew Pinski from comment #9)
> 
> I think this testcase is violating C++ ODR.  In that
> INSTRUCTION_SET::my_simd_func_impl is the same between the TUs.  If you had
> used an anonymous namespace, it should have worked correctly.  If anonymous
> namespace does not work, please file a separate bug.

INSTRUCTION_SET is defined differently for the two translation units, so we
essentially have sse2::my_simd_func_impl and avx::my_simd_func_impl. This does
not violate ODR.

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #9 from Andrew Pinski  ---
(In reply to andysem from comment #8)
> Created attachment 39751 [details]
> A new testcase which produces invalid code with gcc 5.4

I think this testcase is violating C++ ODR.  In that
INSTRUCTION_SET::my_simd_func_impl is the same between the TUs.  If you had
used an anonymous namespace, it should have worked correctly.  If anonymous
namespace does not work, please file a separate bug.

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread andysem at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

--- Comment #8 from andysem at mail dot ru ---
Created attachment 39751
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39751=edit
A new testcase which produces invalid code with gcc 5.4

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-10-04 Thread andysem at mail dot ru
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

andysem at mail dot ru changed:

   What|Removed |Added

 Status|RESOLVED|UNCONFIRMED
 Resolution|FIXED   |---

--- Comment #7 from andysem at mail dot ru ---
I believe, this bug is not yet fixed in gcc 5.4 on Kubuntu 16.04 x86_64.

$ g++ -v
Using built-in specs.
COLLECT_GCC=g++
COLLECT_LTO_WRAPPER=/usr/lib/gcc/x86_64-linux-gnu/5/lto-wrapper
Target: x86_64-linux-gnu
Configured with: ../src/configure -v --with-pkgversion='Ubuntu
5.4.0-6ubuntu1~16.04.2' --with-bugurl=file:///usr/share/doc/gcc-5/README.Bugs
--enable-languages=c,ada,c++,java,go,d,fortran,objc,obj-c++ --prefix=/usr
--program-suffix=-5 --enable-shared --enable-linker-build-id
--libexecdir=/usr/lib --without-included-gettext --enable-threads=posix
--libdir=/usr/lib --enable-nls --with-sysroot=/ --enable-clocale=gnu
--enable-libstdcxx-debug --enable-libstdcxx-time=yes
--with-default-libstdcxx-abi=new --enable-gnu-unique-object
--disable-vtable-verify --enable-libmpx --enable-plugin --with-system-zlib
--disable-browser-plugin --enable-java-awt=gtk --enable-gtk-cairo
--with-java-home=/usr/lib/jvm/java-1.5.0-gcj-5-amd64/jre --enable-java-home
--with-jvm-root-dir=/usr/lib/jvm/java-1.5.0-gcj-5-amd64
--with-jvm-jar-dir=/usr/lib/jvm-exports/java-1.5.0-gcj-5-amd64
--with-arch-directory=amd64 --with-ecj-jar=/usr/share/java/eclipse-ecj.jar
--enable-objc-gc --enable-multiarch --disable-werror --with-arch-32=i686
--with-abi=m64 --with-multilib-list=m32,m64,mx32 --enable-multilib
--with-tune=generic --enable-checking=release --build=x86_64-linux-gnu
--host=x86_64-linux-gnu --target=x86_64-linux-gnu
Thread model: posix
gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.2)

I attached a new testcase. The compiler still produces AVX instructions in both
_Z17my_simd_func_sse2PKhPh and _Z16my_simd_func_avxPKhPh functions.

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-08-13 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

Andrew Pinski  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED
   Target Milestone|--- |4.9.0

--- Comment #6 from Andrew Pinski  ---
Fixed in 4.9

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2016-08-13 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

--- Comment #5 from Andrew Pinski  ---
I think this has been fixed already in GCC 5 (maybe even in 4.9).  Can you try
GCC 5 and see if it has been fixed?

[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2014-05-05 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||lto

--- Comment #4 from Richard Biener rguenth at gcc dot gnu.org ---
Known issue (there are duplicate bugs for this AFAIK).


[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2014-05-04 Thread andysem at mail dot ru
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

--- Comment #2 from andysem at mail dot ru ---
(In reply to Andi Kleen from comment #1)
 Yes LTO doesn't support different options for different files, and combines
 some of them (which happens in your case) and ignores some others.
 
 You could use tag the functions in the different file with 
 
 __attribute__(target(...))
 
 This will also allow automatic switching.
 
 Arguably gcc should do this automatically for LTO, but unfortunately it
 doesn't

Unfortunately, gcc does not allow using SIMD intrinsics if not enabled by
compiler switches, so leaving the compiler options for a generic target CPU
wouldn't work. At least that is the case with gcc 4.8.

 Or alternatively don't compile the file that needs the changed options with
 LTO

Yes, I'm currently not using LTO in my real world project that exhibits this
problem. But users of my project would like to enable LTO, and currently this
silently produces incorrect binaries. The purpose of this ticket is to indicate
the problem and suggest a possible solution (automatically marking each
function in every translation unit with the target options).


[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2014-05-04 Thread andi-gcc at firstfloor dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

--- Comment #3 from Andi Kleen andi-gcc at firstfloor dot org ---
Unfortunately, gcc does not allow using SIMD intrinsics if not enabled by 
compiler switches, so leaving the compiler options for a generic target CPU 
wouldn't work. At least that is the case with gcc 4.8.

This has been fixed in 4.9


[Bug lto/61043] LTO accumulates CPU requirements from all input objects

2014-05-03 Thread andi-gcc at firstfloor dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=61043

Andi Kleen andi-gcc at firstfloor dot org changed:

   What|Removed |Added

 CC||andi-gcc at firstfloor dot org

--- Comment #1 from Andi Kleen andi-gcc at firstfloor dot org ---
Yes LTO doesn't support different options for different files, and combines
some of them (which happens in your case) and ignores some others.

You could use tag the functions in the different file with 

__attribute__(target(...))

This will also allow automatic switching.

Arguably gcc should do this automatically for LTO, but unfortunately it doesn't

Or alternatively don't compile the file that needs the changed options with LTO