[Bug other/59648] -O2 compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 --- Comment #2 from David Kredba nheghathivhistha at gmail dot com --- I am sorry but Xorg guys are saying this is gcc problem: https://bugs.freedesktop.org/show_bug.cgi?id=71127 -O2 compilation is used wide in distributions so in my opinion this issue needs resolution on one of connected sides. Could you kindly please re-open this or say what is wrong with Xorg-server sources? Thank you in advance.
[Bug other/59648] -O2 -flto compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 Andrew Pinski pinskia at gcc dot gnu.org changed: What|Removed |Added Status|RESOLVED|WAITING Last reconfirmed||2014-01-01 Resolution|INVALID |--- Summary|-O2 compilation of |-O2 -flto compilation of |xorg-server-1.15.0 fails|xorg-server-1.15.0 fails Ever confirmed|0 |1 --- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org --- This might still not be a GCC bug. Note the first warning is definitely not a GCC bug and should be fixed in the sources: 1.15.0/xkb/xkbInit.c:690:22: warning: type of 'XkbDfltAccessXOptions' does not match original declaration [enabled by default] extern unsigned char XkbDfltAccessXOptions; ^ /var/tmp/portage/x11-base/xorg-server-1.15.0/work/xorg-server-1.15.0/xkb/xkbAccessX.c:58:16: note: previously declared here unsigned short XkbDfltAccessXOptions = ^ - CUT You are going to have to try to reduce the testcase. Also just this one file is not enough to reproduce the issue as this happens only with -flto it seems.
[Bug c++/59633] [4.7/4.8/4.9 Regression] ICE with __attribute((vector_size(...))) for enum
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59633 --- Comment #3 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Volker Reichelt from comment #2) Well, because the C-frontend compiles it, the C++-frontend used to compile it and even clang (3.2) compiles it, I was under the impression that this should compile (using the underlying type of the enum). Ok, I'll let someone else decide what behavior is wanted. And of course, the docs are at least incomplete, if not inaccurate. E.g. the vector extension of the ternary operator ?: is missing in this chapter. The doc for ?: is under review. If other parts are incomplete or inaccurate, don't hesitate to file bugs (or even post doc patches).
[Bug fortran/59654] [OOP] Broken function table with complex OO use case
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59654 janus at gcc dot gnu.org changed: What|Removed |Added Keywords||wrong-code Status|UNCONFIRMED |NEW Last reconfirmed||2014-01-01 CC||janus at gcc dot gnu.org Summary|Broken function table with |[OOP] Broken function table |complex OO use case |with complex OO use case Ever confirmed|0 |1 Known to fail||4.8.1, 4.9.0 --- Comment #7 from janus at gcc dot gnu.org --- I can confirm the (supposedly wrong) runtime behavior with 4.8 and trunk. 4.7 does not compile the test case. Uncommenting the private statement in line 144 only changes the behavior with 4.8, but my trunk build still yields the 'wrong' output. I tried to use -fdump-tree-original to see what changes in the generated code when flipping the private statement with 4.8, but that does not show *any* difference.
[Bug other/59648] -O2 -flto compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 --- Comment #4 from Markus Trippelsdorf trippels at gcc dot gnu.org --- Xext/panoramiX.c sets: int PanoramiXNumScreens = 0; and events.i has: extern __attribute__((visibility(default))) int PanoramiXNumScreens; ... typedef struct { int screens[16]; int numScreens; } ScreenInfo; ScreenInfo screenInfo; int fn1() { if (noPanoramiXExtension) { int i; i = PanoramiXNumScreens - 1; while (i--) CheckVirtualMotion_x = (long)screenInfo.screens[i]; } return 0; } Which is clearly invalid. Setting PanoramiXNumScreens = 1 in Xext/panoramiX.c fixes the issue.
[Bug fortran/59654] [OOP] Broken function table with complex OO use case
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59654 tlcclt Thomas.L.Clune at nasa dot gov changed: What|Removed |Added Attachment #31554|0 |1 is obsolete|| --- Comment #8 from tlcclt Thomas.L.Clune at nasa dot gov --- Created attachment 31556 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31556action=edit Updated UML diagram I've updated/corrected the UML. Previous version omitted the ConcreteSurrogate class and had some of the associations off. New version also reflects all has-a relationships.
[Bug other/59648] -O2 -flto compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 --- Comment #5 from David Kredba nheghathivhistha at gmail dot com --- I tried to write a script for c-reduce. It writes output from compiler in two steps but grep not waits and c-reduce not wanted to accept it as valid for reducing case becuse test error level was not OK. When I modified it this way: #!/bin/bash TESTCASE=${1:-testcase.i} x86_64-pc-linux-gnu-gcc -std=gnu99 -Wall -Wpointer-arith -Wmissing-declarations -Wformat=2 -Wstrict-prototypes -Wmissing-prototypes -Wnested-externs -Wbad-function-cast -Wold-style-definition -Wdeclaration-after-statement -Wunused -Wuninitialized -Wshadow -Wmissing-noreturn -Wmissing-format-attribute -Wredundant-decls -Wlogical-op -Werror=implicit -Werror=nonnull -Werror=init-self -Werror=main -Werror=missing-braces -Werror=sequence-point -Werror=return-type -Werror=trigraphs -Werror=array-bounds -Werror=write-strings -Werror=address -Werror=int-to-pointer-cast -Werror=pointer-to-int-cast -fno-strict-aliasing -fvisibility=hidden -O2 -ggdb -pipe -march=native -mtune=native -flto=4 -o /dev/null /home/dave2/$TESTCASE /home/dave2/libxservertest.a /home/dave2/test.txt 21 cat /home/dave2/test.txt | grep -q 'error: array subscript is below array bounds' if ! test $? = 0; then exit 1 fi exit 0 then grep was returning expected return code but c-reduce was not able to remove any line. I think I am missing some very basic thing here :-(. PS I know that grep can be called with a file name without using cat and pipe but it not wanted to work too.
[Bug other/59648] -O2 -flto compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 --- Comment #6 from David Kredba nheghathivhistha at gmail dot com --- Maybe to reduce ii file containing events.i and all ii files that creates libxservertest.a together?
[Bug other/59648] -O2 -flto compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 --- Comment #7 from Markus Trippelsdorf trippels at gcc dot gnu.org --- You cannot use full paths for your test output, because creduce will run each iteration in a new directory. So try changing it from: -flto=4 -o /dev/null /home/dave2/$TESTCASE /home/dave2/libxservertest.a /home/dave2/test.txt 21 cat /home/dave2/test.txt | grep ... to: -flto=4 -o /dev/null $TESTCASE /home/dave2/libxservertest.a test.txt 21 cat test.txt | grep ... To reduce /home/dave2/libxservertest.a first extract its contents: ar x /home/dave2/libxservertest.a and then produce a list of the object files: ls -al *.o list Edit list and prepend the full path to all object files. The run delta on list as described here: http://gcc.gnu.org/wiki/A_guide_to_testcase_reduction#Reducing_LTO_bugs and generate preprocessed source for the files. Now reduce the preprocessed files one by one. After a few iterations you'll end up with something like: x4 test # cat test.i long a; extern int noPanoramiXExtension, PanoramiXNumScreens, CheckVirtualMotion_x; typedef struct { int screens[16]; int numScreens; } ScreenInfo; ScreenInfo screenInfo; int fn1() { if (noPanoramiXExtension) { int i; i = PanoramiXNumScreens - 1; while (i--) CheckVirtualMotion_x = (long)screenInfo.screens[i]; } return 0; } int main() { screenInfo.numScreens = 0; a = (long)*fn1; return 0; } x4 test # cat panoramiX.i int PanoramiXNumScreens; int noPanoramiXExtension=1; int CheckVirtualMotion_x; x4 test # gcc -O2 -Werror=array-bounds test.i panoramiX.i x4 test # gcc -O2 -flto -Werror=array-bounds test.i panoramiX.i test.i: In function ‘fn1’: test.i:13:54: warning: iteration 16 invokes undefined behavior [-Waggressive-loop-optimizations] CheckVirtualMotion_x = (long)screenInfo.screens[i]; ^ test.i:12:11: note: containing loop while (i--) ^ test.i:13:54: error: array subscript is below array bounds [-Werror=array-bounds] CheckVirtualMotion_x = (long)screenInfo.screens[i]; ^ lto1: some warnings being treated as errors lto-wrapper: /usr/x86_64-pc-linux-gnu/gcc-bin/4.9.0/gcc returned 1 exit status /usr/lib/gcc/x86_64-pc-linux-gnu/4.9.0/../../../../x86_64-pc-linux-gnu/bin/ld: fatal error: lto-wrapper failed collect2: error: ld returned 1 exit status x4 test #
[Bug other/59648] -O2 -flto compilation of xorg-server-1.15.0 fails
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59648 --- Comment #8 from David Kredba nheghathivhistha at gmail dot com --- Thank you!
[Bug tree-optimization/59651] [4.9 Regression] Vectorizer failing to spot dependence causes incorrect code generation.
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59651 --- Comment #4 from belagod at gcc dot gnu.org --- Thanks for looking at this. Just to clarify, do you mean loop versioning happens in the up-counting loop? Because in the down-counting loop, a partition seems to be happening with 2 iterations of the loop getting vectorized and the remaining 2 are left scalar.
[Bug tree-optimization/59642] Performance regression (4.7/4.8) with -ftree-loop-distribute-patterns
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59642 --- Comment #2 from Marc Glisse glisse at gcc dot gnu.org --- (In reply to Marc Glisse from comment #1) I've noticed the same in other PRs, normally we manage to track the actual value of *p, but we don't manage that when *p was written by __builtin_mem*, which should still be doable: PR 58483 has an example with memcpy.
[Bug libstdc++/59656] New: weak_ptr::lock function crashes when compiling with -fno-exceptions flag
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59656 Bug ID: 59656 Summary: weak_ptr::lock function crashes when compiling with -fno-exceptions flag Product: gcc Version: 4.8.1 Status: UNCONFIRMED Severity: major Priority: P3 Component: libstdc++ Assignee: unassigned at gcc dot gnu.org Reporter: chus_flores at hotmail dot com weak_ptr::lock crashes when the code is compiled with the -fno-exceptions flag and the data pointed by the weak_ptr expires during the execution of the lock function itself: (from http://gcc.gnu.org/onlinedocs/gcc-4.8.2/libstdc++/api/a01518_source.html) shared_ptr_Tp 494 lock() const noexcept 495 { 496 #ifdef __GTHREADS 497 if (this-expired()) 498 return shared_ptr_Tp(); 499 500 __try 501 { 502 return shared_ptr_Tp(*this); 503 } 504 __catch(const bad_weak_ptr) 505 { 506 return shared_ptr_Tp(); 507 } 508 #else 509 return this-expired() ? shared_ptr_Tp() : shared_ptr_Tp(*this); 510 #endif 511 } If the data is valid when line 497 is executed and the data is released in a different thread just before executing line 502, the program will crash because it will try to throw an exception (exceptions are disabled because of the flag -fno-exceptions). This code only works when exceptions are enabled because the try/catch will resolve the problem, but not otherwise. The standard definition says that this function must return safely and it doesn't throw any exception. I presume this must apply even if the exceptions are not enabled.
[Bug rtl-optimization/41171] register allocator undoing optimal schedule
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=41171 Steven Bosscher steven at gcc dot gnu.org changed: What|Removed |Added CC||steven at gcc dot gnu.org --- Comment #9 from Steven Bosscher steven at gcc dot gnu.org --- (In reply to Peter Bergner from comment #5) Looking at update_equiv_regs(), if I disable the replacement for regs that are local to one basic block (patch below) like it existed before John Wehle's patch way back in Oct 2000: http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00782.html then we get the ordering we want. Does anyone know why John removed that part of the test in his patch? Thoughts anyone? To allow things to be moved around in, or out of loops.
[Bug fortran/59654] [OOP] Broken function table with complex OO use case
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59654 --- Comment #9 from janus at gcc dot gnu.org --- Created attachment 31557 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31557action=edit reduce test case Reduced test case. Should print '1' and does so with 4.7.4, but prints '0' with 4.8 and trunk. ICEs with 4.6.
[Bug c/59657] New: SSE intrinsics translates to AVX instructions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59657 Bug ID: 59657 Summary: SSE intrinsics translates to AVX instructions Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: oystein at gnubg dot org Created attachment 31558 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=31558action=edit Example source code file Happy new year! I writing code which should be running both on sse and avx machines. I have manually vectorized the code for SSE and AVX in two different functions and using a function pointer to set the right function according to CPU at startup. The two functions are in the same translation unit. (See attached code) compiled with: gcc -Wall -O3 -g -mavx sse_test.c -o sse_test The problem is that the sse intisics in the sse function gets translated to AVX instructions. This will of course give an illegal instruction on on all non-AVX machines. My gdb session: Program received signal SIGILL, Illegal instruction. 0x08048452 in calculate_sse (data=data@entry=0xb5e0, scale=scale@entry=0.5, size=size@entry=256) at sse_test.c:33 33for ( ; count-- ; p += 4 ){ (gdb) list 28 29static void calculate_sse(float *data, float scale, int size ) 30{ 31int count = size 2; 32float *p = data; 33for ( ; count-- ; p += 4 ){ 34__m128 d = _mm_load_ps( p ); 35__m128 s = _mm_set1_ps( scale ); 36_mm_store_ps( p, _mm_mul_ps( d, s )); 37} (gdb) disassemble Dump of assembler code for function calculate_sse: 0x08048440 +0:mov0xc(%esp),%ecx 0x08048444 +4:mov0x4(%esp),%eax 0x08048448 +8:sar$0x2,%ecx 0x0804844b +11:test %ecx,%ecx 0x0804844d +13:lea-0x1(%ecx),%edx 0x08048450 +16:je 0x8048474 calculate_sse+52 = 0x08048452 +18:vbroadcastss 0x8(%esp),%xmm1 0x08048459 +25:lea0x0(%esi,%eiz,1),%esi 0x08048460 +32:vmulps (%eax),%xmm1,%xmm0 0x08048464 +36:sub$0x1,%edx 0x08048467 +39:add$0x10,%eax 0x0804846a +42:vmovaps %xmm0,-0x10(%eax) 0x0804846f +47:cmp$0x,%edx 0x08048472 +50:jne0x8048460 calculate_sse+32 0x08048474 +52:repz ret End of assembler dump. (Arch linux) [oystein@oysteins-laptop ~]$ gcc --version gcc (GCC) 4.8.2 20131219 (prerelease) Bug or feature? I'm not sure if this is the expected way the intrisics should translate, but it was not what I expected. If it is supposed to be like this, can I get out of my problem without splitting the the two functions to two translation units and use two different compile options? Thanks, Øystein
[Bug c/59657] SSE intrinsics translates to AVX instructions
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59657 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED CC||jakub at gcc dot gnu.org Resolution|--- |INVALID --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org --- Feature. By compiling with -mavx, any function can use AVX instructions. You can either define the functions in different files and use -mavx to compile one and -msse2 or whatever to compile the other one, or you can use the target attribute or #pragma GCC target.
[Bug libstdc++/54448] many failures with /sbin/loader: Error: libstdc++.so.6: symbol __pthread_mutex_init unresolved
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54448 --- Comment #6 from Hin-Tak Leung htl10 at users dot sourceforge.net --- The latest with 4.6.4 and 4.7.3 : http://gcc.gnu.org/ml/gcc-testresults/2014-01/msg00048.html http://gcc.gnu.org/ml/gcc-testresults/2014-01/msg00049.html seems to be a lot healthier. During the course of the latest round, I realised that it seems that GNU strip from GNU binutils seems to confuse the configure system (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=44959 bootstrap failed at Comparing stages 2 and 3) on -gtoggle ; so I am putting /usr/local/bin *last*, instead of first as previously. gcc these days requires GNU tar to extract, and GNU make, GNU bash to configure and make, so it is almost a habit to put /usr/local/bin first, but GNU strip certainly seems to behave differently from DEC strip. Would any of GNU binutils causes /sbin/loader: Error: libstdc++.so.6: symbol __pthread_mutex_init unresolved? If there is a simple test, I can try.
[Bug libitm/52695] libitm/config/x86/cacheline.h: '__m64' does not name a type
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52695 --- Comment #7 from Ryan Hill dirtyepic at gentoo dot org --- (In reply to Jakub Jelinek from comment #5) No idea what brokeness the above talks about, it works just fine for me in C++, so IMHO it just should always include x86intrin.h, but certainly if __MMX__ is defined, but no __SSE__, the above won't include in C++ any header which would define __m64. For 4.8 it just directly includes x86intrin.h. http://gcc.gnu.org/ml/gcc-patches/2012-11/msg00467.html [1] However after patching 4.7.3 [2] we're seeing a different error on some systems. ---8--- In file included from /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/x86intrin.h:27:0, from /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/gcc-4.7.3/libitm/config/x86/target.h:72, from /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/gcc-4.7.3/libitm/libitm_i.h:82, from /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/gcc-4.7.3/libitm/aatree.cc:28: /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/ia32intrin.h: In function ‘int __bsrd(int)’: /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/ia32intrin.h:41:35: error: ‘__builtin_ia32_bsrsi’ was not declared in this scope /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/ia32intrin.h: In function ‘long long unsigned int __rdpmc(int)’: /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/ia32intrin.h:89:35: error: ‘__builtin_ia32_rdpmc’ was not declared in this scope /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/ia32intrin.h: In function ‘long long unsigned int __rdtsc()’: /var/tmp/portage/sys-devel/gcc-4.7.3-r1/work/build/./gcc/include/ia32intrin.h:97:32: error: ‘__builtin_ia32_rdtsc’ was not declared in this scope ---8--- Both the reporters have AMD K8 processors. They only hit the bug when using -march=native; -march=k8 is successful. $ echo | gcc -march=native -v -E - 21 | grep cc1 /usr/libexec/gcc/x86_64-pc-linux-gnu/4.6.3/cc1 -E -quiet -v - -march=k8 -mno-cx16 -mno-sahf -mno-movbe -mno-aes -mno-pclmul -mno-popcnt -mno-abm -mno-lwp -mno-fma -mno-fma4 -mno-xop -mno-bmi -mno-tbm -mno-avx -mno-sse4.2 -mno-sse4.1 --param l1-cache-size=64 --param l1-cache-line-size=64 --param l2-cache-size=512 -mtune=k8 So it seems there's still a piece missing. [1] http://gcc.gnu.org/viewcvs/gcc?view=revisionrevision=193369 [2] http://sources.gentoo.org/cgi-bin/viewvc.cgi/gentoo/src/patchsets/gcc/4.7.3/gentoo/49_all_x86_pr52695_libitm-m64.patch?revision=1.1view=markup
[Bug tree-optimization/59644] [4.9 Regression] r206243 miscompiles Linux kernel
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=59644 --- Comment #9 from Jakub Jelinek jakub at gcc dot gnu.org --- The reason why some changes appear in stdarg functions is: ix86_update_stack_boundary: /* x86_64 vararg needs 16byte stack alignment for register save area. */ if (TARGET_64BIT cfun-stdarg crtl-stack_alignment_estimated 128) crtl-stack_alignment_estimated = 128; and kernel uses -mno-sse -mpreferred-stack-boundary=3 But because of -mno-sse, that is completely unnecessary, as setup_incoming_varargs_64 does: /* FPR size of varargs save area. We don't need it if we don't pass anything in SSE registers. */ if (TARGET_SSE cfun-va_list_fpr_size) ix86_varargs_fpr_size = X86_64_SSE_REGPARM_MAX * 16; else ix86_varargs_fpr_size = 0; thus for !TARGET_SSE it never even allocates the fpr save area that would need the bigger alignment. So IMHO we might as well change ix86_update_stack_boundary to add !TARGET_SSE into the condition. Still, that doesn't explain why the kernel doesn't like it. As I said earlier, the explanation could be that something doesn't expect r10 to be clobbered across some of these calls, or perhaps something assumes 64bit stack alignment in the called function (but with -mpreferred-stack-boundary=3 that would be broken assumption).