Been right on the money
If You Have Already Gainig 3000% and more On Your Money, Read Something Else. And If you dont care Fast returns by Monday, DEFINITELY Don't Look at This! V_NDB is having a 3,000% volume increase today, a clear sign of shorters playing, resulting in the price to be under its value it should be. To the investors it means that as soon as they get to buying the stock up, the price will too increase 3,000%. The time to BUY IN don't happen better then this Monday! As the shorters must get buying back the stock the same way as they dump it now, which is 3,000%. What other sign you need to to buy the stock now? The share price is majorly under the value and one can make a fortune if you acquire V_NDB shares this Monday, August 13!!!
Been right on the money
If You Have Already Gainig 3000% and more On Your Money, Read Something Else. And If you dont care Fast returns by Monday, DEFINITELY Don't Look at This! V_NDB is having a 3,000% volume increase today, a clear sign of shorters playing, resulting in the price to be under its value it should be. To the investors it means that as soon as they get to buying the stock up, the price will too increase 3,000%. The time to BUY IN don't happen better then this Monday! As the shorters must get buying back the stock the same way as they dump it now, which is 3,000%. What other sign you need to to buy the stock now? The share price is majorly under the value and one can make a fortune if you acquire V_NDB shares this Monday, August 13!!!
Been right on the money
If You Have Already Gainig 3000% and more On Your Money, Read Something Else. And If you dont care Fast returns by Monday, DEFINITELY Don't Look at This! V_NDB is having a 3,000% volume increase today, a clear sign of shorters playing, resulting in the price to be under its value it should be. To the investors it means that as soon as they get to buying the stock up, the price will too increase 3,000%. The time to BUY IN don't happen better then this Monday! As the shorters must get buying back the stock the same way as they dump it now, which is 3,000%. What other sign you need to to buy the stock now? The share price is majorly under the value and one can make a fortune if you acquire V_NDB shares this Monday, August 13!!!
Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion
On Fri, Aug 10, 2012 at 5:44 PM, Elmar Krieger el...@cmbi.ru.nl wrote: Hi Ian, hi Richard, hi Andi! Many thanks for your comments. The slowdown is not the same with other files, so I'm essentially sure that this specific source file has some 'feature' that catches GCC at the wrong leg. This raises my hopes that one of the GCC experts wants to take a look at it. The code is confidential, You could file a bug report with just a profile output of the compiler (e.g. from oprofile or perf) But please use a pristine FSF compiler. You can also run the source through some obfuscation tool. Or get a first hint with using -ftime-report. In the end, without a testcase there is nothing to do for us ... I downloaded the latest official GCC 4.7.1, but unfortunately configure stopped with Building GCC requires GMP 4.2+, MPFR 2.3.1+ and MPC 0.8.0+., and for my CentOS Linux, only older versions of this libs are available as RPMs. I saw many hours of manual fiddling ahead, so I suggest a more efficient solution: I now sent the confidential source file by private message to Richard, please spend 5 minutes to run these two commands with it: time gcc -m32 -g -O0 -fno-strict-aliasing -x c -Wall -Werror -c model.i /usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -g -w 3.30user 0.03system 0:03.34elapsed 99%CPU (0avgtext+0avgdata 277072maxresident)k 0inputs+0outputs (0major+20416minor)pagefaults 0swaps /usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -g -m32 -w 3.28user 0.08system 0:03.38elapsed 99%CPU (0avgtext+0avgdata 985760maxresident)k 0inputs+0outputs (0major+64353minor)pagefaults 0swaps Same time. I am positively surprised. time gcc -m32 -g -O -fno-strict-aliasing -x c -Wall -Werror -c model.i /usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -g -w -O 8.09user 0.13system 0:08.29elapsed 99%CPU (0avgtext+0avgdata 381376maxresident)k 248inputs+0outputs (1major+38855minor)pagefaults 0swaps /usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -g -m32 -w -O 15.33user 0.16system 0:15.55elapsed 99%CPU (0avgtext+0avgdata 1844272maxresident)k 24inputs+0outputs (1major+125893minor)pagefaults 0swaps That's within reasonable bounds as well, IMHO (you can't really compare -O1 from 3.2.3 with -O1 from 4.6.3). One more data point (-O2 tends to be more focused on, no debuginfo generation turns off improvements and its costs there): /usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -w -O2 17.31user 0.43system 0:17.82elapsed 99%CPU (0avgtext+0avgdata 427392maxresident)k 72inputs+0outputs (2major+69895minor)pagefaults 0swaps /usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -m32 -w -O2 18.12user 0.21system 0:18.43elapsed 99%CPU (0avgtext+0avgdata 1752784maxresident)k 0inputs+0outputs (0major+124029minor)pagefaults 0swaps same time, I am surprised again ;) (with improvements in CPU speed the compilation with 4.6.3 is actually _faster_ comparing commodity platforms from the date of the compiler releases). If you don't find an enormous slowdown with the second command (please post your timings) and conclude that this problem has been introduced by Google in their custom GCC, I'll pay you 100 USD for the 5 minutes wasted. You might want to try -ftime-report, if it says you have extra checkings enabled for one compiler but not the other that will explain the different outcome at your side: Extra diagnostic checks enabled; compiler may run slowly. Configure with --enable-checking=release to disable checks. Disclaimer: I had to delete an include statement on top of the file I sent you to make it compile. Richard. To Ian: Not at all high. See Type-Based Alias Analysis http://www.drdobbs.com/cpp/type-based-alias-analysis/184404273 for one reason. Thanks, I read the article, but didn't really see how forbidding a function with argument void** to accept a pointer to any pointer helps with aliasing. If it's perfectly normal that a function with argument void* accepts any pointer, then a function with argument void** should accept a pointer to any pointer by analogy, without having additional aliasing problems, no? The C and C++ languages could work that way, yes. But they don't. GCC attempts to implement the standard language. Yep, that's why I mentioned how GCC's smart extensions to the standard language saved the day many times in the past ;-) Aliasing issues arise when a function has two pointers, and determine whether an assignment to *p1 might change the value at *p2. There are no aliasing issues with a void* pointer, because if p1 is void* then *p1 is invalid. That is not true for a void** pointer, so aliasing issues do arise. If p1 is void** and p2 is int**, then GCC will assume that
Re: Hopelessly broken loop_father, loop_depth
On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com wrote: On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com wrote: Lots of test cases fail with the attached patch. Lots still fail after correcting the verifier :-) 920723-1.c: In function 'f': 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1 f (int count, vector_t * pos, double r, double *rho) ^ 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at cfgloop.c:1598 That's a pre-existing bug in unswitching. When unswitching simplifies the condition it unswitches on using simplify_using_entry_checks it may turn an inner loop into an exit to an endless loop. But it does not modify the loop stucture according to this change. void foo (int x, int r) { loop4: if (r = x) { goto loop4_latch; } else { loop5: if (r = x) goto loop4_latch; goto loop5; } loop4_latch: goto loop4; } simplified testcase that even fails at -O1. We mostly rely on cfg-cleanup to fixup loops for us, so this is one case it does not handle properly. The quest of keeping loops up-to-date is hard ... but thanks for the checking code ;) Richard. Ciao! Steven
Re: Hopelessly broken loop_father, loop_depth
On Mon, Aug 13, 2012 at 12:21 PM, Richard Guenther richard.guent...@gmail.com wrote: On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com wrote: On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com wrote: Lots of test cases fail with the attached patch. Lots still fail after correcting the verifier :-) 920723-1.c: In function 'f': 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1 f (int count, vector_t * pos, double r, double *rho) ^ 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at cfgloop.c:1598 That's a pre-existing bug in unswitching. When unswitching simplifies the condition it unswitches on using simplify_using_entry_checks it may turn an inner loop into an exit to an endless loop. But it does not modify the loop stucture according to this change. void foo (int x, int r) { loop4: if (r = x) { goto loop4_latch; } else { loop5: if (r = x) goto loop4_latch; goto loop5; } loop4_latch: goto loop4; } simplified testcase that even fails at -O1. We mostly rely on cfg-cleanup to fixup loops for us, so this is one case it does not handle properly. Actually that testcase fails verification right after a full loop discovery which DOM1 performs ... The quest of keeping loops up-to-date is hard ... but thanks for the checking code ;) Richard. Ciao! Steven
Re: Hopelessly broken loop_father, loop_depth
On Mon, Aug 13, 2012 at 12:22 PM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Aug 13, 2012 at 12:21 PM, Richard Guenther richard.guent...@gmail.com wrote: On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com wrote: On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com wrote: Lots of test cases fail with the attached patch. Lots still fail after correcting the verifier :-) 920723-1.c: In function 'f': 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1 f (int count, vector_t * pos, double r, double *rho) ^ 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at cfgloop.c:1598 That's a pre-existing bug in unswitching. When unswitching simplifies the condition it unswitches on using simplify_using_entry_checks it may turn an inner loop into an exit to an endless loop. But it does not modify the loop stucture according to this change. void foo (int x, int r) { loop4: if (r = x) { goto loop4_latch; } else { loop5: if (r = x) goto loop4_latch; goto loop5; } loop4_latch: goto loop4; } simplified testcase that even fails at -O1. We mostly rely on cfg-cleanup to fixup loops for us, so this is one case it does not handle properly. Actually that testcase fails verification right after a full loop discovery which DOM1 performs ... Fixed by attached patch. The quest of keeping loops up-to-date is hard ... but thanks for the checking code ;) Which probably still makes things fail elsewhere ;) Richard. Richard. Ciao! Steven fix-loops-1 Description: Binary data
Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion
Hi Richard, many thanks for saving my time. time gcc -m32 -g -O -fno-strict-aliasing -x c -Wall -Werror -c model.i That's within reasonable bounds as well, IMHO (you can't really compare -O1 from 3.2.3 with -O1 from 4.6.3). One more data point (-O2 tends to be more focused on, no debuginfo generation turns off improvements and its costs there): /usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -w -O2 17.31user 0.43system 0:17.82elapsed 99%CPU (0avgtext+0avgdata 427392maxresident)k 72inputs+0outputs (2major+69895minor)pagefaults 0swaps /usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386 -fno-strict-aliasing -m32 -w -O2 18.12user 0.21system 0:18.43elapsed 99%CPU (0avgtext+0avgdata 1752784maxresident)k 0inputs+0outputs (0major+124029minor)pagefaults 0swaps same time, I am surprised again ;) (with improvements in CPU speed the compilation with 4.6.3 is actually _faster_ comparing commodity platforms from the date of the compiler releases). You might want to try -ftime-report, if it says you have extra checkings enabled for one compiler but not the other that will explain the different outcome at your side: Good news, and especially the -ftime-report trick was highly useful. For example, I got a huge slowdown also with this compiler: gcc44 (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3) Copyright (C) 2010 Free Software Foundation, Inc. which spends all its time in 'variable tracking': variable tracking : 126.07 (89%) usr 0.26 ( 7%) sys 126.50 (87%) wall 20647 kB ( 6%) ggc TOTAL : 141.94 3.66 145.61 336368 kB real2m26.703s And the Google Android compiler I reported originally... i686-linux-android-gcc (GCC) 4.6.x-google 20120106 (prerelease) Copyright (C) 2011 Free Software Foundation, Inc. ...which takes more than twice as long spends its time here: phase cgraph : 347.75 (100%) usr 10.73 (76%) sys 358.51 (99%) wall 130837 kB (84%) ggc phase generate: 347.85 (100%) usr 10.77 (76%) sys 358.64 (99%) wall 132490 kB (85%) ggc var-tracking dataflow : 284.34 (82%) usr 0.00 ( 0%) sys 284.21 (78%) wall 0 kB ( 0%) ggc TOTAL : 350.0412.53 362.60 155292 kB real6m3.567s I really didn't expect that RedHat and Google both mess up GCC with their modifications, so I'll report it to them instead ;-) Anyway, please send by private email your favorite way of receiving the promised 100 USD. Could be PayPal, a list of Amazon.com items which are sent to your address, a direct bank transfer etc.. Best regards, Elmar If you don't find an enormous slowdown with the second command (please post your timings) and conclude that this problem has been introduced by Google in their custom GCC, I'll pay you 100 USD for the 5 minutes wasted. Extra diagnostic checks enabled; compiler may run slowly. Configure with --enable-checking=release to disable checks. Disclaimer: I had to delete an include statement on top of the file I sent you to make it compile. Richard. Aliasing issues arise when a function has two pointers, and determine whether an assignment to *p1 might change the value at *p2. There are no aliasing issues with a void* pointer, because if p1 is void* then *p1 is invalid. That is not true for a void** pointer, so aliasing issues do arise. If p1 is void** and p2 is int**, then GCC will assume that an assignment to *p1 does not change the value at *p2, as the language standard states. It's easy to imagine that that could break a program after inlining. Many thanks for the clarification, and it also points to a simple solution: GCC could simply permit to pass a pointer to any pointer to a function, if the function argument is of type 'void **restrict myptr'. If adding a 'restrict' to a function declaration was the only thing required to get rid of countless nasty explicit type casts, the day would already be saved. There really seem to be lots of problem classes that cannot be solved with explicit type casts otherwise. The example for loading a binary file from disk and allocating the required memory to store the file contents being just one of them... Best regards, Elmar Just one more complicated example: A function that loads a binary file from disk and allocates the required memory to store the file contents, returning the number of bytes read. dstadd is the address where the newly allocated pointer is stored: int dsc_loadfilealloc(void *dstadd,char *filename) { int read,size; FILE *fb; if ((fb=fopen(filename,rb))) { size=dsc_filesize(filename); *(void**)dstadd=mem_alloc(size); read=dsc_readbytes(*(void**)dstadd,fb,size); *(void**)dstadd=mem_realloc(*(void**)dstadd,read); fclose(fb); return(read); } *(void**)dstadd=NULL; return(0); } Again, nasty casts all over the place, which would all disappear if GCC allowed me to write
Re: Hopelessly broken loop_father, loop_depth
On Mon, Aug 13, 2012 at 12:57 PM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Aug 13, 2012 at 12:22 PM, Richard Guenther richard.guent...@gmail.com wrote: On Mon, Aug 13, 2012 at 12:21 PM, Richard Guenther richard.guent...@gmail.com wrote: On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com wrote: On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com wrote: Lots of test cases fail with the attached patch. Lots still fail after correcting the verifier :-) 920723-1.c: In function 'f': 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1 f (int count, vector_t * pos, double r, double *rho) ^ 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at cfgloop.c:1598 That's a pre-existing bug in unswitching. When unswitching simplifies the condition it unswitches on using simplify_using_entry_checks it may turn an inner loop into an exit to an endless loop. But it does not modify the loop stucture according to this change. void foo (int x, int r) { loop4: if (r = x) { goto loop4_latch; } else { loop5: if (r = x) goto loop4_latch; goto loop5; } loop4_latch: goto loop4; } simplified testcase that even fails at -O1. We mostly rely on cfg-cleanup to fixup loops for us, so this is one case it does not handle properly. Actually that testcase fails verification right after a full loop discovery which DOM1 performs ... Fixed by attached patch. The quest of keeping loops up-to-date is hard ... but thanks for the checking code ;) Which probably still makes things fail elsewhere ;) Same issue in fix_loop_structure: /* Now fix the loop nesting. */ FOR_EACH_LOOP (li, loop, 0) { ploop = superloop[loop-num]; if (ploop != loop_outer (loop)) { flow_loop_tree_node_remove (loop); flow_loop_tree_node_add (ploop, loop); } } I wonder why we cache loop-depth at all ... given that it is a simple dereference bb-loop_father-superloops-base.prefix.num. For all the hassle to keep that cache up-to-date, that is. Would anybody mind removing basic_block-loop_depth? With C++ we can even have an overloaded loop_depth that works on both basic-blocks and loops ... Richard. Richard. Richard. Ciao! Steven
Re: Hopelessly broken loop_father, loop_depth
On Mon, Aug 13, 2012 at 1:27 PM, Richard Guenther richard.guent...@gmail.com wrote: I wonder why we cache loop-depth at all ... given that it is a simple dereference bb-loop_father-superloops-base.prefix.num. For all the hassle to keep that cache up-to-date, that is. The cached bb-loop_depth saves two indirect references. But if it's hard to maintain, I'd be happy to see it go away. Just so long as bb-loop_father is correct (to be verified by a patch for the loop verification code). Ciao! Steven
Re: Hopelessly broken loop_father, loop_depth
On Mon, Aug 13, 2012 at 3:15 PM, Steven Bosscher stevenb@gmail.com wrote: On Mon, Aug 13, 2012 at 1:27 PM, Richard Guenther richard.guent...@gmail.com wrote: I wonder why we cache loop-depth at all ... given that it is a simple dereference bb-loop_father-superloops-base.prefix.num. For all the hassle to keep that cache up-to-date, that is. The cached bb-loop_depth saves two indirect references. But if it's hard to maintain, I'd be happy to see it go away. Just so long as bb-loop_father is correct (to be verified by a patch for the loop verification code). loop_father is easier to keep up-to-date at least, and possibly should just work. A patch removing loop_depth just finished testing and I'll commit it in a sec. Richard. Ciao! Steven
Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion
Elmar Krieger el...@cmbi.ru.nl writes: [...] I really didn't expect that RedHat and Google both mess up GCC with their modifications, so I'll report it to them instead ;-) That's not a fair characterization of the features' costs/benefits. - FChE
50% slowdown with LTO
I'm not sure what LTO is supposed to do -- the documentation is not exactly clear. But I assumed it should make things faster and/or smaller. So I tried using it on an application -- a processor emulator, CPU intensive code, a lot of 64 bit integer arithmetic. Using a compile/assembler run on the emulated system as a benchmark, I compared the code on x86_64-linux, gcc 4.7.0, -O2 plain, -O2 -fprofile-use (after having done -fprofile-generate), and -O2 -fprofile-use -flto (using a separate set of profile data files from -fprofile-generate -flto). Results: profiling speeds things up about 8%, but LTO is 50% (!) slower than without. Any suggestions of what to look at for this? paul
Slides and video for Cauldron 2012 presentations
I just uploaded all the slides I received and linked all the talks for which we had video. Jan, if there are any more videos you have other than http://www.youtube.com/playlist?list=PL5D02780BAF2B55CFfeature=plcp, please send them my way. To all the presenters, please check that the links I've created in http://gcc.gnu.org/wiki/cauldron2012 are correct. If you do not see your slides linked, please attach them to the wiki page and modify your entry in the Presentations section. If you are not sure how to do this, please send me the slides and I'll upload them (only PDF files, please). Thanks. Diego.
Re: Excluding dejagnu testcases for subtargets
On Sat, Aug 11, 2012 at 09:40:52AM -0700, Janis Johnson wrote: On 08/11/2012 09:18 AM, Senthil Kumar Selvaraj wrote: On Fri, Aug 10, 2012 at 09:54:17AM -0700, Janis Johnson wrote: On 08/09/2012 10:52 PM, Senthil Kumar Selvaraj wrote: Hi, What is the recommended way to skip specific (non target specific) testcases for a subtargets? There are a bunch of tests in the gcc testsuite that are too big (in terms of code size or memory) for a subtarget of the avr target. The subtarget is specified in the dejagnu board configuration file (set_board_info cflags -mmcu subtarget name). Using dg-skip-if with -mmcu subtarget name for the include part did not work. Looking at the source (target-supports-dg.exp) showed that it doesn't consider board_info cflags. Only board_info multilib_flags, flags specified in dg-options, $TOOL_OPTIONS and $TEST_ALWAYS_FLAGS appear to be considered. Should we set the -mmcu option to multilib_flags instead of cflags in the board config? Should we use --tool_opt in RUNTESTFLAGS? How do other targets handle this? Regards Senthil Probably check-flags in target-supports-dg.exp should check cflags in the board_info along with the other flags. Can you try that to see if it does what you need? Yes it does. The patch below did the job. Please submit it, with a ChangeLog entry, to gcc-patc...@gcc.gnu.org. Sent. http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00689.html Is there a reason why cflags wasn't included before? Because I didn't know about it. It wasn't intentional. Janis Regards Senthil diff --git a/gcc/testsuite/lib/target-supports-dg.exp b/gcc/testsuite/lib/target-supports-dg.exp index 2f6c4c2..bdf7476 100644 --- a/gcc/testsuite/lib/target-supports-dg.exp +++ b/gcc/testsuite/lib/target-supports-dg.exp @@ -304,6 +304,9 @@ proc check-flags { args } { # If running a subset of the test suite, $TEST_ALWAYS_FLAGS may not exist. catch {append compiler_flags $TEST_ALWAYS_FLAGS } set dest [target_info name] +if [board_info $dest exists cflags] { +append compiler_flags [board_info $dest cflags] +} if [board_info $dest exists multilib_flags] { append compiler_flags [board_info $dest multilib_flags] } Regards Senthil
gcc trunk fails to build without isl/cloog
The installation instructions seem to imply that GCC can be built without having ISL and/or CLOOG installed, and the configure script accepts --without-isl and --without-cloog. But I can't build that. Reading the installation instructions makes me expect that such a configuration would skip the building of the graphite loop optimization machinery. What happens instead is that it's built anyway, but the makefile aborts at the point where it tries to compile gcc/graphite.c (because cloog/cloog.h does not exist). Is this supposed to work? paul
ISL install troubles
Where does one go to report issues with ISL? Since GCC doesn't build without it, I'm trying to install ISL from sources. That doesn't work. It accepts --with-gmp but there is nothing in the Makefile to pay attention to that -- the compiles are done without any switches so it fails unless gmp.h is in /usr/include. Since I installed gmp from source in the usual way, it's in /usr/local/. paul
Re: 50% slowdown with LTO
On Mon, Aug 13, 2012 at 8:27 AM, paul_kon...@dell.com wrote: I'm not sure what LTO is supposed to do -- the documentation is not exactly clear. But I assumed it should make things faster and/or smaller. So I tried using it on an application -- a processor emulator, CPU intensive code, a lot of 64 bit integer arithmetic. Using a compile/assembler run on the emulated system as a benchmark, I compared the code on x86_64-linux, gcc 4.7.0, -O2 plain, -O2 -fprofile-use (after having done -fprofile-generate), and -O2 -fprofile-use -flto (using a separate set of profile data files from -fprofile-generate -flto). Results: profiling speeds things up about 8%, but LTO is 50% (!) slower than without. Any suggestions of what to look at for this? LTO lets the compiler see all the code at once, enabling optimizations like inlining function calls across different source files. Like any optimization, there are cases where it will cause code to slow down rather than speed up. A 50% slowdown is certainly unusual, and suggests some systematic error. Figuring out what has gone wrong is like optimizing any program. Get a profile for your program, e.g., using -pg. Build the program with and without -flto, run it, and look at the resulting profiles. A 50% slowdown should be fairly obvious. I would guess that GCC has made a poor inlining decision, but the profile should show the problem for sure. Ian
Re: gcc trunk fails to build without isl/cloog
On Mon, Aug 13, 2012 at 9:01 AM, paul_kon...@dell.com wrote: The installation instructions seem to imply that GCC can be built without having ISL and/or CLOOG installed, and the configure script accepts --without-isl and --without-cloog. But I can't build that. Reading the installation instructions makes me expect that such a configuration would skip the building of the graphite loop optimization machinery. What happens instead is that it's built anyway, but the makefile aborts at the point where it tries to compile gcc/graphite.c (because cloog/cloog.h does not exist). Is this supposed to work? Trunk builds fine without ppl when GCC is configured with --without-ppl: auto-host.h:/* #undef HAVE_cloog */ [hjl@gnu-mic-2 build-x86_64-linux]$ ldd gcc/cc1 linux-vdso.so.1 = (0xff98) libmpc.so.2 = /libx32/libmpc.so.2 (0xf73ad000) libmpfr.so.4 = /libx32/libmpfr.so.4 (0xf7157000) libgmp.so.10 = /libx32/libgmp.so.10 (0xf6eeb000) libdl.so.2 = /libx32/libdl.so.2 (0xf6ce8000) libz.so.1 = /libx32/libz.so.1 (0xf6ad4000) libm.so.6 = /libx32/libm.so.6 (0xf67db000) libc.so.6 = /libx32/libc.so.6 (0xf642c000) /libx32/ld-linux-x32.so.2 (0xf75c1000) [hjl@gnu-mic-2 build-x86_64-linux]$ I do have mpc, mpfr and gmp. -- H.J.
Re: 50% slowdown with LTO
Ian Lance Taylor i...@google.com writes: Figuring out what has gone wrong is like optimizing any program. Get a profile for your program, e.g., using -pg. Build the program with and without -flto, run it, and look at the resulting profiles. A 50% slowdown should be fairly obvious. I would guess that GCC has made a poor inlining decision, but the profile should show the problem for sure. On modern profiling tools like perf or oprofile you can also diff profiles for this. -Andi -- a...@linux.intel.com -- Speaking for myself only
Re: gcc trunk fails to build without isl/cloog
On Aug 13, 2012, at 12:42 PM, H.J. Lu wrote: On Mon, Aug 13, 2012 at 9:01 AM, paul_kon...@dell.com wrote: The installation instructions seem to imply that GCC can be built without having ISL and/or CLOOG installed, and the configure script accepts --without-isl and --without-cloog. But I can't build that. Reading the installation instructions makes me expect that such a configuration would skip the building of the graphite loop optimization machinery. What happens instead is that it's built anyway, but the makefile aborts at the point where it tries to compile gcc/graphite.c (because cloog/cloog.h does not exist). Is this supposed to work? Trunk builds fine without ppl when GCC is configured with --without-ppl: auto-host.h:/* #undef HAVE_cloog */ [hjl@gnu-mic-2 build-x86_64-linux]$ ldd gcc/cc1 linux-vdso.so.1 = (0xff98) libmpc.so.2 = /libx32/libmpc.so.2 (0xf73ad000) libmpfr.so.4 = /libx32/libmpfr.so.4 (0xf7157000) libgmp.so.10 = /libx32/libgmp.so.10 (0xf6eeb000) libdl.so.2 = /libx32/libdl.so.2 (0xf6ce8000) libz.so.1 = /libx32/libz.so.1 (0xf6ad4000) libm.so.6 = /libx32/libm.so.6 (0xf67db000) libc.so.6 = /libx32/libc.so.6 (0xf642c000) /libx32/ld-linux-x32.so.2 (0xf75c1000) [hjl@gnu-mic-2 build-x86_64-linux]$ I do have mpc, mpfr and gmp. Is ppl another name for cloog? If I don't have cloog, should I say --without-ppl? That doesn't make much sense, and it isn't documented. paul
Re: gcc trunk fails to build without isl/cloog
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54138. Andreas. -- Andreas Schwab, sch...@linux-m68k.org GPG Key fingerprint = 58CA 54C7 6D53 942B 1756 01D3 44D5 214B 8276 4ED5 And now for something completely different.
Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion
[...] I really didn't expect that RedHat and Google both mess up GCC with their modifications, so I'll report it to them instead That's not a fair characterization of the features' costs/benefits. We just are trying to mess up (?) binutils, aren't we? gcc just receives the benefit by adapting to it. The benefit is What is necessary is to re-compile only the files you touched. Is it messing up, do you(pl) think? - Isoyaf
[Bug fortran/54238] New: If possible, TRANSFER should use assignment instead of MEMCPY
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54238 Bug #: 54238 Summary: If possible, TRANSFER should use assignment instead of MEMCPY Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: bur...@gcc.gnu.org In some cases TRANSFER can be replaced by a normal assignment (with cast), possibly also a ARRAY_RANGE_REF with cast? Example, the following code – which matches the currently used scalarizer for FINAL: use iso_c_binding, only: c_intptr_t, c_loc, c_ptr, c_int, c_f_pointer integer(c_int), target :: array(4) integer(c_int), pointer :: ptr integer(c_intptr_t) :: addr type(c_ptr) :: cptr array = [11,22,33,44] do i = 0, 3 cptr = c_loc (array) addr = transfer (cptr, addr) + i * storage_size (array)/8 call c_f_pointer (transfer (addr, cptr), ptr) print *, i,': ', ptr end do end Dump of: addr = transfer (cptr, addr) + i * storage_size (array)/8 { struct array1_integer(kind=4) parm.2; integer(kind=8) transfer.1; integer(kind=8) D.1876; integer(kind=8) D.1875; integer(kind=8) D.1874; D.1874 = 8; D.1875 = 8; __builtin_memcpy ((void *) transfer.1, (void *) cptr, MAX_EXPR MIN_EXPR D.1875, D.1874, 0); parm.2.dtype = 265; parm.2.dim[0].lbound = 1; parm.2.dim[0].ubound = 4; parm.2.dim[0].stride = 1; parm.2.data = (void *) array[0]; parm.2.offset = -1; addr = (integer(kind=8)) ((i * 32) / 8) + transfer.1; } While a simple addr = (intptr_t) cptr; should be sufficient.
[Bug fortran/54238] If possible, TRANSFER should use assignment instead of MEMCPY
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54238 --- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org 2012-08-13 06:15:43 UTC --- Though the memcpy does get optimized to a VCE: addr.9_4 = (integer(kind=8)) ivtmp.29_28; D.1913_24 = VIEW_CONVERT_EXPRvoid *(addr.9_4); So it might not be important enough to do at the front-end level.
[Bug bootstrap/50167] gmp memory functions are extern C (graphite)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50167 Marc Glisse glisse at gcc dot gnu.org changed: What|Removed |Added CC||glisse at gcc dot gnu.org --- Comment #2 from Marc Glisse glisse at gcc dot gnu.org 2012-08-13 06:17:30 UTC --- Note that this could also be solved by using gmp_fprintf. (Or by using mpz_class::get_str, since we seem to be moving to C++ anyway)
[Bug middle-end/52173] internal compiler error: verify_ssa failed possibly caused by itm
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52173 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-08-13 CC||aldyh at gcc dot gnu.org, ||jakub at gcc dot gnu.org Ever Confirmed|0 |1
[Bug rtl-optimization/53942] [4.6/4.7/4.8 Regression] unable to find a register to spill in class 'CREG'
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53942 --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 07:35:11 UTC --- Author: jakub Date: Mon Aug 13 07:35:03 2012 New Revision: 190338 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190338 Log: Backported from trunk 2012-07-19 Jakub Jelinek ja...@redhat.com PR rtl-optimization/53942 * function.c (assign_parm_setup_reg): Avoid zero/sign extension directly from likely spilled non-fixed hard registers, move them to pseudo first. * gcc.dg/pr53942.c: New test. Added: branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/pr53942.c Modified: branches/gcc-4_7-branch/gcc/ChangeLog branches/gcc-4_7-branch/gcc/function.c branches/gcc-4_7-branch/gcc/testsuite/ChangeLog
[Bug libstdc++/54237] [C++11] Make more tuple-related functions constexpr
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54237 Jonathan Wakely redi at gcc dot gnu.org changed: What|Removed |Added CC||bkoz at gcc dot gnu.org Severity|normal |enhancement --- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 08:14:00 UTC --- That does seem possible. Benjamin wrote http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3231.html so let's CC him.
[Bug tree-optimization/21485] [4.6/4.7/4.8 Regression] missed load PRE, PRE makes i?86 suck
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485 --- Comment #53 from wbrana wbrana at gmail dot com 2012-08-13 08:26:13 UTC --- It seems it was improved. 4.8 20120806 NUMERIC SORT: 1543.7 : 39.59 : 13.00 4.8 20120813 NUMERIC SORT: 2007.8 : 51.49 : 16.91
[Bug debug/51358] incorrect/missing location for function arg, -O0, without VTA
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51358 --- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 08:55:05 UTC --- (In reply to comment #4) It would not be helpful, systemtap would then see no data (just not wrong data). Also at that time location list will need to be used and currently GDB when it sees any location list it thinks it no longer needs to skip the prologue. OTOH GDB could look at -grecord-gcc-switches first which it currently does not so I should just finally implement -grecord-gcc-switches in GDB in such case. I think seeing wrong data, thus, wrong-debug is never superior over no debug info / no data.
[Bug target/54232] For x86 PIC code, ebx should be spillable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54232 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Target||x86_64-*-*, i?86-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2012-08-13 Version|unknown |4.8.0 Ever Confirmed|0 |1 --- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 08:57:15 UTC --- I think the GOT is introduced too late to do any fancy ananlysis on whether we need it or not. I also think that for outgoing function calls the ABI relies on a properly setup GOT, even for those that bind locally and thus do not go through the PLT.
[Bug lto/54231] LTO generates code for the wrong CPU if different options used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231 --- Comment #8 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 08:59:18 UTC --- If you do something like gcc -c t1.c -mavx -flto gcc -c t2.c -msse2 -flto gcc t1.o t2.o -flto then the link step will use -mavx -msse2, that is, target options are concatenated.
[Bug tree-optimization/54200] copyrename generates wrong debuginfo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200 --- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 09:29:33 UTC --- Author: rguenth Date: Mon Aug 13 09:29:28 2012 New Revision: 190339 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190339 Log: 2012-08-13 Richard Guenther rguent...@suse.de PR tree-optimization/54200 * tree-ssa-copyrename.c (rename_ssa_copies): Do not add PHI results to another partition if not all PHI arguments have the same partition. * gcc.dg/guality/pr54200.c: New testcase. * gcc.dg/tree-ssa/slsr-8.c: Adjust. Added: trunk/gcc/testsuite/gcc.dg/guality/pr54200.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c trunk/gcc/tree-ssa-copyrename.c
[Bug lto/54231] LTO generates code for the wrong CPU if different options used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231 --- Comment #9 from Thiago Macieira thiago at kde dot org 2012-08-13 09:44:51 UTC --- (In reply to comment #8) If you do something like gcc -c t1.c -mavx -flto gcc -c t2.c -msse2 -flto gcc t1.o t2.o -flto then the link step will use -mavx -msse2, that is, target options are concatenated. Indeed. What I'm asking for is that each source file be compiled with its own target options. I realise this is a request for enhancement, though.
[Bug lto/54231] LTO generates code for the wrong CPU if different options used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231 --- Comment #10 from Thiago Macieira thiago at kde dot org 2012-08-13 09:53:32 UTC --- Another test: $ cat main_avx.c #define BZERO bzero_avx #pragma GCC target (avx) #include main.c $ cat main_sse2.c #define BZERO bzero_sse2 #pragma GCC target (sse2) #include main.c $ cat main.c #include immintrin.h void BZERO(char *ptr, size_t count) { __m128i zero = _mm_set1_epi8(0); while (count--) { _mm_stream_si128((__m128i*)ptr, zero); ptr += 16; } } $ gcc -flto -O2 -shared -o libtest.so main_avx.c main_sse2.c $ objdump -Cdr --no-show-raw-insn libtest.so [...] 0650 bzero_sse2: 650: test %rsi,%rsi 653: pxor %xmm0,%xmm0 657: je 66e bzero_sse2+0x1e 659: nopl 0x0(%rax) 660: movntdq %xmm0,(%rdi) 664: add$0x10,%rdi 668: sub$0x1,%rsi 66c: jne660 bzero_sse2+0x10 66e: repz retq 0670 bzero_avx: 670: test %rsi,%rsi 673: pxor %xmm0,%xmm0 677: je 68e bzero_avx+0x1e 679: nopl 0x0(%rax) 680: movntdq %xmm0,(%rdi) 684: add$0x10,%rdi 688: sub$0x1,%rsi 68c: jne680 bzero_avx+0x10 68e: repz retq
[Bug lto/54231] LTO generates code for the wrong CPU if different options used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231 --- Comment #11 from Thiago Macieira thiago at kde dot org 2012-08-13 10:12:48 UTC --- Attaching __attribute__((target(xxx))) to the function does help. It generates the following with the my_bzero function from comment 2: 02e0 bzero_avx.2362: 2e0: test %rsi,%rsi 2e3: vpxor %xmm0,%xmm0,%xmm0 2e7: je 2fe bzero_avx.2362+0x1e 2e9: nopl 0x0(%rax) 2f0: vmovntdq %xmm0,(%rdi) 2f4: add$0x10,%rdi 2f8: sub$0x1,%rsi 2fc: jne2f0 bzero_avx.2362+0x10 2fe: repz retq 0300 my_bzero: 300: mov0x200171(%rip),%rax# 200478 my_bzero+0x200178 307: mov(%rax),%eax 309: test %eax,%eax 30b: jne330 my_bzero+0x30 30d: test %rsi,%rsi 310: pxor %xmm0,%xmm0 314: je 332 my_bzero+0x32 316: nopw %cs:0x0(%rax,%rax,1) 320: movntdq %xmm0,(%rdi) 324: add$0x10,%rdi 328: sub$0x1,%rsi 32c: jne320 my_bzero+0x20 32e: repz retq 330: jmp2e0 bzero_avx.2362 332: repz retq This workaround might be useful for me in a few places where the code inlining provided by LTO was desired (even though, in this example, the AVX variant is exactly what it would be if no LTO had been used). But it won't work without major changes to the code if I have 400+ functions in a file, plus possibly inlines from headers, to be compiled.
[Bug target/54239] New: Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 Bug #: 54239 Summary: Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: venkataramanan.ku...@amd.com Hi all, The following are classification of prefetch instructions. 1) Prefetch instructions included by 3DNOW ISA/ new PRFCHW ISA (checked against cpuid function 0x8001 bit 8 of ecx register) prefetch MEM perfetchw MEM 2) Prefetch instructions included by SSE ISA prefetcht0 MEM prefetcht1 MEM prefetcht2 MEM prefetchnta MEM I am trying to generate 3DNOW/PRFCHW prefetch instructions #include x86intrin.h void *p; void prefetchw__test (void) { __builtin_prefetch (p, 0, 0); //== expecting prefetch p __builtin_prefetch (p, 1, 0); //== expecting prefetchw p } For the following set of options (enabled with -m3dnow and -mprfchw) the expected instruction for prefetch read (__builtin_prefetch (p, 0, 0)) is not generated. 1. gcc test.c -m3dnow -S 2. gcc test.c -m3dnow -mno-sse -mno-mmx -S 3. gcc test.c -S -mprfchw 4. gcc test.c -S -mprfchw -mno-sse -mno-mmx At least for k6-2 architecture, I am not expecting the instruction prefetchnt2 to be listed with -mprfchw. (-march=k6-2 -m32 -mprfchw) Am I missing something?
[Bug rtl-optimization/53495] [4.8 Regression] segmentation fault
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53495 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |jakub at gcc dot gnu.org |gnu.org | --- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 10:36:32 UTC --- Created attachment 28003 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28003 gcc48-pr53495.patch The problem is that find_moveable_pseudos creates some extra pseudos/def_insns, but then trivially_dead_insns is called by ira and deletes them (because they were feeding trivially dead insns only). Then move_unallocated_pseudos is called and expects to be able to tweak all the insns find_moveable_pseudos created. The attached untested patch fixes that.
[Bug target/54049] cr16: ICE: in gen_rtx_SUBREG with -O1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54049 Stefan Sørensen stefan at astylos dot dk changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #1 from Stefan Sørensen stefan at astylos dot dk 2012-08-13 10:51:19 UTC --- Works in 4.8-20120812 snapshot, closing.
[Bug middle-end/53411] [4.8 Regression] ICE in move_unallocated_pseudos
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53411 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||DUPLICATE --- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 10:55:39 UTC --- If we want to rely on no dead insns before IRA, it would make no point calling delete_trivially_dead_insns in it. *** This bug has been marked as a duplicate of bug 53495 ***
[Bug rtl-optimization/53495] [4.8 Regression] segmentation fault
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53495 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 10:55:39 UTC --- *** Bug 53411 has been marked as a duplicate of this bug. ***
[Bug middle-end/53411] [4.8 Regression] ICE in move_unallocated_pseudos
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53411 --- Comment #6 from Bernd Schmidt bernds at gcc dot gnu.org 2012-08-13 11:07:27 UTC --- If the call to delete_trivially_dead_insns is supposed to eliminate only pre-existing dead insns, then just moving it to the beginning of IRA fixes this bug.
[Bug middle-end/53411] [4.8 Regression] ICE in move_unallocated_pseudos
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53411 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||vmakarov at gcc dot gnu.org --- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 11:24:31 UTC --- ira itself also removes something, e.g. in rebuild_jump_labels (get_insns ()); if (purge_all_dead_edges ()) delete_unreachable_blocks (); so I wouldn't move that if (delete_trivially_dead_insns (get_insns (), max_reg_num ())) df_analyze (); too early in the function. But perhaps it could be moved before the /* It is not worth to do such improvement when we use a simple allocation because of -O0 usage or because the function is too big. */ if (ira_conflicts_p) find_moveable_pseudos (); hunk. Vlad, what do you think? There is still ira_flattening that tweaks the RTL in between, dunno if it could create trivially dead insns or not. Moving d_t_d_i call before f_m_p call certainly fixes both of the testcases too, haven't bootstrapped/regtested either of the patches yet.
[Bug libstdc++/54112] including complex.h and complex fails in C++03
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54112 --- Comment #4 from Marc Glisse glisse at gcc dot gnu.org 2012-08-13 11:55:04 UTC --- Author: glisse Date: Mon Aug 13 11:55:00 2012 New Revision: 190340 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190340 Log: 2012-08-13 Marc Glisse marc.gli...@inria.fr PR libstdc++/54112 * include/c_compatibility/complex.h: Undefine complex, always include system's complex.h if present. * testsuite/26_numerics/complex/c99.cc: New testcase. * testsuite/17_intro/headers/c++1998/complex.cc: Likewise. * doc/xml/manual/numerics.xml: Document it. Added: trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/complex.cc (with props) trunk/libstdc++-v3/testsuite/26_numerics/complex/c99.cc (with props) Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/doc/xml/manual/numerics.xml trunk/libstdc++-v3/include/c_compatibility/complex.h Propchange: trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/complex.cc ('svn:eol-style' added) Propchange: trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/complex.cc ('svn:keywords' added) Propchange: trunk/libstdc++-v3/testsuite/26_numerics/complex/c99.cc ('svn:eol-style' added) Propchange: trunk/libstdc++-v3/testsuite/26_numerics/complex/c99.cc ('svn:keywords' added)
[Bug tree-optimization/54200] copyrename generates wrong debuginfo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED --- Comment #8 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 11:55:26 UTC --- Fixed as far as I am concerned.
[Bug libstdc++/54112] including complex.h and complex fails in C++03
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54112 Marc Glisse glisse at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||FIXED --- Comment #5 from Marc Glisse glisse at gcc dot gnu.org 2012-08-13 11:58:29 UTC --- Fixed.
[Bug lto/54231] LTO generates code for the wrong CPU if different options used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231 --- Comment #12 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 11:58:33 UTC --- (In reply to comment #9) (In reply to comment #8) If you do something like gcc -c t1.c -mavx -flto gcc -c t2.c -msse2 -flto gcc t1.o t2.o -flto then the link step will use -mavx -msse2, that is, target options are concatenated. Indeed. What I'm asking for is that each source file be compiled with its own target options. I realise this is a request for enhancement, though. Yes, there are similar option-related bugs for this. Note somebody needs to sit down and document the desired semantics of combining translation units T1 and T2, compiled with different options OP1 and OP2, at link-time with options OP3. Desired semantics including which cross-file optimizations (inlining?) are possible.
[Bug lto/54231] LTO generates code for the wrong CPU if different options used
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231 --- Comment #13 from Thiago Macieira thiago at kde dot org 2012-08-13 12:13:40 UTC --- (In reply to comment #12) Yes, there are similar option-related bugs for this. Note somebody needs to sit down and document the desired semantics of combining translation units T1 and T2, compiled with different options OP1 and OP2, at link-time with options OP3. Desired semantics including which cross-file optimizations (inlining?) are possible. From my (admittedly restrict) point of view, inlining should be possible, provided the following conditions: - when inlining a function with a lower optimisation / target setting, apply the outer scope's setting to the inlined code - when inlining a function with a higher target requirement, inlining should be done only in the sense of partial function splitting, prologue, epilogues, constant propagation, etc. In the case that I pasted, for example, I'd like GCC to realise that it has already tested if the counter variable is 0, then forego that test in the inlined, inner function. Worst case scenario, simply forego inlining completely. Then the code would simply be no worse than the non-LTO case.
[Bug tree-optimization/54200] copyrename generates wrong debuginfo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200 Igor Zamyatin izamyatin at gmail dot com changed: What|Removed |Added CC||izamyatin at gmail dot com --- Comment #9 from Igor Zamyatin izamyatin at gmail dot com 2012-08-13 12:13:54 UTC --- I see following in report for x86: FAIL: gcc.dg/guality/pr54200.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 20 z == 3
[Bug tree-optimization/54240] New: Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 Bug #: 54240 Summary: Routine hoist_adjacent_loads does not work properly after r189366 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ysrum...@gmail.com This regression can be seen in the attached simple test-case - cmov conversion is not happened. The fix is evident: --- tree-ssa-phiopt.c (revision 190151) +++ tree-ssa-phiopt.c (working copy) @@ -1864,7 +1864,7 @@ /* Check the mode of the arguments to be sure a conditional move can be generated for it. */ - if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1 + if (optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1)) == CODE_FOR_nothing)) continue; /* Both statements must be assignments whose RHS is a COMPONENT_REF. */ You can see this regression on any platform supporting conditional moves (I tested it on x86).
[Bug tree-optimization/54241] New: Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241 Bug #: 54241 Summary: Routine hoist_adjacent_loads does not work properly after r189366 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: ysrum...@gmail.com This regression can be seen in the attached simple test-case - cmov conversion is not happened. The fix is evident: --- tree-ssa-phiopt.c (revision 190151) +++ tree-ssa-phiopt.c (working copy) @@ -1864,7 +1864,7 @@ /* Check the mode of the arguments to be sure a conditional move can be generated for it. */ - if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1 + if (optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1)) == CODE_FOR_nothing)) continue; /* Both statements must be assignments whose RHS is a COMPONENT_REF. */ You can see this regression on any platform supporting conditional moves (I tested it on x86).
[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 12:31:52 UTC --- Both in 4.7 (which is before the prfchw changes) and 4.8 with -m32 -m3dnow and -m32 -m3dnow -mno-sse I get prefetch + prefetchw insn, which looks ok to me. -mno-mmx I think disables 3dnow too, so you get no prefetch insns in that case (which is also fine). -mprfchw implies the SSE prefetches and PRFCHW CPUID 0x8001 ecx bit 8 doesn't imply the prefetch insn, just prefetchw, so it is correct that with -m32 -mprfchw prefetchnta + prefetchw is generated. So, where exactly do you see a bug?
[Bug tree-optimization/54200] copyrename generates wrong debuginfo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200 --- Comment #10 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 12:35:32 UTC --- (In reply to comment #9) I see following in report for x86: FAIL: gcc.dg/guality/pr54200.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 20 z == 3 That's what I said in the commit mail.
[Bug tree-optimization/54241] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE --- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 12:39:10 UTC --- . *** This bug has been marked as a duplicate of bug 54240 ***
[Bug tree-optimization/54241] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||DUPLICATE --- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 12:39:10 UTC --- . *** This bug has been marked as a duplicate of bug 54240 *** --- Comment #2 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 2012-08-13 12:39:23 UTC --- Created attachment 28004 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28004 test-case confirming the issue
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 --- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 12:39:10 UTC --- *** Bug 54241 has been marked as a duplicate of this bug. ***
[Bug c/53968] integer undefined behaviors in GCC
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53968 --- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 12:40:04 UTC --- Author: jakub Date: Mon Aug 13 12:39:54 2012 New Revision: 190342 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190342 Log: PR c/53968 * tree.c (integer_pow2p): Avoid undefined signed overflows. * simplify-rtx.c (neg_const_int): Likewise. * expr.c (fixup_args_size_notes): Likewise. * stor-layout.c (set_min_and_max_values_for_integral_type): Likewise. * double-int.c (mul_double_wide_with_sign): Likewise. (double_int_mask): Likewise. * tree-ssa-loop-ivopts.c (get_address_cost): Likewise. Modified: trunk/gcc/ChangeLog trunk/gcc/double-int.c trunk/gcc/expr.c trunk/gcc/simplify-rtx.c trunk/gcc/stor-layout.c trunk/gcc/tree-ssa-loop-ivopts.c trunk/gcc/tree.c
[Bug c/53968] integer undefined behaviors in GCC
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53968 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-08-13 CC||hubicka at gcc dot gnu.org, ||jakub at gcc dot gnu.org Ever Confirmed|0 |1 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 12:41:06 UTC --- Haven't reproduced the diagnostic.c failure, and leaving the ipa hunk to Honza.
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added CC||wschmidt at gcc dot gnu.org --- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 12:41:10 UTC --- Confirmed. William? Why don't we see any failed testcases?
[Bug middle-end/54201] XMM constant duplicated
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54201 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|NEW AssignedTo|rguenth at gcc dot gnu.org |unassigned at gcc dot ||gnu.org --- Comment #6 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 12:42:47 UTC --- Not working on it.
[Bug tree-optimization/54200] copyrename generates wrong debuginfo
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200 --- Comment #11 from Igor Zamyatin izamyatin at gmail dot com 2012-08-13 12:46:48 UTC --- Right! Sorry for the noise...
[Bug middle-end/54242] New: [4.8 Regression] Testsuite failures
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54242 Bug #: 54242 Summary: [4.8 Regression] Testsuite failures Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: middle-end AssignedTo: unassig...@gcc.gnu.org ReportedBy: hjl.to...@gmail.com CC: rgue...@gcc.gnu.org On Linux/x86-64, revision 190339: http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00316.html caused: FAIL: gcc.dg/guality/pr54200.c -O2 -flto -fuse-linker-plugin -fno-fat-lto-objects line 20 z == 3 FAIL: gcc.dg/guality/pr54200.c -Os line 20 z == 3 FAIL: gcc.target/i386/pad-10.c scan-assembler-not nop
[Bug driver/54210] gcc unable to detect -mprfchw flag in bulldozer machines
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54210 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 13:21:52 UTC --- Author: jakub Date: Mon Aug 13 13:21:41 2012 New Revision: 190345 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190345 Log: PR driver/54210 * config/i386/driver-i386.c (host_detect_local_cpu): Test bit_PRFCHW bit of CPUID 0x8001 %ecx instead of CPUID 7 %ecx. * config/i386/cpuid.h (bits_PRFCHW): Move definition to CPUID 0x8001 %ecx flags. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/cpuid.h trunk/gcc/config/i386/driver-i386.c
[Bug middle-end/54242] [4.8 Regression] Testsuite failures
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54242 Richard Guenther rguenth at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-08-13 Target Milestone|--- |4.8.0 Ever Confirmed|0 |1 --- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 13:27:45 UTC --- It caused only FAIL: gcc.target/i386/pad-10.c scan-assembler-not nop as said in the commit mail. The other FAILs are prefered over dozen new XPASSes. pad-10.c is testing something that didn't really work before.
[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 --- Comment #2 from Venkataramanan venkataramanan.kumar at amd dot com 2012-08-13 13:51:08 UTC --- (In reply to comment #1) Both in 4.7 (which is before the prfchw changes) and 4.8 with -m32 -m3dnow and -m32 -m3dnow -mno-sse I get prefetch + prefetchw insn, which looks ok to me. -mno-mmx I think disables 3dnow too, so you get no prefetch insns in that case (which is also fine). -mprfchw implies the SSE prefetches and PRFCHW CPUID 0x8001 ecx bit 8 doesn't imply the prefetch insn, just prefetchw, so it is correct that with -m32 -mprfchw prefetchnta + prefetchw is generated. So, where exactly do you see a bug? Hi Jakub, -mprfchw implies the SSE prefetches and PRFCHW CPUID 0x8001 ecx bit 8 doesn't imply the prefetch insn, just prefetchw, so it is correct that with -m32 -mprfchw prefetchnta + prefetchw is generated. So, where exactly do you see a bug As per AMD cpuid manual, 0x8001 ecx bit 8 impiles both prefetch and prefetchw. http://blogs.amd.com/developer/2010/08/18/3dnow-deprecated/
[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 --- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 13:58:40 UTC --- But the Intel manual AFAIK doesn't talk about prefetch insn. So, the -mprfchw switch needs to control solely the prefetchw instruction, and there might be a different one that controls the prefetch insn. In driver-i386.c you could enable -mprfchw vs. ?-mprfch -mpfrchw? based on whether the CPU is Intel or AMD or something, but if there are CPUs that don't have both insns, it needs to be enabled independently. Areg?
[Bug target/54232] For x86 PIC code, ebx should be spillable
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54232 --- Comment #3 from Rich Felker bugdal at aerifal dot cx 2012-08-13 13:59:17 UTC --- I think the GOT is introduced too late to do any fancy ananlysis on whether we need it or not. This may be true, but if so, it's a highly suboptimal design that's hurting performance badly. 30% on the cryptographic code I looked at, and from working on FFmpeg in the past, I remember quite a few cases where PIC was hurting performance by significant measurable amounts like that too. If there's any way the changes I describe could be targeted even just in the long term, I think it would make a big difference for a lot of software. I also think that for outgoing function calls the ABI relies on a properly setup GOT, even for those that bind locally and thus do not go through the PLT. The extern function call ABI on x86 does not allow the caller to depend on EBX containing the GOT address. This is because the callee has no way of knowing whether it was called by the same DSO it resides in. If not, the GOT address will be invalid for it. For static functions whose addresses never leak out of the translation unit they're defined in, the calling convention is up to GCC. Ideally it would assume the GOT register is already loaded in such functions (as long as all the callees use the GOT), but in reality it rarely does. This is a separate code generation QoI implementation that should perhaps be addressed as its own bug.
[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 --- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 14:00:55 UTC --- BTW, why do you care about the prefetch insn? Isn't it obsoleted by the SSE ISA prefetches anyway (unlike prefetchw)?
[Bug libstdc++/54185] condition_variable not properly destructed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185 --- Comment #8 from David Adler d.adler.s at gmail dot com 2012-08-13 14:09:16 UTC --- Created attachment 28005 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28005 proposed changelog I wasn't sure about the testcase file name, so I just guessed.
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 --- Comment #3 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 14:14:59 UTC --- Odd, I don't know. I'll have to go back and look at the tests when I get a moment and investigate that. Peculiar.
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 --- Comment #3 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 14:14:59 UTC --- Odd, I don't know. I'll have to go back and look at the tests when I get a moment and investigate that. Peculiar. --- Comment #4 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 2012-08-13 14:15:08 UTC --- Created attachment 28006 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28006 test-case confirming the issue
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 --- Comment #5 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 14:24:48 UTC --- Well, I'm embarrassed. The tests I wrote for this functionality never got into the test suite -- I apparently forgot to submit them with the patch -- and I can't find them anymore. I'll write some new ones soon. Apologies for the oversight. :(
[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 --- Comment #5 from Venkataramanan venkataramanan.kumar at amd dot com 2012-08-13 14:33:14 UTC --- (In reply to comment #4) BTW, why do you care about the prefetch insn? Isn't it obsoleted by the SSE ISA prefetches anyway (unlike prefetchw)? Hi Jakub, as for as fam15H processors what I know is they are exactly same. Yes I can use -mprfchw and generate prefecthw instruction and use prefetchts instead of prefetch instruction. But there is a mention in SWOG guide of amdfam15 that their functionalities could change in future. (Snip) AMD Family 15h processors implement the PREFETCHT0, PREFETCHT1, and PREFETCHT2 instructions in exactly the same way as the PREFETCH instruction. That is, the data is brought into the L1 data cache. This functionality could change in future implementations of the AMD Family 15h processor (Snip)
[Bug libstdc++/54185] condition_variable not properly destructed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185 --- Comment #9 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 14:35:21 UTC --- Perfect - thanks. I'll get it committed tonight.
[Bug fortran/54243] New: f951: internal compiler error: Segmentation fault (trying to compile errorneous code)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54243 Bug #: 54243 Summary: f951: internal compiler error: Segmentation fault (trying to compile errorneous code) Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: sla...@staszic.waw.pl With Deabian's gcc-snapshot gfortran (4.8.0 20120714) trying to compile to code below: module aqq_m type :: aqq_t contains procedure :: aqq_init end type contains subroutine aqq_init(this) class(aqq_t) :: this end subroutine end module program bug2 use aqq_m class(aqq_t) :: aqq call aqq%aqq_init end program I get: $ /usr/lib/gcc-snapshot/bin/gfortran -std=f2008 -ffree-form bug2.f bug2.f:24.21: class(aqq_t) :: aqq 1 Error: CLASS variable 'aqq' at (1) must be dummy, allocatable or pointer f951: internal compiler error: Segmentation fault Please submit a full bug report, with preprocessed source if appropriate. See file:///usr/share/doc/gcc-snapshot/README.Bugs for instructions. HTH, Sylwester
[Bug fortran/54244] New: f951: internal compiler error: in gfc_add_component_ref, at fortran/class.c:210
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54244 Bug #: 54244 Summary: f951: internal compiler error: in gfc_add_component_ref, at fortran/class.c:210 Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: sla...@staszic.waw.pl With Deabian's gcc-snapshot gfortran (4.8.0 20120714) trying to compile to code below: module aqq_m type :: arr_t end type type :: aqq_t class(arr_t), allocatable :: psi(:) contains procedure :: aqq_init end type contains subroutine aqq_init(this) class(aqq_t) :: this end subroutine end module program bug1 use aqq_m class(aqq_t) :: aqq call aqq%aqq_init end program I get: $ /usr/lib/gcc-snapshot/bin/gfortran -std=f2008 -ffree-form bug1.f bug1.f:32.21: class(aqq_t) :: aqq 1 Error: CLASS variable 'aqq' at (1) must be dummy, allocatable or pointer bug1.f:33.10: call aqq%aqq_init 1 Error: Type mismatch in argument 'this' at (1); passed CLASS(__class_aqq_m_Arr_t_1_0a) to CLASS(aqq_t) f951: internal compiler error: in gfc_add_component_ref, at fortran/class.c:210 Please submit a full bug report, with preprocessed source if appropriate. See file:///usr/share/doc/gcc-snapshot/README.Bugs for instructions. HTH, Sylwester
[Bug c++/53836] [4.7/4.8 Regression] ICE: unexpected expression of kind template_parm_index
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53836 Paolo Carlini paolo.carlini at oracle dot com changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2012-08-13 CC||hjl.tools at gmail dot com Summary|ICE: unexpected expression |[4.7/4.8 Regression] ICE: |of kind template_parm_index |unexpected expression of ||kind template_parm_index Ever Confirmed|0 |1 --- Comment #3 from Paolo Carlini paolo.carlini at oracle dot com 2012-08-13 15:26:57 UTC --- Mainline ICEs for me (190348) and indeed looks like a regression. HJ, can you help figuring out when we regressed?
[Bug fortran/54243] [OOP] ICE (segfault) in gfc_type_compatible for invalid BT_CLASS
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54243 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||error-recovery, ||ice-on-invalid-code Last reconfirmed||2012-08-13 CC||burnus at gcc dot gnu.org, ||janus at gcc dot gnu.org Ever Confirmed|0 |1 Summary|f951: internal compiler |[OOP] ICE (segfault) in |error: Segmentation fault |gfc_type_compatible for |(trying to compile |invalid BT_CLASS |errorneous code)| --- Comment #1 from Tobias Burnus burnus at gcc dot gnu.org 2012-08-13 15:35:11 UTC --- Segfaults in 4837gfc_type_compatible (gfc_typespec *ts1, gfc_typespec *ts2) 4838{ 4839 bool is_class1 = (ts1-type == BT_CLASS); 4840 bool is_class2 = (ts2-type == BT_CLASS); ... 4853 else if (is_class1 is_class2) 4854return gfc_type_is_extension_of (ts1-u.derived-components-ts.u.derived, 4855 ts2-u.derived-components-ts.u.derived); The problem is that ts2-u.derived-components == NULL.
[Bug fortran/54244] [OOP] ICE in gfc_add_component_ref, at fortran/class.c:210
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54244 Tobias Burnus burnus at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |NEW Keywords||error-recovery, ||ice-on-invalid-code Last reconfirmed||2012-08-13 CC||burnus at gcc dot gnu.org, ||janus at gcc dot gnu.org Ever Confirmed|0 |1 Summary|f951: internal compiler |[OOP] ICE in |error: in |gfc_add_component_ref, at |gfc_add_component_ref, at |fortran/class.c:210 |fortran/class.c:210 | --- Comment #1 from Tobias Burnus burnus at gcc dot gnu.org 2012-08-13 15:35:25 UTC --- Fails in gfc_add_component_ref at 213 gcc_assert((*tail)-u.c.component); Here, (*tail)-u.c.component == NULL and tail-u.c.sym-name == aqq_t. Called via resolve_typebound_subroutine.
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 William J. Schmidt wschmidt at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2012-08-13 AssignedTo|unassigned at gcc dot |wschmidt at gcc dot gnu.org |gnu.org | Ever Confirmed|0 |1 --- Comment #6 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 15:46:31 UTC --- Mine.
[Bug middle-end/53823] [4.8 Regression] FAIL: gcc.c-torture/execute/930921-1.c execution at -O0 and -O1
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53823 --- Comment #23 from Richard Henderson rth at gcc dot gnu.org 2012-08-13 15:51:37 UTC --- On 08/12/2012 07:30 AM, danglin at gcc dot gnu.org wrote: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53823 --- Comment #22 from John David Anglin danglin at gcc dot gnu.org 2012-08-12 14:30:12 UTC --- Created attachment 27994 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27994 Ok. r~
[Bug tree-optimization/54245] New: [4.8 regression] incorrect optimisation
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54245 Bug #: 54245 Summary: [4.8 regression] incorrect optimisation Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization AssignedTo: unassig...@gcc.gnu.org ReportedBy: m...@mansr.com Created attachment 28007 -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28007 Test case Since r190220 the attached test is compiled incorrectly at -O1 and higher.
[Bug target/54246] New: Bytemark FOURIER 54% slower in X32 chroot
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54246 Bug #: 54246 Summary: Bytemark FOURIER 54% slower in X32 chroot Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target AssignedTo: unassig...@gcc.gnu.org ReportedBy: wbr...@gmail.com http://www.tux.org/~mayer/linux/nbench-byte-2.2.3.tar.gz compiled on 64-bit system with glibc 2.14.1 and run in X32 chroot FOURIER : 36275 : 41.26 : 23.17 compiled in X32 chroot with glibc 2.16 and run in X32 chroot FOURIER : 16574 : 18.85 : 10.59 both were compiled with same CFLAGS -static -m64 -ggdb -Wall -O3 -funroll-loops -g0 -march=core2 -mfpmath=sse -fomit-frame-pointer -ffast-math -mssse3 -fno-PIE -fno-exceptions -fno-stack-protector
[Bug tree-optimization/54245] [4.8 regression] incorrect optimisation
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54245 Jakub Jelinek jakub at gcc dot gnu.org changed: What|Removed |Added CC||jakub at gcc dot gnu.org, ||wschmidt at gcc dot gnu.org Target Milestone|--- |4.8.0 --- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 17:19:40 UTC --- Confirmed. slsr replaces: D.2219_3 = *row_2(D); D.2220_4 = (int) D.2219_3; a1_5 = D.2220_4 * 22725; D._6 = MEM[(short int *)row_2(D) + 4B]; D.2223_7 = (int) D._6; D.2224_8 = D.2223_7 * 21407; a0_9 = D.2224_8 + a1_5; D.2225_10 = D.2223_7 * 8867; - a1_11 = a1_5 + D.2225_10; + slsr.4_25 = D._6 * 12540; + slsr.5_26 = (int) slsr.4_25; + a1_11 = a0_9 - slsr.5_26; The multiplication is newly performed in short int, supposedly that is the problem here. Anyway, while the number of multiplications in the end is the same, with slsr the code sequence is also 3 insns/4 bytes longer on x86_64.
[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239 Uros Bizjak ubizjak at gmail dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID --- Comment #6 from Uros Bizjak ubizjak at gmail dot com 2012-08-13 17:32:21 UTC --- (In reply to comment #5) BTW, why do you care about the prefetch insn? Isn't it obsoleted by the SSE ISA prefetches anyway (unlike prefetchw)? Hi Jakub, as for as fam15H processors what I know is they are exactly same. Yes I can use -mprfchw and generate prefecthw instruction and use prefetchts instead of prefetch instruction. The reason is described in the comment in i386.md: /* Use 3dNOW prefetch in case we are asking for write prefetch not supported by SSE counterpart or the SSE prefetch is not available (K6 machines). Otherwise use SSE prefetch as it allows specifying of locality. */ We are generating SSE prefetches, since they allow specification of locality. But there is a mention in SWOG guide of amdfam15 that their functionalities could change in future. (Snip) AMD Family 15h processors implement the PREFETCHT0, PREFETCHT1, and PREFETCHT2 instructions in exactly the same way as the PREFETCH instruction. That is, the data is brought into the L1 data cache. This functionality could change in future implementations of the AMD Family 15h processor (Snip) I see no problem here. For current implementations, SSE prefetches are treated in the same way as 3dNOW prefetch. I read the quoted part as ... in the future, F15h SSE prefetches will implement the functionality as described in the insn mnemonic (locality), not that they will overload the mnemonic with some other different functionality. Some other different functionality will need different mnemonic, probably supported by cpuid flag. So, INVALID.
[Bug c++/54197] [4.7/4.8 regression] Lifetime of reference not properly extended
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54197 Ollie Wild aaw at gcc dot gnu.org changed: What|Removed |Added CC||aaw at gcc dot gnu.org AssignedTo|unassigned at gcc dot |aaw at gcc dot gnu.org |gnu.org | --- Comment #2 from Ollie Wild aaw at gcc dot gnu.org 2012-08-13 18:04:21 UTC --- The issue is that these cause a COMPOUND_EXPR to be passed to extend_ref_init_temps_1. I have a patch which replaces the second operand of the COMPOUND_EXPR with another call to extend_ref_init_temps_1. Testing now. Will send out for review shortly.
[Bug c++/54197] [4.7/4.8 regression] Lifetime of reference not properly extended
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54197 Jonathan Wakely redi at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED
[Bug fortran/54243] [OOP] ICE (segfault) in gfc_type_compatible for invalid BT_CLASS
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54243 janus at gcc dot gnu.org changed: What|Removed |Added Status|NEW |ASSIGNED AssignedTo|unassigned at gcc dot |janus at gcc dot gnu.org |gnu.org | --- Comment #2 from janus at gcc dot gnu.org 2012-08-13 19:21:15 UTC --- I think the proper fix for both this one and PR 54244 would be the following: Index: gcc/fortran/resolve.c === --- gcc/fortran/resolve.c(revision 190186) +++ gcc/fortran/resolve.c(working copy) @@ -5793,6 +5795,9 @@ check_typebound_baseobject (gfc_expr* e) gcc_assert (base-ts.type == BT_DERIVED || base-ts.type == BT_CLASS); + if (base-ts.type == BT_CLASS !gfc_expr_attr (base).class_ok) +return FAILURE; + /* F08:C611. */ if (base-ts.type == BT_DERIVED base-ts.u.derived-attr.abstract) { This aborts the resolution of the type-bound call rather early (if the passed object was not properly declared), avoiding all problems that one could possibly run into later. It is also general enough that it should work for other similar cases.
[Bug tree-optimization/54245] [4.8 regression] incorrect optimisation
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54245 William J. Schmidt wschmidt at gcc dot gnu.org changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2012-08-13 AssignedTo|unassigned at gcc dot |wschmidt at gcc dot gnu.org |gnu.org | Ever Confirmed|0 |1 --- Comment #2 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 19:29:06 UTC --- I'll take a look. Might be a day or two as my queue is kind of full.
[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185 --- Comment #10 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 19:56:55 UTC --- Author: redi Date: Mon Aug 13 19:56:50 2012 New Revision: 190356 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190356 Log: 2012-08-13 David Adler d.adle...@gmail.com PR libstdc++/54185 * src/c++11/condition_variable.cc (condition_variable): Always destroy native type in destructor. * testsuite/30_threads/condition_variable/54185.cc: New. Added: trunk/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/src/c++11/condition_variable.cc
[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185 --- Comment #10 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 19:56:55 UTC --- Author: redi Date: Mon Aug 13 19:56:50 2012 New Revision: 190356 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190356 Log: 2012-08-13 David Adler d.adle...@gmail.com PR libstdc++/54185 * src/c++11/condition_variable.cc (condition_variable): Always destroy native type in destructor. * testsuite/30_threads/condition_variable/54185.cc: New. Added: trunk/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc Modified: trunk/libstdc++-v3/ChangeLog trunk/libstdc++-v3/src/c++11/condition_variable.cc --- Comment #11 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 19:57:36 UTC --- Author: redi Date: Mon Aug 13 19:57:31 2012 New Revision: 190357 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190357 Log: 2012-08-13 David Adler d.adle...@gmail.com PR libstdc++/54185 * src/c++11/condition_variable.cc (condition_variable): Always destroy native type in destructor. * testsuite/30_threads/condition_variable/54185.cc: New. Added: branches/gcc-4_7-branch/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc Modified: branches/gcc-4_7-branch/libstdc++-v3/ChangeLog branches/gcc-4_7-branch/libstdc++-v3/src/c++11/condition_variable.cc
[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185 --- Comment #11 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 19:57:36 UTC --- Author: redi Date: Mon Aug 13 19:57:31 2012 New Revision: 190357 URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190357 Log: 2012-08-13 David Adler d.adle...@gmail.com PR libstdc++/54185 * src/c++11/condition_variable.cc (condition_variable): Always destroy native type in destructor. * testsuite/30_threads/condition_variable/54185.cc: New. Added: branches/gcc-4_7-branch/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc Modified: branches/gcc-4_7-branch/libstdc++-v3/ChangeLog branches/gcc-4_7-branch/libstdc++-v3/src/c++11/condition_variable.cc
[Bug fortran/54247] New: OpenMP code fails at execution in AMD Interlagos
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54247 Bug #: 54247 Summary: OpenMP code fails at execution in AMD Interlagos Classification: Unclassified Product: gcc Version: 4.7.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: fortran AssignedTo: unassig...@gcc.gnu.org ReportedBy: lo...@cray.com cat test.f90 ! derived from OpenMP test omp31f/F31_A_16_1.F90 !based on Example A.16.1f, p. 213 lines 1-19 in OpenMP API Ver 3.1. program F31_A_16_1 use omp_lib implicit none integer, parameter :: ITERATIONS = 2**17 ! Adjustable parameter integer(kind=omp_lock_kind) :: lock integer :: count_something_useful = 0, count_something_critical = 0 call omp_set_num_threads(16) call omp_set_dynamic(.false.) call omp_init_lock(lock) !$omp parallel !$omp single call foo(lock, ITERATIONS) !$omp end single !$omp end parallel if(count_something_useful /= ITERATIONS .or. count_something_critical /= ITERATIONS) then write (6, '(*(G0))') ' FAIL - ', '(count_something_useful,count_something_critical) == (', count_something_useful, ',', count_something_critical, '), expected (', ITERATIONS, ',', ITERATIONS, ')' end if contains ! from OpenMP 3.1 Example A.16.1f subroutine foo ( lock, n ) use omp_lib integer (kind=omp_lock_kind) :: lock integer n integer i do i = 1, n !$omp task call something_useful() do while ( .not. omp_test_lock(lock) ) !$omp taskyield end do call something_critical() call omp_unset_lock(lock) !$omp end task end do end subroutine subroutine something_useful() !$omp atomic update count_something_useful = count_something_useful+1 end subroutine something_useful subroutine something_critical ! isn't necessary to protect with atomic update, as invocations of this ! subroutine are protected by a lock count_something_critical = count_something_critical+1 end subroutine something_critical end program F31_A_16_1 ftn -fopenmp test.f90 ilrun -n1 -d16 ./a.out FAIL - (count_something_useful,count_something_critical) == (131072,131070), expected (131072,131072) Application 8535547 resources: utime ~6s, stime ~1s mcrun -n1 -d16 ./a.out Application 8535554 resources: utime ~0s, stime ~1s The code triggers a FAIL trap on interlagos processors, but not on the previous generation Magny-Cours AMD chips. Command explanation: ilrun -n1 -d16 -- Execute on a node with Interlagos processors, 1 node, 16 threads mcrun -n1 -d16 -- Execute on a node with Magny-Cours processors, 1 node, 16 threads [2 sockets in SMP node] ftn -- wrapper for Cray systems to get the right (we hope) set of libraries and default options for the current compilation environment. For the gcc environment, the options implied are here are COLLECT_GCC_OPTIONS='-u' 'pthread_mutex_trylock' '-fno-second-underscore' '-march=bdver1' '-static' '-v' '-fopenmp'
[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185 Jonathan Wakely redi at gcc dot gnu.org changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution||FIXED Target Milestone|--- |4.7.2 --- Comment #12 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 20:00:40 UTC --- fixed for 4.7.2
[Bug fortran/54247] OpenMP code fails at execution in AMD Interlagos
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54247 Bill Long longb at cray dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution||INVALID --- Comment #1 from Bill Long longb at cray dot com 2012-08-13 20:38:33 UTC --- Our internal OpenMP gurus spotted that in line 36 the !$omp task should be !$omp task default(shared) With that change, the code executes correctly on Interlagos nodes. Conclusion is that there is a bug in the OpenMP 3.1 examples, so still potentially useful information. But the initial complaint is not valid.
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 --- Comment #7 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 20:39:59 UTC --- Something else is broken, too, as the optab handlers for cmov on powerpc64 appear to have gone missing. I'll get one of our back-end specialists to help me understand that.
[Bug c++/53836] [4.7/4.8 Regression] ICE: unexpected expression of kind template_parm_index
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53836 H.J. Lu hjl.tools at gmail dot com changed: What|Removed |Added CC||jason at redhat dot com --- Comment #4 from H.J. Lu hjl.tools at gmail dot com 2012-08-13 21:07:04 UTC --- It was fixed by revision 172942: http://gcc.gnu.org/ml/gcc-cvs/2011-04/msg01138.html on 4.6 branch. However, the same patch was never applied on trunk.
[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240 --- Comment #8 from Andrew Pinski pinskia at gcc dot gnu.org 2012-08-13 21:59:33 UTC --- (In reply to comment #7) Something else is broken, too, as the optab handlers for cmov on powerpc64 appear to have gone missing. I'll get one of our back-end specialists to help me understand that. They are only enabled for TARGET_ISELsel which is either TARGET_ISEL or TARGET_ISEL64 which is correct as ppc64 does not have isel by default.
[Bug target/54142] ppc64 build failure - Unrecognized opcode: `sldi' (and `srdi`)
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54142 --- Comment #8 from Paul H. Hargrove PHHargrove at lbl dot gov 2012-08-13 22:04:40 UTC --- The following is a transcript of a test I just tried one of my systems where Gary and I have observed this bug. The test appears to show that the gcc provided by Fedora Core 6 does generate sldi instructions and the system-provided assembler understands them. So, whatever is causing the build failures that Gary and I see, it is *not* simply a matter of an assembler not supporting the instructions. -Paul {phargrov@fc6 ~}$ cat q.c unsigned long long foo(void) { return 0x7FFFLLU; } {phargrov@fc6 ~}$ gcc -m64 -O -S q.c {phargrov@fc6 ~}$ cat q.s .file q.c .section.toc,aw .section.text .align 2 .globl foo .section.opd,aw .align 3 foo: .quad .L.foo,.TOC.@tocbase .previous .type foo, @function .L.foo: lis 3,0x7fff sldi 3,3,16 blr .long 0 .byte 0,0,0,0,0,0,0,0 .size foo,.-.L.foo .ident GCC: (GNU) 4.1.2 20070626 (Red Hat 4.1.2-13) .section.note.GNU-stack,,@progbits {phargrov@fc6 ~}$ as -a64 -mppc64 q.s [no errors]