Been right on the money

2012-08-13 Thread makino
If You Have Already Gainig 3000% and more On Your Money, Read Something 
Else. And If you dont care Fast returns by Monday, DEFINITELY Don't Look at 
This!

V_NDB is having a 3,000% volume increase today, a clear sign of shorters 
playing, resulting in the price to be under its value it should be. To the 
investors it means that as soon as they get to buying the stock up, the 
price will too increase 3,000%. The time to BUY IN don't happen better then 
this Monday!

As the shorters must get buying back the stock the same way as they dump it 
now, which is 3,000%. What other sign you need to to buy the stock now? The 
share price is majorly under the value and one can make a fortune if you 
acquire V_NDB shares this Monday, August 13!!!


Been right on the money

2012-08-13 Thread mindy_libbee
If You Have Already Gainig 3000% and more On Your Money, Read Something 
Else. And If you dont care Fast returns by Monday, DEFINITELY Don't Look at 
This!

V_NDB is having a 3,000% volume increase today, a clear sign of shorters 
playing, resulting in the price to be under its value it should be. To the 
investors it means that as soon as they get to buying the stock up, the 
price will too increase 3,000%. The time to BUY IN don't happen better then 
this Monday!

As the shorters must get buying back the stock the same way as they dump it 
now, which is 3,000%. What other sign you need to to buy the stock now? The 
share price is majorly under the value and one can make a fortune if you 
acquire V_NDB shares this Monday, August 13!!!


Been right on the money

2012-08-13 Thread buzassante
If You Have Already Gainig 3000% and more On Your Money, Read Something 
Else. And If you dont care Fast returns by Monday, DEFINITELY Don't Look at 
This!

V_NDB is having a 3,000% volume increase today, a clear sign of shorters 
playing, resulting in the price to be under its value it should be. To the 
investors it means that as soon as they get to buying the stock up, the 
price will too increase 3,000%. The time to BUY IN don't happen better then 
this Monday!

As the shorters must get buying back the stock the same way as they dump it 
now, which is 3,000%. What other sign you need to to buy the stock now? The 
share price is majorly under the value and one can make a fortune if you 
acquire V_NDB shares this Monday, August 13!!!


Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-13 Thread Richard Guenther
On Fri, Aug 10, 2012 at 5:44 PM, Elmar Krieger el...@cmbi.ru.nl wrote:
 Hi Ian, hi Richard, hi Andi!

 Many thanks for your comments.


 The slowdown is not the same with other files, so I'm essentially sure
 that this specific source file has some 'feature' that catches GCC at
 the wrong leg. This raises my hopes that one of the GCC experts wants
 to take a look at it. The code is confidential,

 You could file a bug report with just a profile output of the compiler
 (e.g. from oprofile or perf)

 But please use a pristine FSF compiler.  You can also run the source
 through
 some obfuscation tool.  Or get a first hint with using -ftime-report.

 In the end, without a testcase there is nothing to do for us ...

 I downloaded the latest official GCC 4.7.1, but unfortunately configure
 stopped with Building GCC requires GMP 4.2+, MPFR 2.3.1+ and MPC 0.8.0+.,
 and for my CentOS Linux, only older versions of this libs are available as
 RPMs. I saw many hours of manual fiddling ahead, so I suggest a more
 efficient solution:

 I now sent the confidential source file by private message to Richard,
 please spend 5 minutes to run these two commands with it:

 time gcc -m32 -g -O0 -fno-strict-aliasing -x c -Wall -Werror -c model.i

/usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o
/dev/null model.i -march=i386 -fno-strict-aliasing -g -w
3.30user 0.03system 0:03.34elapsed 99%CPU (0avgtext+0avgdata 277072maxresident)k
0inputs+0outputs (0major+20416minor)pagefaults 0swaps

/usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386
-fno-strict-aliasing -g -m32 -w
3.28user 0.08system 0:03.38elapsed 99%CPU (0avgtext+0avgdata 985760maxresident)k
0inputs+0outputs (0major+64353minor)pagefaults 0swaps

Same time.  I am positively surprised.

 time gcc -m32 -g -O -fno-strict-aliasing -x c -Wall -Werror -c model.i

/usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o
/dev/null model.i -march=i386 -fno-strict-aliasing -g -w -O
8.09user 0.13system 0:08.29elapsed 99%CPU (0avgtext+0avgdata 381376maxresident)k
248inputs+0outputs (1major+38855minor)pagefaults 0swaps

/usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386
-fno-strict-aliasing -g -m32 -w -O
15.33user 0.16system 0:15.55elapsed 99%CPU (0avgtext+0avgdata
1844272maxresident)k
24inputs+0outputs (1major+125893minor)pagefaults 0swaps

That's within reasonable bounds as well, IMHO (you can't really compare
-O1 from 3.2.3 with -O1 from 4.6.3).  One more data point (-O2 tends to
be more focused on, no debuginfo generation turns off improvements
and its costs there):

/usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o
/dev/null model.i -march=i386 -fno-strict-aliasing -w -O2
17.31user 0.43system 0:17.82elapsed 99%CPU (0avgtext+0avgdata
427392maxresident)k
72inputs+0outputs (2major+69895minor)pagefaults 0swaps

/usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386
-fno-strict-aliasing  -m32 -w -O2
18.12user 0.21system 0:18.43elapsed 99%CPU (0avgtext+0avgdata
1752784maxresident)k
0inputs+0outputs (0major+124029minor)pagefaults 0swaps

same time, I am surprised again ;) (with improvements in CPU speed the
compilation
with 4.6.3 is actually _faster_ comparing commodity platforms from the
date of the compiler releases).

 If you don't find an enormous slowdown with the second command (please post
 your timings) and conclude that this problem has been introduced by Google
 in their custom GCC, I'll pay you 100 USD for the 5 minutes wasted.

You might want to try -ftime-report, if it says you have extra checkings enabled
for one compiler but not the other that will explain the different outcome at
your side:

Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

Disclaimer: I had to delete an include statement on top of the file I sent you
to make it compile.

Richard.

 To Ian:


 Not at all high.  See Type-Based Alias Analysis
 http://www.drdobbs.com/cpp/type-based-alias-analysis/184404273
 for one reason.


 Thanks, I read the article, but didn't really see how forbidding a
 function
 with argument void** to accept a pointer to any pointer helps with
 aliasing.

 If it's perfectly normal that a function with argument void* accepts any
 pointer, then a function with argument void** should accept a pointer to
 any
 pointer by analogy, without having additional aliasing problems, no?


 The C and C++ languages could work that way, yes.  But they don't.
 GCC attempts to implement the standard language.


 Yep, that's why I mentioned how GCC's smart extensions to the standard
 language saved the day many times in the past ;-)


 Aliasing issues arise when a function has two pointers, and determine
 whether an assignment to *p1 might change the value at *p2.  There are
 no aliasing issues with a void* pointer, because if p1 is void* then
 *p1 is invalid.  That is not true for a void** pointer, so aliasing
 issues do arise.  If p1 is void** and p2 is int**, then GCC will
 assume that 

Re: Hopelessly broken loop_father, loop_depth

2012-08-13 Thread Richard Guenther
On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Lots of test cases fail with the attached patch.

 Lots still fail after correcting the verifier :-)

 920723-1.c: In function 'f':
 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1
  f (int count, vector_t * pos, double r, double *rho)
  ^
 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1
 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at
 cfgloop.c:1598

That's a pre-existing bug in unswitching.  When unswitching
simplifies the condition it unswitches on using simplify_using_entry_checks
it may turn an inner loop into an exit to an endless loop.  But it does
not modify the loop stucture according to this change.

void foo (int x, int r)
{
loop4:
  if (r = x)
{
  goto loop4_latch;
}
  else
{
loop5:
  if (r = x)
goto loop4_latch;
  goto loop5;
}
loop4_latch:
  goto loop4;
}

simplified testcase that even fails at -O1.  We mostly rely on cfg-cleanup
to fixup loops for us, so this is one case it does not handle properly.

The quest of keeping loops up-to-date is hard ... but thanks for the checking
code ;)

Richard.

 Ciao!
 Steven


Re: Hopelessly broken loop_father, loop_depth

2012-08-13 Thread Richard Guenther
On Mon, Aug 13, 2012 at 12:21 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Lots of test cases fail with the attached patch.

 Lots still fail after correcting the verifier :-)

 920723-1.c: In function 'f':
 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1
  f (int count, vector_t * pos, double r, double *rho)
  ^
 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1
 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at
 cfgloop.c:1598

 That's a pre-existing bug in unswitching.  When unswitching
 simplifies the condition it unswitches on using simplify_using_entry_checks
 it may turn an inner loop into an exit to an endless loop.  But it does
 not modify the loop stucture according to this change.

 void foo (int x, int r)
 {
 loop4:
   if (r = x)
 {
   goto loop4_latch;
 }
   else
 {
 loop5:
   if (r = x)
 goto loop4_latch;
   goto loop5;
 }
 loop4_latch:
   goto loop4;
 }

 simplified testcase that even fails at -O1.  We mostly rely on cfg-cleanup
 to fixup loops for us, so this is one case it does not handle properly.

Actually that testcase fails verification right after a full loop
discovery which
DOM1 performs ...

 The quest of keeping loops up-to-date is hard ... but thanks for the checking
 code ;)

 Richard.

 Ciao!
 Steven


Re: Hopelessly broken loop_father, loop_depth

2012-08-13 Thread Richard Guenther
On Mon, Aug 13, 2012 at 12:22 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Mon, Aug 13, 2012 at 12:21 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Lots of test cases fail with the attached patch.

 Lots still fail after correcting the verifier :-)

 920723-1.c: In function 'f':
 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1
  f (int count, vector_t * pos, double r, double *rho)
  ^
 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1
 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at
 cfgloop.c:1598

 That's a pre-existing bug in unswitching.  When unswitching
 simplifies the condition it unswitches on using simplify_using_entry_checks
 it may turn an inner loop into an exit to an endless loop.  But it does
 not modify the loop stucture according to this change.

 void foo (int x, int r)
 {
 loop4:
   if (r = x)
 {
   goto loop4_latch;
 }
   else
 {
 loop5:
   if (r = x)
 goto loop4_latch;
   goto loop5;
 }
 loop4_latch:
   goto loop4;
 }

 simplified testcase that even fails at -O1.  We mostly rely on cfg-cleanup
 to fixup loops for us, so this is one case it does not handle properly.

 Actually that testcase fails verification right after a full loop
 discovery which
 DOM1 performs ...

Fixed by attached patch.

 The quest of keeping loops up-to-date is hard ... but thanks for the checking
 code ;)

Which probably still makes things fail elsewhere ;)

Richard.

 Richard.

 Ciao!
 Steven


fix-loops-1
Description: Binary data


Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-13 Thread Elmar Krieger

Hi Richard,

many thanks for saving my time.


time gcc -m32 -g -O -fno-strict-aliasing -x c -Wall -Werror -c model.i


That's within reasonable bounds as well, IMHO (you can't really compare
-O1 from 3.2.3 with -O1 from 4.6.3).  One more data point (-O2 tends to
be more focused on, no debuginfo generation turns off improvements
and its costs there):

/usr/bin/time /space/rguenther/install/gcc-3.2.3/bin/gcc -S -o
/dev/null model.i -march=i386 -fno-strict-aliasing -w -O2
17.31user 0.43system 0:17.82elapsed 99%CPU (0avgtext+0avgdata
427392maxresident)k
72inputs+0outputs (2major+69895minor)pagefaults 0swaps

/usr/bin/time gcc-4.6 -S -o /dev/null model.i -march=i386
-fno-strict-aliasing  -m32 -w -O2
18.12user 0.21system 0:18.43elapsed 99%CPU (0avgtext+0avgdata
1752784maxresident)k
0inputs+0outputs (0major+124029minor)pagefaults 0swaps

same time, I am surprised again ;) (with improvements in CPU speed the
compilation
with 4.6.3 is actually _faster_ comparing commodity platforms from the
date of the compiler releases).


 You might want to try -ftime-report, if it says you have extra 
checkings enabled
 for one compiler but not the other that will explain the different 
outcome at

 your side:


Good news, and especially the -ftime-report trick was highly useful.

For example, I got a huge slowdown also with this compiler:

gcc44 (GCC) 4.4.6 20110731 (Red Hat 4.4.6-3)
Copyright (C) 2010 Free Software Foundation, Inc.

which spends all its time in 'variable tracking':


variable tracking : 126.07 (89%) usr   0.26 ( 7%) sys 126.50 (87%) 
wall   20647 kB ( 6%) ggc
 TOTAL : 141.94 3.66   145.61 
   336368 kB


real2m26.703s


And the Google Android compiler I reported originally...

i686-linux-android-gcc (GCC) 4.6.x-google 20120106 (prerelease)
Copyright (C) 2011 Free Software Foundation, Inc.

...which takes more than twice as long spends its time here:

phase cgraph  : 347.75 (100%) usr  10.73 (76%) sys 358.51 (99%) 
wall  130837 kB (84%) ggc
phase generate: 347.85 (100%) usr  10.77 (76%) sys 358.64 (99%) 
wall  132490 kB (85%) ggc
var-tracking dataflow : 284.34 (82%) usr   0.00 ( 0%) sys 284.21 (78%) 
wall   0 kB ( 0%) ggc
TOTAL : 350.0412.53   362.60 
 155292 kB


real6m3.567s

I really didn't expect that RedHat and Google both mess up GCC with 
their modifications, so I'll report it to them instead ;-)


Anyway, please send by private email your favorite way of receiving the 
promised 100 USD. Could be PayPal, a list of Amazon.com items which are 
sent to your address, a direct bank transfer etc..


Best regards,
Elmar


If you don't find an enormous slowdown with the second command (please post
your timings) and conclude that this problem has been introduced by Google
in their custom GCC, I'll pay you 100 USD for the 5 minutes wasted.


Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.

Disclaimer: I had to delete an include statement on top of the file I sent you
to make it compile.

Richard.




Aliasing issues arise when a function has two pointers, and determine
whether an assignment to *p1 might change the value at *p2.  There are
no aliasing issues with a void* pointer, because if p1 is void* then
*p1 is invalid.  That is not true for a void** pointer, so aliasing
issues do arise.  If p1 is void** and p2 is int**, then GCC will
assume that an assignment to *p1 does not change the value at *p2, as
the language standard states.  It's easy to imagine that that could
break a program after inlining.



Many thanks for the clarification, and it also points to a simple solution:

GCC could simply permit to pass a pointer to any pointer to a function, if
the function argument is of type 'void **restrict myptr'.

If adding a 'restrict' to a function declaration was the only thing required
to get rid of countless nasty explicit type casts, the day would already be
saved. There really seem to be lots of problem classes that cannot be solved
with explicit type casts otherwise. The example for loading a binary file
from disk and allocating the required memory to store the file contents
being just one of them...

Best regards,
Elmar




Just one more complicated example:

A function that loads a binary file from disk and allocates the required
memory to store the file contents, returning the number of bytes read.
dstadd is the address where the newly allocated pointer is stored:

int dsc_loadfilealloc(void *dstadd,char *filename)
{ int read,size;
   FILE *fb;

   if ((fb=fopen(filename,rb)))
   { size=dsc_filesize(filename);
 *(void**)dstadd=mem_alloc(size);
 read=dsc_readbytes(*(void**)dstadd,fb,size);
 *(void**)dstadd=mem_realloc(*(void**)dstadd,read);
 fclose(fb);
 return(read); }
   *(void**)dstadd=NULL;
   return(0); }

Again, nasty casts all over the place, which would all disappear if GCC
allowed me to write


Re: Hopelessly broken loop_father, loop_depth

2012-08-13 Thread Richard Guenther
On Mon, Aug 13, 2012 at 12:57 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 On Mon, Aug 13, 2012 at 12:22 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Mon, Aug 13, 2012 at 12:21 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 On Sun, Aug 12, 2012 at 2:02 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 On Sat, Aug 11, 2012 at 11:16 PM, Steven Bosscher stevenb@gmail.com 
 wrote:
 Lots of test cases fail with the attached patch.

 Lots still fail after correcting the verifier :-)

 920723-1.c: In function 'f':
 920723-1.c:14:1: error: bb 13 has loop depth 2, should be 1
  f (int count, vector_t * pos, double r, double *rho)
  ^
 920723-1.c:14:1: error: bb 14 has loop depth 2, should be 1
 920723-1.c:14:1: internal compiler error: in verify_loop_structure, at
 cfgloop.c:1598

 That's a pre-existing bug in unswitching.  When unswitching
 simplifies the condition it unswitches on using simplify_using_entry_checks
 it may turn an inner loop into an exit to an endless loop.  But it does
 not modify the loop stucture according to this change.

 void foo (int x, int r)
 {
 loop4:
   if (r = x)
 {
   goto loop4_latch;
 }
   else
 {
 loop5:
   if (r = x)
 goto loop4_latch;
   goto loop5;
 }
 loop4_latch:
   goto loop4;
 }

 simplified testcase that even fails at -O1.  We mostly rely on cfg-cleanup
 to fixup loops for us, so this is one case it does not handle properly.

 Actually that testcase fails verification right after a full loop
 discovery which
 DOM1 performs ...

 Fixed by attached patch.

 The quest of keeping loops up-to-date is hard ... but thanks for the 
 checking
 code ;)

 Which probably still makes things fail elsewhere ;)

Same issue in fix_loop_structure:

  /* Now fix the loop nesting.  */
  FOR_EACH_LOOP (li, loop, 0)
{
  ploop = superloop[loop-num];
  if (ploop != loop_outer (loop))
{
  flow_loop_tree_node_remove (loop);
  flow_loop_tree_node_add (ploop, loop);
}
}

I wonder why we cache loop-depth at all ... given that it is a simple
dereference bb-loop_father-superloops-base.prefix.num.  For all
the hassle to keep that cache up-to-date, that is.

Would anybody mind removing basic_block-loop_depth?  With C++
we can even have an overloaded loop_depth that works on both basic-blocks
and loops ...

Richard.


 Richard.

 Richard.

 Ciao!
 Steven


Re: Hopelessly broken loop_father, loop_depth

2012-08-13 Thread Steven Bosscher
On Mon, Aug 13, 2012 at 1:27 PM, Richard Guenther
richard.guent...@gmail.com wrote:
 I wonder why we cache loop-depth at all ... given that it is a simple
 dereference bb-loop_father-superloops-base.prefix.num.  For all
 the hassle to keep that cache up-to-date, that is.

The cached bb-loop_depth saves two indirect references. But if it's
hard to maintain, I'd be happy to see it go away. Just so long as
bb-loop_father is correct (to be verified by a patch for the loop
verification code).

Ciao!
Steven


Re: Hopelessly broken loop_father, loop_depth

2012-08-13 Thread Richard Guenther
On Mon, Aug 13, 2012 at 3:15 PM, Steven Bosscher stevenb@gmail.com wrote:
 On Mon, Aug 13, 2012 at 1:27 PM, Richard Guenther
 richard.guent...@gmail.com wrote:
 I wonder why we cache loop-depth at all ... given that it is a simple
 dereference bb-loop_father-superloops-base.prefix.num.  For all
 the hassle to keep that cache up-to-date, that is.

 The cached bb-loop_depth saves two indirect references. But if it's
 hard to maintain, I'd be happy to see it go away. Just so long as
 bb-loop_father is correct (to be verified by a patch for the loop
 verification code).

loop_father is easier to keep up-to-date at least, and possibly should
just work.
A patch removing loop_depth just finished testing and I'll commit it in a sec.

Richard.

 Ciao!
 Steven


Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-13 Thread Frank Ch. Eigler

Elmar Krieger el...@cmbi.ru.nl writes:

 [...]  I really didn't expect that RedHat and Google both mess up
 GCC with their modifications, so I'll report it to them instead ;-)

That's not a fair characterization of the features' costs/benefits.

- FChE


50% slowdown with LTO

2012-08-13 Thread Paul_Koning
I'm not sure what LTO is supposed to do -- the documentation is not exactly 
clear.  But I assumed it should make things faster and/or smaller.

So I tried using it on an application -- a processor emulator, CPU intensive 
code, a lot of 64 bit integer arithmetic.

Using a compile/assembler run on the emulated system as a benchmark, I compared 
the code on x86_64-linux, gcc 4.7.0, -O2 plain, -O2 -fprofile-use (after having 
done -fprofile-generate), and -O2 -fprofile-use -flto (using a separate set of 
profile data files from -fprofile-generate -flto).  

Results: profiling speeds things up about 8%, but LTO is 50% (!) slower than 
without.

Any suggestions of what to look at for this?

paul



Slides and video for Cauldron 2012 presentations

2012-08-13 Thread Diego Novillo


I just uploaded all the slides I received and linked all the talks for 
which we had video.


Jan, if there are any more videos you have other than 
http://www.youtube.com/playlist?list=PL5D02780BAF2B55CFfeature=plcp, 
please send them my way.


To all the presenters, please check that the links I've created in 
http://gcc.gnu.org/wiki/cauldron2012 are correct.


If you do not see your slides linked, please attach them to the wiki 
page and modify your entry in the Presentations section.  If you are not 
sure how to do this, please send me the slides and I'll upload them 
(only PDF files, please).



Thanks.  Diego.


Re: Excluding dejagnu testcases for subtargets

2012-08-13 Thread Senthil Kumar Selvaraj
On Sat, Aug 11, 2012 at 09:40:52AM -0700, Janis Johnson wrote:
 On 08/11/2012 09:18 AM, Senthil Kumar Selvaraj wrote:
  On Fri, Aug 10, 2012 at 09:54:17AM -0700, Janis Johnson wrote:
  On 08/09/2012 10:52 PM, Senthil Kumar Selvaraj wrote:
  Hi,
 
   What is the recommended way to skip specific (non target specific) 
  testcases for a  subtargets?
 
   There are a bunch of tests in the gcc testsuite that are too big (in
   terms of code size or memory) for a subtarget of the avr target. The
   subtarget is specified in the dejagnu board configuration file
   (set_board_info cflags -mmcu subtarget name).
 
   Using dg-skip-if with -mmcu subtarget name for the include part did
   not work. Looking at the source (target-supports-dg.exp) showed that it 
   doesn't consider board_info cflags. Only board_info multilib_flags, 
   flags specified in dg-options, $TOOL_OPTIONS and $TEST_ALWAYS_FLAGS 
   appear to be considered.
 
   Should we set the -mmcu option to  multilib_flags instead of cflags in 
   the board config? Should we use --tool_opt in RUNTESTFLAGS? How do
   other targets handle this?
 
   Regards
   Senthil
 
  Probably check-flags in target-supports-dg.exp should check cflags
  in the board_info along with the other flags.  Can you try that to
  see if it does what you need?
 
  
  Yes it does. The patch below did the job.
 
 Please submit it, with a ChangeLog entry, to gcc-patc...@gcc.gnu.org.
 

Sent.
http://gcc.gnu.org/ml/gcc-patches/2012-08/msg00689.html

  Is there a reason why cflags wasn't included before?
 
 Because I didn't know about it.  It wasn't intentional.
 
 Janis
 
  Regards
  Senthil
  
  
  diff --git a/gcc/testsuite/lib/target-supports-dg.exp 
  b/gcc/testsuite/lib/target-supports-dg.exp
  index 2f6c4c2..bdf7476 100644
  --- a/gcc/testsuite/lib/target-supports-dg.exp
  +++ b/gcc/testsuite/lib/target-supports-dg.exp
  @@ -304,6 +304,9 @@ proc check-flags { args } {
   # If running a subset of the test suite, $TEST_ALWAYS_FLAGS may not 
  exist.
   catch {append compiler_flags  $TEST_ALWAYS_FLAGS }
   set dest [target_info name]
  +if [board_info $dest exists cflags] {
  +append compiler_flags [board_info $dest cflags] 
  +}
   if [board_info $dest exists multilib_flags] {
  append compiler_flags [board_info $dest multilib_flags] 
   }
 

Regards
Senthil


gcc trunk fails to build without isl/cloog

2012-08-13 Thread Paul_Koning
The installation instructions seem to imply that GCC can be built without 
having ISL and/or CLOOG installed, and the configure script accepts 
--without-isl and --without-cloog.

But I can't build that.  Reading the installation instructions makes me expect 
that such a configuration would skip the building of the graphite loop 
optimization machinery.  What happens instead is that it's built anyway, but 
the makefile aborts at the point where it tries to compile gcc/graphite.c 
(because cloog/cloog.h does not exist).

Is this supposed to work?  

paul



ISL install troubles

2012-08-13 Thread Paul_Koning
Where does one go to report issues with ISL?

Since GCC doesn't build without it, I'm trying to install ISL from sources.  
That doesn't work.  It accepts --with-gmp but there is nothing in the Makefile 
to pay attention to that -- the compiles are done without any switches so it 
fails unless gmp.h is in /usr/include.  Since I installed gmp from source in 
the usual way, it's in /usr/local/.

paul



Re: 50% slowdown with LTO

2012-08-13 Thread Ian Lance Taylor
On Mon, Aug 13, 2012 at 8:27 AM,  paul_kon...@dell.com wrote:
 I'm not sure what LTO is supposed to do -- the documentation is not exactly 
 clear.  But I assumed it should make things faster and/or smaller.

 So I tried using it on an application -- a processor emulator, CPU intensive 
 code, a lot of 64 bit integer arithmetic.

 Using a compile/assembler run on the emulated system as a benchmark, I 
 compared the code on x86_64-linux, gcc 4.7.0, -O2 plain, -O2 -fprofile-use 
 (after having done -fprofile-generate), and -O2 -fprofile-use -flto (using a 
 separate set of profile data files from -fprofile-generate -flto).

 Results: profiling speeds things up about 8%, but LTO is 50% (!) slower than 
 without.

 Any suggestions of what to look at for this?

LTO lets the compiler see all the code at once, enabling optimizations
like inlining function calls across different source files.  Like any
optimization, there are cases where it will cause code to slow down
rather than speed up.  A 50% slowdown is certainly unusual, and
suggests some systematic error.

Figuring out what has gone wrong is like optimizing any program.  Get
a profile for your program, e.g., using -pg.  Build the program with
and without -flto, run it, and look at the resulting profiles.  A 50%
slowdown should be fairly obvious.  I would guess that GCC has made a
poor inlining decision, but the profile should show the problem for
sure.

Ian


Re: gcc trunk fails to build without isl/cloog

2012-08-13 Thread H.J. Lu
On Mon, Aug 13, 2012 at 9:01 AM,  paul_kon...@dell.com wrote:
 The installation instructions seem to imply that GCC can be built without 
 having ISL and/or CLOOG installed, and the configure script accepts 
 --without-isl and --without-cloog.

 But I can't build that.  Reading the installation instructions makes me 
 expect that such a configuration would skip the building of the graphite 
 loop optimization machinery.  What happens instead is that it's built anyway, 
 but the makefile aborts at the point where it tries to compile gcc/graphite.c 
 (because cloog/cloog.h does not exist).

 Is this supposed to work?


Trunk builds fine without ppl when GCC is configured with --without-ppl:

auto-host.h:/* #undef HAVE_cloog */

[hjl@gnu-mic-2 build-x86_64-linux]$ ldd gcc/cc1
linux-vdso.so.1 =  (0xff98)
libmpc.so.2 = /libx32/libmpc.so.2 (0xf73ad000)
libmpfr.so.4 = /libx32/libmpfr.so.4 (0xf7157000)
libgmp.so.10 = /libx32/libgmp.so.10 (0xf6eeb000)
libdl.so.2 = /libx32/libdl.so.2 (0xf6ce8000)
libz.so.1 = /libx32/libz.so.1 (0xf6ad4000)
libm.so.6 = /libx32/libm.so.6 (0xf67db000)
libc.so.6 = /libx32/libc.so.6 (0xf642c000)
/libx32/ld-linux-x32.so.2 (0xf75c1000)
[hjl@gnu-mic-2 build-x86_64-linux]$

I do have mpc, mpfr and gmp.

-- 
H.J.


Re: 50% slowdown with LTO

2012-08-13 Thread Andi Kleen
Ian Lance Taylor i...@google.com writes:

 Figuring out what has gone wrong is like optimizing any program.  Get
 a profile for your program, e.g., using -pg.  Build the program with
 and without -flto, run it, and look at the resulting profiles.  A 50%
 slowdown should be fairly obvious.  I would guess that GCC has made a
 poor inlining decision, but the profile should show the problem for
 sure.

On modern profiling tools like perf or oprofile you can also
diff profiles for this.

-Andi

-- 
a...@linux.intel.com -- Speaking for myself only


Re: gcc trunk fails to build without isl/cloog

2012-08-13 Thread Paul_Koning

On Aug 13, 2012, at 12:42 PM, H.J. Lu wrote:

 On Mon, Aug 13, 2012 at 9:01 AM,  paul_kon...@dell.com wrote:
 The installation instructions seem to imply that GCC can be built without 
 having ISL and/or CLOOG installed, and the configure script accepts 
 --without-isl and --without-cloog.
 
 But I can't build that.  Reading the installation instructions makes me 
 expect that such a configuration would skip the building of the graphite 
 loop optimization machinery.  What happens instead is that it's built 
 anyway, but the makefile aborts at the point where it tries to compile 
 gcc/graphite.c (because cloog/cloog.h does not exist).
 
 Is this supposed to work?
 
 
 Trunk builds fine without ppl when GCC is configured with --without-ppl:
 
 auto-host.h:/* #undef HAVE_cloog */
 
 [hjl@gnu-mic-2 build-x86_64-linux]$ ldd gcc/cc1
   linux-vdso.so.1 =  (0xff98)
   libmpc.so.2 = /libx32/libmpc.so.2 (0xf73ad000)
   libmpfr.so.4 = /libx32/libmpfr.so.4 (0xf7157000)
   libgmp.so.10 = /libx32/libgmp.so.10 (0xf6eeb000)
   libdl.so.2 = /libx32/libdl.so.2 (0xf6ce8000)
   libz.so.1 = /libx32/libz.so.1 (0xf6ad4000)
   libm.so.6 = /libx32/libm.so.6 (0xf67db000)
   libc.so.6 = /libx32/libc.so.6 (0xf642c000)
   /libx32/ld-linux-x32.so.2 (0xf75c1000)
 [hjl@gnu-mic-2 build-x86_64-linux]$
 
 I do have mpc, mpfr and gmp.

Is ppl another name for cloog?  If I don't have cloog, should I say 
--without-ppl?  That doesn't make much sense, and it isn't documented.

paul



Re: gcc trunk fails to build without isl/cloog

2012-08-13 Thread Andreas Schwab
See http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54138.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
And now for something completely different.


Re: New GCC takes 19x as long to compile my program (compared to old GCC), plus void** patch suggestion

2012-08-13 Thread Fumiaki Isoya

  [...]  I really didn't expect that RedHat and Google both mess up
  GCC with their modifications, so I'll report it to them instead

 That's not a fair characterization of the features' costs/benefits.

We just are trying to mess up (?) binutils, aren't we?  gcc just
receives the benefit by adapting to it.  The benefit is What is
necessary is to re-compile only the files you touched.  Is it
messing up, do you(pl) think?

- Isoyaf



[Bug fortran/54238] New: If possible, TRANSFER should use assignment instead of MEMCPY

2012-08-13 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54238

 Bug #: 54238
   Summary: If possible, TRANSFER should use assignment instead of
MEMCPY
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Keywords: missed-optimization
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: bur...@gcc.gnu.org


In some cases TRANSFER can be replaced by a normal assignment (with cast),
possibly also a ARRAY_RANGE_REF with cast?

Example, the following code – which matches the currently used scalarizer for
FINAL:


use iso_c_binding, only: c_intptr_t, c_loc, c_ptr, c_int, c_f_pointer
integer(c_int), target :: array(4)
integer(c_int), pointer :: ptr

integer(c_intptr_t) :: addr
type(c_ptr) :: cptr

array = [11,22,33,44]
do i = 0, 3
  cptr = c_loc (array)
  addr = transfer (cptr, addr) + i * storage_size (array)/8
  call c_f_pointer (transfer (addr, cptr), ptr)
  print *, i,': ', ptr
end do
end


Dump of: addr = transfer (cptr, addr) + i * storage_size (array)/8

  {
struct array1_integer(kind=4) parm.2;
integer(kind=8) transfer.1;
integer(kind=8) D.1876;
integer(kind=8) D.1875;
integer(kind=8) D.1874;

D.1874 = 8;
D.1875 = 8;
__builtin_memcpy ((void *) transfer.1, (void *) cptr,
  MAX_EXPR MIN_EXPR D.1875, D.1874, 0);
parm.2.dtype = 265;
parm.2.dim[0].lbound = 1;
parm.2.dim[0].ubound = 4;
parm.2.dim[0].stride = 1;
parm.2.data = (void *) array[0];
parm.2.offset = -1;
addr = (integer(kind=8)) ((i * 32) / 8) + transfer.1;
  }

While a simple
  addr = (intptr_t) cptr;
should be sufficient.


[Bug fortran/54238] If possible, TRANSFER should use assignment instead of MEMCPY

2012-08-13 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54238

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org 2012-08-13 
06:15:43 UTC ---
Though the memcpy does get optimized to a VCE:
  addr.9_4 = (integer(kind=8)) ivtmp.29_28;
  D.1913_24 = VIEW_CONVERT_EXPRvoid *(addr.9_4);

So it might not be important enough to do at the front-end level.


[Bug bootstrap/50167] gmp memory functions are extern C (graphite)

2012-08-13 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50167

Marc Glisse glisse at gcc dot gnu.org changed:

   What|Removed |Added

 CC||glisse at gcc dot gnu.org

--- Comment #2 from Marc Glisse glisse at gcc dot gnu.org 2012-08-13 06:17:30 
UTC ---
Note that this could also be solved by using gmp_fprintf. (Or by using
mpz_class::get_str, since we seem to be moving to C++ anyway)


[Bug middle-end/52173] internal compiler error: verify_ssa failed possibly caused by itm

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52173

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-13
 CC||aldyh at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Ever Confirmed|0   |1


[Bug rtl-optimization/53942] [4.6/4.7/4.8 Regression] unable to find a register to spill in class 'CREG'

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53942

--- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
07:35:11 UTC ---
Author: jakub
Date: Mon Aug 13 07:35:03 2012
New Revision: 190338

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190338
Log:
Backported from trunk
2012-07-19  Jakub Jelinek  ja...@redhat.com

PR rtl-optimization/53942
* function.c (assign_parm_setup_reg): Avoid zero/sign extension
directly from likely spilled non-fixed hard registers, move them
to pseudo first.

* gcc.dg/pr53942.c: New test.

Added:
branches/gcc-4_7-branch/gcc/testsuite/gcc.dg/pr53942.c
Modified:
branches/gcc-4_7-branch/gcc/ChangeLog
branches/gcc-4_7-branch/gcc/function.c
branches/gcc-4_7-branch/gcc/testsuite/ChangeLog


[Bug libstdc++/54237] [C++11] Make more tuple-related functions constexpr

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54237

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 CC||bkoz at gcc dot gnu.org
   Severity|normal  |enhancement

--- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
08:14:00 UTC ---
That does seem possible.  Benjamin wrote
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2010/n3231.html so let's CC
him.


[Bug tree-optimization/21485] [4.6/4.7/4.8 Regression] missed load PRE, PRE makes i?86 suck

2012-08-13 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=21485

--- Comment #53 from wbrana wbrana at gmail dot com 2012-08-13 08:26:13 UTC 
---
It seems it was improved.

4.8 20120806
NUMERIC SORT:  1543.7  :  39.59  :  13.00
4.8 20120813
NUMERIC SORT:  2007.8  :  51.49  :  16.91


[Bug debug/51358] incorrect/missing location for function arg, -O0, without VTA

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51358

--- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
08:55:05 UTC ---
(In reply to comment #4)
 It would not be helpful, systemtap would then see no data (just not wrong
 data).
 
 Also at that time location list will need to be used and currently GDB when it
 sees any location list it thinks it no longer needs to skip the prologue.
 OTOH GDB could look at -grecord-gcc-switches first which it currently does not
 so I should just finally implement -grecord-gcc-switches in GDB in such case.

I think seeing wrong data, thus, wrong-debug is never superior over no debug
info / no data.


[Bug target/54232] For x86 PIC code, ebx should be spillable

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54232

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Target||x86_64-*-*, i?86-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-13
Version|unknown |4.8.0
 Ever Confirmed|0   |1

--- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
08:57:15 UTC ---
I think the GOT is introduced too late to do any fancy ananlysis on whether
we need it or not.  I also think that for outgoing function calls the ABI
relies on a properly setup GOT, even for those that bind locally and thus
do not go through the PLT.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #8 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
08:59:18 UTC ---
If you do something like

 gcc -c t1.c -mavx -flto
 gcc -c t2.c -msse2 -flto
 gcc t1.o t2.o -flto

then the link step will use -mavx -msse2, that is, target options are
concatenated.


[Bug tree-optimization/54200] copyrename generates wrong debuginfo

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200

--- Comment #7 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
09:29:33 UTC ---
Author: rguenth
Date: Mon Aug 13 09:29:28 2012
New Revision: 190339

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190339
Log:
2012-08-13  Richard Guenther  rguent...@suse.de

PR tree-optimization/54200
* tree-ssa-copyrename.c (rename_ssa_copies): Do not add
PHI results to another partition if not all PHI arguments
have the same partition.

* gcc.dg/guality/pr54200.c: New testcase.
* gcc.dg/tree-ssa/slsr-8.c: Adjust.

Added:
trunk/gcc/testsuite/gcc.dg/guality/pr54200.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/tree-ssa/slsr-8.c
trunk/gcc/tree-ssa-copyrename.c


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-13 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #9 from Thiago Macieira thiago at kde dot org 2012-08-13 09:44:51 
UTC ---
(In reply to comment #8)
 If you do something like
 
  gcc -c t1.c -mavx -flto
  gcc -c t2.c -msse2 -flto
  gcc t1.o t2.o -flto
 
 then the link step will use -mavx -msse2, that is, target options are
 concatenated.

Indeed.

What I'm asking for is that each source file be compiled with its own target
options. I realise this is a request for enhancement, though.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-13 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #10 from Thiago Macieira thiago at kde dot org 2012-08-13 
09:53:32 UTC ---
Another test:

$ cat main_avx.c
#define BZERO bzero_avx
#pragma GCC target (avx)
#include main.c

$ cat main_sse2.c
#define BZERO bzero_sse2
#pragma GCC target (sse2)
#include main.c

$ cat main.c
#include immintrin.h

void BZERO(char *ptr, size_t count)
{
__m128i zero = _mm_set1_epi8(0);
while (count--) {
_mm_stream_si128((__m128i*)ptr, zero);
ptr += 16;
}
}

$ gcc -flto -O2 -shared -o libtest.so main_avx.c main_sse2.c
$ objdump -Cdr --no-show-raw-insn libtest.so
[...]

0650 bzero_sse2:
 650:   test   %rsi,%rsi
 653:   pxor   %xmm0,%xmm0
 657:   je 66e bzero_sse2+0x1e
 659:   nopl   0x0(%rax)
 660:   movntdq %xmm0,(%rdi)
 664:   add$0x10,%rdi
 668:   sub$0x1,%rsi
 66c:   jne660 bzero_sse2+0x10
 66e:   repz retq 

0670 bzero_avx:
 670:   test   %rsi,%rsi
 673:   pxor   %xmm0,%xmm0
 677:   je 68e bzero_avx+0x1e
 679:   nopl   0x0(%rax)
 680:   movntdq %xmm0,(%rdi)
 684:   add$0x10,%rdi
 688:   sub$0x1,%rsi
 68c:   jne680 bzero_avx+0x10
 68e:   repz retq


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-13 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #11 from Thiago Macieira thiago at kde dot org 2012-08-13 
10:12:48 UTC ---
Attaching __attribute__((target(xxx))) to the function does help.

It generates the following with the my_bzero function from comment 2:

02e0 bzero_avx.2362:
 2e0:   test   %rsi,%rsi
 2e3:   vpxor  %xmm0,%xmm0,%xmm0
 2e7:   je 2fe bzero_avx.2362+0x1e
 2e9:   nopl   0x0(%rax)
 2f0:   vmovntdq %xmm0,(%rdi)
 2f4:   add$0x10,%rdi
 2f8:   sub$0x1,%rsi
 2fc:   jne2f0 bzero_avx.2362+0x10
 2fe:   repz retq 

0300 my_bzero:
 300:   mov0x200171(%rip),%rax# 200478 my_bzero+0x200178
 307:   mov(%rax),%eax
 309:   test   %eax,%eax
 30b:   jne330 my_bzero+0x30
 30d:   test   %rsi,%rsi
 310:   pxor   %xmm0,%xmm0
 314:   je 332 my_bzero+0x32
 316:   nopw   %cs:0x0(%rax,%rax,1)
 320:   movntdq %xmm0,(%rdi)
 324:   add$0x10,%rdi
 328:   sub$0x1,%rsi
 32c:   jne320 my_bzero+0x20
 32e:   repz retq 
 330:   jmp2e0 bzero_avx.2362
 332:   repz retq 


This workaround might be useful for me in a few places where the code inlining
provided by LTO was desired (even though, in this example, the AVX variant is
exactly what it would be if no LTO had been used). But it won't work without
major changes to the code if I have 400+ functions in a file, plus possibly
inlines from headers, to be compiled.


[Bug target/54239] New: Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

 Bug #: 54239
   Summary: Not able to generate prefetch (prefetch read)
instruction using -m3dnow or -mprfchw
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: venkataramanan.ku...@amd.com


Hi all, 

The following are classification of prefetch instructions.

1) Prefetch instructions included by 3DNOW ISA/ new PRFCHW ISA (checked against
cpuid function 0x8001 bit 8 of ecx register)

prefetch   MEM
perfetchw  MEM

2) Prefetch instructions included by SSE ISA

prefetcht0 MEM
prefetcht1 MEM
prefetcht2 MEM
prefetchnta MEM

I am trying to generate 3DNOW/PRFCHW prefetch instructions

#include x86intrin.h
void *p;

void
prefetchw__test (void)
{
__builtin_prefetch (p, 0, 0); //== expecting prefetch p
__builtin_prefetch (p, 1, 0); //== expecting prefetchw p
}

For the following set of options (enabled with -m3dnow and -mprfchw) the
expected instruction for prefetch read (__builtin_prefetch (p, 0, 0)) is not
generated.
1. gcc test.c -m3dnow -S
2. gcc test.c -m3dnow -mno-sse -mno-mmx -S
3. gcc test.c -S -mprfchw
4. gcc test.c -S -mprfchw -mno-sse -mno-mmx

At least for k6-2 architecture, I am not expecting the instruction prefetchnt2
to be listed with -mprfchw. (-march=k6-2 -m32 -mprfchw)

Am I missing something?


[Bug rtl-optimization/53495] [4.8 Regression] segmentation fault

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53495

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |jakub at gcc dot gnu.org
   |gnu.org |

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
10:36:32 UTC ---
Created attachment 28003
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28003
gcc48-pr53495.patch

The problem is that find_moveable_pseudos creates some extra pseudos/def_insns,
but then trivially_dead_insns is called by ira and deletes them (because they
were feeding trivially dead insns only).  Then move_unallocated_pseudos is
called and expects to be able to tweak all the insns find_moveable_pseudos
created.  The attached untested patch fixes that.


[Bug target/54049] cr16: ICE: in gen_rtx_SUBREG with -O1

2012-08-13 Thread stefan at astylos dot dk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54049

Stefan Sørensen stefan at astylos dot dk changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #1 from Stefan Sørensen stefan at astylos dot dk 2012-08-13 
10:51:19 UTC ---
Works in 4.8-20120812 snapshot, closing.


[Bug middle-end/53411] [4.8 Regression] ICE in move_unallocated_pseudos

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53411

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||DUPLICATE

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
10:55:39 UTC ---
If we want to rely on no dead insns before IRA, it would make no point calling
delete_trivially_dead_insns in it.

*** This bug has been marked as a duplicate of bug 53495 ***


[Bug rtl-optimization/53495] [4.8 Regression] segmentation fault

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53495

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
10:55:39 UTC ---
*** Bug 53411 has been marked as a duplicate of this bug. ***


[Bug middle-end/53411] [4.8 Regression] ICE in move_unallocated_pseudos

2012-08-13 Thread bernds at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53411

--- Comment #6 from Bernd Schmidt bernds at gcc dot gnu.org 2012-08-13 
11:07:27 UTC ---
If the call to delete_trivially_dead_insns is supposed to eliminate only
pre-existing dead insns, then just moving it to the beginning of IRA fixes this
bug.


[Bug middle-end/53411] [4.8 Regression] ICE in move_unallocated_pseudos

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53411

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||vmakarov at gcc dot gnu.org

--- Comment #7 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
11:24:31 UTC ---
ira itself also removes something, e.g. in
  rebuild_jump_labels (get_insns ());
  if (purge_all_dead_edges ())
delete_unreachable_blocks ();
so I wouldn't move that
  if (delete_trivially_dead_insns (get_insns (), max_reg_num ()))
df_analyze ();
too early in the function.  But perhaps it could be moved before the
  /* It is not worth to do such improvement when we use a simple
 allocation because of -O0 usage or because the function is too
 big.  */
  if (ira_conflicts_p)
find_moveable_pseudos ();
hunk.  Vlad, what do you think?  There is still ira_flattening that tweaks the
RTL in between, dunno if it could create trivially dead insns or not.  Moving
d_t_d_i call before f_m_p call certainly fixes both of the testcases too,
haven't bootstrapped/regtested either of the patches yet.


[Bug libstdc++/54112] including complex.h and complex fails in C++03

2012-08-13 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54112

--- Comment #4 from Marc Glisse glisse at gcc dot gnu.org 2012-08-13 11:55:04 
UTC ---
Author: glisse
Date: Mon Aug 13 11:55:00 2012
New Revision: 190340

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190340
Log:
2012-08-13  Marc Glisse  marc.gli...@inria.fr

PR libstdc++/54112
* include/c_compatibility/complex.h: Undefine complex, always
include system's complex.h if present.
* testsuite/26_numerics/complex/c99.cc: New testcase.
* testsuite/17_intro/headers/c++1998/complex.cc: Likewise.
* doc/xml/manual/numerics.xml: Document it.

Added:
trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/complex.cc   (with
props)
trunk/libstdc++-v3/testsuite/26_numerics/complex/c99.cc   (with props)
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/doc/xml/manual/numerics.xml
trunk/libstdc++-v3/include/c_compatibility/complex.h

Propchange: trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/complex.cc
('svn:eol-style' added)

Propchange: trunk/libstdc++-v3/testsuite/17_intro/headers/c++1998/complex.cc
('svn:keywords' added)

Propchange: trunk/libstdc++-v3/testsuite/26_numerics/complex/c99.cc
('svn:eol-style' added)

Propchange: trunk/libstdc++-v3/testsuite/26_numerics/complex/c99.cc
('svn:keywords' added)


[Bug tree-optimization/54200] copyrename generates wrong debuginfo

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #8 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
11:55:26 UTC ---
Fixed as far as I am concerned.


[Bug libstdc++/54112] including complex.h and complex fails in C++03

2012-08-13 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54112

Marc Glisse glisse at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||FIXED

--- Comment #5 from Marc Glisse glisse at gcc dot gnu.org 2012-08-13 11:58:29 
UTC ---
Fixed.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #12 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
11:58:33 UTC ---
(In reply to comment #9)
 (In reply to comment #8)
  If you do something like
  
   gcc -c t1.c -mavx -flto
   gcc -c t2.c -msse2 -flto
   gcc t1.o t2.o -flto
  
  then the link step will use -mavx -msse2, that is, target options are
  concatenated.
 
 Indeed.
 
 What I'm asking for is that each source file be compiled with its own target
 options. I realise this is a request for enhancement, though.

Yes, there are similar option-related bugs for this.  Note somebody needs
to sit down and document the desired semantics of combining translation
units T1 and T2, compiled with different options OP1 and OP2, at link-time with
options OP3.  Desired semantics including which cross-file optimizations
(inlining?) are possible.


[Bug lto/54231] LTO generates code for the wrong CPU if different options used

2012-08-13 Thread thiago at kde dot org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54231

--- Comment #13 from Thiago Macieira thiago at kde dot org 2012-08-13 
12:13:40 UTC ---
(In reply to comment #12)
 Yes, there are similar option-related bugs for this.  Note somebody needs
 to sit down and document the desired semantics of combining translation
 units T1 and T2, compiled with different options OP1 and OP2, at link-time 
 with
 options OP3.  Desired semantics including which cross-file optimizations
 (inlining?) are possible.

From my (admittedly restrict) point of view, inlining should be possible,
provided the following conditions:
 - when inlining a function with a lower optimisation / target setting, apply
the outer scope's setting to the inlined code
 - when inlining a function with a higher target requirement, inlining should
be done only in the sense of partial function splitting, prologue, epilogues,
constant propagation, etc.

In the case that I pasted, for example, I'd like GCC to realise that it has
already tested if the counter variable is 0, then forego that test in the
inlined, inner function.

Worst case scenario, simply forego inlining completely. Then the code would
simply be no worse than the non-LTO case.


[Bug tree-optimization/54200] copyrename generates wrong debuginfo

2012-08-13 Thread izamyatin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200

Igor Zamyatin izamyatin at gmail dot com changed:

   What|Removed |Added

 CC||izamyatin at gmail dot com

--- Comment #9 from Igor Zamyatin izamyatin at gmail dot com 2012-08-13 
12:13:54 UTC ---
I see following in report for x86:

FAIL: gcc.dg/guality/pr54200.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 20 z == 3


[Bug tree-optimization/54240] New: Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

 Bug #: 54240
   Summary: Routine hoist_adjacent_loads does not work properly
after r189366
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ysrum...@gmail.com


This regression can be seen in the attached simple test-case - cmov conversion
is not happened. The fix is evident:

--- tree-ssa-phiopt.c   (revision 190151)
+++ tree-ssa-phiopt.c   (working copy)
@@ -1864,7 +1864,7 @@

   /* Check the mode of the arguments to be sure a conditional move
 can be generated for it.  */
-  if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1
+  if (optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1)) ==
CODE_FOR_nothing))
continue;

   /* Both statements must be assignments whose RHS is a COMPONENT_REF.  */

You can see this regression on any platform supporting conditional moves (I
tested it on x86).


[Bug tree-optimization/54241] New: Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread ysrumyan at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241

 Bug #: 54241
   Summary: Routine hoist_adjacent_loads does not work properly
after r189366
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ysrum...@gmail.com


This regression can be seen in the attached simple test-case - cmov conversion
is not happened. The fix is evident:

--- tree-ssa-phiopt.c   (revision 190151)
+++ tree-ssa-phiopt.c   (working copy)
@@ -1864,7 +1864,7 @@

   /* Check the mode of the arguments to be sure a conditional move
 can be generated for it.  */
-  if (!optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1
+  if (optab_handler (cmov_optab, TYPE_MODE (TREE_TYPE (arg1)) ==
CODE_FOR_nothing))
continue;

   /* Both statements must be assignments whose RHS is a COMPONENT_REF.  */

You can see this regression on any platform supporting conditional moves (I
tested it on x86).


[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
12:31:52 UTC ---
Both in 4.7 (which is before the prfchw changes) and 4.8 with -m32 -m3dnow and
-m32 -m3dnow -mno-sse I get prefetch + prefetchw insn, which looks ok to me.
-mno-mmx I think disables 3dnow too, so you get no prefetch insns in that case
(which is also fine).  -mprfchw implies the SSE prefetches and PRFCHW CPUID
0x8001 ecx bit 8 doesn't imply the prefetch insn, just prefetchw, so it is
correct that with -m32 -mprfchw prefetchnta + prefetchw is generated.
So, where exactly do you see a bug?


[Bug tree-optimization/54200] copyrename generates wrong debuginfo

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200

--- Comment #10 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
12:35:32 UTC ---
(In reply to comment #9)
 I see following in report for x86:
 
 FAIL: gcc.dg/guality/pr54200.c  -O2 -flto -fuse-linker-plugin
 -fno-fat-lto-objects  line 20 z == 3

That's what I said in the commit mail.


[Bug tree-optimization/54241] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
12:39:10 UTC ---
.

*** This bug has been marked as a duplicate of bug 54240 ***


[Bug tree-optimization/54241] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread michael.v.zolotukhin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54241

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||DUPLICATE

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
12:39:10 UTC ---
.

*** This bug has been marked as a duplicate of bug 54240 ***

--- Comment #2 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 
2012-08-13 12:39:23 UTC ---
Created attachment 28004
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28004
test-case confirming the issue


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
12:39:10 UTC ---
*** Bug 54241 has been marked as a duplicate of this bug. ***


[Bug c/53968] integer undefined behaviors in GCC

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53968

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
12:40:04 UTC ---
Author: jakub
Date: Mon Aug 13 12:39:54 2012
New Revision: 190342

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190342
Log:
PR c/53968
* tree.c (integer_pow2p): Avoid undefined signed overflows.
* simplify-rtx.c (neg_const_int): Likewise.
* expr.c (fixup_args_size_notes): Likewise.
* stor-layout.c (set_min_and_max_values_for_integral_type): Likewise.
* double-int.c (mul_double_wide_with_sign): Likewise.
(double_int_mask): Likewise.
* tree-ssa-loop-ivopts.c (get_address_cost): Likewise.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/double-int.c
trunk/gcc/expr.c
trunk/gcc/simplify-rtx.c
trunk/gcc/stor-layout.c
trunk/gcc/tree-ssa-loop-ivopts.c
trunk/gcc/tree.c


[Bug c/53968] integer undefined behaviors in GCC

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53968

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-13
 CC||hubicka at gcc dot gnu.org,
   ||jakub at gcc dot gnu.org
 Ever Confirmed|0   |1

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
12:41:06 UTC ---
Haven't reproduced the diagnostic.c failure, and leaving the ipa hunk to Honza.


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 CC||wschmidt at gcc dot gnu.org

--- Comment #2 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
12:41:10 UTC ---
Confirmed.  William?  Why don't we see any failed testcases?


[Bug middle-end/54201] XMM constant duplicated

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54201

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|NEW
 AssignedTo|rguenth at gcc dot gnu.org  |unassigned at gcc dot
   ||gnu.org

--- Comment #6 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
12:42:47 UTC ---
Not working on it.


[Bug tree-optimization/54200] copyrename generates wrong debuginfo

2012-08-13 Thread izamyatin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54200

--- Comment #11 from Igor Zamyatin izamyatin at gmail dot com 2012-08-13 
12:46:48 UTC ---
Right! Sorry for the noise...


[Bug middle-end/54242] New: [4.8 Regression] Testsuite failures

2012-08-13 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54242

 Bug #: 54242
   Summary: [4.8 Regression] Testsuite failures
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: hjl.to...@gmail.com
CC: rgue...@gcc.gnu.org


On Linux/x86-64, revision 190339:

http://gcc.gnu.org/ml/gcc-cvs/2012-08/msg00316.html

caused:

FAIL: gcc.dg/guality/pr54200.c  -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects  line 20 z == 3
FAIL: gcc.dg/guality/pr54200.c  -Os  line 20 z == 3
FAIL: gcc.target/i386/pad-10.c scan-assembler-not nop


[Bug driver/54210] gcc unable to detect -mprfchw flag in bulldozer machines

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54210

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
13:21:52 UTC ---
Author: jakub
Date: Mon Aug 13 13:21:41 2012
New Revision: 190345

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190345
Log:
PR driver/54210
* config/i386/driver-i386.c (host_detect_local_cpu): Test bit_PRFCHW
bit of CPUID 0x8001 %ecx instead of CPUID 7 %ecx.
* config/i386/cpuid.h (bits_PRFCHW): Move definition to CPUID
0x8001 %ecx flags.

Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/i386/cpuid.h
trunk/gcc/config/i386/driver-i386.c


[Bug middle-end/54242] [4.8 Regression] Testsuite failures

2012-08-13 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54242

Richard Guenther rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-13
   Target Milestone|--- |4.8.0
 Ever Confirmed|0   |1

--- Comment #1 from Richard Guenther rguenth at gcc dot gnu.org 2012-08-13 
13:27:45 UTC ---
It caused only

FAIL: gcc.target/i386/pad-10.c scan-assembler-not nop

as said in the commit mail.  The other FAILs are prefered over dozen
new XPASSes.  pad-10.c is testing something that didn't really work before.


[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

--- Comment #2 from Venkataramanan venkataramanan.kumar at amd dot com 
2012-08-13 13:51:08 UTC ---
(In reply to comment #1)
 Both in 4.7 (which is before the prfchw changes) and 4.8 with -m32 -m3dnow and
 -m32 -m3dnow -mno-sse I get prefetch + prefetchw insn, which looks ok to me.
 -mno-mmx I think disables 3dnow too, so you get no prefetch insns in that case
 (which is also fine).  -mprfchw implies the SSE prefetches and PRFCHW CPUID
 0x8001 ecx bit 8 doesn't imply the prefetch insn, just prefetchw, so it is
 correct that with -m32 -mprfchw prefetchnta + prefetchw is generated.
 So, where exactly do you see a bug?

Hi Jakub,

 -mprfchw implies the SSE prefetches and PRFCHW CPUID
 0x8001 ecx bit 8 doesn't imply the prefetch insn, just prefetchw, so it is
 correct that with -m32 -mprfchw prefetchnta + prefetchw is generated.
 So, where exactly do you see a bug

As per AMD cpuid manual, 0x8001 ecx bit 8 impiles both prefetch and
prefetchw.  

http://blogs.amd.com/developer/2010/08/18/3dnow-deprecated/


[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
13:58:40 UTC ---
But the Intel manual AFAIK doesn't talk about prefetch insn.
So, the -mprfchw switch needs to control solely the prefetchw instruction,
and there might be a different one that controls the prefetch insn.  In
driver-i386.c you could enable -mprfchw vs. ?-mprfch -mpfrchw? based on whether
the CPU is Intel or AMD or something, but if there are CPUs that don't have
both insns, it needs to be enabled independently.  Areg?


[Bug target/54232] For x86 PIC code, ebx should be spillable

2012-08-13 Thread bugdal at aerifal dot cx
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54232

--- Comment #3 from Rich Felker bugdal at aerifal dot cx 2012-08-13 13:59:17 
UTC ---
 I think the GOT is introduced too late to do any fancy ananlysis
 on whether we need it or not.

This may be true, but if so, it's a highly suboptimal design that's hurting
performance badly. 30% on the cryptographic code I looked at, and from working
on FFmpeg in the past, I remember quite a few cases where PIC was hurting
performance by significant measurable amounts like that too. If there's any way
the changes I describe could be targeted even just in the long term, I think it
would make a big difference for a lot of software.

 I also think that for outgoing function calls the ABI
 relies on a properly setup GOT, even for those that bind
 locally and thus do not go through the PLT.

The extern function call ABI on x86 does not allow the caller to depend on EBX
containing the GOT address. This is because the callee has no way of knowing
whether it was called by the same DSO it resides in. If not, the GOT address
will be invalid for it.

For static functions whose addresses never leak out of the translation unit
they're defined in, the calling convention is up to GCC. Ideally it would
assume the GOT register is already loaded in such functions (as long as all the
callees use the GOT), but in reality it rarely does. This is a separate code
generation QoI implementation that should perhaps be addressed as its own bug.


[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
14:00:55 UTC ---
BTW, why do you care about the prefetch insn?  Isn't it obsoleted by the SSE
ISA prefetches anyway (unlike prefetchw)?


[Bug libstdc++/54185] condition_variable not properly destructed

2012-08-13 Thread d.adler.s at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185

--- Comment #8 from David Adler d.adler.s at gmail dot com 2012-08-13 
14:09:16 UTC ---
Created attachment 28005
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28005
proposed changelog

I wasn't sure about the testcase file name, so I just guessed.


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

--- Comment #3 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 
14:14:59 UTC ---
Odd, I don't know.  I'll have to go back and look at the tests when I get a
moment and investigate that.  Peculiar.


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread michael.v.zolotukhin at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

--- Comment #3 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 
14:14:59 UTC ---
Odd, I don't know.  I'll have to go back and look at the tests when I get a
moment and investigate that.  Peculiar.

--- Comment #4 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 
2012-08-13 14:15:08 UTC ---
Created attachment 28006
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28006
test-case confirming the issue


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

--- Comment #5 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 
14:24:48 UTC ---
Well, I'm embarrassed.  The tests I wrote for this functionality never got into
the test suite -- I apparently forgot to submit them with the patch -- and I
can't find them anymore.  I'll write some new ones soon.  Apologies for the
oversight. :(


[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread venkataramanan.kumar at amd dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

--- Comment #5 from Venkataramanan venkataramanan.kumar at amd dot com 
2012-08-13 14:33:14 UTC ---
(In reply to comment #4)
 BTW, why do you care about the prefetch insn?  Isn't it obsoleted by the SSE
 ISA prefetches anyway (unlike prefetchw)?


Hi Jakub, as for as fam15H processors what I know is they are exactly same.
Yes I can use -mprfchw and generate prefecthw instruction and use prefetchts
instead of prefetch instruction.

But there is a mention in SWOG guide of amdfam15 that their functionalities
could change in future. 

(Snip)
AMD Family 15h processors implement the PREFETCHT0, PREFETCHT1, and PREFETCHT2
instructions in exactly the same way as the PREFETCH instruction. That is, the
data is brought into the L1 data cache. This functionality could change in
future implementations of the AMD Family 15h
processor
(Snip)


[Bug libstdc++/54185] condition_variable not properly destructed

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185

--- Comment #9 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
14:35:21 UTC ---
Perfect - thanks. I'll get it committed tonight.


[Bug fortran/54243] New: f951: internal compiler error: Segmentation fault (trying to compile errorneous code)

2012-08-13 Thread slayoo at staszic dot waw.pl
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54243

 Bug #: 54243
   Summary: f951: internal compiler error: Segmentation fault
(trying to compile errorneous code)
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: sla...@staszic.waw.pl


With Deabian's gcc-snapshot gfortran (4.8.0 20120714) trying to compile to code
below:



module aqq_m
  type :: aqq_t
contains
procedure :: aqq_init
  end type 
  contains
  subroutine aqq_init(this)
class(aqq_t) :: this
  end subroutine
end module
program bug2
  use aqq_m
  class(aqq_t) :: aqq
  call aqq%aqq_init
end program



I get:



$ /usr/lib/gcc-snapshot/bin/gfortran -std=f2008 -ffree-form  bug2.f
bug2.f:24.21:

  class(aqq_t) :: aqq
 1   
Error: CLASS variable 'aqq' at (1) must be dummy, allocatable or pointer
f951: internal compiler error: Segmentation fault
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-snapshot/README.Bugs for instructions.



HTH,
Sylwester


[Bug fortran/54244] New: f951: internal compiler error: in gfc_add_component_ref, at fortran/class.c:210

2012-08-13 Thread slayoo at staszic dot waw.pl
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54244

 Bug #: 54244
   Summary: f951: internal compiler error: in
gfc_add_component_ref, at fortran/class.c:210
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: sla...@staszic.waw.pl


With Deabian's gcc-snapshot gfortran (4.8.0 20120714) trying to compile to code
below:



module aqq_m
  type :: arr_t
  end type
  type :: aqq_t
class(arr_t), allocatable :: psi(:)
contains
procedure :: aqq_init
  end type 
  contains
  subroutine aqq_init(this)
class(aqq_t) :: this
  end subroutine
end module
program bug1
  use aqq_m
  class(aqq_t) :: aqq
  call aqq%aqq_init
end program



I get:



$ /usr/lib/gcc-snapshot/bin/gfortran -std=f2008 -ffree-form  bug1.f 
bug1.f:32.21:

  class(aqq_t) :: aqq
 1   
Error: CLASS variable 'aqq' at (1) must be dummy, allocatable or pointer
bug1.f:33.10:

  call aqq%aqq_init
  1
Error: Type mismatch in argument 'this' at (1); passed
CLASS(__class_aqq_m_Arr_t_1_0a) to CLASS(aqq_t)
f951: internal compiler error: in gfc_add_component_ref, at fortran/class.c:210
Please submit a full bug report,
with preprocessed source if appropriate.
See file:///usr/share/doc/gcc-snapshot/README.Bugs for instructions.



HTH,
Sylwester


[Bug c++/53836] [4.7/4.8 Regression] ICE: unexpected expression of kind template_parm_index

2012-08-13 Thread paolo.carlini at oracle dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53836

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2012-08-13
 CC||hjl.tools at gmail dot com
Summary|ICE: unexpected expression  |[4.7/4.8 Regression] ICE:
   |of kind template_parm_index |unexpected expression of
   ||kind template_parm_index
 Ever Confirmed|0   |1

--- Comment #3 from Paolo Carlini paolo.carlini at oracle dot com 2012-08-13 
15:26:57 UTC ---
Mainline ICEs for me (190348) and indeed looks like a regression.

HJ, can you help figuring out when we regressed?


[Bug fortran/54243] [OOP] ICE (segfault) in gfc_type_compatible for invalid BT_CLASS

2012-08-13 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54243

Tobias Burnus burnus at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||error-recovery,
   ||ice-on-invalid-code
   Last reconfirmed||2012-08-13
 CC||burnus at gcc dot gnu.org,
   ||janus at gcc dot gnu.org
 Ever Confirmed|0   |1
Summary|f951: internal compiler |[OOP] ICE (segfault) in
   |error: Segmentation fault   |gfc_type_compatible for
   |(trying to compile  |invalid BT_CLASS
   |errorneous code)|

--- Comment #1 from Tobias Burnus burnus at gcc dot gnu.org 2012-08-13 
15:35:11 UTC ---
Segfaults in

4837gfc_type_compatible (gfc_typespec *ts1, gfc_typespec *ts2)
4838{
4839  bool is_class1 = (ts1-type == BT_CLASS);
4840  bool is_class2 = (ts2-type == BT_CLASS);
...
4853  else if (is_class1  is_class2)
4854return gfc_type_is_extension_of
(ts1-u.derived-components-ts.u.derived,
4855
ts2-u.derived-components-ts.u.derived);

The problem is that ts2-u.derived-components == NULL.


[Bug fortran/54244] [OOP] ICE in gfc_add_component_ref, at fortran/class.c:210

2012-08-13 Thread burnus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54244

Tobias Burnus burnus at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Keywords||error-recovery,
   ||ice-on-invalid-code
   Last reconfirmed||2012-08-13
 CC||burnus at gcc dot gnu.org,
   ||janus at gcc dot gnu.org
 Ever Confirmed|0   |1
Summary|f951: internal compiler |[OOP] ICE in
   |error: in   |gfc_add_component_ref, at
   |gfc_add_component_ref, at   |fortran/class.c:210
   |fortran/class.c:210 |

--- Comment #1 from Tobias Burnus burnus at gcc dot gnu.org 2012-08-13 
15:35:25 UTC ---
Fails in gfc_add_component_ref at
213   gcc_assert((*tail)-u.c.component);

Here, (*tail)-u.c.component == NULL and tail-u.c.sym-name == aqq_t.

Called via resolve_typebound_subroutine.


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

William J. Schmidt wschmidt at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-08-13
 AssignedTo|unassigned at gcc dot   |wschmidt at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #6 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 
15:46:31 UTC ---
Mine.


[Bug middle-end/53823] [4.8 Regression] FAIL: gcc.c-torture/execute/930921-1.c execution at -O0 and -O1

2012-08-13 Thread rth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53823

--- Comment #23 from Richard Henderson rth at gcc dot gnu.org 2012-08-13 
15:51:37 UTC ---
On 08/12/2012 07:30 AM, danglin at gcc dot gnu.org wrote:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53823
 
 --- Comment #22 from John David Anglin danglin at gcc dot gnu.org 
 2012-08-12 14:30:12 UTC ---
 Created attachment 27994
   -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=27994

Ok.


r~


[Bug tree-optimization/54245] New: [4.8 regression] incorrect optimisation

2012-08-13 Thread mans at mansr dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54245

 Bug #: 54245
   Summary: [4.8 regression] incorrect optimisation
Classification: Unclassified
   Product: gcc
   Version: 4.8.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: m...@mansr.com


Created attachment 28007
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=28007
Test case

Since r190220 the attached test is compiled incorrectly at -O1 and higher.


[Bug target/54246] New: Bytemark FOURIER 54% slower in X32 chroot

2012-08-13 Thread wbrana at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54246

 Bug #: 54246
   Summary: Bytemark FOURIER 54% slower in X32 chroot
Classification: Unclassified
   Product: gcc
   Version: 4.7.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: wbr...@gmail.com


http://www.tux.org/~mayer/linux/nbench-byte-2.2.3.tar.gz

compiled on 64-bit system with glibc 2.14.1 and run in X32 chroot
FOURIER :   36275  :  41.26  :  23.17
compiled in X32 chroot with glibc 2.16 and run in X32 chroot
FOURIER :   16574  :  18.85  :  10.59
both were compiled with same CFLAGS
-static -m64 -ggdb -Wall -O3 -funroll-loops -g0 -march=core2 -mfpmath=sse
-fomit-frame-pointer -ffast-math -mssse3 -fno-PIE -fno-exceptions
-fno-stack-protector


[Bug tree-optimization/54245] [4.8 regression] incorrect optimisation

2012-08-13 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54245

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org,
   ||wschmidt at gcc dot gnu.org
   Target Milestone|--- |4.8.0

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org 2012-08-13 
17:19:40 UTC ---
Confirmed.  slsr replaces:
  D.2219_3 = *row_2(D);
  D.2220_4 = (int) D.2219_3;
  a1_5 = D.2220_4 * 22725;
  D._6 = MEM[(short int *)row_2(D) + 4B];
  D.2223_7 = (int) D._6;
  D.2224_8 = D.2223_7 * 21407;
  a0_9 = D.2224_8 + a1_5;
  D.2225_10 = D.2223_7 * 8867;
- a1_11 = a1_5 + D.2225_10;
+ slsr.4_25 = D._6 * 12540;
+ slsr.5_26 = (int) slsr.4_25;
+ a1_11 = a0_9 - slsr.5_26;

The multiplication is newly performed in short int, supposedly that is the
problem here.  Anyway, while the number of multiplications in the end is the
same, with slsr the code sequence is also 3 insns/4 bytes longer on x86_64.


[Bug target/54239] Not able to generate prefetch (prefetch read) instruction using -m3dnow or -mprfchw

2012-08-13 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54239

Uros Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID

--- Comment #6 from Uros Bizjak ubizjak at gmail dot com 2012-08-13 17:32:21 
UTC ---
(In reply to comment #5)
  BTW, why do you care about the prefetch insn?  Isn't it obsoleted by the SSE
  ISA prefetches anyway (unlike prefetchw)?
 
 Hi Jakub, as for as fam15H processors what I know is they are exactly same.
 Yes I can use -mprfchw and generate prefecthw instruction and use prefetchts
 instead of prefetch instruction.

The reason is described in the comment in i386.md:

  /* Use 3dNOW prefetch in case we are asking for write prefetch not
 supported by SSE counterpart or the SSE prefetch is not available
 (K6 machines).  Otherwise use SSE prefetch as it allows specifying
 of locality.  */

We are generating SSE prefetches, since they allow specification of locality.

 But there is a mention in SWOG guide of amdfam15 that their functionalities
 could change in future. 
 
 (Snip)
 AMD Family 15h processors implement the PREFETCHT0, PREFETCHT1, and PREFETCHT2
 instructions in exactly the same way as the PREFETCH instruction. That is, the
 data is brought into the L1 data cache. This functionality could change in
 future implementations of the AMD Family 15h
 processor
 (Snip)

I see no problem here. For current implementations, SSE prefetches are treated
in the same way as 3dNOW prefetch. I read the quoted part as ... in the
future, F15h SSE prefetches will implement the functionality as described in
the insn mnemonic (locality), not that they will overload the mnemonic with
some other different functionality.

Some other different functionality will need different mnemonic, probably
supported by cpuid flag.

So, INVALID.


[Bug c++/54197] [4.7/4.8 regression] Lifetime of reference not properly extended

2012-08-13 Thread aaw at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54197

Ollie Wild aaw at gcc dot gnu.org changed:

   What|Removed |Added

 CC||aaw at gcc dot gnu.org
 AssignedTo|unassigned at gcc dot   |aaw at gcc dot gnu.org
   |gnu.org |

--- Comment #2 from Ollie Wild aaw at gcc dot gnu.org 2012-08-13 18:04:21 UTC 
---
The issue is that these cause a COMPOUND_EXPR to be passed to
extend_ref_init_temps_1.

I have a patch which replaces the second operand of the COMPOUND_EXPR with
another call to extend_ref_init_temps_1.  Testing now.  Will send out for
review shortly.


[Bug c++/54197] [4.7/4.8 regression] Lifetime of reference not properly extended

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54197

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED


[Bug fortran/54243] [OOP] ICE (segfault) in gfc_type_compatible for invalid BT_CLASS

2012-08-13 Thread janus at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54243

janus at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |janus at gcc dot gnu.org
   |gnu.org |

--- Comment #2 from janus at gcc dot gnu.org 2012-08-13 19:21:15 UTC ---
I think the proper fix for both this one and PR 54244 would be the following:

Index: gcc/fortran/resolve.c
===
--- gcc/fortran/resolve.c(revision 190186)
+++ gcc/fortran/resolve.c(working copy)
@@ -5793,6 +5795,9 @@ check_typebound_baseobject (gfc_expr* e)

   gcc_assert (base-ts.type == BT_DERIVED || base-ts.type == BT_CLASS);

+  if (base-ts.type == BT_CLASS  !gfc_expr_attr (base).class_ok)
+return FAILURE;
+
   /* F08:C611.  */
   if (base-ts.type == BT_DERIVED  base-ts.u.derived-attr.abstract)
 {


This aborts the resolution of the type-bound call rather early (if the passed
object was not properly declared), avoiding all problems that one could
possibly run into later. It is also general enough that it should work for
other similar cases.


[Bug tree-optimization/54245] [4.8 regression] incorrect optimisation

2012-08-13 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54245

William J. Schmidt wschmidt at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2012-08-13
 AssignedTo|unassigned at gcc dot   |wschmidt at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

--- Comment #2 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 
19:29:06 UTC ---
I'll take a look.  Might be a day or two as my queue is kind of full.


[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185

--- Comment #10 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
19:56:55 UTC ---
Author: redi
Date: Mon Aug 13 19:56:50 2012
New Revision: 190356

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190356
Log:
2012-08-13  David Adler  d.adle...@gmail.com

PR libstdc++/54185
* src/c++11/condition_variable.cc (condition_variable): Always
destroy native type in destructor.
* testsuite/30_threads/condition_variable/54185.cc: New.

Added:
trunk/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/src/c++11/condition_variable.cc


[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185

--- Comment #10 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
19:56:55 UTC ---
Author: redi
Date: Mon Aug 13 19:56:50 2012
New Revision: 190356

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190356
Log:
2012-08-13  David Adler  d.adle...@gmail.com

PR libstdc++/54185
* src/c++11/condition_variable.cc (condition_variable): Always
destroy native type in destructor.
* testsuite/30_threads/condition_variable/54185.cc: New.

Added:
trunk/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/src/c++11/condition_variable.cc

--- Comment #11 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
19:57:36 UTC ---
Author: redi
Date: Mon Aug 13 19:57:31 2012
New Revision: 190357

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190357
Log:
2012-08-13  David Adler  d.adle...@gmail.com

PR libstdc++/54185
* src/c++11/condition_variable.cc (condition_variable): Always
destroy native type in destructor.
* testsuite/30_threads/condition_variable/54185.cc: New.

Added:
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/src/c++11/condition_variable.cc


[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185

--- Comment #11 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
19:57:36 UTC ---
Author: redi
Date: Mon Aug 13 19:57:31 2012
New Revision: 190357

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=190357
Log:
2012-08-13  David Adler  d.adle...@gmail.com

PR libstdc++/54185
* src/c++11/condition_variable.cc (condition_variable): Always
destroy native type in destructor.
* testsuite/30_threads/condition_variable/54185.cc: New.

Added:
   
branches/gcc-4_7-branch/libstdc++-v3/testsuite/30_threads/condition_variable/54185.cc
Modified:
branches/gcc-4_7-branch/libstdc++-v3/ChangeLog
branches/gcc-4_7-branch/libstdc++-v3/src/c++11/condition_variable.cc


[Bug fortran/54247] New: OpenMP code fails at execution in AMD Interlagos

2012-08-13 Thread longb at cray dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54247

 Bug #: 54247
   Summary: OpenMP code fails at execution in AMD Interlagos
Classification: Unclassified
   Product: gcc
   Version: 4.7.1
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: fortran
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: lo...@cray.com


 cat test.f90
!  derived from OpenMP test omp31f/F31_A_16_1.F90
!based on Example A.16.1f, p. 213 lines 1-19 in OpenMP API Ver 3.1.
program F31_A_16_1
   use omp_lib
   implicit none
   integer, parameter :: ITERATIONS = 2**17 ! Adjustable parameter
   integer(kind=omp_lock_kind) :: lock
   integer :: count_something_useful = 0, count_something_critical = 0

   call omp_set_num_threads(16)
   call omp_set_dynamic(.false.)
   call omp_init_lock(lock)

!$omp parallel
!$omp single
   call foo(lock, ITERATIONS)
!$omp end single
!$omp end parallel

   if(count_something_useful /= ITERATIONS .or. 
  count_something_critical /= ITERATIONS) then
  write (6, '(*(G0))') ' FAIL - ', 
 '(count_something_useful,count_something_critical) == (', 
 count_something_useful, ',', count_something_critical, 
 '), expected (', ITERATIONS, ',', ITERATIONS, ')'
   end if

contains
   ! from OpenMP 3.1 Example A.16.1f
   subroutine foo ( lock, n )
  use omp_lib
  integer (kind=omp_lock_kind) :: lock
  integer n
  integer i
  do i = 1, n
!$omp task
  call something_useful()
  do while ( .not. omp_test_lock(lock) ) 
!$omp taskyield
  end do
  call something_critical()
  call omp_unset_lock(lock)
!$omp end task
  end do
   end subroutine

   subroutine something_useful()
  !$omp atomic update
  count_something_useful = count_something_useful+1
   end subroutine something_useful

   subroutine something_critical
  ! isn't necessary to protect with atomic update, as invocations of this
  ! subroutine are protected by a lock
  count_something_critical = count_something_critical+1
   end subroutine something_critical

end program F31_A_16_1


 ftn -fopenmp test.f90
 ilrun -n1 -d16 ./a.out
 FAIL - (count_something_useful,count_something_critical) == (131072,131070),
expected (131072,131072)
Application 8535547 resources: utime ~6s, stime ~1s
 mcrun -n1 -d16 ./a.out
Application 8535554 resources: utime ~0s, stime ~1s

The code triggers a FAIL trap on interlagos processors, but not on the previous
generation Magny-Cours AMD chips.

Command explanation:

ilrun -n1 -d16

-- Execute on a node with Interlagos processors, 1 node, 16 threads

mcrun -n1 -d16

-- Execute on a node with Magny-Cours processors, 1 node, 16 threads  [2
sockets in SMP node]

ftn

-- wrapper for Cray systems to get the right (we hope) set of libraries and
default options for the current compilation environment.  For the gcc
environment, the options implied are here are COLLECT_GCC_OPTIONS='-u'
'pthread_mutex_trylock' '-fno-second-underscore' '-march=bdver1' '-static' '-v'
'-fopenmp'


[Bug libstdc++/54185] [4.7/4.8 Regression] condition_variable not properly destructed

2012-08-13 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54185

Jonathan Wakely redi at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.7.2

--- Comment #12 from Jonathan Wakely redi at gcc dot gnu.org 2012-08-13 
20:00:40 UTC ---
fixed for 4.7.2


[Bug fortran/54247] OpenMP code fails at execution in AMD Interlagos

2012-08-13 Thread longb at cray dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54247

Bill Long longb at cray dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution||INVALID

--- Comment #1 from Bill Long longb at cray dot com 2012-08-13 20:38:33 UTC 
---
Our internal OpenMP gurus spotted that in line 36 the

!$omp task

should be 

!$omp task default(shared)


With that change, the code executes correctly on Interlagos nodes.  

Conclusion is that there is a bug in the OpenMP 3.1 examples, so still
potentially useful information.  But the initial complaint is not valid.


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread wschmidt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

--- Comment #7 from William J. Schmidt wschmidt at gcc dot gnu.org 2012-08-13 
20:39:59 UTC ---
Something else is broken, too, as the optab handlers for cmov on powerpc64
appear to have gone missing.  I'll get one of our back-end specialists to help
me understand that.


[Bug c++/53836] [4.7/4.8 Regression] ICE: unexpected expression of kind template_parm_index

2012-08-13 Thread hjl.tools at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53836

H.J. Lu hjl.tools at gmail dot com changed:

   What|Removed |Added

 CC||jason at redhat dot com

--- Comment #4 from H.J. Lu hjl.tools at gmail dot com 2012-08-13 21:07:04 
UTC ---
It was fixed by revision 172942:

http://gcc.gnu.org/ml/gcc-cvs/2011-04/msg01138.html

on 4.6 branch.  However, the same patch was never applied
on trunk.


[Bug tree-optimization/54240] Routine hoist_adjacent_loads does not work properly after r189366

2012-08-13 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54240

--- Comment #8 from Andrew Pinski pinskia at gcc dot gnu.org 2012-08-13 
21:59:33 UTC ---
(In reply to comment #7)
 Something else is broken, too, as the optab handlers for cmov on powerpc64
 appear to have gone missing.  I'll get one of our back-end specialists to help
 me understand that.

They are only enabled for TARGET_ISELsel which is either TARGET_ISEL or 
TARGET_ISEL64 which is correct as ppc64 does not have isel by default.


[Bug target/54142] ppc64 build failure - Unrecognized opcode: `sldi' (and `srdi`)

2012-08-13 Thread PHHargrove at lbl dot gov
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54142

--- Comment #8 from Paul H. Hargrove PHHargrove at lbl dot gov 2012-08-13 
22:04:40 UTC ---
The following is a transcript of a test I just tried one of my systems where
Gary and I have observed this bug.  The test appears to show that the gcc
provided by Fedora Core 6 does generate sldi instructions and the
system-provided assembler understands them.  So, whatever is causing the build
failures that Gary and I see, it is *not* simply a matter of an assembler not
supporting the instructions.

-Paul

{phargrov@fc6 ~}$ cat q.c
unsigned long long foo(void) { return 0x7FFFLLU; }

{phargrov@fc6 ~}$ gcc -m64 -O -S q.c

{phargrov@fc6 ~}$ cat q.s
.file   q.c
.section.toc,aw
.section.text
.align 2
.globl foo
.section.opd,aw
.align 3
foo:
.quad   .L.foo,.TOC.@tocbase
.previous
.type   foo, @function
.L.foo:
lis 3,0x7fff
sldi 3,3,16
blr
.long 0
.byte 0,0,0,0,0,0,0,0
.size   foo,.-.L.foo
.ident  GCC: (GNU) 4.1.2 20070626 (Red Hat 4.1.2-13)
.section.note.GNU-stack,,@progbits

{phargrov@fc6 ~}$ as -a64 -mppc64 q.s
[no errors]


  1   2   3   >