date:20111228

RE: A case exposing code sink issue

2011-12-28 Thread Jiangning Liu

 -Original Message-
 From: gcc-ow...@gcc.gnu.org [mailto:gcc-ow...@gcc.gnu.org] On Behalf Of
 Jiangning Liu
 Sent: Tuesday, December 27, 2011 5:10 PM
 To: 'Richard Guenther'
 Cc: Michael Matz; gcc@gcc.gnu.org
 Subject: RE: A case exposing code sink issue

  The job to do this is final value replacement, not sinking (we do not
  sink non-invariant expressions - you'd have to translate them through
  the loop-closed SSA exit PHI node, certainly doable, patches
  welcome ;)).

 Richard,

 In final value replacement, expression a + D. can be figured out,
 while a[i_xxx] failed to be CHRECed, so I'm wondering if we should
 lower
 a[i_xxx] to a + unitsize(a) * i_xxx first? It seems GCC intends to
 keep
 a[i_xxx] until cfgexpand pass. Or we have to directly modify CHREC
 algorithm to get it calculated?

 Appreciate your kindly help in advance!

Richard,

Now I have a patch working for the case of step i++, by directly modifying
scalar evolution algorithm. the following code would be generated after
SCCP,
l
  # i_13 = PHI i_6(7), k_2(D)(4)
  a_p.0_4 = a[i_13];
  MEM[(int *)a][i_13] = 100;
  i_6 = i_13 + 1;
  if (i_6 = 999)
goto bb 7;
  else
goto bb 6;

bb 6:
  a_p_lsm.5_11 = MEM[(void *)a + 3996B];
  a_p = a_p_lsm.5_11;
  goto bb 3;

It looks good, but I still have problem when the case has step i+=k. 

For this case the value of variable i exiting loop isn't invariant, the
algorithm below in scalar evolution doesn't work on it,

compute_overall_effect_of_inner_loop()
{
  ...
  tree nb_iter = number_of_latch_executions (inner_loop);

  if (nb_iter == chrec_dont_know)
return chrec_dont_know;
  else
{
  tree res;

  /* evolution_fn is the evolution function in LOOP.  Get
 its value in the nb_iter-th iteration.  */
  res = chrec_apply (inner_loop-num, evolution_fn, nb_iter);

  if (chrec_contains_symbols_defined_in_loop (res, loop-num))
res = instantiate_parameters (loop, res);

  /* Continue the computation until ending on a parent of LOOP.
*/
  return compute_overall_effect_of_inner_loop (loop, res);
}
}

In theory, we can still have the transformation like below even if the step
is i+=k,

  # i_13 = PHI i_6(7), k_2(D)(4)
  i_14 = i_13,
  a_p.0_4 = a[i_13];
  MEM[(int *)a][i_13] = 100;
  i_6 = i_13 + k_2(D); // i+=k
  if (i_6 = 999)
goto bb 7;
  else
goto bb 6;

bb 6:
  a_p_lsm.5_11 = a[i_14];
  a_p = a_p_lsm.5_11;
  goto bb 3;

But I realize this is not a loop closed SSA form at all, because i_14 is
being used out of the loop. Where could we extend the liverange of variable
i in GCC infrastructure and finally solve this problem?

 Thanks,
 -Jiangning

Re: FW: a nifty feature for c preprocessor

2011-12-28 Thread David Brown


On 28/12/2011 07:48, R A wrote:


i'm an amateur programmer that just started learning C. i like most
of the features, specially the c preprocessor that it comes packed
with. it's an extremely portable way of implementing metaprogramming
in C.

though i've always thought it lacked a single feature -- an
evaluation feature.




I think you have missed the point about the C pre-processor.  It is not 
a metaprogramming language - it is a simple text substitution macro 
processor.  It does not have any understanding of the symbols (except 
for #) in the code, nor does it support recursion - it's pure text 
substitution.  Your suggestion would therefore need a complete re-design 
of the C pre-processor.  And the result is not a feature that people 
would want.


Many uses of the C pre-processor are deprecated with modern use of C and 
C++.  Where possible, it is usually better programming practice to use a 
static const instead of a simple numeric #define, and a static 
inline function instead of a function-like macro.  With C++, even more 
pre-processor functionality can be replaced by language features - 
templates give you metaprogramming.  There are plenty of exceptions, of 
course, but in general it is better to use a feature that is part of the 
language itself (C or C++) rather than the preprocessor.


It looks like you are wanting to get the compiler to pre-calculate 
results rather than have them calculated at run-time.  That's a good 
idea - so the gcc developers have worked hard to make the compiler do 
that in many cases.  If your various expressions here boil down to 
constants that the compiler can see, and you have at least some 
optimisation enabled, then it will pre-calculate the results.



If you have particular need of more complicated pre-processing, then 
what you want is generally some sort of code generator.  C has a simple 
enough syntax - write code in any language you want (C itself, or 
anything else) that outputs a C file.  I've done that a few times, such 
as for scripts to generate CRC tables.


And if you really want to use a pre-processing macro style, then there 
are more powerful languages suited to that.  You could use PHP, for 
example - while the output of a PHP script is usually HTML, there is no 
reason why it couldn't be used as a C pre-processor.






say i have these definitions: #define MACRO_1   (x/y)*y
#define MACRO_2   sqrt(a) #define MACRO_3   calc13()
 #define MACRO_15 (a + b)/c


now, all throughout the codebase, whenever and whichever of MACRO_1,
or MACRO_2 (or so forth) needs to be called, they are conveniently
indexed by another macro expansion:

#define CONCAT(a, b)  a##b #define CONCAT_VAR(a, b)
CONCAT(a, b)

#define MASTER_MACRO(N) CONCAT_VAR(MACRO_, N)

now, if we use MASTER_MACRO with a direct value:

MASTER_MACRO(10) or #define N  10 MASTER_MACRO(10) both
will work.


but substitute this with:

#define N((5*a)/c + (10*b)/c +
((5*a) % c + (10*b) % c)/c)

and MASTER_MACRO expands to: MACRO_((5*a)/c + (10*b)/c + ((5*a) % c +
(10*b) % c)/c)

which, of course is wrong. there are other workarounds or many times
this scheme can be avoided altogether. but it can be made to work
(elegantly) by adding an eval preprocessor operation:

so we redefine MASTER_MACRO this way: #define MASTER_MACRO(N)
CONCAT_VAR(MACRO_, eval(N)) which evaluates correctly.

this nifty trick (though a bit extended than what i elaborated above)
can also be used to *finally* have increments and decrements (among
others). since eval forces the evaluation of an *arithmetic*
expression (for now), it will force the evaluation of an expression,
then define it to itself. this will of course trigger a redefinition
flag from our beloved preprocessor, but the defined effect would be:

#define X (((14*x)/y)/z)/* say this evaluates to
simply 3 */

incrementing X, will simply be: #define X eval(eval(X) +
1)/* 1) will be evaluated as 4 before any token substitution */
#define X eval(eval(X) + 1)/* 2) will be evaluated as
5 before any token substitution */

that easy.

to suppress the redef warnings, we can have another directive like
force_redef (which can only work in conjunction with eval)
#force_redef  X eval(eval(X) + 1)


i'm just confused :-S... why hasn't this been suggested? i would love
to have this incorporated (even just on test builds) to gcc. it would
make my code so, so much more manageable and virtually extensible to
more platforms.

i would love to have a go at it and probably modify the gcc
preprocessor, but i since i know nothing of it's implementation
details, i don't know where to begin. i was hoping that this being a
gnu implementation, it's been heavily modularized (the fact that gcc
was heavily revised back then to use abstract syntax trees, gimple,
etc, past version 2.95 -- ???). so i can easily interrupt the

return vs simple_return

2011-12-28 Thread Michael Eager


Hi --

I've run into a problem with the MicroBlaze backend
where it is not recognizing a return pattern.  I'm
trying to modify the back end to use the 'simple_return'
pattern, rather than 'return', since MicroBlaze has
exactly what the documentation describes:  a no-frills
return instruction which does nothing more than branch
back to the caller.

When I define only 'simple_return', there are undefined
references in function.c for emit_return_into_block()
and emit_use_return_register_into_block(), since these
are defined when HAVE_return is defined.

MIPS has a similar call/return model, with a trivial
return instruction.  mips.md defines expanders for both
'return' and 'simple_return' and identical insn's for both
which generate the return jump.

ARM also has a simple return, but the back end defines
'return' and does not define 'simple_return'.

My guess is that the #ifdef HAVE_return in function.c
which surrounds the undefined functions should be removed.

What is the correct model for the back end?  Define only
'return' like ARM, define both 'return' and 'simple_return'
like MIPS, or define only 'simple_return' like I tried to do?

--
Michael Eagerea...@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

RE: a nifty feature for c preprocessor

2011-12-28 Thread R A


yes, i do realize that c preprocessor is but a text substitution tool from days 
past when programmers where only starting to develop the rudimentaries of 
high-level programming. but the reason i'm sticking with the c preprocessor if 
the fact that code that i write from it is extremely portable. copy the code 
and you can use it in any IDE or stand-alone compiler, it's as simple as that.
i have considered using gnu make, writing scripts with m4 and other parsers or 
lexers, but sticking with the preprocessor's minimalism is still too attractive 
an idea.

about the built in features in c and C++ to alleviate the extensive use for the 
preprocessor, like inline functions, static consts. the fact is NOT ALL 
compilers out there would optimize a function so that it will not have to use a 
return stack. simply using a macro FORCES the compiler to do so. the same goes 
for static const, if you use a precompiled value, you are forcing an immediate 
addressing, something of a good optimization. so it's still mostly an issue of 
portability of optimization.

templates, i have no problem with, i wish there could be a C dialect that can 
integrate it, so i wouldn't have to be forced to use C++ and all the bloat that 
usually come from a lot of it's implementation (by that i mean a performance 
close to C i think is very possible for C++'s library).

but, of course, one has to ask if you're making your code portable to any C 
compiler, why do you want gcc to change (or modify it for your own use)? you 
should be persuading the c committee.
well, that's the thing, it's harder to do the latter, so by doing this, i can 
demonstrate that it's a SIMPLE, but good idea.


 Date: Wed, 28 Dec 2011 10:57:28 +0100
 From: da...@westcontrol.com
 To: ren_zokuke...@hotmail.com
 CC: gcc@gcc.gnu.org
 Subject: Re: FW: a nifty feature for c preprocessor

 On 28/12/2011 07:48, R A wrote:
 
  i'm an amateur programmer that just started learning C. i like most
  of the features, specially the c preprocessor that it comes packed
  with. it's an extremely portable way of implementing metaprogramming
  in C.
 
  though i've always thought it lacked a single feature -- an
  evaluation feature.
 


 I think you have missed the point about the C pre-processor. It is not
 a metaprogramming language - it is a simple text substitution macro
 processor. It does not have any understanding of the symbols (except
 for #) in the code, nor does it support recursion - it's pure text
 substitution. Your suggestion would therefore need a complete re-design
 of the C pre-processor. And the result is not a feature that people
 would want.

 Many uses of the C pre-processor are deprecated with modern use of C and
 C++. Where possible, it is usually better programming practice to use a
 static const instead of a simple numeric #define, and a static
 inline function instead of a function-like macro. With C++, even more
 pre-processor functionality can be replaced by language features -
 templates give you metaprogramming. There are plenty of exceptions, of
 course, but in general it is better to use a feature that is part of the
 language itself (C or C++) rather than the preprocessor.

 It looks like you are wanting to get the compiler to pre-calculate
 results rather than have them calculated at run-time. That's a good
 idea - so the gcc developers have worked hard to make the compiler do
 that in many cases. If your various expressions here boil down to
 constants that the compiler can see, and you have at least some
 optimisation enabled, then it will pre-calculate the results.


 If you have particular need of more complicated pre-processing, then
 what you want is generally some sort of code generator. C has a simple
 enough syntax - write code in any language you want (C itself, or
 anything else) that outputs a C file. I've done that a few times, such
 as for scripts to generate CRC tables.

 And if you really want to use a pre-processing macro style, then there
 are more powerful languages suited to that. You could use PHP, for
 example - while the output of a PHP script is usually HTML, there is no
 reason why it couldn't be used as a C pre-processor.




  say i have these definitions: #define MACRO_1 (x/y)*y
  #define MACRO_2 sqrt(a) #define MACRO_3 calc13()
   #define MACRO_15 (a + b)/c
 
 
  now, all throughout the codebase, whenever and whichever of MACRO_1,
  or MACRO_2 (or so forth) needs to be called, they are conveniently
  indexed by another macro expansion:
 
  #define CONCAT(a, b) a##b #define CONCAT_VAR(a, b)
  CONCAT(a, b)
 
  #define MASTER_MACRO(N) CONCAT_VAR(MACRO_, N)
 
  now, if we use MASTER_MACRO with a direct value:
 
  MASTER_MACRO(10) or #define N 10 MASTER_MACRO(10) both
  will work.
 
 
  but substitute this with:
 
  #define N ((5*a)/c + (10*b)/c +
  ((5*a) % c + (10*b) % c)/c)
 
  and MASTER_MACRO expands to: MACRO_((5*a)/c + (10*b)/c + ((5*a) % c +
  (10*b) % c)/c)
 
  which,

Re: a nifty feature for c preprocessor

2011-12-28 Thread Jonathan Wakely

On 28 December 2011 20:57, R A wrote:

 templates, i have no problem with, i wish there could be a C dialect that can 
 integrate it, so i wouldn't have to be forced to use C++ and all the bloat 
 that usually come from a lot of it's implementation (by that i mean a 
 performance close to C i think is very possible for C++'s library).

What bloat?  If you only use the subset of C++ that is compatible
with C++ then you don't get any additional cost, you are not forced
to use anything, or to get any mythical bloat

 but, of course, one has to ask if you're making your code portable to any C 
 compiler, why do you want gcc to change (or modify it for your own use)? you 
 should be persuading the c committee.
 well, that's the thing, it's harder to do the latter, so by doing this, i can 
 demonstrate that it's a SIMPLE, but good idea.

It's not simple, or IMHO a good idea.

Re: a nifty feature for c preprocessor

2011-12-28 Thread David Brown


On 28/12/11 21:57, R A wrote:


yes, i do realize that c preprocessor is but a text substitution tool
from days past when programmers where only starting to develop the
rudimentaries of high-level programming. but the reason i'm sticking
with the c preprocessor if the fact that code that i write from it is
extremely portable. copy the code and you can use it in any IDE or
stand-alone compiler, it's as simple as that. i have considered using
gnu make, writing scripts with m4 and other parsers or lexers, but
sticking with the preprocessor's minimalism is still too attractive
an idea.



If you want portable, use features that already exist.  Lots of people 
write lots of C code that is portable across huge ranges of compilers 
and target processors.


And if you want portable pre-processing or code generation, use 
something that generates the code rather than inventing tools and 
features that don't exist, nor will ever exist.  It is also quite common 
to use scripts in languages like perl or python to generate tables and 
other pre-calculated values for inclusion in C code.



about the built in features in c and C++ to alleviate the extensive
use for the preprocessor, like inline functions, static consts. the
fact is NOT ALL compilers out there would optimize a function so that
it will not have to use a return stack. simply using a macro FORCES
the compiler to do so. the same goes for static const, if you use a
precompiled value, you are forcing an immediate addressing, something
of a good optimization. so it's still mostly an issue of portability
of optimization.



Most modern compilers will do a pretty reasonable job of constant 
propagation and calculating expressions using constant values.  And most 
will apply inline as you would expect, unless you intentionally hamper 
the compiler by not enabling optimisations.  Using macros, incidentally, 
does not FORCE the compiler to do anything - I know at least one 
compiler that will take common sections of code (from macros or normal 
text) and refactor it artificial functions, expending stack space and 
run time speed to reduce code size.  And immediate addressing is not 
necessarily a good optimisation - beware making generalisations like 
that.  Let the compiler do what it is good at doing - generating optimal 
code for the target in question - and don't try to second-guess it.  You 
will end up with bigger and slower code.



templates, i have no problem with, i wish there could be a C dialect
that can integrate it, so i wouldn't have to be forced to use C++ and
all the bloat that usually come from a lot of it's implementation (by
that i mean a performance close to C i think is very possible for
C++'s library).



C++ does not have bloat.  The only feature of C++ that can occasionally 
lead to larger or slower code, or fewer optimisations, than the same 
code in C is exceptions - if you don't need them, disable them with 
-fno-exceptions.  Other than that C++ is zero cost compared to C - you 
only pay for the features you use.



but, of course, one has to ask if you're making your code portable
to any C compiler, why do you want gcc to change (or modify it for
your own use)? you should be persuading the c committee. well,
that's the thing, it's harder to do the latter, so by doing this, i
can demonstrate that it's a SIMPLE, but good idea.



It's not a good idea, and it would not be simple to implement.

I really don't want to discourage someone from wanting to contribute to 
gcc development, but this is very much a dead-end idea.  I applaud your 
enthusiasm, but keep a check on reality - you are an amateur just 
starting C programming.  C has been used for the last forty years - with 
gcc coming up for its 25th birthday this spring.  If this idea were that 
simple, and that good, it would already be implemented.  As you gain 
experience and knowledge with C (and possibly C++), you will quickly 
find that a preprocessor like you describe is neither necessary nor 
desirable.


mvh.,

David






Date: Wed, 28 Dec 2011 10:57:28 +0100 From: da...@westcontrol.com
To: ren_zokuke...@hotmail.com CC: gcc@gcc.gnu.org Subject: Re: FW:
a nifty feature for c preprocessor

On 28/12/2011 07:48, R A wrote:


i'm an amateur programmer that just started learning C. i like
most of the features, specially the c preprocessor that it comes
packed with. it's an extremely portable way of implementing
metaprogramming in C.

though i've always thought it lacked a single feature -- an
evaluation feature.




I think you have missed the point about the C pre-processor. It is
not a metaprogramming language - it is a simple text substitution
macro processor. It does not have any understanding of the symbols
(except for #) in the code, nor does it support recursion - it's
pure text substitution. Your suggestion would therefore need a
complete re-design of the C pre-processor. And the result is not a
feature that people would want.

Many

RE: a nifty feature for c preprocessor

2011-12-28 Thread R A


 And if you want portable pre-processing or code generation, use
 something that generates the code rather than inventing tools and
 features that don't exist, nor will ever exist. It is also quite common
 to use scripts in languages like perl or python to generate tables and
 other pre-calculated values for inclusion in C code.

though there are things that i will not disclose, i've never had to invent any 
tools for the project i'm working on everything is legit. this is the only 
time that i've had to. so believe me if i said i've considered all 
*conventional* solutions


 Most modern compilers will do a pretty reasonable job of constant
 propagation and calculating expressions using constant values. And most
 will apply inline as you would expect, unless you intentionally hamper
 the compiler by not enabling optimisations. Using macros, incidentally,
 does not FORCE the compiler to do anything - I know at least one
 compiler that will take common sections of code (from macros or normal
 text) and refactor it artificial functions, expending stack space and
 run time speed to reduce code size. And immediate addressing is not
 necessarily a good optimisation - beware making generalisations like
 that. Let the compiler do what it is good at doing - generating optimal
 code for the target in question - and don't try to second-guess it. You
 will end up with bigger and slower code.

i'm not one to share techniques/methodologies, 1) but if it's the case for more 
than, say 70%, of systems/processors and 2) it takes very little penalty;
then i'd write it that way. if it's not optimized, just let the compiler (if 
it's as good as you say it is) re-optimize it. if the compiler ain't good 
enough to do that, well it's not a good compiler anyway. but the code will 
still work.

 I really don't want to discourage someone from wanting to contribute to
 gcc development, but this is very much a dead-end idea. I applaud your
 enthusiasm, but keep a check on reality - you are an amateur just
 starting C programming. C has been used for the last forty years - with
 gcc coming up for its 25th birthday this spring. If this idea were that
 simple, and that good, it would already be implemented. As you gain
 experience and knowledge with C (and possibly C++), you will quickly
 find that a preprocessor like you describe is neither necessary nor
 desirable.

you know there's no way i can't answer that without invoking the wrath of the 
community.

FW: a nifty feature for c preprocessor

2011-12-28 Thread R A


sorry:
2) it takes very little penalty, otherwise.

Re: a nifty feature for c preprocessor‏

2011-12-28 Thread R A


that all being said, i really don't think it's a hard feature to implement 
like i said, just whenever there is an 1) evaluation in the conditional 
directives or 2) #define is called, look for eval, if there, evaluate the 
expression, then substitute token.

the rest of the needs no tampering at all. libccp's implementation is great, 
neatly divided. probably have to edit only half a dozen files, at most  -- at 
least from what i can tell from scanning the the code.

it'll just take me a long time to know how to work with setting all the flags, 
attributes, and working with the structs, so it's hard for me to do by myself.

[Bug ada/51691] New: Cast of an array with type generates a please file bug message (See below)

2011-12-28 Thread alexis at m2osw dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51691

 Bug #: 51691
   Summary: Cast of an array with type generates a please file
bug message (See below)
Classification: Unclassified
   Product: gcc
   Version: 4.4.5
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: ada
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ale...@m2osw.com


Created attachment 26193
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26193
Case Folding implementation for my own Ada compiler

---
prompt gnatmake case_folding

gcc-4.4 -c case_folding.adb
+===GNAT BUG DETECTED==+
| 4.4.5 (x86_64-pc-linux-gnu) Assert_Failure sinfo.adb:880 |
| Error detected at case_folding.adb:401:32|
| Please submit a bug report; see http://gcc.gnu.org/bugs.html.|
| Use a subject line meaningful to you and us to track the bug.|
| Include the entire contents of this bug box in the report.   |
| Include the exact gcc-4.4 or gnatmake command that you entered.  |
| Also include sources listed below in gnatchop format |
| (concatenated together with no headers between files).   |
+==+

Please include these source files with error report
Note that list may not be accurate in some cases,
so please double check that the problem can still
be reproduced with the set of files listed.

case_folding.adb









case_folding.adb:401:53: missing )
compilation abandoned
gnatmake: case_folding.adb compilation error
---

As I type fast, the error came from this line:

  output_line(1 .. indent) := string(1 .. indent = ' ');

which includes an invalid cast, the proper line should be (without string):

  output_line(1 .. indent) := (1 .. indent = ' ');

There are still problems on line 403 which I left in case the bug would not be
reported without that other error (unlikely though.)

Just in case, I'm on Ubuntu 11.04. I use the stock version of Ada.

---
More info about my project can be found here:
http://aada.m2osw.com/compiler

[Bug tree-optimization/51684] [4.7 Regression]: ICE in gfortran.dg/maxloc_bounds_5 on ia64

2011-12-28 Thread ubizjak at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51684

--- Comment #2 from Uros Bizjak ubizjak at gmail dot com 2011-12-28 09:06:45 
UTC ---
(In reply to comment #1)
 Untested patch:

I have bootstrapped and regression tested the patch on ia64-unknown-linux-gnu
[1], where it fixes all mentioned failures.

[1] http://gcc.gnu.org/ml/gcc-testresults/2011-12/msg02709.html

[Bug rtl-optimization/51667] [4.7 Regression] new FAIL: 27_io/basic_stream/ execution test with -m32

2011-12-28 Thread ubizjak at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51667

--- Comment #19 from Uros Bizjak ubizjak at gmail dot com 2011-12-28 09:09:02 
UTC ---
FYI, the patch also works correctly on alpha [1], a target with sign-extended
instructions.

[1] http://gcc.gnu.org/ml/gcc-testresults/2011-12/msg02710.html

[Bug target/50038] redundant zero extensions

2011-12-28 Thread ubizjak at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50038

Uros Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution||FIXED
   Target Milestone|--- |4.7.0

--- Comment #9 from Uros Bizjak ubizjak at gmail dot com 2011-12-28 09:13:03 
UTC ---
Patch was committed to mainline.

[Bug testsuite/50722] FAIL: gcc.dg/pr49994-3.c (test for excess errors)

2011-12-28 Thread uros at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50722

--- Comment #7 from uros at gcc dot gnu.org 2011-12-28 09:16:28 UTC ---
Author: uros
Date: Wed Dec 28 09:16:24 2011
New Revision: 182704

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=182704
Log:
PR testsuite/50722
* gcc.dg/pr49994-3.c: Skip on ia64-*-*-*, hppa*-*-* and *-*-hpux*.


Modified:
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/gcc.dg/pr49994-3.c

[Bug tree-optimization/51684] [4.7 Regression]: ICE in gfortran.dg/maxloc_bounds_5 on ia64

2011-12-28 Thread irar at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51684

--- Comment #3 from irar at gcc dot gnu.org 2011-12-28 09:20:20 UTC ---
Author: irar
Date: Wed Dec 28 09:20:16 2011
New Revision: 182705

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=182705
Log:

PR tree-optimization/51684
* tree-vect-slp.c (vect_schedule_slp_instance): Get gsi of
original statement in case of a pattern.
(vect_schedule_slp): Likewise.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/tree-vect-slp.c

[Bug target/51685] FAIL: gcc.dg/tm/pr51472.c (internal compiler error) on ppc--, s390--, spu--

2011-12-28 Thread hp at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51685

Hans-Peter Nilsson hp at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2011-12-28
 CC||hp at gcc dot gnu.org
 Ever Confirmed|0   |1

--- Comment #1 from Hans-Peter Nilsson hp at gcc dot gnu.org 2011-12-28 
09:38:27 UTC ---
I checked my logs for r182695 (latest at this time) and yes, cris-axi-elf too,
same message.  A quick peek in gcc-testresults@ shows the same error for
armv7l-unknown-linux-gnueabi
(http://gcc.gnu.org/ml/gcc-testresults/2011-12/msg02689.html) and ia64-linux
(http://gcc.gnu.org/ml/gcc-testresults/2011-12/msg02709.html) so it looks
almost universal.

[Bug tree-optimization/51684] [4.7 Regression]: ICE in gfortran.dg/maxloc_bounds_5 on ia64

2011-12-28 Thread irar at il dot ibm.com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51684

Ira Rosen irar at il dot ibm.com changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution||FIXED

--- Comment #4 from Ira Rosen irar at il dot ibm.com 2011-12-28 10:22:07 UTC 
---
Fixed.

[Bug tree-optimization/51692] New: [4.7 Regression] ICE on several valgrind tests

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51692

 Bug #: 51692
   Summary: [4.7 Regression] ICE on several valgrind tests
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ja...@gcc.gnu.org
Target: x86_64-linux


int
main ()
{
  volatile double d = 0.0;
  double *p = __builtin_calloc (1, sizeof (double));
  d += 1.0;
  *p += 2.0;
  __builtin_free (p);
  return 0;
}

ICEs at -O2, the free argument becomes a freed SSA_NAME for some reason.
Started with http://gcc.gnu.org/viewcvs?root=gccview=revrev=182009

[Bug tree-optimization/51692] [4.7 Regression] ICE on several valgrind tests

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51692

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.7.0

[Bug testsuite/51693] New: New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread irar at il dot ibm.com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

 Bug #: 51693
   Summary: New XPASSes in vectorizer testsuite on
powerpc64-suse-linux
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: testsuite
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: i...@il.ibm.com
CC: michael.v.zolotuk...@gmail.com
  Host: powerpc64-suse-linux
Target: powerpc64-suse-linux
 Build: powerpc64-suse-linux


Revision 182583 http://gcc.gnu.org/viewcvs?view=revisionrevision=182583 caused
several XPASSes on powerpc64-suse-linux:

XPASS: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect Alignment of
access forced using peeling 2
XPASS: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect Vectorizing
an unaligned access 4
XPASS: gcc.dg/vect/vect-peel-3.c scan-tree-dump-times vect Vectorizing an
unaligned access 1
XPASS: gcc.dg/vect/vect-peel-3.c scan-tree-dump-times vect Alignment of access
forced using peeling 1
XPASS: gcc.dg/vect/vect-multitypes-1.c -flto scan-tree-dump-times vect
Alignment of access forced using peeling 2
XPASS: gcc.dg/vect/vect-multitypes-1.c -flto scan-tree-dump-times vect
Vectorizing an unaligned access 4
XPASS: gcc.dg/vect/vect-peel-3.c -flto scan-tree-dump-times vect Vectorizing
an unaligned access 1
XPASS: gcc.dg/vect/vect-peel-3.c -flto scan-tree-dump-times vect Alignment of
access forced using peeling 1
XPASS: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times vect
Alignment of access forced using peeling 2

The reason is that {!vect_aligned_arrays} was added to xfail of the above
checks, while vect_aligned_arrays is false for power.

Changing that, i.e.:
Index: ../../lib/target-supports.exp
===
--- ../../lib/target-supports.exp   (revision 182703)
+++ ../../lib/target-supports.exp   (working copy)
@@ -3222,7 +3222,8 @@ proc check_effective_target_vect_aligned_arrays {
 set et_vect_aligned_arrays_saved 1
}
}
-if [istarget spu-*-*] {
+if {[istarget spu-*-*]
+   || [istarget powerpc*-*-*] } {
set et_vect_aligned_arrays_saved 1
}
 }

fixes the XPASSes and doesn't cause any problems (on powerpc64-suse-linux), but
AFAIU arrays are not always vector aligned on power, so this is not a good
idea, unless we change the definition of
check_effective_target_vect_aligned_arrays.

What was the purpose of adding {!vect_aligned_arrays} to these tests? If
peeling is impossible on AVX because arrays are never vector aligned, maybe we
need a new target check instead of vect_aligned_arrays?

[Bug tree-optimization/51694] New: [4.7 Regression] ICE while compiling alliance package

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51694

 Bug #: 51694
   Summary: [4.7 Regression] ICE while compiling alliance package
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ja...@gcc.gnu.org
CC: mkuvyr...@gcc.gnu.org
Target: x86_64-linux


void
foo (x, fn)
  void (*fn) ();
{
  int a = baz ((void *) 0, x);
  (*fn) (x, 0);
}

void
bar (void)
{
  void *x = 0;
  foo (x);
}

ICEs at -O2 starting with
http://gcc.gnu.org/viewcvs?root=gccview=revrev=181377

[Bug tree-optimization/51694] [4.7 Regression] ICE while compiling alliance package

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51694

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.7.0

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread michael.v.zolotukhin at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #1 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 
2011-12-28 11:08:36 UTC ---
I though that if {vect_aligned_arrays} isn't true, than arrays could
be aligned even after peeling - that's why I added such check.
Unfortunately, I can't reproduce these fails, as I have no PowerPC. By
the way, if arrays aren't aligned on Power, why does GCC produce such
messages - does it really try to peel something? Maybe we should just
refine the check?
Anyway, if everything is ok with the tests (in original version) and
with gcc itself - we could check not for vect_aligned_arrays, but for
AVX. Please check
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01600.html and the
attached to that letter patch.

Thanks, Michael


On 28 December 2011 14:51, irar at il dot ibm.com
gcc-bugzi...@gcc.gnu.org wrote:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

             Bug #: 51693
           Summary: New XPASSes in vectorizer testsuite on
                    powerpc64-suse-linux
    Classification: Unclassified
           Product: gcc
           Version: 4.7.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: testsuite
        AssignedTo: unassig...@gcc.gnu.org
        ReportedBy: i...@il.ibm.com
                CC: michael.v.zolotuk...@gmail.com
              Host: powerpc64-suse-linux
            Target: powerpc64-suse-linux
             Build: powerpc64-suse-linux


 Revision 182583 http://gcc.gnu.org/viewcvs?view=revisionrevision=182583 
 caused
 several XPASSes on powerpc64-suse-linux:

 XPASS: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect Alignment of
 access forced using peeling 2
 XPASS: gcc.dg/vect/vect-multitypes-1.c scan-tree-dump-times vect Vectorizing
 an unaligned access 4
 XPASS: gcc.dg/vect/vect-peel-3.c scan-tree-dump-times vect Vectorizing an
 unaligned access 1
 XPASS: gcc.dg/vect/vect-peel-3.c scan-tree-dump-times vect Alignment of 
 access
 forced using peeling 1
 XPASS: gcc.dg/vect/vect-multitypes-1.c -flto scan-tree-dump-times vect
 Alignment of access forced using peeling 2
 XPASS: gcc.dg/vect/vect-multitypes-1.c -flto scan-tree-dump-times vect
 Vectorizing an unaligned access 4
 XPASS: gcc.dg/vect/vect-peel-3.c -flto scan-tree-dump-times vect Vectorizing
 an unaligned access 1
 XPASS: gcc.dg/vect/vect-peel-3.c -flto scan-tree-dump-times vect Alignment of
 access forced using peeling 1
 XPASS: gcc.dg/vect/no-section-anchors-vect-69.c scan-tree-dump-times vect
 Alignment of access forced using peeling 2

 The reason is that {!vect_aligned_arrays} was added to xfail of the above
 checks, while vect_aligned_arrays is false for power.

 Changing that, i.e.:
 Index: ../../lib/target-supports.exp
 ===
 --- ../../lib/target-supports.exp       (revision 182703)
 +++ ../../lib/target-supports.exp       (working copy)
 @@ -3222,7 +3222,8 @@ proc check_effective_target_vect_aligned_arrays {
                 set et_vect_aligned_arrays_saved 1
            }
        }
 -        if [istarget spu-*-*] {
 +        if {[istarget spu-*-*]
 +           || [istarget powerpc*-*-*] } {
            set et_vect_aligned_arrays_saved 1
        }
     }

 fixes the XPASSes and doesn't cause any problems (on powerpc64-suse-linux), 
 but
 AFAIU arrays are not always vector aligned on power, so this is not a good
 idea, unless we change the definition of
 check_effective_target_vect_aligned_arrays.

 What was the purpose of adding {!vect_aligned_arrays} to these tests? If
 peeling is impossible on AVX because arrays are never vector aligned, maybe we
 need a new target check instead of vect_aligned_arrays?

 --
 Configure bugmail: http://gcc.gnu.org/bugzilla/userprefs.cgi?tab=email
 --- You are receiving this mail because: ---
 You are on the CC list for the bug.

[Bug tree-optimization/51694] [4.7 Regression] ICE while compiling alliance package

2011-12-28 Thread mkuvyrkov at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51694

Maxim Kuvyrkov mkuvyrkov at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011-12-28
 Ever Confirmed|0   |1

--- Comment #1 from Maxim Kuvyrkov mkuvyrkov at gcc dot gnu.org 2011-12-28 
11:09:29 UTC ---
Will investigate.

Jakub, thanks for reporting this.

[Bug debug/51695] [4.7 Regression] ICE while compiling argyllcms package

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51695

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

  Component|tree-optimization   |debug
   Target Milestone|--- |4.7.0

[Bug tree-optimization/51695] New: [4.7 Regression] ICE while compiling argyllcms package

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51695

 Bug #: 51695
   Summary: [4.7 Regression] ICE while compiling argyllcms package
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: tree-optimization
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: ja...@gcc.gnu.org
CC: aol...@gcc.gnu.org
Target: x86_64-linux


typedef struct
{
  struct { unsigned int t1, t2, t3, t4, t5, t6; } t;
  int p;
  struct { double X, Y, Z; } r;
} T;
typedef struct { T *h; } S;

static unsigned int v = 0x12345678;

int
foo (void)
{
  v = (v  0x8000) ? ((v  1) ^ 0xa398655d) : (v  1);
  return 0;
}

double
bar (void)
{
  unsigned int o;
  v = (v  0x8000) ? ((v  1) ^ 0xa398655d) : (v  1);
  o = v  0x;
  return (double) o / 32768.0;
}

int
baz (void)
{
  foo ();
  return 0;
}

void
test (S *x)
{
  T *t = x-h;
  t-t.t1 = foo ();
  t-t.t2 = foo ();
  t-t.t3 = foo ();
  t-t.t4 = foo ();
  t-t.t5 = foo ();
  t-t.t6 = foo ();
  t-p = baz ();
  t-r.X = bar ();
  t-r.Y = bar ();
  t-r.Z = bar ();
}

ICEs at -O2 -g, starting with
http://gcc.gnu.org/viewcvs?root=gccview=revrev=180194

[Bug debug/51695] [4.7 Regression] ICE while compiling argyllcms package

2011-12-28 Thread jakub at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51695

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org 2011-12-28 
11:35:23 UTC ---
The NOTE_INSN_VAR_LOCATION argument for variable o is extremely huge in this
case and we hit the 64KB limit on .debug_loc expressions.

[Bug target/51345] [avr] Devices with 8-bit SP need their own multilib(s)

2011-12-28 Thread gjl at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51345

--- Comment #3 from Georg-Johann Lay gjl at gcc dot gnu.org 2011-12-28 
12:21:40 UTC ---
Created attachment 26194
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26194
tentative patch

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread irar at il dot ibm.com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #2 from Ira Rosen irar at il dot ibm.com 2011-12-28 12:27:18 UTC 
---
(In reply to comment #1)
 I though that if {vect_aligned_arrays} isn't true, than arrays could
 be aligned even after peeling - that's why I added such check.

Sorry, I don't understand this sentence. What do you mean by aligned after
peeling? Could you please explain what exactly happens on AVX (a dump file with
-fdump-tree-vect-details would be the best thing).

 Unfortunately, I can't reproduce these fails, as I have no PowerPC. By
 the way, if arrays aren't aligned on Power, why does GCC produce such
 messages - does it really try to peel something? 

The arrays in the tests are aligned. I said that I think that we can't promise
that all the arrays are vector aligned on power. BTW, we can peel for unknown
misalignment as well.

 Maybe we should just
 refine the check?
 Anyway, if everything is ok with the tests (in original version) and
 with gcc itself - we could check not for vect_aligned_arrays, but for
 AVX. Please check
 http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01600.html and the
 attached to that letter patch.

I think that everything was ok, but I don't think that using vect_sizes_32B_16B
is a good idea. I would really like to see an AVX vect dump for eg.
vect-peel-3.c.

Thanks,
Ira

 
 Thanks, Michael

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread michael.v.zolotukhin at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #3 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 
2011-12-28 12:59:24 UTC ---
Created attachment 26195
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26195
AVX2 vect dump

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread michael.v.zolotukhin at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #4 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 
2011-12-28 13:01:51 UTC ---
(In reply to comment #2)
  I though that if {vect_aligned_arrays} isn't true, than arrays could
  be aligned even after peeling - that's why I added such check.
 
 Sorry, I don't understand this sentence. What do you mean by aligned after
 peeling? Could you please explain what exactly happens on AVX (a dump file 
 with
 -fdump-tree-vect-details would be the best thing).
Sorry, I misspelled. I meant than arrays couldn't be aligned - at least
without some runtime checks. I.e. we can't peel some compile-time-known number
of iterations and be sure that array become aligned.

E.g., if we have array IA of ints aligned to 16-bytes, and we have access
IA[i+3], then peeling of one iteration will guarantee alignment to 16-byte. But
we don't know, how much iterations needs to be peeled to reach alignment to
32-bytes (as needed for AVX operations).

  Unfortunately, I can't reproduce these fails, as I have no PowerPC. By
  the way, if arrays aren't aligned on Power, why does GCC produce such
  messages - does it really try to peel something? 
 
 The arrays in the tests are aligned. I said that I think that we can't promise
 that all the arrays are vector aligned on power. BTW, we can peel for unknown
 misalignment as well.

In this case we shouldn't add Power to vector_aligned_arrays, I guess.

  Maybe we should just
  refine the check?
  Anyway, if everything is ok with the tests (in original version) and
  with gcc itself - we could check not for vect_aligned_arrays, but for
  AVX. Please check
  http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01600.html and the
  attached to that letter patch.
 
 I think that everything was ok, but I don't think that using 
 vect_sizes_32B_16B
 is a good idea. I would really like to see an AVX vect dump for eg.
 vect-peel-3.c.

In vect-peel-3.c we actually assume that vector length is 16 byte. Here is the
loop body:
  suma += ia[i];
  sumb += ib[i+5];
  sumc += ic[i+1];
When vector-size is 16, then peeling can make two of three accesses aligned,
but when vector size is 32 that's impossible. That's why using
vector_sizes_32B_16B might be correct here.

Also, I uploaded the dump you asked.

Michael

 Thanks,
 Ira

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread irar at il dot ibm.com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #5 from Ira Rosen irar at il dot ibm.com 2011-12-28 13:11:53 UTC 
---
(In reply to comment #4)

 In vect-peel-3.c we actually assume that vector length is 16 byte. Here is the
 loop body:
   suma += ia[i];
   sumb += ib[i+5];
   sumc += ic[i+1];
 When vector-size is 16, then peeling can make two of three accesses aligned,
 but when vector size is 32 that's impossible. That's why using
 vector_sizes_32B_16B might be correct here.

Ah, now I understand. I was confused by vect_aligned_arrays, and it's
irrelevant here, right?

Yes, vector_sizes_32B_16B seems to be ok in that case.

Thanks,
Ira

[Bug c++/51680] g++ 4.7 fails to inline trivial template stuff

2011-12-28 Thread marc.glisse at normalesup dot org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51680

Marc Glisse marc.glisse at normalesup dot org changed:

   What|Removed |Added

 CC||marc.glisse at normalesup
   ||dot org

--- Comment #7 from Marc Glisse marc.glisse at normalesup dot org 2011-12-28 
13:44:16 UTC ---
With g++-4.6, -O1 -finline-small-functions already inlines everything, so maybe
the definition of small somehow changed a bit? g++-4.7 -fdump-ipa-all says
that it doesn't inline because function not declared inline and code size
would grow. g++-4.6 only tells me that the code size was unchanged by inlining
the 2 calls.

[Bug c++/51547] auto, type deduction, reference collapsing and const: invalid initialization of reference of type 'const X' from expression of type 'const X'

2011-12-28 Thread paolo at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51547

--- Comment #4 from paolo at gcc dot gnu.org paolo at gcc dot gnu.org 
2011-12-28 15:53:01 UTC ---
Author: paolo
Date: Wed Dec 28 15:52:54 2011
New Revision: 182709

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=182709
Log:
2011-12-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/51547
* g++.dg/cpp0x/pr51547.C: New.

Modified:
trunk/gcc/testsuite/ChangeLog

[Bug target/51244] SH Target: Inefficient conditional branch

2011-12-28 Thread oleg.e...@t-online.de

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #6 from Oleg Endo oleg.e...@t-online.de 2011-12-28 15:59:35 UTC 
---
(In reply to comment #3)
 Created attachment 26191 [details]
 Proposed patch to improve some of the issues.
 
 The attached patch removes the useless sequence and still allows the -1
 constant to be CSE-ed for such cases as the example function above.
 
 I haven't ran all tests on it yet, but CSiBE shows average code size reduction
 of approx. -0.1% for -m4* with some code size increases in some files.

Some of the code size increases are caused by the ifcvt.c pass which tries to
transform sequences like:

int test_func_6 (int a, int b, int c)
{
  if (a == 16)
c = 0;
  return b + c;
}

into branch-free code like:
mov r4,r0   ! 45movsi_ie/2[length = 2]
cmp/eq  #16,r0  ! 9 cmpeqsi_t/2[length = 2]
mov #-1,r0  ! 34movsi_ie/3[length = 2]
negcr0,r0   ! 38*negc[length = 2]
neg r0,r0   ! 36negsi2[length = 2]
and r6,r0   ! 37*andsi3_compact/2[length = 2]
rts ! 48*return_i[length = 2]
add r5,r0   ! 14*addsi3_compact[length = 2]

instead of the more compact (and on SH4 most likely better):
movr4,r0   ! 41movsi_ie/2[length = 2]
cmp/eq#16,r0  ! 9cmpeqsi_t/2[length = 2]
bf0f  ! 34*movsicc_t_true/2[length = 4]
mov#0,r6
0:
addr5,r6   ! 14*addsi3_compact[length = 2]
rts ! 44*return_i[length = 2]
movr6,r0   ! 19movsi_ie/2[length = 2]

This particular case is handled in noce_try_store_flag_mask, which does the
transformation if BRANCH_COST = 2, which is true for -m4.  I guess before the
patch ifcvt didn't realize that this transformation can be applied.

I've tried setting BRANCH_COST to 1, which avoids this transformation but
increases overall code size a bit.

[Bug libstdc++/51673] undefined references / libstdc++-7.dll

2011-12-28 Thread pluto at agmk dot net

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51673

--- Comment #6 from Pawel Sikora pluto at agmk dot net 2011-12-28 16:06:47 
UTC ---
btw, i've tested the default allocator with std::__7 and the i686-pc-mingw32
toolchain works fine while the x86_64-pc-mingw32 reports undefined reference to

.text$_ZN9__gnu_cxx3__713new_allocatorIiE8allocateEyPKv[__gnu_cxx::__7::new_allocatorint::allocate(unsigned
long long, void const*)]

so, there's a bug with symbol exporting not directly related to mt_allocator.
_Znwj vs. _Znwy issue?

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread michael.v.zolotukhin at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #6 from Michael Zolotukhin michael.v.zolotukhin at gmail dot com 
2011-12-28 16:19:54 UTC ---
(In reply to comment #5)
  In vect-peel-3.c we actually assume that vector length is 16 byte. Here is 
  the
  loop body:
suma += ia[i];
sumb += ib[i+5];
sumc += ic[i+1];
  When vector-size is 16, then peeling can make two of three accesses aligned,
  but when vector size is 32 that's impossible. That's why using
  vector_sizes_32B_16B might be correct here.
 
 Ah, now I understand. I was confused by vect_aligned_arrays, and it's
 irrelevant here, right?
Actually yes, you're right. I think, ideally, vect_aligned_arrays should be
somehow checked in such tests, as in them we assume that array's beginning is
aligned - but that's not the rootcause of the xpasses.

 Yes, vector_sizes_32B_16B seems to be ok in that case.
Other two tests (vect-multitypes-1.c and no-section-anchors-vect-69.c) look
like having the same problem - are you ok for similar fix for them too, i.e. is
patch
http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01600/vec-tests-avx2_fixes-7.patch
ok for trunk?

Thanks, Michael

[Bug rtl-optimization/51623] PowerPC section type conflict

2011-12-28 Thread meissner at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51623

--- Comment #2 from Michael Meissner meissner at gcc dot gnu.org 2011-12-28 
18:02:56 UTC ---
Author: meissner
Date: Wed Dec 28 18:02:49 2011
New Revision: 182710

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=182710
Log:
Fix PR 51623

Added:
trunk/gcc/testsuite/gcc.target/powerpc/pr51623.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/config/rs6000/rs6000.c
trunk/gcc/testsuite/ChangeLog

[Bug rtl-optimization/51623] PowerPC section type conflict

2011-12-28 Thread meissner at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51623

Michael Meissner meissner at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||meissner at gcc dot gnu.org
 Resolution||FIXED
 AssignedTo|unassigned at gcc dot   |meissner at gcc dot gnu.org
   |gnu.org |

--- Comment #3 from Michael Meissner meissner at gcc dot gnu.org 2011-12-28 
18:04:03 UTC ---
Fixed in subversion revision 182710.

[Bug c++/51556] Bizarre member template access control errors

2011-12-28 Thread paolo.carlini at oracle dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51556

--- Comment #5 from Paolo Carlini paolo.carlini at oracle dot com 2011-12-28 
18:12:17 UTC ---
This works with current (Rev 182710) mainline.

[Bug rtl-optimization/49710] [4.7 Regression] segfault

2011-12-28 Thread hubicka at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49710

Jan Hubicka hubicka at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |hubicka at gcc dot gnu.org
   |gnu.org |

--- Comment #4 from Jan Hubicka hubicka at gcc dot gnu.org 2011-12-28 
18:41:12 UTC ---
Looking into it now. I am by no means expert on this code ;))

[Bug rtl-optimization/49710] [4.7 Regression] segfault

2011-12-28 Thread hubicka at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49710

--- Comment #5 from Jan Hubicka hubicka at gcc dot gnu.org 2011-12-28 
19:37:38 UTC ---
OK, loop hiearchy looks as follows:

loop_0 (header = 0, latch = 1, niter = )
{
  bb_2 (preds = {bb_0 }, succs = {bb_3 })
  bb_6 (preds = {bb_5 }, succs = {bb_13 })
  bb_12 (preds = {bb_4 }, succs = {bb_1 })
  loop_4 (header = 13, latch = 14, niter = )
  {
bb_13 (preds = {bb_6 bb_14 }, succs = {bb_14 })
bb_14 (preds = {bb_13 }, succs = {bb_13 })
  }
  loop_1 (header = 3, latch = 9, niter = )
  {
bb_3 (preds = {bb_2 bb_9 }, succs = {bb_4 })
bb_9 (preds = {bb_8 }, succs = {bb_3 })
loop_2 (header = 4, latch = 11, niter = )
{
  bb_4 (preds = {bb_3 bb_11 }, succs = {bb_12 bb_5 })
  bb_5 (preds = {bb_4 }, succs = {bb_6 bb_7 })
  bb_7 (preds = {bb_5 }, succs = {bb_10 })
  bb_11 (preds = {bb_10 }, succs = {bb_4 })
  loop_3 (header = 10, latch = 15, niter = )
  {
bb_8 (preds = {bb_10 }, succs = {bb_9 bb_15 })
bb_15 (preds = {bb_8 }, succs = {bb_10 })
bb_10 (preds = {bb_7 bb_15 }, succs = {bb_8 bb_11 })
  }
}
  }
}

We remove path from 10 to 8, that is closing the loop of loop_3.
Basic blocks removed are 8 9 and 15.

Finally we fail on BB 3 that is believed to be in loop 1, but header is null at
this point because of code in delete_basic_block:

504  /* If we remove the header or the latch of a loop, mark the loop for
405 removal by setting its header and latch to NULL.  */
506   if (loop-latch == bb
507   || loop-header == bb)
508 {
509   loop-header = NULL;
510   loop-latch = NULL;
511 }

OK, so it seems that fix_bb_placements is not ready to see loops marked for
removal. I guess the catch is that loop peeling renders bb 3 unreachable.
I however do not understand how loop peeling can make this happen, perhaps
folding of the header condition is done?

Honza

[Bug libstdc++/51673] undefined references / libstdc++-7.dll

2011-12-28 Thread pluto at agmk dot net

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51673

--- Comment #7 from Pawel Sikora pluto at agmk dot net 2011-12-28 19:51:55 
UTC ---
please apply following obvious patch:

--- gcc-4.6.0/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver.orig 
2011-12-28 12:43:50.0 +0100
+++ gcc-4.6.0/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver  
2011-12-28 20:25:36.603040153 +0100
@@ -42,9 +42,9 @@
 __once_proxy;

 # operator new(size_t)
-_Znw[jm];
+_Znw[jmy];
 # operator new(size_t, std::nothrow_t const)
-_Znw[jm]RKSt9nothrow_t;
+_Znw[jmy]RKSt9nothrow_t;

 # operator delete(void*)
 _ZdlPv;
@@ -52,9 +52,9 @@
 _ZdlPvRKSt9nothrow_t;

 # operator new[](size_t)
-_Zna[jm];
+_Zna[jmy];
 # operator new[](size_t, std::nothrow_t const)
-_Zna[jm]RKSt9nothrow_t;
+_Zna[jmy]RKSt9nothrow_t;

 # operator delete[](void*)
 _ZdaPv;


it fixes new/delete exports for x86_64-pc-mingw32.
mt-allocator needs more exports...

[Bug c++/23211] using dec in nested class doesn't import name

2011-12-28 Thread fabien at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23211

--- Comment #15 from fabien at gcc dot gnu.org 2011-12-28 19:53:19 UTC ---
Author: fabien
Date: Wed Dec 28 19:53:14 2011
New Revision: 182711

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=182711
Log:
gcc/testsuite/ChangeLog

2011-12-28  Fabien Chene  fab...@gcc.gnu.org

PR c++/23211
* g++.dg/template/using18.C: New.
* g++.dg/template/using19.C: New.
* g++.dg/template/nested3.C: Remove dg-message at instantiation.
* g++.dg/template/crash13.C: Likewise.

gcc/cp/ChangeLog

2011-12-28  Fabien Chene  fab...@gcc.gnu.org

PR c++/23211
* name-lookup.c (do_class_using_decl): Use dependent_scope_p
instead of dependent_type_p, to check that a non-dependent
nested-name-specifier of a class-scope using declaration refers to
a base, even if the current scope is dependent.
* parser.c (cp_parser_using_declaration): Set
USING_DECL_TYPENAME_P to 1 if the DECL is not null. Re-indent a
'else' close to the prior modification.


Added:
trunk/gcc/testsuite/g++.dg/template/using18.C
trunk/gcc/testsuite/g++.dg/template/using19.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/name-lookup.c
trunk/gcc/cp/parser.c
trunk/gcc/testsuite/ChangeLog
trunk/gcc/testsuite/g++.dg/template/crash13.C
trunk/gcc/testsuite/g++.dg/template/nested3.C

[Bug c++/23211] using dec in nested class doesn't import name

2011-12-28 Thread fabien at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=23211

fabien at gcc dot gnu.org changed:

   What|Removed |Added

 Status|REOPENED|RESOLVED
 Resolution||FIXED

--- Comment #16 from fabien at gcc dot gnu.org 2011-12-28 20:04:25 UTC ---
Fixed.

[Bug c++/51680] g++ 4.7 fails to inline trivial template stuff

2011-12-28 Thread redi at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51680

--- Comment #8 from Jonathan Wakely redi at gcc dot gnu.org 2011-12-28 
20:09:32 UTC ---
(In reply to comment #6)
 Well, it's just an impression ... :]
 
 I think one reason is that unlike normal functions, template functions are
 implicitly sort of local (by necessity), in that they can have a definition
 in many compilation units without causing a link conflict.  To get this effect
 for normal functions, one must use the static or inline keywords -- so the
 impression (rightly or wrongly) is that template functions definitions are 
 like
 one of those.


Inline functions and templates both have vague linkage, which is how they avoid
multiple definitions. That has nothing to do with inlining.

[Bug c++/51316] alignof doesn't work with arrays of unknown bound

2011-12-28 Thread paolo.carlini at oracle dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51316

Paolo Carlini paolo.carlini at oracle dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011-12-28
 AssignedTo|unassigned at gcc dot   |paolo.carlini at oracle dot
   |gnu.org |com
 Ever Confirmed|0   |1

--- Comment #4 from Paolo Carlini paolo.carlini at oracle dot com 2011-12-28 
20:24:44 UTC ---
On it.

[Bug c/51696] New: [trans-mem] unsafe indirect function call in struct not properly displayed

2011-12-28 Thread patrick.marlier at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51696

 Bug #: 51696
   Summary: [trans-mem] unsafe indirect function call in struct
not properly displayed
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: patrick.marl...@gmail.com
CC: al...@gcc.gnu.org, torv...@gcc.gnu.org


Created attachment 26196
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26196
Attached testcase

With an unsafe indirect function call, the error message is not clear. I don't
know if it can display the declaration. In the worst case, unsafe indirect
function call within ‘transaction_safe’ function should be ok.

$ ./gcc/xgcc -B./gcc/ -fgnu-tm -O0 testcase.i
testcase.i: In function ‘func’:
testcase.i:7:21: error: unsafe function call ‘Uf3c0’ within
‘transaction_safe’ function
testcase.i:8:12: error: unsafe function call ‘compare.1’ within
‘transaction_safe’ function

Patrick Marlier.

[Bug rtl-optimization/51623] PowerPC section type conflict

2011-12-28 Thread meissner at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51623

--- Comment #4 from Michael Meissner meissner at gcc dot gnu.org 2011-12-28 
20:53:33 UTC ---
Author: meissner
Date: Wed Dec 28 20:53:30 2011
New Revision: 182712

URL: http://gcc.gnu.org/viewcvs?root=gccview=revrev=182712
Log:
Backport PR 51623 change

Added:
branches/gcc-4_6-branch/gcc/testsuite/gcc.target/powerpc/pr51623.c
  - copied unchanged from r182710,
trunk/gcc/testsuite/gcc.target/powerpc/pr51623.c
Modified:
branches/gcc-4_6-branch/gcc/ChangeLog
branches/gcc-4_6-branch/gcc/config/rs6000/rs6000.c
branches/gcc-4_6-branch/gcc/testsuite/ChangeLog

[Bug libstdc++/51673] undefined references / libstdc++-7.dll

2011-12-28 Thread ktietz at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51673

--- Comment #8 from Kai Tietz ktietz at gcc dot gnu.org 2011-12-28 21:24:25 
UTC ---
(In reply to comment #7)
 please apply following obvious patch:
 
 --- gcc-4.6.0/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver.orig
  
 2011-12-28 12:43:50.0 +0100
 +++ gcc-4.6.0/libstdc++-v3/config/abi/pre/gnu-versioned-namespace.ver  
 2011-12-28 20:25:36.603040153 +0100
 @@ -42,9 +42,9 @@
  __once_proxy;
 
  # operator new(size_t)
 -_Znw[jm];
 +_Znw[jmy];
  # operator new(size_t, std::nothrow_t const)
 -_Znw[jm]RKSt9nothrow_t;
 +_Znw[jmy]RKSt9nothrow_t;
 
  # operator delete(void*)
  _ZdlPv;
 @@ -52,9 +52,9 @@
  _ZdlPvRKSt9nothrow_t;
 
  # operator new[](size_t)
 -_Zna[jm];
 +_Zna[jmy];
  # operator new[](size_t, std::nothrow_t const)
 -_Zna[jm]RKSt9nothrow_t;
 +_Zna[jmy]RKSt9nothrow_t;
 
  # operator delete[](void*)
  _ZdaPv;
 
 
 it fixes new/delete exports for x86_64-pc-mingw32.
 mt-allocator needs more exports...

Thanks. Yes, confirmed patch fixes reported new/delete issue.  From my side
this patch is ok.  If C++ maintainer ok-s it too, I will apply it.

Kai

[Bug c++/51316] alignof doesn't work with arrays of unknown bound

2011-12-28 Thread tsoae at mail dot ru

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51316

--- Comment #5 from Nikolka tsoae at mail dot ru 2011-12-28 22:06:18 UTC ---
 On it.

There is an active core issue about alignof:

http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2011/n3309.html#1305

Probably, you should take into account the proposed resolution.

[Bug target/51244] SH Target: Inefficient conditional branch

2011-12-28 Thread kkojima at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51244

--- Comment #7 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-28 
22:25:48 UTC ---
(In reply to comment #3)
 I haven't ran all tests on it yet, but CSiBE shows average code size reduction
 of approx. -0.1% for -m4* with some code size increases in some files.
 Would something like that be OK for stage 3?

Looks good, though not appropriate for stage 3, I think.

[Bug testsuite/50988] gcc.target/powerpc/*: Several tests fail incorrectly on powerpc-linux-gnuspe

2011-12-28 Thread meissner at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50988

--- Comment #2 from Michael Meissner meissner at gcc dot gnu.org 2011-12-28 
22:30:22 UTC ---
Created attachment 26197
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26197
Proposed patch

Please check this patch on the spe compiler.

[Bug c++/51316] alignof doesn't work with arrays of unknown bound

2011-12-28 Thread paolo.carlini at oracle dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51316

--- Comment #6 from Paolo Carlini paolo.carlini at oracle dot com 2011-12-28 
22:31:02 UTC ---
Yeah, just allow the types at issue, that was clarified in core/930 actually.

[Bug target/51340] SH Target: Make -mfused-madd enabled by default

2011-12-28 Thread kkojima at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51340

--- Comment #3 from Kazumoto Kojima kkojima at gcc dot gnu.org 2011-12-28 
22:31:27 UTC ---
(In reply to comment #2)
 Uhm, yes...
 The title should have been Enable -mfused-madd by -ffast-math

Do you mean something like this?

--- ORIG/trunk/gcc/config/sh/sh.c2011-12-03 10:03:41.0 +0900
+++ trunk/gcc/config/sh/sh.c2011-12-27 08:33:23.0 +0900
@@ -838,6 +838,11 @@ sh_option_override (void)
 align_functions = min_align;
 }

+  /* Default to use fmac insn when -ffast-math.  See PR target/29100.  */
+  if (global_options_set.x_TARGET_FMAC == 0
+   fast_math_flags_set_p (global_options)
+TARGET_FMAC = 1;
+
   if (sh_fixed_range_str)
 sh_fix_range (sh_fixed_range_str);

 I don't know the exact semantics for the new patterns.  All I know is that
 rounding is supposed to be done only once after the two operations.  This is
 the case for the SH fmac insn.  Not sure whether this is enough though.

It seems that we can use the fma pattern, though it would be an another issue.

[Bug testsuite/50988] gcc.target/powerpc/*: Several tests fail incorrectly on powerpc-linux-gnuspe

2011-12-28 Thread meissner at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50988

Michael Meissner meissner at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2011-12-28
 CC||meissner at gcc dot gnu.org
 AssignedTo|unassigned at gcc dot   |meissner at gcc dot gnu.org
   |gnu.org |
 Ever Confirmed|0   |1

[Bug testsuite/50988] gcc.target/powerpc/*: Several tests fail incorrectly on powerpc-linux-gnuspe

2011-12-28 Thread meissner at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50988

Michael Meissner meissner at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|WAITING

--- Comment #3 from Michael Meissner meissner at gcc dot gnu.org 2011-12-28 
22:32:41 UTC ---
Klye, could you check this patch on your SPE compiler before I check it in?

[Bug fortran/51502] [4.6/4.7 Regression] Potentially wrong code generation due to wrong implict_pure check

2011-12-28 Thread tkoenig at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51502

Thomas Koenig tkoenig at gcc dot gnu.org changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
 AssignedTo|unassigned at gcc dot   |tkoenig at gcc dot gnu.org
   |gnu.org |

[Bug middle-end/42668] internal compiler error: in expand_expr_real_1, at expr.c:9314

2011-12-28 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=42668

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
  Component|c   |middle-end
 Resolution||FIXED
   Target Milestone|--- |4.4.3
   Severity|major   |normal

--- Comment #2 from Andrew Pinski pinskia at gcc dot gnu.org 2011-12-28 
22:52:12 UTC ---
Has been fixed for awhile now.

[Bug target/51340] SH Target: Make -mfused-madd enabled by default

2011-12-28 Thread oleg.e...@t-online.de

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51340

--- Comment #4 from Oleg Endo oleg.e...@t-online.de 2011-12-29 00:02:40 UTC 
---
(In reply to comment #3)
 (In reply to comment #2)
  Uhm, yes...
  The title should have been Enable -mfused-madd by -ffast-math
 
 Do you mean something like this?
 
 --- ORIG/trunk/gcc/config/sh/sh.c2011-12-03 10:03:41.0 +0900
 +++ trunk/gcc/config/sh/sh.c2011-12-27 08:33:23.0 +0900
 @@ -838,6 +838,11 @@ sh_option_override (void)
  align_functions = min_align;
  }
 
 +  /* Default to use fmac insn when -ffast-math.  See PR target/29100.  */
 +  if (global_options_set.x_TARGET_FMAC == 0
 +   fast_math_flags_set_p (global_options)
 +TARGET_FMAC = 1;
 +
if (sh_fixed_range_str)
  sh_fix_range (sh_fixed_range_str);
 

Yes, something like that.  Or maybe check flag_unsafe_math_optimizations, as it
is done for FSCA and FSRRA insns in sh.md.

  I don't know the exact semantics for the new patterns.  All I know is that
  rounding is supposed to be done only once after the two operations.  This is
  the case for the SH fmac insn.  Not sure whether this is enough though.
 
 It seems that we can use the fma pattern, though it would be an another issue.

Maybe when trunk is back to stage 1.

[Bug target/51697] New: SH Target: Inefficient DImode comparisons for -Os

2011-12-28 Thread oleg.e...@t-online.de

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51697

 Bug #: 51697
   Summary: SH Target: Inefficient DImode comparisons for -Os
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: target
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: oleg.e...@t-online.de
CC: kkoj...@gcc.gnu.org
Target: sh*-*-*


For -Os and everything but -m1 DImode comparisons are not optimized properly
which results in redundant SImode comparisons, producing code worse than for
-O1.  
A reduced example:

int test_0 (long long* x)
{
  return *x  0x ? -20 : -40;
}

-Os -m2/-m3/-m4:
mov#0,r2   ! 55movsi_ie/3[length = 2]
tstr2,r2   ! 57cmpeqsi_t/1[length = 2]
bf/s.L12! 58branch_false[length = 2]
mov.l@(4,r4),r3! 12movsi_ie/7[length = 2]
tstr3,r3   ! 59cmpeqsi_t/1[length = 2]
.L12:
bt/s.L11! 14branch_true[length = 2]
mov#-40,r0 ! 5movsi_ie/3[length = 2]
mov#-20,r0 ! 4movsi_ie/3[length = 2]
.L11:
rts
nop ! 65*return_i[length = 4]


-Os -m1:
-O2 -m4:
mov.l   @(4,r4),r1   ! 10movsi_i/5[length = 2]
mov #-40,r0 ! 5movsi_i/3[length = 2]
tst r1,r1   ! 15cmpeqsi_t/1[length = 2]
bt  .L7 ! 16branch_true[length = 2]
mov #-20,r0 ! 4movsi_i/3[length = 2]
.L7:
rts
nop ! 61*return_i[length = 4]


-O1 -m4:
mov.l   @(4,r4),r1  ! 10movsi_ie/7[length = 2]
tst r1,r1   ! 17cmpeqsi_t/1[length = 2]
bt/s.L6 ! 18branch_true[length = 2]
mov #-40,r0 ! 5movsi_ie/3[length = 2]
mov #-20,r0 ! 4movsi_ie/3[length = 2]
.L6:
rts
nop ! 62*return_i[length = 4]


Another example would be:

int test_2 (unsigned long long x)
{
  return x = 0x1LL ? -20 : -40;
}

-Os -m2/-m3/-m4:
mov #0,r2   ! 48movsi_ie/3[length = 2]
mov #-1,r3  ! 49movsi_ie/3[length = 2]
cmp/eq  r2,r4   ! 9cmpgtudi_t[length = 8]
bf/s.Ldi67
cmp/hi  r2,r4
cmp/hi  r3,r5
.Ldi67:
bf/s.L16! 10branch_false[length = 2]
mov #-40,r0 ! 5movsi_ie/3[length = 2]
mov #-20,r0 ! 4movsi_ie/3[length = 2]
.L16:
rts
nop ! 52*return_i[length = 4]


-Os -m1:
tst r4,r4   ! 9cmpeqsi_t/1[length = 2]
mov #-20,r0 ! 4movsi_i/3[length = 2]
bf  .L12! 10branch_false[length = 2]
mov #-40,r0 ! 5movsi_i/3[length = 2]
.L12:
rts
nop ! 56*return_i[length = 4]



The problem does not appear for -m1, only for -Os and -m2*, -m3*, -m4*.

[Bug target/49263] SH Target: underutilized TST #imm, R0 instruction

2011-12-28 Thread oleg.e...@t-online.de

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=49263

--- Comment #15 from Oleg Endo oleg.e...@t-online.de 2011-12-29 00:34:53 UTC 
---
(In reply to comment #14)
 With trunk rev 181517 I have observed the following problem, which happens 
 when
 compiling for -m2*, -m3*, -m4* and -Os:
 

This is still present as of rev 182713 and seems to be a different issue.
I've created PR51697 for it.

[Bug lto/51698] New: [trans-mem] TM runtime and application with LTO

2011-12-28 Thread patrick.marlier at gmail dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51698

 Bug #: 51698
   Summary: [trans-mem] TM runtime and application with LTO
Classification: Unclassified
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: patrick.marl...@gmail.com
CC: al...@gcc.gnu.org, r...@gcc.gnu.org,
torv...@gcc.gnu.org


Created attachment 26198
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=26198
testcase app-itm with lto

In my attempt to make _ITM_R/W* calls inlined into the application code, it
seems that the TM builtins and TM defintions don't work as expected with LTO.

$ gcc -flto -fgnu-tm -Wall -o bin appitm.c
`_ITM_beginTransaction' referenced in section `.text' of
/tmp/cc7uGSe1.ltrans0.ltrans.o: defined in discarded section `.text' of
/tmp/ccJk2crp.o (symbol from plugin)
`_ITM_RU4' referenced in section `.text' of /tmp/cc7uGSe1.ltrans0.ltrans.o:
defined in discarded section `.text' of /tmp/ccJk2crp.o (symbol from  
plugin)
`_ITM_commitTransaction' referenced in section `.text' of
/tmp/cc7uGSe1.ltrans0.ltrans.o: defined in discarded section `.text' of
/tmp/ccJk2crp.o (symbol from plugin)
collect2: error: ld returned 1 exit status

I have merged all .c in the same source for the testcase but it has the same
problem if TM runtime is in a library.

Patrick Marlier.

[Bug libstdc++/51699] New: Clang refuses to compile ext/rope citing scope resolution issues

2011-12-28 Thread fedorabugmail at yahoo dot com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51699

 Bug #: 51699
   Summary: Clang refuses to compile ext/rope citing scope
resolution issues
Classification: Unclassified
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: minor
  Priority: P3
 Component: libstdc++
AssignedTo: unassig...@gcc.gnu.org
ReportedBy: fedorabugm...@yahoo.com


When using clang to compile an existing program, clang refuses to compile the
ext/rope header files. One of the errors given is below.

ropeimpl.h:433:2: error: use of undeclared identifier '_Data_allocate'
_Data_allocate(_S_rounded_up_size(__old_len + __len));

g++ will compile this okay but the clang authors claim this code is invalid,
http://llvm.org/bugs/show_bug.cgi?id=6454. Below are the 7 changes to the two
files that allowed a successful compile. Line numbers may not be exact.

In ropeimpl.h

383c381
 this-_L_deallocate(__l, 1);
---
 _L_deallocate(__l, 1);
392c390
 this-_C_deallocate(__c, 1);
---
 _C_deallocate(__c, 1);
400c398
 this-_F_deallocate(__f, 1);
---
 _F_deallocate(__f, 1);
409c407
 this-_S_deallocate(__ss, 1);
---
 _S_deallocate(__ss, 1);
433c431
 _Rope_base_CharT, _Alloc::_Data_allocate(_S_rounded_up_size(__old_len +
__len));
---
 _Data_allocate(_S_rounded_up_size(__old_len + __len));
514c512
   _Rope_base_CharT, _Alloc::_C_deallocate(__result,1);
---
   _C_deallocate(__result,1);
817c815
   _Rope_base_CharT,
_Alloc::_Data_allocate(_S_rounded_up_size(__result_len));
---
   _Data_allocate(_S_rounded_up_size(__result_len));


In rope

732c730
 this-_S_free_string(_M_data,
this-_M_size,this-_M_get_allocator());
---
 __STL_FREE_STRING(_M_data, this-_M_size, this-_M_get_allocator());

[Bug target/51565] [4.4/4.5/4.6/4.7 Regression] fastcall in array of method pointers: internal compiler error

2011-12-28 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51565

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2011-12-29
  Component|c++ |target
  Known to work||4.3.5
   Target Milestone|--- |4.4.7
Summary|fastcall in array of method |[4.4/4.5/4.6/4.7
   |pointers: internal compiler |Regression] fastcall in
   |error   |array of method pointers:
   ||internal compiler error
 Ever Confirmed|0   |1
  Known to fail||4.4.5, 4.7.0

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org 2011-12-29 
06:00:53 UTC ---
Confirmed.

[Bug fortran/51569] documentation on sign intrinsic

2011-12-28 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51569

--- Comment #2 from Andrew Pinski pinskia at gcc dot gnu.org 2011-12-29 
06:02:53 UTC ---
-0.0 does not exist in Fortran except when using the IEEE module IIRC.

[Bug c++/51613] [4.4/4.5/4.6/4.7 Regression] Ambiguous function template instantiations as template argument are not rejected

2011-12-28 Thread pinskia at gcc dot gnu.org

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51613

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
  Known to work||4.3.5
   Keywords||accepts-invalid
   Last reconfirmed||2011-12-29
 Ever Confirmed|0   |1
Summary|Ambiguous function template |[4.4/4.5/4.6/4.7
   |instantiations as template  |Regression] Ambiguous
   |argument are not rejected   |function template
   ||instantiations as template
   ||argument are not rejected
   Target Milestone|--- |4.4.7
  Known to fail||4.4.5, 4.7.0

--- Comment #1 from Andrew Pinski pinskia at gcc dot gnu.org 2011-12-29 
06:07:16 UTC ---
Confirmed.

[Bug testsuite/51693] New XPASSes in vectorizer testsuite on powerpc64-suse-linux

2011-12-28 Thread irar at il dot ibm.com

http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51693

--- Comment #7 from Ira Rosen irar at il dot ibm.com 2011-12-29 07:37:53 UTC 
---
(In reply to comment #6)

  Yes, vector_sizes_32B_16B seems to be ok in that case.
 Other two tests (vect-multitypes-1.c and no-section-anchors-vect-69.c) look
 like having the same problem - are you ok for similar fix for them too, i.e. 
 is
 patch
 http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01600/vec-tests-avx2_fixes-7.patch
 ok for trunk?

Yes, just please don't forget to update testsuite/ChangeLog.

Thanks,
Ira

 
 Thanks, Michael

[patch] Fix PR tree-optimization/51684

2011-12-28 Thread Ira Rosen


Hi,

This patch fixes an attempt to access gsi of pattern statement.

Bootstrapped and tested on ia64-unknown-linux-gnu by Uros and on
powerpc64-suse-linux by me.

Committed.

Ira

ChangeLog:

PR tree-optimization/51684
* tree-vect-slp.c (vect_schedule_slp_instance): Get gsi of original
statement in case of a pattern.
(vect_schedule_slp): Likewise.

Index: gcc/tree-vect-slp.c
===
--- gcc/tree-vect-slp.c (revision 182703)
+++ gcc/tree-vect-slp.c (working copy)
@@ -2885,6 +2885,8 @@ vect_schedule_slp_instance (slp_tree node, slp_ins
REFERENCE_CLASS_P (gimple_get_lhs (stmt)))
 {
   gimple last_store = vect_find_last_store_in_slp_instance (instance);
+  if (is_pattern_stmt_p (vinfo_for_stmt (last_store)))
+   last_store = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (last_store));
   si = gsi_for_stmt (last_store);
 }

@@ -2989,6 +2991,8 @@ vect_schedule_slp (loop_vec_info loop_vinfo, bb_ve
   if (!STMT_VINFO_DATA_REF (vinfo_for_stmt (store)))
 break;

+ if (is_pattern_stmt_p (vinfo_for_stmt (store)))
+   store = STMT_VINFO_RELATED_STMT (vinfo_for_stmt (store));
   /* Free the attached stmt_vec_info and remove the stmt.  */
   gsi = gsi_for_stmt (store);
   gsi_remove (gsi, true);

[PATCH, testsuite]: Use dg-add-options ieee in gcc.dg/torture/pr50396.c

2011-12-28 Thread Uros Bizjak

Hello!

Some targets (i.e. alpha) need -mieee to handle NaNs.

2011-12-28  Uros Bizjak  ubiz...@gmail.com

* gcc.dg/torture/pr50396.c: Use dg-add-options ieee.

Tested on alphaev68-pc-linux-gnu, committed to mainline SVN and 4.6.

Uros.

Index: gcc.dg/torture/pr50396.c
===
--- gcc.dg/torture/pr50396.c(revision 182694)
+++ gcc.dg/torture/pr50396.c(working copy)
@@ -1,4 +1,5 @@
 /* { dg-do run } */
+/* { dg-add-options ieee } */

 extern void abort (void);
 typedef float vf128 __attribute__((vector_size(16)));

Ping: backport fix for PR 48660 (assigning to BLKmode return regs)

2011-12-28 Thread Richard Sandiford

Ping for backporting this expand patch, which fixes an ice-on-valid
regression from 4.4 while compiling certain C++ packages on ARM:

   http://gcc.gnu.org/ml/gcc-patches/2011-12/msg01550.html

As I understand it, this bug is the only reason Ubuntu is keeping
a GCC 4.4 package: there's no known workaround besides changing the
source, so affected packages have to be compiled with an older compiler.
I think this is going to be one of those patches that distros who care
about ARM will end up having to backport individually if we don't do it
in the FSF version.

Richard

[C++ testcase, commited] PR 51547

2011-12-28 Thread Paolo Carlini

[resend: the first time the message didn't get through because 
miscategorized as spam]


Hi,

I'm adding the testcase to mainline and closing the PR.

Thanks,
Paolo.

/


2011-12-27  Paolo Carlini  paolo.carl...@oracle.com

PR c++/51547
* g++.dg/cpp0x/pr51547.C: New.

Index: g++.dg/cpp0x/pr51547.C
===
--- g++.dg/cpp0x/pr51547.C  (revision 0)
+++ g++.dg/cpp0x/pr51547.C  (revision 0)
@@ -0,0 +1,50 @@
+// PR c++/51547
+// { dg-options -std=c++0x }
+
+template class T
+struct vector
+{
+  T*
+  begin()
+  { return member; }
+
+  const T*
+  begin() const
+  { return member; }
+
+  T member;
+};
+
+struct Bar {
+  int x;
+};
+
+struct Foo {
+  const vectorBar bar() const {
+return bar_;
+  }
+
+  vectorBar bar_;
+};
+
+template class X
+struct Y {
+  void foo() {
+Foo a;
+auto b = a.bar().begin();
+auto c = b-x;
+  }
+};
+
+template class X
+void foo() {
+  Foo a;
+  auto b = a.bar().begin();
+  auto c = b-x;
+}
+
+int main() {
+  Yint p;
+  p.foo();
+  fooint();
+}

Re: RFC: An alternative -fsched-pressure implementation

2011-12-28 Thread Richard Sandiford

Vladimir Makarov vmaka...@redhat.com writes:
 In the end I tried an ad-hoc approach in an attempt to do something
 about (2), (3) and (4b).  The idea was to construct a preliminary
 model schedule in which the only objective is to keep register
 pressure to a minimum.  This schedule ignores pipeline characteristics,
 latencies, and the number of available registers.  The maximum pressure
 seen in this initial model schedule (MP) is then the benchmark for ECC(X).

 I always had an impression that the code before scheduler is close to 
 minimal register pressure because of specific expression generation.  
 May be I was wrong and some optimizations (global ones like pre) changes 
 this a lot.

One of the examples I was looking at was:

-
#include stdint.h

#define COUNT 8

void
loop (uint8_t *__restrict dst, uint8_t *__restrict src, uint8_t *__restrict 
ff_cropTbl, int dstStride, int srcStride)
{
  const int w = COUNT;
  uint8_t *cm = ff_cropTbl + 1024;
  for(int i=0; iw; i++)
{
  const int srcB = src[-2*srcStride];
  const int srcA = src[-1*srcStride];
  const int src0 = src[0 *srcStride];
  const int src1 = src[1 *srcStride];
  const int src2 = src[2 *srcStride];
  const int src3 = src[3 *srcStride];
  const int src4 = src[4 *srcStride];
  const int src5 = src[5 *srcStride];
  const int src6 = src[6 *srcStride];
  const int src7 = src[7 *srcStride];
  const int src8 = src[8 *srcStride];
  const int src9 = src[9 *srcStride];
  const int src10 = src[10*srcStride];

  dst[0*dstStride] = cm[(((src0+src1)*20 - (srcA+src2)*5 + (srcB+src3)) + 
16)5];
  dst[1*dstStride] = cm[(((src1+src2)*20 - (src0+src3)*5 + (srcA+src4)) + 
16)5];
  dst[2*dstStride] = cm[(((src2+src3)*20 - (src1+src4)*5 + (src0+src5)) + 
16)5];
  dst[3*dstStride] = cm[(((src3+src4)*20 - (src2+src5)*5 + (src1+src6)) + 
16)5];
  dst[4*dstStride] = cm[(((src4+src5)*20 - (src3+src6)*5 + (src2+src7)) + 
16)5];
  dst[5*dstStride] = cm[(((src5+src6)*20 - (src4+src7)*5 + (src3+src8)) + 
16)5];
  dst[6*dstStride] = cm[(((src6+src7)*20 - (src5+src8)*5 + (src4+src9)) + 
16)5];
  dst[7*dstStride] = cm[(((src7+src8)*20 - (src6+src9)*5 + (src5+src10)) + 
16)5];
  dst++;
  src++;
}
}
-

(based on the libav h264 code).  In this example the loads from src and
stores to dst are still in their original order by the time we reach sched1,
so src, dst, srcA, srcB, and src0..10 are all live at once.  There's no
aliasing reason why they can't be reordered, and we do that during
scheduling.

 During the main scheduling, an instruction X that occurs at or before
 the next point of maximum pressure in the model schedule is measured
 based on the current register pressure.  If X doesn't increase the
 current pressure beyond the current maximum, its ECC(X) is zero,
 otherwise ECC(X) is the cost of going from MP to the new maximum.
 The idea is that the final net pressure of scheduling a given set of
 instructions is going to be the same regardless of the order; we simply
 want to keep the intermediate pressure under control.  An ECC(X) of zero
 usually[*] means that scheduling X next won't send the rest of the
 sequence beyond the current maximum pressure.

[*] but not always.  There's more about this in the patch comments.

 If an instruction X occurs _after_ the next point of maximum pressure,
 X is measured based on that maximum pressure.  If the current maximum
 pressure is MP', and X increases pressure by dP, ECC(X) is the cost of
 going from MP to MP' + dP.

 Of course, this all depends on how good a value MP is, and therefore
 on how good the model schedule is.  I tried a few variations before
 settling on the one in the patch (which I hope makes conceptual sense).

 I initially stayed with the idea above about assigning different costs to
 (Ra), (Rb) and (Rc).  This produces some good results, but was still a
 little too conservative in general, in that other tests were still worse
 with -fsched-pressure than without.  I described some of the problems
 with these costs above.  Another is that if an instruction X has a spill
 cost of 6, say, then:

ECC(X) + delay(X)

 will only allow X to be scheduled if the next instruction without
 a spill cost has a delay of 6 cycles or more.  This is overly harsh,
 especially seeing as few ARM instructions have such a high latency.
 The benefit of spilling is often to avoid a series of short
 (e.g. single-cycle) stalls, rather than to avoid a single long one.

 I then adjusted positive ECC(X) values based on the priority of X
 relative to the highest-priority zero-cost instruction.  This was
 better, but a DES filter in particular still suffered from the
 lots of short stalls problem.

 Then, as an experiment, I tried ignoring MEMORY_MOVE_COST altogether
 and simply treating

Ping**1.57 [Patch, fortran] Improve common function elimination

2011-12-28 Thread Thomas Koenig


http://gcc.gnu.org/ml/fortran/2011-12/msg00102.html


OK for trunk?


Regards

Thomas

Re: [PATCH] PowerPC section type conflict (created PR 51623)

2011-12-28 Thread Michael Meissner

On Mon, Dec 19, 2011 at 11:45:35PM +0800, Chung-Lin Tang wrote:
 On 2011/12/19 上午 03:18, Richard Henderson wrote:
  On 12/17/2011 10:36 PM, Chung-Lin Tang wrote:
  I don't think it's that kind of problem; the powerpc backend uses
  unlikely_text_section_p(), which compares the passed in argument section
  and the value of function_section_1(current_function_decl,true).
  
  I think this might be the real bug, or something related.
  
  Since current_function_decl is NULL at assembly phase, it retrieves
  .text.unlikely to test for equality. It's the retrieving/lookup that
  fails here, because the default looked-up section flags set when decl ==
  NULL does not really seem to make sense (adds SECTION_WRITE).
  
  current_function_decl is only null when we're not inside a function.
  
  One possible fix is to test for current_function_section inside
  unlikely_text_section_p.  However, I think that begs the question
  of what in the world is actually going on in rs6000_assemble_integer.
  Why are we testing for emitting data in text sections?
 
 I think I sort of mis-represented the context here; this was not really
 during the assembly phase of a function, but already in
 toplev.c:output_object_blocks().
 
 I've created a bugzilla PR for this, with a testcase from U-boot, and a
 minimal testcase: http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51623

This one line patch fixes the problem by using a different test than
unlikely_text_section_p, which assumes it is called within a function context.

I bootstrapped it, and there were no regressions.  I have added the test case
from the PR so it doesn't come back.  Is it ok to apply?  It is also a bug in
GCC 4.6, and I will backport the patch to that branch as well.

FWIW, I wrote -mrelocatable around 1990 or so to for a specific Cygnus customer
that needed to have pseudo shared libraries in embedded code, as long as they
were willing to live with various restrictions.  At the time, the Linux shared
library code was non-existant, and this was a quick hack.  In the nature of all
quick hacks, eventually things change in the machine independent code layer,
and it has to be revisited.  In hindsight, it would have been better if the
Linux shared library code was operational, and that there was a non-GPL dynamic
linker written to handle the relocations, rather than having this quick hack.

The check for unlikely text was added in 2004 by Caroline Tice of Apple, and it
is curious that they didn't add a check for it being in a hot function as well
as a cold function.  I also suspect the check would not work as well if
-ffunctions-section was used.  The point of the check is not to add to the
fixup table pointers that are stored in the read-only text section (which would
cause a segfault at runtime, but it would leave a pointer that is not fixed up
when the program starts).  It was modified in 2005 by Richard Sandiford in a
global change in how sections are dealt with, and modified by Alan Modra in
2006.

[gcc]
2011-12-27  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/51623
* config/rs6000/rs6000.c (rs6000_assemble_integer): Don't call
unlikely_text_section_p.  Instead check for being in a code
section.

[gcc/testsuite]
2011-12-27  Michael Meissner  meiss...@linux.vnet.ibm.com

PR target/51623
* gcc.target/powerpc/pr51623.c: New file.


-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 182694)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -15461,7 +15461,7 @@ rs6000_assemble_integer (rtx x, unsigned
   if (TARGET_RELOCATABLE
   in_section != toc_section
   in_section != text_section
-  !unlikely_text_section_p (in_section)
+  (in_section  (in_section-common.flags  SECTION_CODE)) == 0
   !recurse
   GET_CODE (x) != CONST_INT
   GET_CODE (x) != CONST_DOUBLE
Index: gcc/testsuite/gcc.target/powerpc/pr51623.c
===
--- gcc/testsuite/gcc.target/powerpc/pr51623.c  (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/pr51623.c  (revision 0)
@@ -0,0 +1,123 @@
+/* PR target/51623 */
+/* { dg-do compile { target { { powerpc*-*-linux*  ilp32 } || { 
powerpc-*-eabi* } } } } */
+/* { dg-options -mrelocatable -ffreestanding } */
+
+/* This generated an error, since the compiler was calling
+   unlikely_text_section_p in a context where it wasn't valid.  */
+
+typedef long long loff_t;
+typedef unsigned size_t;
+
+
+struct mtd_info {
+  unsigned writesize;
+  unsigned oobsize;
+  const char *name;
+};
+
+extern int strcmp(const char *,const char *);
+extern char * strchr(const char *,int);
+
+struct cmd_tbl_s {
+  char *name;
+};
+
+
+int printf(const char *fmt, ...)

PR rtl-optimization/51069 (verify_loop_info failed)

2011-12-28 Thread Jan Hubicka

Hi,
in this testcase peeling of loop contaiing irreducible region leads to
increasing size of the region (by removing the conditional path into it).
remove_path is not quite ready for this scenario.  Still it would be nice to
avoid us creating irreducible region in cases where they are not.

Bootstrapped/regtested x86_64-linux, OK?

int a, b, c, d, e, f, bar (void);

void
foo (int x)
{
  for (;;)
{
  if (!x)
{
  for (d = 6; d = 0; d--)
{
  while (!b)
;
  if (e)
return foo (x);
  if (f)
{
  a = 0;
  continue;
}
  for (; c; c--)
;
}
}
  if (bar ())
break;
  e = 0;
  if (x)
for (;;)
  ;
}
}
PR rtl-optimization/51069
* cfgloopmanip.c (remove_path): Removing path making irreducible
region unconditional makes BB part of the region.

Index: cfgloopmanip.c
===
*** cfgloopmanip.c  (revision 182708)
--- cfgloopmanip.c  (working copy)
*** remove_path (edge e)
*** 290,295 
--- 290,296 
int i, nrem, n_bord_bbs;
sbitmap seen;
bool irred_invalidated = false;
+   edge_iterator ei;
  
if (!can_remove_branch_p (e))
  return false;
*** remove_path (edge e)
*** 329,337 
/* Find border hexes -- i.e. those with predecessor in removed path.  */
for (i = 0; i  nrem; i++)
  SET_BIT (seen, rem_bbs[i]-index);
for (i = 0; i  nrem; i++)
  {
-   edge_iterator ei;
bb = rem_bbs[i];
FOR_EACH_EDGE (ae, ei, rem_bbs[i]-succs)
if (ae-dest != EXIT_BLOCK_PTR  !TEST_BIT (seen, ae-dest-index))
--- 330,341 
/* Find border hexes -- i.e. those with predecessor in removed path.  */
for (i = 0; i  nrem; i++)
  SET_BIT (seen, rem_bbs[i]-index);
+   FOR_EACH_EDGE (ae, ei, e-src-succs)
+ if (ae != e  ae-dest != EXIT_BLOCK_PTR  !TEST_BIT (seen, 
ae-dest-index)
+ae-flags  EDGE_IRREDUCIBLE_LOOP)
+   irred_invalidated = true;
for (i = 0; i  nrem; i++)
  {
bb = rem_bbs[i];
FOR_EACH_EDGE (ae, ei, rem_bbs[i]-succs)
if (ae-dest != EXIT_BLOCK_PTR  !TEST_BIT (seen, ae-dest-index))

Re: PR rtl-optimization/51069 (verify_loop_info failed)

2011-12-28 Thread Jakub Jelinek

On Wed, Dec 28, 2011 at 07:31:57PM +0100, Jan Hubicka wrote:
 *** cfgloopmanip.c(revision 182708)
 --- cfgloopmanip.c(working copy)
 *** remove_path (edge e)
 *** 290,295 
 --- 290,296 
 int i, nrem, n_bord_bbs;
 sbitmap seen;
 bool irred_invalidated = false;
 +   edge_iterator ei;
   
 if (!can_remove_branch_p (e))
   return false;
 *** remove_path (edge e)
 *** 329,337 
 /* Find border hexes -- i.e. those with predecessor in removed path.  */
 for (i = 0; i  nrem; i++)
   SET_BIT (seen, rem_bbs[i]-index);
 for (i = 0; i  nrem; i++)
   {
 -   edge_iterator ei;
 bb = rem_bbs[i];
 FOR_EACH_EDGE (ae, ei, rem_bbs[i]-succs)
   if (ae-dest != EXIT_BLOCK_PTR  !TEST_BIT (seen, ae-dest-index))
 --- 330,341 
 /* Find border hexes -- i.e. those with predecessor in removed path.  */
 for (i = 0; i  nrem; i++)
   SET_BIT (seen, rem_bbs[i]-index);
 +   FOR_EACH_EDGE (ae, ei, e-src-succs)
 + if (ae != e  ae-dest != EXIT_BLOCK_PTR  !TEST_BIT (seen, 
 ae-dest-index)
 +  ae-flags  EDGE_IRREDUCIBLE_LOOP)
 +   irred_invalidated = true;

Just a nit, can't you break out of the loop when irred_invalidated is set to
true as well?  There is no need to look through any further edges.  I.e.
perhaps:
  if (!irred_invalidated)
FOR_EACH_EDGE (ae, ei, e-src-succs)
  if (ae != e
   ae-dest != EXIT_BLOCK_PTR
   (ae-flags  EDGE_IRREDUCIBLE_LOOP)
   !TEST_BIT (seen, ae-dest-index))
{
  irred_invalidated = true;
  break;
}
Thanks for looking into this, I'll defer the review to somebody familiar
with cfgloopmanip.c though.

Jakub

[PATCH] Don't optimize away non-pure/const calls during ccp (PR tree-optimization/51683)

2011-12-28 Thread Jakub Jelinek

Hi!

For some calls (like memcpy and other builtins that are known to pass
through the first argument) we know the value of the lhs, but still
we shouldn't be replacing the call with just a mere assignment of that
known value to the LHS SSA_NAME, because the call has other side-effects.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
ok for trunk?

2011-12-28  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/51683
* tree-ssa-propagate.c (substitute_and_fold): Don't optimize away
calls with side-effects.
* tree-ssa-ccp.c (ccp_fold_stmt): Likewise.

* gcc.dg/pr51683.c: New test.

--- gcc/tree-ssa-propagate.c.jj 2011-11-11 20:54:59.0 +0100
+++ gcc/tree-ssa-propagate.c2011-12-27 12:23:41.334187258 +0100
@@ -1056,6 +1056,12 @@ substitute_and_fold (ssa_prop_get_value_
  }
else if (is_gimple_call (def_stmt))
  {
+   int flags = gimple_call_flags (def_stmt);
+
+   /* Don't optimize away calls that have side-effects.  */
+   if ((flags  (ECF_CONST|ECF_PURE)) == 0
+   || (flags  ECF_LOOPING_CONST_OR_PURE))
+ continue;
if (update_call_from_tree (gsi, val)
 maybe_clean_or_replace_eh_stmt (def_stmt, gsi_stmt (gsi)))
  gimple_purge_dead_eh_edges (gimple_bb (gsi_stmt (gsi)));
--- gcc/tree-ssa-ccp.c.jj   2011-12-19 09:21:07.0 +0100
+++ gcc/tree-ssa-ccp.c  2011-12-27 12:29:48.620880857 +0100
@@ -1878,6 +1878,7 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi
 case GIMPLE_CALL:
   {
tree lhs = gimple_call_lhs (stmt);
+   int flags = gimple_call_flags (stmt);
tree val;
tree argt;
bool changed = false;
@@ -1888,7 +1889,10 @@ ccp_fold_stmt (gimple_stmt_iterator *gsi
   type issues.  */
if (lhs
 TREE_CODE (lhs) == SSA_NAME
-(val = get_constant_value (lhs)))
+(val = get_constant_value (lhs))
+   /* Don't optimize away calls that have side-effects.  */
+(flags  (ECF_CONST|ECF_PURE)) != 0
+(flags  ECF_LOOPING_CONST_OR_PURE) == 0)
  {
tree new_rhs = unshare_expr (val);
bool res;
--- gcc/testsuite/gcc.dg/pr51683.c.jj   2011-12-27 12:21:43.662925435 +0100
+++ gcc/testsuite/gcc.dg/pr51683.c  2011-12-27 12:21:23.0 +0100
@@ -0,0 +1,18 @@
+/* PR tree-optimization/51683 */
+/* { dg-do compile } */
+/* { dg-options -O2 -fdump-tree-optimized } */
+
+static inline void *
+bar (void *p, void *q, int r)
+{
+  return __builtin_memcpy (p, q, r);
+}
+
+void *
+foo (void *p)
+{
+  return bar ((void *) 0x12345000, p, 256);
+}
+
+/* { dg-final { scan-tree-dump memcpy optimized } } */
+/* { dg-final { cleanup-tree-dump optimized } } */

Jakub

[SH] Fix defunct -mbranch-cost option

2011-12-28 Thread Oleg Endo

Hello,

while working on another PR I've noticed that the -mbranch-cost option
in the SH target is not really working.  The attached patch brings it
back to life, leaving the default behavior unchanged.

Cheers,
Oleg


2011-12-28  Oleg Endo  oleg.e...@t-online.de

* config/sh/sh.h (BRANCH_COST): Use sh_branch_cost variable.
* config/sh/sh.c (sh_option_override): Simplify sh_branch_cost
expression.

Index: gcc/config/sh/sh.c
===
--- gcc/config/sh/sh.c	(revision 182695)
+++ gcc/config/sh/sh.c	(working copy)
@@ -724,9 +724,16 @@
   else
 sh_divsi3_libfunc = __sdivsi3;
   if (sh_branch_cost == -1)
-sh_branch_cost
-  = TARGET_SH5 ? 1 : ! TARGET_SH2 || TARGET_HARD_SH4 ? 2 : 1;
+{
+  sh_branch_cost = 1;
 
+  /*  The SH1 does not have delay slots, hence we get a pipeline stall
+	  at every branch.  The SH4 is superscalar, so the single delay slot
+	  is not sufficient to keep both pipelines filled.  */
+  if (! TARGET_SH2 || TARGET_HARD_SH4)
+	sh_branch_cost = 2;
+}
+
   for (regno = 0; regno  FIRST_PSEUDO_REGISTER; regno++)
 if (! VALID_REGISTER_P (regno))
   sh_register_names[regno][0] = '\0';
Index: gcc/config/sh/sh.h
===
--- gcc/config/sh/sh.h	(revision 182695)
+++ gcc/config/sh/sh.h	(working copy)
@@ -2088,12 +2088,8 @@
different code that does fewer memory accesses.  */
 
 /* A C expression for the cost of a branch instruction.  A value of 1
-   is the default; other values are interpreted relative to that.
-   The SH1 does not have delay slots, hence we get a pipeline stall
-   at every branch.  The SH4 is superscalar, so the single delay slot
-   is not sufficient to keep both pipelines filled.  */
-#define BRANCH_COST(speed_p, predictable_p) \
-	(TARGET_SH5 ? 1 : ! TARGET_SH2 || TARGET_HARD_SH4 ? 2 : 1)
+   is the default; other values are interpreted relative to that.  */
+#define BRANCH_COST(speed_p, predictable_p) sh_branch_cost
 
 /* Assembler output control.  */

Re: PR rtl-optimization/51069 (verify_loop_info failed)

2011-12-28 Thread Jan Hubicka

 Just a nit, can't you break out of the loop when irred_invalidated is set to
 true as well?  There is no need to look through any further edges.  I.e.

Sure, though we do have horrible time complexity in case irreducible
regions are including recomputing the whole CFG flags after every path
removal.

Honza

Re: [patch testsuite g++.dg]: Reflect ABI change for windows native targets about bitfield layout in structures

2011-12-28 Thread Mike Stump

On Dec 16, 2011, at 9:56 AM, Dave Korn dave.korn.cyg...@gmail.com wrote:

 On 16/12/2011 09:01, Kai Tietz wrote:
 2011/12/15 Dave Korn:
 
 { dg-options -mno-align-double { target i?86-*-cygwin* i?86-*-mingw* } }
 { dg-additional-options -mno-ms-bitfields { target i?86-*-mingw* } }
 
 ... so that MinGW gets both and Cygwin only the one it wants?  (Actually the
 first one could just as well be changed to dg-additional-options at the same
 time, couldn't it?)
 
 Well, interesting.  I think it should be the additional variant for
 cygwin/mingw, as otherwise -O2 gets clobbered for it, isn't it?
 
  Yes, that's what I was concerned with.
 
 So I modified patch as attached.
 
  Thanks for that.  I recommend this patch for approval.

Ok.

Re: [patch testsuite g++.old-deja]: Fix some testcases for mingw targets

2011-12-28 Thread Mike Stump

On Dec 27, 2011, at 10:55 PM, Kai Tietz ktiet...@googlemail.com wrote:

 Ping

It was previously approved in the email you quote.  See the Ok buried in there.

 2011/12/15 Dave Korn dave.korn.cyg...@gmail.com:
 On 15/12/2011 17:44, Mike Stump wrote:
 On Dec 15, 2011, at 1:43 AM, Kai Tietz wrote:
 This patch takes care that we are using for operator new/delete
 replacement test static version on mingw-targets.  As the shared (DLL)
 version isn't able to have operator overload within DLL itself, as a DLL
 is finally-linked for PE-COFF.
 
 Ok for apply?
 
 Not sure who would review this if I don't, so, Ok.  That said, if a shared
 library C++ type person wants to chime in...  I get the feeling this is
 unfortunate, and it might have been nice to manage this in some other way,
 but, I just want to step back and let others think about it.
 
  Well, it's a consequence of how you can't leave undefined references in
 Windows DLLs at link-time for the loader to just fill in with the first
 definition it comes across at run-time (as you can on ELF).  We have to jump
 through hoops to get operator new/delete replacement working on Cygwin, and
 were lucky in that the cygwin1.dll is linked against absolutely everything, 
 so
 we had somewhere to hang our redirection hooks.  Without someone adding some
 similar amount of infrastructure to MinGW, the only time function replacement
 can work is for a statically-linked executable, when all definitions are
 visible in one single link.
 
   * g++.old-deja/g++.brendan/new3.C: Adjust test for mingw
   targets to use static-version.
 
 s/static-version/static linking/
 
 +// Avoid use of none-overridable new/delete operators in shared
 
 s/none-overridable/non-overridable/g
 s/in shared/in shared link/g
 
  Patch looks perfectly sensible to me, but I can't approve.
 
cheers,
  DaveK
 
 
 
 -- 
 |  (\_/) This is Bunny. Copy and paste
 | (='.'=) Bunny into your signature to help
 | ()_() him gain world domination

Re: [PATCH] Don't optimize away non-pure/const calls during ccp (PR tree-optimization/51683)

2011-12-28 Thread Nathan Froyd

- Original Message -
 else if (is_gimple_call (def_stmt))
 {
 + int flags = gimple_call_flags (def_stmt);
 +
 + /* Don't optimize away calls that have side-effects. */
 + if ((flags  (ECF_CONST|ECF_PURE)) == 0
 + || (flags  ECF_LOOPING_CONST_OR_PURE))

This patch does this computation twice; grepping through the tree for ECF_CONST 
suggests it's done quite a few more times.  Could we get a predicate in 
gimple.h to encapsulate this?

-Nathan

Re: [PATCH] Don't optimize away non-pure/const calls during ccp (PR tree-optimization/51683)

2011-12-28 Thread Jakub Jelinek

On Wed, Dec 28, 2011 at 11:53:41AM -0800, Nathan Froyd wrote:
 - Original Message -
  else if (is_gimple_call (def_stmt))
  {
  + int flags = gimple_call_flags (def_stmt);
  +
  + /* Don't optimize away calls that have side-effects. */
  + if ((flags  (ECF_CONST|ECF_PURE)) == 0
  + || (flags  ECF_LOOPING_CONST_OR_PURE))
 
 This patch does this computation twice; grepping through the tree for
 ECF_CONST suggests it's done quite a few more times.  Could we get a
 predicate in gimple.h to encapsulate this?

I think it would be an overkill to have a predicate for
nonlooping_const_or_pure_flags, we don't have predicates for similar
RTL or decl flags either.
We write:
/* We can delete dead const or pure calls as long as they do not
 infinite loop.  */
   (RTL_CONST_OR_PURE_CALL_P (insn)
   !RTL_LOOPING_CONST_OR_PURE_CALL_P (insn)))
and not RTL_CONST_OR_PURE_NONLOOPING_CALL_P (insn) etc.

Jakub

Re: [PATCH] PowerPC section type conflict (created PR 51623)

2011-12-28 Thread Richard Henderson

On 12/28/2011 09:39 AM, Michael Meissner wrote:
  in_section != text_section
 -!unlikely_text_section_p (in_section)
 +(in_section  (in_section-common.flags  SECTION_CODE)) == 0

You should be able to delete the text_section test as well,
and in_section should *never* be null, when emitting data.

Otherwise this looks much better to me.


r~

Ping [ARM back-end and middle-end patch] stack check for threads

2011-12-28 Thread Thomas Klein


ping

I would like to introduce two new -fstack-check options named direct and 
indirect.
Targets that did not supporting the new stack checking options will work 
as before.

At the ARM platform the old generic options is working as before.
(Including that is now possible to have a checking code sequence even if 
optimization

is switched on.)
The check against a given limit value while doing dynamic stack 
allocation is now

also working, too.
This was not the case due to missing trap function.
For this case I've added a code sequence to let generic act like the 
dynamic part

doing a compare against a given limit value.
I'm treating this as keeping old stuff alive.

Back to my new options I like to have here.
Maybe you are happy with the above, but I'm not.
Sometimes you do not have a one single limit value that is valid for all.
For example if you are having an environment with threads and each 
threads is using

its own stack at an different location.
In case all functions should have a common knowledge about a global 
limit variable

which is holding the limit value.
This limit value can be used to check if a stack overflow has occurred 
or not.


There are two ways to inform the compiler about this limit variable.
 If it is an ordinary variable (located somewhere in data space) you 
should

 use the option combination
 -fstack-check=indirect and -fstack-limit-symbol=global_stack_limit

 If it is a register global variable you should use the option combination
 -fstack-check=direct} and -fstack-limit-register=r6
 In this case you have to make sure that this register isn't be used by 
others.

 For example you can add the option -ffixed-r6 to all files that are
 not going to do stack checking.
The OS is responsible to insert the correct limit value.
For example at the end of a context switch.

I've added a little bit of documentation, too.
This may not be as god as you expect, but it the best I can do.
Sorry for that.

I have added some tests running on arm simulator and linux arm target 
machine.
I'm using ../src/configure --target=arm-elf and --target=arm-elf-eabi 
cross compilers

and running tests with:
gmake check-gcc RUNTESTFLAGS=--target_board=arm-sim arm_stack_check.exp
Also using a native linux compiler (on armv7-a machine) and running 
tests with:

gmake check-gcc RUNTESTFLAGS=arm_stack_check.exp

Each test case is done with:
- stack checking variants
  generic using a limit-symbol, generic using a limit register,
  direct using a limit-symbol, direct using a limit register and
  indirect using a limit-symbol
- various modes ARM, Thumb, (and if possible with Thumb-2)
- With and without optimization.
- Without -fpic, with -fpic and with -fpic -msingle-pic-base

I have also detected a minor bug if using combination:
-fpic -mpic-register=r9 -march=armv4t -mthumb.
(A move of the hi register to a lo register is missing here.)
So I've added the few lines of code in here, too.
Maybe you think that this is a nasty hack, so insert a better one instead.

All tests succeeds.

I'm still thinking that my idea isn't that bad.
How ever any feedback from the ARM maintainers would be god.
Even if is something like: We hate this bull shit at all.
Any feedback is better than no feedback.

Regards
  Thomas Klein

references
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg01261.html
http://gcc.gnu.org/ml/gcc-patches/2011-09/msg00310.html
http://gcc.gnu.org/ml/gcc-patches/2011-08/msg00216.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00281.html
http://gcc.gnu.org/ml/gcc-patches/2011-07/msg00149.html
http://gcc.gnu.org/ml/gcc-patches/2011-06/msg01872.html
http://gcc.gnu.org/ml/gcc-patches/2011-03/msg01226.html


ChangeLog.check.bz2
Description: Binary data


gcc.diff_chk.bz2
Description: Binary data


gcc.diff_dep.bz2
Description: Binary data


ChangeLog.test.bz2
Description: Binary data


gcc.diff_test.bz2
Description: Binary data

Re: [PATCH] PowerPC section type conflict (created PR 51623)

2011-12-28 Thread Michael Meissner

On Wed, Dec 28, 2011 at 12:34:25PM -0800, Richard Henderson wrote:
 On 12/28/2011 09:39 AM, Michael Meissner wrote:
 in_section != text_section
  -  !unlikely_text_section_p (in_section)
  +  (in_section  (in_section-common.flags  SECTION_CODE)) == 0
 
 You should be able to delete the text_section test as well,
 and in_section should *never* be null, when emitting data.
 
 Otherwise this looks much better to me.

Yeah, I thought about that.  I'm wondering whether any integer is ever emitted
in the text section, and just delete the two lines.  I'll try it out.

-- 
Michael Meissner, IBM
5 Technology Place Drive, M/S 2757, Westford, MA 01886-3141, USA
meiss...@linux.vnet.ibm.com fax +1 (978) 399-6899

Re: Ping**1.57 [Patch, fortran] Improve common function elimination

2011-12-28 Thread Steve Kargl

On Wed, Dec 28, 2011 at 04:21:55PM +0100, Thomas Koenig wrote:
 http://gcc.gnu.org/ml/fortran/2011-12/msg00102.html
 
 OK for trunk?
 

I did not test the patch, but it appears correct to me.

OK.

-- 
Steve

MAINTAINERS: Add myself

2011-12-28 Thread Oleg Endo

Just commited:

* MAINTAINERS (Write After Approval): Add myself.

Index: MAINTAINERS
===
--- MAINTAINERS (revision 182712)
+++ MAINTAINERS (working copy)
@@ -352,6 +352,7 @@
 Michael Eager  ea...@eagercon.com
 Phil Edwards   p...@gcc.gnu.org
 Mohan Embargnust...@thisiscool.com
+Oleg Endo  olege...@gcc.gnu.org
 Revital Eres   e...@il.ibm.com
 Marc Espie es...@cvs.openbsd.org
 Rafael Ávila de Espíndola  espind...@google.com

[PATCH] PR testsuite/51097 fix: a lot of FAIL: gcc.dg/vect on i686 avx build 181167 to 181177

2011-12-28 Thread Igor Zamyatin

Hi,

Here is another patch about failures in gcc.dg/vect tests. These
changes fix fails that could be seen on avx-built compilers. It also
introduces no FAILs/XFAILs/XPASSes/ERRORs on regular i686, x86_64,
avx2_32, avx2_64.
Is it ok for the trunk?

Thanks,
Igor

2011-12-28  Igor Zamyatin  igor.zamya...@intel.com

       PR testsuite/51097
       * lib/target-supports.exp (check_effective_target_vect_float_no_int):
       New function.
       (check_avx2_available): Ditto.
       * gcc.dg/vect/no-scevccp-outer-7.c: Adjust dg-scans for AVX-built
       compiler.
       * gcc.dg/vect/no-scevccp-vect-iv-3.c: Likewise.
       * gcc.dg/vect/no-vfa-vect-depend-1.c: Likewise.
       * gcc.dg/vect/no-vfa-vect-dv-2.c: Likewise.
       * gcc.dg/vect/slp-perm-9.c: Likewise.
       * gcc.dg/vect/slp-reduc-6.c: Likewise.
       * gcc.dg/vect/slp-widen-mult-half.c: Likewise.
       * gcc.dg/vect/vect-109.c: Likewise.
       * gcc.dg/vect/vect-119.c: Likewise.
       * gcc.dg/vect/vect-35-big-array.c: Likewise.
       * gcc.dg/vect/vect-91.c: Likewise.
       * gcc.dg/vect/vect-multitypes-4.c: Likewise.
       * gcc.dg/vect/vect-multitypes-6.c: Likewise.
       * gcc.dg/vect/vect-outer-4c-big-array.c: Likewise.
       * gcc.dg/vect/vect-over-widen-1.c: Likewise.
       * gcc.dg/vect/vect-over-widen-4.c: Likewise.
       * gcc.dg/vect/vect-peel-1.c: Likewise.
       * gcc.dg/vect/vect-peel-3.c: Likewise.
       * gcc.dg/vect/vect-peel-4.c: Likewise.
       * gcc.dg/vect/vect-reduc-dot-s16a.c: Likewise.
       * gcc.dg/vect/vect-reduc-dot-s8a.c: Likewise.
       * gcc.dg/vect/vect-reduc-dot-u8a.c: Likewise.
       * gcc.dg/vect/vect-reduc-dot-u8b.c: Likewise.
       * gcc.dg/vect/vect-reduc-pattern-1a.c: Likewise.
       * gcc.dg/vect/vect-reduc-pattern-1b-big-array.c: Likewise.
       * gcc.dg/vect/vect-reduc-pattern-1c-big-array.c: Likewise.
       * gcc.dg/vect/vect-reduc-pattern-2a.c: Likewise.
       * gcc.dg/vect/vect-reduc-pattern-2b-big-array.c: Likewise.
       * gcc.dg/vect/vect-widen-mult-const-s16.c: Likewise.
       * gcc.dg/vect/vect-widen-mult-const-u16.c: Likewise.
       * gcc.dg/vect/vect-widen-mult-half-u8.c: Likewise.
       * gcc.dg/vect/vect-widen-mult-half.c: Likewise.
       * gcc.dg/vect/vect-widen-mult-sum.c: Likewise.
       * gcc.dg/vect/vect-widen-mult-u16.c: Likewise.
       * gcc.dg/vect/wrapv-vect-reduc-dot-s8b.c: Likewise.


51097.patch
Description: Binary data

Re: PR rtl-optimization/51069 (verify_loop_info failed)

2011-12-28 Thread Zdenek Dvorak

Hi,

 Just a nit, can't you break out of the loop when irred_invalidated is set to
 true as well?  There is no need to look through any further edges.  I.e.
 perhaps:
   if (!irred_invalidated)
 FOR_EACH_EDGE (ae, ei, e-src-succs)
   if (ae != e
  ae-dest != EXIT_BLOCK_PTR
  (ae-flags  EDGE_IRREDUCIBLE_LOOP)
  !TEST_BIT (seen, ae-dest-index))
   {
 irred_invalidated = true;
 break;
   }
 Thanks for looking into this, I'll defer the review to somebody familiar
 with cfgloopmanip.c though.

the change looks fine to me.

 Sure, though we do have horrible time complexity in case irreducible
 regions are including recomputing the whole CFG flags after every path
 removal.

Yeah, though trying to keep it up-to-date locally was a nightmare.  We actually
do not use the information about irreducible regions all that much, so maybe
the right approach would be to just compute it when needed,

Zdenek

[wwwdocs] - changes to GUPC page

2011-12-28 Thread Nenad Vukicevic


Hello, I updated GUPC page with the Download section.

Attached is a patch. Ok to commit? Feedback is appreciated.

Nenad
Index: htdocs/projects/gupc.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/gupc.html,v
retrieving revision 1.5
diff -u -r1.5 gupc.html
--- htdocs/projects/gupc.html   31 Dec 2010 11:36:07 -  1.5
+++ htdocs/projects/gupc.html   28 Dec 2011 21:47:42 -
@@ -68,7 +68,7 @@
 liIntel x86 Linux uniprocessor and symmetric multiprocessor systems
 (CentOS 5.3)/li
 liIntel x86 Apple Mac OS X uniprocessor and symmetric multiprocessor
-systems (Leopard 10.5.7+ and Snow Leopard 10.6)/li
+systems (Leopard 10.5.7+, Snow Leopard 10.6, and Lion 1.7)/li
 liMips2 32-bit (-n32) ABI and mips4 64-bit (-n64) ABI (SGI IRIX 6.5)/li
 liCray XT3/4/5 CNL and Catamount/li
 liAs a front-end to the Berkeley UPC Berkeley UPC runtime
@@ -81,6 +81,16 @@
 a href=#gupc_discussGUPC discussion list/a.
 /p
 
+h2Download/h2
+
+pThe latest release of GUPC can be downloaded from a
+href=http://www.gccupc.org/downloads.html;gccupc.org/a./p
+
+pAlternatively, read-only SVN access to the GUPC branch can be used to
+acquire the latest development source tree:/p
+
+presvn checkout svn://gcc.gnu.org/svn/gcc/branches/gupc/pre
+
 h2Documentation/h2
 
 pFor a list of configuration switches that you can use to build GUPC, consult

Re: [SH] Fix defunct -mbranch-cost option

2011-12-28 Thread Kaz Kojima

Oleg Endo oleg.e...@t-online.de wrote:
 while working on another PR I've noticed that the -mbranch-cost option
 in the SH target is not really working.  The attached patch brings it
 back to life, leaving the default behavior unchanged.
 
 Cheers,
 Oleg
 
 
 2011-12-28  Oleg Endo  oleg.e...@t-online.de
 
   * config/sh/sh.h (BRANCH_COST): Use sh_branch_cost variable.
   * config/sh/sh.c (sh_option_override): Simplify sh_branch_cost
   expression.

Ok as the obvious fix.  Thanks for the patch.

Regards,
kaz

[C++ Patch] PR 51316

2011-12-28 Thread Paolo Carlini


Hi,

I think the resolution of core/930 and C++11 itself are pretty clear: 
alignof of an array of unknown bound is fine, provided the element type 
is complete of course.


Tested x86_64-linux.

Thanks,
Paolo.

//
/c-family
2011-12-29  Paolo Carlini  paolo.carl...@oracle.com

PR c++/51316
* c-common.c (c_sizeof_or_alignof_type): In C++ allow for alignof
of array types with an unknown bound.

/testsuite
2011-12-29  Paolo Carlini  paolo.carl...@oracle.com

PR c++/51316
* g++.dg/cpp0x/alignof4.C: New.
Index: testsuite/g++.dg/cpp0x/alignof4.C
===
--- testsuite/g++.dg/cpp0x/alignof4.C   (revision 0)
+++ testsuite/g++.dg/cpp0x/alignof4.C   (revision 0)
@@ -0,0 +1,7 @@
+// PR c++/51316
+// { dg-options -std=c++0x }
+
+int main()
+{
+  alignof(int []);
+}
Index: c-family/c-common.c
===
--- c-family/c-common.c (revision 182710)
+++ c-family/c-common.c (working copy)
@@ -4382,13 +4382,22 @@ c_sizeof_or_alignof_type (location_t loc,
 return error_mark_node;
   value = size_one_node;
 }
-  else if (!COMPLETE_TYPE_P (type))
+  else if (!COMPLETE_TYPE_P (type)
+   (!c_dialect_cxx () || is_sizeof || type_code != ARRAY_TYPE))
 {
   if (complain)
-   error_at (loc, invalid application of %qs to incomplete type %qT ,
+   error_at (loc, invalid application of %qs to incomplete type %qT,
  op_name, type);
   return error_mark_node;
 }
+  else if (c_dialect_cxx ()  type_code == ARRAY_TYPE
+   !COMPLETE_TYPE_P (TREE_TYPE (type)))
+{
+  if (complain)
+   error_at (loc, invalid application of %qs to array type %qT of 
+ incomplete element type, op_name, type);
+  return error_mark_node;
+}
   else
 {
   if (is_sizeof)

Use DW_LANG_Go for Go

2011-12-28 Thread Ian Lance Taylor

This patch to gcc uses the new DW_LANG_Go DWARF language code for Go.
Bootstrapped and ran testsuite on x86_64-unknown-linux-gnu.  Committed
on the basis of 1) I am a middle-end maintainer; 2) I am a Go
maintainer; 3) the patch is obvious.

Ian


2011-12-28  Ian Lance Taylor  i...@google.com

* dwarf2out.c (gen_compile_unit_die): Use DW_LANG_Go for Go.


Index: dwarf2out.c
===
--- dwarf2out.c	(revision 182694)
+++ dwarf2out.c	(working copy)
@@ -18433,6 +18433,11 @@ gen_compile_unit_die (const char *filena
 	language = DW_LANG_ObjC;
   else if (strcmp (language_string, GNU Objective-C++) == 0)
 	language = DW_LANG_ObjC_plus_plus;
+  else if (dwarf_version = 5 || !dwarf_strict)
+	{
+	  if (strcmp (language_string, GNU Go) == 0)
+	language = DW_LANG_Go;
+	}
 }
 
   add_AT_unsigned (die, DW_AT_language, language);

[libitm] Remove variadic argument of _ITM_beginTransaction from libitm.h

2011-12-28 Thread Patrick Marlier

With i386, the regparm(2) is not taken into account when there is a 
variadic function. All parameters are in the stack.

Since this variable argument is never used removing it is not a problem.

This solves libitm testcases memset-1.c/memcpy-1.c on i686 (part of 
PR51655/51124).


Before:
FAIL: libitm.c/memcpy-1.c execution test
FAIL: libitm.c/memset-1.c execution test

=== libitm Summary ===

# of expected passes21
# of unexpected failures2
# of expected failures  5
# of unresolved testcases   1


After:
=== libitm Summary ===

# of expected passes23
# of expected failures  5
# of unresolved testcases   1

Tested on i686. If ok, please commit. Thanks.

Patrick Marlier.


2011-12-28  Patrick Marlier  patrick.marl...@gmail.com

PR testsuite/51655
* libitm.h (_ITM_beginTransaction): Remove unused argument.



Index: libitm.h
===
--- libitm.h(revision 182549)
+++ libitm.h(working copy)
@@ -136,7 +136,7 @@ typedef uint64_t _ITM_transactionId_t;  /* Transact

 extern _ITM_transactionId_t _ITM_getTransactionId(void) ITM_REGPARM;

-extern uint32_t _ITM_beginTransaction(uint32_t, ...) ITM_REGPARM;
+extern uint32_t _ITM_beginTransaction(uint32_t) ITM_REGPARM;

 extern void _ITM_abortTransaction(_ITM_abortReason) ITM_REGPARM 
ITM_NORETURN;

Re: [wwwdocs] - changes to GUPC page

2011-12-28 Thread Mike Stump

On Dec 28, 2011, at 1:52 PM, Nenad Vukicevic wrote:

 -systems (Leopard 10.5.7+ and Snow Leopard 10.6)/li
 +systems (Leopard 10.5.7+, Snow Leopard 10.6, and Lion 1.7)/li

1.7?  Should this be 10.7?

RE: PING: [PATCH, ARM, iWMMXt][4/5]: WMMX machine description

2011-12-28 Thread Xinyu Qi

At 2011-12-22 17:53:45,Richard Earnshaw rearn...@arm.com wrote: 
 On 22/12/11 06:38, Xinyu Qi wrote:
  At 2011-12-15 01:32:13,Richard Earnshaw rearn...@arm.com wrote:
  On 24/11/11 01:33, Xinyu Qi wrote:
  Hi Ramana,
 
  I solve the conflict, please try again. The new diff is attached.
 
  Thanks,
  Xinyu
 
  At 2011-11-19 07:36:15,Ramana Radhakrishnan
  ramana.radhakrish...@linaro.org wrote:
 
  Hi Xinyu,
 
  This doesn't apply cleanly currently on trunk and the reject appears
  to come from iwmmxt.md and I've not yet investigated why.
 
  Can you have a look ?
 
 
  This patch is NOT ok.
 
  You're adding features that were new in iWMMXt2 (ie not in the original
  implementation) but you've provided no means by which the compiler can
  detect which operations are only available on the new cores.
 
  Hi Richard,
 
  All of the WMMX chips support WMMX2 instructions.
 
 This may be true for Marvell's current range of processors, but I find
 it hard to reconcile with the assembler support in GAS, which clearly
 distinguishes between iWMMXT and iWMMXT2 instruction sets.  Are you
 telling me that no cores were ever manufactured (even by Intel) that
 only supported iWMMXT?
 
 I'm concerned that this patch will break support for existing users who
 have older chips (for GCC we have to go through a deprecation cycle if
 we want to drop support for something we now believe is no-longer worth
 maintaining).
 
  What I do is to complement the WMMX2 intrinsic support in GCC.
 
 I understand that, and I'm not saying the patch can never go in; just
 that it needs to separate out the support for the different architecture
 variants.
 
  I don't think it is necessary for users to consider whether one WMMX insn 
  is a
 WMMX2 insn or not.
 
 Users don't (unless they want their code to run on legacy processors
 that only support the original instruction set), but the compiler surely
 must know what it is targeting.  Remember that the instruction patterns
 are not entirely black boxes, the compiler can do optimizations on
 intrinsics (it's one of the reasons why they are better than inline
 assembly).  Unless the compiler knows exactly what instructions are
 legal, it could end up optimizing something that started as a WMMX insn
 into something that's a WMMX2 insn (for example, propagating a constant
 into a vector shift expression).
 
 R.

Hi, Richard,

You are right. There exist the chips that only support WMMX instructions in the 
history.
I distinguish the iWMMXt and iWMMXt2 in the patch update this time.

In current GCC, -march=iwmmxt and -march=iwmmxt2 (or -mcpu=iwmmxt and 
-mcpu=iwmmxt2) are almost no difference in the compiling stage.
I take advantage of them to do the work, that is, make -march=iwmmxt (or 
-mcpu=iwmmxt) only support iWMMXt intrinsic iWMMXt built in and WMMX 
instructions,
and make -march=iwmmxt2 (or -mcpu=iwmmxt2) support fully iWMMXt2.

Define a new flag FL_IWMMXT2 to represent the chip support iWMMXt2 extension, 
which directly controls the iWMMXt2 built in initialization and the followed 
defines.
Define __IWMMXT2__ in TARGET_CPU_CPP_BUILTINS to control the access of iWMMXt2 
intrinsics.
Define TARGET_REALLY_IWMMXT2 to control the access of WMMX2 instructions' 
machine description.
In arm.md, define iwmmxt2 in arch attr to control the access of the 
alternative in shift patterns. 

The updated patch 4/5 is attached here. 1/5, 2/5 and 3/5 are updated 
accordingly. Attach them in related mails.
Please take a look if such modification is proper.

Changelog:

* config/arm/arm.c (arm_output_iwmmxt_shift_immediate): New function.
(arm_output_iwmmxt_tinsr): Likewise.
* config/arm/arm-protos.h (arm_output_iwmmxt_shift_immediate): Declare.
(arm_output_iwmmxt_tinsr): Likewise.
* config/arm/iwmmxt.md (WCGR0, WCGR1, WCGR2, WCGR3): New constant.
(iwmmxt_psadbw, iwmmxt_walign, iwmmxt_tmrc, iwmmxt_tmcr): Delete.
(rorv4hi3, rorv2si3, rordi3): Likewise.
(rorv4hi3_di, rorv2si3_di, rordi3_di): Likewise.
(ashrv4hi3_di, ashrv2si3_di, ashrdi3_di): Likewise.
(lshrv4hi3_di, lshrv2si3_di, lshrdi3_di): Likewise.
(ashlv4hi3_di, ashlv2si3_di, ashldi3_di): Likewise.
(iwmmxt_tbcstqi, iwmmxt_tbcsthi, iwmmxt_tbcstsi): Likewise
(*iwmmxt_clrv8qi, *iwmmxt_clrv4hi, *iwmmxt_clrv2si): Likewise.
(tbcstv8qi, tbcstv4hi, tbsctv2si): New pattern.
(iwmmxt_clrv8qi, iwmmxt_clrv4hi, iwmmxt_clrv2si): Likewise.
(*andmode3_iwmmxt, *iormode3_iwmmxt, *xormode3_iwmmxt): Likewise.
(rormode3, rormode3_di): Likewise.
(ashrmode3_di, lshrmode3_di, ashlmode3_di): Likewise.
(ashlimode3_iwmmxt, iwmmxt_waligni, iwmmxt_walignr): Likewise.
(iwmmxt_walignr0, iwmmxt_walignr1): Likewise.
(iwmmxt_walignr2, iwmmxt_walignr3): Likewise.
(iwmmxt_setwcgr0, iwmmxt_setwcgr1): Likewise.
(iwmmxt_setwcgr2, iwmmxt_setwcgr3): Likewise.
(iwmmxt_getwcgr0, iwmmxt_getwcgr1): Likewise.

RE: [PATCH, ARM, iWMMXt][1/5]: ARM code generic change

2011-12-28 Thread Xinyu Qi

 At 2011-12-15 00:47:48,Richard Earnshaw rearn...@arm.com wrote:
  On 14/07/11 08:35, Xinyu Qi wrote:
   Hi,
  
   It is the first part of iWMMXt maintenance.
  
   *config/arm/arm.c (arm_option_override):
 Enable iWMMXt with VFP. iWMMXt and NEON are incompatible.
   iWMMXt unsupported under Thumb-2 mode.
 (arm_expand_binop_builtin): Accept immediate op (with mode VOID)
   *config/arm/arm.md:
 Resettle include location of iwmmxt.md so that *arm_movdi
   and *arm_movsi_insn could be used when iWMMXt is enabled.
  
   With the current work in trunk to handle enabled attributes and
   per-alternative predicable attributes (Thanks Bernd) we should be
   able to get rid of *cond_iwmmxt_movsi_insn  in iwmmxt.md file.
   It's not a matter for this patch but for a follow-up patch.
  
   Actually we should probably do the same for the various insns that
   are dotted around all over the place with final conditions that
   prevent matching - atleast makes the backend description slightly
   smaller :).
  
 Add pipeline description file include.
  
   It is enough to say
  
(filename): Include.
  
   in the changelog entry.
  
   The include for the pipeline description file should be with the
   patch that you add this in i.e. patch #5. Please add this to
   MD_INCLUDES in t-arm as well.
  
   Also as a general note, please provide a correct Changelog entry.
  
   This is not the format that we expect Changelog entries to be in.
   Please look at the coding standards on the website for this or at
   other patches submitted with respect to Changelog entries. Please
   fix this for each patch in the patch stack.
  
  
   cheers
   Ramana
  
   Thanks for reviewing. I have updated the patches and the Changelog.
  
   *config/arm/arm.c (arm_option_override): Enable iWMMXt with VFP.
(arm_expand_binop_builtin): Accept VOIDmode op.
   *config/arm/arm.md (*arm_movdi, *arm_movsi_insn): Remove
  condition !TARGET_IWMMXT.
(iwmmxt.md): Include location.
  
   Thanks,
   Xinyu=
  
 
  + VFP and iWMMXt however can coexist.  */  if (TARGET_IWMMXT
 
  + TARGET_HARD_FLOAT  !TARGET_VFP)
  +sorry (iWMMXt and non-VFP floating point unit);
  +
  +  /* iWMMXt and NEON are incompatible.  */  if (TARGET_IWMMXT 
  + TARGET_NEON)
  +sorry (iWMMXt and NEON);
 
  -  /* ??? iWMMXt insn patterns need auditing for Thumb-2.  */
  +  /* iWMMXt unsupported under Thumb-2 mode.  */
 if (TARGET_THUMB2  TARGET_IWMMXT)
   sorry (Thumb-2 iWMMXt);
 
  Don't use sorry() when a feature is not supported by the hardware;
  sorry() is used when GCC is currently unable to support something that
  it should.  Use error() in these cases.
 
  Secondly, iWMMXt is incompatible with the entire Thumb ISA, not just
  the
  Thumb-2 extensions to the Thumb ISA.
 
 Done.
 
 
 
  +;; Load the Intel Wireless Multimedia Extension patterns (include
  +iwmmxt.md)
  +
 
 
  No, the extension patterns need to come at the end of the main machine
  description.  The list at the top of the MD file is purely for
  pipeline descriptions.  Why do you think this is needed?
 
 This modification is needless right now since *iwmmxt_movsi_insn and
 *iwmmxt_arm_movdi have been corrected in the fourth part of the patch.
 Revert it.
 The new modified patch is attached.
 
   * config/arm/arm.c (arm_option_override): Enable use of iWMMXt with
 VFP.
   Disable use of iWMMXt with NEON. Disable use of iWMMXt under Thumb
 mode.
   (arm_expand_binop_builtin): Accept VOIDmode op.
 
 Thanks,
 Xinyu
 
 
  Other bits are ok.
 
  R.

New changlog

* config/arm/arm.c (FL_IWMMXT2): New define.
(arm_arch_iwmmxt2): New variable.
(arm_option_override): Enable use of iWMMXt with VFP.
Disable use of iWMMXt with NEON. Disable use of iWMMXt under Thumb mode.
Set arm_arch_iwmmxt2.
(arm_expand_binop_builtin): Accept VOIDmode op.
* config/arm/arm.h (TARGET_CPU_CPP_BUILTINS): Define __IWMMXT2__.
(TARGET_IWMMXT2): New define.
(TARGET_REALLY_IWMMXT2): Likewise.
(arm_arch_iwmmxt2): Declare.
* config/arm/arm-cores.def (iwmmxt2): Add FL_IWMMXT2.
* config/arm/arm-arches.def (iwmmxt2): Likewise.
* config/arm/arm.md (arch): Add iwmmxt2.
(arch_enabled): Handle iwmmxt2.

Thanks,
Xinyu


1_generic.diff
Description: 1_generic.diff

RE: PING: [PATCH, ARM, iWMMXt][2/5]: intrinsic head file change

2011-12-28 Thread Xinyu Qi

* config/arm/mmintrin.h: Use __IWMMXT__ to enable iWMMXt intrinsics.
Use __IWMMXT2__ to enable iWMMXt2 intrinsics.
Use C name-mangling for intrinsics.
(__v8qi): Redefine.
(_mm_cvtsi32_si64, _mm_andnot_si64, _mm_sad_pu8): Revise.
(_mm_sad_pu16, _mm_align_si64, _mm_setwcx, _mm_getwcx): Likewise.
(_m_from_int): Likewise.
(_mm_sada_pu8, _mm_sada_pu16): New intrinsic.
(_mm_alignr0_si64, _mm_alignr1_si64, _mm_alignr2_si64): Likewise.
(_mm_alignr3_si64, _mm_tandcb, _mm_tandch, _mm_tandcw): Likewise.
(_mm_textrcb, _mm_textrch, _mm_textrcw, _mm_torcb): Likewise.
(_mm_torch, _mm_torcw, _mm_tbcst_pi8, _mm_tbcst_pi16): Likewise.
(_mm_tbcst_pi32): Likewise.
(_mm_abs_pi8, _mm_abs_pi16, _mm_abs_pi32): New iWMMXt2 intrinsic.
(_mm_addsubhx_pi16, _mm_absdiff_pu8, _mm_absdiff_pu16): Likewise.
(_mm_absdiff_pu32, _mm_addc_pu16, _mm_addc_pu32): Likewise.
(_mm_avg4_pu8, _mm_avg4r_pu8, _mm_maddx_pi16, _mm_maddx_pu16): Likewise.
(_mm_msub_pi16, _mm_msub_pu16, _mm_mulhi_pi32): Likewise.
(_mm_mulhi_pu32, _mm_mulhir_pi16, _mm_mulhir_pi32): Likewise.
(_mm_mulhir_pu16, _mm_mulhir_pu32, _mm_mullo_pi32): Likewise.
(_mm_qmulm_pi16, _mm_qmulm_pi32, _mm_qmulmr_pi16): Likewise.
(_mm_qmulmr_pi32, _mm_subaddhx_pi16, _mm_addbhusl_pu8): Likewise.
(_mm_addbhusm_pu8, _mm_qmiabb_pi32, _mm_qmiabbn_pi32): Likewise.
(_mm_qmiabt_pi32, _mm_qmiabtn_pi32, _mm_qmiatb_pi32): Likewise.
(_mm_qmiatbn_pi32, _mm_qmiatt_pi32, _mm_qmiattn_pi32): Likewise.
(_mm_wmiabb_si64, _mm_wmiabbn_si64, _mm_wmiabt_si64): Likewise.
(_mm_wmiabtn_si64, _mm_wmiatb_si64, _mm_wmiatbn_si64): Likewise.
(_mm_wmiatt_si64, _mm_wmiattn_si64, _mm_wmiawbb_si64): Likewise.
(_mm_wmiawbbn_si64, _mm_wmiawbt_si64, _mm_wmiawbtn_si64): Likewise.
(_mm_wmiawtb_si64, _mm_wmiawtbn_si64, _mm_wmiawtt_si64): Likewise.
(_mm_wmiawttn_si64, _mm_merge_si64): Likewise.
(_mm_torvscb, _mm_torvsch, _mm_torvscw): Likewise.
(_m_to_int): New define.

Thanks,
Xinyu


2_mmintrin.diff
Description: 2_mmintrin.diff

1 2 >

1 - 100 of 102 matches

Mail list logo