Re: [RFC] Vectorization of indexed elements

2013-09-27 Thread Vidya Praveen
On Fri, Sep 27, 2013 at 03:50:08PM +0100, Vidya Praveen wrote:
[...]
   I can't really insist on the single lane load.. something like:
   
   vc:V4SI[0] = c
   vt:V4SI = vec_duplicate:V4SI (vec_select:SI vc:V4SI 0)
   va:V4SI = vb:V4SI op vt:V4SI
   
   Or is there any other way to do this?
  
  Can you elaborate on I can't really insist on the single lane load?
  What's the single lane load in your example? 
 
 Loading just one lane of the vector like this:
 
 vc:V4SI[0] = c // from the above scalar example
 
 or 
 
 vc:V4SI[0] = c[2] 
 
 is what I meant by single lane load. In this example:
 
 t = c[2] 
 ...
 vb:v4si = b[0:3] 
 vc:v4si = { t, t, t, t }
 va:v4si = vb:v4si op vc:v4si 
 
 If we are expanding the CONSTRUCTOR as vec_duplicate at vec_init, I cannot
 insist 't' to be vector and t = c[2] to be vect_t[0] = c[2] (which could be 
 seen as vec_select:SI (vect_t 0) ). 
 
  I'd expect the instruction
  pattern as quoted to just work (and I hope we expand an uniform
  constructor { a, a, a, a } properly using vec_duplicate).
 
 As much as I went through the code, this is only done using vect_init. It is
 not expanded as vec_duplicate from, for example, store_constructor() of expr.c

Do you see any issues if we expand such constructor as vec_duplicate directly 
instead of going through vect_init way? 

VP




Mothballing C11 atomic work for now.

2013-09-27 Thread Andrew MacLeod
I don't have the time to finish pushing through the C11 atomic work for 
this release. Much of the remaining parts are in the parser, which I 
know very little about, and I won't be able to do a sufficient job in 
the time remaining, so I am switching my focus to the interface work and 
getting the header files re-factored before we end stage 1.


I have put all the work to date in a branch 'C11-atomic' which is based 
off of trunk on Sept 25th.  I have detailed the current status as well 
as what else needs doing on the C11 atomic wiki page

http://gcc.gnu.org/wiki/Atomic/C11

I have also uploaded the specific patches that have been applied to the 
branch, along with their revision numbers.  they are also on that wiki page.


The patches are very similar to what I posted here 
http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00420.html
I addressed jsm's basic comments, but did not address the larger issues 
of warnings/errors and converting lvalues into rvals, and other front 
end issues.  I also removed the places where i tried to treat the atomic 
qualifier like it was volatile.. I think that was wrong and was masking 
other issues.


If this work is important to someone else, you are welcome to pick it 
up.  My parser expertise is minimal, and most of the remaining work is 
in that part of the compiler.   The facilities are provided already to 
do the expansion of atomic variable into the appropriate sequences, they 
just need to be called from the right places in the parser.


I may well get back to this next spring for the next release, but for 
now I am mothballing it until I have the time to learn that parts I need 
to learn to finish it.


Andrew


Re: Mothballing C11 atomic work for now.

2013-09-27 Thread Jeff Hammond
If C11 atomics are not going into 4.9, then comments made to reject
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58016 no longer hold and I
would ask that the resolution of both it and
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53769 be reconsidered.

Jeff

On Fri, Sep 27, 2013 at 2:10 PM, Andrew MacLeod amacl...@redhat.com wrote:
 I don't have the time to finish pushing through the C11 atomic work for this
 release. Much of the remaining parts are in the parser, which I know very
 little about, and I won't be able to do a sufficient job in the time
 remaining, so I am switching my focus to the interface work and getting the
 header files re-factored before we end stage 1.

 I have put all the work to date in a branch 'C11-atomic' which is based off
 of trunk on Sept 25th.  I have detailed the current status as well as what
 else needs doing on the C11 atomic wiki page
 http://gcc.gnu.org/wiki/Atomic/C11

 I have also uploaded the specific patches that have been applied to the
 branch, along with their revision numbers.  they are also on that wiki page.

 The patches are very similar to what I posted here
 http://gcc.gnu.org/ml/gcc-patches/2013-08/msg00420.html
 I addressed jsm's basic comments, but did not address the larger issues of
 warnings/errors and converting lvalues into rvals, and other front end
 issues.  I also removed the places where i tried to treat the atomic
 qualifier like it was volatile.. I think that was wrong and was masking
 other issues.

 If this work is important to someone else, you are welcome to pick it up.
 My parser expertise is minimal, and most of the remaining work is in that
 part of the compiler.   The facilities are provided already to do the
 expansion of atomic variable into the appropriate sequences, they just need
 to be called from the right places in the parser.

 I may well get back to this next spring for the next release, but for now I
 am mothballing it until I have the time to learn that parts I need to learn
 to finish it.

 Andrew



-- 
Jeff Hammond
jeff.scie...@gmail.com


[Bug c++/58548] New: ICE with local struct in function with auto parameter

2013-09-27 Thread reichelt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58548

Bug ID: 58548
   Summary: ICE with local struct in function with auto parameter
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: reichelt at gcc dot gnu.org

The following code snippet triggers an ICE on trunk (4.9.0 20130926) when
compiled with -std=gnu++1y:

===
void foo(auto)
{
  struct A { int i; };
}
===

bug.cc: In function 'void foo(auto1)':
bug.cc:3:18: error: data member 'i' cannot be a member template
   struct A { int i; };
  ^
neu40.cc:6:18: internal compiler error: in poplevel, at cp/decl.c:560
0x554850 poplevel(int, int, int)
../../gcc/gcc/cp/decl.c:560
0x58e568 end_template_decl()
../../gcc/gcc/cp/pt.c:3786
0x62ed7b finish_fully_implicit_template
../../gcc/gcc/cp/parser.c:29040
0x637ad1 cp_parser_member_declaration
../../gcc/gcc/cp/parser.c:20086
0x6381ee cp_parser_member_specification_opt
../../gcc/gcc/cp/parser.c:19630
0x6381ee cp_parser_class_specifier_1
../../gcc/gcc/cp/parser.c:18885
0x63ab90 cp_parser_class_specifier
../../gcc/gcc/cp/parser.c:19101
0x63ab90 cp_parser_type_specifier
../../gcc/gcc/cp/parser.c:14080
0x6500a9 cp_parser_decl_specifier_seq
../../gcc/gcc/cp/parser.c:11328
0x654139 cp_parser_simple_declaration
../../gcc/gcc/cp/parser.c:10918
0x656140 cp_parser_block_declaration
../../gcc/gcc/cp/parser.c:10867
0x657230 cp_parser_declaration_statement
../../gcc/gcc/cp/parser.c:10514
0x63fad7 cp_parser_statement
../../gcc/gcc/cp/parser.c:9274
0x640dde cp_parser_statement_seq_opt
../../gcc/gcc/cp/parser.c:9552
0x640f26 cp_parser_compound_statement
../../gcc/gcc/cp/parser.c:9506
0x6522db cp_parser_function_body
../../gcc/gcc/cp/parser.c:18318
0x6522db cp_parser_ctor_initializer_opt_and_function_body
../../gcc/gcc/cp/parser.c:18354
0x65331f cp_parser_function_definition_after_declarator
../../gcc/gcc/cp/parser.c:22338
0x654027 cp_parser_function_definition_from_specifiers_and_declarator
../../gcc/gcc/cp/parser.c:22259
0x654027 cp_parser_init_declarator
../../gcc/gcc/cp/parser.c:16347
Please submit a full bug report, [etc.]

Furthermore, IMHO the error message is bogus and the code should be accepted.


[Bug c++/58549] New: [c++1y] ICE with local function in function with auto parameter

2013-09-27 Thread reichelt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58549

Bug ID: 58549
   Summary: [c++1y] ICE with local function in function with auto
parameter
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: reichelt at gcc dot gnu.org

The following valid code snippet (compiled with -std=gnu++1y) triggers an ICE
on trunk (4.9.0 20130926):

===
void foo(auto)
{
  void bar();
}
===

bug.cc: In function 'void foo(auto1)':
bug.cc:4:1: internal compiler error: in finish_function, at cp/decl.c:13852
 }
 ^
0x56b38f finish_function(int)
../../gcc/gcc/cp/decl.c:13852
0x65333d cp_parser_function_definition_after_declarator
../../gcc/gcc/cp/parser.c:22344
0x654027 cp_parser_function_definition_from_specifiers_and_declarator
../../gcc/gcc/cp/parser.c:22259
0x654027 cp_parser_init_declarator
../../gcc/gcc/cp/parser.c:16347
0x6542df cp_parser_simple_declaration
../../gcc/gcc/cp/parser.c:10986
0x656140 cp_parser_block_declaration
../../gcc/gcc/cp/parser.c:10867
0x65f16e cp_parser_declaration
../../gcc/gcc/cp/parser.c:10764
0x65decd cp_parser_declaration_seq_opt
../../gcc/gcc/cp/parser.c:10650
0x65f7b6 cp_parser_translation_unit
../../gcc/gcc/cp/parser.c:3939
0x65f7b6 c_parse_file()
../../gcc/gcc/cp/parser.c:28898
0x772e94 c_common_parse_file()
../../gcc/gcc/c-family/c-opts.c:1046
Please submit a full bug report, [etc.]


[Bug c++/58548] [4.9 Regression] [c++1y] ICE with local struct in function with auto parameter

2013-09-27 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58548

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-09-27
 CC||mpolacek at gcc dot gnu.org
   Target Milestone|--- |4.9.0
Summary|[c++1y] ICE with local  |[4.9 Regression] [c++1y]
   |struct in function with |ICE with local struct in
   |auto parameter  |function with auto
   ||parameter
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek mpolacek at gcc dot gnu.org ---
Confirmed.  I can't comment on whether it's valid code or not though.


[Bug c++/58549] [4.9 Regression] [c++1y] ICE with local function in function with auto parameter

2013-09-27 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58549

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-09-27
 CC||mpolacek at gcc dot gnu.org
Summary|[c++1y] ICE with local  |[4.9 Regression] [c++1y]
   |function in function with   |ICE with local function in
   |auto parameter  |function with auto
   ||parameter
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek mpolacek at gcc dot gnu.org ---
Confirmed with trunk, 4.8:
q.C:1:10: error: parameter declared ‘auto’
 void foo(auto)
  ^
Are these auto parameters really valid?  What's their purpose?

[Bug c++/58549] [4.9 Regression] [c++1y] ICE with local function in function with auto parameter

2013-09-27 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58549

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0


[Bug target/58546] volatile bug and also larger code at -Os

2013-09-27 Thread ubizjak at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58546

Uroš Bizjak ubizjak at gmail dot com changed:

   What|Removed |Added

 CC||hubicka at gcc dot gnu.org

--- Comment #4 from Uroš Bizjak ubizjak at gmail dot com ---
(In reply to Andrew Pinski from comment #3)
 This is a target specific issue as the RTL looks fine from expand:

The splitter in question is the one with the comment:

;; Avoid redundant prefixes by splitting HImode arithmetic to SImode.

The splitter does check for aligned_operand operands, which in turn avoids
volatiles. However, outside of the operand, data layout is not known to the
predicate.

Let's ask Honza about this.

[Bug c++/58550] New: [4.9 Regression] ][c++0x] ICE with auto in function return type and lto

2013-09-27 Thread reichelt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58550

Bug ID: 58550
   Summary: [4.9 Regression] ][c++0x] ICE with auto in function
return type and lto
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: reichelt at gcc dot gnu.org

The following (probably invalid) code snippet triggers an ICE on trunk (4.9.0
20130926) when compiled with std=c++0x -flto:


auto foo();

auto fp = foo;


bug.cc:1:10: warning: 'foo' function uses 'auto' type specifier without
trailing return type [enabled by default]
 auto foo();
  ^
bug.cc:3:14: internal compiler error: tree code 'template_type_parm' is not
supported in LTO streams
 auto fp = foo;
  ^
0xa17696 DFS_write_tree
../../gcc/gcc/lto-streamer-out.c:1244
0xa165c9 DFS_write_tree_body
../../gcc/gcc/lto-streamer-out.c:461
0xa165c9 DFS_write_tree
../../gcc/gcc/lto-streamer-out.c:1152
0xa165c9 DFS_write_tree_body
../../gcc/gcc/lto-streamer-out.c:461
0xa165c9 DFS_write_tree
../../gcc/gcc/lto-streamer-out.c:1152
0xa18907 lto_output_tree(output_block*, tree_node*, bool, bool)
../../gcc/gcc/lto-streamer-out.c:1334
0xa12cfc write_global_stream
../../gcc/gcc/lto-streamer-out.c:2084
0xa1a990 lto_output_decl_state_streams
../../gcc/gcc/lto-streamer-out.c:2128
0xa1a990 produce_asm_for_decls
../../gcc/gcc/lto-streamer-out.c:2413
0xa4e720 ipa_write_summaries_2
../../gcc/gcc/passes.c:2283
0xa4f799 ipa_write_summaries_1
../../gcc/gcc/passes.c:2314
0xa4f799 ipa_write_summaries()
../../gcc/gcc/passes.c:2371
0x807c5b ipa_passes
../../gcc/gcc/cgraphunit.c:2019
0x807c5b compile()
../../gcc/gcc/cgraphunit.c:2115
0x807ee9 finalize_compilation_unit()
../../gcc/gcc/cgraphunit.c:2269
0x61b2b0 cp_write_global_declarations()
../../gcc/gcc/cp/decl2.c:4360
Please submit a full bug report, [etc.]


In GCC 4.8.1 the code was rejected:
bug.cc:1:10: warning: 'foo' function uses 'auto' type specifier without
trailing return type [enabled by default]
 auto foo();
  ^
bug.cc:3:11: error: use of 'auto foo()' before deduction of 'auto'
 auto fp = foo;


[Bug c++/58549] [4.9 Regression] [c++1y] ICE with local function in function with auto parameter

2013-09-27 Thread reichelt at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58549

--- Comment #2 from Volker Reichelt reichelt at gcc dot gnu.org ---
To me they look like a (syntactically simpler) alternative to template
parameters. They were introduced here:

 2013-09-16  Adam Butcher  a...@jessamine.co.uk

   * cp-tree.h (type_uses_auto_or_concept): Declare.
   (is_auto_or_concept): Declare.
   * decl.c (grokdeclarator): Allow 'auto' parameters in lambdas with
   -std=gnu++1y or -std=c++1y or, as a GNU extension, in plain functions.

[...]

[Bug c++/58550] [4.9 Regression] ][c++0x] ICE with auto in function return type and lto

2013-09-27 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58550

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-09-27
 CC||mpolacek at gcc dot gnu.org
   Target Milestone|--- |4.9.0
 Ever confirmed|0   |1

--- Comment #1 from Marek Polacek mpolacek at gcc dot gnu.org ---
Confirmed with trunk.  Interesting is that with -std=gnu++1y:
w.C:3:11: error: use of ‘auto foo()’ before deduction of ‘auto’
 auto fp = foo;
   ^

[Bug c++/58549] [4.9 Regression] [c++1y] ICE with local function in function with auto parameter

2013-09-27 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58549

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 CC||abutcher at gcc dot gnu.org

--- Comment #3 from Marek Polacek mpolacek at gcc dot gnu.org ---
Started with r202850.


[Bug middle-end/58547] [4.9 Regression] rtlanal.c:5482:19: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58547

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0


[Bug other/58545] [4.7/4.8/4.9 Regression] error: unable to find a register to spill in class 'POINTER_REGS'

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58545

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.7.4
Summary|[4.7/4.8 Regression] error: |[4.7/4.8/4.9 Regression]
   |unable to find a register   |error: unable to find a
   |to spill in class   |register to spill in class
   |'POINTER_REGS'  |'POINTER_REGS'

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
Assuming 4.9 doesn't work either.


[Bug tree-optimization/58459] [4.9 regression] Loop invariant is not hoisted out of loop after r202525.

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58459

--- Comment #7 from Richard Biener rguenth at gcc dot gnu.org ---
Author: rguenth
Date: Fri Sep 27 08:14:53 2013
New Revision: 202966

URL: http://gcc.gnu.org/viewcvs?rev=202966root=gccview=rev
Log:
2013-09-27  Richard Biener  rguent...@suse.de

PR tree-optimization/58459
* tree-ssa-forwprop.c (forward_propagate_addr_expr): Remove
restriction not propagating into loops.

* gcc.dg/tree-ssa/ssa-pre-31.c: New testcase.

Added:
trunk/gcc/testsuite/gcc.dg/tree-ssa/ssa-pre-31.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-ssa-forwprop.c


[Bug c++/58550] [4.9 Regression] ][c++0x] ICE with auto in function return type and lto

2013-09-27 Thread mpolacek at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58550

Marek Polacek mpolacek at gcc dot gnu.org changed:

   What|Removed |Added

 CC||jason at gcc dot gnu.org

--- Comment #2 from Marek Polacek mpolacek at gcc dot gnu.org ---
This one seems to start with r198099 -- but it might be some other latent
issue...


[Bug middle-end/58547] [4.9 Regression] rtlanal.c:5482:19: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]

2013-09-27 Thread ebotcazou at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58547

Eric Botcazou ebotcazou at gcc dot gnu.org changed:

   What|Removed |Added

 Target|hppa-unknown-linux-gnu  |
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-09-27
 CC||ebotcazou at gcc dot gnu.org
   Host|hppa-unknown-linux-gnu  |
 Ever confirmed|0   |1
  Build|hppa-unknown-linux-gnu  |
   Severity|normal  |major

--- Comment #1 from Eric Botcazou ebotcazou at gcc dot gnu.org ---
Confirmed on PowerPC.


[Bug middle-end/58551] New: [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

Bug ID: 58551
   Summary: [4.9 Regression] ICE with abort in OpenMP SESE region
inside of some loop
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Keywords: ice-on-valid-code
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jakub at gcc dot gnu.org

/* { dg-do compile } */
/* { dg-options -O0 -fopenmp } */

void
foo (int *a)
{
  int i;
  for (i = 0; i  8; i++)
#pragma omp task
if (a[i])
  __builtin_abort ();
}

ICEs in 4.9, because __builtin_abort () bb after outlining the SESE region has
bogus loop_father.


[Bug middle-end/58547] [4.9 Regression] rtlanal.c:5482:19: error: comparison between signed and unsigned integer expressions [-Werror=sign-compare]

2013-09-27 Thread iains at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58547

--- Comment #2 from Iain Sandoe iains at gcc dot gnu.org ---
Author: iains
Date: Fri Sep 27 08:59:18 2013
New Revision: 202967

URL: http://gcc.gnu.org/viewcvs?rev=202967root=gccview=rev
Log:
gcc:

PR middle-end/58547
* rtlanal.c (lsb_bitfield_op_p): Make both parts of the comparison
signed.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/rtlanal.c


[Bug middle-end/58551] [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

--- Comment #1 from Jakub Jelinek jakub at gcc dot gnu.org ---
Anoter testcase that ICEs even with -O2 -fopenmp:
/* { dg-do compile } */
/* { dg-options -O2 -fopenmp } */

void bar (int, int);
void
foo (int *a)
{
  int i;
  for (i = 0; i  8; i++)
#pragma omp task
if (a[i])
  {
int j, k;
for (j = 0; j  10; j++)
  for (k = 0; k  8; k++)
bar (j, k);
for (k = 0; k  12; k++)
  bar (-1, k);
__builtin_abort ();
  }
}


[Bug middle-end/58551] [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2013-09-27
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #2 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 30907
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30907action=edit
gcc49-pr58551.patch

Untested fix.


[Bug tree-optimization/58459] [4.9 regression] Loop invariant is not hoisted out of loop after r202525.

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58459

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #8 from Richard Biener rguenth at gcc dot gnu.org ---
Fixed.


[Bug sanitizer/58543] Invalid unpoisoning of stack redzones on ARM

2013-09-27 Thread y.gribov at samsung dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58543

--- Comment #3 from Yury Gribov y.gribov at samsung dot com ---
Created attachment 30908
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30908action=edit
Test results

Tests seem to pass both on x86_64 and on ARM (attached).


[Bug tree-optimization/58532] [4.9 Regression] bootstrap failure with BOOT_CFLAGS=-g -O3

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58532

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|WAITING |ASSIGNED

--- Comment #4 from Richard Biener rguenth at gcc dot gnu.org ---
Ok, I reproduced it.

Bootstrap comparison failure!
gcc/dwarf2out.o differs
gcc/fortran/parse.o differs
libiberty/regex.o differs
libiberty/pic/regex.o differs

somehow GCC has miscompiled itself.


[Bug middle-end/58551] [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0


[Bug middle-end/58551] [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

  Attachment #30907|0   |1
is obsolete||

--- Comment #3 from Jakub Jelinek jakub at gcc dot gnu.org ---
Created attachment 30909
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30909action=edit
gcc49-pr58551.patch

Updated untested patch that should also fix num_nodes adjustments.


[Bug tree-optimization/58532] [4.9 Regression] bootstrap failure with BOOT_CFLAGS=-g -O3

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58532

--- Comment #5 from Richard Biener rguenth at gcc dot gnu.org ---
There is a compare-debug failure on fortran/parse.o at least, reducing that.


[Bug target/58507] Incorrect parsing of `-mmcu=msp430*`

2013-09-27 Thread nickc at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58507

--- Comment #1 from Nick Clifton nickc at redhat dot com ---
Created attachment 30910
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30910action=edit
Fix objdump output

Proposed patch to fix objdump output


[Bug tree-optimization/58532] [4.9 Regression] bootstrap failure with BOOT_CFLAGS=-g -O3

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58532

--- Comment #6 from Richard Biener rguenth at gcc dot gnu.org ---
One difference happens in 057.cunrolli already, we create a preheader for a
loop depending on -g:

 ;; Function bool gfc_parse_file() (_Z14gfc_parse_filev, funcdef_no=257,
decl_ui
d=17369, symbol_order=156)

 Created preheader block for loop 4
-Created preheader block for loop 5
 ;; 10 loops found

...

@@ -18009,13 +19254,14 @@
   bb 109:
   st_228 = parse_spec (131);

-  # st_470 = PHI st_228(109)
-  # error_471 = PHI 0(109)
+  # st_229 = PHI st_228(109), st_232(124)
+  # error_233 = PHI 0(109), 1(124)
 loop:
-
-  bb 205:
-  # st_229 = PHI st_470(110), st_232(207)
-  # error_233 = PHI error_471(110), 1(207)
+  # DEBUG st = NULL
+  # DEBUG error = NULL
+  # DEBUG st = NULL
+  # DEBUG error = error_233
+  # DEBUG st = st_229
   st.280_230 = (int) st_229;
   switch (st.280_230) default: L61, case 12: L59, case 33: L60, case
131: L58

...

   bb 124:
   st_232 = next_statement ();
-
-  bb 207:
-  goto bb 205;
+  goto bb 110 (loop);

(to investigate).


[Bug tree-optimization/58552] New: [4.9 Regression] -fcompare-debug failure

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58552

Bug ID: 58552
   Summary: [4.9 Regression] -fcompare-debug failure
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rguenth at gcc dot gnu.org

Created attachment 30911
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30911action=edit
testcase

spuriously reduced from a -O3 bootstrap miscompare.  Fails at -O2.


[Bug tree-optimization/58552] [4.9 Regression] -fcompare-debug failure

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58552

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

   Target Milestone|--- |4.9.0


[Bug tree-optimization/58532] [4.9 Regression] bootstrap failure with BOOT_CFLAGS=-g -O3

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58532

--- Comment #7 from Richard Biener rguenth at gcc dot gnu.org ---
First testcase reduction ended in PR58552, re-reducing.


[Bug lto/58528] lto1: internal compiler error: in build_abbrev_table, at dwarf2out.c:7478

2013-09-27 Thread charles.frasch at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58528

--- Comment #6 from Charles charles.frasch at gmail dot com ---
Created attachment 30912
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30912action=edit
script to reproduce the ICE

This script reproduces the bug. It requires 27 .ii files and one archive file
of Google's gtest 1.6.0. If this is acceptable will either attach the .ii files
or send you a tarball directly.


[Bug c/53001] -Wfloat-conversion should be available to warn about floating point errors

2013-09-27 Thread jjcogliati-r1 at yahoo dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=53001

--- Comment #19 from Joshua Cogliati jjcogliati-r1 at yahoo dot com ---
Created attachment 30913
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30913action=edit
Patch to add -Wfloat-conversion option against trunk

This version is against gcc trunk (rev 202818).  It now bootstraps.  

It adds about ten casts so that the existing float conversions in gcc are now
explicit instead of implicit so that gcc can bootstrap even with the new
warning.


[Bug tree-optimization/58552] [4.9 Regression] -fcompare-debug failure

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58552

Richard Biener rguenth at gcc dot gnu.org changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2013-09-27
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener rguenth at gcc dot gnu.org ---
starts already with early inlining.


[Bug tree-optimization/58552] [4.9 Regression] -fcompare-debug failure

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58552

--- Comment #2 from Richard Biener rguenth at gcc dot gnu.org ---
Reduced:

extern void fancy_abort () __attribute__ ((__noreturn__));
extern C {
struct __jmp_buf_tag   { };
typedef struct __jmp_buf_tag jmp_buf[1];
extern int _setjmp (struct __jmp_buf_tag __env[1]) throw ();
}
extern void *gfc_state_stack;
static jmp_buf eof_buf;
static void push_state ()
{
  if (!gfc_state_stack)
fancy_abort ();
}
bool gfc_parse_file (void)
{
  int seen_program=0;
  if (_setjmp (eof_buf))
return false;
  if (seen_program)
goto duplicate_main;
  seen_program = 1;
  push_state ();
  push_state ();
duplicate_main:
  return true;
}


[Bug tree-optimization/58552] [4.9 Regression] -fcompare-debug failure

2013-09-27 Thread rguenth at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58552

--- Comment #3 from Richard Biener rguenth at gcc dot gnu.org ---
Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 202971)
+++ gcc/tree-cfg.c  (working copy)
@@ -1013,6 +1013,9 @@ make_abnormal_goto_edges (basic_block bb
  break;
}
}
+  if (!gsi_end_p (gsi)
+  is_gimple_debug (gsi_stmt (gsi)))
+   gsi_next_nondebug (gsi);
   if (!gsi_end_p (gsi))
{
  /* Make an edge to every setjmp-like call.  */

fixes it.


[Bug middle-end/58551] [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

--- Comment #4 from Jakub Jelinek jakub at gcc dot gnu.org ---
Author: jakub
Date: Fri Sep 27 13:44:10 2013
New Revision: 202972

URL: http://gcc.gnu.org/viewcvs?rev=202972root=gccview=rev
Log:
PR middle-end/58551
* tree-cfg.c (move_sese_region_to_fn): Also move loops that
are children of outermost saved_cfun's loop, and set it up to
be moved to dest_cfun's outermost loop.  Fix up num_nodes adjustments
if loop != loop0 and SESE region contains bbs that belong to loop0.

* c-c++-common/gomp/pr58551.c: New test.

Added:
trunk/gcc/testsuite/c-c++-common/gomp/pr58551.c
Modified:
trunk/gcc/ChangeLog
trunk/gcc/testsuite/ChangeLog
trunk/gcc/tree-cfg.c


[Bug libstdc++/57465] Failed postcondition for std::function constructed with null function pointer

2013-09-27 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57465

--- Comment #1 from Jonathan Wakely redi at gcc dot gnu.org ---
Author: redi
Date: Fri Sep 27 14:06:09 2013
New Revision: 202974

URL: http://gcc.gnu.org/viewcvs?rev=202974root=gccview=rev
Log:
PR libstdc++/57465
* include/std/functional
(_Function_base::_Base_manager::_M_not_empty_function): Fix overload
for pointers.
* testsuite/20_util/function/cons/57465.cc: New.

Added:
trunk/libstdc++-v3/testsuite/20_util/function/cons/57465.cc
Modified:
trunk/libstdc++-v3/ChangeLog
trunk/libstdc++-v3/include/std/functional


[Bug libstdc++/57465] Failed postcondition for std::function constructed with null function pointer

2013-09-27 Thread redi at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=57465

--- Comment #2 from Jonathan Wakely redi at gcc dot gnu.org ---
Fixed on the trunk so far.


[Bug libfortran/58015] FAIL: gfortran.dg/round_4.f90: Unsatisfied symbol nextafterl

2013-09-27 Thread dave.anglin at bell dot net
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58015

--- Comment #4 from dave.anglin at bell dot net ---
On 9/21/2013 11:13 AM, dominiq at lps dot ens.fr wrote:
 http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58015

 --- Comment #2 from Dominique d'Humieres dominiq at lps dot ens.fr ---
 Is this PR different from pr58113 beside the missing nextafterl on
 hppa64-hp-hpux11.11?
I hacked c99_functions.c to provide nextafterl using nextafterq from 
libquadmath.  With
this, I see the bug in pr58113.

Regarding nextafterl, I'm thinking about an include hack to math.h for 
hppa*-*-hpux11*.
On all HP-UX systems, the l and q long double and quad math 
functions are equivalent.

Dave


[Bug tree-optimization/58359] __builtin_unreachable prevents vectorization

2013-09-27 Thread a.sinyavin at samsung dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58359

--- Comment #4 from Anatoly Sinyavin a.sinyavin at samsung dot com ---
Created attachment 30914
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30914action=edit
Fisrt patch


[Bug tree-optimization/58359] __builtin_unreachable prevents vectorization

2013-09-27 Thread a.sinyavin at samsung dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58359

--- Comment #5 from Anatoly Sinyavin a.sinyavin at samsung dot com ---
Created attachment 30915
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30915action=edit
Second patch


[Bug tree-optimization/58359] __builtin_unreachable prevents vectorization

2013-09-27 Thread a.sinyavin at samsung dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58359

--- Comment #6 from Anatoly Sinyavin a.sinyavin at samsung dot com ---
I have created two patches to fix this problem.

   The first patch (bug_fix_58359_builit_unreachable.patch) just moves
functionality of optimize_unreachable from fab pass to cfg pass

   The second patch (bug_fix_58359_builit_unreachable.AGGRESSIVE.patch) is more
aggressive variant. Origininal implementation of optimize_unreachable doesn't
delete basic block if there is FORCED_LABEL, non debug statemnt, or call
function before __built_unreachable in this basic block.
   I think we can't delete basic block if it contains some statement X before 
__built_unreachable. This statement X can potentially transfer control from
this basic block and can't return. It's possible in two cases: if statement X
is procedure call (without return) or assembler instruction. (See also
__built_unreachable description)


[Bug middle-end/58463] ICE with -fdump-tree-all-all in vector indexed access

2013-09-27 Thread pmatos at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58463

--- Comment #8 from pmatos at gcc dot gnu.org ---
Author: pmatos
Date: Fri Sep 27 14:54:43 2013
New Revision: 202976

URL: http://gcc.gnu.org/viewcvs?rev=202976root=gccview=rev
Log:
PR middle-end/58463
* gcc.dg/pr58463.c: New test.

Added:
trunk/gcc/testsuite/gcc.dg/pr58463.c
Modified:
trunk/gcc/ChangeLog


[Bug target/58507] Incorrect parsing of `-mmcu=msp430*`

2013-09-27 Thread nickc at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58507

--- Comment #2 from Nick Clifton nickc at redhat dot com ---
Created attachment 30916
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30916action=edit
Add parsing of known MSP430 MCU types

I am currently testing this patch to see if it introduces any regressions into
the gcc testsuite...


[Bug tree-optimization/58463] ICE with -fdump-tree-all-all in vector indexed access

2013-09-27 Thread pmatos at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58463

--- Comment #9 from pmatos at gcc dot gnu.org ---
Author: pmatos
Date: Fri Sep 27 16:30:15 2013
New Revision: 202978

URL: http://gcc.gnu.org/viewcvs?rev=202978root=gccview=rev
Log:
Backport from mainline.

2013-09-27  Paulo Matos  pma...@broadcom.com
PR middle-end/58463
* gcc.dg/pr58463.c: New test.

Added:
branches/gcc-4_8-branch/gcc/testsuite/gcc.dg/pr58463.c
Modified:
branches/gcc-4_8-branch/gcc/ChangeLog


[Bug target/56716] during gcc 4.8.0 build on Cygwin: bid128_fma.c:4460:1: internal compiler error: Segmentation fault

2013-09-27 Thread pmatos at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56716

--- Comment #11 from pmatos at gcc dot gnu.org ---
Author: pmatos
Date: Fri Sep 27 16:44:39 2013
New Revision: 202979

URL: http://gcc.gnu.org/viewcvs?rev=202979root=gccview=rev
Log:
Backport from mainline.

 PR middle-end/58463
 2013-03-27  Richard Biener  rguent...@suse.de

 PR tree-optimization/56716
 * tree-ssa-structalias.c (perform_var_substitution): Adjust
 dumping for ref nodes.

Modified:
branches/gcc-4_8-branch/gcc/ChangeLog
branches/gcc-4_8-branch/gcc/tree-ssa-structalias.c


[Bug middle-end/58463] ICE with -fdump-tree-all-all in vector indexed access

2013-09-27 Thread pmatos at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58463

--- Comment #10 from pmatos at gcc dot gnu.org ---
Author: pmatos
Date: Fri Sep 27 16:44:39 2013
New Revision: 202979

URL: http://gcc.gnu.org/viewcvs?rev=202979root=gccview=rev
Log:
Backport from mainline.

 PR middle-end/58463
 2013-03-27  Richard Biener  rguent...@suse.de

 PR tree-optimization/56716
 * tree-ssa-structalias.c (perform_var_substitution): Adjust
 dumping for ref nodes.

Modified:
branches/gcc-4_8-branch/gcc/ChangeLog
branches/gcc-4_8-branch/gcc/tree-ssa-structalias.c


[Bug tree-optimization/58553] New: New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread jgreenhalgh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Bug ID: 58553
   Summary: New fail in PASS-FAIL:
gcc.c-torture/execute/memcpy-2.c execution on arm and
aarch64
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: jgreenhalgh at gcc dot gnu.org

Created attachment 30917
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30917action=edit
Preprocessed source

Jeff's change to the Jump-Threading code here:
http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01910.html

Introduced a regression for arm and aarch64 in 
gcc.c-torture/execute/memcpy-2.c, such that I now see:

 *** EXIT code
 emu: host signal 0

When executing the testcase on a model with command line:

/work/gcc-clean/build-arm-none-eabi/install/bin/arm-none-eabi-gcc
-B/work/gcc-clean/build-arm-none-eabi/obj/gcc2/gcc/
/work/gcc-clean/src/gcc/gcc/testsuite/gcc.c-torture/execute/memcpy-2.c
-fno-diagnostics-show-caret -fdiagnostics-color=never -w -O3 -g
-Wa,-mno-warn-deprecated -lm -marm -march=armv7-a -mfpu=vfpv3-d16
-mfloat-abi=softfp -o
/work/gcc-clean/build-arm-none-eabi/obj/gcc2/gcc/testsuite/gcc/memcpy-2.x
-save-temps

I've attached the preprocessed source and the output from
-fdump-tree-dom1-details


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread jgreenhalgh at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #1 from jgreenhalgh at gcc dot gnu.org ---
Created attachment 30918
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30918action=edit
Output of dom1


[Bug middle-end/58463] ICE with -fdump-tree-all-all in vector indexed access

2013-09-27 Thread pa...@matos-sorge.com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58463

Paulo J. Matos pa...@matos-sorge.com changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #11 from Paulo J. Matos pa...@matos-sorge.com ---
Backported Richard's patch to branch 4.8 under r202979.
Will mark as fixed.


[Bug tree-optimization/58554] New: Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk

2013-09-27 Thread pthaugen at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

Bug ID: 58554
   Summary: Revision 202619 causes runtime failure in CPU2006
benchmark 445.gobmk
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: pthaugen at gcc dot gnu.org
CC: bergner at gcc dot gnu.org, dje.gcc at gmail dot com,
rguenth at gcc dot gnu.org
  Host: powerpc64-linux
Target: powerpc64-linux
 Build: powerpc64-linux

gobmk started failing at runtime with the stated revision. Tracked down
offending code (from benchmark source engine/board.c) and reduced to the
following. Generated code is ignoring control dependence and simply calling
memset to set the entire array.

[pthaugen@igoo build_base_test_32.]$ cat junk.c
extern int board_size;
extern unsigned char board[421];

void clear_board(void)
{
  int k;

  for (k = 0; k  421; k++) {
/* Original:
if (!((unsigned) (((k) / (19 + 1) - 1))  (unsigned) board_size 
(unsigned) (((k) % (19 + 1) - 1))  (unsigned) board_size)) */
if (k  board_size )
  board[k] = 3;
  }
}
[pthaugen@igoo build_base_test_32.]$
/home/pthaugen/install/gcc/trunk_work/bin/gcc -S -m32 -O3 junk.c

Generated assembler for the function:
clear_board:
lis 3,board@ha
li 4,3
la 3,board@l(3)
li 5,421
b memset


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #2 from Jeffrey A. Law law at redhat dot com ---
James.  Look in the .ldist dump.  In particular look at that memset call. 
We're writing off the end of the structure.  Now to walk backwards and figure
out why  :-)


[Bug middle-end/58551] [4.9 Regression] ICE with abort in OpenMP SESE region inside of some loop

2013-09-27 Thread jakub at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58551

Jakub Jelinek jakub at gcc dot gnu.org changed:

   What|Removed |Added

 Status|ASSIGNED|RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Jakub Jelinek jakub at gcc dot gnu.org ---
Fixed.


[Bug tree-optimization/58554] [4.9 Regression] Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk

2013-09-27 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

   Keywords||wrong-code
   Target Milestone|--- |4.9.0
Summary|Revision 202619 causes  |[4.9 Regression] Revision
   |runtime failure in CPU2006  |202619 causes runtime
   |benchmark 445.gobmk |failure in CPU2006
   ||benchmark 445.gobmk
   Severity|normal  |blocker


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread pinskia at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Andrew Pinski pinskia at gcc dot gnu.org changed:

   What|Removed |Added

 Depends on||58554

--- Comment #3 from Andrew Pinski pinskia at gcc dot gnu.org ---
This sounds like bug 58554.


[Bug c++/58555] New: Floating point exception in want_inline_self_recursive_call_p

2013-09-27 Thread dcb314 at hotmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58555

Bug ID: 58555
   Summary: Floating point exception in
want_inline_self_recursive_call_p
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com

Created attachment 30919
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30919action=edit
gzipped C++ source code

I just tried to compile package flamerobin-0.9.3-4.20130401snap with
gcc 4.9 trunk dated 20130925. It said

./src/metadata/root.cpp:375:1: internal compiler error: Floating point
exception
 }
 ^
0xafbfff crash_signal
../../src/trunk/gcc/toplev.c:335
0x50ed95 want_inline_self_recursive_call_p
../../src/trunk/gcc/ipa-inline.c:699
0xf72320 inline_small_functions
../../src/trunk/gcc/ipa-inline.c:1756
0xf72320 ipa_inline
../../src/trunk/gcc/ipa-inline.c:2009
0xf72320 execute
../../src/trunk/gcc/ipa-inline.c:2379
Please submit a full bug report,
with preprocessed source if appropriate.

Preprocessed source code attached. Flag -O3 required.

Checking the compiler source code, the offending line is

  if (!max_count
   (edge-frequency * CGRAPH_FREQ_BASE / caller_freq
  = max_prob))

I speculate that caller_freq == 0 and someone has missed out
a belt'n'braces check for zero before making the division.


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

--- Comment #4 from Jeffrey A. Law law at redhat dot com ---
Andrew.  Yes it does.  I've never looked at the ldist code, but the dump seems
a bit strange:
Analyzing # of iterations of loop 3
  exit condition [1, + , 1](no_overflow) != 96
  bounds on difference of bases: 95 ... 95
  result:
# of iterations 95, bounded by 95

  __builtin_memset (MEM[(void *)u1 + 1B], 97, 96);


So it determined the right iteration count but mucked up the count in the call
to memset ?!?  Weird


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Jeffrey A. Law law at redhat dot com changed:

   What|Removed |Added

 CC||pthaugen at gcc dot gnu.org

--- Comment #5 from Jeffrey A. Law law at redhat dot com ---
*** Bug 58554 has been marked as a duplicate of this bug. ***


[Bug tree-optimization/58554] [4.9 Regression] Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

Jeffrey A. Law law at redhat dot com changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 CC||law at redhat dot com
 Resolution|--- |DUPLICATE

--- Comment #1 from Jeffrey A. Law law at redhat dot com ---
Duplicate.

*** This bug has been marked as a duplicate of bug 58553 ***


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Bug 58553 depends on bug 58554, which changed state.

Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in 
CPU2006 benchmark 445.gobmk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE


[Bug tree-optimization/58553] New fail in PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution on arm and aarch64

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58553

Bug 58553 depends on bug 58554, which changed state.

Bug 58554 Summary: [4.9 Regression] Revision 202619 causes runtime failure in 
CPU2006 benchmark 445.gobmk
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

   What|Removed |Added

 Status|RESOLVED|REOPENED
 Resolution|DUPLICATE   |---


[Bug tree-optimization/58554] [4.9 Regression] Revision 202619 causes runtime failure in CPU2006 benchmark 445.gobmk

2013-09-27 Thread law at redhat dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58554

Jeffrey A. Law law at redhat dot com changed:

   What|Removed |Added

 Status|RESOLVED|REOPENED
   Last reconfirmed||2013-09-27
 Resolution|DUPLICATE   |---
 Ever confirmed|0   |1

--- Comment #2 from Jeffrey A. Law law at redhat dot com ---
Since this doesn't depend on the recent threading changes to trigger, I'm
keeping this open as I'll probably revert a tiny piece of the threading changes
which will make 58553 go latent.


[Bug c++/58555] Floating point exception in want_inline_self_recursive_call_p

2013-09-27 Thread markus at trippelsdorf dot de
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58555

Markus Trippelsdorf markus at trippelsdorf dot de changed:

   What|Removed |Added

 CC||markus at trippelsdorf dot de

--- Comment #1 from Markus Trippelsdorf markus at trippelsdorf dot de ---
Created attachment 30920
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30920action=edit
reduced testcase


[Bug tree-optimization/58556] New: gen-vect-26.c / gen-vect-28.c regression merging from r202839 to r202981

2013-09-27 Thread amylaar at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58556

Bug ID: 58556
   Summary: gen-vect-26.c / gen-vect-28.c regression merging from
r202839 to r202981
   Product: gcc
   Version: 4.9.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amylaar at gcc dot gnu.org
Target: arc-elf32

Created attachment 30921
  -- http://gcc.gnu.org/bugzilla/attachment.cgi?id=30921action=edit
gen-vect-26.c.114t.vect dump file

I just merged in trunk from https://github.com/mirrors/gcc.git, and I see
four new failures (in just four days):
82870c82883
 PASS: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect vectorized 1
lo
ops 1
---
 FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect vectorized 1 
 loops 1
82872c82885
 PASS: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect Alignment of
access forced using peeling 1
---
 FAIL: gcc.dg/tree-ssa/gen-vect-26.c scan-tree-dump-times vect Alignment of 
 access forced using peeling 1
82875c82888
 PASS: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect vectorized 1
loops 1
---
 FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect vectorized 1 
 loops 1
82877c82890
 PASS: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect Alignment of
access forced using peeling 1
---
 FAIL: gcc.dg/tree-ssa/gen-vect-28.c scan-tree-dump-times vect Alignment of 
 access forced using peeling 1


[Bug target/58490] __sync_bool_compare_and_swap sign bit failure

2013-09-27 Thread erikvanderwerf at gmail dot com
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58490

--- Comment #3 from Erik van der Werf erikvanderwerf at gmail dot com ---
I'm sorry, that patch definitely looks relevant, and I'd like to try it, but
somehow I did not manage to rebuild the arm-linux-gnueabi-gcc-4.7 package. 

I'm not a gcc expert, and trying to figure out how to configure the build for
cross compilation turns out to be rather time consuming, so for now I'll just
stay with gcc-4.6.

BTW I also tried the new atomic built-ins (__atomic_compare_exchange) and those
show the exact same problem.


RE: [PATCH]Fix computation of offset in ivopt

2013-09-27 Thread bin.cheng


 -Original Message-
 From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
 ow...@gcc.gnu.org] On Behalf Of bin.cheng
 Sent: Friday, September 27, 2013 1:07 PM
 To: 'Richard Biener'
 Cc: GCC Patches
 Subject: RE: [PATCH]Fix computation of offset in ivopt
 
 
 
  -Original Message-
  From: Richard Biener [mailto:richard.guent...@gmail.com]
  Sent: Tuesday, September 24, 2013 6:31 PM
  To: Bin Cheng
  Cc: GCC Patches
  Subject: Re: [PATCH]Fix computation of offset in ivopt
 
  On Tue, Sep 24, 2013 at 11:13 AM, bin.cheng bin.ch...@arm.com wrote:
 
  +   field = TREE_OPERAND (expr, 1);
  +   if (DECL_FIELD_BIT_OFFSET (field)
  +cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
  + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
  +
  +   tmp = component_ref_field_offset (expr);
  +   if (top_compref
  +cst_and_fits_in_hwi (tmp))
  + {
  +   /* Strip the component reference completely.  */
  +   op0 = TREE_OPERAND (expr, 0);
  +   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
  +   *offset = off0 + int_cst_value (tmp) + boffset /
 BITS_PER_UNIT;
  +   return op0;
  + }
 
  the failure paths seem mangled, that is, if cst_and_fits_in_hwi is
  false
 for
  either offset part you may end up doing half accounting and not
stripping.
 
  Btw, DECL_FIELD_BIT_OFFSET is always non-NULL.  I suggest to rewrite
  to
 
   if (!inside_addr)
 return orig_expr;
 
   tmp = component_ref_field_offset (expr);
   field = TREE_OPERAND (expr, 1);
   if (top_compref
cst_and_fits_in_hwi (tmp)
cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
  {
...
  }
 Will be refined.
 
 
  note that this doesn't really handle overflows correctly as
 
  +   *offset = off0 + int_cst_value (tmp) + boffset /
  + BITS_PER_UNIT;
 
  may still overflow.
 Since it's unsigned + signed + signed, according to implicit conversion,
the
 signed operand will be converted to unsigned, so the overflow would only
 happen when off0 is huge number and tmp/boffset is large positive number,
 right?  Do I need to check whether off0 is larger than the overflowed
result?
 Also there is signed-unsigned problem here, see below.
 
 
  @@ -4133,6 +4142,9 @@ get_computation_cost_at (struct ivopts_data
 *data,
   bitmap_clear (*depends_on);
   }
 
  +  /* Sign-extend offset if utype has lower precision than
  + HOST_WIDE_INT.  */  offset = sext_hwi (offset, TYPE_PRECISION
  + (utype));
  +
 
  offset is computed elsewhere in difference_cost and the bug to me
  seems that it is unsigned.  sign-extending it here is odd at least
  (and the
 extension
  should probably happen at sizetype precision, not that of utype).
 I agree, The root cause is in split_offset_1, in which offset is computed.
 Every time offset is computed in this function with a signed operand (like
 int_cst_value (tmp) above), we need to take care the possible negative
 number problem.   Take this case as an example, we need to do below
 change:
 
   case INTEGER_CST:
   //...
   *offset = int_cst_value (expr);
 change to
   case INTEGER_CST:
   //...
   *offset = sext_hwi (int_cst_value (expr), type);
 
 and
   case MULT_EXPR:
   //...
   *offset = sext_hwi (int_cst_value (expr), type); to
   case MULT_EXPR:
   //...
  HOST_WIDE_INT xxx = (HOST_WIDE_INT)off0 * int_cst_value (op1);
   *offset = sext_hwi (xxx, type);
 
 Any comments?

Thought twice, I guess we can compute signed offset in strip_offset_1 and
sign extend it for strip_offset, thus we don't need to change every
computation of offset in that function.

Thanks.
bin





Re: OMP4/cilkplus: simd clone function mangling

2013-09-27 Thread Richard Biener
On Thu, Sep 26, 2013 at 9:35 PM, Aldy Hernandez al...@redhat.com wrote:
 +  /* To distinguish from an OpenMP simd clone, Cilk Plus functions to
 + be cloned have a distinctive artificial label in addition to omp
 + declare simd.  */
 +  bool cilk_clone = flag_enable_cilkplus
 + lookup_attribute (cilk plus elemental,
 +DECL_ATTRIBUTES (new_node-symbol.decl));
 +  if (cilk_clone)
 +remove_attribute (cilk plus elemental,
 + DECL_ATTRIBUTES (new_node-symbol.decl));


 Oh yeah, rth had asked me why I remove the attribute.  My initial thoughts
 were that whether or not a function is a simd clone can be accessed through
 the cgraph bits (node-simdclone != NULL for the clone, and
 node-has_simd_clones for the parent).  No sense keeping the attribute.
 But I can leave it if you think it's better.

Why have it in the first place if it's marked in the cgraph?

Richard.

 Aldy


Re: [google gcc-4_8] fix size_estimation for builtin_expect

2013-09-27 Thread Richard Biener
On Fri, Sep 27, 2013 at 12:23 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,

 builtin_expect should be a NOP in size_estimation. Indeed, the call
 stmt itself is 0 weight in size and time. But it may introduce
 an extra relation expr which has non-zero size/time. The end result
 is: for w/ and w/o builtin_expect, we have different size/time estimation
 for early inlining.

 This patch fixes this problem.

 -Rong

 2013-09-26  Rong Xu  x...@google.com

   * ipa-inline-analysis.c (estimate_function_body_sizes): fix
 the size estimation for builtin_expect.

 This seems fine with an comment in the code what it is about.
 I also think we want to support mutiple builtin_expects in a BB so perhaps
 we want to have pointer set of statements to ignore?

 To avoid spagetti code, please just move the new logic into separate 
 functions.

Looks like this could use tree-ssa.c:walk_use_def_chains (please
change its implementation as necessary, make it C++, etc. - you will
be the first user again).

Richard.

 Honza

 Index: ipa-inline-analysis.c
 ===
 --- ipa-inline-analysis.c (revision 202638)
 +++ ipa-inline-analysis.c (working copy)
 @@ -2266,6 +2266,8 @@ estimate_function_body_sizes (struct cgraph_node *
/* Estimate static overhead for function prologue/epilogue and alignment. 
 */
int overhead = PARAM_VALUE (PARAM_INLINE_FUNCTION_OVERHEAD_SIZE);
int size = overhead;
 +  gimple fix_expect_builtin;
 +
/* Benefits are scaled by probability of elimination that is in range
   0,2.  */
basic_block bb;
 @@ -2359,14 +2361,73 @@ estimate_function_body_sizes (struct cgraph_node *
   }
   }

 +  fix_expect_builtin = NULL;
for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
   {
 gimple stmt = gsi_stmt (bsi);
 +   if (gimple_call_builtin_p (stmt, BUILT_IN_EXPECT))
 +{
 +  tree var = gimple_call_lhs (stmt);
 +  tree arg = gimple_call_arg (stmt, 0);
 +  use_operand_p use_p;
 +  gimple use_stmt;
 +  bool match = false;
 +  bool done = false;
 +  gcc_assert (var  arg);
 +  gcc_assert (TREE_CODE (var) == SSA_NAME);
 +
 +  while (TREE_CODE (arg) == SSA_NAME)
 +{
 +  gimple stmt_tmp = SSA_NAME_DEF_STMT (arg);
 +  switch (gimple_assign_rhs_code (stmt_tmp))
 +{
 +  case LT_EXPR:
 +  case LE_EXPR:
 +  case GT_EXPR:
 +  case GE_EXPR:
 +  case EQ_EXPR:
 +  case NE_EXPR:
 +match = true;
 +done = true;
 +break;
 +  case NOP_EXPR:
 +break;
 +  default:
 +done = true;
 +break;
 +}
 +  if (done)
 +break;
 +  arg = gimple_assign_rhs1 (stmt_tmp);
 +}
 +
 +  if (match  single_imm_use (var, use_p, use_stmt))
 +{
 +  if (gimple_code (use_stmt) == GIMPLE_COND)
 +{
 +  fix_expect_builtin = use_stmt;
 +}
 +}
 +
 +  /* we should see one builtin_expert call in one bb.  */
 +  break;
 +}
 +}
 +
 +  for (bsi = gsi_start_bb (bb); !gsi_end_p (bsi); gsi_next (bsi))
 + {
 +   gimple stmt = gsi_stmt (bsi);
 int this_size = estimate_num_insns (stmt, eni_size_weights);
 int this_time = estimate_num_insns (stmt, eni_time_weights);
 int prob;
 struct predicate will_be_nonconstant;

 +   if (stmt == fix_expect_builtin)
 +{
 +  this_size--;
 +  this_time--;
 +}
 +
 if (dump_file  (dump_flags  TDF_DETAILS))
   {
 fprintf (dump_file,   );



Re: [PATCH]Fix computation of offset in ivopt

2013-09-27 Thread Richard Biener
On Fri, Sep 27, 2013 at 7:07 AM, bin.cheng bin.ch...@arm.com wrote:


 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: Tuesday, September 24, 2013 6:31 PM
 To: Bin Cheng
 Cc: GCC Patches
 Subject: Re: [PATCH]Fix computation of offset in ivopt

 On Tue, Sep 24, 2013 at 11:13 AM, bin.cheng bin.ch...@arm.com wrote:

 +   field = TREE_OPERAND (expr, 1);
 +   if (DECL_FIELD_BIT_OFFSET (field)
 +cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
 + boffset = int_cst_value (DECL_FIELD_BIT_OFFSET (field));
 +
 +   tmp = component_ref_field_offset (expr);
 +   if (top_compref
 +cst_and_fits_in_hwi (tmp))
 + {
 +   /* Strip the component reference completely.  */
 +   op0 = TREE_OPERAND (expr, 0);
 +   op0 = strip_offset_1 (op0, inside_addr, top_compref, off0);
 +   *offset = off0 + int_cst_value (tmp) + boffset /
 BITS_PER_UNIT;
 +   return op0;
 + }

 the failure paths seem mangled, that is, if cst_and_fits_in_hwi is false
 for
 either offset part you may end up doing half accounting and not stripping.

 Btw, DECL_FIELD_BIT_OFFSET is always non-NULL.  I suggest to rewrite to

  if (!inside_addr)
return orig_expr;

  tmp = component_ref_field_offset (expr);
  field = TREE_OPERAND (expr, 1);
  if (top_compref
   cst_and_fits_in_hwi (tmp)
   cst_and_fits_in_hwi (DECL_FIELD_BIT_OFFSET (field)))
 {
   ...
 }
 Will be refined.


 note that this doesn't really handle overflows correctly as

 +   *offset = off0 + int_cst_value (tmp) + boffset /
 + BITS_PER_UNIT;

 may still overflow.
 Since it's unsigned + signed + signed, according to implicit conversion,
 the signed operand will be converted to unsigned, so the overflow would only
 happen when off0 is huge number and tmp/boffset is large positive number,
 right?  Do I need to check whether off0 is larger than the overflowed
 result?  Also there is signed-unsigned problem here, see below.


 @@ -4133,6 +4142,9 @@ get_computation_cost_at (struct ivopts_data *data,
  bitmap_clear (*depends_on);
  }

 +  /* Sign-extend offset if utype has lower precision than
 + HOST_WIDE_INT.  */  offset = sext_hwi (offset, TYPE_PRECISION
 + (utype));
 +

 offset is computed elsewhere in difference_cost and the bug to me seems
 that it is unsigned.  sign-extending it here is odd at least (and the
 extension
 should probably happen at sizetype precision, not that of utype).
 I agree, The root cause is in split_offset_1, in which offset is computed.
 Every time offset is computed in this function with a signed operand (like
 int_cst_value (tmp) above), we need to take care the possible negative
 number problem.   Take this case as an example, we need to do below change:

   case INTEGER_CST:
   //...
   *offset = int_cst_value (expr);
 change to
   case INTEGER_CST:
   //...
   *offset = sext_hwi (int_cst_value (expr), type);

 and
   case MULT_EXPR:
   //...
   *offset = sext_hwi (int_cst_value (expr), type);
 to
   case MULT_EXPR:
   //...
  HOST_WIDE_INT xxx = (HOST_WIDE_INT)off0 * int_cst_value (op1);
   *offset = sext_hwi (xxx, type);

 Any comments?

The issue is of course that we end up converting offsets to sizetype
at some point which makes them all appear unsigned.  The fix for this
is to simply interpret them as signed ... but it's really a mess ;)

Richard.

 Thanks.
 bin





Re: [PATCH, PR 57748] Check for out of bounds access, Part 2

2013-09-27 Thread Eric Botcazou
 Sure, but the modifier is not meant to force something into memory,
 especially when it is already in an register. Remember, we are only
 talking of structures here, and we only want to access one member.
 
 It is more the other way round:
 It says: You do not have to load the value in a register, if it is already
 in memory I'm happy

   EXPAND_MEMORY means we are interested in a memory result, even if
the memory is constant and we could have propagated a constant value.  */

We definitely want to propagate constant values here, look at the code below. 
And it already lists explicit cases where we really need to splill to memory.

-- 
Eric Botcazou


Re: [PATCH, ARM, LRA] Prepare ARM build with LRA

2013-09-27 Thread Eric Botcazou
 They don't need to be kept synchronised as such.  It's fine for the index
 to allow more than must_be_index_p.  But if you're not keen on the current
 structure, does the following look better?  Tested on x86_64-linux-gnu.
 
 Thanks,
 Richard
 
 
 gcc/
   * rtlanal.c (must_be_base_p, must_be_index_p): Delete.
   (binary_scale_code_p, get_base_term, get_index_term): New functions.
   (set_address_segment, set_address_base, set_address_index)
   (set_address_disp): Accept the argument unconditionally.
   (baseness): Remove must_be_base_p and must_be_index_p checks.
   (decompose_normal_address): Classify as much as possible in the
   main loop.

Yes, fine by me, thanks.

-- 
Eric Botcazou


Re: RFA: Store the REG_BR_PROB probability directly as an int

2013-09-27 Thread Eric Botcazou
 Thanks for the testing.  It also passes bootstrap on x86_64-linux-gnu.
 OK to install?

Yes, thanks.

-- 
Eric Botcazou


Re: [patch] Separate immediate uses and phi routines from tree-flow*.h

2013-09-27 Thread Richard Biener
On Thu, Sep 26, 2013 at 6:07 PM, Andrew MacLeod amacl...@redhat.com wrote:
 On 09/25/2013 04:49 AM, Richard Biener wrote:

 On Tue, Sep 24, 2013 at 4:39 PM, Andrew MacLeod amacl...@redhat.com
 wrote:

 This larger patch moves all the immediate use and operand routines from
 tree-flow.h into tree-ssa-operands.h.
 It also moves the basic phi routines and prototypes into a newly created
 tree-phinodes.h, or tree-ssa-operands.h if they belong there.
 And finally shuffles a couple of other routines which allows
 tree-ssa-operands.h to be removed from the gimple.h header file.

 of note or interest:

 1 - dump_decl_set() was defined in tree-into-ssa.c, but isn't really ssa
 specific. Its tree-specific, so normally I'd throw it into tree.c.
 Looking
 forward a little, its only used in a gimple context, so when we map to
 gimple_types it will need to be converted to/created for those. If it is
 in
 tree.c, I'll have to create a new version for gimple types, and then the
 routine in tree.c will become unused.  Based on that, I figured gimple.c
 is
 the place place for it.

 2 - has_zero_uses_1() and single_imm_use_1() were both in tree-cfg.c for
 some reason.. they've been moved to tree-ssa-operands.c

 3 - a few routines seem like basic gimple routines, but really turn out
 to
 require the operand infrastructure to implement... so they are moved to
 tree-ssa-operands.[ch] as well.  This sort of thing showed up when
 removing
 tree-ssa-operands.h from the gimple.h include file.  These were things
 like
 gimple_vuse_op, gimple_vdef_op, update_stmt, and update_stmt_if_modified

 Note that things like gimple_vuse_op are on the interface border between
 gimple (where the SSA operands are stored) and SSA operands.  So
 it's not so clear for them given they access internal gimple fields
 directly
 but use the regular SSA operand API.

 I'd prefer gimple_vuse_op and gimple_vdef_op to stay in gimple.[ch].


 Ugg. I incorporated what we talked about, and it was much messier than
 expected :-P.  I ended up with a chicken and egg problem between the
 gimple_v{use,def}_op routines in gimple-ssa.h  and the operand routines in
 tree-ssa-operands.h.   They both require each other, and I couldn't get
 things into a consistent state while they are in separate files.  It was
 actually the immediate use iterators which were requiring
 gimple_vuse_op()...  So I have created a new ssa-iterators.h file  to
 resolve this problem.  They build on the operand code and clearly has other
 prerequisites, so that seems reasonable to me...

 This in fact solves a couple of other little warts. It allows me to put both
 gimple_phi_arg_imm_use_ptr() and phi_arg_index_from_use() into
 tree-phinodes.h.

 It also exposes that gimple.c::walk_stmt_load_store_addr_ops() and friends
 actually depend on the existence of PHI nodes, meaning it really belongs on
 the gimple-ssa border as well. So I moved those into gimple-ssa.c

It doesn't depend on PHI nodes but it also works for PHI nodes.  So
I'd rather have it in gimple.c.

 And finally, it turns out that a lot of files include tree-flow.h and
 depend on it to include gimple.h rather than including it themselves.
 Since tree-flow.h is losing its kitchen-sink attribute, and I needed to move
 it to the bottom of the #include list for tree-ssa.h, I have temporarily
 included gimple.h at the top of tree-ssa.h to make sure it gets hauled in.
 When I remove tree-flow.h as the everyone includes it file, I'll add
 gimple.h in all the appropriate .c files and remove it from tree-ssa.h.   It
 would have just made this growing patch even more annoying for now.

 Does this seem reasonable?

Yes - try leaving walk_stmt_load_store_addr_ops in gimple.c though,
if that is technically possible.  Otherwise I guess I don't mind.

Thanks,
Richard.

 Bootstraps on x86_64-unknown-linux-gnu and currently running regressions.

 Andrew

 PS Oh and I noticed the macro name for tree-outof-ssa.h wasnt right, so I
 changed it too.

 Next I'll diverge into trying to sort through putting all the phi-related
 structs and such into tree-phinodes.h


Re: [PATCH, RTL] Prepare ARM build with LRA

2013-09-27 Thread Eric Botcazou
 below is a trivial patch, which makes both parts of test signed.
 With this, bootstrap completes on powerpc-darwin9 - however, you might want
 to check that it still does what you intended.

Please install under PR middle-end/58547 if not already done.

-- 
Eric Botcazou


Re: Commit: MSP430: Pass -md on to assembler

2013-09-27 Thread nick clifton

Hi Mike,


I must say though, it seems wrong to have to provide a sign-extend pointer 
pattern when pointers (on the MSP430) are unsigned.


Agreed.  If we instead ask, is it sane for gcc to ever want to signed extend in 
this case, the answer appears to be no.  Why does it, ptr_mode is SImode, and 
expand_builtin_next_arg is used to perform the addition in this mode.  It 
'just' knows that is can be signed extended… and just does it that way.  This 
seems like it is wrong.

Index: builtins.c
===
--- builtins.c  (revision 202634)
+++ builtins.c  (working copy)
@@ -4094,7 +4094,7 @@ expand_builtin_next_arg (void)
return expand_binop (ptr_mode, add_optab,
   crtl-args.internal_arg_pointer,
   crtl-args.arg_offset_rtx,
-  NULL_RTX, 0, OPTAB_LIB_WIDEN);
+  NULL_RTX, POINTERS_EXTEND_UNSIGNED  0, OPTAB_LIB_WIDEN);
  }

  /* Make it easier for the backends by protecting the valist argument

would fix this problem.  If this is done, the unmodified test case then doesn't 
abort.  Arguably, the extension should be done as the port directs.  It isn't 
clear to me why they do not.

Ok?



OK by me, although I cannot approve that particular patch.

I did eventually find some test cases that exercised the sign-extend 
pointer pattern, so I was able to check the generated code - it worked OK.


But I ran into a very strange problem.  With your PARTIAL_INT_MODE_NAME 
patch applied GCC started erroneously eliminating NULL function pointer 
checks!  This was particularly noticeable in libgcc/crtstuff.c where for 
example:


  static void __attribute__((used))
  frame_dummy (void)
  {
static struct object object;
if (__register_frame_info)
  __register_frame_info (__EH_FRAME_BEGIN__, object);

(this is a simplified version of the real code) ... is compiled as if it 
had be written as:


  static void __attribute__((used))
  frame_dummy (void)
  {
static struct object object;
__register_frame_info (__EH_FRAME_BEGIN__, object);


This only happens for the LARGE model (when pointers are PSImode) but I 
was baffled as to where it could be happening. Have you come across 
anything like this ?


Cheers
  Nick



Re: [PATCH][RFC] Remove quadratic loop with component_uses_parent_alias_set

2013-09-27 Thread Eric Botcazou
 Like the following.
 
 Bootstrap and regtest running on x86_64-unknown-linux-gnu.
 
 Richard.
 
 2013-09-26  Richard Biener  rguent...@suse.de
 
   * alias.h (component_uses_parent_alias_set): Rename to ...
   (component_uses_parent_alias_set_from): ... this.
   * alias.c (component_uses_parent_alias_set): Rename to ...
   (component_uses_parent_alias_set_from): ... this and return
   the desired parent.
   (reference_alias_ptr_type_1): Use the result from
   component_uses_parent_alias_set_from instead of stripping
   components one at a time.
   * emit-rtl.c (set_mem_attributes_minus_bitpos): Adjust.

FWIW it looks fine to me.

-- 
Eric Botcazou


Re: [ping] [PATCH] Silence an unused variable warning

2013-09-27 Thread Dodji Seketeli
Let's CC Vladimir on this easy one.

Cheers.

Jan-Benedict Glaw jbg...@lug-owl.de a écrit:

 On Fri, 2013-09-20 20:51:37 +0200, Jan-Benedict Glaw jbg...@lug-owl.de 
 wrote:
 Hi!
 
 With the VAX target, I see this warning:
 
 g++ -c   -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions 
 -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing 
 -Wwrite-strings -Wcast-qual -Wmissing-format-attribute -pedantic 
 -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -fno-common  
 -DHAVE_CONFIG_H -I. -I. -I../../../../gcc/gcc -I../../../../gcc/gcc/. 
 -I../../../../gcc/gcc/../include -I../../../../gcc/gcc/../libcpp/include  
 -I../../../../gcc/gcc/../libdecnumber 
 -I../../../../gcc/gcc/../libdecnumber/dpd -I../libdecnumber 
 -I../../../../gcc/gcc/../libbacktrace
 ../../../../gcc/gcc/lra-eliminations.c -o lra-eliminations.o
 ../../../../gcc/gcc/lra-eliminations.c: In function ‘void init_elim_table()’:
 ../../../../gcc/gcc/lra-eliminations.c:1162:8: warning: unused variable 
 ‘value_p’ [-Wunused-variable]
bool value_p;
 ^
 [...]

 Ping:

 http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01568.html
 `-- http://gcc.gnu.org/ml/gcc-patches/2013-09/txtnrNwaGiD3x.txt

 MfG, JBG

-- 
Dodji


Generic tuning in x86-tune.def 1/2

2013-09-27 Thread Jan Hubicka
Hi,
this is second part of the generic tuning changes sanityzing the tuning flags.
This patch again is supposed to deal with the obvious part only.
I will send separate patch for more changes.

The flags changed agree on all CPUs considered for generic (and their
optimization manuals) + amdfam10, core2 and Atom SLM.

I also added X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL to bobcat tuning, since it
seems like obvious omision (after double checking in optimization manual) and
droped X86_TUNE_FOUR_JUMP_LIMIT for buldozer cores.  Implementation of this
feature was always bit weird and its main purpose was to avoid terrible branch
predictor degeneration on the older AMD branch predictors. I benchmarked both
spec2k and 2k6 to verify there are no regression.

Especially X86_TUNE_REASSOC_FP_TO_PARALLEL seems to bring nice improvements in 
specfp
benchmarks.

Bootstrapped/regtested x86_64-linux, will wait for comments and commit it
during weekend.  I will be happy to revisit any of the generic tuning if
regressions pop up.

Overall this patch also brings small code size improvements for smaller
loads/stores and less padding at -O2. Differences are sub 0.1% however.

Honza
* x86-tune.def (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Enable for 
generic.
(X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
(X86_TUNE_FOUR_JUMP_LIMIT): Drop for generic and buldozer.
(X86_TUNE_PAD_RETURNS): Drop for newer AMD chips.
(X86_TUNE_AVOID_VECTOR_DECODE): Drop for generic.
(X86_TUNE_REASSOC_FP_TO_PARALLEL): Enable for generic.
Index: config/i386/x86-tune.def
===
--- config/i386/x86-tune.def(revision 202966)
+++ config/i386/x86-tune.def(working copy)
@@ -115,9 +115,9 @@ DEF_TUNE (X86_TUNE_SSE_PARTIAL_REG_DEPEN
   m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_ATOM | m_SLM | m_AMDFAM10 
   | m_BDVER | m_GENERIC)
 DEF_TUNE (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL, sse_unaligned_load_optimal,
-  m_COREI7 | m_AMDFAM10 | m_BDVER | m_BTVER | m_SLM)
+  m_COREI7 | m_AMDFAM10 | m_BDVER | m_BTVER | m_SLM | m_GENERIC)
 DEF_TUNE (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL, sse_unaligned_store_optimal,
-  m_COREI7 | m_BDVER | m_SLM)
+  m_COREI7 | m_BDVER | m_BTVER | m_SLM | m_GENERIC)
 DEF_TUNE (X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL, 
sse_packed_single_insn_optimal,
   m_BDVER)
 /* X86_TUNE_SSE_SPLIT_REGS: Set for machines where the type and dependencies
@@ -146,8 +146,7 @@ DEF_TUNE (X86_TUNE_INTER_UNIT_CONVERSION
 /* X86_TUNE_FOUR_JUMP_LIMIT: Some CPU cores are not able to predict more
than 4 branch instructions in the 16 byte window.  */
 DEF_TUNE (X86_TUNE_FOUR_JUMP_LIMIT, four_jump_limit,
-  m_PPRO | m_P4_NOCONA | m_ATOM | m_SLM | m_AMD_MULTIPLE 
-  | m_GENERIC)
+  m_PPRO | m_P4_NOCONA | m_ATOM | m_SLM | m_ATHLON_K8 | m_AMDFAM10)
 DEF_TUNE (X86_TUNE_SCHEDULE, schedule,
   m_PENT | m_PPRO | m_CORE_ALL | m_ATOM | m_SLM | m_K6_GEODE 
   | m_AMD_MULTIPLE | m_GENERIC)
@@ -156,13 +155,13 @@ DEF_TUNE (X86_TUNE_USE_BT, use_bt,
 DEF_TUNE (X86_TUNE_USE_INCDEC, use_incdec,
   ~(m_P4_NOCONA | m_CORE_ALL | m_ATOM | m_SLM | m_GENERIC))
 DEF_TUNE (X86_TUNE_PAD_RETURNS, pad_returns,
-  m_AMD_MULTIPLE | m_GENERIC)
+  m_ATHLON_K8 | m_AMDFAM10 | | m_GENERIC)
 DEF_TUNE (X86_TUNE_PAD_SHORT_FUNCTION, pad_short_function, m_ATOM)
 DEF_TUNE (X86_TUNE_EXT_80387_CONSTANTS, ext_80387_constants,
   m_PPRO | m_P4_NOCONA | m_CORE_ALL | m_ATOM | m_SLM | m_K6_GEODE
   | m_ATHLON_K8 | m_GENERIC)
 DEF_TUNE (X86_TUNE_AVOID_VECTOR_DECODE, avoid_vector_decode,
-  m_K8 | m_GENERIC)
+  m_K8)
 /* X86_TUNE_PROMOTE_HIMODE_IMUL: Modern CPUs have same latency for HImode
and SImode multiply, but 386 and 486 do HImode multiply faster.  */
 DEF_TUNE (X86_TUNE_PROMOTE_HIMODE_IMUL, promote_himode_imul,
@@ -217,7 +216,7 @@ DEF_TUNE (X86_TUNE_REASSOC_INT_TO_PARALL
 /* X86_TUNE_REASSOC_FP_TO_PARALLEL: Try to produce parallel computations
during reassociation of fp computation.  */
 DEF_TUNE (X86_TUNE_REASSOC_FP_TO_PARALLEL, reassoc_fp_to_parallel,
-  m_ATOM | m_SLM | m_HASWELL | m_BDVER1 | m_BDVER2)
+  m_ATOM | m_SLM | m_HASWELL | m_BDVER1 | m_BDVER2 | m_GENERIC)
 /* X86_TUNE_GENERAL_REGS_SSE_SPILL: Try to spill general regs to SSE
regs instead of memory.  */
 DEF_TUNE (X86_TUNE_GENERAL_REGS_SSE_SPILL, general_regs_sse_spill,


Re: [Patch] Let ordinary escaping in POSIX regex be valid

2013-09-27 Thread Jonathan Wakely
On 27 September 2013 03:15, Tim Shen wrote:
 POSIX ERE says that escaping an ordinary char, say R\n is not
 permitted, because 'n' is not a special char. However, they also say
 that : Implementations are permitted to extend the language to allow
 these. Conforming applications cannot use such constructs.

 So let's support it not to make users surprised.

 Booted and tested under -m32 and -m64

I'm wondering whether we want to have a stricter mode that doesn't
allow them, to help users avoid creating non-portable programs.  We
could check the value of the preprocessor macro __STRICT_ANSI__, which
is set by -std=c++11 but not by -std=gnu++11, although that's not
really the right flag. We want something more like the GNU shell
utils' POSIXLY_CORRECT.


Re: User-define literals for std::complex.

2013-09-27 Thread Jonathan Wakely
On 27 September 2013 05:17, Ed Smith-Rowland wrote:

 The complex user-defined literals finally passed (n3779) with the resolution
 to DR1473 allowing the suffix id to touch the quotes (Can't find it but I
 put it in not too long ago).

I think it's been approved by the LWG and looks like it will go to a
vote by the full committee, but let's wait for that to pass before
making any changes.


Re: [gomp4] Library side of depend clause support

2013-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2013 at 01:48:36AM +0200, Jakub Jelinek wrote:
 Perhaps.  What if I do just minor cleanup (use flexible array members for
 the reallocated vectors, and perhaps keep only the last out/inout task
 in the hash table chains rather than all of them), retest, commit and then
 we can discuss/incrementally improve it?

Here is what I've committed now, the incremental changes were really only
using a structure with flex array member for the dependers vectors,
removing/making redundant earlier !ent-is_in when adding !is_in into the
chain and addition of new testcases.

Let's improve it incrementally later.

2013-09-27  Jakub Jelinek  ja...@redhat.com

* libgomp.h: Include stdlib.h.
(struct gomp_task_depend_entry,
struct gomp_dependers_vec): New types.
(struct gomp_task): Add dependers, depend_hash, depend_count,
num_dependees and depend fields.
(struct gomp_taskgroup): Add num_children field.
(gomp_finish_task): Free depend_hash if non-NULL.
* libgomp_g.h (GOMP_task): Add depend argument.
* hashtab.h: New file.
* task.c: Include hashtab.h.
(hash_entry_type): New typedef.
(htab_alloc, htab_free, htab_hash, htab_eq): New inlines.
(gomp_init_task): Clear dependers, depend_hash and depend_count
fields.
(GOMP_task): Add depend argument, handle depend clauses.  Increment
num_children field in taskgroup.
(gomp_task_run_pre): Don't increment task_running_count here,
nor clear task_pending bit.
(gomp_task_run_post_handle_depend_hash,
gomp_task_run_post_handle_dependers,
gomp_task_run_post_handle_depend): New functions.
(gomp_task_run_post_remove_parent): Clear in_taskwait before
signalling corresponding semaphore.
(gomp_task_run_post_remove_taskgroup): Decrement num_children
field and make the decrement to 0 MEMMODEL_RELEASE operation,
rather than storing NULL to taskgroup-children.  Clear
in_taskgroup_wait before signalling corresponding semaphore.
(gomp_barrier_handle_tasks): Move task_running_count increment
and task_pending bit clearing here.  Call
gomp_task_run_post_handle_depend.  If more than one new tasks
have been queued, wake other threads if needed.
(GOMP_taskwait): Call gomp_task_run_post_handle_depend.  If more
than one new tasks have been queued, wake other threads if needed.
After waiting on taskwait_sem, enter critical section again.
(GOMP_taskgroup_start): Initialize num_children field.
(GOMP_taskgroup_end): Check num_children instead of children
before critical section.  If children is NULL, but num_children
is non-zero, wait on taskgroup_sem.  Call
gomp_task_run_post_handle_depend.  If more than one new tasks have
been queued, wake other threads if needed.  After waiting on
taskgroup_sem, enter critical section again.
* testsuite/libgomp.c/depend-1.c: New test.
* testsuite/libgomp.c/depend-2.c: New test.
* testsuite/libgomp.c/depend-3.c: New test.
* testsuite/libgomp.c/depend-4.c: New test.

--- libgomp/libgomp.h.jj2013-09-26 09:43:10.903930832 +0200
+++ libgomp/libgomp.h   2013-09-27 09:05:17.025402127 +0200
@@ -39,6 +39,7 @@
 
 #include pthread.h
 #include stdbool.h
+#include stdlib.h
 
 #ifdef HAVE_ATTRIBUTE_VISIBILITY
 # pragma GCC visibility push(hidden)
@@ -253,7 +254,26 @@ enum gomp_task_kind
   GOMP_TASK_TIED
 };
 
+struct gomp_task;
 struct gomp_taskgroup;
+struct htab;
+
+struct gomp_task_depend_entry
+{
+  void *addr;
+  struct gomp_task_depend_entry *next;
+  struct gomp_task_depend_entry *prev;
+  struct gomp_task *task;
+  bool is_in;
+  bool redundant;
+};
+
+struct gomp_dependers_vec
+{
+  size_t n_elem;
+  size_t allocated;
+  struct gomp_task *elem[];
+};
 
 /* This structure describes a task to be run by a thread.  */
 
@@ -268,6 +288,10 @@ struct gomp_task
   struct gomp_task *next_taskgroup;
   struct gomp_task *prev_taskgroup;
   struct gomp_taskgroup *taskgroup;
+  struct gomp_dependers_vec *dependers;
+  struct htab *depend_hash;
+  size_t depend_count;
+  size_t num_dependees;
   struct gomp_task_icv icv;
   void (*fn) (void *);
   void *fn_data;
@@ -277,6 +301,7 @@ struct gomp_task
   bool final_task;
   bool copy_ctors_done;
   gomp_sem_t taskwait_sem;
+  struct gomp_task_depend_entry depend[];
 };
 
 struct gomp_taskgroup
@@ -286,6 +311,7 @@ struct gomp_taskgroup
   bool in_taskgroup_wait;
   bool cancelled;
   gomp_sem_t taskgroup_sem;
+  size_t num_children;
 };
 
 /* This structure describes a team of threads.  These are the threads
@@ -525,6 +551,8 @@ extern void gomp_barrier_handle_tasks (g
 static void inline
 gomp_finish_task (struct gomp_task *task)
 {
+  if (__builtin_expect (task-depend_hash != NULL, 0))
+free (task-depend_hash);
   gomp_sem_destroy (task-taskwait_sem);
 }
 
--- 

[PING] [C++ PATCH] demangler fix (take 2)

2013-09-27 Thread Gary Benson
Gary Benson wrote:
 Hi all,
 
 This is a resubmission of my previous demangler fix [1] rewritten
 to avoid using hashtables and other libiberty features.
 
 From the above referenced email:
 
 d_print_comp maintains a certain amount of scope across calls (namely
 a stack of templates) which is used when evaluating references in
 template argument lists.  If such a reference is later used from a
 subtitution then the scope in force at the time of the substitution is
 used.  This appears to be wrong (I say appears because I couldn't find
 anything in the API [2] to clarify this).
 
 The attached patch causes the demangler to capture the scope the first
 time such a reference is traversed, and to use that captured scope on
 subsequent traversals.  This fixes GDB PR 14963 [3] whereby a
 reference is resolved against the wrong template, causing an infinite
 loop and eventual stack overflow and segmentation fault.
 
 I've added the result to the demangler test suite, but I know of no
 way to check the validity of the demangled symbol other than by
 inspection (and I am no expert here!)  If anybody knows a way to
 check this then please let me know!  Otherwise, I hope this
 not-really-checked demangled version is acceptable.
 
 Thanks,
 Gary
 
 [1] http://gcc.gnu.org/ml/gcc-patches/2013-09/msg00215.html
 [2] http://mentorembedded.github.io/cxx-abi/abi.html#mangling
 [3] http://sourceware.org/bugzilla/show_bug.cgi?id=14963
 
 -- 
 http://gbenson.net/
diff --git a/libiberty/ChangeLog b/libiberty/ChangeLog
index 89e108a..2ff8216 100644
--- a/libiberty/ChangeLog
+++ b/libiberty/ChangeLog
@@ -1,3 +1,20 @@
+2013-09-17  Gary Benson  gben...@redhat.com
+
+   * cp-demangle.c (struct d_saved_scope): New structure.
+   (struct d_print_info): New fields saved_scopes and
+   num_saved_scopes.
+   (d_print_init): Initialize the above.
+   (d_print_free): New function.
+   (cplus_demangle_print_callback): Call the above.
+   (d_copy_templates): New function.
+   (d_print_comp): New variables saved_templates and
+   need_template_restore.
+   [DEMANGLE_COMPONENT_REFERENCE,
+   DEMANGLE_COMPONENT_RVALUE_REFERENCE]: Capture scope the first
+   time the component is traversed, and use the captured scope for
+   subsequent traversals.
+   * testsuite/demangle-expected: Add regression test.
+
 2013-09-10  Paolo Carlini  paolo.carl...@oracle.com
 
PR bootstrap/58386
diff --git a/libiberty/cp-demangle.c b/libiberty/cp-demangle.c
index 70f5438..a199f6d 100644
--- a/libiberty/cp-demangle.c
+++ b/libiberty/cp-demangle.c
@@ -275,6 +275,18 @@ struct d_growable_string
   int allocation_failure;
 };
 
+/* A demangle component and some scope captured when it was first
+   traversed.  */
+
+struct d_saved_scope
+{
+  /* The component whose scope this is.  */
+  const struct demangle_component *container;
+  /* The list of templates, if any, that was current when this
+ scope was captured.  */
+  struct d_print_template *templates;
+};
+
 enum { D_PRINT_BUFFER_LENGTH = 256 };
 struct d_print_info
 {
@@ -302,6 +314,10 @@ struct d_print_info
   int pack_index;
   /* Number of d_print_flush calls so far.  */
   unsigned long int flush_count;
+  /* Array of saved scopes for evaluating substitutions.  */
+  struct d_saved_scope *saved_scopes;
+  /* Number of saved scopes in the above array.  */
+  int num_saved_scopes;
 };
 
 #ifdef CP_DEMANGLE_DEBUG
@@ -3665,6 +3681,30 @@ d_print_init (struct d_print_info *dpi, 
demangle_callbackref callback,
   dpi-opaque = opaque;
 
   dpi-demangle_failure = 0;
+
+  dpi-saved_scopes = NULL;
+  dpi-num_saved_scopes = 0;
+}
+
+/* Free a print information structure.  */
+
+static void
+d_print_free (struct d_print_info *dpi)
+{
+  int i;
+
+  for (i = 0; i  dpi-num_saved_scopes; i++)
+{
+  struct d_print_template *ts, *tn;
+
+  for (ts = dpi-saved_scopes[i].templates; ts != NULL; ts = tn)
+   {
+ tn = ts-next;
+ free (ts);
+   }
+}
+
+  free (dpi-saved_scopes);
 }
 
 /* Indicate that an error occurred during printing, and test for error.  */
@@ -3749,6 +3789,7 @@ cplus_demangle_print_callback (int options,
demangle_callbackref callback, void *opaque)
 {
   struct d_print_info dpi;
+  int success;
 
   d_print_init (dpi, callback, opaque);
 
@@ -3756,7 +3797,9 @@ cplus_demangle_print_callback (int options,
 
   d_print_flush (dpi);
 
-  return ! d_print_saw_error (dpi);
+  success = ! d_print_saw_error (dpi);
+  d_print_free (dpi);
+  return success;
 }
 
 /* Turn components into a human readable string.  OPTIONS is the
@@ -3913,6 +3956,36 @@ d_print_subexpr (struct d_print_info *dpi, int options,
 d_append_char (dpi, ')');
 }
 
+/* Return a shallow copy of the current list of templates.
+   On error d_print_error is called and a partial list may
+   be returned.  Whatever is returned must be freed.  */
+
+static struct d_print_template *
+d_copy_templates (struct 

[patch] Fix PR bootstrap/58509

2013-09-27 Thread Eric Botcazou
Hi,

this fixes the ICE during the build of the Ada runtime on the SPARC, a fallout 
of the recent inliner changes:
  http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01033.html

The ICE is triggered because the ldd peephole merges an MEM with MEM_NOTRAP_P 
and a contiguous MEM without MEM_NOTRAP_P, keeping the MEM_NOTRAP_P flag on 
the result.  As a consequence, an EH edge is eliminated and a BB is orphaned.

I think this shows that my above inliner patch was too gross: when you have 
successive inlining, you can quickly end up with a mess of trapping and non-
trapping memory accesses for the same object.  So the attached seriously 
refines it, restricting it to parameters with reference type and leaning 
towards being less conservative.  Again, this should only affect Ada.

Tested on x86_64-suse-linux, OK for the mainline?


2013-09-27  Eric Botcazou  ebotca...@adacore.com

PR bootstrap/58509
* ipa-prop.h (get_ancestor_addr_info): Declare.
* ipa-prop.c (get_ancestor_addr_info): Make public.
* tree-inline.c (is_parm): Rename into...
(is_ref_parm): ...this.
(is_based_on_ref_parm): New predicate.
(remap_gimple_op_r): Do not propagate TREE_THIS_NOTRAP on MEM_REF if
a parameter with reference type has been remapped and the result is
not based on another parameter with reference type.
(copy_tree_body_r): Likewise on INDIRECT_REF and MEM_REF.


2013-09-27  Eric Botcazou  ebotca...@adacore.com

* gnat.dg/specs/opt1.ads: New test.


-- 
Eric BotcazouIndex: tree-inline.c
===
--- tree-inline.c	(revision 202912)
+++ tree-inline.c	(working copy)
@@ -751,10 +751,11 @@ copy_gimple_bind (gimple stmt, copy_body
   return new_bind;
 }
 
-/* Return true if DECL is a parameter or a SSA_NAME for a parameter.  */
+/* Return true if DECL is a parameter with reference type or a SSA_NAME
+  for a parameter with reference type.  */
 
 static bool
-is_parm (tree decl)
+is_ref_parm (tree decl)
 {
   if (TREE_CODE (decl) == SSA_NAME)
 {
@@ -763,7 +764,40 @@ is_parm (tree decl)
 	return false;
 }
 
-  return (TREE_CODE (decl) == PARM_DECL);
+  return (TREE_CODE (decl) == PARM_DECL
+	   TREE_CODE (TREE_TYPE (decl)) == REFERENCE_TYPE);
+}
+
+/* Return true if DECL is based on a parameter with reference type or a
+   SSA_NAME for a parameter with with reference type.  */
+
+static bool
+is_based_on_ref_parm (tree decl)
+{
+  HOST_WIDE_INT offset;
+  tree obj, expr;
+  gimple def_stmt;
+
+  /* First the easy case.  */
+  if (is_ref_parm (decl))
+return true;
+
+  /* Then look for an SSA name whose defining statement is of the form:
+
+  D.1718_7 = parm_2(D)-f1;
+
+ where parm_2 is a parameter with reference type.  */
+  if (TREE_CODE (decl) != SSA_NAME)
+return false;
+  def_stmt = SSA_NAME_DEF_STMT (decl);
+  if (!def_stmt)
+return false;
+
+  expr = get_ancestor_addr_info (def_stmt, obj, offset);
+  if (!expr)
+return false;
+
+  return is_ref_parm (TREE_OPERAND (expr, 0));
 }
 
 /* Remap the GIMPLE operand pointed to by *TP.  DATA is really a
@@ -865,12 +899,13 @@ remap_gimple_op_r (tree *tp, int *walk_s
 	  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
 	  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
 	  TREE_NO_WARNING (*tp) = TREE_NO_WARNING (old);
-	  /* We cannot propagate the TREE_THIS_NOTRAP flag if we have
-	 remapped a parameter as the property might be valid only
-	 for the parameter itself.  */
+	  /* We cannot always propagate the TREE_THIS_NOTRAP flag if we have
+	 remapped a parameter with reference type as the property may be
+	 valid only for the parameter.  */
 	  if (TREE_THIS_NOTRAP (old)
-	   (!is_parm (TREE_OPERAND (old, 0))
-		  || (!id-transform_parameter  is_parm (ptr
+	   (!is_ref_parm (TREE_OPERAND (old, 0))
+		  || !id-transform_parameter
+		  || is_based_on_ref_parm (ptr)))
 	TREE_THIS_NOTRAP (*tp) = 1;
 	  *walk_subtrees = 0;
 	  return NULL;
@@ -1092,12 +1127,13 @@ copy_tree_body_r (tree *tp, int *walk_su
 		  TREE_THIS_VOLATILE (*tp) = TREE_THIS_VOLATILE (old);
 		  TREE_SIDE_EFFECTS (*tp) = TREE_SIDE_EFFECTS (old);
 		  TREE_READONLY (*tp) = TREE_READONLY (old);
-		  /* We cannot propagate the TREE_THIS_NOTRAP flag if we
-			 have remapped a parameter as the property might be
-			 valid only for the parameter itself.  */
+		  /* We cannot always propagate the TREE_THIS_NOTRAP flag
+			 if we have remapped a parameter with reference type as
+			 the property may be valid only for the parameter.  */
 		  if (TREE_THIS_NOTRAP (old)
-			   (!is_parm (TREE_OPERAND (old, 0))
-			  || (!id-transform_parameter  is_parm (ptr
+			   (!is_ref_parm (TREE_OPERAND (old, 0))
+			  || !id-transform_parameter
+			  || is_based_on_ref_parm (ptr)))
 		TREE_THIS_NOTRAP (*tp) = 1;
 		}
 		}
@@ -1118,12 +1154,13 @@ copy_tree_body_r (tree *tp, int 

Re: [Patch] Let ordinary escaping in POSIX regex be valid

2013-09-27 Thread Paolo Carlini

On 9/27/13 4:34 AM, Jonathan Wakely wrote:

On 27 September 2013 03:15, Tim Shen wrote:

POSIX ERE says that escaping an ordinary char, say R\n is not
permitted, because 'n' is not a special char. However, they also say
that : Implementations are permitted to extend the language to allow
these. Conforming applications cannot use such constructs.

So let's support it not to make users surprised.

Booted and tested under -m32 and -m64

I'm wondering whether we want to have a stricter mode that doesn't
allow them, to help users avoid creating non-portable programs.  We
could check the value of the preprocessor macro __STRICT_ANSI__, which
is set by -std=c++11 but not by -std=gnu++11, although that's not
really the right flag. We want something more like the GNU shell
utils' POSIXLY_CORRECT.
Indeed. I think that for now __STRICT_ANSI__ can do, it's important to 
manage to accept those otherwise, as we discovered yesterday, we easily 
reject quite a few rather sensible regex users can write or find in 
examples: this started when Tim, upon my suggestion, tried the examples 
in the new edition of Nicolai Josuttis book and found in one those an 
escaped closed curly bracket (note, closed, open are definitely fine), 
which apparently most of the other implementations do not reject.


Paolo.


Re: OMP4/cilkplus: simd clone function mangling

2013-09-27 Thread Aldy Hernandez

On 09/27/13 03:18, Richard Biener wrote:

On Thu, Sep 26, 2013 at 9:35 PM, Aldy Hernandez al...@redhat.com wrote:

+  /* To distinguish from an OpenMP simd clone, Cilk Plus functions to
+ be cloned have a distinctive artificial label in addition to omp
+ declare simd.  */
+  bool cilk_clone = flag_enable_cilkplus
+ lookup_attribute (cilk plus elemental,
+DECL_ATTRIBUTES (new_node-symbol.decl));
+  if (cilk_clone)
+remove_attribute (cilk plus elemental,
+ DECL_ATTRIBUTES (new_node-symbol.decl));



Oh yeah, rth had asked me why I remove the attribute.  My initial thoughts
were that whether or not a function is a simd clone can be accessed through
the cgraph bits (node-simdclone != NULL for the clone, and
node-has_simd_clones for the parent).  No sense keeping the attribute.
But I can leave it if you think it's better.


Why have it in the first place if it's marked in the cgraph?


It would be placed there by the front-end when parsing Cilk Plus 
simd-enabled functions.  It's only in the the omp stage that we transfer 
that information to the cgraph bits.




Re: [Patch] Let ordinary escaping in POSIX regex be valid

2013-09-27 Thread Jonathan Wakely
On 27 September 2013 13:32, Paolo Carlini wrote:
 On 9/27/13 4:34 AM, Jonathan Wakely wrote:

 On 27 September 2013 03:15, Tim Shen wrote:

 POSIX ERE says that escaping an ordinary char, say R\n is not
 permitted, because 'n' is not a special char. However, they also say
 that : Implementations are permitted to extend the language to allow
 these. Conforming applications cannot use such constructs.

 So let's support it not to make users surprised.

 Booted and tested under -m32 and -m64

 I'm wondering whether we want to have a stricter mode that doesn't
 allow them, to help users avoid creating non-portable programs.  We
 could check the value of the preprocessor macro __STRICT_ANSI__, which
 is set by -std=c++11 but not by -std=gnu++11, although that's not
 really the right flag. We want something more like the GNU shell
 utils' POSIXLY_CORRECT.

 Indeed. I think that for now __STRICT_ANSI__ can do, it's important to
 manage to accept those otherwise, as we discovered yesterday, we easily
 reject quite a few rather sensible regex users can write or find in
 examples: this started when Tim, upon my suggestion, tried the examples in
 the new edition of Nicolai Josuttis book and found in one those an escaped
 closed curly bracket (note, closed, open are definitely fine), which
 apparently most of the other implementations do not reject.

Ah I see.  I definitely agree it's good to accept that instead of
being unnecessarily strict, but other people will want the option of
strict conformance, so I think we can please everyone with something
like:

else
  {
#ifdef __STRICT_ANSI__
__throw_regex_error(regex_constants::error_escape);
#else
   _M_token = _S_token_ord_char;
   _M_value.assign(1, __c);
#endif
  }


[committed] Fix move_sese_region_to_fn (PR middle-end/58551)

2013-09-27 Thread Jakub Jelinek
Hi!

I've committed the following fix to a regression introduced in 4.9
early loop construction.  SESE regions, as documented above
move_sese_region_to_fn, are allowed to contain calls to noreturn functions
like abort/exit.  But, basic blocks leading to noreturn functions aren't
actually placed in the loop inside of which the SESE region is present,
but directly inside of the outermost loop of the function.

So, we can't just move change loop_father of bb's belonging to
entry_bb's loop_father to new function's outermost loop and move
loops which have their outer loop equal to entry_bb's loop_father
and have their header in the SESE region into the new function,
but we also have to handle the same way the outermost loop of the
original function.

Bootstrapped/regtested on x86_64-linux and i686-linux, preapproved by richi
on IRC, committed to trunk.

2013-09-27  Jakub Jelinek  ja...@redhat.com

PR middle-end/58551
* tree-cfg.c (move_sese_region_to_fn): Also move loops that
are children of outermost saved_cfun's loop, and set it up to
be moved to dest_cfun's outermost loop.  Fix up num_nodes adjustments
if loop != loop0 and SESE region contains bbs that belong to loop0.

* c-c++-common/gomp/pr58551.c: New test.

--- gcc/tree-cfg.c.jj   2013-09-13 14:41:28.0 +0200
+++ gcc/tree-cfg.c  2013-09-27 12:23:48.582217401 +0200
@@ -6662,12 +6662,13 @@ move_sese_region_to_fn (struct function
   struct function *saved_cfun = cfun;
   int *entry_flag, *exit_flag;
   unsigned *entry_prob, *exit_prob;
-  unsigned i, num_entry_edges, num_exit_edges;
+  unsigned i, num_entry_edges, num_exit_edges, num_nodes;
   edge e;
   edge_iterator ei;
   htab_t new_label_map;
   struct pointer_map_t *vars_map, *eh_map;
   struct loop *loop = entry_bb-loop_father;
+  struct loop *loop0 = get_loop (saved_cfun, 0);
   struct move_stmt_d d;
 
   /* If ENTRY does not strictly dominate EXIT, this cannot be an SESE
@@ -6760,16 +6761,29 @@ move_sese_region_to_fn (struct function
   set_loops_for_fn (dest_cfun, loops);
 
   /* Move the outlined loop tree part.  */
+  num_nodes = bbs.length ();
   FOR_EACH_VEC_ELT (bbs, i, bb)
 {
-  if (bb-loop_father-header == bb
-  loop_outer (bb-loop_father) == loop)
+  if (bb-loop_father-header == bb)
{
  struct loop *this_loop = bb-loop_father;
- flow_loop_tree_node_remove (bb-loop_father);
- flow_loop_tree_node_add (get_loop (dest_cfun, 0), this_loop);
- fixup_loop_arrays_after_move (saved_cfun, cfun, this_loop);
+ struct loop *outer = loop_outer (this_loop);
+ if (outer == loop
+ /* If the SESE region contains some bbs ending with
+a noreturn call, those are considered to belong
+to the outermost loop in saved_cfun, rather than
+the entry_bb's loop_father.  */
+ || outer == loop0)
+   {
+ if (outer != loop)
+   num_nodes -= this_loop-num_nodes;
+ flow_loop_tree_node_remove (bb-loop_father);
+ flow_loop_tree_node_add (get_loop (dest_cfun, 0), this_loop);
+ fixup_loop_arrays_after_move (saved_cfun, cfun, this_loop);
+   }
}
+  else if (bb-loop_father == loop0  loop0 != loop)
+   num_nodes--;
 
   /* Remove loop exits from the outlined region.  */
   if (loops_for_fn (saved_cfun)-exits)
@@ -6789,6 +6803,7 @@ move_sese_region_to_fn (struct function
 
   /* Setup a mapping to be used by move_block_to_fn.  */
   loop-aux = current_loops-tree_root;
+  loop0-aux = current_loops-tree_root;
 
   pop_cfun ();
 
@@ -6817,11 +6832,13 @@ move_sese_region_to_fn (struct function
 }
 
   loop-aux = NULL;
+  loop0-aux = NULL;
   /* Loop sizes are no longer correct, fix them up.  */
-  loop-num_nodes -= bbs.length ();
+  loop-num_nodes -= num_nodes;
   for (struct loop *outer = loop_outer (loop);
outer; outer = loop_outer (outer))
-outer-num_nodes -= bbs.length ();
+outer-num_nodes -= num_nodes;
+  loop0-num_nodes -= bbs.length () - num_nodes;
 
   if (saved_cfun-has_simduid_loops || saved_cfun-has_force_vect_loops)
 {
--- gcc/testsuite/c-c++-common/gomp/pr58551.c.jj2013-09-27 
11:18:20.825251967 +0200
+++ gcc/testsuite/c-c++-common/gomp/pr58551.c   2013-09-27 11:17:56.0 
+0200
@@ -0,0 +1,33 @@
+/* PR middle-end/58551 */
+/* { dg-do compile } */
+/* { dg-options -O0 -fopenmp } */
+
+void
+foo (int *a)
+{
+  int i;
+  for (i = 0; i  8; i++)
+#pragma omp task
+if (a[i])
+  __builtin_abort ();
+}
+
+void bar (int, int);
+
+void
+baz (int *a)
+{
+  int i;
+  for (i = 0; i  8; i++)
+#pragma omp task
+if (a[i])
+  {
+   int j, k;
+   for (j = 0; j  10; j++)
+ for (k = 0; k  8; k++)
+   bar (j, k);
+   for (k = 0; k  12; k++)
+ bar (-1, k);
+   __builtin_abort ();
+  }
+}


Jakub


[patch] fix libstdc++/57465

2013-09-27 Thread Jonathan Wakely
PR libstdc++/57465
* include/std/functional
(_Function_base::_Base_manager::_M_not_empty_function): Fix overload
for pointers.
* testsuite/20_util/function/cons/57465.cc: New.


Tested x86_64-linux, committed to trunk.  I'll apply it to the
branches after it's been on trunk without problems for a while.
commit 55531e9c74a5f2b4699250b6b302d49f7dc8c5ae
Author: Jonathan Wakely jwakely@gmail.com
Date:   Wed Aug 7 01:38:39 2013 +0100

PR libstdc++/57465
* include/std/functional
(_Function_base::_Base_manager::_M_not_empty_function): Fix overload
for pointers.
* testsuite/20_util/function/cons/57465.cc: New.

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index 63ba777..73cddfe 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -1932,7 +1932,7 @@ _GLIBCXX_HAS_NESTED_TYPE(result_type)
 
templatetypename _Tp
  static bool
- _M_not_empty_function(const _Tp* __fp)
+ _M_not_empty_function(_Tp* const __fp)
  { return __fp; }
 
templatetypename _Class, typename _Tp
diff --git a/libstdc++-v3/testsuite/20_util/function/cons/57465.cc 
b/libstdc++-v3/testsuite/20_util/function/cons/57465.cc
new file mode 100644
index 000..44413fb
--- /dev/null
+++ b/libstdc++-v3/testsuite/20_util/function/cons/57465.cc
@@ -0,0 +1,31 @@
+// Copyright (C) 2013 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+//
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+//
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// http://www.gnu.org/licenses/.
+
+// libstdc++/57465
+
+// { dg-options -std=gnu++11 }
+
+#include functional
+#include testsuite_hooks.h
+
+int main()
+{
+  using F = void();
+  F* f = nullptr;
+  std::functionF x(f);
+  VERIFY( !x );
+}


[PATCH] Invalid unpoisoning of stack redzones on ARM

2013-09-27 Thread Yury Gribov

Hi all,

I've recently submitted a bug report regarding invalid unpoisoning of 
stack frame redzones 
(http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58543). Could someone take 
a look at proposed patch (a simple one-liner) and check whether it's ok 
for commit?


Thanks!

-Yuri
diff --git a/gcc/asan.c b/gcc/asan.c
index 32f1837..acb00ea 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -895,7 +895,7 @@ asan_clear_shadow (rtx shadow_mem, HOST_WIDE_INT len)
 
   gcc_assert ((len  3) == 0);
   top_label = gen_label_rtx ();
-  addr = force_reg (Pmode, XEXP (shadow_mem, 0));
+  addr = copy_to_reg (force_reg (Pmode, XEXP (shadow_mem, 0)));
   shadow_mem = adjust_automodify_address (shadow_mem, SImode, addr, 0);
   end = force_reg (Pmode, plus_constant (Pmode, addr, len));
   emit_label (top_label);


Re: [PATCH] Sanitize block partitioning under -freorder-blocks-and-partition

2013-09-27 Thread Teresa Johnson
On Thu, Sep 26, 2013 at 3:02 PM, Jan Hubicka hubi...@ucw.cz wrote:

 Why not just have probably_never_executed_bb_p return simply return
 false bb-frequency is non-zero (right now it does the opposite -

 We want to have frequencies guessed for functions that was not trained
 in the profiling run (that was patch I posted earlier that I think did not
 go in, yet).

Right, but for splitting and bb layout purposes, for these statically
guessed unprofiled functions we in fact don't want to do any splitting
or treat the bbs as never executed (which shouldn't be a change from
the status quo since all the bbs in these functions are currently 0
weight, it's only when we inline in the case of comdats that they
appear colder than the surrounding code, but in fact we don't want
this).

The only other caller to probably_never_executed_bb_p is
compute_function_frequency, but in the case of statically guessed
functions they will have profile_status != PROFILE_READ and won't
invoke probably_never_executed_bb_p. But re-reading our most recent
exchange on the comdat profile issue, it sounds like you were
suggesting guessing profiles for all 0-weight functions early, then
dropping them from PROFILE_READ to PROFILE_GUESSED only once we
determine in ipa-inline that there is a potentially non-zero call path
to them. In that case with the change I describe above to
probably_never_executed_bb_p, the 0-weight functions with 0 calls to
them will incorrectly be marked as NODE_FREQUENCY_NORMAL, which would
be bad as they would not be size optimized or moved into the cold
section.

So it seems like we want different handling of these guessed
frequencies in compute_function_frequency and bb-reorder.c. Actually I
think we can handle this by checking if the function entry block has a
0 count. If so, then we just look at the bb counts and not the
frequencies for determining bb hotness as the frequencies would
presumably have been statically-guessed. This will ensure that the
cgraph node continues to be marked unlikely and size-optimized. If the
function entry block has a non-zero count, then we look at both the bb
count and the bb frequency - if they are both zero then the bb is
probably never executed, but if either is non-zero then we should
treat the block as possibly executed (which will come into play for
splitting and bb layout).

Teresa


 Currently I return true when frequency indicate that BB is executed at least 
 in
 1/4th of all executions.  With the cases discussed I see we may need to reduce
 this threshold.  In general I do not like much hard tests for 0 because 
 meaning
 of 0 depends on REG_BR_FREQ_BASE that is supposed to be changeable and we may
 want to make frequencies sreal, too.

 I suppose we may introduce --param for this.  You are also right that I should
 update probably_never_executed_edge_p (I intended so, but obviously the code
 ended up in mainline accidentally).

 I however saw at least one case of jump threading where this trick did not
 help: the jump threading update confused itself by scaling via counts rather
 than frequencies and ended up with dropping everything to 0. This makes it
 more tempting to try to go with sreals for those

 Honza

 returns true when bb-frequency is 0)? Making this change removed a
 bunch of other failures. With this change as well, there are only 3
 cases that still fail with 1 train run that pass with 100. Need to
 look at those.

 
  Will you look into logic of do_jump or shall I try to dive in?

 I can take a look, but probably won't have a chance until late this
 week. If you don't get to it before then I will see if I can figure
 out why it is applying the branch probabilities this way.

 Teresa

 
  Honza



 --
 Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413



-- 
Teresa Johnson | Software Engineer | tejohn...@google.com | 408-460-2413


Re: OMP4/cilkplus: simd clone function mangling

2013-09-27 Thread Jakub Jelinek
On Thu, Sep 26, 2013 at 02:31:33PM -0500, Aldy Hernandez wrote:
 --- a/gcc/config/i386/i386.c
 +++ b/gcc/config/i386/i386.c
 @@ -42806,6 +42806,43 @@ ix86_memmodel_check (unsigned HOST_WIDE_INT val)
return val;
  }
  
 +/* Return the default vector mangling ISA code when none is specified
 +   in a `processor' clause.  */
 +
 +static char
 +ix86_cilkplus_default_vector_mangling_isa_code (struct cgraph_node *clone
 + ATTRIBUTE_UNUSED)
 +{
 +  return 'x';
 +}

I think rth was suggesting using vecsize_mangle, vecsize_modifier or something 
else,
instead of ISA, because it won't represent the ISA on all targets.
It is just some magic letter used in mangling of the simd functions.

 +
 +  /* To distinguish from an OpenMP simd clone, Cilk Plus functions to
 + be cloned have a distinctive artificial label in addition to omp
 + declare simd.  */
 +  bool cilk_clone = flag_enable_cilkplus
 + lookup_attribute (cilk plus elemental,
 +  DECL_ATTRIBUTES (new_node-symbol.decl));

Formatting.  I'd say it should be
  bool cilk_clone
= (flag_enable_cilkplus
lookup_attribute (cilk plus elemental,
  DECL_ATTRIBUTES (new_node-symbol.decl)));

 +  if (cilk_clone)
 +remove_attribute (cilk plus elemental,
 +   DECL_ATTRIBUTES (new_node-symbol.decl));

I think it doesn't make sense to remove the attribute.

 +  pretty_printer vars_pp;

Do you really need two different pretty printers?
Can't you just print _ZGV%c%c%d into pp (is pp_printf
that cheap, wouldn't it be better to pp_string (pp, _ZGV),
2 pp_character + one pp_decimal_int?), and then do the loop over
the args, which right now writes into vars_pp and finally
pp_underscore and pp_string the normally mangled name?

 +/* Create a simd clone of OLD_NODE and return it.  */
 +
 +static struct cgraph_node *
 +simd_clone_create (struct cgraph_node *old_node)
 +{
 +  struct cgraph_node *new_node;
 +  new_node = cgraph_function_versioning (old_node, vNULL, NULL, NULL, false,
 +  NULL, NULL, simdclone);
 +

My understanding of how IPA cloning etc. works is that you first
set up various data structures describing how you change the arguments
and only then actually do cgraph_function_versioning which already during
the copying will do some of the transformations of the IL.
But perhaps those transformations are too complicated to describe for
tree-inline.c to make them for you.

 +  tree attr = lookup_attribute (omp declare simd,
 + DECL_ATTRIBUTES (node-symbol.decl));
 +  if (!attr)
 +return;
 +  do
 +{
 +  struct cgraph_node *new_node = simd_clone_create (node);
 +
 +  bool inbranch_clause;
 +  simd_clone_clauses_extract (new_node, TREE_VALUE (attr),
 +   inbranch_clause);
 +  simd_clone_compute_isa_and_simdlen (new_node);
 +  simd_clone_mangle (node, new_node);

As discussed on IRC, I was hoping that for OpenMP simd and selected
targets (e.g. i?86-linux and x86_64-linux) we could do better than that,
creating not just one or two clones as we do for Cilk+ where one can
select which CPU (and thus ISA) he wants to build the clones for, but
creating clones for all ISAs, and just based on command line options
either emit just one of them as the really optimized one and the others
just as thunks that would just call other simd clone functions or the
normal function possibly several times.

Jakub


Re: [PATCH] Invalid unpoisoning of stack redzones on ARM

2013-09-27 Thread Jakub Jelinek
On Fri, Sep 27, 2013 at 06:10:41PM +0400, Yury Gribov wrote:
 Hi all,
 
 I've recently submitted a bug report regarding invalid unpoisoning
 of stack frame redzones
 (http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58543). Could someone
 take a look at proposed patch (a simple one-liner) and check whether
 it's ok for commit?

Can you please be more verbose on why do you think it is the right fix,
what exactly is the problem and why force_reg wasn't sufficient?
What exactly was XEXP (shadow_mem, 0) that force_reg didn't force it into
a pseudo?

Also, you are missing a ChangeLog entry.

 diff --git a/gcc/asan.c b/gcc/asan.c
 index 32f1837..acb00ea 100644
 --- a/gcc/asan.c
 +++ b/gcc/asan.c
 @@ -895,7 +895,7 @@ asan_clear_shadow (rtx shadow_mem, HOST_WIDE_INT len)
  
gcc_assert ((len  3) == 0);
top_label = gen_label_rtx ();
 -  addr = force_reg (Pmode, XEXP (shadow_mem, 0));
 +  addr = copy_to_reg (force_reg (Pmode, XEXP (shadow_mem, 0)));
shadow_mem = adjust_automodify_address (shadow_mem, SImode, addr, 0);
end = force_reg (Pmode, plus_constant (Pmode, addr, len));
emit_label (top_label);


Jakub


Add value range support into memcpy/memset expansion

2013-09-27 Thread Jan Hubicka
Hi,
this patch makes it possible to access value range info from setmem/movstr that
I plan to use in i386 memcpy/memset expansion code.  It is all quite
straighforward except that I need to deal with cases where max size does not
fit in HOST_WIDE_INT where I use maximal value as a marker.  It is then
translated as NULL pointer to the expander that is bit inconsistent with other
places that use -1 as marker of unknown value.

I also think we lose some cases because of TER replacing out the SSA_NAME by
something else, but it seems to work in quite many cases. This can be probably
tracked incrementally by disabling TER here or finally getting away from
expanding calls via the generic route.

Bootstrapped/regtested x86_64-linux, OK?

Honza

* doc/md.texi (setmem, movstr): Update documentation.
* builtins.c (determine_block_size): New function.
(expand_builtin_memcpy): Use it and pass it to
emit_block_move_hints.
(expand_builtin_memset_args): Use it and pass it to
set_storage_via_setmem.
* expr.c (emit_block_move_via_movmem): Add min_size/max_size parameters;
update call to expander.
(emit_block_move_hints): Add min_size/max_size parameters.
(clear_storage_hints): Likewise.
(set_storage_via_setmem): Likewise.
(clear_storage): Update.
* expr.h (emit_block_move_hints, clear_storage_hints,
set_storage_via_setmem): Update prototype.

Index: doc/md.texi
===
--- doc/md.texi (revision 202968)
+++ doc/md.texi (working copy)
@@ -5198,6 +5198,9 @@ destination and source strings are opera
 the expansion of this pattern should store in operand 0 the address in
 which the @code{NUL} terminator was stored in the destination string.
 
+This patern has also several optional operands that are same as in
+@code{setmem}.
+
 @cindex @code{setmem@var{m}} instruction pattern
 @item @samp{setmem@var{m}}
 Block set instruction.  The destination string is the first operand,
@@ -5217,6 +5220,8 @@ respectively.  The expected alignment di
 in a way that the blocks are not required to be aligned according to it in
 all cases. This expected alignment is also in bytes, just like operand 4.
 Expected size, when unknown, is set to @code{(const_int -1)}.
+Operand 7 is the minimal size of the block and operand 8 is the
+maximal size of the block (NULL if it can not be represented as CONST_INT).
 
 The use for multiple @code{setmem@var{m}} is as for @code{movmem@var{m}}.
 
Index: builtins.c
===
--- builtins.c  (revision 202968)
+++ builtins.c  (working copy)
@@ -3070,6 +3070,51 @@ builtin_memcpy_read_str (void *data, HOS
   return c_readstr (str + offset, mode);
 }
 
+/* LEN specify length of the block of memcpy/memset operation.
+   Figure out its range and put it into MIN_SIZE/MAX_SIZE.  */
+
+static void
+determine_block_size (tree len, rtx len_rtx,
+ unsigned HOST_WIDE_INT *min_size,
+ unsigned HOST_WIDE_INT *max_size)
+{
+  if (CONST_INT_P (len_rtx))
+{
+  *min_size = *max_size = UINTVAL (len_rtx);
+  return;
+}
+  else
+{
+  double_int min, max;
+  if (TREE_CODE (len) == SSA_NAME 
+  get_range_info (len, min, max) == VR_RANGE)
+   {
+ if (min.fits_uhwi ())
+   *min_size = min.to_uhwi ();
+ else
+   *min_size = 0;
+ if (max.fits_uhwi ())
+   *max_size = max.to_uhwi ();
+ else
+   *max_size = (HOST_WIDE_INT)-1;
+   }
+  else
+   {
+ if (host_integerp (TYPE_MIN_VALUE (TREE_TYPE (len)), 1))
+   *min_size = tree_low_cst (TYPE_MIN_VALUE (TREE_TYPE (len)), 1);
+ else
+   *min_size = 0;
+ if (host_integerp (TYPE_MAX_VALUE (TREE_TYPE (len)), 1))
+   *max_size = tree_low_cst (TYPE_MAX_VALUE (TREE_TYPE (len)), 1);
+ else
+   *max_size = GET_MODE_MASK (GET_MODE (len_rtx));
+   }
+}
+  gcc_checking_assert (*max_size =
+  (unsigned HOST_WIDE_INT)
+ GET_MODE_MASK (GET_MODE (len_rtx)));
+}
+
 /* Expand a call EXP to the memcpy builtin.
Return NULL_RTX if we failed, the caller should emit a normal call,
otherwise try to get the result in TARGET, if convenient (and in
@@ -3092,6 +3137,8 @@ expand_builtin_memcpy (tree exp, rtx tar
   rtx dest_mem, src_mem, dest_addr, len_rtx;
   HOST_WIDE_INT expected_size = -1;
   unsigned int expected_align = 0;
+  unsigned HOST_WIDE_INT min_size;
+  unsigned HOST_WIDE_INT max_size;
 
   /* If DEST is not a pointer type, call the normal function.  */
   if (dest_align == 0)
@@ -3111,6 +3158,7 @@ expand_builtin_memcpy (tree exp, rtx tar
   dest_mem = get_memory_rtx (dest, len);
   set_mem_align (dest_mem, dest_align);
   len_rtx = expand_normal (len);
+  

Re: Generic tuning in x86-tune.def 1/2

2013-09-27 Thread H.J. Lu
On Fri, Sep 27, 2013 at 1:56 AM, Jan Hubicka hubi...@ucw.cz wrote:
 Hi,
 this is second part of the generic tuning changes sanityzing the tuning flags.
 This patch again is supposed to deal with the obvious part only.
 I will send separate patch for more changes.

 The flags changed agree on all CPUs considered for generic (and their
 optimization manuals) + amdfam10, core2 and Atom SLM.

 I also added X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL to bobcat tuning, since it
 seems like obvious omision (after double checking in optimization manual) and
 droped X86_TUNE_FOUR_JUMP_LIMIT for buldozer cores.  Implementation of this
 feature was always bit weird and its main purpose was to avoid terrible branch
 predictor degeneration on the older AMD branch predictors. I benchmarked both
 spec2k and 2k6 to verify there are no regression.

 Especially X86_TUNE_REASSOC_FP_TO_PARALLEL seems to bring nice improvements 
 in specfp
 benchmarks.

 Bootstrapped/regtested x86_64-linux, will wait for comments and commit it
 during weekend.  I will be happy to revisit any of the generic tuning if
 regressions pop up.

 Overall this patch also brings small code size improvements for smaller
 loads/stores and less padding at -O2. Differences are sub 0.1% however.

 Honza
 * x86-tune.def (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Enable for 
 generic.
 (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
 (X86_TUNE_FOUR_JUMP_LIMIT): Drop for generic and buldozer.
 (X86_TUNE_PAD_RETURNS): Drop for newer AMD chips.

Can we drop generic on X86_TUNE_PAD_RETURNS?

 (X86_TUNE_AVOID_VECTOR_DECODE): Drop for generic.
 (X86_TUNE_REASSOC_FP_TO_PARALLEL): Enable for generic.


-- 
H.J.


Re: [ping] [PATCH] Silence an unused variable warning

2013-09-27 Thread Vladimir Makarov

On 13-09-27 4:55 AM, Dodji Seketeli wrote:

Let's CC Vladimir on this easy one.

Cheers.
All targets I know have ELIMINABLE_REGS defined.  Therefore it was not 
caught before.

.
The patch is ok for me.  Thanks.



Jan-Benedict Glaw jbg...@lug-owl.de a écrit:


On Fri, 2013-09-20 20:51:37 +0200, Jan-Benedict Glaw jbg...@lug-owl.de wrote:

Hi!

With the VAX target, I see this warning:

g++ -c   -g -O2 -DIN_GCC -DCROSS_DIRECTORY_STRUCTURE  -fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -pedantic -Wno-long-long 
-Wno-variadic-macros -Wno-overlength-strings -fno-common  -DHAVE_CONFIG_H -I. 
-I. -I../../../../gcc/gcc -I../../../../gcc/gcc/. 
-I../../../../gcc/gcc/../include -I../../../../gcc/gcc/../libcpp/include  
-I../../../../gcc/gcc/../libdecnumber -I../../../../gcc/gcc/../libdecnumber/dpd 
-I../libdecnumber -I../../../../gcc/gcc/../libbacktrace
../../../../gcc/gcc/lra-eliminations.c -o lra-eliminations.o
../../../../gcc/gcc/lra-eliminations.c: In function ‘void init_elim_table()’:
../../../../gcc/gcc/lra-eliminations.c:1162:8: warning: unused variable 
‘value_p’ [-Wunused-variable]
bool value_p;
 ^

[...]





Re: [PATCH] Make jump thread path carry more information

2013-09-27 Thread Jeff Law

On 09/27/2013 08:42 AM, James Greenhalgh wrote:

On Thu, Sep 26, 2013 at 04:26:35AM +0100, Jeff Law wrote:

Bootstrapped and regression tested on x86_64-unknown-linux-gnu.
Installed on trunk.


Hi Jeff,

This patch caused a regression on Arm and AArch64 in:

PASS-FAIL: gcc.c-torture/execute/memcpy-2.c execution,  -O3 
-fomit-frame-pointer

 From what I can see, the only place the behaviour of the threader has
changed is in this hunk:
Yes.  The old code was dropping the tail off the thread path; if we're 
seeing failures on the ARM port as a result of fixing that goof we 
obviously need to address them.


Let me take a looksie :-)

If you could pass along a .i file it'd be helpful in case I want to look 
at something under the debugger.



jeff



Re: Generic tuning in x86-tune.def 1/2

2013-09-27 Thread Jan Hubicka
 On Fri, Sep 27, 2013 at 1:56 AM, Jan Hubicka hubi...@ucw.cz wrote:
  Hi,
  this is second part of the generic tuning changes sanityzing the tuning 
  flags.
  This patch again is supposed to deal with the obvious part only.
  I will send separate patch for more changes.
 
  The flags changed agree on all CPUs considered for generic (and their
  optimization manuals) + amdfam10, core2 and Atom SLM.
 
  I also added X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL to bobcat tuning, since it
  seems like obvious omision (after double checking in optimization manual) 
  and
  droped X86_TUNE_FOUR_JUMP_LIMIT for buldozer cores.  Implementation of this
  feature was always bit weird and its main purpose was to avoid terrible 
  branch
  predictor degeneration on the older AMD branch predictors. I benchmarked 
  both
  spec2k and 2k6 to verify there are no regression.
 
  Especially X86_TUNE_REASSOC_FP_TO_PARALLEL seems to bring nice improvements 
  in specfp
  benchmarks.
 
  Bootstrapped/regtested x86_64-linux, will wait for comments and commit it
  during weekend.  I will be happy to revisit any of the generic tuning if
  regressions pop up.
 
  Overall this patch also brings small code size improvements for smaller
  loads/stores and less padding at -O2. Differences are sub 0.1% however.
 
  Honza
  * x86-tune.def (X86_TUNE_SSE_UNALIGNED_LOAD_OPTIMAL): Enable for 
  generic.
  (X86_TUNE_SSE_UNALIGNED_STORE_OPTIMAL): Likewise.
  (X86_TUNE_FOUR_JUMP_LIMIT): Drop for generic and buldozer.
  (X86_TUNE_PAD_RETURNS): Drop for newer AMD chips.
 
 Can we drop generic on X86_TUNE_PAD_RETURNS?
It is on my list for not-so-obvious changes.  I tested and removed it from
BDVER with intention to drop it from generic. But after furhter testing I lean
towards keeping it for some extra time.

I tested it on fam10 machines and it causes over 10% regressions on some
benchmarks, including bzip and botan (where it is up to 4-fold regression).
Missing a return on amdfam10 hardware is bad, because it causes return stack to
go out of sync. At the same time I can not really measure benefits for
disabling it - the code size cost is very small and runtime cost on
non-amdfam10 cores is not important, too, since the function call overhead hide
the extra nop quite easily.

So I would incline to be apply extra care on this flag and keep it for extra
release or two. Most of gcc.opensuse.org testing runs on these and adding
random branch mispredictions will trash them.

At the related note, would would you think of X86_TUNE_PARTIAL_FLAG_REG_STALL?
I benchmarked it on my I5 notebook and it seems to have no measurable effects
on spec2k6.

I also did some benchmarking of the patch to disable alignments you proposed.
Unforutnately I can measure slowdowns on fam10/bdver/and on botan/hand written
loops even for core.

I am considering to drop the branch target/function alignment and keep only loop
alignment, but I did not test this yet.

Honza


  1   2   >