date:20131021

Re: [PATCH] Fix part of PR58712

2013-10-21 Thread Markus Trippelsdorf


LTO bootstrapped and tested on x86_64-unknown-linux-gnu.

I would be grateful if one of you guys could apply it.
Thanks.

2013-10-21  Markus Trippelsdorf  mar...@trippelsdorf.de

PR ipa/58712
* cgraph.c (cgraph_create_edge_1): Add indirect_unknown_callee
as argument.
(cgraph_create_edge): Use the new argument.
(cgraph_create_indirect_edge): Likewise.


diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 6ebd0c71e026..7f83a016dcf3 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -814,7 +814,8 @@ cgraph_set_call_stmt (struct cgraph_edge *e, gimple 
new_stmt,
 
 static struct cgraph_edge *
 cgraph_create_edge_1 (struct cgraph_node *caller, struct cgraph_node *callee,
-  gimple call_stmt, gcov_type count, int freq)
+  gimple call_stmt, gcov_type count, int freq,
+  bool indir_unknown_callee)
 {
   struct cgraph_edge *edge;
 
@@ -874,6 +875,7 @@ cgraph_create_edge_1 (struct cgraph_node *caller, struct 
cgraph_node *callee,
   edge-indirect_info = NULL;
   edge-indirect_inlining_edge = 0;
   edge-speculative = false;
+  edge-indirect_unknown_callee = indir_unknown_callee;
   if (call_stmt  caller-call_site_hash)
 cgraph_add_edge_to_call_site_hash (edge);
 
@@ -887,9 +889,8 @@ cgraph_create_edge (struct cgraph_node *caller, struct 
cgraph_node *callee,
gimple call_stmt, gcov_type count, int freq)
 {
   struct cgraph_edge *edge = cgraph_create_edge_1 (caller, callee, call_stmt,
-  count, freq);
+  count, freq, false);
 
-  edge-indirect_unknown_callee = 0;
   initialize_inline_failed (edge);
 
   edge-next_caller = callee-callers;
@@ -926,10 +927,9 @@ cgraph_create_indirect_edge (struct cgraph_node *caller, 
gimple call_stmt,
 gcov_type count, int freq)
 {
   struct cgraph_edge *edge = cgraph_create_edge_1 (caller, NULL, call_stmt,
-  count, freq);
+  count, freq, true);
   tree target;
 
-  edge-indirect_unknown_callee = 1;
   initialize_inline_failed (edge);
 
   edge-indirect_info = cgraph_allocate_init_indirect_info ();

-- 
Markus

Re: [PATCH] Fix PR58775

2013-10-21 Thread Jakub Jelinek

On Mon, Oct 21, 2013 at 01:20:06PM +0800, Zhenqiang Chen wrote:
 Thanks for the comments.
 
 Patch is updated to set uid in update_range_test after 
 force_gimple_operand_gsi.
 
 I am not sure the patch can cover all cases. If not, I think
 force_gimple_operand_gsi_1 maybe the only place to set uid for all new
 generated statements.

No, force_gimple_operand_gsi can add more than one stmt, thus I think it is
appropriate to do this instead (or force_gimple_operand_1 into a gimple_seq,
set uid on all stmts in that seq and then insert them).

OT, the other force_gimple_operand_gsi in negate_value lead me to a thought
that we could avoid the extra walk of all the IL in do_reassoc -
renumber_gimple_stmt_uids by setting the uids in break_up_subtract_bb,
where we already reset the visited flags to avoid the extra IL walk.

2013-10-21  Jakub Jelinek  ja...@redhat.com

PR tree-optimization/58775
* tree-ssa-reassoc.c (update_range_test): Set uid on stmts added by
force_gimple_operand_gsi.

* gcc.c-torture/compile/pr58775.c: New test.

--- gcc/tree-ssa-reassoc.c.jj   2013-10-21 09:00:25.0 +0200
+++ gcc/tree-ssa-reassoc.c  2013-10-21 09:47:24.431082049 +0200
@@ -2011,6 +2011,11 @@ update_range_test (struct range_entry *r
   gsi = gsi_for_stmt (stmt);
   tem = force_gimple_operand_gsi (gsi, tem, true, NULL_TREE, true,
  GSI_SAME_STMT);
+  for (gsi_prev (gsi); !gsi_end_p (gsi); gsi_prev (gsi))
+if (gimple_uid (gsi_stmt (gsi)))
+  break;
+else
+  gimple_set_uid (gsi_stmt (gsi), gimple_uid (stmt));
 
   /* If doing inter-bb range test optimization, update the
  stmts immediately.  Start with changing the first range test
--- gcc/testsuite/gcc.c-torture/compile/pr58775.c.jj2013-10-21 
09:48:34.034726581 +0200
+++ gcc/testsuite/gcc.c-torture/compile/pr58775.c   2013-10-21 
09:48:47.029662347 +0200
@@ -0,0 +1,26 @@
+/* PR tree-optimization/58775 */
+
+void bar (void);
+
+void
+foo (char *x)
+{
+  char a;
+  _Bool b, c, d, e, f, g, h, i, j, k, l, m;
+
+  a = *x;
+  b = a == 100;
+  c = a == 105;
+  d = b | c;
+  e = a != 111;
+  f = !d;
+  g = e  f;
+  h = a != 117;
+  i = g  h;
+  j = a != 120;
+  k = i  j;
+  l = a != 88;
+  m = k  l;
+  if (m == 0)
+bar ();
+}


Jakub

Re: [PR libstdc++/58804][PR libstdc++/58729] tr2/dynamic_bitset issues.

2013-10-21 Thread Paolo Carlini


On 10/21/2013 02:02 AM, Ed Smith-Rowland wrote:

Greetings.

Here is a patch to correct tr2/dynamic_bitset to use __builtin_xxxll 
for long long instead of the long versions.


Relevant bugs:
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58804
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=58729

Builds and *really* tests clean on x86_64-linux.

OK?

Ok, thanks. Remember to close the Bugs.

Paolo.

[PATCH][AArch64] Get %c output template tests to pass for -fPIC

2013-10-21 Thread Kyrill Tkachov


Hi all,

When I added the tests for the %c output template for aarch64 I did not take 
into account that compiling them with -fPIC would fail. This patch fixes them to 
use the 'S' constraint to get them to work.


Ok for trunk?

Tested on aarch64-none-elf/-fPIC

Thanks,
Kyrill


[gcc/testsuite]
2013-10-21  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* gcc.target/aarch64/c-output-mod-2.c: Fix for -fPIC.
* gcc.target/aarch64/c-output-mod-3.c: Likewise.diff --git a/gcc/testsuite/gcc.target/aarch64/c-output-template-2.c b/gcc/testsuite/gcc.target/aarch64/c-output-template-2.c
index 16ff58d..ced96d0 100644
--- a/gcc/testsuite/gcc.target/aarch64/c-output-template-2.c
+++ b/gcc/testsuite/gcc.target/aarch64/c-output-template-2.c
@@ -1,15 +1,9 @@
 /* { dg-do compile } */
 
-struct tracepoint {
-int dummy;
-int state;
-};
-static struct tracepoint tp;
-
 void
 test (void)
 {
-__asm__ (@ %c0 : : i (tp));
+__asm__ (@ %c0 : : S (test));
 }
 
-/* { dg-final { scan-assembler @ tp } } */
+/* { dg-final { scan-assembler @ test } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/c-output-template-3.c b/gcc/testsuite/gcc.target/aarch64/c-output-template-3.c
index e332fe1..c28837c 100644
--- a/gcc/testsuite/gcc.target/aarch64/c-output-template-3.c
+++ b/gcc/testsuite/gcc.target/aarch64/c-output-template-3.c
@@ -1,15 +1,10 @@
 /* { dg-do compile } */
-
-struct tracepoint {
-int dummy;
-int state;
-};
-static struct tracepoint tp;
+/* { dg-options -Wno-pointer-arith } */
 
 void
 test (void)
 {
-__asm__ (@ %c0 : : i (tp.state));
+__asm__ (@ %c0 : : S (test + 4));
 }
 
-/* { dg-final { scan-assembler @ tp\\+4 } } */
+/* { dg-final { scan-assembler @ test\\+4 } } */

Re: libstdc++ Testsuite extension for sort, partial_sort, partial_sort_copy, nth_element

2013-10-21 Thread Paolo Carlini


Hi,

On 10/19/2013 11:10 PM, Christopher Jefferson wrote:

The following patch (related to my 58800 patch) adds many more tests
for several important functions. While 58800 is my fault, the reason
it was not caught earlier is that many functions in libstdc++ have
almost no testing. This works toward fixing that.

These tests are kept to an acceptable amount of time (~2 seconds/
function) at the moment (and automatic reduction for simulators),
although I would like to take much more time and do much more testing,
but I realise that might annoy some people so it might not be on by
default.

I plan to write my tests if these are accepted (although writing this
reminded me why gcc is the worst open source project for submitting
code to I have ever worked with).

2013-10-16  Edward Smith-Rowland  3dw...@verizon.net

* testsuite/util/testsuite_iterators.h : Add 'val' method to read
testsuite container.
* testsuite/25_algorithms/partial_sort/nth_element.cc : New
* testsuite/25_algorithms/partial_sort/random_test.cc : New
* testsuite/25_algorithms/partial_sort_copy/random_test.cc : New
* testsuite/25_algorithms/sort/random_test.cc : New
* testsuite/util/testsuite_containergen.h : New header for
automatically testing functions on a large set of autogenerated
headers
a bunch of nits and a more substantive issue. The nits are: the name in 
the ChangeLog ;); these days prefer -std=gnu++11 to 0x, only 2013 as 
Copyright date. Also, stylistically, we don't use curly braces in 
single-line for loop bodies. Watch out overlong lines, as usual (I 
spotted a couple)


Slightly more interesting: while we are at it, we are including the 
whole random, why not using uniform_int_distribution instead of 
fiddling with % in lots of places? It's also preferable from a 
statistical point of view (probably doesn't matter much in this specific 
application, but using % isn't something we want to recommend to people 
studying free software to learn about these topics, eh!)


Thanks,
Paolo.

Re: [PATCH][i386] Enable vector_loop in memset expanding and merge expanders for memset and memmov

2013-10-21 Thread Michael V. Zolotukhin

 The patch is OK.
Thanks, the patch was committed.

  That's a good point.  I added a check for this case - so if CONST_INT is 
  passed
  we assume that mode is QI.  But usually promoted value resides in a 
  register, so
  we don't go over-conservative here.
 Hmm, so if we use broadcast constant, then we won't end up having CONST_INT 
 here?
 It is OK then.
Currently we don't end up with CONST_INT.  But maybe we'd need to revisit this
place later.

Michael
 Honza

Re: [Patch, fortran] PR58793 - Wrong value for _vtab for intrinsic types with CLASS(): storage_size of class() gives wrong result

2013-10-21 Thread Tobias Burnus


Paul Richard Thomas wrote:

This patch is fairly obvious and follows a suggestion from Tobias to
use gfc_element_size.  He even wrote the testcase!
Bootstrapped and regested on FC17/x86_64 - OK for trunk?


Looks good to me. Thanks for the patch.


PS In writing this, I have just noted that I need to trap ts-type ==
BT_HOLLERITH.  It would be a rather bizarre thing to try to do but I
just know that somebody, somewhere will eventually try it :-) Consider
that it will be done in the committed version.


I think that should be done as follow-up patch.In particular, it is not 
obvious how it should be handled.


The program below ICEs in gfc_typenode_for_spec;with both Intel's and 
Cray's compiler, it gives

 Character with len 3
 Something else
I somehow had expected that it maps to a character.

program test
  call up(abc)
  call up(3habc)
contains
  subroutine up(x)
class(*) :: x
select type(x)
 type is (character(*))
  print *, 'Character with len', len(x)
 class default
  print *, 'Something else'
end select
  end subroutine
end program test

 The following - related to TYPE(*) - is also accepted, but I think it 
shouldn't. (call up(x) is presumably invalid). It fails here at 
assembler time.


program test
  use iso_c_binding
  integer,target ::aa
  call up(c_loc(aa)) ! Probably okay but not matchable?
  call bar(aa) !OK
contains
  subroutine up(x)
class(*) :: x
select type(x)
! type is (c_ptr) ! invalid  rejected by gfortran
!  print *, 'C_ptr'
 class default
  print *, 'Something else'
end select
  end subroutine
  subroutine bar(x)
   type(*) :: x
   call up(x) ! Invalid
  end subroutine bar
end program test


Tobias


2013-10-20  Paul Thomas  pa...@gcc.gnu.org

 PR fortran 57893
 * class.c : Include target-memory.h.
 (gfc_find_intrinsic_vtab) Build a minimal expression so that
 gfc_element_size can be used to obtain the storage size, rather
 that the kind value.

2013-10-20  Paul Thomas  pa...@gcc.gnu.org

 PR fortran 57893
 * gfortran.dg/unlimited_polymorphic_12.f90 : New test.

[4.8 branch] Handle svn 1.8.1

2013-10-21 Thread Andreas Jaeger


I'd like to backport this patch from trunk to handle subversion 1.8.1
output.

Ok to commit?

Andreas

2013-10-21  Mike Stump  mikest...@comcast.net

* gcc_update (configure): Update to handle svn 1.8.1.

Index: contrib/gcc_update
===
--- contrib/gcc_update  (revision 203886)
+++ contrib/gcc_update  (working copy)
@@ -382,7 +382,7 @@
fi

revision=`$GCC_SVN info | awk '/Revision:/ { print $2 }'`
-   branch=`$GCC_SVN info | sed -ne /URL:/ {
+   branch=`$GCC_SVN info | sed -ne /^URL:/ {
s,.*/trunk,trunk,
s,.*/branches/,,
s,.*/tags/,,

-- 
 Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
  SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
   GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126

Re: [4.8 branch] Handle svn 1.8.1

2013-10-21 Thread Jakub Jelinek

On Mon, Oct 21, 2013 at 12:01:28PM +0200, Andreas Jaeger wrote:
 Ok to commit?
 
 Andreas
 
 2013-10-21  Mike Stump  mikest...@comcast.net
 
   * gcc_update (configure): Update to handle svn 1.8.1.

Ok.

 --- contrib/gcc_update(revision 203886)
 +++ contrib/gcc_update(working copy)
 @@ -382,7 +382,7 @@
   fi
 
   revision=`$GCC_SVN info | awk '/Revision:/ { print $2 }'`
 - branch=`$GCC_SVN info | sed -ne /URL:/ {
 + branch=`$GCC_SVN info | sed -ne /^URL:/ {
   s,.*/trunk,trunk,
   s,.*/branches/,,
   s,.*/tags/,,
 
 -- 
  Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi
   SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany
GF: Jeff Hawn,Jennifer Guild,Felix Imendörffer,HRB16746 (AG Nürnberg)
 GPG fingerprint = 93A3 365E CE47 B889 DF7F  FED1 389A 563C C272 A126

Jakub

[Patch] Fix gcc.dg/20050922-*.c

2013-10-21 Thread Vidya Praveen

Hello,

Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can
be a issue especially since they define uint32_t. Testcase writing guidelines
discourages such inclusion as well.

This patch replaces these #includes with manual declarations.

Tested for aarch64-none-elf, arm-none-eabi and x86_64-linux-gnu

OK for trunk, 4.7, 4.8?

VP.

---

gcc/testsuite/ChangeLog:

2013-10-21  Vidya Praveen  vidyaprav...@arm.com

* gcc.dg/20050922-1.c: Remove stdlib.h and declare abort().
* gcc.dg/20050922-1.c: Remove stdlib.h and declare abort() and exit().

diff --git a/gcc/testsuite/gcc.dg/20050922-1.c b/gcc/testsuite/gcc.dg/20050922-1.c
index ed5a3c6..982f820 100644
--- a/gcc/testsuite/gcc.dg/20050922-1.c
+++ b/gcc/testsuite/gcc.dg/20050922-1.c
@@ -4,7 +4,7 @@
 /* { dg-do run } */
 /* { dg-options -O1 -std=c99 } */
 
-#include stdlib.h
+extern void abort (void);
 
 #if __INT_MAX__ == 2147483647
 typedef unsigned int uint32_t;
diff --git a/gcc/testsuite/gcc.dg/20050922-2.c b/gcc/testsuite/gcc.dg/20050922-2.c
index c2974d0..2e8db82 100644
--- a/gcc/testsuite/gcc.dg/20050922-2.c
+++ b/gcc/testsuite/gcc.dg/20050922-2.c
@@ -4,7 +4,8 @@
 /* { dg-do run } */
 /* { dg-options -O1 -std=c99 } */
 
-#include stdlib.h
+extern void abort (void);
+extern void exit (int);
 
 #if __INT_MAX__ == 2147483647
 typedef unsigned int uint32_t;

Add FLAGS_REG clobber to kxnormode

2013-10-21 Thread Uros Bizjak

Hello!

We are splitting it to normal XOR that includes clobber, so better
start with the clobber from the beginning.

2013-10-20  Uros Bizjak  ubiz...@gmail.com

* config/i386/i386.md (kxnormode): Add FLAGS_REG clobber.

Tested on x86_64-pc-linux-gnu and committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 203885)
+++ i386.md (working copy)
@@ -8254,7 +8253,8 @@
(not:SWI12
  (xor:SWI12
(match_operand:SWI12 1 register_operand 0,Yk)
-   (match_operand:SWI12 2 register_operand r,Yk]
+   (match_operand:SWI12 2 register_operand r,Yk
+   (clobber (reg:CC FLAGS_REG))]
   TARGET_AVX512F
   @
#
@@ -8268,15 +8268,15 @@
(not:SWI12
  (xor:SWI12
(match_dup 0)
-   (match_operand:SWI12 1 general_reg_operand]
+   (match_operand:SWI12 1 general_reg_operand
+   (clobber (reg:CC FLAGS_REG))]
   TARGET_AVX512F  reload_completed
[(parallel [(set (match_dup 0)
(xor:HI (match_dup 0)

Re: [Patch, fortran] PR58793 - Wrong value for _vtab for intrinsic types with CLASS(): storage_size of class() gives wrong result

2013-10-21 Thread Paul Richard Thomas

Dear Tobias,



 Looks good to me. Thanks for the patch.



 I think that should be done as follow-up patch.In particular, it is not
 obvious how it should be handled.

OK - I'll commit tonight, as is.


 The program below ICEs in gfc_typenode_for_spec;with both Intel's and Cray's
 compiler, it gives
  Character with len 3
  Something else
 I somehow had expected that it maps to a character.

I guess that it would do no harm to do that mapping anyway, I'll
have a think about it tonight.

Cheers and thanks for the review.

Paul


 program test
   call up(abc)
   call up(3habc)
 contains
   subroutine up(x)
 class(*) :: x
 select type(x)
  type is (character(*))
   print *, 'Character with len', len(x)
  class default
   print *, 'Something else'
 end select
   end subroutine
 end program test

  The following - related to TYPE(*) - is also accepted, but I think it
 shouldn't. (call up(x) is presumably invalid). It fails here at assembler
 time.

 program test
   use iso_c_binding
   integer,target ::aa
   call up(c_loc(aa)) ! Probably okay but not matchable?
   call bar(aa) !OK
 contains
   subroutine up(x)
 class(*) :: x
 select type(x)
 ! type is (c_ptr) ! invalid  rejected by gfortran
 !  print *, 'C_ptr'
  class default
   print *, 'Something else'
 end select
   end subroutine
   subroutine bar(x)
type(*) :: x
call up(x) ! Invalid
   end subroutine bar
 end program test


 Tobias


 2013-10-20  Paul Thomas  pa...@gcc.gnu.org

  PR fortran 57893
  * class.c : Include target-memory.h.
  (gfc_find_intrinsic_vtab) Build a minimal expression so that
  gfc_element_size can be used to obtain the storage size, rather
  that the kind value.

 2013-10-20  Paul Thomas  pa...@gcc.gnu.org

  PR fortran 57893
  * gfortran.dg/unlimited_polymorphic_12.f90 : New test.





-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy

Re: [wide-int] Add is_sign_extended optimisation

2013-10-21 Thread Richard Biener

On Sun, 20 Oct 2013, Richard Sandiford wrote:

 Another follow-up to yesterday's patch.  This one implements Richard's
 suggestion of having an is_sign_extended trait to optimise cases where
 excess upper bits are known to be signs rather than undefined.
 
 The uses so far are:
 
 * make to_shwi () equivalent to slow ()
 * turn eq_p into a simple loop
 * avoid extensions in sign_mask ()
 * avoid extensions in set_len if the input was already sign-extended
 
 The first two are new (compared to wide-int svn) while the second two
 partially undo the negative effects of yesterday's patch on is_sign_extended
 values.
 
 E.g.
 
   bool
   f (wide_int x, HOST_WIDE_INT y)
   {
 return x == y;
   }
 
 now gives:
 
 xorl%eax, %eax
 cmpl$1, 264(%rsp)
 je  .L27
 ret
 .p2align 4,,10
 .p2align 3
 .L27:   
 cmpq8(%rsp), %rdi
 sete%al
 ret
 
 And:
 
   wide_int
   f (wide_int x, wide_int y)
   {
 return x == y;
   }
 
 gives:
 
 movl264(%rsp), %ecx
 xorl%eax, %eax
 cmpl528(%rsp), %ecx
 je  .L42
 rep ret
 .p2align 4,,10
 .p2align 3
 .L42:
 xorl%eax, %eax
 jmp .L38
 .p2align 4,,10
 .p2align 3
 .L44:
 addl$1, %eax
 cmpl%eax, %ecx
 je  .L43
 .L38:
 movl%eax, %edx
 movq272(%rsp,%rdx,8), %rsi
 cmpq%rsi, 8(%rsp,%rdx,8)
 je  .L44
 xorl%eax, %eax
 ret
 .p2align 4,,10
 .p2align 3
 .L43:
 movl$1, %eax
 ret
 
 (which is a bit poor -- je .L42 trivially threads to je. L38,
 although that's probably only true after RA).
 
 The code for:
 
   bool
   f (wide_int x, unsigned HOST_WIDE_INT y)
   {
 return x == y;
   }
 
 still needs some work though...
 
 The lts_p sequences of wide_int are similar to the ones Mike posted.
 
 Tested on x86_64-linux-gnu.  OK for wide-int?

Ok.

Thanks,
Richard.

 Thanks,
 Richard
 
 
 Index: gcc/rtl.h
 ===
 --- gcc/rtl.h 2013-10-20 09:38:40.254493991 +0100
 +++ gcc/rtl.h 2013-10-20 09:39:28.169894855 +0100
 @@ -1410,6 +1410,7 @@ typedef std::pair rtx, enum machine_mod
{
  static const enum precision_type precision_type = VAR_PRECISION;
  static const bool host_dependent_precision = false;
 +static const bool is_sign_extended = true;
  static unsigned int get_precision (const rtx_mode_t );
  static wi::storage_ref decompose (HOST_WIDE_INT *, unsigned int,
 const rtx_mode_t );
 Index: gcc/tree.h
 ===
 --- gcc/tree.h2013-10-20 09:38:40.254493991 +0100
 +++ gcc/tree.h2013-10-20 09:39:28.170894863 +0100
 @@ -5158,6 +5158,7 @@ #define ANON_AGGRNAME_FORMAT __anon_%d
{
  static const enum precision_type precision_type = FLEXIBLE_PRECISION;
  static const bool host_dependent_precision = false;
 +static const bool is_sign_extended = false;
  static unsigned int get_precision (const_tree);
  static wi::storage_ref decompose (HOST_WIDE_INT *, unsigned int,
 const_tree);
 Index: gcc/wide-int.h
 ===
 --- gcc/wide-int.h2013-10-20 09:39:11.527755622 +0100
 +++ gcc/wide-int.h2013-10-20 09:45:17.725820291 +0100
 @@ -335,8 +335,21 @@ #define WI_UNARY_RESULT_VAR(RESULT, VAL,
  struct wide_int_storage;
  typedef generic_wide_int wide_int_storage wide_int;
  
 +template bool SE
  struct wide_int_ref_storage;
 -typedef generic_wide_int wide_int_ref_storage wide_int_ref;
 +
 +typedef generic_wide_int wide_int_ref_storage false  wide_int_ref;
 +
 +/* This can be used instead of wide_int_ref if the referenced value is
 +   known to have type T.  It carries across properties of T's representation,
 +   such as whether excess upper bits in a HWI are defined, and can therefore
 +   help avoid redundant work.
 +
 +   The macro could be replaced with a template typedef, once we're able
 +   to use those.  */
 +#define WIDE_INT_REF_FOR(T) \
 +  generic_wide_int \
 +wide_int_ref_storage wi::int_traits T::is_sign_extended 
  
  /* Public functions for querying and operating on integers.  */
  namespace wi
 @@ -520,18 +533,6 @@ wi::storage_ref::get_val () const
return val;
  }
  
 -namespace wi
 -{
 -  template 
 -  struct int_traits wi::storage_ref
 -  {
 -static const enum precision_type precision_type = VAR_PRECISION;
 -/* wi::storage_ref can be a reference to a primitive type,
 -   so this is the conservatively-correct setting.  */
 -static const bool host_dependent_precision = true;
 -  };
 -}
 -
  /* This class defines an integer type using the storage provided by the
 template argument.  The storage class must provide the

[PATCH] Fix regression parts of PR58742

2013-10-21 Thread Richard Biener


This fixes a regression introduced in GCC 4.2 after which we no
longer fold (unsigned) (X /[ex] 4) * 4 to (unsigned) X.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Richard.

2013-10-21  Richard Biener  rguent...@suse.de

PR middle-end/58742
* fold-const.c (fold_binary_loc): Fold ((T) (X /[ex] C)) * C
to (T) X for sign-changing conversions (or no conversion).

* c-c++-common/fold-divmul-1.c: New testcase.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 203885)
--- gcc/fold-const.c(working copy)
*** fold_binary_loc (location_t loc,
*** 11002,11007 
--- 11002,11014 
fold_build2_loc (loc, MULT_EXPR, type,
 build_int_cst (type, 2) , arg1));
  
+ /* ((T) (X /[ex] C)) * C cancels out if the conversion is
+sign-changing only.  */
+ if (TREE_CODE (arg1) == INTEGER_CST
+  TREE_CODE (arg0) == EXACT_DIV_EXPR
+  operand_equal_p (arg1, TREE_OPERAND (arg0, 1), 0))
+   return fold_convert_loc (loc, type, TREE_OPERAND (arg0, 0));
+ 
  strict_overflow_p = false;
  if (TREE_CODE (arg1) == INTEGER_CST
   0 != (tem = extract_muldiv (op0, arg1, code, NULL_TREE,
Index: gcc/testsuite/c-c++-common/fold-divmul-1.c
===
*** gcc/testsuite/c-c++-common/fold-divmul-1.c  (revision 0)
--- gcc/testsuite/c-c++-common/fold-divmul-1.c  (working copy)
***
*** 0 
--- 1,11 
+ /* { dg-do compile } */
+ /* { dg-options -fdump-tree-original } */
+ 
+ int *
+ fx (int *b, int *e)
+ {
+   return b + (e - b);
+ }
+ 
+ /* { dg-final { scan-tree-dump-not /\\\[ex\\\] original } } */
+ /* { dg-final { cleanup-tree-dump original } } */

[PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-21 Thread Ilya Enkovich

Hi,

This patch starts the series which introduces Pointers Checker and its support 
in i386 via Intel MPX. Pointers Checker is described on Wiki page - 
http://gcc.gnu.org/wiki/Intel%20MPX%20support%20in%20the%20GCC%20compiler. This 
series actually replaces previously sent patch 
(http://gcc.gnu.org/ml/gcc-patches/2013-08/msg01167.html) which seems too 
inconvenient for review.

The first patch in a series introduces new target and language hooks used by 
Pointers Checker.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  ilya.enkov...@intel.com

* target.def (builtin_chkp_function): New.
(chkp_bound_type): New.
(chkp_bound_mode): New.
(fn_abi_va_list_bounds_size): New.
(load_bounds_for_arg): New.
(store_bounds_for_arg): New.
* targhooks.h (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode): New.
(default_builtin_chkp_function): New.
* targhooks.c (default_load_bounds_for_arg): New.
(default_store_bounds_for_arg): New.
(default_fn_abi_va_list_bounds_size): New.
(default_chkp_bound_type): New.
(default_chkp_bound_mode); New.
(default_builtin_chkp_function): New.
* doc/tm.texi.in (TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE): New.
(TARGET_LOAD_BOUNDS_FOR_ARG): New.
(TARGET_STORE_BOUNDS_FOR_ARG): New.
(TARGET_BUILTIN_CHKP_FUNCTION): New.
(TARGET_CHKP_BOUND_TYPE): New.
(TARGET_CHKP_BOUND_MODE): New.
* doc/tm.texi: Regenerated.
* langhooks.h (lang_hooks): Add chkp_supported field.
* langhooks-def.h (LANG_HOOKS_CHKP_SUPPORTED): New.
(LANG_HOOKS_INITIALIZER); Add LANG_HOOKS_CHKP_SUPPORTED.


diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..01462a2 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4334,6 +4334,11 @@ This hook returns the va_list type of the calling 
convention specified by
 The default version of this hook returns @code{va_list_type_node}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for va_list object or integer_zero_node if
+it does not have any (e.g. is scalar pointer to the stack).
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type})
 This hook returns the va_list type of the calling convention specified by the
 type of @var{type}. If @var{type} is not a valid va_list type, it returns
@@ -5151,6 +5156,16 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx, @var{rtx}, 
@var{rtx})
+This hook is used to emit insn to load arg's bounds
+in case bounds are not passed on register.  Return loaded bounds
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx, @var{rtx}, 
@var{rtx}, @var{rtx})
+This hook is used to emit insn to store arg's bounds
+in case bounds are not passed on register.
+@end deftypefn
+
 @node Trampolines
 @section Trampolines for Nested Functions
 @cindex trampolines for nested functions
@@ -10907,6 +10922,18 @@ ignored.  This function should return the result of 
the call to the
 built-in function.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_BUILTIN_CHKP_FUNCTION (unsigned 
@var{fcode})
+Pointers checker instrumentation pass uses this hook to obtain
+target-specific functions which implement specified generic checker
+builtins.
+@end deftypefn
+@deftypefn {Target Hook} tree TARGET_CHKP_BOUND_TYPE (void)
+Return type to be used for bounds
+@end deftypefn
+@deftypefn {Target Hook} {enum machine_mode} TARGET_CHKP_BOUND_MODE (void)
+Return mode to be used for bounds.
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_RESOLVE_OVERLOADED_BUILTIN (unsigned int 
@var{loc}, tree @var{fndecl}, void *@var{arglist})
 Select a replacement for a machine specific built-in function that
 was set up by @samp{TARGET_INIT_BUILTINS}.  This is done
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 863e843a..2828361 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -3694,6 +3694,8 @@ stack.
 
 @hook TARGET_FN_ABI_VA_LIST
 
+@hook TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE
+
 @hook TARGET_CANONICAL_VA_LIST_TYPE
 
 @hook TARGET_GIMPLIFY_VA_ARG_EXPR
@@ -4064,6 +4066,10 @@ These machine description macros help implement varargs:
 
 @hook TARGET_PRETEND_OUTGOING_VARARGS_NAMED
 
+@hook TARGET_LOAD_BOUNDS_FOR_ARG
+
+@hook TARGET_STORE_BOUNDS_FOR_ARG
+
 @node Trampolines
 @section Trampolines for Nested Functions
 @cindex trampolines for nested functions
@@ -8184,6 +8190,10 @@ to by @var{ce_info}.
 
 @hook TARGET_EXPAND_BUILTIN
 
+@hook TARGET_BUILTIN_CHKP_FUNCTION
+@hook TARGET_CHKP_BOUND_TYPE
+@hook

Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-21 Thread Joseph S. Myers

On Mon, 21 Oct 2013, Ilya Enkovich wrote:

 +DEFHOOK
 +(builtin_chkp_function,
 + Pointers checker instrumentation pass uses this hook to obtain\n\
 +target-specific functions which implement specified generic checker\n\
 +builtins.,
 + tree, (unsigned fcode),
 + default_builtin_chkp_function)

I don't think that's enough detail.  The audience for this hook 
description is back-end maintainers wanting to implement such hooks for 
their back ends, and the hook description should give sufficient 
information to do so.  This description says nothing at all about the 
semantics of the hook argument or return value.

If it seems difficult to describe things sufficiently in the context of 
individual hook descriptions, maybe an overview of the feature and 
implementation approach is needed as a new section in the internals 
manual, with hook descriptions then referring to that section, or going in 
appropriate places within that section (if the section is in tm.texi.in).

 +DEFHOOK
 +(fn_abi_va_list_bounds_size,
 + This hook returns size for va_list object or integer_zero_node if\n\
 +it does not have any (e.g. is scalar pointer to the stack).,
 + tree, (tree fndecl),
 + default_fn_abi_va_list_bounds_size)

@code{va_list}, @code{integer_zero_node}, specify semantics of fndecl 
argument.

  DEFHOOK
 +(load_bounds_for_arg,
 + This hook is used to emit insn to load arg's bounds\n\
 +in case bounds are not passed on register.  Return loaded bounds,
 + rtx, (rtx, rtx, rtx),
 + default_load_bounds_for_arg)

You need to name all the arguments and explain their semantics by name in 
the documentation (which should end with .).

 +DEFHOOK
 +(store_bounds_for_arg,
 + This hook is used to emit insn to store arg's bounds\n\
 +in case bounds are not passed on register.,
 + void, (rtx, rtx, rtx, rtx),
 + default_store_bounds_for_arg)

Likewise.

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-21 Thread Ilya Enkovich

Hi,

This patch introduces built-in functions used by Pointers Checker and flag to 
enable Pointers Checker. Builtins available for user are expanded in 
expand_builtin. All other builtins are not allowed in expand until generic 
version of Pointers Cheker is implemented.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-04  Ilya Enkovich  ilya.enkov...@intel.com

* builtin-types.def (BT_FN_VOID_CONST_PTR): New.
(BT_FN_PTR_CONST_PTR): New.
(BT_FN_CONST_PTR_CONST_PTR): New.
(BT_FN_PTR_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR): New.
(BT_FN_VOID_PTRPTR_CONST_PTR): New.
(BT_FN_VOID_CONST_PTR_SIZE): New.
(BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE): New.
* chkp-builtins.def: New.
* builtins.def: include chkp-builtins.def.
(DEF_CHKP_BUILTIN): New.
* builtins.c (expand_builtin): Support BUILT_IN_CHKP_INIT_PTR_BOUNDS,
BUILT_IN_CHKP_NULL_PTR_BOUNDS, BUILT_IN_CHKP_COPY_PTR_BOUNDS,
BUILT_IN_CHKP_CHECK_PTR_LBOUNDS, BUILT_IN_CHKP_CHECK_PTR_UBOUNDS,
BUILT_IN_CHKP_CHECK_PTR_BOUNDS, BUILT_IN_CHKP_SET_PTR_BOUNDS,
BUILT_IN_CHKP_NARROW_PTR_BOUNDS, BUILT_IN_CHKP_STORE_PTR_BOUNDS,
BUILT_IN_CHKP_GET_PTR_LBOUND, BUILT_IN_CHKP_GET_PTR_UBOUND,
BUILT_IN_CHKP_BNDMK, BUILT_IN_CHKP_BNDSTX, BUILT_IN_CHKP_BNDCL,
BUILT_IN_CHKP_BNDCU, BUILT_IN_CHKP_BNDLDX, BUILT_IN_CHKP_BNDRET,
BUILT_IN_CHKP_INTERSECT, BUILT_IN_CHKP_ARG_BND, BUILT_IN_CHKP_NARROW,
BUILT_IN_CHKP_EXTRACT_LOWER, BUILT_IN_CHKP_EXTRACT_UPPER.
* common.opt (fcheck-pointers): New.
* toplev.c (process_options): Check Pointers Checker is supported.
* doc/extend.texi: Document Pointers Checker built-in functions.


diff --git a/gcc/builtin-types.def b/gcc/builtin-types.def
index 2634ecc..c6c5e5c 100644
--- a/gcc/builtin-types.def
+++ b/gcc/builtin-types.def
@@ -224,12 +224,15 @@ DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT64_DFLOAT64, 
BT_DFLOAT64, BT_DFLOAT64)
 DEF_FUNCTION_TYPE_1 (BT_FN_DFLOAT128_DFLOAT128, BT_DFLOAT128, BT_DFLOAT128)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_VPTR, BT_VOID, BT_VOLATILE_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_VOID_PTRPTR, BT_VOID, BT_PTR_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_VOID_CONST_PTR, BT_VOID, BT_CONST_PTR)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT_UINT, BT_UINT, BT_UINT)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONG_ULONG, BT_ULONG, BT_ULONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_ULONGLONG_ULONGLONG, BT_ULONGLONG, BT_ULONGLONG)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT16_UINT16, BT_UINT16, BT_UINT16)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT32_UINT32, BT_UINT32, BT_UINT32)
 DEF_FUNCTION_TYPE_1 (BT_FN_UINT64_UINT64, BT_UINT64, BT_UINT64)
+DEF_FUNCTION_TYPE_1 (BT_FN_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR)
+DEF_FUNCTION_TYPE_1 (BT_FN_CONST_PTR_CONST_PTR, BT_CONST_PTR, BT_CONST_PTR)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR, BT_FN_VOID_PTR)
 
@@ -341,6 +344,10 @@ DEF_FUNCTION_TYPE_2 (BT_FN_VOID_VPTR_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_INT)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_VPTR_INT, BT_BOOL, BT_VOLATILE_PTR, BT_INT)
 DEF_FUNCTION_TYPE_2 (BT_FN_BOOL_SIZE_CONST_VPTR, BT_BOOL, BT_SIZE,
 BT_CONST_VOLATILE_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, BT_SIZE)
+DEF_FUNCTION_TYPE_2 (BT_FN_PTR_CONST_PTR_CONST_PTR, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_PTRPTR_CONST_PTR, BT_VOID, BT_PTR_PTR, 
BT_CONST_PTR)
+DEF_FUNCTION_TYPE_2 (BT_FN_VOID_CONST_PTR_SIZE, BT_VOID, BT_CONST_PTR, BT_SIZE)
 
 DEF_POINTER_TYPE (BT_PTR_FN_VOID_PTR_PTR, BT_FN_VOID_PTR_PTR)
 
@@ -425,6 +432,7 @@ DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I2_INT, BT_VOID, 
BT_VOLATILE_PTR, BT_I2, BT
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I4_INT, BT_VOID, BT_VOLATILE_PTR, BT_I4, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I8_INT, BT_VOID, BT_VOLATILE_PTR, BT_I8, 
BT_INT)
 DEF_FUNCTION_TYPE_3 (BT_FN_VOID_VPTR_I16_INT, BT_VOID, BT_VOLATILE_PTR, 
BT_I16, BT_INT)
+DEF_FUNCTION_TYPE_3 (BT_FN_PTR_CONST_PTR_CONST_PTR_SIZE, BT_PTR, BT_CONST_PTR, 
BT_CONST_PTR, BT_SIZE)
 
 DEF_FUNCTION_TYPE_4 (BT_FN_SIZE_CONST_PTR_SIZE_SIZE_FILEPTR,
 BT_SIZE, BT_CONST_PTR, BT_SIZE, BT_SIZE, BT_FILEPTR)
diff --git a/gcc/builtins.c b/gcc/builtins.c
index 3b16d59..b8dec3f 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -5861,7 +5861,18 @@ expand_builtin (tree exp, rtx target, rtx subtarget, 
enum machine_mode mode,
fcode != BUILT_IN_EXECVE
fcode != BUILT_IN_ALLOCA
fcode != BUILT_IN_ALLOCA_WITH_ALIGN
-   fcode != BUILT_IN_FREE)
+   fcode != BUILT_IN_FREE
+   fcode != BUILT_IN_CHKP_SET_PTR_BOUNDS
+   fcode != BUILT_IN_CHKP_INIT_PTR_BOUNDS
+   fcode != BUILT_IN_CHKP_NULL_PTR_BOUNDS
+   fcode != BUILT_IN_CHKP_COPY_PTR_BOUNDS
+   fcode != BUILT_IN_CHKP_NARROW_PTR_BOUNDS
+   fcode != BUILT_IN_CHKP_STORE_PTR_BOUNDS
+   fcode != BUILT_IN_CHKP_CHECK_PTR_LBOUNDS
+   fcode != BUILT_IN_CHKP_CHECK_PTR_UBOUNDS
+   fcode !=

RE: [PATCH] Fix PR58682

2013-10-21 Thread Paulo Matos

 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: 18 October 2013 13:12
 To: Paulo Matos
 Cc: Mike Stump; Kyrill Tkachov; gcc-patches@gcc.gnu.org
 Subject: Re: [PATCH] Fix PR58682

 Please re-post the latest version of the patch and CC Honza.

 Richard.

Patch attached. OK to submit?
Bootstrapped x86_64-unknown-linux-gnu.

2013-10-21  Paulo Matos  pma...@broadcom.com

* ipa-inline.c (inline_small_functions): Update max_count if new edges
are found after inlining.

-- 
Paulo Matos

pr58682.patch
Description: pr58682.patch

Re: [PATCH] Fix regression parts of PR58742

2013-10-21 Thread Jakub Jelinek

On Mon, Oct 21, 2013 at 01:30:37PM +0200, Richard Biener wrote:
 This fixes a regression introduced in GCC 4.2 after which we no
 longer fold (unsigned) (X /[ex] 4) * 4 to (unsigned) X.
 
 Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.

Don't we want to do the same optimization on GIMPLE?
I mean, if the testcase is
int *
fx (int *b, int *e)
{
  ptrdiff_t p = e - b;
  return b + p;
}

(or s/ptrdiff_t/long long/, etc.)?

Jakub

Re: [PATCH] Fix regression parts of PR58742

2013-10-21 Thread Richard Biener

On Mon, 21 Oct 2013, Jakub Jelinek wrote:

 On Mon, Oct 21, 2013 at 01:30:37PM +0200, Richard Biener wrote:
  This fixes a regression introduced in GCC 4.2 after which we no
  longer fold (unsigned) (X /[ex] 4) * 4 to (unsigned) X.
  
  Bootstrapped and tested on x86_64-unknown-linux-gnu, applied to trunk.
 
 Don't we want to do the same optimization on GIMPLE?
 I mean, if the testcase is
 int *
 fx (int *b, int *e)
 {
   ptrdiff_t p = e - b;
   return b + p;
 }
 
 (or s/ptrdiff_t/long long/, etc.)?

Sure, but that's not the regression part either ;)

Richard.

[PATCH, MPX, 2/X] Pointers Checker [3/25] Attributes

2013-10-21 Thread Ilya Enkovich

Hi,

This patch adds attributes 'bnd_variable_size' and 'bnd_legacy' used by 
Pointers Checker.

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  ilya.enkov...@intel.com

* c-family/c-common.c (handle_bnd_variable_size_attribute): New.
(handle_bnd_legacy): New.
(c_common_attribute_table): Add bnd_variable_size and bnd_legacy.
* doc/extend.texi: Document bnd_variable_size and bnd_legacy
attributes.


diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 8ecb70c..3902909 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -371,6 +371,8 @@ static tree ignore_attribute (tree *, tree, tree, int, bool 
*);
 static tree handle_no_split_stack_attribute (tree *, tree, tree, int, bool *);
 static tree handle_fnspec_attribute (tree *, tree, tree, int, bool *);
 static tree handle_warn_unused_attribute (tree *, tree, tree, int, bool *);
+static tree handle_bnd_variable_size_attribute (tree *, tree, tree, int, bool 
*);
+static tree handle_bnd_legacy (tree *, tree, tree, int, bool *);
 
 static void check_function_nonnull (tree, int, tree *);
 static void check_nonnull_arg (void *, tree, unsigned HOST_WIDE_INT);
@@ -745,8 +747,12 @@ const struct attribute_spec c_common_attribute_table[] =
  The name contains space to prevent its usage in source code.  */
   { fn spec,   1, 1, false, true, true,
  handle_fnspec_attribute, false },
-  { warn_unused,0, 0, false, false, false,
+  { warn_unused,   0, 0, false, false, false,
  handle_warn_unused_attribute, false },
+  { bnd_variable_size,  0, 0, true,  false, false,
+ handle_bnd_variable_size_attribute, false },
+  { bnd_legacy, 0, 0, true, false, false,
+ handle_bnd_legacy, false },
   { NULL, 0, 0, false, false, false, NULL, false }
 };
 
@@ -8000,6 +8006,38 @@ handle_fnspec_attribute (tree *node ATTRIBUTE_UNUSED, 
tree ARG_UNUSED (name),
   return NULL_TREE;
 }
 
+/* Handle a bnd_variable_size attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bnd_variable_size_attribute (tree *node, tree name, tree ARG_UNUSED 
(args),
+   int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FIELD_DECL)
+{
+  warning (OPT_Wattributes, %qE attribute ignored, name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
+/* Handle a bnd_legacy attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_bnd_legacy (tree *node, tree name, tree ARG_UNUSED (args),
+  int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  if (TREE_CODE (*node) != FUNCTION_DECL)
+{
+  warning (OPT_Wattributes, %qE attribute ignored, name);
+  *no_add_attrs = true;
+}
+
+  return NULL_TREE;
+}
+
 /* Handle a warn_unused attribute; arguments as in
struct attribute_spec.handler.  */
 
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ef422ad..7701d60 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2137,7 +2137,7 @@ attributes are currently defined for functions on all 
targets:
 @code{warn_unused_result}, @code{nonnull}, @code{gnu_inline},
 @code{externally_visible}, @code{hot}, @code{cold}, @code{artificial},
 @code{no_sanitize_address}, @code{no_address_safety_analysis},
-@code{no_sanitize_undefined},
+@code{no_sanitize_undefined}, @code{bnd_legacy},
 @code{error} and @code{warning}.
 Several other attributes are defined for functions on particular
 target systems.  Other attributes, including @code{section} are
@@ -3508,6 +3508,12 @@ The @code{no_sanitize_undefined} attribute on functions 
is used
 to inform the compiler that it should not check for undefined behavior
 in the function when compiling with the @option{-fsanitize=undefined} option.
 
+@item bnd_legacy
+@cindex @code{bnd_legacy} function attribute
+The @code{bnd_legacy} attribute on functions is used to inform
+compiler that function should not be instrumented when compiled
+with @option{-fcheck-pointers} option.
+
 @item regparm (@var{number})
 @cindex @code{regparm} attribute
 @cindex functions that are passed arguments in registers on the 386
@@ -5280,12 +5286,12 @@ placed in either the @code{.bss_below100} section or the
 The keyword @code{__attribute__} allows you to specify special
 attributes of @code{struct} and @code{union} types when you define
 such types.  This keyword is followed by an attribute specification
-inside double parentheses.  Seven attributes are currently defined for
+inside double parentheses.  Eight attributes are currently defined for
 types: @code{aligned}, @code{packed}, @code{transparent_union},
-@code{unused}, @code{deprecated}, @code{visibility}, and
-@code{may_alias}.  Other attributes are defined for functions
-(@pxref{Function

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-21 Thread Joseph S. Myers

On Mon, 21 Oct 2013, Ilya Enkovich wrote:

 +  if (flag_check_pointers)
 +{
 +  if (flag_lto)
 + sorry (Pointers checker is not yet fully supported for link-time 
 optimization);

That sounds wrong.  It suggests some bug somewhere in your patch series 
failing to allow for LTO, which should be fixed.  At least give a more 
detailed explanation in a comment of what would need to change to remove 
this call to sorry ().

 +  if (targetm.chkp_bound_mode () == VOIDmode)
 + error (-fcheck-pointers is not supported for this target.);

Also note GNU Coding Standards for diagnostics.  They should not start 
with uppercase letters or end with ..

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH, MPX, 2/X] Pointers Checker [4/25] Constructors

2013-10-21 Thread Ilya Enkovich

Hi,

This patch introduces two new contructor types supported by 
cgraph_build_static_cdtor.

'B' type is used to initialize static objects (bounds) created by Pointers 
Checker. The difference of this type from the regular constructor is that 'B' 
constructor is never instrumented by Pointers Checker.

'P' type is used by Pointers Checker to generate constructors to initialize 
bounds of statically initialized pointers. Pointers Checker remove all stores 
from such constructors after instrumentation.

Since 'P' type constructors are created for statically initialized objects, we 
need to avoid creation of such objects during its gimplification. New 
restriction was added to gimplify_init_constructor.

Bootstrapped and checked on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  ilya.enkov...@intel.com

* ipa.c (cgraph_build_static_cdtor_1): Support contructors
with chkp ctor and bnd_legacy attributes.
* gimplify.c (gimplify_init_constructor): Avoid infinite
loop during gimplification of bounds initializer.


diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 86bda77..7c350fd 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -4033,10 +4033,19 @@ gimplify_init_constructor (tree *expr_p, gimple_seq 
*pre_p, gimple_seq *post_p,
   individual element initialization.  Also don't do this for small
   all-zero initializers (which aren't big enough to merit
   clearing), and don't try to make bitwise copies of
-  TREE_ADDRESSABLE types.  */
+  TREE_ADDRESSABLE types.
+
+  We cannot apply such transformation when compiling chkp static
+  initializer because creation of initializer image in the memory
+  will require static initialization of bounds for it.  It should
+  result in another gimplification of similar initializer and we
+  may fall into infinite loop.  */
if (valid_const_initializer
 !(cleared || num_nonzero_elements == 0)
-!TREE_ADDRESSABLE (type))
+!TREE_ADDRESSABLE (type)
+(!current_function_decl
+   || !lookup_attribute (chkp ctor,
+ DECL_ATTRIBUTES (current_function_decl
  {
HOST_WIDE_INT size = int_size_in_bytes (type);
unsigned int align;
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 92343fb2..36cc621 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -1259,9 +1259,11 @@ make_pass_ipa_whole_program_visibility (gcc::context 
*ctxt)
 }
 
 /* Generate and emit a static constructor or destructor.  WHICH must
-   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
-   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
-   initialization priority for this constructor or destructor. 
+   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
+   (for chp static vars constructor) or 'B' (for chkp static bounds
+   constructor).  BODY is a STATEMENT_LIST containing GENERIC
+   statements.  PRIORITY is the initialization priority for this
+   constructor or destructor.
 
FINAL specify whether the externally visible name for collect2 should
be produced. */
@@ -1320,6 +1322,20 @@ cgraph_build_static_cdtor_1 (char which, tree body, int 
priority, bool final)
   DECL_STATIC_CONSTRUCTOR (decl) = 1;
   decl_init_priority_insert (decl, priority);
   break;
+case 'P':
+  DECL_STATIC_CONSTRUCTOR (decl) = 1;
+  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (chkp ctor),
+ NULL,
+ NULL_TREE);
+  decl_init_priority_insert (decl, priority);
+  break;
+case 'B':
+  DECL_STATIC_CONSTRUCTOR (decl) = 1;
+  DECL_ATTRIBUTES (decl) = tree_cons (get_identifier (bnd_legacy),
+ NULL,
+ NULL_TREE);
+  decl_init_priority_insert (decl, priority);
+  break;
 case 'D':
   DECL_STATIC_DESTRUCTOR (decl) = 1;
   decl_fini_priority_insert (decl, priority);
@@ -1337,9 +1353,11 @@ cgraph_build_static_cdtor_1 (char which, tree body, int 
priority, bool final)
 }
 
 /* Generate and emit a static constructor or destructor.  WHICH must
-   be one of 'I' (for a constructor) or 'D' (for a destructor).  BODY
-   is a STATEMENT_LIST containing GENERIC statements.  PRIORITY is the
-   initialization priority for this constructor or destructor.  */
+   be one of 'I' (for a constructor), 'D' (for a destructor), 'P'
+   (for chkp static vars constructor) or 'B' (for chkp static bounds
+   constructor).  BODY is a STATEMENT_LIST containing GENERIC
+   statements.  PRIORITY is the initialization priority for this
+   constructor or destructor.  */
 
 void
 cgraph_build_static_cdtor (char which, tree body, int priority)

Re: [patch 1/8] Remove gimple-low.h from the tree-ssa.h include list.

2013-10-21 Thread Richard Biener

On Fri, Oct 18, 2013 at 9:01 PM, Andrew MacLeod amacl...@redhat.com wrote:
 On 10/18/2013 12:10 PM, Jeff Law wrote:

 On 10/18/13 07:37, Andrew MacLeod wrote:


 gimple_check_call_matching_types() was being called from 3 or 4
 different files,and seemed more appropriate as a cgraph routine (which
 called it 3 times). So I moved that and its helper to cgraph.c.

 After that, I only needed to update 4 .c files to directly include
 gimple-low.h

 bootstraps on x86_64-unknown-linux-gnu with no new regressions. OK?

 I'm less sure about this one.  I don't see that it clearly belongs in
 either location.  I could easily see it moving out of cgraph.c at some
 point.

 If it furthers your cleanup efforts at this time, that's fine. Just be
 aware that, at least IMHO, this routine doesn't fit clearly into either
 location and I wouldn't be surprised if we have to come back to it at some
 point.

 possibly, there didn't seem to be a tree call file for call helpers, so I
 went with locality :-)   I will happily move it to a more appropriate place
 as a follow up.  cgraph.h is already a collector for the exports for a half
 dozen .c file.  It needs the same treatment tree-flow got.

I think that we should remove all indirect includes again at some point
to get a clearer picture of what is used where.  After all prototypes are
in the appropriate header files, of course...

So the less indirect includes the better (for now).

Richard.

 Andrew

Re: [RFC] Isolate simplify paths with undefined behaviour

2013-10-21 Thread Richard Biener

On Fri, Oct 18, 2013 at 7:12 PM, Jeff Law l...@redhat.com wrote:


 Back in 2011 I wrote code to detect cases when traversing a particular path
 could be proven to trigger undefined behaviour (such as a null pointer
 dereference).  That original patch would find the control dependent edges
 leading to the dereference and eliminate those edges. The result being those
 paths with undefined behaviour simply vanished.

 The problem with that implementation is there could have been observable
 side effects on those paths prior to triggering the undefined behaviour.

 At that time I speculated we could isolate the path (via block copying) with
 undefined behaviour.  On the duplicate path we'd replace the undefined
 behaviour with a trap and remove only the outgoing edges. That would still
 preserve any visible side effects on the undefined path up to the undefined
 behaviour and we still often get to simplify the main path, though not as
 much as the original version from 2011.

 That's precisely what this patch does.  When we have a PHI which feeds a
 memory dereference in the same block as the PHI and one (or more) of the PHI
 args is NULL we duplicate the block, redirect incoming edges with the NULL
 PHI args to the duplicate and change the statement with the memory
 dereference to a trap.

 You can certainly ask is this actually helpful.  In my experience, yes.  I'm
 regularly seeing stuff like

 x2 = PHI (x1, x1, 0, 0);
 *x2

 When we isolate the paths with NULL incoming values, we're left with
 x2 = PHI (x1, x1)

 And we can propagate x1 into uses of x2.  The block now has just one pred
 and the current block can sometimes be merged into that single pred.  I've
 seen the combination of those two then lead to identification additional
 common subexpressions.

 You might also ask how often such paths show up.  In a bootstrap about a
 week ago, this triggered ~3500 times.

 The code tries to be reasonably smart and re-use an existing duplicate when
 there's multiple NULL values in a PHI node.  That triggers often enough to
 be worth the marginal amount of book keeping.

 The code doesn't yet handle the case where there's multiple dereferences in
 the block.  There's no guarantee the first dereference will be the first one
 we find on the immediate use list.  It's on my TODO list.


 There's no reason this same framework couldn't be used to do the same thing
 for an out of bounds array access.  Similarly, it wouldn't be hard to issue
 a warning when these transformations are applied.

 Posting at this time to get some feedback.  Planning a formal RFA prior to
 close of 4.9 stage1.

 And, of course, this bootstrapps and regression tests.  20030711-3 is
 clearly optimized further with this patch, but I'll cobble together some
 more testcases next week.


 I'm not familiar with the options stuff, so if someone could look at that
 more closely, I'd appreciate it.  Similarly for  the bits to create a new
 pass structure.


 Thoughts, comments, flames?

I wonder why this isn't part of the regular jump-threading code - after
all the opportunity is to thead to __builtin_unreachable () ;)  Otherwise
the question is always where you'd place this pass and whether it
enables jump threading or CSE opportunities (that at least it does).

Richard.






 * Makefile.in (OBJS): Add tree-ssa-isolate-paths.
 * common.opt (-ftree-isolate-paths): Add option and documentation.
 * opts.c (default_options_table): Add OPT_ftree_isolate_paths.
 * passes.def: Add path isolation pass.
 * timevar.def (TV_TREE_SSA_ISOLATE_PATHS): New timevar.
 * tree-pass.h (make_isolate_paths): Declare.
 * tree-ssa-isolate-paths.c: New file.

 * gcc.dg/pr38984.c: Add -fno-tree-isolate-paths.
 * gcc.dg/tree-ssa/20030711-3.c: Update expected output.

 * testsuite/libmudflap.c/fail20-frag.c: Add -fno-tree-isolate-paths.


 diff --git a/gcc/Makefile.in b/gcc/Makefile.in
 index ba39ac9..e7c18bc 100644
 --- a/gcc/Makefile.in
 +++ b/gcc/Makefile.in
 @@ -1412,6 +1412,7 @@ OBJS = \
 tree-ssa-dse.o \
 tree-ssa-forwprop.o \
 tree-ssa-ifcombine.o \
 +   tree-ssa-isolate-paths.o \
 tree-ssa-live.o \
 tree-ssa-loop-ch.o \
 tree-ssa-loop-im.o \
 diff --git a/gcc/common.opt b/gcc/common.opt
 index 1f11fcd..1fd1bcc 100644
 --- a/gcc/common.opt
 +++ b/gcc/common.opt
 @@ -2104,6 +2104,12 @@ foptimize-strlen
  Common Report Var(flag_optimize_strlen) Optimization
  Enable string length optimizations on trees

 +ftree-isolate-paths
 +Common Report Var(flag_tree_isolate_paths) Init(1) Optimization
 +Detect paths which trigger undefined behaviour.  Isolate those paths from
 +the main control flow and turn the statement with undefined behaviour into
 +a trap.
 +
  ftree-loop-distribution
  Common Report Var(flag_tree_loop_distribution) Optimization
  Enable loop distribution on trees
 diff --git a/gcc/opts.c b/gcc/opts.c
 index 728d36d..b6ed0c2

Re: [PATCH] Fix PR58682

2013-10-21 Thread Jan Hubicka

  -Original Message-
  From: Richard Biener [mailto:richard.guent...@gmail.com]
  Sent: 18 October 2013 13:12
  To: Paulo Matos
  Cc: Mike Stump; Kyrill Tkachov; gcc-patches@gcc.gnu.org
  Subject: Re: [PATCH] Fix PR58682

  Please re-post the latest version of the patch and CC Honza.

  Richard.

 Patch attached. OK to submit?
 Bootstrapped x86_64-unknown-linux-gnu.

 2013-10-21  Paulo Matos  pma...@broadcom.com

   * ipa-inline.c (inline_small_functions): Update max_count if new edges
   are found after inlining.

This won't work - the max_count is there to get things correctly scaled by 
badness calculation.
Changing max_count would mean the need to recompute all badnesses in the 
fibheap keys.
I would just cap the edge_count in badness calculation.

Honza

 -- 
 Paulo Matos

Re: [PATCH]Fix computation of offset in ivopt

2013-10-21 Thread Richard Biener

On Fri, Oct 18, 2013 at 5:48 PM, Bin.Cheng amker.ch...@gmail.com wrote:
 Hi Richard,
 Is the first patch OK?  Since there is a patch depending on it.
 http://gcc.gnu.org/ml/gcc-patches/2013-09/msg01931.html

Yes.

Richard.

 Thanks.
 bin

 On Fri, Oct 18, 2013 at 7:18 PM, Richard Biener
 richard.guent...@gmail.com wrote:
 On Thu, Oct 17, 2013 at 7:52 AM, bin.cheng bin.ch...@arm.com wrote:
 Hi,
 As noted in previous messages, GCC forces offset to unsigned in middle end.
 It also gets offset value and stores it in HOST_WIDE_INT then uses it in
 various computation.  In this scenario, It should use int_cst_value to do
 additional sign extension against the type of tree node, otherwise we might
 lose sign information.  Take function fold_plusminus_mult_expr as an
 example, the sign information of arg01/arg11 might be lost because it uses
 TREE_INT_CST_LOW directly.  I think this is the offender of the problem in
 this thread.  I think we should fix the offender, rather the hacking
 strip_offset as in the original patch, so I split original patch into two as
 attached, with one fixing the offset of COMPONENT_REF in strip_offset_1, the
 other fixing the mentioned sign extension problem.

 Index: gcc/fold-const.c
 ===
 --- gcc/fold-const.c (revision 203267)
 +++ gcc/fold-const.c (working copy)
 @@ -7270,8 +7270,8 @@ fold_plusminus_mult_expr (location_t loc, enum tre
HOST_WIDE_INT int01, int11, tmp;
bool swap = false;
tree maybe_same;
 -  int01 = TREE_INT_CST_LOW (arg01);
 -  int11 = TREE_INT_CST_LOW (arg11);
 +  int01 = int_cst_value (arg01);
 +  int11 = int_cst_value (arg11);

 this is not correct - it will mishandle all large unsigned numbers.

 The real issue is that we rely on twos-complement arithmetic to work
 when operating on pointer offsets (because we convert them all to
 unsigned sizetype).  That makes interpreting the offsets or expressions
 that compute offsets hard (or impossible in some cases), as you can
 see from the issues in IVOPTs.

 The above means that strip_offset_1 cannot reliably look through
 MULT_EXPRs as that can make an offset X * CST that is
 effectively signed surely unsigned in the process.  Think of
 this looking into X * CST as performing a division - whose result
 is dependent on the sign of X which we lost with our unsigned
 casting.  Now, removing the MULT_EXPR look-through might
 be too pessimizing ... but it may be worth trying to see if it fixes
 your issue?  In this context I also remember a new bug filed
 recently of us not folding x /[ex] 4 * 4 to x.

 In the past I was working multiple times on stopping doing that
 (make all offsets unsigned), but I never managed to finish ...

 Richard.

 Bootstrap and test on x86/x86_64/arm.  Is it OK?

 Thanks.
 bin

 Patch a:
 2013-10-17  Bin Cheng  bin.ch...@arm.com

 * tree-ssa-loop-ivopts.c (strip_offset_1): Change parameter type.
 Count DECL_FIELD_BIT_OFFSET when computing offset for COMPONENT_REF.

 Patch b:
 2013-10-17  Bin Cheng  bin.ch...@arm.com

 * fold-const.c (fold_plusminus_mult_expr): Use int_cst_value instead
 of TREE_INT_CST_LOW.

 gcc/testsuite/ChangeLog
 2013-10-17  Bin Cheng  bin.ch...@arm.com

 * gcc.dg/tree-ssa/ivopts-outside-loop-use-1.c: New test.

 -Original Message-
 From: Richard Biener [mailto:richard.guent...@gmail.com]
 Sent: Wednesday, October 02, 2013 4:32 PM
 To: Bin.Cheng
 Cc: Bin Cheng; GCC Patches
 Subject: Re: [PATCH]Fix computation of offset in ivopt

 On Tue, Oct 1, 2013 at 6:13 PM, Bin.Cheng amker.ch...@gmail.com wrote:
  On Tue, Oct 1, 2013 at 6:50 PM, Richard Biener
  richard.guent...@gmail.com wrote:
  On Mon, Sep 30, 2013 at 7:39 AM, bin.cheng bin.ch...@arm.com
 wrote:
 
 
 
  I don't think you need
 
  +  /* Sign extend off if expr is in type which has lower precision
  + than HOST_WIDE_INT.  */
  +  if (TYPE_PRECISION (TREE_TYPE (expr)) = HOST_BITS_PER_WIDE_INT)
  +off = sext_hwi (off, TYPE_PRECISION (TREE_TYPE (expr)));
 
  at least it would be suspicious if you did ...
  There is a problem for example of the first message.  The iv base if
 like:
  pretmp_184 + ((sizetype) KeyIndex_180 + 1073741823) * 4 I am not sure
  why but it seems (-4/0xFFFC) is represented by (1073741823*4).
  For each operand strip_offset_1 returns exactly the positive number
  and result of multiplication never get its chance of sign extend.
  That's why the sign extend is necessary to fix the problem. Or it
  should be fixed elsewhere by representing iv base with:
  pretmp_184 + ((sizetype) KeyIndex_180 + 4294967295) * 4 in the first
 place.

 Yeah, that's why I said the whole issue with forcing all offsets to be
 unsigned
 is a mess ...

 There is really no good answer besides not doing that I fear.

 Yes, in the above case we could fold the whole thing differently
 (interpret
 the offset of a POINTER_PLUS_EXPR as signed).  You can try tracking down

[PATCH, MPX, 2/X] Pointers Checker [5/25] Tree and gimple ifaces

2013-10-21 Thread Ilya Enkovich

Hi,

This patch adds bounds related iface macros and functions for tree and gimple. 

Bootstrapped and tested on linux-x86_64.

Thanks,
Ilya
--

gcc/

2013-10-21  Ilya Enkovich  ilya.enkov...@intel.com

* tree-core.h (tree_index): Add TI_BOUND_TYPE.
* tree.h (BOUND_P): New.
(BOUNDED_TYPE_P): New.
(BOUNDED_P): New.
(bound_type_node): New.
* tree.c (build_common_tree_nodes): Initialize bound_type_node.
* gimple.h (gimple_call_get_nobnd_arg_index): New.
(gimple_call_num_nobnd_args): New.
(gimple_call_nobnd_arg): New.
(gimple_return_retbnd): New.
(gimple_return_set_retbnd): New
* gimple.c (gimple_build_return): Increase number of ops
for return statement.
(gimple_call_get_nobnd_arg_index): New.
* gimple-pretty-print.c (dump_gimple_return): Print second op.


diff --git a/gcc/gimple-pretty-print.c b/gcc/gimple-pretty-print.c
index 1e985e0..fddcee0 100644
--- a/gcc/gimple-pretty-print.c
+++ b/gcc/gimple-pretty-print.c
@@ -535,11 +535,12 @@ dump_gimple_assign (pretty_printer *buffer, gimple gs, 
int spc, int flags)
 static void
 dump_gimple_return (pretty_printer *buffer, gimple gs, int spc, int flags)
 {
-  tree t;
+  tree t, t2;
 
   t = gimple_return_retval (gs);
+  t2 = gimple_return_retbnd (gs);
   if (flags  TDF_RAW)
-dump_gimple_fmt (buffer, spc, flags, %G %T, gs, t);
+dump_gimple_fmt (buffer, spc, flags, %G %T %T, gs, t, t2);
   else
 {
   pp_string (buffer, return);
@@ -548,6 +549,11 @@ dump_gimple_return (pretty_printer *buffer, gimple gs, int 
spc, int flags)
  pp_space (buffer);
  dump_generic_node (buffer, t, spc, flags, false);
}
+  if (t2)
+   {
+ pp_string (buffer, , );
+ dump_generic_node (buffer, t2, spc, flags, false);
+   }
   pp_semicolon (buffer);
 }
 }
diff --git a/gcc/gimple.c b/gcc/gimple.c
index e12f7d9..eb2d365 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -179,7 +179,7 @@ gimple_build_with_ops_stat (enum gimple_code code, unsigned 
subcode,
 gimple
 gimple_build_return (tree retval)
 {
-  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 1);
+  gimple s = gimple_build_with_ops (GIMPLE_RETURN, ERROR_MARK, 2);
   if (retval)
 gimple_return_set_retval (s, retval);
   return s;
@@ -371,6 +371,26 @@ gimple_build_call_from_tree (tree t)
 }
 
 
+/* Return index of INDEX's non bound argument of the call.  */
+
+unsigned
+gimple_call_get_nobnd_arg_index (const_gimple gs, unsigned index)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  for (unsigned n = 0; n  num_args; n++)
+{
+  if (BOUND_P (gimple_call_arg (gs, n)))
+   continue;
+  else if (index)
+   index--;
+  else
+   return n;
+}
+
+  gcc_unreachable ();
+}
+
+
 /* Extract the operands and code for expression EXPR into *SUBCODE_P,
*OP1_P, *OP2_P and *OP3_P respectively.  */
 
diff --git a/gcc/gimple.h b/gcc/gimple.h
index 376fda2..4376408 100644
--- a/gcc/gimple.h
+++ b/gcc/gimple.h
@@ -910,6 +910,7 @@ extern tree get_initialized_tmp_var (tree, gimple_seq *, 
gimple_seq *);
 extern tree get_formal_tmp_var (tree, gimple_seq *);
 extern void declare_vars (tree, gimple, bool);
 extern void annotate_all_with_location (gimple_seq, location_t);
+extern unsigned gimple_call_get_nobnd_arg_index (const_gimple, unsigned);
 
 /* Validation of GIMPLE expressions.  Note that these predicates only check
the basic form of the expression, they don't recurse to make sure that
@@ -2379,6 +2380,31 @@ gimple_call_arg (const_gimple gs, unsigned index)
 }
 
 
+/* Return the number of arguments used by call statement GS.  */
+
+static inline unsigned
+gimple_call_num_nobnd_args (const_gimple gs)
+{
+  unsigned num_args = gimple_call_num_args (gs);
+  unsigned res = num_args;
+  for (unsigned n = 0; n  num_args; n++)
+if (BOUND_P (gimple_call_arg (gs, n)))
+  res--;
+  return res;
+}
+
+
+/* Return INDEX's call argument ignoring bound ones.  */
+static inline tree
+gimple_call_nobnd_arg (const_gimple gs, unsigned index)
+{
+  /* No bound args may exist if pointers checker is off.  */
+  if (!flag_check_pointers)
+return gimple_call_arg (gs, index);
+  return gimple_call_arg (gs, gimple_call_get_nobnd_arg_index (gs, index));
+}
+
+
 /* Return a pointer to the argument at position INDEX for call
statement GS.  */
 
@@ -4886,6 +4912,26 @@ gimple_return_set_retval (gimple gs, tree retval)
 }
 
 
+/* Return the return bounds for GIMPLE_RETURN GS.  */
+
+static inline tree
+gimple_return_retbnd (const_gimple gs)
+{
+  GIMPLE_CHECK (gs, GIMPLE_RETURN);
+  return gimple_op (gs, 1);
+}
+
+
+/* Set RETVAL to be the return bounds for GIMPLE_RETURN GS.  */
+
+static inline void
+gimple_return_set_retbnd (gimple gs, tree retval)
+{
+  GIMPLE_CHECK (gs, GIMPLE_RETURN);
+  gimple_set_op (gs, 1, retval);
+}
+
+
 /* Returns true when the gimple statement STMT is any of the OpenMP types.  */
 
 #define

[PATCH] Fix DECL_BIT_FIELD dependent on flag_strict_volatile_bitfields

2013-10-21 Thread Bernd Edlinger

Hello,

this patch removes the dependency of DECL_BIT_FIELD on the 
flag_strict_volatile_bitfields,
and makes get_inner_reference not returning a different mode for non-volatile 
members
if flag_strict-volatile_bitfields is used.

This fixes the regression on the SH-target, due to the patch Fix asymmetric 
volatile handling
in optimize_bit_field_compare: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01629.html

Manually verified the mentioned test cases on a SH-cross-compiler.

Boot-strapped and regression-tested on x86_64-linux-gnu.

Ok for trunk?

Bernd.

RE: [PATCH] Fix DECL_BIT_FIELD dependent on flag_strict_volatile_bitfields

2013-10-21 Thread Bernd Edlinger

Oops...


 Hello,

 this patch removes the dependency of DECL_BIT_FIELD on the 
 flag_strict_volatile_bitfields,
 and makes get_inner_reference not returning a different mode for non-volatile 
 members
 if flag_strict-volatile_bitfields is used.

 This fixes the regression on the SH-target, due to the patch Fix asymmetric 
 volatile handling
 in optimize_bit_field_compare: 
 http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01629.html

 Manually verified the mentioned test cases on a SH-cross-compiler.

 Boot-strapped and regression-tested on x86_64-linux-gnu.

 Ok for trunk?

 Bernd.  2013-10-21  Bernd Edlinger  bernd.edlin...@hotmail.de

Fix DECL_BIT_FIELD depencency on flag_strict_volatile_bitfields
and get_inner_reference returning different pmode for non-volatile
bit-field members dependent on flag_strict_volatile_bitfields.
* stor-layout.c (layout_decl): Remove special handling of
flag_strict_volatile_bitfields.
* expr.c (get_inner_reference): Don't use DECL_BIT_FIELD
if flag_strict_volatile_bitfields  0 and TREE_THIS_VOLATILE.



patch-stor-layout.diff
Description: Binary data

Re: [PATCH] Fix DECL_BIT_FIELD dependent on flag_strict_volatile_bitfields

2013-10-21 Thread Richard Biener

On Mon, Oct 21, 2013 at 2:33 PM, Bernd Edlinger
bernd.edlin...@hotmail.de wrote:
 Oops...


 Hello,

 this patch removes the dependency of DECL_BIT_FIELD on the 
 flag_strict_volatile_bitfields,
 and makes get_inner_reference not returning a different mode for 
 non-volatile members
 if flag_strict-volatile_bitfields is used.

 This fixes the regression on the SH-target, due to the patch Fix asymmetric 
 volatile handling
 in optimize_bit_field_compare: 
 http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01629.html

 Manually verified the mentioned test cases on a SH-cross-compiler.

 Boot-strapped and regression-tested on x86_64-linux-gnu.

 Ok for trunk?

Ok.

Thanks,
Richard.

 Bernd.

[PATCH, rs6000] Be careful with special permute masks for little endian

2013-10-21 Thread Bill Schmidt

Hi,

In altivec_expand_vec_perm_const, we look for special masks that match
the behavior of specific instructions, so we can use those instructions
rather than load a constant control vector and perform a permute.  Some
of the masks must be treated differently for little endian mode.

The masks that represent merge-high and merge-low operations have
reversed meanings in little-endian, because of the reversed ordering of
the vector elements.  

The masks that represent vector-pack operations remain correct when the
mode of the input operands matches the natural mode of the instruction,
but not otherwise.  This is because the pack instructions always select
the rightmost, low-order bits of the vector element.  There are cases
where we use this, for example, with a V8SI vector matching a vpkuwum
mask in order to select the odd-numbered elements of the vector.  In
little endian mode, this instruction will get us the even-numbered
elements instead.  There is no alternative instruction with the desired
behavior, so I've just disabled use of those masks for little endian
when the mode isn't natural.

These changes fix 32 failures in the test suite for little endian mode.
Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
failures.  Is this ok for trunk?

Thanks,
Bill


2013-10-21  Bill Schmidt  wschm...@vnet.ibm.com

* config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
meaning of merge-high and merge-low masks for little endian; avoid
use of vector-pack masks for little endian for mismatched modes.


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 203792)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4])
   {  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
 { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
   {  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
+{ OPTION_MASK_ALTIVEC, 
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
   {  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
   {  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
   {  0,  1,  2,  3, 16, 17, 18, 19,  4,  5,  6,  7, 20, 21, 22, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
   {  8, 24,  9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
   {  8,  9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
   {  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
 { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
   {  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
@@ -28980,6 +28987,22 @@ altivec_expand_vec_perm_const (rtx operands[4])
  enum machine_mode omode = insn_data[icode].operand[0].mode;
  enum machine_mode imode = insn_data[icode].operand[1].mode;
 
+ /* For little-endian, don't use vpkuwum and vpkuhum if the
+underlying vector type is not V4SI and V8HI, respectively.
+For example, using vpkuwum with a V8HI picks up the even
+halfwords (BE numbering) when the even halfwords (LE
+numbering) are what we need.  */
+ if (!BYTES_BIG_ENDIAN
+  icode == CODE_FOR_altivec_vpkuwum
+  GET_CODE (op0) == SUBREG
+  GET_MODE (XEXP (op0, 0)) != V4SImode)
+   continue;
+ if (!BYTES_BIG_ENDIAN
+  icode == CODE_FOR_altivec_vpkuhum
+  GET_CODE (op0) == SUBREG
+  GET_MODE (XEXP (op0, 0)) != V8HImode)
+   continue;
+
   /* For little-endian, the two input operands must be swapped
  (or swapped back) to ensure proper right-to-left numbering
  from 0 to 2N-1.  */

[PATCH] Fix PR58794

2013-10-21 Thread Richard Biener


This fixes another case of unequal addresses of volatile fields.

Boostrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2013-10-21  Richard Biener  rguent...@suse.de

PR tree-optimization/58794
* fold-const.c (operand_equal_p): Compare FIELD_DECL operand
of COMPONENT_REFs with OEP_CONSTANT_ADDRESS_OF left in place.

* c-c++-common/torture/pr58794-1.c: New testcase.
* c-c++-common/torture/pr58794-2.c: Likewise.

Index: gcc/fold-const.c
===
*** gcc/fold-const.c(revision 203886)
--- gcc/fold-const.c(working copy)
*** operand_equal_p (const_tree arg0, const_
*** 2715,2724 
case COMPONENT_REF:
  /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
 may be NULL when we're called to compare MEM_EXPRs.  */
! if (!OP_SAME_WITH_NULL (0))
return 0;
  flags = ~OEP_CONSTANT_ADDRESS_OF;
! return OP_SAME (1)  OP_SAME_WITH_NULL (2);
  
case BIT_FIELD_REF:
  if (!OP_SAME (0))
--- 2715,2725 
case COMPONENT_REF:
  /* Handle operand 2 the same as for ARRAY_REF.  Operand 0
 may be NULL when we're called to compare MEM_EXPRs.  */
! if (!OP_SAME_WITH_NULL (0)
! || !OP_SAME (1))
return 0;
  flags = ~OEP_CONSTANT_ADDRESS_OF;
! return OP_SAME_WITH_NULL (2);
  
case BIT_FIELD_REF:
  if (!OP_SAME (0))
Index: gcc/testsuite/c-c++-common/torture/pr58794-1.c
===
*** gcc/testsuite/c-c++-common/torture/pr58794-1.c  (revision 0)
--- gcc/testsuite/c-c++-common/torture/pr58794-1.c  (working copy)
***
*** 0 
--- 1,29 
+ /* { dg-do compile } */
+ 
+ struct S0
+ {
+   int f;
+ };
+ 
+ struct S1
+ {
+   struct S0 f1;
+   volatile int f2;
+ };
+ 
+ struct S2
+ {
+   struct S1 g;
+ } a, b; 
+ 
+ static int *c[1][2] = {{0, (int *)a.g.f2}};
+ static int d; 
+ 
+ int
+ main ()
+ {
+   for (d = 0; d  1; d++)
+ for (b.g.f1.f = 0; b.g.f1.f  1; b.g.f1.f++)
+   *c[b.g.f1.f][d + 1] = 0;
+   return 0;
+ }
Index: gcc/testsuite/c-c++-common/torture/pr58794-2.c
===
*** gcc/testsuite/c-c++-common/torture/pr58794-2.c  (revision 0)
--- gcc/testsuite/c-c++-common/torture/pr58794-2.c  (working copy)
***
*** 0 
--- 1,21 
+ /* { dg-do compile } */
+ 
+ struct S 
+ {
+   volatile int f;
+ } a;
+ 
+ unsigned int b;
+ 
+ static int *c[1][2] = {{0, (int *)a.f}};
+ static unsigned int d;
+ 
+ int 
+ main ()
+ {
+   for (; d  1; d++)
+ for (; b  1; b++)
+   *c[b][d + 1] = 0;
+ 
+   return 0;
+ }

Re: [patch] Cleanup tree-ssa-ter.c exports

2013-10-21 Thread Andrew MacLeod


On 10/20/2013 09:53 AM, Eric Botcazou wrote:

Unfortunately,  tree-ssa-ter.c also has 2 functions
(find_replaceable_exprs() and  dump_replaceable_exprs()) which are
exported and utilized by tree-outof-ssa.c  (the file is a part of the
out-of-ssa module).  So I moved the prototypes from tree-ssa-live.h into
a newly created tree-ssa-ter.h file,  and included it directly from
tree-outof-ssa.c, the only consumer.

Apparently something went wrong when tree-ssa-ter.h was created.


what went wrong?  I see it in my checkouts...?

Andrew

Re: [PATCH][i386] Enable vector_loop in memset expanding and merge expanders for memset and memmov

2013-10-21 Thread Jan Hubicka

  The patch is OK.
 Thanks, the patch was committed.
 
   That's a good point.  I added a check for this case - so if CONST_INT is 
   passed
   we assume that mode is QI.  But usually promoted value resides in a 
   register, so
   we don't go over-conservative here.
  Hmm, so if we use broadcast constant, then we won't end up having CONST_INT 
  here?
  It is OK then.
 Currently we don't end up with CONST_INT.  But maybe we'd need to revisit this
 place later.

OK, I merged in my misaligned prologues changes and will post patch after full
testing.  It seemed to go seamlessly.  I spotted there are still few places for
cleanup, so i will try to handle there incrementally. 

Do you have plans on enabling the SSE memcpy/memset on Atom now?  What about
the move_by_pieces/set_by_pieces via SSE now?

Thanks!
Honza
 
 Michael
  Honza

[PATCH][ARM] Update Cortex A9 DImode mult extend rtx costs

2013-10-21 Thread Kyrill Tkachov


Hi all,

There's a slight error in the Cortex A9 costs I committed recently. The extend 
and extend_add costs refer to 32x32-64 bit operations and thus have valid costs 
in aarch32. This patches updates those costs for the Cortex A9. The other 
existing cost tables are correct in this regard and don't need updating. This is 
just a thinko in the A9 costs.


Tested arm-none-eabi on qemu.

Ok for trunk?

Thanks,
Kyrill

[gcc/]
2013-10-21  Kyrylo Tkachov  kyrylo.tkac...@arm.com

* config/arm/arm.c (cortexa9_extra_costs): Update mult costs for
extend and extend_add.diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index e2baa9b..f85aa5f 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -1078,18 +1078,18 @@ const struct cpu_cost_table cortexa9_extra_costs =
 {
   COSTS_N_INSNS (3),	/* Simple.  */
   COSTS_N_INSNS (3),	/* Flag_setting.  */
-  COSTS_N_INSNS (4),	/* Extend.  */
+  COSTS_N_INSNS (2),	/* Extend.  */
   COSTS_N_INSNS (3),	/* Add.  */
-  COSTS_N_INSNS (4),	/* Extend_add.  */
+  COSTS_N_INSNS (2),	/* Extend_add.  */
   COSTS_N_INSNS (30)	/* Idiv.  No HW div on Cortex A9.  */
 },
 /* MULT DImode */
 {
   0,			/* Simple (N/A).  */
   0,			/* Flag_setting (N/A).  */
-  0,			/* Extend (N/A).  */
+  COSTS_N_INSNS (4),	/* Extend.  */
   0,			/* Add (N/A).  */
-  0,			/* Extend_add (N/A).  */
+  COSTS_N_INSNS (4),	/* Extend_add.  */
   0/* Idiv (N/A).  */
 }
   },

Re: [PATCH][ARM] Update Cortex A9 DImode mult extend rtx costs

2013-10-21 Thread Richard Earnshaw

On 21/10/13 14:37, Kyrill Tkachov wrote:
 Hi all,
 
 There's a slight error in the Cortex A9 costs I committed recently. The 
 extend 
 and extend_add costs refer to 32x32-64 bit operations and thus have valid 
 costs 
 in aarch32. This patches updates those costs for the Cortex A9. The other 
 existing cost tables are correct in this regard and don't need updating. This 
 is 
 just a thinko in the A9 costs.
 
 Tested arm-none-eabi on qemu.
 
 Ok for trunk?
 
 Thanks,
 Kyrill
 
 [gcc/]
 2013-10-21  Kyrylo Tkachov  kyrylo.tkac...@arm.com
 
  * config/arm/arm.c (cortexa9_extra_costs): Update mult costs for
  extend and extend_add.
 
 

OK.

R.

Re: [PATCH][i386] Enable vector_loop in memset expanding and merge expanders for memset and memmov

2013-10-21 Thread Michael V. Zolotukhin

 OK, I merged in my misaligned prologues changes and will post patch after full
 testing.  It seemed to go seamlessly.  I spotted there are still few places 
 for
 cleanup, so i will try to handle there incrementally. 
Great!  Hope these recent memcpy/memset changes will lead to some gains soon:)

 Do you have plans on enabling the SSE memcpy/memset on Atom now?  What about
 the move_by_pieces/set_by_pieces via SSE now?
Actually, I'm a bit busy with other stuff right now, so I'm not sure that I'll
find time for this before stage1 finish.  But maybe somebody will carry on this
and do necessary experiments and tuning.

And the same for the move_by_pieces.

Michael
 Honza

Re: [PATCH, rs6000] Be careful with special permute masks for little endian

2013-10-21 Thread David Edelsohn

On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt
wschm...@linux.vnet.ibm.com wrote:
 Hi,

 In altivec_expand_vec_perm_const, we look for special masks that match
 the behavior of specific instructions, so we can use those instructions
 rather than load a constant control vector and perform a permute.  Some
 of the masks must be treated differently for little endian mode.

 The masks that represent merge-high and merge-low operations have
 reversed meanings in little-endian, because of the reversed ordering of
 the vector elements.

 The masks that represent vector-pack operations remain correct when the
 mode of the input operands matches the natural mode of the instruction,
 but not otherwise.  This is because the pack instructions always select
 the rightmost, low-order bits of the vector element.  There are cases
 where we use this, for example, with a V8SI vector matching a vpkuwum
 mask in order to select the odd-numbered elements of the vector.  In
 little endian mode, this instruction will get us the even-numbered
 elements instead.  There is no alternative instruction with the desired
 behavior, so I've just disabled use of those masks for little endian
 when the mode isn't natural.

 These changes fix 32 failures in the test suite for little endian mode.
 Bootstrapped and tested on powerpc64{,le}-unknown-linux-gnu with no new
 failures.  Is this ok for trunk?

 Thanks,
 Bill


 2013-10-21  Bill Schmidt  wschm...@vnet.ibm.com

 * config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
 meaning of merge-high and merge-low masks for little endian; avoid
 use of vector-pack masks for little endian for mismatched modes.

Okay.

Thanks, David

Re: [C++ PATCH] Fix PR58705

2013-10-21 Thread Marek Polacek

Ping.

On Mon, Oct 14, 2013 at 02:23:59PM +0200, Marek Polacek wrote:
 We were ICEing on the attached testcase, because in check_narrowing,
 for = {{}}, we wanted to check recursively the CONSTRUCTOR_ELTs,
 even though init in this case has 0 CONSTRUCTOR_NELTS.  So I added
 the check for CONSTRUCTOR_NELTS  0.  Moreover, since empty scalar
 initializers are forbidden in C FE, I think we should error out here
 too.  (Complex type is considered as an arithmetic type as a GNU
 extension and arithmetic types are scalar types.)
 This isn't C++11-specific as it may look from the PR.  The bug
 exhibits when -Wnarrowing (that is implicitly enabled with C++11) is
 on, since in check_narrowing we have
 
   if (!warn_narrowing || !ARITHMETIC_TYPE_P (type))
   return;
 
 and with -Wno-narrowing we just return early.
 
 Regtested/bootstrapped on x86_64-linux, ok for trunk?
 
 2013-10-14  Marek Polacek  pola...@redhat.com
 
   PR c++/58705
 cp/
   * typeck2.c (check_narrowing): Give an error when the scalar
   initializer is empty.
 testsuite/
   * g++.dg/parse/pr58705.C: New test.
 
 --- gcc/cp/typeck2.c.mp   2013-10-14 11:11:36.971293089 +0200
 +++ gcc/cp/typeck2.c  2013-10-14 11:52:14.582061052 +0200
 @@ -833,7 +833,10 @@ check_narrowing (tree type, tree init)
 TREE_CODE (type) == COMPLEX_TYPE)
  {
tree elttype = TREE_TYPE (type);
 -  check_narrowing (elttype, CONSTRUCTOR_ELT (init, 0)-value);
 +  if (CONSTRUCTOR_NELTS (init)  0)
 + check_narrowing (elttype, CONSTRUCTOR_ELT (init, 0)-value);
 +  else
 + error (empty scalar initializer);
if (CONSTRUCTOR_NELTS (init)  1)
   check_narrowing (elttype, CONSTRUCTOR_ELT (init, 1)-value);
return;
 --- gcc/testsuite/g++.dg/parse/pr58705.C.mp   2013-10-14 11:14:46.460955343 
 +0200
 +++ gcc/testsuite/g++.dg/parse/pr58705.C  2013-10-14 11:51:02.698810118 
 +0200
 @@ -0,0 +1,5 @@
 +// PR c++/58705
 +// { dg-do compile }
 +// { dg-options -Wnarrowing }
 +
 +_Complex float f = {{}};  // { dg-error empty scalar initializer }
 
   Marek

Marek

Re: [PATCH, MPX, 2/X] Pointers Checker [1/25] Hooks

2013-10-21 Thread Ilya Enkovich

esOn 21 Oct 11:44, Joseph S. Myers wrote:
 On Mon, 21 Oct 2013, Ilya Enkovich wrote:
 
  +DEFHOOK
  +(builtin_chkp_function,
  + Pointers checker instrumentation pass uses this hook to obtain\n\
  +target-specific functions which implement specified generic checker\n\
  +builtins.,
  + tree, (unsigned fcode),
  + default_builtin_chkp_function)
 
 I don't think that's enough detail.  The audience for this hook 
 description is back-end maintainers wanting to implement such hooks for 
 their back ends, and the hook description should give sufficient 
 information to do so.  This description says nothing at all about the 
 semantics of the hook argument or return value.
 
 If it seems difficult to describe things sufficiently in the context of 
 individual hook descriptions, maybe an overview of the feature and 
 implementation approach is needed as a new section in the internals 
 manual, with hook descriptions then referring to that section, or going in 
 appropriate places within that section (if the section is in tm.texi.in).
 
  +DEFHOOK
  +(fn_abi_va_list_bounds_size,
  + This hook returns size for va_list object or integer_zero_node if\n\
  +it does not have any (e.g. is scalar pointer to the stack).,
  + tree, (tree fndecl),
  + default_fn_abi_va_list_bounds_size)
 
 @code{va_list}, @code{integer_zero_node}, specify semantics of fndecl 
 argument.
 
   DEFHOOK
  +(load_bounds_for_arg,
  + This hook is used to emit insn to load arg's bounds\n\
  +in case bounds are not passed on register.  Return loaded bounds,
  + rtx, (rtx, rtx, rtx),
  + default_load_bounds_for_arg)
 
 You need to name all the arguments and explain their semantics by name in 
 the documentation (which should end with .).
 
  +DEFHOOK
  +(store_bounds_for_arg,
  + This hook is used to emit insn to store arg's bounds\n\
  +in case bounds are not passed on register.,
  + void, (rtx, rtx, rtx, rtx),
  + default_store_bounds_for_arg)
 
 Likewise.
 
 -- 
 Joseph S. Myers
 jos...@codesourcery.com

Hello Joseph,

Thanks for your comments! I attach a new patch version with changed hooks 
documentation. Hope it is more informative now.

Thanks,
Ilya
--

diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index 8d220f3..79bd0f9 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -4334,6 +4334,13 @@ This hook returns the va_list type of the calling 
convention specified by
 The default version of this hook returns @code{va_list_type_node}.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_FN_ABI_VA_LIST_BOUNDS_SIZE (tree 
@var{fndecl})
+This hook returns size for @code{va_list} object in function specified
+by @var{fndecl}.  This hook is used by Pointers Checker to build bounds for
+@code{va_list} object.  Return @code{integer_zero_node} if no bounds should
+be used (e.g. @code{va_list} is a scalar pointer to the stack).
+@end deftypefn
+
 @deftypefn {Target Hook} tree TARGET_CANONICAL_VA_LIST_TYPE (tree @var{type})
 This hook returns the va_list type of the calling convention specified by the
 type of @var{type}. If @var{type} is not a valid va_list type, it returns
@@ -5151,6 +5158,19 @@ defined, then define this hook to return @code{true} if
 Otherwise, you should not define this hook.
 @end deftypefn
 
+@deftypefn {Target Hook} rtx TARGET_LOAD_BOUNDS_FOR_ARG (rtx @var{slot}, rtx 
@var{arg}, rtx @var{slot_no})
+This hook is used to emit insn to load bounds of @var{arg} passed
+in @var{slot}.  In case @var{slot} is not a memory, @var{slot_no} is RTX
+constant holding number of the special slot we should get bounds from.
+Return loaded bounds.
+@end deftypefn
+
+@deftypefn {Target Hook} void TARGET_STORE_BOUNDS_FOR_ARG (rtx @var{arg}, rtx 
@var{slot}, rtx @var{bounds}, rtx @var{slot_no})
+This hook is used to emit insn to store bounds of @var{arg} passed
+in @var{slot}.  In case @var{slot} is not a memory, @var{slot_no} is RTX
+constant holding number of the special slot we should store bounds to.
+@end deftypefn
+
 @node Trampolines
 @section Trampolines for Nested Functions
 @cindex trampolines for nested functions
@@ -10907,6 +10927,27 @@ ignored.  This function should return the result of 
the call to the
 built-in function.
 @end deftypefn
 
+@deftypefn {Target Hook} tree TARGET_BUILTIN_CHKP_FUNCTION (unsigned 
@var{fcode})
+This hook allows target to redefine built-in functions used by
+Pointers Checker for code instrumentation.  Hook should return
+fndecl of function implementing generic builtin whose code is
+passed in @var{fcode}.  Currently following built-in functions are
+obtained using this hook:
+@code{BUILT_IN_CHKP_BNDMK}, @code{BUILT_IN_CHKP_BNDSTX},
+@code{BUILT_IN_CHKP_BNDLDX}, @code{BUILT_IN_CHKP_BNDCL},
+@code{BUILT_IN_CHKP_BNDCU}, @code{BUILT_IN_CHKP_BNDRET},
+@code{BUILT_IN_CHKP_INTERSECT}, @code{BUILT_IN_CHKP_SET_PTR_BOUNDS},
+@code{BUILT_IN_CHKP_NARROW}, @code{BUILT_IN_CHKP_ARG_BND},
+@code{BUILT_IN_CHKP_SIZEOF}, @code{BUILT_IN_CHKP_EXTRACT_LOWER},
+@code{BUILT_IN_CHKP_EXTRACT_UPPER}.
+@end deftypefn
+@deftypefn

RE: [PATCH] Fix PR58682

2013-10-21 Thread Paulo Matos

 -Original Message-
 From: Jan Hubicka [mailto:hubi...@ucw.cz]
 Sent: 21 October 2013 13:21
 To: Paulo Matos
 Cc: Richard Biener; Mike Stump; Kyrill Tkachov; gcc-patches@gcc.gnu.org; Jan
 Hubicka (hubi...@ucw.cz)
 Subject: Re: [PATCH] Fix PR58682

 This won't work - the max_count is there to get things correctly scaled by
 badness calculation.
 Changing max_count would mean the need to recompute all badnesses in the 
 fibheap
 keys.
 I would just cap the edge_count in badness calculation.

 Honza

Makes sense. Thanks for the comment. I have created a new patch. Now attached.

2013-10-21  Paulo Matos  pma...@broadcom.com

* ipa-inline.c (edge_badness): Cap edge-count at max_count for badness
calculations.

-- 
PMatos

pr58682-v2.patch
Description: pr58682-v2.patch

Doc patch committed: Improve attributes doc in -fno-toplevel-reorder

2013-10-21 Thread Ian Lance Taylor

This patch adds the words when possible to make clear that it is not
always possible to use attributes to replace the effects of
-fno-toplevel-reorder.  I also fixed a typo s/includes files/include
files/ I noticed a while ago.  Bootstrapped on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian


2013-10-21  Ian Lance Taylor  i...@google.com

* doc/invoke.texi (Optimize Options): For -fno-toplevel-reorder,
don't imply that attributes can solve all problems.
(Directory Options): Fix typo.


Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 203893)
+++ doc/invoke.texi	(working copy)
@@ -8364,7 +8364,7 @@ statements.  Output them in the same ord
 input file.  When this option is used, unreferenced static variables
 are not removed.  This option is intended to support existing code
 that relies on a particular ordering.  For new code, it is better to
-use attributes.
+use attributes when possible.
 
 Enabled at level @option{-O0}.  When disabled explicitly, it also implies
 @option{-fno-section-anchors}, which is otherwise enabled at @option{-O0} on some
@@ -10367,7 +10367,7 @@ separator character at the end of the pa
 @option{-B} prefixes that effectively specify directory names also apply
 to libraries in the linker, because the compiler translates these
 options into @option{-L} options for the linker.  They also apply to
-includes files in the preprocessor, because the compiler translates these
+include files in the preprocessor, because the compiler translates these
 options into @option{-isystem} options for the preprocessor.  In this case,
 the compiler appends @samp{include} to the prefix.

Re: [PATCH, MPX, 2/X] Pointers Checker [2/25] Builtins

2013-10-21 Thread Ilya Enkovich

On 21 Oct 12:08, Joseph S. Myers wrote:
 On Mon, 21 Oct 2013, Ilya Enkovich wrote:
 
  +  if (flag_check_pointers)
  +{
  +  if (flag_lto)
  +   sorry (Pointers checker is not yet fully supported for link-time 
  optimization);
 
 That sounds wrong.  It suggests some bug somewhere in your patch series 
 failing to allow for LTO, which should be fixed.  At least give a more 
 detailed explanation in a comment of what would need to change to remove 
 this call to sorry ().

It has been a while since I saw problem with LTO (some functions appeared to 
miss instrumentation; it seemed compilation flags for checker were not handled 
correctly for some reason).  When I made this branch public I put this guard to 
prevent users from falling into this problem.  Actually I do not know what is 
the current status of that issue.  A lot of things has changed including 
checker flags definition.  I'll try mpx testsuite with LTO and be back with the 
result.

 
  +  if (targetm.chkp_bound_mode () == VOIDmode)
  +   error (-fcheck-pointers is not supported for this target.);
 
 Also note GNU Coding Standards for diagnostics.  They should not start 
 with uppercase letters or end with ..

Thanks for input! I'll fix it.

Thanks,
Ilya

 
 -- 
 Joseph S. Myers
 jos...@codesourcery.com

[PATCH] Make fold_range_test work only on integral types (PR middle-end/58809)

2013-10-21 Thread Marek Polacek

In this PR (well, in the first testcase in it), the problem was that
fold_range_test got op0 and op1 of a COMPLEX_TYPE, subsequent call
of make_range then failed because it wants to create an integer constant
using build_int_cst - and that can't handle COMPLEX_TYPE.  Also,
we can't just compare complex numbers, so I think it's sane to bail out
if fold_range_test isn't dealing with integer types...

I didn't want to put the check between the declarations, so the
make_range calls were moved slightly below.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-10-21  Marek Polacek  pola...@redhat.com

PR middle-end/58809
* fold-const.c (fold_range_test): Return 0 if the type is not
an integral type.
testsuite/
* gcc.dg/gomp/pr58809.c: New test.

--- gcc/fold-const.c.mp 2013-10-21 16:12:31.494179417 +0200
+++ gcc/fold-const.c2013-10-21 16:19:08.673580775 +0200
@@ -4984,12 +4984,16 @@ fold_range_test (location_t loc, enum tr
   int in0_p, in1_p, in_p;
   tree low0, low1, low, high0, high1, high;
   bool strict_overflow_p = false;
-  tree lhs = make_range (op0, in0_p, low0, high0, strict_overflow_p);
-  tree rhs = make_range (op1, in1_p, low1, high1, strict_overflow_p);
-  tree tem;
+  tree tem, lhs, rhs;
   const char * const warnmsg = G_(assuming signed overflow does not occur 
  when simplifying range test);
 
+  if (!INTEGRAL_TYPE_P (type))
+return 0;
+
+  lhs = make_range (op0, in0_p, low0, high0, strict_overflow_p);
+  rhs = make_range (op1, in1_p, low1, high1, strict_overflow_p);
+
   /* If this is an OR operation, invert both sides; we will invert
  again at the end.  */
   if (or_op)
--- gcc/testsuite/gcc.dg/gomp/pr58809.c.mp  2013-10-21 16:12:13.333114856 
+0200
+++ gcc/testsuite/gcc.dg/gomp/pr58809.c 2013-10-21 16:11:49.296022857 +0200
@@ -0,0 +1,13 @@
+/* PR middle-end/58809 */
+/* { dg-do compile } */
+/* { dg-options -fopenmp -O } */
+
+int i;
+#pragma omp threadprivate (i)
+
+void foo()
+{
+  _Complex int j;
+#pragma omp parallel copyin (i) reduction (:j)
+  ;
+}

Marek

Re: [PATCH] Fix PR58682

2013-10-21 Thread Jan Hubicka

  -Original Message-
  From: Jan Hubicka [mailto:hubi...@ucw.cz]
  Sent: 21 October 2013 13:21
  To: Paulo Matos
  Cc: Richard Biener; Mike Stump; Kyrill Tkachov; gcc-patches@gcc.gnu.org; Jan
  Hubicka (hubi...@ucw.cz)
  Subject: Re: [PATCH] Fix PR58682

  This won't work - the max_count is there to get things correctly scaled by
  badness calculation.
  Changing max_count would mean the need to recompute all badnesses in the 
  fibheap
  keys.
  I would just cap the edge_count in badness calculation.

  Honza

 Makes sense. Thanks for the comment. I have created a new patch. Now attached.

 2013-10-21  Paulo Matos  pma...@broadcom.com

   * ipa-inline.c (edge_badness): Cap edge-count at max_count for badness
   calculations.

OK, thanks!

Honza

[PATCH] More jump threading cleanups

2013-10-21 Thread Jeff Law



Two cleanups I noticed late last week while working on the general case 
FSA optimization.


First, we were not cancelling jump threads through joiner blocks when 
there is any edge on the jump threading path with a recorded jump thread 
opportunity.  We were not nearly as aggressive at pruning as we should 
have been resulting in useless block duplication.


Second, in the general form of the FSA optimization, we can have jump 
threading path which starts with a join point and traverses through a 
normal jump threading block (ie, has side effects and a compile-time 
determinable out-edge).  To facilitate these opportunities, we want to 
call thread_through_normal_block after threading through a joiner block. 
 Thus we need thread_through_normal_block to accept the bitmap of 
blocks we have already visited and check it appropriately.


Bootstrapped  regression tested on x86_64-unknown-linux-gnu.  Installed 
on trunk.



diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index c5b794e..54298e4 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,15 @@
+2013-10-21  Jeff Law  l...@redhat.com
+
+   * tree-ssa-threadedge.c (thread_through_normal_block): New argument 
VISITED.
+   Remove VISISTED as a local variable.  When we have a threadable jump, 
verify
+   the destination of the jump has not been visised.
+   (thread_across_edge): Allocate VISITED bitmap once at function scope and
+   use it throughout.  Make sure to set appropriate bits in VISITED for E 
(start
+   of jump thread path).
+
+   * tree-ssa-threadupdate.c (mark_threaded_blocks): Reject threading 
through
+   a joiner if any edge on the path has a recorded jump thread.
+
 2013-10-21  Ian Lance Taylor  i...@google.com
 
* doc/invoke.texi (Optimize Options): For -fno-toplevel-reorder,
diff --git a/gcc/tree-ssa-threadedge.c b/gcc/tree-ssa-threadedge.c
index f567557..ebd93cb 100644
--- a/gcc/tree-ssa-threadedge.c
+++ b/gcc/tree-ssa-threadedge.c
@@ -883,7 +883,8 @@ thread_through_normal_block (edge e,
 bool handle_dominating_asserts,
 vectree *stack,
 tree (*simplify) (gimple, gimple),
-vecjump_thread_edge * *path)
+vecjump_thread_edge * *path,
+bitmap visited)
 {
   /* If E is a backedge, then we want to verify that the COND_EXPR,
  SWITCH_EXPR or GOTO_EXPR at the end of e-dest is not affected
@@ -922,11 +923,10 @@ thread_through_normal_block (edge e,
{
  edge taken_edge = find_taken_edge (e-dest, cond);
  basic_block dest = (taken_edge ? taken_edge-dest : NULL);
- bitmap visited;
 
  /* DEST could be NULL for a computed jump to an absolute
 address.  */
- if (dest == NULL || dest == e-dest)
+ if (dest == NULL || dest == e-dest || bitmap_bit_p (visited, 
dest-index))
return false;
 
   jump_thread_edge *x
@@ -944,7 +944,6 @@ thread_through_normal_block (edge e,
{
  /* We don't want to thread back to a block we have already
 visited.  This may be overly conservative.  */
- visited = BITMAP_ALLOC (NULL);
  bitmap_set_bit (visited, dest-index);
  bitmap_set_bit (visited, e-dest-index);
  thread_around_empty_blocks (taken_edge,
@@ -953,7 +952,6 @@ thread_through_normal_block (edge e,
  simplify,
  visited,
  path);
- BITMAP_FREE (visited);
}
  return true;
}
@@ -995,15 +993,21 @@ thread_across_edge (gimple dummy_cond,
vectree *stack,
tree (*simplify) (gimple, gimple))
 {
+  bitmap visited = BITMAP_ALLOC (NULL);
+
   stmt_count = 0;
 
   vecjump_thread_edge * *path = new vecjump_thread_edge * ();
+  bitmap_clear (visited);
+  bitmap_set_bit (visited, e-src-index);
+  bitmap_set_bit (visited, e-dest-index);
   if (thread_through_normal_block (e, dummy_cond, handle_dominating_asserts,
-  stack, simplify, path))
+  stack, simplify, path, visited))
 {
   propagate_threaded_block_debug_into (path-last ()-e-dest,
   e-dest);
   remove_temporary_equivalences (stack);
+  BITMAP_FREE (visited);
   register_jump_thread (path);
   return;
 }
@@ -1030,7 +1034,6 @@ thread_across_edge (gimple dummy_cond,
 edge taken_edge;
 edge_iterator ei;
 bool found;
-bitmap visited = BITMAP_ALLOC (NULL);
 
 /* Look at each successor of E-dest to see if we can thread through it.  
*/
 FOR_EACH_EDGE (taken_edge, ei, e-dest-succs)
diff --git a/gcc/tree-ssa-threadupdate.c b/gcc/tree-ssa-threadupdate.c
index e791269..737a6a2 100644
---

Doc patch RFA: Remove docs for -fno-default-inline

2013-10-21 Thread Ian Lance Taylor

According to c.opt, the -fdefault-inline patch is ignored, and only
exists for backward compatibility.  The only use of flag_default_inline
was removed by Honza here:
http://gcc.gnu.org/ml/gcc-patches/2008-07/msg02104.html

This patch removes the docs for -fdefault-inline.

Bootstrapped on x86_64-unknown-linux-gnu.  OK for mainline?

Ian


2013-10-21  Ian Lance Taylor  i...@google.com

* doc/invoke.texi (Option Summary): Remove -fno-default-inline.
(C++ Dialect Options): Likewise.
(Optimize Options): Likewise.


Index: doc/invoke.texi
===
--- doc/invoke.texi	(revision 203894)
+++ doc/invoke.texi	(working copy)
@@ -190,7 +190,7 @@ in the following sections.
 -frepo  -fno-rtti  -fstats  -ftemplate-backtrace-limit=@var{n} @gol
 -ftemplate-depth=@var{n} @gol
 -fno-threadsafe-statics -fuse-cxa-atexit  -fno-weak  -nostdinc++ @gol
--fno-default-inline  -fvisibility-inlines-hidden @gol
+-fvisibility-inlines-hidden @gol
 -fvtable-verify=@var{std|preinit|none} @gol
 -fvtv-counts -fvtv-debug @gol
 -fvisibility-ms-compat @gol
@@ -385,7 +385,6 @@ Objective-C and Objective-C++ Dialects}.
 -flto-partition=@var{alg} -flto-report -flto-report-wpa -fmerge-all-constants @gol
 -fmerge-constants -fmodulo-sched -fmodulo-sched-allow-regmoves @gol
 -fmove-loop-invariants fmudflap -fmudflapir -fmudflapth -fno-branch-count-reg @gol
--fno-default-inline @gol
 -fno-defer-pop -fno-function-cse -fno-guess-branch-probability @gol
 -fno-inline -fno-math-errno -fno-peephole -fno-peephole2 @gol
 -fno-sched-interblock -fno-sched-spec -fno-signed-zeros @gol
@@ -2403,13 +2402,6 @@ In addition, these optimization, warning
 have meanings only for C++ programs:
 
 @table @gcctabopt
-@item -fno-default-inline
-@opindex fno-default-inline
-Do not assume @samp{inline} for functions defined inside a class scope.
-@xref{Optimize Options,,Options That Control Optimization}.  Note that these
-functions have linkage like inline functions; they just aren't
-inlined by default.
-
 @item -Wabi @r{(C, Objective-C, C++ and Objective-C++ only)}
 @opindex Wabi
 @opindex Wno-abi
@@ -6822,14 +6814,6 @@ can use the following flags in the rare
 optimizations to be performed is desired.
 
 @table @gcctabopt
-@item -fno-default-inline
-@opindex fno-default-inline
-Do not make member functions inline by default merely because they are
-defined inside the class scope (C++ only).  Otherwise, when you specify
-@w{@option{-O}}, member functions defined inside class scope are compiled
-inline by default; i.e., you don't need to add @samp{inline} in front of
-the member function name.
-
 @item -fno-defer-pop
 @opindex fno-defer-pop
 Always pop the arguments to each function call as soon as that function

Re: patch to enable LRA for ppc

2013-10-21 Thread Michael Meissner

On Sun, Oct 20, 2013 at 10:48:08PM -0400, Vladimir Makarov wrote:
 On 13-10-18 11:26 AM, David Edelsohn wrote:
 On Thu, Oct 3, 2013 at 5:02 PM, Vladimir Makarov vmaka...@redhat.com wrote:
 The following patch permits today trunk to use LRA for ppc by default.
 To switch it off -mno-lra can be used.
 
 The patch was bootstrapped on ppc64.  GCC testsuite does not have
 regressions too (in comparison with reload).  The change in rs6000.md is
 for fix LRA failure on a recently added ppc test.
 Vlad,
 
 I have not forgotten this patch. We are trying to figure out the right
 timeframe to make this change. The patch does affect performance --
 both positively and negatively; most are in the noise but not all. And
 there still are some SPEC benchmarks that fail to build with the
 patch, at least in Mike's tests. And Mike is implementing some patches
 to utilize reload to improve use of VSX registers, which would need to
 be mirrored in LRA for the equivalent functionality.
 Thanks for informing me, David.
 
 I am ready to work on any LRA ppc issues when it will be in the
 trunk.  It would be easier for me to work on LRA ppc if the patch is
 committed to the trunk and of course LRA is used as non-default
 local RA.
 
 I don't know what Mike is doing on reload to use VSX registers.  I
 guess it is usage of  VSX regs as spilled locations for GENERAL regs
 instead of memory.  If it is so, it is 2 day work to add this
 functionality in LRA (as it already has analogous functionality for
 Intel processors and that gave a nice SPECFP2000 improvement for
 them) and probably more work on resolving issues especially as I
 have no power8.

I would say lets add -mlra, but make the default OFF for the time being.  We
can always switch the default later.

Vladimir, I thought I included you in the list when I gave status.  The big
thing is several of the Spec 2006 benchmarks don't work in 32-bit mode, and I
get a lot of Fortran errors, again in 32-bit.  I also saw some decimal floating
point problems.

What I'm doing is adding secondary reload support so that up until reload time,
we can represent VSX addresses as reg+offset, and in secondary reload, create
the addition instructions to put the offset in a base register.  I haven't made
any changes to the machine independent portions of the compiler.  As long as
IRA uses the secondary reload interface, it should be ok.  However, right now,
I need to focus most of my attention on getting the secondary reload support to
work.

One thing that I've asked for before, but to remind you, is I really, really
wish secondary reload could allocate two scratch registers if it is given an
insn that takes 4 arguments.  Right now, I'm allocating a TFmode scratch, since
that gives 2 registers, but future changes will want TFmode to go into a single
vector register, and I will need to create another type, like V4DI that does
take 2 registers.  The case that this is needed for is moving an item from GPRs
to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
64-bit mode, or 64-bit items in 32-bit mode.  I need two registers to do the
move into, and then I will do the combine operation.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping

2013-10-21 Thread Martin Jambor

Hi,

in spring I have suggested to shedule pass_cprop_hardreg before
pass_thread_prologue_and_epilogue in order to create many more
shrink-wrapping opportunities.  The problem is that formal arguments
of a functions which need to be saved across calls on slow paths often
get assigned callee saved registers and the very first BB thus needs
prologue, which makes any attempt at shrink-wrapping difficult.

Steven suggested that splitting live-ranges of these registers might
be a better approach (and Vlad suggested that I should look at
http://gcc.gnu.org/ml/gcc-patches/2013-04/msg01137.html, a big thanks
to both) and this is what the patch below does.

The basic idea is to split live ranges of problematic pseudos in the
common dominator of all non-sibling calls and all uses of those
pseudos in BBs which are reachable from calls, if such dominator is
not in a loop and if it does not post-dominate the entry BB.

As far as the number of performed shrink-wrappings is concerned, it
has even bigger effect to scheduling pass_cprop_hardreg earlier that I
have proposed in spring.  Apart from looking at bootstrap and SPEC
2006 (at -Ofast) I have tried compiling Python and Ruby (with whatever
their default flags were, I believe mostly -O3) because David
suggested at his Cauldron talk last year that shrink-wrapping may be
important:

Number of performed shrink-wrappings:
| Source  | Trunk | With patch |   % |
|-+---++-|
| SPEC 2006 Int   |   734 |   1670 | +127.52 |
| SPEC 2006 FP (*)|   575 |   1022 |  +77.74 |
| C/CPP bootstrap stage 2 |  1748 |   3015 |  +72.48 |
| Python 2.7.5|   231 |401 |  +73.59 |
| Python 3.3.2|   234 |416 |  +77.78 |
| Ruby 2.0.0-p247 |   331 |464 |  +40.18 |

(*) I forgot to increase the stack limit and I believe that is the
reason why 410.bwaves, 416.gamess, 447.dealII and 481.wrf SPEC 2006 FP
benchmarks failed, regardless of my patch, but I have not investigated
what exactly happened, so there might have even been compile-time
issues.

As far as run-times are concerned, I have tried running the benchmarks
from hg.python.org/benchmarks but I do not have anything to report.
David claimed that shrink-wrapping in function lookdict_string is
likely to help but this patch alone does not make that happen, I am
investigating what else needs to be done for this to happen.  I have
not yet tried benchmarking ruby.  Spec 2006 (*) run times are a bit
encouraging.

Run-times (seconds, median of three runs):
| Benchmark  | Trunk | Live range splitting | % |
|+---+--+---|
| 400.perlbench  |   294 |  290 | -1.36 |
| 401.bzip2  |   384 |  387 | +0.78 |
| 403.gcc|   239 |  237 | -0.84 |
| 429.mcf|   227 |  228 | +0.44 |
| 445.gobmk  |   392 |  390 | -0.51 |
| 456.hmmer  |   343 |  342 | -0.29 |
| 458.sjeng  |   418 |  411 | -1.67 |
| 462.libquantum |   287 |  288 | +0.35 |
| 464.h264ref|   417 |  418 | +0.24 |
| 471.omnetpp|   275 |  280 | +1.82 |
| 473.astar  |   312 |  295 | -5.45 |
| 483.xalancbmk  |   183 |  183 |  0.00 |
|+---+--+---|
| 433.milc   |   336 |  335 | -0.30 |
| 434.zeusmp |   247 |  246 | -0.40 |
| 435.gromacs|   259 |  259 |  0.00 |
| 436.cactusADM  |   192 |  186 | -3.12 |
| 437.leslie3d   |   227 |  226 | -0.44 |
| 444.namd   |   315 |  315 |  0.00 |
| 450.soplex |   202 |  200 | -0.99 |
| 453.povray |   136 |  134 | -1.47 |
| 454.calculix   |   300 |  300 |  0.00 |
| 459.GemsFDTD   |   273 |  267 | -2.20 |
| 465.tonto  |   685 |  685 |  0.00 |
| 470.lbm|   212 |  213 | +0.47 |
| 482.sphinx3|   363 |  360 | -0.83 |

The patch is speculative, quite a few modifications do not lead to
shrink wrappings, as outlined in the following table which shows how
many of the changed functions still needed their prologue in the first
BB.

Number of functions touched:
| |  Modified | |   |
| Source  | functions | In vain | % |
|-+---+-+---|
| SPEC 2006 Int   |  2004 | 745 | 37.18 |
| SPEC 2006 FP (*)|  1491 | 881 | 59.09 |
| C/C++ bootstrap stage 2 |  2864 | 935 | 32.65 |

Re: [PATCH, rs6000] Be careful with special permute masks for little endian

2013-10-21 Thread Bill Schmidt

Whoops, looks like I missed some simpler cases (REG with the wrong mode
instead of SUBREG with the wrong mode).  Is this revised version ok,
assuming it passes testing?  It should fix a few more test cases.

The changed code from the previous version is in the last hunk.

Thanks,
Bill


Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 203792)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4])
   {  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
 { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
   {  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
+{ OPTION_MASK_ALTIVEC, 
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
   {  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
   {  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
   {  0,  1,  2,  3, 16, 17, 18, 19,  4,  5,  6,  7, 20, 21, 22, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
   {  8, 24,  9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
   {  8,  9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
   {  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
 { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
   {  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
@@ -28980,6 +28987,26 @@ altivec_expand_vec_perm_const (rtx operands[4])
  enum machine_mode omode = insn_data[icode].operand[0].mode;
  enum machine_mode imode = insn_data[icode].operand[1].mode;
 
+ /* For little-endian, don't use vpkuwum and vpkuhum if the
+underlying vector type is not V4SI and V8HI, respectively.
+For example, using vpkuwum with a V8HI picks up the even
+halfwords (BE numbering) when the even halfwords (LE
+numbering) are what we need.  */
+ if (!BYTES_BIG_ENDIAN
+  icode == CODE_FOR_altivec_vpkuwum
+  ((GET_CODE (op0) == REG
+   GET_MODE (op0) != V4SImode)
+ || (GET_CODE (op0) == SUBREG
+  GET_MODE (XEXP (op0, 0)) != V4SImode)))
+   continue;
+ if (!BYTES_BIG_ENDIAN
+  icode == CODE_FOR_altivec_vpkuhum
+  ((GET_CODE (op0) == REG
+   GET_MODE (op0) != V8HImode)
+ || (GET_CODE (op0) == SUBREG
+  GET_MODE (XEXP (op0, 0)) != V8HImode)))
+   continue;
+
   /* For little-endian, the two input operands must be swapped
  (or swapped back) to ensure proper right-to-left numbering
  from 0 to 2N-1.  */


On Mon, 2013-10-21 at 10:02 -0400, David Edelsohn wrote:
 On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt
 wschm...@linux.vnet.ibm.com wrote:
  Hi,
 
  In altivec_expand_vec_perm_const, we look for special masks that match
  the behavior of specific instructions, so we can use those instructions
  rather than load a constant control vector and perform a permute.  Some
  of the masks must be treated differently for little endian mode.
 
  The masks that represent merge-high and merge-low operations have
  reversed meanings in little-endian, because of the reversed ordering of
  the vector elements.
 
  The masks that represent vector-pack operations remain correct when the
  mode of the input operands matches the natural mode of the instruction,
  but not otherwise.  This is because the pack instructions always select
  the rightmost, low-order bits of the vector element.  There are cases
  where we use this, for example, with a V8SI vector matching a vpkuwum
  mask in order to select the odd-numbered elements of the vector.  In
  little endian mode, this instruction will get us the even-numbered
  elements instead.  There is no alternative instruction with the desired
  behavior, so I've just disabled use of those masks for little endian
  when the mode isn't natural.
 
  These changes fix 32 failures in the test suite for little endian mode.

Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.

2013-10-21 Thread Cong Hou

Jeff, thank you for installing this patch. Actually I already have the
write privileges. I just came back from a trip.

Thank you again!



thanks,
Cong


On Fri, Oct 18, 2013 at 10:22 PM, Jeff Law l...@redhat.com wrote:
 On 10/18/13 03:56, Richard Biener wrote:

 On Thu, 17 Oct 2013, Cong Hou wrote:

 I tested this case with -fno-tree-loop-im and -fno-tree-pre, and it
 seems that GCC could hoist j+1 outside of the i loop:

 t3.c:5:5: note: hoisting out of the vectorized loop: _10 = (sizetype)
 j_25;
 t3.c:5:5: note: hoisting out of the vectorized loop: _11 = _10 + 1;
 t3.c:5:5: note: hoisting out of the vectorized loop: _12 = _11 * 4;
 t3.c:5:5: note: hoisting out of the vectorized loop: _14 = b_13(D) + _12;
 t3.c:5:5: note: hoisting out of the vectorized loop: _15 = *_14;
 t3.c:5:5: note: hoisting out of the vectorized loop: _16 = _15 + 1;


 But your suggestion is still nice as it can remove a branch and make
 the code more brief. I have updated the patch and also included the
 nested loop example into the test case.


 Ok if it passes bootstrap  regtest.

 Bootstrapped  regression tested on x86_64-unknown-linux-gnu.  Installed on
 Cong's behalf.

 Cong -- if you plan on contributing regularly to GCC, please start the
 process for write privileges.  This form should have everything you need:

 https://sourceware.org/cgi-bin/pdw/ps_form.cgi

 Jeff

Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.

2013-10-21 Thread Jeff Law


On 10/21/13 10:45, Cong Hou wrote:

Jeff, thank you for installing this patch. Actually I already have the
write privileges. I just came back from a trip.
Ah.  I didn't see you in the MAINTAINERS file.  Can you update that file 
please.


Thanks,
jeff

Re: [Patch] Fix gcc.dg/20050922-*.c

2013-10-21 Thread Jeff Law


On 10/21/13 04:28, Vidya Praveen wrote:

Hello,

Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can
be a issue especially since they define uint32_t. Testcase writing guidelines
discourages such inclusion as well.

This patch replaces these #includes with manual declarations.

Tested for aarch64-none-elf, arm-none-eabi and x86_64-linux-gnu

OK for trunk, 4.7, 4.8?

VP.

---

gcc/testsuite/ChangeLog:

2013-10-21  Vidya Praveen  vidyaprav...@arm.com

* gcc.dg/20050922-1.c: Remove stdlib.h and declare abort().
* gcc.dg/20050922-1.c: Remove stdlib.h and declare abort() and exit().

OK  installed on trunk.

Release managers would need to make a decision about whether or not to 
include this for the 4.7/4.8 branches.


Thanks,
Jeff

Re-factor inclusion of tree.h

2013-10-21 Thread Diego Novillo

This moves tree.h out of every header. This exposes dependencies of
tree.h in files that should probably not need it after tree and gimple
are separated.

After this change, no header should include tree.h directly. It should
only be included by a .c file. Unfortunately, I did not find an
automatic way of forcing this.

Tested on x86_64 with all languages enabled and using
contrib/config-list.mk.

Andrew, I think that some of the inclusions of tree-core.h can
actually go away. Last time we chatted about this, I think we
decided that including tree-core.h from a header file is OK. Not
sure if that's still the case.

I plan to commit this by tomorrow, unless there are objections.

Diego.


* asan.c: Include tree.h
* bb-reorder.c: Likewise.
* cfgcleanup.c: Likewise.
* cfgloopmanip.c: Likewise.
* data-streamer-in.c: Likewise.
* data-streamer-out.c: Likewise.
* data-streamer.c: Likewise.
* dwarf2cfi.c: Likewise.
* graphite-blocking.c: Likewise.
* graphite-clast-to-gimple.c: Likewise.
* graphite-dependences.c: Likewise.
* graphite-interchange.c: Likewise.
* graphite-optimize-isl.c: Likewise.
* graphite-poly.c: Likewise.
* graphite-scop-detection.c: Likewise.
* graphite-sese-to-poly.c: Likewise.
* graphite.c: Likewise.
* ipa-devirt.c: Likewise.
* ipa-profile.c: Likewise.
* ipa.c: Likewise.
* ira.c: Likewise.
* loop-init.c: Likewise.
* loop-unroll.c: Likewise.
* lower-subreg.c: Likewise.
* lto/lto-object.c: Likewise.
* recog.c: Likewise.
* reginfo.c: Likewise.
* tree-loop-distribution.c: Likewise.
* tree-parloops.c: Likewise.
* tree-ssa-strlen.c: Likewise.
* tree-streamer.c: Likewise.
* value-prof.c: Likewise.
* expr.h: Include tree-core.h instead of tree.h.
* gimple.h: Likewise.
* ipa-prop.h: Likewise.
* ipa-utils.h: Likewise.
* lto-streamer.h: Likewise.
* streamer-hooks.h: Likewise.
* ipa-reference.h: Include cgraph.h instead of tree.h.
* cgraph.h: Include basic-block.h instead of tree.h.
* tree-streamer.h: Do not include tree.h.

testsuite/ChangeLog

* g++.dg/plugin/selfassign.c: Include tree.h.
* gcc.dg/plugin/finish_unit_plugin.c: Likewise.
* gcc.dg/plugin/ggcplug.c: Likewise.
* gcc.dg/plugin/one_time_plugin.c: Likewise.
* gcc.dg/plugin/selfassign.c: Likewise.
* gcc.dg/plugin/start_unit_plugin.c: Likewise.
---
 gcc/ChangeLog| 44 
 gcc/asan.c   |  1 +
 gcc/bb-reorder.c |  1 +
 gcc/cfgcleanup.c |  1 +
 gcc/cfgloopmanip.c   |  1 +
 gcc/cgraph.h |  2 +-
 gcc/data-streamer-in.c   |  1 +
 gcc/data-streamer-out.c  |  1 +
 gcc/data-streamer.c  |  1 +
 gcc/dwarf2cfi.c  |  1 +
 gcc/expr.h   |  2 +-
 gcc/gimple.h |  2 +-
 gcc/graphite-blocking.c  |  1 +
 gcc/graphite-clast-to-gimple.c   |  1 +
 gcc/graphite-dependences.c   |  1 +
 gcc/graphite-interchange.c   |  1 +
 gcc/graphite-optimize-isl.c  |  1 +
 gcc/graphite-poly.c  |  1 +
 gcc/graphite-scop-detection.c|  1 +
 gcc/graphite-sese-to-poly.c  |  1 +
 gcc/graphite.c   |  1 +
 gcc/ipa-devirt.c |  1 +
 gcc/ipa-profile.c|  1 +
 gcc/ipa-prop.h   |  2 +-
 gcc/ipa-reference.h  |  2 +-
 gcc/ipa-utils.h  |  2 +-
 gcc/ipa.c|  1 +
 gcc/ira.c|  1 +
 gcc/loop-init.c  |  1 +
 gcc/loop-unroll.c|  1 +
 gcc/lower-subreg.c   |  1 +
 gcc/lto-streamer.h   |  2 +-
 gcc/lto/lto-object.c |  1 +
 gcc/recog.c  |  1 +
 gcc/reginfo.c|  1 +
 gcc/streamer-hooks.h |  2 +-
 gcc/testsuite/ChangeLog  |  9 +
 gcc/testsuite/g++.dg/plugin/selfassign.c |  1 +
 gcc/testsuite/gcc.dg/plugin/finish_unit_plugin.c |  1 +
 gcc/testsuite/gcc.dg/plugin/ggcplug.c|  1 +
 gcc/testsuite/gcc.dg/plugin/one_time_plugin.c|  1 +

Re: [PATCH] Make fold_range_test work only on integral types (PR middle-end/58809)

2013-10-21 Thread Jeff Law


On 10/21/13 09:13, Marek Polacek wrote:

In this PR (well, in the first testcase in it), the problem was that
fold_range_test got op0 and op1 of a COMPLEX_TYPE, subsequent call
of make_range then failed because it wants to create an integer constant
using build_int_cst - and that can't handle COMPLEX_TYPE.  Also,
we can't just compare complex numbers, so I think it's sane to bail out
if fold_range_test isn't dealing with integer types...

I didn't want to put the check between the declarations, so the
make_range calls were moved slightly below.

Regtested/bootstrapped on x86_64-linux, ok for trunk?

2013-10-21  Marek Polacek  pola...@redhat.com

PR middle-end/58809
* fold-const.c (fold_range_test): Return 0 if the type is not
an integral type.
testsuite/
* gcc.dg/gomp/pr58809.c: New test.
For exceedingly annoying reasons BOOLEAN_TYPE is not considered an 
integral type.  You might consider letting fold_range_test do it's thing 
on BOOLEAN_TYPEs, though I'd be surprised if that matters in practice.


Your call -- fine as-is or you can let BOOLEAN_TYPEs through with a 
pre-approved patch.


jeff

Re: Re-factor inclusion of tree.h

2013-10-21 Thread Jeff Law


On 10/21/13 10:52, Diego Novillo wrote:

This moves tree.h out of every header. This exposes dependencies of
tree.h in files that should probably not need it after tree and gimple
are separated.

After this change, no header should include tree.h directly. It should
only be included by a .c file. Unfortunately, I did not find an
automatic way of forcing this.

Tested on x86_64 with all languages enabled and using
contrib/config-list.mk.

Andrew, I think that some of the inclusions of tree-core.h can
actually go away. Last time we chatted about this, I think we
decided that including tree-core.h from a header file is OK. Not
sure if that's still the case.

I plan to commit this by tomorrow, unless there are objections.

I can't think of a good reason to even bother waiting :-)

jeff

Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.

2013-10-21 Thread Cong Hou

OK. Have done that. And this is also a patch, right? ;)


thanks,
Cong



diff --git a/MAINTAINERS b/MAINTAINERS
index 15b6cc7..a6954da 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -406,6 +406,7 @@ Fergus Hendersonf...@cs.mu.oz.au
 Stuart Henderson   shend...@gcc.gnu.org
 Matthew Hiller hil...@redhat.com
 Manfred Hollstein  m...@suse.com
+Cong Hou   co...@google.com
 Falk Hueffner  f...@debian.org
 Andrew John Hughes gnu_and...@member.fsf.org
 Andy Hutchinsonhutchinsona...@aim.com





On Mon, Oct 21, 2013 at 9:46 AM, Jeff Law l...@redhat.com wrote:
 On 10/21/13 10:45, Cong Hou wrote:

 Jeff, thank you for installing this patch. Actually I already have the
 write privileges. I just came back from a trip.

 Ah.  I didn't see you in the MAINTAINERS file.  Can you update that file
 please.

 Thanks,
 jeff

Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-21 Thread Jeff Law


On 10/18/13 15:06, Iyer, Balaji V wrote:

Hi Jeff, Please see my comments below. Also, I am adding all these
changes to the files as you requested in my local directory. Should I
send you an updated patch along the way?
I'll let you know when I've worked my way through everything.  ISTM an 
updated patch after that would be best.  I'm hoping to get through the 
remaining changes today, then review your replies in detail.



jeff

Re: [PATCH] Hoist loop invariant statements containing data refs with zero-step during loop-versioning in vectorization.

2013-10-21 Thread Jeff Law


On 10/21/13 11:00, Cong Hou wrote:

OK. Have done that. And this is also a patch, right? ;)

Right :-)  Approved.
jeff

[Patch, Fortran] PR58803 - Prevent a double-free with proc-pointer components

2013-10-21 Thread Tobias Burnus


As written in the PR:

The problem is that in free_components one frees:

2054free_components (gfc_component *p)
...
2058  for (; p; p = q)
2059{
2060  q = p-next;
2061
2062  gfc_free_array_spec (p-as);
2063  gfc_free_expr (p-initializer);
2064  free (p-tb);
2065
2066  free (p);
2067}

Here:
  p-name == f1
  p-tb == (gfc_typebound_proc *) 0x17e0070

when one then cycles to p = q (i.e. to p-next), one has:
  p-name == f2
  p-tb == (gfc_typebound_proc *) 0x17e0070


The patch is rather straight forward.

Build and regtested on x86-64-gnu-linux.
OK for the trunk?

Tobias
2013-10-21  Tobias Burnus  bur...@net-b.de

	PR fortran/58803
	* decl.c (match_ppc_decl): Prevent later
	double free.

2013-10-21  Tobias Burnus  bur...@net-b.de

	PR fortran/58803
	* gfortran.dg/proc_ptr_comp_38.f90: New.

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 3a8175f..9c9fd4f 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -5055,7 +5055,14 @@ match_ppc_decl (void)
   if (!gfc_add_proc (c-attr, name, NULL))
 	return MATCH_ERROR;
 
-  c-tb = tb;
+  if (num == 1)
+	c-tb = tb;
+  else
+	{
+	  c-tb = XCNEW (gfc_typebound_proc);
+	  c-tb-where = gfc_current_locus;
+	  *c-tb = *tb;
+	}
 
   /* Set interface.  */
   if (proc_if != NULL)
diff --git a/gcc/testsuite/gfortran.dg/proc_ptr_comp_38.f90 b/gcc/testsuite/gfortran.dg/proc_ptr_comp_38.f90
new file mode 100644
index 000..2a71ca0
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/proc_ptr_comp_38.f90
@@ -0,0 +1,12 @@
+! { dg-do compile }
+!
+! PR fortran/58803
+!
+! Contributed by Vittorio Zecca
+!
+! Was before ICEing due to a double free
+!
+  type t
+   procedure(real), pointer, nopass  :: f1, f2
+  end type
+  end

Re: [PATCH] Make fold_range_test work only on integral types (PR middle-end/58809)

2013-10-21 Thread Marek Polacek

On Mon, Oct 21, 2013 at 10:56:52AM -0600, Jeff Law wrote:
 On 10/21/13 09:13, Marek Polacek wrote:
 In this PR (well, in the first testcase in it), the problem was that
 fold_range_test got op0 and op1 of a COMPLEX_TYPE, subsequent call
 of make_range then failed because it wants to create an integer constant
 using build_int_cst - and that can't handle COMPLEX_TYPE.  Also,
 we can't just compare complex numbers, so I think it's sane to bail out
 if fold_range_test isn't dealing with integer types...
 
 I didn't want to put the check between the declarations, so the
 make_range calls were moved slightly below.
 
 Regtested/bootstrapped on x86_64-linux, ok for trunk?
 
 2013-10-21  Marek Polacek  pola...@redhat.com
 
  PR middle-end/58809
  * fold-const.c (fold_range_test): Return 0 if the type is not
  an integral type.
 testsuite/
  * gcc.dg/gomp/pr58809.c: New test.
 For exceedingly annoying reasons BOOLEAN_TYPE is not considered an
 integral type.  You might consider letting fold_range_test do it's
 thing on BOOLEAN_TYPEs, though I'd be surprised if that matters in
 practice.

Really?  In tree.h I can see

#define INTEGRAL_TYPE_P(TYPE)  \
  (TREE_CODE (TYPE) == ENUMERAL_TYPE  \
   || TREE_CODE (TYPE) == BOOLEAN_TYPE \
   || TREE_CODE (TYPE) == INTEGER_TYPE)

 Your call -- fine as-is or you can let BOOLEAN_TYPEs through with a
 pre-approved patch.

...so I'm keeping the patch as-is.  Thanks,

Marek

[jit] Implement gcc_jit_lvalue_get_address

2013-10-21 Thread David Malcolm

Committed to dmalcolm/jit:

gcc/jit/
* TODO.rst (the C unary prefix  operator): Remove completed item.
* internal-api.c (gcc::jit::lvalue::get_address): New.
* internal-api.h (gcc::jit::lvalue::get_address): New.
* libgccjit.c (gcc_jit_lvalue_get_address): New.
* libgccjit.h (gcc_jit_lvalue_get_address): New.
* libgccjit.map (gcc_jit_lvalue_get_address): New.

gcc/testsuite/
* jit.dg/test-expressions.c (test_global): New.
(make_test_of_get_address): New.
(verify_get_address): New.
(code_making_callback): Add call to make_test_of_get_address.
(verify_code): Add call to verify_get_address.
---
 gcc/jit/ChangeLog.jit   |  9 ++
 gcc/jit/TODO.rst|  5 ---
 gcc/jit/internal-api.c  | 13 
 gcc/jit/internal-api.h  |  3 ++
 gcc/jit/libgccjit.c |  9 ++
 gcc/jit/libgccjit.h |  7 +
 gcc/jit/libgccjit.map   |  1 +
 gcc/testsuite/ChangeLog.jit |  8 +
 gcc/testsuite/jit.dg/test-expressions.c | 54 +
 9 files changed, 104 insertions(+), 5 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index 7c574e8..e16902d 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,3 +1,12 @@
+2013-10-21  David Malcolm  dmalc...@redhat.com
+
+   * TODO.rst (the C unary prefix  operator): Remove completed item.
+   * internal-api.c (gcc::jit::lvalue::get_address): New.
+   * internal-api.h (gcc::jit::lvalue::get_address): New.
+   * libgccjit.c (gcc_jit_lvalue_get_address): New.
+   * libgccjit.h (gcc_jit_lvalue_get_address): New.
+   * libgccjit.map (gcc_jit_lvalue_get_address): New.
+
 2013-10-18  David Malcolm  dmalc...@redhat.com
 
* internal-api.c (gcc::jit::context::new_param): Add context
diff --git a/gcc/jit/TODO.rst b/gcc/jit/TODO.rst
index 41cc4ad..d0daeb4 100644
--- a/gcc/jit/TODO.rst
+++ b/gcc/jit/TODO.rst
@@ -55,11 +55,6 @@ Initial Release
 
   and, indeed, clarify all of the other operations.
 
-* the C unary prefix  operator::
-
-extern gcc_jit_rvalue *
-gcc_jit_lvalue_get_address (gcc_jit_lvalue *lvalue);
-
 * array types, in case they're needed for structs::
 
 extern gcc_jit_type *
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 81178e1..8f84e14 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -757,6 +757,19 @@ dereference (gcc::jit::location *loc)
   return new lvalue (get_context (), datum);
 }
 
+gcc::jit::rvalue *
+gcc::jit::lvalue::
+get_address (location *loc)
+{
+  tree t_lvalue = as_tree ();
+  tree t_thistype = TREE_TYPE (t_lvalue);
+  tree t_ptrtype = build_pointer_type (t_thistype);
+  tree ptr = build1 (ADDR_EXPR, t_ptrtype, t_lvalue);
+  if (loc)
+get_context ()-set_tree_location (ptr, loc);
+  return new rvalue (get_context (), ptr);
+}
+
 void *
 gcc::jit::wrapper::
 operator new (size_t sz)
diff --git a/gcc/jit/internal-api.h b/gcc/jit/internal-api.h
index e31dbc8..1a8499f 100644
--- a/gcc/jit/internal-api.h
+++ b/gcc/jit/internal-api.h
@@ -422,6 +422,9 @@ public:
   access_field (location *loc,
const char *fieldname);
 
+  rvalue *
+  get_address (location *loc);
+
 };
 
 class param : public lvalue
diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 023c213..8d77761 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -529,6 +529,15 @@ gcc_jit_rvalue_dereference (gcc_jit_rvalue *rvalue,
   return (gcc_jit_lvalue *)rvalue-dereference (loc);
 }
 
+gcc_jit_rvalue *
+gcc_jit_lvalue_get_address (gcc_jit_lvalue *lvalue,
+   gcc_jit_location *loc)
+{
+  RETURN_NULL_IF_FAIL (lvalue, NULL, NULL lvalue);
+
+  return (gcc_jit_rvalue *)lvalue-get_address (loc);
+}
+
 gcc_jit_lvalue *
 gcc_jit_function_new_local (gcc_jit_function *func,
gcc_jit_location *loc,
diff --git a/gcc/jit/libgccjit.h b/gcc/jit/libgccjit.h
index 2187ede..12a715d 100644
--- a/gcc/jit/libgccjit.h
+++ b/gcc/jit/libgccjit.h
@@ -601,6 +601,13 @@ extern gcc_jit_lvalue *
 gcc_jit_rvalue_dereference (gcc_jit_rvalue *rvalue,
gcc_jit_location *loc);
 
+/* Taking the address of an lvalue; analogous to:
+ (EXPR)
+   in C.  */
+extern gcc_jit_rvalue *
+gcc_jit_lvalue_get_address (gcc_jit_lvalue *lvalue,
+   gcc_jit_location *loc);
+
 extern gcc_jit_lvalue *
 gcc_jit_function_new_local (gcc_jit_function *func,
gcc_jit_location *loc,
diff --git a/gcc/jit/libgccjit.map b/gcc/jit/libgccjit.map
index 7700788..e2503e4 100644
--- a/gcc/jit/libgccjit.map
+++ b/gcc/jit/libgccjit.map
@@ -41,6 +41,7 @@
 gcc_jit_loop_end;
 gcc_jit_lvalue_as_rvalue;
 gcc_jit_lvalue_access_field;
+gcc_jit_lvalue_get_address;
 gcc_jit_param_as_lvalue;
 gcc_jit_param_as_rvalue;

Re: [RFC] Isolate simplify paths with undefined behaviour

2013-10-21 Thread Jeff Law


On 10/21/13 06:19, Richard Biener wrote:



I wonder why this isn't part of the regular jump-threading code - after
all the opportunity is to thead to __builtin_unreachable () ;)  Otherwise
the question is always where you'd place this pass and whether it
enables jump threading or CSE opportunities (that at least it does).
In theory one could do this as part of jump threading.  Doing it in jump 
threading would have the advantage that we could use a NULL generated in 
a different block than the dereference.  Of course, we'd have a full 
path to duplicate at that point.


The downside is complexity.  This pass is amazingly simple and 
effective.   The jump threading code is considerably more complex and 
bolting this on the side would just make it worse.


From a pipeline location, if kept as a separate pass, it probably needs 
to be between DOM1  phi-only cprop.  DOM1 does a reasonable job at 
propagating the NULL values into the PHI nodes to expose the 
optimization.  And the pass tends to expose precisely the kinds of PHI 
nodes that phi-only cprop can optimize.  VRP2/PRE/DOM2 do a good job at 
finding the newly exposed CSEs and jump threading opportunities.


I haven't looked to see if there's anything left to optimize after DOM2, 
but I can certainly speculate that there would be cases exposed by the 
various passes that currently exist between DOM1  DOM2.  That's 
certainly worth exploring.


As you know, phase ordering will always be a concern.  We already have 
acknowledged that we're punting some of those issues once we stopped 
iterating DOM.


Jeff

[Patch, Fortran] PR58793 - reject passing TYPE() to CLASS()

2013-10-21 Thread Tobias Burnus


The issue came up while reviewing the patch for PR58793.

Build and regtested on x86-64-gnu-linux
OK for the trunk?

Tobias
2013-10-21  Tobias Burnus  bur...@net-b.de

	PR fortran/58793
	* interface.c (compare_parameter): Reject passing TYPE(*)
	to CLASS(*).

2013-10-21  Tobias Burnus  bur...@net-b.de

	PR fortran/58793
	* gfortran.dg/assumed_type_8.f90: New.

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index b3ddf5f..0504c90 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -1972,6 +1972,15 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual,
   return 0;
 }
 
+  if (actual-ts.type == BT_ASSUMED  formal-ts.type != BT_ASSUMED)
+{
+  if (where)
+	gfc_error (Assumed-type actual argument at %L requires that dummy 
+		   argument '%s' is of assumed type, actual-where,
+		   formal-name);
+  return 0;
+}
+
   /* F2008, 12.5.2.5; IR F08/0073.  */
   if (formal-ts.type == BT_CLASS  formal-attr.class_ok
actual-expr_type != EXPR_NULL
diff --git a/gcc/testsuite/gfortran.dg/assumed_type_8.f90 b/gcc/testsuite/gfortran.dg/assumed_type_8.f90
new file mode 100644
index 000..543e693
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/assumed_type_8.f90
@@ -0,0 +1,19 @@
+! { dg-do compile }
+!
+! Issue came up during the review of PR fortran/58793
+!
+! Test for TS29113:2012's C407b.
+! 
+program test
+  use iso_c_binding
+  integer,target ::aa
+  call up(c_loc(aa))
+contains
+  subroutine up(x)
+class(*) :: x
+  end subroutine
+  subroutine bar(x)
+   type(*) :: x
+   call up(x) ! { dg-error Assumed-type actual argument at .1. requires that dummy argument 'x' is of assumed type }
+  end subroutine bar
+end program test

Re: [Patch] Fix gcc.dg/20050922-*.c

2013-10-21 Thread Mike Stump

On Oct 21, 2013, at 3:28 AM, Vidya Praveen vidyaprav...@arm.com wrote:
 Tests gcc.dg/20050922-1.c and gcc.dg/20050922-2.c includes stdlib.h. This can
 be a issue especially since they define uint32_t.

 OK for 4.7, 4.8?

For release branches, you'd need to transition from the theoretical to the 
practical.  On which systems (software) does it fail?  If none, then no, a back 
port isn't necessary.  If it fails on a system (or software) on which real 
users use, then I'll approve it once you name the system (software) and let it 
bake on trunk for a week and see if anyone objects…

Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-21 Thread Jeff Law


On 10/18/13 15:06, Iyer, Balaji V wrote:


The main reason why I made it volatile (as expressed by the volatil
bool variable) is that I want to make sure these values aren't
optimized by the compiler and the value is fetched from memory on
every access. I have added an explanation to the header comment.
I figured that was the case; it's also what origianlly got me thinking 
about memory models.  volatile can help with getting the right number of 
accesses (though there are cases where it gets in the way).  But 
volatile is not a fence/barrier :-)




Presumably users never concern themselves with the cilkrts_pedigree
structure, it's entirely hidden, right?  So there's no way for a
user to ask for an array of these things, right?
Ok, so this relates to the question about keeping all the structures in 
sync






I note you set DECL_ALIGN to BIGGEST_ALIGNMENT, which seems
excessive. Why not compute the natural alignment for that structure
and use that? Similarly for the cilkrts_stack_frame, except that it
seems to use PREFERRED_STACK_BOUNDARY.  Is there some reason why
each shouldn't be aligned on whatever boundary the target would
normally align those structures?



Fixed, I now let the compiler set these values
Actually, thinking more about this, we've got to make sure we're 
compatible with ICC on this stuff.  So if you need to weak the 
alignments to be ICC compatible, then, well, we have to do it.   If (for 
example) the compilers differ on their notion of the structure's 
alignment, then you couldn't make an array of them and use it reliably 
across the ICC  GCC implementations.


You might consider some sort of interoperatibity test.  Obviously the 
test wouldn't run if ICC wasn't installed, but having a test to catch 
this stuff early (if it ever happens) would be wise.



These structures don't change often and whenever they do (i.e. fields
get added into it) it is kind of a big deal.  So far, when we have
changed - once in the past 3 yrs. or so - we have allowed backward
compatibility.
Good to hear it's a big deal.  Once there are multiple implementations 
in the wild relying upon them, it'll be an even bigger deal..





We've given it some thought, but have neither the manpower nor
charter to port Cilk to non-Intel architectures.  We've done a trial
port to ARM to prove it could be done, but we ran fib (a small
example) once and declared victory.  We're sure that more work needs
to be done here and would welcome changes to the runtime to support
other architectures.  In particular, most or all uses of
assembly-language instructions should be replaced by compiler
intrinsics and memory barriers probably need to be added for
architectures that have a more relaxed memory model than the x86.
Our hope is that, once Cilk Plus was added to gcc, that members of
the community would help us port the runtime to other architectures.
Such ports could probably start with full barriers, it'd be painful from 
a performance standpoint, but as a starting point, sufficient.


jeff

[C++ Patch] PR 58810

2013-10-21 Thread Paolo Carlini


Hi,

this is just an ICE on invalid, but I think it hints to a small 
clean-up. Essentially the issue is that we diagnose qualified free 
functions in two different places: in the grokdeclarator itself and in 
grokfndecl. The former handles non-friend declarations and does type = 
TYPE_MAIN_VARIANT (type), the latter handles friends and then doesn't 
even try to do what grokdeclarator does to type. Then in situations like 
58810, when we have *both* at the same time:


typedef int F() const;

F f;

struct A
{
  friend F f;
};

the two types become inconsistent, one has the const and the other 
doesn't, and an assert in typeck.c:merge_types triggers. Thus the idea 
of resolving the inconsistency by removing completely one of the two 
checks: that in grokfndecl is more precise (separate messages for 
cv-qualifiers and ref-qualifiers) and seems the best candidate. The 
patch below passes testing modulo trivial adjustments to a couple of 
existing testcases.


Admittedly, the change is a bit risky, but it's still Stage 1, maybe we 
can take the chance and try to shave some lines off the mammuth 
grokdeclarator ;) What do you think?


Thanks,
Paolo.

///
/cp
2013-10-21  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58810
* decl.c (grokdeclarator): Don't handle qualified free functions here,
leave the diagnostic to grokfndecl.

/testsuite
2013-10-21  Paolo Carlini  paolo.carl...@oracle.com

PR c++/58810
* g++.dg/other/cv_func3.C: New.
* g++.dg/other/cv_func.C: Adjust.
* g++.dg/parse/fn-typedef2.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 203893)
+++ cp/decl.c   (working copy)
@@ -10242,21 +10242,6 @@ grokdeclarator (const cp_declarator *declarator,
 
   if (decl_context != TYPENAME)
{
- /* A cv-qualifier-seq shall only be part of the function type
-for a non-static member function. A ref-qualifier shall only
- /same as above/ [dcl.fct] */
- if ((type_memfn_quals (type) != TYPE_UNQUALIFIED
-  || type_memfn_rqual (type) != REF_QUAL_NONE)
-  (current_class_type == NULL_TREE || staticp) )
-   {
- error (staticp
- ? G_(qualified function types cannot be used to 
-  declare static member functions)
- : G_(qualified function types cannot be used to 
-  declare free functions));
- type = TYPE_MAIN_VARIANT (type);
-   }
-
  /* The qualifiers on the function type become the qualifiers on
 the non-static member function. */
  memfn_quals |= type_memfn_quals (type);
Index: testsuite/g++.dg/other/cv_func.C
===
--- testsuite/g++.dg/other/cv_func.C(revision 203893)
+++ testsuite/g++.dg/other/cv_func.C(working copy)
@@ -3,7 +3,7 @@
 typedef int FIC(int) const;
 typedef int FI(int);
 
-FIC f; // { dg-error qualified }
+FIC f; // { dg-error cv-qualifier }
 struct S {
   FIC f; // OK
 
Index: testsuite/g++.dg/other/cv_func3.C
===
--- testsuite/g++.dg/other/cv_func3.C   (revision 0)
+++ testsuite/g++.dg/other/cv_func3.C   (working copy)
@@ -0,0 +1,10 @@
+// PR c++/58810
+
+typedef int F() const;
+
+F f;   // { dg-error cv-qualifier }
+
+struct A
+{
+  friend F f;  // { dg-error cv-qualifier }
+};
Index: testsuite/g++.dg/parse/fn-typedef2.C
===
--- testsuite/g++.dg/parse/fn-typedef2.C(revision 203893)
+++ testsuite/g++.dg/parse/fn-typedef2.C(working copy)
@@ -4,4 +4,4 @@ typedef void ft() const;
 typedef void V;
 typedef V ft() const;
 
-ft f;  // { dg-error qualified }
+ft f;  // { dg-error cv-qualifier }

Re: [PATCH, i386, MPX, 1/X] Support of Intel MPX ISA. 1/2 Bound type and modes

2013-10-21 Thread Jeff Law


On 10/15/13 07:31, Ilya Enkovich wrote:

Hey guys,

could please someone look at this small patch? It blocks approved MPX
ISA support on i386 target.




diff --git a/gcc/doc/rtl.texi b/gcc/doc/rtl.texi
index 1d62223..02b1214 100644
--- a/gcc/doc/rtl.texi
+++ b/gcc/doc/rtl.texi
@@ -1382,6 +1382,10 @@ any @code{CC_MODE} modes listed in the 
@file{@var{machine}-modes.def}.
  @xref{Jump Patterns},
  also see @ref{Condition Code}.

+@findex MODE_BOUND
+@item MODE_BOUND
+Bound modes class.  Used to represent values of pointer bounds.
+
  @findex MODE_RANDOM
  @item MODE_RANDOM
  This is a catchall mode class for modes which don't fit into the above
So why are bounds distinct modes?Is there some inherent reason why 
bounds are something other than an integer mode (MODE_INT)?


Similarly what's the rationale behind having new types for bounds?  Is 
there some reason why they couldn't be implemented with one of the 
existing types?


ISTM the entire patch is gated on being able to answer those two questions.


jeff

Re: patch to enable LRA for ppc

2013-10-21 Thread Steven Bosscher

On Mon, Oct 21, 2013 at 5:51 PM, Michael Meissner wrote:
 What I'm doing is adding secondary reload support so that up until reload 
 time,
 we can represent VSX addresses as reg+offset, and in secondary reload, create
 the addition instructions to put the offset in a base register.  I haven't 
 made
 any changes to the machine independent portions of the compiler.  As long as
 IRA uses the secondary reload interface, it should be ok.  However, right now,
 I need to focus most of my attention on getting the secondary reload support 
 to
 work.

 One thing that I've asked for before, but to remind you, is I really, really
 wish secondary reload could allocate two scratch registers if it is given an
 insn that takes 4 arguments.  Right now, I'm allocating a TFmode scratch, 
 since
 that gives 2 registers, but future changes will want TFmode to go into a 
 single
 vector register, and I will need to create another type, like V4DI that does
 take 2 registers.  The case that this is needed for is moving an item from 
 GPRs
 to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
 64-bit mode, or 64-bit items in 32-bit mode.  I need two registers to do the
 move into, and then I will do the combine operation.


Eh, perhaps I'm missing something, but...

Isn't one of the great advantages of LRA over reload, that LRA allows
you to create new pseudos so that you shouldn't ever need secondary
reloads??

Ciao!
Steven

Re: [C PATCH] Warn for _Alignas in an array declarator (PR c/58267)

2013-10-21 Thread Marek Polacek

On Sat, Oct 19, 2013 at 07:23:20PM +, Joseph S. Myers wrote:
 On Wed, 16 Oct 2013, Marek Polacek wrote:
 
  @@ -2946,7 +2957,8 @@ c_parser_declarator (c_parser *parser, b
 struct c_declspecs *quals_attrs = build_null_declspecs ();
 struct c_declarator *inner;
 c_parser_consume_token (parser);
  -  c_parser_declspecs (parser, quals_attrs, false, false, true, 
  cla_prefer_id);
  +  c_parser_declspecs (parser, quals_attrs, false, false, true,
  + true, cla_prefer_id);
 inner = c_parser_declarator (parser, type_seen_p, kind, seen_id);
 if (inner == NULL)
  return NULL;
 
 Looking at this again, shouldn't the new argument be false (with 
 associated testcase)?  This is parsing pointer declarators, and _Alignas 
 isn't allowed there any more than it is in array declarators

Ah, yeah, I think so, fixed  testcase added.  Thanks for catching it.
 
  @@ -3715,7 +3730,8 @@ c_parser_type_name (c_parser *parser)
 struct c_declarator *declarator;
 struct c_type_name *ret;
 bool dummy = false;
  -  c_parser_declspecs (parser, specs, false, true, true, cla_prefer_type);
  +  c_parser_declspecs (parser, specs, false, true, true, false,
  + cla_prefer_type);
 if (!specs-declspecs_seen_p)
   {
 c_parser_error (parser, expected specifier-qualifier-list);
 
 And this should get a testcase added, that _Alignas is correctly rejected 
 in type names where previously it would have been wrongly accepted.

I added the tests.
 
 (Strictly by the standard it should be false in 
 c_parser_struct_declaration as well - the syntax there doesn't allow 
 _Alignas - but it appears to have been intended to allow it there, so 
 probably best not to change anything there until WG14 reaches a conclusion 
 on the issues I raised in N1731.)

I added a comment there to that effect.

Tested via make check -C gcc RUNTESTFLAGS=dg.exp=c1x-*.c, ok for
trunk?

2013-10-21  Marek Polacek  pola...@redhat.com

c/
* c-parser.c (c_parser_struct_declaration): Add a comment.
(c_parser_declarator): Don't allow _Alignas here.
testsuite/
* gcc.dg/c1x-align-5.c: Add more testing.

--- gcc/c/c-parser.c.mp 2013-10-21 19:34:17.877539421 +0200
+++ gcc/c/c-parser.c2013-10-21 20:22:45.980220027 +0200
@@ -2645,6 +2645,11 @@ c_parser_struct_declaration (c_parser *p
 }
   specs = build_null_declspecs ();
   decl_loc = c_parser_peek_token (parser)-location;
+  /* Strictly by the standard, we shouldn't allow _Alignas here,
+ but it appears to have been intended to allow it there, so
+ we're keeping it as it is until WG14 reaches a conclusion
+ of N1731.
+ http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1731.pdf  */
   c_parser_declspecs (parser, specs, false, true, true,
  true, cla_nonabstract_decl);
   if (parser-error)
@@ -2959,7 +2964,7 @@ c_parser_declarator (c_parser *parser, b
   struct c_declarator *inner;
   c_parser_consume_token (parser);
   c_parser_declspecs (parser, quals_attrs, false, false, true,
- true, cla_prefer_id);
+ false, cla_prefer_id);
   inner = c_parser_declarator (parser, type_seen_p, kind, seen_id);
   if (inner == NULL)
return NULL;
--- gcc/testsuite/gcc.dg/c1x-align-5.c.mp   2013-10-21 19:41:20.088041095 
+0200
+++ gcc/testsuite/gcc.dg/c1x-align-5.c  2013-10-21 20:16:55.096984545 +0200
@@ -14,6 +14,14 @@ void foo (int a[_Alignas (0) 10]) { } /*
 void
 test (void)
 {
+  int *_Alignas (long) p; /* { dg-error expected } */
+  int *const _Alignas (long) *q; /* { dg-error expected } */
+  struct s { int n; };
+  __builtin_offsetof (struct s _Alignas (int), n); /* { dg-error expected } 
*/
+  __typeof (long double _Alignas (0)) e; /* { dg-error expected } */
+  sizeof (int _Alignas (int)); /* { dg-error expected } */
+  _Alignas (int _Alignas (float)) int t; /* { dg-error expected } */
+  __builtin_types_compatible_p (signed _Alignas (0), unsigned); /* { dg-error 
expected } */
   int a[_Alignas (int) 10]; /* { dg-error expected expression before } */
   int b[10];
   foo (b);

Marek

[jit] Improve error message in gcc_jit_function_new_local

2013-10-21 Thread David Malcolm

Committed to dmalcolm/jit:

gcc/jit/
* libgccjit.c (gcc_jit_function_new_local): Use a more clear
error message for the case where someone tries to add a local
to a function imported from elsewhere.
---
 gcc/jit/ChangeLog.jit | 6 ++
 gcc/jit/libgccjit.c   | 5 -
 2 files changed, 10 insertions(+), 1 deletion(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index e16902d..df08834 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,11 @@
 2013-10-21  David Malcolm  dmalc...@redhat.com
 
+   * libgccjit.c (gcc_jit_function_new_local): Use a more clear
+   error message for the case where someone tries to add a local
+   to a function imported from elsewhere.
+
+2013-10-21  David Malcolm  dmalc...@redhat.com
+
* TODO.rst (the C unary prefix  operator): Remove completed item.
* internal-api.c (gcc::jit::lvalue::get_address): New.
* internal-api.h (gcc::jit::lvalue::get_address): New.
diff --git a/gcc/jit/libgccjit.c b/gcc/jit/libgccjit.c
index 8d77761..40ac98d 100644
--- a/gcc/jit/libgccjit.c
+++ b/gcc/jit/libgccjit.c
@@ -544,7 +544,10 @@ gcc_jit_function_new_local (gcc_jit_function *func,
gcc_jit_type *type,
const char *name)
 {
-  RETURN_NULL_IF_NOT_FUNC_DEFINITION (func);
+  RETURN_NULL_IF_FAIL (func, NULL, NULL function);
+  RETURN_NULL_IF_FAIL (func-get_kind () != GCC_JIT_FUNCTION_IMPORTED,
+  NULL,
+  Cannot add locals to an imported function);
   RETURN_NULL_IF_FAIL (type, NULL, NULL type);
   RETURN_NULL_IF_FAIL (name, NULL, NULL name);
 
-- 
1.7.11.7

Re: [Patch, Fortran] PR58803 - Prevent a double-free with proc-pointer components

2013-10-21 Thread Steve Kargl

On Mon, Oct 21, 2013 at 07:15:07PM +0200, Tobias Burnus wrote:
 -  c-tb = tb;
 +  if (num == 1)
 + c-tb = tb;
 +  else
 + {
 +   c-tb = XCNEW (gfc_typebound_proc);
 +   c-tb-where = gfc_current_locus;
 +   *c-tb = *tb;

I haven't convinced myself yet, but is this leaking memory?

Lines 5006-7 are

  tb = XCNEW (gfc_typebound_proc);
  tb-where = gfc_current_locus;

In the num = 1 case, c-tb = tb and I presume during teardown
c-tb is freed and by extension tb.  But, in the else case,
are you copying the contents from *tb into *c-tb and the tb
is never freed?

-- 
Steve

[jit] Show source locations (if any) in gimple dump

2013-10-21 Thread David Malcolm

Committed to dmalcolm/jit:

gcc/jit/
* internal-api.c (gcc::jit::context::postprocess): Show source
line numbers (if any) in gimple dump.
---
 gcc/jit/ChangeLog.jit  | 5 +
 gcc/jit/internal-api.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index df08834..adf273b 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,10 @@
 2013-10-21  David Malcolm  dmalc...@redhat.com
 
+   * internal-api.c (gcc::jit::context::postprocess): Show source
+   line numbers (if any) in gimple dump.
+
+2013-10-21  David Malcolm  dmalc...@redhat.com
+
* libgccjit.c (gcc_jit_function_new_local): Use a more clear
error message for the case where someone tries to add a local
to a function imported from elsewhere.
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 8f84e14..8bb6741 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -864,7 +864,7 @@ postprocess ()
 
   current_function_decl = m_inner_fndecl;
   if (m_ctxt-get_bool_option (GCC_JIT_BOOL_OPTION_DUMP_INITIAL_GIMPLE))
-   dump_function_to_file (m_inner_fndecl, stderr, TDF_VOPS|TDF_MEMSYMS);
+   dump_function_to_file (m_inner_fndecl, stderr, 
TDF_VOPS|TDF_MEMSYMS|TDF_LINENO);
   //debug_tree (m_inner_fndecl);
 
   //printf(about to add to cgraph\n);
-- 
1.7.11.7

[jit] Fix error-handling within gcc::jit::context::compile (timevar issues)

2013-10-21 Thread David Malcolm

Committed to dmalcolm/jit:

gcc/jit/
* internal-api.c (gcc::jit::context::compile): Correctly cleanup
timevars in error-handling, preventing an issue where an error
on a context left timevar.c in an unstopped state, leading to an
assertion failure when restarting timevars in the next compile.
Found via fuzz-testing.
---
 gcc/jit/ChangeLog.jit  |  8 
 gcc/jit/internal-api.c | 13 +++--
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/gcc/jit/ChangeLog.jit b/gcc/jit/ChangeLog.jit
index adf273b..2cf5c8d 100644
--- a/gcc/jit/ChangeLog.jit
+++ b/gcc/jit/ChangeLog.jit
@@ -1,5 +1,13 @@
 2013-10-21  David Malcolm  dmalc...@redhat.com
 
+   * internal-api.c (gcc::jit::context::compile): Correctly cleanup
+   timevars in error-handling, preventing an issue where an error
+   on a context left timevar.c in an unstopped state, leading to an
+   assertion failure when restarting timevars in the next compile.
+   Found via fuzz-testing.
+
+2013-10-21  David Malcolm  dmalc...@redhat.com
+
* internal-api.c (gcc::jit::context::postprocess): Show source
line numbers (if any) in gimple dump.
 
diff --git a/gcc/jit/internal-api.c b/gcc/jit/internal-api.c
index 8bb6741..d90f001 100644
--- a/gcc/jit/internal-api.c
+++ b/gcc/jit/internal-api.c
@@ -1261,7 +1261,11 @@ compile ()
   active_jit_ctxt = NULL;
 
   if (errors_occurred ())
-goto error;
+{
+  timevar_stop (TV_TOTAL);
+  timevar_print (stderr);
+  goto error;
+}
 
   timevar_push (TV_ASSEMBLE);
 
@@ -1279,7 +1283,12 @@ compile ()
   printf (cmd: %s\n, cmd);
 int ret = system (cmd);
 if (ret)
-  goto error;
+  {
+   timevar_pop (TV_ASSEMBLE);
+   timevar_stop (TV_TOTAL);
+   timevar_print (stderr);
+   goto error;
+  }
   }
   timevar_pop (TV_ASSEMBLE);
 
-- 
1.7.11.7

Re: [PATCH] Cilk Keywords (_Cilk_spawn and _Cilk_sync) for C (and C++)

2013-10-21 Thread Jeff Law


On 09/11/13 12:18, Iyer, Balaji V wrote:

Hello Everyone, Couple weeks back, I had submitted a patch for review
that will implement Cilk keywords (_Cilk_spawn and _Cilk_sync) into
the C compiler. I recently finished C++ implementation also. In this
email, I am attaching 2 patches: 1 for C (and the common parts for C
and C++) and 1 for C++. The C++ Changelog is labelled
cp-ChangeLog.cilkplus and the other one is just ChangeLog.cilkplus.
There isn't much changes in the C patch. Only noticeable changes
would be moving functions to the common parts so that C++ can use
them.

It passes all the tests and does not affect  (by affect I mean fail a
passing test or pass a failing one) any of the other tests in the
testsuite directory.

Is this Ok for trunk?
Here are my final notes from this pass over the changes.  If you could 
send an updated patch to the list, it would be appreciated.  I may have 
more comments now that I've looked over everything and have a slightly 
better understanding of the overall structure.



I'm not a C++ guy, but are you going to have to do anything special with 
lambdas?  Such as in cilk_set_spawn_marker?


I'm going to assume you don't need to do anything for the C++ specific 
decls in copy_decl_for_cilk.  Note I didn't look at any of the C++ 
specific files, so if you're handling it there, that's fine with me.


I didn't look at the tests closely.  How thorough are those tests, 
particularly for front-end issues?  The reason I ask is those tests will 
help ensure that folks don't break the cilk+ support as often as they 
might otherwise.  Basically they cover for the lack of knowledge about 
the cilk+ implementation by providing developers a heads-up that they 
broke something.  Are there any internal testsuites Intel could 
contribute to help beef up testing?


Can you #undef the helper macros SUBTREE, MODIFIED, INITIALIZED that are 
defined in extract_free_variables?  A nit, but those kind of defines can 
certainly surprised folks.  Actually, unless there's a compelling 
reason, why not just go ahead and make the calls to 
extract_free_variables explicit and drop the macros completely?


In all, I didn't see anything that made me say wait, this is a huge 
issue we need to address.  Presumably you and Aldy worked through those 
before I got involved ;-)


jeff

Re: patch to enable LRA for ppc

2013-10-21 Thread Michael Meissner

On Mon, Oct 21, 2013 at 08:18:22PM +0200, Steven Bosscher wrote:
 On Mon, Oct 21, 2013 at 5:51 PM, Michael Meissner wrote:
  What I'm doing is adding secondary reload support so that up until reload 
  time,
  we can represent VSX addresses as reg+offset, and in secondary reload, 
  create
  the addition instructions to put the offset in a base register.  I haven't 
  made
  any changes to the machine independent portions of the compiler.  As long as
  IRA uses the secondary reload interface, it should be ok.  However, right 
  now,
  I need to focus most of my attention on getting the secondary reload 
  support to
  work.
 
  One thing that I've asked for before, but to remind you, is I really, really
  wish secondary reload could allocate two scratch registers if it is given an
  insn that takes 4 arguments.  Right now, I'm allocating a TFmode scratch, 
  since
  that gives 2 registers, but future changes will want TFmode to go into a 
  single
  vector register, and I will need to create another type, like V4DI that does
  take 2 registers.  The case that this is needed for is moving an item from 
  GPRs
  to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
  64-bit mode, or 64-bit items in 32-bit mode.  I need two registers to do the
  move into, and then I will do the combine operation.
 
 
 Eh, perhaps I'm missing something, but...
 
 Isn't one of the great advantages of LRA over reload, that LRA allows
 you to create new pseudos so that you shouldn't ever need secondary
 reloads??

You still need secondary reload.

For example, on the powerpc, you have 5 addressing modes for GPRs and FPRS:

1) base register
2) base register + index register
3) base register + offset
4) auto-update for base register + index register
5) auto-update for base register + offset register

Now for VSX registers you only have:

1) base register
2) base register + index register

So, in the work I'm doing right now, I want to allow reg + offset addressing,
but if the register being loaded is an Altivec register (high part of the VSX
registers), I need to create a secondary reload to load the offset into and
then convert the address to indirect or indexed addressing.  You don't want to
always disallow offset based addressing, but you want to create the secondary
reload when you need to.

Similarly, vector types can do indexed addressing (reg+reg) but if you are
loading or storing the value into GPR registers, you can't do reg+reg
addressing on multi-word items.

Finally, one of the features of ISA 2.07 is the notion of load fusion, where
the hardware will fuse together a load immediate with an adjacent load (for
GPRs, you need the load immediate shifted to be the register that is being
loaded, for VSX registers, you just need the instructions adjacent).  In this
case, before reload we will want to pretend that the machine has addressing to
include the fusion forms, and in secondary reload, you will generate the
combined insn that will become the fusion instruction.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Re: [Patch, Fortran] PR58803 - Prevent a double-free with proc-pointer components

2013-10-21 Thread Tobias Burnus


Steve Kargl wrote:

On Mon, Oct 21, 2013 at 07:15:07PM +0200, Tobias Burnus wrote:

-  c-tb = tb;
+  if (num == 1)
+   c-tb = tb;
+  else
+   {
+ c-tb = XCNEW (gfc_typebound_proc);
+ c-tb-where = gfc_current_locus;
+ *c-tb = *tb;

I haven't convinced myself yet, but is this leaking memory?

Lines 5006-7 are

   tb = XCNEW (gfc_typebound_proc);
   tb-where = gfc_current_locus;

In the num = 1 case, c-tb = tb and I presume during teardown
c-tb is freed and by extension tb.  But, in the else case,
are you copying the contents from *tb into *c-tb and the tb
is never freed?


Well, as the double-free PR shows, free_components frees the memory: 
free (p-tb);. And as it walks all components …


Tobias

Re: [C++ Patch] PR 58810

2013-10-21 Thread Paolo Carlini

... this is an alternate, more conservative approach: a check remains in 
grokdeclarator and it handles friends too. At variance with the check in 
grokfndecl it's about function *types*.


Thanks,
Paolo.

///
Index: cp/decl.c
===
--- cp/decl.c   (revision 203907)
+++ cp/decl.c   (working copy)
@@ -10247,7 +10247,7 @@ grokdeclarator (const cp_declarator *declarator,
  /same as above/ [dcl.fct] */
  if ((type_memfn_quals (type) != TYPE_UNQUALIFIED
   || type_memfn_rqual (type) != REF_QUAL_NONE)
-  (current_class_type == NULL_TREE || staticp) )
+  (current_class_type == NULL_TREE || staticp || friendp) )
{
  error (staticp
  ? G_(qualified function types cannot be used to 
Index: testsuite/g++.dg/other/cv_func3.C
===
--- testsuite/g++.dg/other/cv_func3.C   (revision 0)
+++ testsuite/g++.dg/other/cv_func3.C   (working copy)
@@ -0,0 +1,10 @@
+// PR c++/58810
+
+typedef int F() const;
+
+F f;   // { dg-error qualified }
+
+struct A
+{
+  friend F f;  // { dg-error qualified }
+};

Re: [Patch, Fortran] PR58803 - Prevent a double-free with proc-pointer components

2013-10-21 Thread Steve Kargl

On Mon, Oct 21, 2013 at 09:04:22PM +0200, Tobias Burnus wrote:
 Steve Kargl wrote:
  On Mon, Oct 21, 2013 at 07:15:07PM +0200, Tobias Burnus wrote:
  -  c-tb = tb;
  +  if (num == 1)
  +  c-tb = tb;
  +  else
  +  {
  +c-tb = XCNEW (gfc_typebound_proc);

(see below)

  +c-tb-where = gfc_current_locus;
  +*c-tb = *tb;
  I haven't convinced myself yet, but is this leaking memory?
 
  Lines 5006-7 are
 
 tb = XCNEW (gfc_typebound_proc);
 tb-where = gfc_current_locus;
 
  In the num = 1 case, c-tb = tb and I presume during teardown
  c-tb is freed and by extension tb.  But, in the else case,
  are you copying the contents from *tb into *c-tb and the tb
  is never freed?
 
 Well, as the double-free PR shows, free_components frees the memory: 
 free (p-tb);. And as it walks all components ?

Argh.  I misread the code.  Instead of 'c-tb = XCNEW(...' above,
I saw 'tb = XCNEW(...'.  So, I thought you might be leaking a 
local copy of tb.

With that misunderstanding cleared up, the patch looks fine.

-- 
Steve

Re: Re-factor inclusion of tree.h

2013-10-21 Thread Diego Novillo

On Mon, Oct 21, 2013 at 12:57 PM, Jeff Law l...@redhat.com wrote:
 On 10/21/13 10:52, Diego Novillo wrote:

 I plan to commit this by tomorrow, unless there are objections.

 I can't think of a good reason to even bother waiting :-)

Heh, OK, thanks.

After analyzing all the build failures in config-list.mk, I found that
we also need explicit inclusion of tree.h in the gen* binaries. Some
of these generate all their includes in header files, but I figured we
can revise this later on. This was breaking cr16-elf and mips-netbsd.

Can anyone think of some way that we can use to automatically block
inclusions of tree.h from header files? Code review is the only way
that comes to mind.

Committed both patches to trunk.

Diego.

diff --git a/gcc/genattrtab.c b/gcc/genattrtab.c
index 973cade..f79380d 100644
--- a/gcc/genattrtab.c
+++ b/gcc/genattrtab.c
@@ -5100,6 +5100,7 @@ write_header (FILE *outf)
   fprintf (outf, #include \system.h\\n);
   fprintf (outf, #include \coretypes.h\\n);
   fprintf (outf, #include \tm.h\\n);
+  fprintf (outf, #include \tree.h\\n);
   fprintf (outf, #include \rtl.h\\n);
   fprintf (outf, #include \insn-attr.h\\n);
   fprintf (outf, #include \tm_p.h\\n);
diff --git a/gcc/genautomata.c b/gcc/genautomata.c
index a0bf076..f6c4b91c4 100644
--- a/gcc/genautomata.c
+++ b/gcc/genautomata.c
@@ -9665,6 +9665,7 @@ main (int argc, char **argv)
  #include \system.h\\n
  #include \coretypes.h\\n
  #include \tm.h\\n
+ #include \tree.h\\n
  #include \rtl.h\\n
  #include \tm_p.h\\n
  #include \insn-config.h\\n
diff --git a/gcc/genemit.c b/gcc/genemit.c
index d4bb301..724a114 100644
--- a/gcc/genemit.c
+++ b/gcc/genemit.c
@@ -790,6 +790,7 @@ from the machine description file `md'.  */\n\n);
   printf (#include \system.h\\n);
   printf (#include \coretypes.h\\n);
   printf (#include \tm.h\\n);
+  printf (#include \tree.h\\n);
   printf (#include \rtl.h\\n);
   printf (#include \tm_p.h\\n);
   printf (#include \function.h\\n);
diff --git a/gcc/genopinit.c b/gcc/genopinit.c
index 9c7cf2c..3efb71e 100644
--- a/gcc/genopinit.c
+++ b/gcc/genopinit.c
@@ -404,6 +404,7 @@ main (int argc, char **argv)
#include \system.h\\n
#include \coretypes.h\\n
#include \tm.h\\n
+   #include \tree.h\\n
#include \rtl.h\\n
#include \tm_p.h\\n
#include \flags.h\\n
diff --git a/gcc/genoutput.c b/gcc/genoutput.c
index c3a0936..2a7ee23 100644
--- a/gcc/genoutput.c
+++ b/gcc/genoutput.c
@@ -238,6 +238,7 @@ output_prologue (void)
   printf (#include \tm.h\\n);
   printf (#include \flags.h\\n);
   printf (#include \ggc.h\\n);
+  printf (#include \tree.h\\n);
   printf (#include \rtl.h\\n);
   printf (#include \expr.h\\n);
   printf (#include \insn-codes.h\\n);
diff --git a/gcc/genpeep.c b/gcc/genpeep.c
index a14d061..877fde3 100644
--- a/gcc/genpeep.c
+++ b/gcc/genpeep.c
@@ -359,6 +359,7 @@ from the machine description file `md'.  */\n\n);
   printf (#include \coretypes.h\\n);
   printf (#include \tm.h\\n);
   printf (#include \insn-config.h\\n);
+  printf (#include \tree.h\\n);
   printf (#include \rtl.h\\n);
   printf (#include \tm_p.h\\n);
   printf (#include \regs.h\\n);
diff --git a/gcc/target-globals.c b/gcc/target-globals.c
index 65ccb8a..9d223fc 100644
--- a/gcc/target-globals.c
+++ b/gcc/target-globals.c
@@ -23,6 +23,7 @@ along with GCC; see the file COPYING3.  If not see
 #include tm.h
 #include insn-config.h
 #include machmode.h
+#include tree.h
 #include ggc.h
 #include toplev.h
 #include target-globals.h

pr37868

2013-10-21 Thread Mike Stump

Concerning:

2008-11-20  Richard Guenther  rguent...@suse.de

PR tree-optimization/37868  
* gcc.dg/torture/pr37868.c: New testcase.
* gcc.c-torture/execute/pr38048-1.c: Likewise.
* gcc.c-torture/execute/pr38048-2.c: Likewise.

So, is there any reason why we can't make the test case portable as in the 
below instead of skipping on all the targets that blow on misaligned data?

Ok?

diff --git a/gcc/testsuite/gcc.dg/torture/pr37868.c 
b/gcc/testsuite/gcc.dg/torture/pr37868.c
index 380b089..b7ff43f 100644
--- a/gcc/testsuite/gcc.dg/torture/pr37868.c
+++ b/gcc/testsuite/gcc.dg/torture/pr37868.c
@@ -24,15 +24,18 @@ struct X {
 
 int main (void)
 {
-  struct X x;
+  union {
+struct X x;
+unsigned int i;
+  } s;
   unsigned int bad_bits;
 
-  x.pad = -1;
-  x.a = -1;
-  x.b = -1;
-  x.c = -1;
+  s.x.pad = -1;
+  s.x.a = -1;
+  s.x.b = -1;
+  s.x.c = -1;
 
-  bad_bits = ((unsigned int)-1) ^ *(1+(unsigned int *) x);
+  bad_bits = ((unsigned int)-1) ^ *(1+(unsigned int *) s.x);
   if (bad_bits != 0)
 abort ();
   return 0;

Re: Patch: Add #pragma ivdep support to the ME and C FE

2013-10-21 Thread Tobias Burnus


Dear all,

attached is a new version of the patch. Changes:
* #pragma GCC ivdep instead of #pragma ivdep
* Corrections to the error message in c-parser.c and a test case for it
* New wording in the .texi and examples

I am still not completely happy with the wording – and I am open for 
suggestions. In the example, I played safe and mention k  -m and k =m; 
even if k = 0 probably works.


I also didn't know how to best state the reason for requiring a 
condition. (Internal reason: The annotation is attached to the condition 
- thus, it has to be present. External reason: For vectorization, there 
shouldn't be a branching in the body of the loop and without a condition 
in either the for header or in its body, one has an endless loop.)


Do you have suggestions for a better wording? If not, is the patch okay 
for the trunk?
Built and regtested (C only). [An all-language bootstrap + regtesting is 
underway.]



Crossrefs: C review comments: 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg00803.html
OpenMPv4's omp simd wording, see bottom of the email at 
http://gcc.gnu.org/ml/gcc-patches/2013-10/msg01194.html


Tobias
2013-08-21  Tobias Burnus  bur...@net-b.de

	PR other/33426
* c-pragma.c (init_pragma) Add #pragma ivdep handling.
* c-pragma.h (pragma_kind): Add PRAGMA_IVDEP.

	PR other/33426
* c-parser.c (c_parser_pragma, c_parser_for_statement):
Handle PRAGMA_IVDEP.
(c_parser_statement_after_labels): Update call.

	PR other/33426
* cfgloop.c (flow_loops_find): Search for IFN_ANNOTATE
and set safelen.
* gimplify.c (gimple_boolify, gimplify_expr): Handle ANNOTATE_EXPR.
* internal-fn.c (expand_ANNOTATE): New function.
* internal-fn.def (ANNOTATE): Define as new internal function.
* tree-core.h (tree_node_kind): Add annot_expr_ivdep_kind.
(tree_base) Update a comment.
* tree-pretty-print.c (dump_generic_node): Handle ANNOTATE_EXPR.
* tree.def (ANNOTATE_EXPR): New DEFTREECODE.
* tree.h (ANNOTATE_EXPR_ID, SET_ANNOTATE_EXPR_ID): New macros.
	* doc/extend.texi (Pragmas): Document #pragma ivdep.

	PR other/33426
	* testsuite/gcc.dg/ivdep.c: New.
	* testsuite/gcc.dg/vect/vect-ivdep-1.c: New.

diff --git a/gcc/c-family/c-pragma.c b/gcc/c-family/c-pragma.c
index be5748b..1656000 100644
--- a/gcc/c-family/c-pragma.c
+++ b/gcc/c-family/c-pragma.c
@@ -1362,6 +1362,8 @@ init_pragma (void)
 cpp_register_deferred_pragma (parse_in, GCC, pch_preprocess,
   PRAGMA_GCC_PCH_PREPROCESS, false, false);
 
+  cpp_register_deferred_pragma (parse_in, GCC, ivdep, PRAGMA_IVDEP, false,
+false);
 #ifdef HANDLE_PRAGMA_PACK_WITH_EXPANSION
   c_register_pragma_with_expansion (0, pack, handle_pragma_pack);
 #else
diff --git a/gcc/c-family/c-pragma.h b/gcc/c-family/c-pragma.h
index c421284..705bcb4 100644
--- a/gcc/c-family/c-pragma.h
+++ b/gcc/c-family/c-pragma.h
@@ -53,6 +53,7 @@ typedef enum pragma_kind {
   PRAGMA_OMP_TEAMS,
 
   PRAGMA_GCC_PCH_PREPROCESS,
+  PRAGMA_IVDEP,
 
   PRAGMA_FIRST_EXTERNAL
 } pragma_kind;
diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 9b6abe0..c721f8a 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -1159,7 +1159,7 @@ static void c_parser_if_statement (c_parser *);
 static void c_parser_switch_statement (c_parser *);
 static void c_parser_while_statement (c_parser *);
 static void c_parser_do_statement (c_parser *);
-static void c_parser_for_statement (c_parser *);
+static void c_parser_for_statement (c_parser *, bool);
 static tree c_parser_asm_statement (c_parser *);
 static tree c_parser_asm_operands (c_parser *);
 static tree c_parser_asm_goto_operands (c_parser *);
@@ -4580,7 +4580,7 @@ c_parser_statement_after_labels (c_parser *parser)
 	  c_parser_do_statement (parser);
 	  break;
 	case RID_FOR:
-	  c_parser_for_statement (parser);
+	  c_parser_for_statement (parser, false);
 	  break;
 	case RID_GOTO:
 	  c_parser_consume_token (parser);
@@ -5033,7 +5033,7 @@ c_parser_do_statement (c_parser *parser)
 */
 
 static void
-c_parser_for_statement (c_parser *parser)
+c_parser_for_statement (c_parser *parser, bool ivdep)
 {
   tree block, cond, incr, save_break, save_cont, body;
   /* The following are only used when parsing an ObjC foreach statement.  */
@@ -5139,8 +5139,17 @@ c_parser_for_statement (c_parser *parser)
 	{
 	  if (c_parser_next_token_is (parser, CPP_SEMICOLON))
 	{
-	  c_parser_consume_token (parser);
-	  cond = NULL_TREE;
+	  if (ivdep)
+		{
+		  c_parser_error (parser, missing loop condition in loop with 
+  %GCC ivdep% pragma);
+		  cond = error_mark_node;
+		}
+	  else
+		{
+		  c_parser_consume_token (parser);
+		  cond = NULL_TREE;
+		}
 	}
 	  else
 	{
@@ -5154,6 +5163,12 @@ c_parser_for_statement (c_parser *parser)
 	  c_parser_skip_until_found (parser, CPP_SEMICOLON,
 	 expected %;%);
 	}
+	  if (ivdep  cond != error_mark_node)
+	{
+	  cond = build1 (ANNOTATE_EXPR, TREE_TYPE (cond), cond);

Re: [PATCH, rs6000] Be careful with special permute masks for little endian

2013-10-21 Thread Bill Schmidt

Please hold off reviewing this.  I see at least one testcase that will
have to be modified (expected code generation pattern will be different
for LE vs. BE).  I'll resubmit the whole thing later today.

Thanks,
Bill

On Mon, 2013-10-21 at 11:39 -0500, Bill Schmidt wrote:
 Whoops, looks like I missed some simpler cases (REG with the wrong mode
 instead of SUBREG with the wrong mode).  Is this revised version ok,
 assuming it passes testing?  It should fix a few more test cases.
 
 The changed code from the previous version is in the last hunk.
 
 Thanks,
 Bill
 
 
 Index: gcc/config/rs6000/rs6000.c
 ===
 --- gcc/config/rs6000/rs6000.c(revision 203792)
 +++ gcc/config/rs6000/rs6000.c(working copy)
 @@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4])
{  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
  { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
{  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
 -{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
 +{ OPTION_MASK_ALTIVEC, 
 +  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
{  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
 -{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
 +{ OPTION_MASK_ALTIVEC,
 +  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
{  0,  1, 16, 17,  2,  3, 18, 19,  4,  5, 20, 21,  6,  7, 22, 23 } },
 -{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghw,
 +{ OPTION_MASK_ALTIVEC,
 +  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghw : CODE_FOR_altivec_vmrglw,
{  0,  1,  2,  3, 16, 17, 18, 19,  4,  5,  6,  7, 20, 21, 22, 23 } },
 -{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglb,
 +{ OPTION_MASK_ALTIVEC,
 +  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglb : CODE_FOR_altivec_vmrghb,
{  8, 24,  9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31 } },
 -{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglh,
 +{ OPTION_MASK_ALTIVEC,
 +  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglh : CODE_FOR_altivec_vmrghh,
{  8,  9, 24, 25, 10, 11, 26, 27, 12, 13, 28, 29, 14, 15, 30, 31 } },
 -{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrglw,
 +{ OPTION_MASK_ALTIVEC,
 +  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrglw : CODE_FOR_altivec_vmrghw,
{  8,  9, 10, 11, 24, 25, 26, 27, 12, 13, 14, 15, 28, 29, 30, 31 } },
  { OPTION_MASK_P8_VECTOR, CODE_FOR_p8_vmrgew,
{  0,  1,  2,  3, 16, 17, 18, 19,  8,  9, 10, 11, 24, 25, 26, 27 } },
 @@ -28980,6 +28987,26 @@ altivec_expand_vec_perm_const (rtx operands[4])
 enum machine_mode omode = insn_data[icode].operand[0].mode;
 enum machine_mode imode = insn_data[icode].operand[1].mode;
  
 +   /* For little-endian, don't use vpkuwum and vpkuhum if the
 +  underlying vector type is not V4SI and V8HI, respectively.
 +  For example, using vpkuwum with a V8HI picks up the even
 +  halfwords (BE numbering) when the even halfwords (LE
 +  numbering) are what we need.  */
 +   if (!BYTES_BIG_ENDIAN
 +icode == CODE_FOR_altivec_vpkuwum
 +((GET_CODE (op0) == REG
 + GET_MODE (op0) != V4SImode)
 +   || (GET_CODE (op0) == SUBREG
 +GET_MODE (XEXP (op0, 0)) != V4SImode)))
 + continue;
 +   if (!BYTES_BIG_ENDIAN
 +icode == CODE_FOR_altivec_vpkuhum
 +((GET_CODE (op0) == REG
 + GET_MODE (op0) != V8HImode)
 +   || (GET_CODE (op0) == SUBREG
 +GET_MODE (XEXP (op0, 0)) != V8HImode)))
 + continue;
 +
/* For little-endian, the two input operands must be swapped
   (or swapped back) to ensure proper right-to-left numbering
   from 0 to 2N-1.  */
 
 
 On Mon, 2013-10-21 at 10:02 -0400, David Edelsohn wrote:
  On Mon, Oct 21, 2013 at 8:49 AM, Bill Schmidt
  wschm...@linux.vnet.ibm.com wrote:
   Hi,
  
   In altivec_expand_vec_perm_const, we look for special masks that match
   the behavior of specific instructions, so we can use those instructions
   rather than load a constant control vector and perform a permute.  Some
   of the masks must be treated differently for little endian mode.
  
   The masks that represent merge-high and merge-low operations have
   reversed meanings in little-endian, because of the reversed ordering of
   the vector elements.
  
   The masks that represent vector-pack operations remain correct when the
   mode of the input operands matches the natural mode of the instruction,
   but not otherwise.  This is because the pack instructions always select
   the rightmost, low-order bits of the vector element.  There are cases
   where we use this, for example, with a V8SI vector matching a vpkuwum
   mask in order to select the odd-numbered elements of the

Re: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2

2013-10-21 Thread DJ Delorie


 have an option for true AAPCS compliance, which will
 be allowed to break the C++11 memory model and

 And an option that addresses your requirements,
 which will _not_ break the C++11 memory model

So the problem isn't that what *I* need conflicts with C++11, it's
that what AAPCS needs conflicts?

[PATCH] Generate fused widening multiply-and-accumulate operations only when the widening multiply has single use

2013-10-21 Thread Yufeng Zhang


Hi,

This patch changes the widening_mul pass to fuse the widening multiply 
with accumulate only when the multiply has single use.  The widening_mul 
pass currently does the conversion regardless of the number of the uses, 
which can cause poor code-gen in cases like the following:


typedef int ArrT [10][10];

void
foo (ArrT Arr, int Idx)
{
  Arr[Idx][Idx] = 1;
  Arr[Idx + 10][Idx] = 2;
}

On AArch64, after widening_mul, the IR is like

  _2 = (long unsigned int) Idx_1(D);
  _3 = Idx_1(D) w* 40;   
  _5 = Arr_4(D) + _3;
  *_5[Idx_1(D)] = 1;
  _8 = WIDEN_MULT_PLUS_EXPR Idx_1(D), 40, 400; 
  _9 = Arr_4(D) + _8;
  *_9[Idx_1(D)] = 2;

Where the arrows point, there are redundant widening multiplies.

Bootstrap successfully on x86_64.

The patch passes the regtest on aarch64, arm and x86_64.

OK for the trunk?

Thanks,
Yufeng

p.s. Note that x86_64 doesn't suffer from this issue as the 
corresponding widening multiply accumulate op is not available on the 
target.


gcc/

* tree-ssa-math-opts.c (convert_plusminus_to_widen): Call
has_single_use () and not do the conversion if has_single_use ()
returns false for the multiplication result.diff --git a/gcc/tree-ssa-math-opts.c b/gcc/tree-ssa-math-opts.c
index f7f8ec9..d316990 100644
--- a/gcc/tree-ssa-math-opts.c
+++ b/gcc/tree-ssa-math-opts.c
@@ -2425,12 +2425,16 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, 
gimple stmt,
 
  It might also appear that it would be sufficient to use the existing
  operands of the widening multiply, but that would limit the choice of
- multiply-and-accumulate instructions.  */
+ multiply-and-accumulate instructions.
+
+ If the widened-multiplication result has more than one uses, it is
+ probably wiser not to do the conversion.  */
   if (code == PLUS_EXPR
(rhs1_code == MULT_EXPR || rhs1_code == WIDEN_MULT_EXPR))
 {
   if (!is_widening_mult_p (rhs1_stmt, type1, mult_rhs1,
-  type2, mult_rhs2))
+  type2, mult_rhs2)
+ || !has_single_use (rhs1))
return false;
   add_rhs = rhs2;
   conv_stmt = conv1_stmt;
@@ -2438,7 +2442,8 @@ convert_plusminus_to_widen (gimple_stmt_iterator *gsi, 
gimple stmt,
   else if (rhs2_code == MULT_EXPR || rhs2_code == WIDEN_MULT_EXPR)
 {
   if (!is_widening_mult_p (rhs2_stmt, type1, mult_rhs1,
-  type2, mult_rhs2))
+  type2, mult_rhs2)
+ || !has_single_use (rhs2))
return false;
   add_rhs = rhs1;
   conv_stmt = conv2_stmt;

Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping

2013-10-21 Thread Steven Bosscher

On Mon, Oct 21, 2013 at 6:32 PM, Martin Jambor wrote:
 --- a/gcc/ira.c
 +++ b/gcc/ira.c
 @@ -4314,6 +4314,197 @@ find_moveable_pseudos (void)
free_dominance_info (CDI_DOMINATORS);
  }

 +
 +/* If insn is interesting for parameter range-splitting shring-wrapping
 +   preparation, i.e. it is a single set from a hard register to a pseudo, 
 which
 +   is live at CALL_DOM, return the destination.  Otherwise return NULL.  */
 +
 +static rtx
 +interesting_dest_for_shprep (rtx insn, basic_block call_dom)
 +{
 +  rtx set = single_set (insn);
 +  if (!set)
 +return NULL;
 +  rtx src = SET_SRC (set);
 +  rtx dest = SET_DEST (set);
 +  if (!REG_P (src) || !HARD_REGISTER_P (src)
 +  || !REG_P (dest) || HARD_REGISTER_P (dest)
 +  || (call_dom  !bitmap_bit_p (df_get_live_in (call_dom), REGNO 
 (dest

See below...


 +return NULL;
 +  return dest;
 +}
 +
 +/* Split live ranges of pseudos that are loaded from hard registers in the
 +   first BB in a BB that dominates all non-sibling call if such a BB can be
 +   found and is not in a loop.  */
 +
 +static void
 +split_live_ranges_for_shrink_wrap (void)
 +{
 +  basic_block bb, call_dom = NULL;
 +  basic_block first = single_succ (ENTRY_BLOCK_PTR);
 +  rtx insn, last_interesting_insn = NULL;
 +  bitmap_head need_new, reachable;
 +  vecbasic_block queue;
 +
 +  if (!flag_shrink_wrap)
 +return;
 +
 +  bitmap_initialize (need_new, 0);
 +  bitmap_initialize (reachable, 0);
 +  queue.create (n_basic_blocks);
 +
 +  FOR_EACH_BB (bb)
 +FOR_BB_INSNS (bb, insn)
 +  if (CALL_P (insn)  !SIBLING_CALL_P (insn))
 +   {
 + if (bb == first)
 +   {
 + bitmap_clear (need_new);
 + bitmap_clear (reachable);
 + queue.release ();
 + return;
 +   }
 +
 + bitmap_set_bit (need_new, bb-index);
 + bitmap_set_bit (reachable, bb-index);
 + queue.quick_push (bb);
 + break;
 +   }
 +
 +  if (queue.is_empty ())
 +{
 +  bitmap_clear (need_new);
 +  bitmap_clear (reachable);
 +  queue.release ();
 +  return;
 +}
 +
 +  while (!queue.is_empty ())
 +{
 +  edge e;
 +  edge_iterator ei;
 +
 +  bb = queue.pop ();
 +  FOR_EACH_EDGE (e, ei, bb-succs)
 +   if (e-dest != EXIT_BLOCK_PTR
 +bitmap_set_bit (reachable, e-dest-index))
 + queue.quick_push (e-dest);
 +}
 +  queue.release ();
 +
 +  FOR_BB_INSNS (first, insn)
 +{
 +  rtx dest = interesting_dest_for_shprep (insn, NULL);
 +  if (!dest)
 +   continue;
 +
 +  if (DF_REG_DEF_COUNT (REGNO (dest))  1)

See below...


 +   {
 + bitmap_clear (need_new);
 + bitmap_clear (reachable);
 + return;
 +   }
 +
 +  for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
 +  use;
 +  use = DF_REF_NEXT_REG (use))

You're using DF in these places. But IRA and LRA don't work with DF.
After update_equiv_regs DF caches and liveness may be incorrect. You'd
have to add a df_analyze call but I'm not sure how that will interact
with IRA/LRA's own dataflow frameworks (e.g. w.r.t.
REG_DEAD/REG_UNUSED notes).


 + rtx uin = DF_REF_INSN (use);
 + int ubbi = BLOCK_FOR_INSN (uin)-index;

int ubbi = DF_REF_BB (use)?

Ciao!
Steven

Re: [PATCH i386 4/8] [AVX512] [1/n] Add substed patterns.

2013-10-21 Thread Richard Henderson

On 10/17/2013 07:15 AM, Kirill Yukhin wrote:
 +(define_mode_attr ssescalarsize
 +  [(V8DI  64) (V4DI  64) (V2DI  64)
 +   (V32HI 16) (V16HI 16) (V8HI 16)
 +   (V16SI 32) (V8SI 32) (V4SI 32)
 +   (V16SF 16) (V8DF 64)])

Error on V16SF.  Probably better to fill this out.

 +(define_insn avx512f_loadmode_mask
 +  [(set (match_operand:VI48F_512 0 register_operand =v,v)
 + (vec_merge:VI48F_512
 +   (match_operand:VI48F_512 1 nonimmediate_operand v,m)
 +   (match_operand:VI48F_512 2 vector_move_operand 0C,0C)
 +   (match_operand:avx512fmaskmode 3 register_operand k,k)))]
 +  TARGET_AVX512F
 +{
 +  switch (get_attr_mode (insn))

Better to just use sseinsnmode here, as it's a compile-time constant.

 +{
 +case MODE_V8DF:
 +case MODE_V16SF:
 +  return vmovassemodesuffix\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1};
 +default:
 +  return vmovdqassescalarsize\t{%1, %0%{%3%}%N2|%0%{%3%}%N2, %1};
 +}
 +}
 +  [(set_attr type ssemov)
 +   (set_attr prefix evex)
 +   (set_attr memory none,load)
 +   (set_attr mode sseinsnmode)])
 +

 +(define_insn avx512f_storemode_mask

Likewise.

 +(define_insn avx512f_movesmode_mask
 +  [(set (match_operand:VF_128 0 register_operand =v)
 + (vec_merge:VF_128
 +   (vec_merge:VF_128
 + (match_operand:VF_128 2 register_operand v)
 + (match_operand:VF_128 3 vector_move_operand 0C)
 + (match_operand:avx512fmaskmode 4 register_operand k))
 +   (match_operand:VF_128 1 register_operand v)
 +   (const_int 1)))]
 +  TARGET_AVX512F
 +  vmovssescalarmodesuffix\t{%2, %1, %0%{%4%}%N3|%0%{%4%}%N3, %1, %2}
 +  [(set_attr type ssemov)
 +   (set_attr prefix evex)
 +   (set_attr mode sseinsnmode)])

Nested vec_merge?  That seems... odd to say the least.
How in the world does this get matched?

 +(define_insn *avx512f_loadsmode_mask

Likewise.

 +(define_insn avx512f_storesmode_mask
 +  [(set (match_operand:ssescalarmode 0 memory_operand =m)
 + (vec_select:ssescalarmode
 +   (vec_merge:VF_128
 + (match_operand:VF_128 1 register_operand v)
 + (vec_duplicate:VF_128
 +   (match_dup 0))
 + (match_operand:avx512fmaskmode 2 register_operand k))
 +   (parallel [(const_int 0)])))]

This seems similar, though of course it's an extract.
I still can't imagine how it could be used.

 -(define_insn rcp14mode
 +(define_insn mask_codeforrcp14modemask_name

What, this name isn't used for non-masked anymore?

 -(define_insn srcp14mode
 +(define_insn *srcp14mode

Likewise.  These changes don't belong in this patch.

 -(define_insn rsqrt14mode
 +(define_insn *rsqrt14mode

Likewise.

 @@ -2565,9 +2751,9 @@
  (define_insn *fma_fmadd_mode
[(set (match_operand:FMAMODE 0 register_operand =v,v,v,x,x)
   (fma:FMAMODE
 -   (match_operand:FMAMODE 1 nonimmediate_operand %0, 0, v, x,x)
 -   (match_operand:FMAMODE 2 nonimmediate_operand vm, v,vm, x,m)
 -   (match_operand:FMAMODE 3 nonimmediate_operand  v,vm, 0,xm,x)))]
 +   (match_operand:FMAMODE 1 nonimmediate_operand %0,0,v,x,x)
 +   (match_operand:FMAMODE 2 nonimmediate_operand vm,v,vm,x,m)
 +   (match_operand:FMAMODE 3 nonimmediate_operand v,vm,0,xm,x)))]

Unrelated changes.  Repeated throughout the fma patterns.

 +(define_insn *fmai_fmadd_mode_maskz
 +  [(set (match_operand:VF_128 0 register_operand =v,v)
 + (vec_merge:VF_128
 +   (vec_merge:VF_128
 + (fma:VF_128
 +   (match_operand:VF_128 1 nonimmediate_operand 0,0)
 +   (match_operand:VF_128 2 nonimmediate_operand vm,v)
 +   (match_operand:VF_128 3 nonimmediate_operand v,vm))
 + (match_operand:VF_128 4 const0_operand)
 + (match_operand:QI 5 register_operand k,k))
 +   (match_dup 1)
 +   (const_int 1)))]
 +  TARGET_AVX512F
 +  @
 +   vfmadd132ssescalarmodesuffix\t{%2, %3, %0%{%5%}%N4|%0%{%5%}%N4, %3, %2}
 +   vfmadd213ssescalarmodesuffix\t{%3, %2, %0%{%5%}%N4|%0%{%5%}%N4, %2, %3}
 +  [(set_attr type ssemuladd)
 +   (set_attr mode MODE)])

These seem like useless patterns.  If they're for builtins,
then they seem like useless builtins.  See above.

 @@ -3686,8 +4328,8 @@
 (set_attr athlon_decode vector,double,*)
 (set_attr amdfam10_decode vector,double,*)
 (set_attr bdver1_decode direct,direct,*)
 -   (set_attr btver2_decode double,double,double)
 (set_attr prefix orig,orig,vex)
 +   (set_attr btver2_decode double,double,double)
 (set_attr mode SF)])

Unrelated changes.

 +(define_expand vec_unpacku_float_hi_v16si
 +  [(match_operand:V8DF 0 register_operand)
 +   (match_operand:V16SI 1 register_operand)]
 +  TARGET_AVX512F
 +{
 +  REAL_VALUE_TYPE TWO32r;
 +  rtx k, x, tmp[4];
 +
 +  real_ldexp (TWO32r, dconst1, 32);
 +  x = const_double_from_real_value (TWO32r, DFmode);
 +
 +  tmp[0] = force_reg (V8DFmode, CONST0_RTX (V8DFmode));
 +  tmp[1] = force_reg (V8DFmode, ix86_build_const_vector (V8DFmode, 1, x));
 +  tmp[2] = gen_reg_rtx (V8DFmode);
 +  tmp[3] = gen_reg_rtx (V8SImode);
 +  k = gen_reg_rtx

Re: [patch] Cleanup tree-ssa-ter.c exports

2013-10-21 Thread Eric Botcazou

 what went wrong?  I see it in my checkouts...?

Just open the file in your preferred editor. :-)

-- 
Eric Botcazou

[PATCH, rs6000] Be careful with special permute masks for little endian, take 2

2013-10-21 Thread Bill Schmidt

Hi,

This is a revision of my earlier patch on the subject, expanded to catch
a few more cases and with some attendant test-case adjustments:

In altivec_expand_vec_perm_const, we look for special masks that match
the behavior of specific instructions, so we can use those instructions
rather than load a constant control vector and perform a permute.  Some
of the masks must be treated differently for little endian mode.

The masks that represent merge-high and merge-low operations have
reversed meanings in little-endian, because of the reversed ordering of
the vector elements.  

The masks that represent vector-pack operations remain correct when the
mode of the input operands matches the natural mode of the instruction,
but not otherwise.  This is because the pack instructions always select
the rightmost, low-order bits of the vector element.  There are cases
where we use this, for example, with a V8SI vector matching a vpkuwum
mask in order to select the odd-numbered elements of the vector.  In
little endian mode, this instruction will get us the even-numbered
elements instead.  There is no alternative instruction with the desired
behavior, so I've just disabled use of those masks for little endian
when the mode isn't natural.

This requires adjusting the altivec-perm-1.c test case.  The vector pack
tests are moved to a new altivec-perm-3.c test, which is restricted to
big-endian targets.

These changes fix 49 failures in the test suite for little endian mode
(9 vector failures left to go!).  Bootstrapped and tested on
powerpc64{,le}-unknown-linux-gnu with no new failures.  Is this ok for
trunk?

Thanks,
Bill


gcc:

2013-10-21  Bill Schmidt  wschm...@vnet.ibm.com

* config/rs6000/rs6000.c (altivec_expand_vec_perm_const): Reverse
meaning of merge-high and merge-low masks for little endian; avoid
use of vector-pack masks for little endian for mismatched modes.

gcc/testsuite:

2013-10-21  Bill Schmidt  wschm...@vnet.ibm.com

* gcc.target/powerpc/altivec-perm-1.c: Move the two vector pack
tests into...
* gcc.target/powerpc/altivec-perm-3.c: ...this new test, which is
restricted to big-endian targets.


Index: gcc/testsuite/gcc.target/powerpc/altivec-perm-3.c
===
--- gcc/testsuite/gcc.target/powerpc/altivec-perm-3.c   (revision 0)
+++ gcc/testsuite/gcc.target/powerpc/altivec-perm-3.c   (revision 0)
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_altivec_ok } */
+/* { dg-skip-if  { powerpc*le-*-* } { * } {  } } */
+/* { dg-options -O -maltivec -mno-vsx } */
+
+typedef unsigned char V __attribute__((vector_size(16)));
+
+V p2(V x, V y)
+{
+  return __builtin_shuffle(x, y,
+   (V){ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 });
+
+}
+
+V p4(V x, V y)
+{
+  return __builtin_shuffle(x, y,
+   (V){ 2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 });
+}
+
+/* { dg-final { scan-assembler-not vperm } } */
+/* { dg-final { scan-assembler vpkuhum } } */
+/* { dg-final { scan-assembler vpkuwum } } */
Index: gcc/testsuite/gcc.target/powerpc/altivec-perm-1.c
===
--- gcc/testsuite/gcc.target/powerpc/altivec-perm-1.c   (revision 203792)
+++ gcc/testsuite/gcc.target/powerpc/altivec-perm-1.c   (working copy)
@@ -19,19 +19,6 @@ V b4(V x)
   return __builtin_shuffle(x, (V){ 4,5,6,7, 4,5,6,7, 4,5,6,7, 4,5,6,7, });
 }
 
-V p2(V x, V y)
-{
-  return __builtin_shuffle(x, y,
-   (V){ 1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 });
-
-}
-
-V p4(V x, V y)
-{
-  return __builtin_shuffle(x, y,
-   (V){ 2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 });
-}
-
 V h1(V x, V y)
 {
   return __builtin_shuffle(x, y,
@@ -72,5 +59,3 @@ V l4(V x, V y)
 /* { dg-final { scan-assembler vspltb } } */
 /* { dg-final { scan-assembler vsplth } } */
 /* { dg-final { scan-assembler vspltw } } */
-/* { dg-final { scan-assembler vpkuhum } } */
-/* { dg-final { scan-assembler vpkuwum } } */
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 203792)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -28837,17 +28838,23 @@ altivec_expand_vec_perm_const (rtx operands[4])
   {  1,  3,  5,  7,  9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31 } },
 { OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vpkuwum,
   {  2,  3,  6,  7, 10, 11, 14, 15, 18, 19, 22, 23, 26, 27, 30, 31 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghb,
+{ OPTION_MASK_ALTIVEC, 
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghb : CODE_FOR_altivec_vmrglb,
   {  0, 16,  1, 17,  2, 18,  3, 19,  4, 20,  5, 21,  6, 22,  7, 23 } },
-{ OPTION_MASK_ALTIVEC, CODE_FOR_altivec_vmrghh,
+{ OPTION_MASK_ALTIVEC,
+  BYTES_BIG_ENDIAN ? CODE_FOR_altivec_vmrghh : CODE_FOR_altivec_vmrglh,
   {  0,

RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 1/2

2013-10-21 Thread Bernd Edlinger

Hi,

On Sun, 20 Oct 2013 20:23:49, Sandra Loosemore wrote:
 Hr. After some further testing, I'm afraid this patch is still
 buggy. :-(

 I tried a backport to GCC 4.8 and tested on arm-none-eabi. On the new
 pr56997-1.c testcase, it got stuck in infinite recursion between
 store_split_bit_field/store_fixed_bit_field and/or
 extract_split_bit_field/extract_fixed_bit_field. This didn't show up in
 my previous mainline testing.

 The difference between 4.8 and mainline head is the alignment of the
 incoming str_rtx passed to store_bit_field/extract_bit_field, due to the
 changes in r199898. The alignment is 8 bits on mainline, and 32 on 4.8.
 It seems to me that the bitfield code ought to handle store/extract
 from a more-aligned object and it's probably possible to construct an
 example that fails in this way on mainline as well.

 It looks like there are conflicting assumptions between get_best_mode,
 the places that call it in store_fixed_bit_field and
 extract_fixed_bit_field, and the code that actually does the splitting
 -- which uses a unit based on the MEM_ALIGN of the incoming rtx, rather
 than its mode. In the case where it's failing, get_best_mode is
 deciding it can't do a HImode access without splitting, but then the
 split code is assuming SImode units because of the 32-bit alignment, but
 not actually changing the mode of the rtx to match that.

 On top of that, this is one of the cases that strict_volatile_bitfield_p
 checks for and returns false, but the callers of store_bit_field and
 extract_bit_field in expr.c have already fiddled with the mode of the
 incoming rtx assuming that -fstrict-volatile-bitfields does apply. It
 doesn't get into that infinite recursion if it's compiled with
 -fno-strict-volatile-bitfields instead; in that case the incoming rtx
 has BLKmode, get_best_mode chooses SImode, and it's able to do the
 access without splitting at all.

 Anyway I tried a couple different bandaids that solved the infinite
 recursion problem but caused regressions elsewhere, and now I'm not sure
 of the right place to fix this. Given that there is also still ongoing
 discussion about making this a 3-way switch (etc) I am going to hold off
 on committing this patch for now.

 -Sandra


You are right,
if I modify pr56997-1 the patch crashes on trunk:

typedef struct s{
 char Prefix;
 short Type;
}__attribute__((packed,aligned(4))) ss;

please add a test case for this.

This way we have op0 aligned 32 and HI mode selected bitoffset=8
numbits=16.
crashes only when reading this value, because the access tries to use
SImode.

For some reason the alignment seems to be wrong in 4.8.

Thanks
Bernd.

RE: [PATCH] reimplement -fstrict-volatile-bitfields v4, part 2/2

2013-10-21 Thread Bernd Edlinger


 have an option for true AAPCS compliance, which will
 be allowed to break the C++11 memory model and

 And an option that addresses your requirements,
 which will _not_ break the C++11 memory model

 So the problem isn't that what *I* need conflicts with C++11, it's
 that what AAPCS needs conflicts?

Yes, there are two written specifications which are in conflict
AAPCS and C++11. We cannot follow both at the same time.

But from this discussion I've learned, that your target's requirements
can easily co-exist with the C++ memory model.

Because if you only use well-formed bit-fields, the C++ memory
model just allows everything, and we can choose what to do.

Regards
Bernd.

Re: patch to enable LRA for ppc

2013-10-21 Thread Vladimir Makarov


On 13-10-21 11:51 AM, Michael Meissner wrote:

On Sun, Oct 20, 2013 at 10:48:08PM -0400, Vladimir Makarov wrote:

On 13-10-18 11:26 AM, David Edelsohn wrote:

On Thu, Oct 3, 2013 at 5:02 PM, Vladimir Makarov vmaka...@redhat.com wrote:

The following patch permits today trunk to use LRA for ppc by default.
To switch it off -mno-lra can be used.

The patch was bootstrapped on ppc64.  GCC testsuite does not have
regressions too (in comparison with reload).  The change in rs6000.md is
for fix LRA failure on a recently added ppc test.

Vlad,

I have not forgotten this patch. We are trying to figure out the right
timeframe to make this change. The patch does affect performance --
both positively and negatively; most are in the noise but not all. And
there still are some SPEC benchmarks that fail to build with the
patch, at least in Mike's tests. And Mike is implementing some patches
to utilize reload to improve use of VSX registers, which would need to
be mirrored in LRA for the equivalent functionality.

Thanks for informing me, David.

I am ready to work on any LRA ppc issues when it will be in the
trunk.  It would be easier for me to work on LRA ppc if the patch is
committed to the trunk and of course LRA is used as non-default
local RA.

I don't know what Mike is doing on reload to use VSX registers.  I
guess it is usage of  VSX regs as spilled locations for GENERAL regs
instead of memory.  If it is so, it is 2 day work to add this
functionality in LRA (as it already has analogous functionality for
Intel processors and that gave a nice SPECFP2000 improvement for
them) and probably more work on resolving issues especially as I
have no power8.

I would say lets add -mlra, but make the default OFF for the time being.  We
can always switch the default later.
Sure, if you know some LRA problems it should not be on default. 
Moreover, if we still have the problems when releasing gcc4.9, I think 
we should exclude any possibility for a user to use LRA for ppc.  I 
don't want to have GGC-4.9 users blaming LRA.


But adding LRA to PPC on the trunk (switched OFF by default) earlier 
could help me a lot to work on the issues.

Vladimir, I thought I included you in the list when I gave status.  The big
thing is several of the Spec 2006 benchmarks don't work in 32-bit mode, and I
get a lot of Fortran errors, again in 32-bit.  I also saw some decimal floating
point problems.

No, I did not see the message (or may be missed).  I need to check.

What I'm doing is adding secondary reload support so that up until reload time,
we can represent VSX addresses as reg+offset, and in secondary reload, create
the addition instructions to put the offset in a base register.  I haven't made
any changes to the machine independent portions of the compiler.  As long as
IRA uses the secondary reload interface, it should be ok.  However, right now,
I need to focus most of my attention on getting the secondary reload support to
work.
I completely understand.  You are quite busy this time as me rushing 
some stuff into gcc-4.9.

One thing that I've asked for before, but to remind you, is I really, really
wish secondary reload could allocate two scratch registers if it is given an
insn that takes 4 arguments.  Right now, I'm allocating a TFmode scratch, since
that gives 2 registers, but future changes will want TFmode to go into a single
vector register, and I will need to create another type, like V4DI that does
take 2 registers.  The case that this is needed for is moving an item from GPRs
to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
64-bit mode, or 64-bit items in 32-bit mode.  I need two registers to do the
move into, and then I will do the combine operation.

Ok.  I guess LRA can be adapted to some new secondary_reload hook 
returning two scratch registers.

Re: patch to enable LRA for ppc

2013-10-21 Thread Vladimir Makarov


On 13-10-21 2:55 PM, Michael Meissner wrote:

On Mon, Oct 21, 2013 at 08:18:22PM +0200, Steven Bosscher wrote:

On Mon, Oct 21, 2013 at 5:51 PM, Michael Meissner wrote:

What I'm doing is adding secondary reload support so that up until reload time,
we can represent VSX addresses as reg+offset, and in secondary reload, create
the addition instructions to put the offset in a base register.  I haven't made
any changes to the machine independent portions of the compiler.  As long as
IRA uses the secondary reload interface, it should be ok.  However, right now,
I need to focus most of my attention on getting the secondary reload support to
work.

One thing that I've asked for before, but to remind you, is I really, really
wish secondary reload could allocate two scratch registers if it is given an
insn that takes 4 arguments.  Right now, I'm allocating a TFmode scratch, since
that gives 2 registers, but future changes will want TFmode to go into a single
vector register, and I will need to create another type, like V4DI that does
take 2 registers.  The case that this is needed for is moving an item from GPRs
to VSX registers that takes 2 GPR registers, such as moving 128-bit items in
64-bit mode, or 64-bit items in 32-bit mode.  I need two registers to do the
move into, and then I will do the combine operation.


Eh, perhaps I'm missing something, but...

Isn't one of the great advantages of LRA over reload, that LRA allows
you to create new pseudos so that you shouldn't ever need secondary
reloads??

You still need secondary reload.
As I understand, Mike is telling about secondary_reload hook.  LRA can 
generate chain of reloads as long as it is needed.  It is achieved by 
subsequent processing generated reload insns on one or more 
lra-constraints subpasses.  Porting LRA frequently consists of removing 
secondary reload hook as in many cases it is smart enough to find 
necessary reloads just from insns constraints.  But there are still 
really complicated situations when LRA can not do this and it still 
needs directions from secondary_reload hook.  I am sure PPC has real 
needs to use this hook even for LRA.

For example, on the powerpc, you have 5 addressing modes for GPRs and FPRS:

 1) base register
 2) base register + index register
 3) base register + offset
 4) auto-update for base register + index register
 5) auto-update for base register + offset register

Now for VSX registers you only have:

 1) base register
 2) base register + index register

So, in the work I'm doing right now, I want to allow reg + offset addressing,
but if the register being loaded is an Altivec register (high part of the VSX
registers), I need to create a secondary reload to load the offset into and
then convert the address to indirect or indexed addressing.  You don't want to
always disallow offset based addressing, but you want to create the secondary
reload when you need to.

Similarly, vector types can do indexed addressing (reg+reg) but if you are
loading or storing the value into GPR registers, you can't do reg+reg
addressing on multi-word items.

Finally, one of the features of ISA 2.07 is the notion of load fusion, where
the hardware will fuse together a load immediate with an adjacent load (for
GPRs, you need the load immediate shifted to be the register that is being
loaded, for VSX registers, you just need the instructions adjacent).  In this
case, before reload we will want to pretend that the machine has addressing to
include the fusion forms, and in secondary reload, you will generate the
combined insn that will become the fusion instruction.

Re: [PATCH, PR 10474] Split live-ranges of function arguments to help shrink-wrapping

2013-10-21 Thread Vladimir Makarov


On 13-10-21 6:56 PM, Steven Bosscher wrote:


+   {
+ bitmap_clear (need_new);
+ bitmap_clear (reachable);
+ return;
+   }
+
+  for (df_ref use = DF_REG_USE_CHAIN (REGNO(dest));
+  use;
+  use = DF_REF_NEXT_REG (use))
You're using DF in these places. But IRA and LRA don't work with DF.
After update_equiv_regs DF caches and liveness may be incorrect. You'd
have to add a df_analyze call but I'm not sure how that will interact
with IRA/LRA's own dataflow frameworks (e.g. w.r.t.
REG_DEAD/REG_UNUSED notes).


Sorry, Martin.  I think Steven is right.  IRA/LRA (and reload pass) 
creates so many changes in RTL that DF infrastructure would slow down 
the compiler a lot and therefore df info is not updated during RA.  Your 
patch mostly uses a correct DF-info because there are few changes since 
updating is off.


You could move your optimization a bit up before df_clear_flags 
(DF_NO_INSN_RESCAN); or move this call right after your optimizations 
(possibly some minor df calls are needed too to restore live info for RA 
after your RTL changes).

[patch] fix tree-ssa-ter.h dup

2013-10-21 Thread Andrew MacLeod


On 10/21/2013 07:56 PM, Eric Botcazou wrote:

what went wrong?  I see it in my checkouts...?

Just open the file in your preferred editor. :-)


ah. open it in an editor with more than 25 lines on the screen you mean :-)

Thanks
fixed... revision 203914

Andrew

	* tree-ssa-ter.h: Remove duplicate copy of file contents.

Index: tree-ssa-ter.h
===
*** tree-ssa-ter.h	(revision 203836)
--- tree-ssa-ter.h	(working copy)
*** extern bitmap find_replaceable_exprs (va
*** 24,52 
  extern void dump_replaceable_exprs (FILE *, bitmap);
  
  #endif /* GCC_TREE_SSA_TER_H */
- /* Header file for tree-ssa-ter.c exports.
-Copyright (C) 2013 Free Software Foundation, Inc.
- 
- This file is part of GCC.
- 
- GCC is free software; you can redistribute it and/or modify it under
- the terms of the GNU General Public License as published by the Free
- Software Foundation; either version 3, or (at your option) any later
- version.
- 
- GCC is distributed in the hope that it will be useful, but WITHOUT ANY
- WARRANTY; without even the implied warranty of MERCHANTABILITY or
- FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
-  for more details.
- 
- You should have received a copy of the GNU General Public License
- along with GCC; see the file COPYING3.  If not see
- http://www.gnu.org/licenses/.  */
- 
- #ifndef GCC_TREE_SSA_TER_H
- #define GCC_TREE_SSA_TER_H
- 
- extern bitmap find_replaceable_exprs (var_map);
- extern void dump_replaceable_exprs (FILE *, bitmap);
- 
- #endif /* GCC_TREE_SSA_TER_H */
--- 24,26

Re: [Patch, fortran] PR58793 - Wrong value for _vtab for intrinsic types with CLASS(): storage_size of class() gives wrong result

2013-10-21 Thread Paul Richard Thomas

Committed as r203915.  I have a fix for the hollerith nonsense that
yields the same behaviour as other brands. I'll submit tonight.

Cheers

Paul

On 21 October 2013 11:20, Tobias Burnus bur...@net-b.de wrote:
 Paul Richard Thomas wrote:

 This patch is fairly obvious and follows a suggestion from Tobias to
 use gfc_element_size.  He even wrote the testcase!
 Bootstrapped and regested on FC17/x86_64 - OK for trunk?


 Looks good to me. Thanks for the patch.


 PS In writing this, I have just noted that I need to trap ts-type ==
 BT_HOLLERITH.  It would be a rather bizarre thing to try to do but I
 just know that somebody, somewhere will eventually try it :-) Consider
 that it will be done in the committed version.


 I think that should be done as follow-up patch.In particular, it is not
 obvious how it should be handled.

 The program below ICEs in gfc_typenode_for_spec;with both Intel's and Cray's
 compiler, it gives
  Character with len 3
  Something else
 I somehow had expected that it maps to a character.

 program test
   call up(abc)
   call up(3habc)
 contains
   subroutine up(x)
 class(*) :: x
 select type(x)
  type is (character(*))
   print *, 'Character with len', len(x)
  class default
   print *, 'Something else'
 end select
   end subroutine
 end program test

  The following - related to TYPE(*) - is also accepted, but I think it
 shouldn't. (call up(x) is presumably invalid). It fails here at assembler
 time.

 program test
   use iso_c_binding
   integer,target ::aa
   call up(c_loc(aa)) ! Probably okay but not matchable?
   call bar(aa) !OK
 contains
   subroutine up(x)
 class(*) :: x
 select type(x)
 ! type is (c_ptr) ! invalid  rejected by gfortran
 !  print *, 'C_ptr'
  class default
   print *, 'Something else'
 end select
   end subroutine
   subroutine bar(x)
type(*) :: x
call up(x) ! Invalid
   end subroutine bar
 end program test


 Tobias


 2013-10-20  Paul Thomas  pa...@gcc.gnu.org

  PR fortran 57893
  * class.c : Include target-memory.h.
  (gfc_find_intrinsic_vtab) Build a minimal expression so that
  gfc_element_size can be used to obtain the storage size, rather
  that the kind value.

 2013-10-20  Paul Thomas  pa...@gcc.gnu.org

  PR fortran 57893
  * gfortran.dg/unlimited_polymorphic_12.f90 : New test.





-- 
The knack of flying is learning how to throw yourself at the ground and miss.
   --Hitchhikers Guide to the Galaxy

97 matches

Mail list logo