[PATCH][PR rtl-optimization/70024] Fix argument to CROSSING_JUMP_P

2016-03-19 Thread Jeff Law


As noted in the BZ, we were passing a SEQUENCE to CROSSING_JUMP_P, which 
triggers an RTL checking failure.  It's pretty obvious that we should 
have been passing in "delay_jump_insn" and doing so, of course, fixes 
the failure.


I haven't been able to put together a sparc64 system for testing under 
qemu, but I'm highly confident we've got the right fix.


I've committed this to the trunk.  I'm removing the gcc-6 regression 
marker, but adding one for gcc-5 as I believe gcc-5 suffers from the 
same problem -- even if this testcase doesn't trigger.


Jeff
commit 295d529101bb79b4f876e119d8e3e8dbd43963d2
Author: law 
Date:   Wed Mar 16 16:58:12 2016 +

PR rtl-optimization/70024
* reorg.c (relax_delay_slots): Pass right argument to CROSSING_JUMP_P.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234262 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b673443..ef16b27 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2016-03-11  Jeff Law  
+
+   PR rtl-optimization/70024
+   * reorg.c (relax_delay_slots): Pass right argument to CROSSING_JUMP_P.
+
 2016-03-16  Richard Henderson  
 
PR middle-end/70199
diff --git a/gcc/reorg.c b/gcc/reorg.c
index a02141f..7b28821 100644
--- a/gcc/reorg.c
+++ b/gcc/reorg.c
@@ -3307,7 +3307,7 @@ relax_delay_slots (rtx_insn *first)
  reorg_redirect_jump (delay_jump_insn, trial);
  target_label = trial;
  if (crossing)
-   CROSSING_JUMP_P (insn) = 1;
+   CROSSING_JUMP_P (delay_jump_insn) = 1;
}
 
   /* If the first insn at TARGET_LABEL is redundant with a previous


[PATCH] Use simplify_replace_rtx instead of replace_rtx for DEBUG_INSNs in reload

2016-03-19 Thread Jakub Jelinek
Hi!

This patch fixes one of the spots that use replace_rtx, as this changes
debug insns, it clearly wants to replace just based on regno, not on pointer
equality, and for debug insns simplification is always desirable too.
The other place in reload1 that modifies DEBUG_INSNs also uses
simplify_replace_rtx.

Bootstrapped/regtested on {x86_64,i686,powerpc64{,le}}-linux, ok for trunk?

2016-03-18  Jakub Jelinek  

* reload1.c (emit_input_reload_insns): Use simplify_replace_rtx
instead of replace_rtx for DEBUG_INSNs.

--- gcc/reload1.c.jj2016-03-02 07:39:13.0 +0100
+++ gcc/reload1.c   2016-03-16 10:41:34.622921016 +0100
@@ -7395,7 +7395,9 @@ emit_input_reload_insns (struct insn_cha
  /* Adjust any debug insns between temp and insn.  */
  while ((temp = NEXT_INSN (temp)) != insn)
if (DEBUG_INSN_P (temp))
- replace_rtx (PATTERN (temp), old, reloadreg);
+ INSN_VAR_LOCATION_LOC (temp)
+   = simplify_replace_rtx (INSN_VAR_LOCATION_LOC (temp),
+   old, reloadreg);
else
  gcc_assert (NOTE_P (temp));
}

Jakub


Re: [PATCH] PR testsuite/70150: Check non-pic/ia32 in stackprotectexplicit2.C

2016-03-19 Thread Rainer Orth
"H.J. Lu"  writes:

> For ia32, __stack_chk_fail isn't called in PIC.  We need to check
> non-pic or non-ia32 before scanning for __stack_chk_fail.
>
> OK for trunk?
>
> H.J.
> ---
>   PR testsuite/70150

There's already PR c++/66400 (and a bunch of others for several related
issues) which I'd filed on your request when working on Solaris PIE
support.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [C++ PATCH] Fix -flifetime-dse bug in dtors too (PR c++/70272)

2016-03-19 Thread Jason Merrill

OK.

Jason


New French PO file for 'cpplib' (version 6.1-b20160131)

2016-03-19 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'cpplib' has been submitted
by the French team of translators.  The file is available at:

http://translationproject.org/latest/cpplib/fr.po

(This file, 'cpplib-6.1-b20160131.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/cpplib/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/cpplib.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [RFA][PR rtl-optimization/70263] Fix creation of new REG_EQUIV notes

2016-03-19 Thread David Malcolm
On Fri, 2016-03-18 at 13:20 -0600, Jeff Law wrote:
> On 03/18/2016 01:16 PM, Bernd Schmidt wrote:
> > On 03/18/2016 08:14 PM, Jeff Law wrote:
> > > I also added a blurb to the dump file when we create these
> > > equivalences
> > > and included a test to verify the code fires.  I verified it
> > > fired on
> > > x86 and x86-64.  It may or may not fire on other targets, so I
> > > left the
> > > test in the i386 specific subdirectory.
> > 
> > This is the sort of thing I'd want to do with rtl unit tests.

Would you like an RTL frontend?  (and something like a 
 gcc/testsuite/rtl.dg/ )

> Yea.  Along the same lines, my patch for the coalescing problem 
> introduces a new bitmap function that I'd like to cover with some
> unit 
> tests.  

Presumably the -fself-test idea could help here?

> I'm sure we're going to find oodles of these things as we 
> continue development and ponder how to better test things than
> scanning 
> dump files.

 



Re: [PATCH] Fix PR c++/70218 (illegal access to private field succeeds)

2016-03-19 Thread Matthias Klose

On 13.03.2016 21:03, Patrick Palka wrote:

Here we are mishandling the deferred_access_stack by not coherently
pushing/popping from it.  In cp_parser_lambda_expression we are calling
(in order):

   push_deferring_access_checks (dk_no_deferred);
   cp_parser_start_tentative_firewall (parser);
   ...
   pop_deferring_access_checks ();
   cp_parser_end_tentative_firewall (parser, start, lambda_expr);

But the order of the last two popping calls does not correspond with the order
of the first two pushing calls.  pop_deferring_access_checks should be
called last.  This error may cause us to drop deferred access checks
instead of performing them.

Bootstrap + regtest in progress, does this look OK to commit if testing
succeeds?


when applying this patch to the gcc-5-branch I see regressions like

/scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C: 
In function 'void foo()':
/scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:6:8: 
error: 'int X::i' is private
/scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:16:18: 
error: within this context


Excess errors:
/scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:6:8: 
error: 'int X::i' is private
/scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:16:18: 
error: within this context



haven't yet checked the trunk. I don't see any other regressions besides the 
usual noise in the ubsan tests.


Matthias




Re: [PATCH] Fix PR64764

2016-03-19 Thread Tom de Vries

On 16/03/16 17:15, H.J. Lu wrote:

On Wed, Mar 16, 2016 at 9:12 AM, H.J. Lu  wrote:



Any particular reason why this test was changed to DOS format?


FWIW, the test was in DOS format from the start.

Thanks
- Tom



Re: [PATCH, PR70269] Set dump_file to NULL in cgraph_node::get_body

2016-03-19 Thread Richard Biener
On Thu, 17 Mar 2016, Tom de Vries wrote:

> Hi,
> 
> this patch fixes PR70269, an 5/6 regression.
> 
> When compiling with "-O2 -fipa-pta -fdump-ipa-pta-graph" we try to initialize
> a graph dump file for ipa-cp, while the dump file is not enabled, which causes
> an ICE because dump_file_name is NULL.
> 
> This condition in pass_init_dump_file enables the unnecessary initialization,
> because dump_file is non-NULL:
> ...
>   if (initializing_dump
>   && dump_file && (dump_flags & TDF_GRAPH)
>   && cfun && (cfun->curr_properties & PROP_cfg))
> ...
> 
> The dump_file is non-NULL, but it's the dump file for ipa-pta, the pass that
> calls cgraph_node:get_body which triggers the ipa transform of ipa-cp.
> 
> The patch fixes this by resetting dump_file to NULL in cgraph_node::get_body.
> 
> OK for stage 4 trunk/5 branch if bootstrap and reg-test succeeds?

Ok.

Richard.


Rename GOMP_MAP_FORCE_DEALLOC to GOMP_MAP_DELETE (was: [gomp4.1] map clause parsing improvements)

2016-03-19 Thread Thomas Schwinge
Hi!

On Thu, 17 Mar 2016 15:37:04 +0100, Jakub Jelinek  wrote:
> On Thu, Mar 17, 2016 at 03:34:09PM +0100, Thomas Schwinge wrote:
> > That's simple enouch; OK to commit?  (I'm also including the related
> > change, to rename the Fortran OMP_MAP_FORCE_DEALLOC to OMP_MAP_DELETE,
> > because I think that's what you'd do, once starting the OpenMP 4.5
> > Fortran front end work.)
> 
> Ok, thanks.

Committed unchanged to trunk in r234294:

commit 5cb6b0b9685d5c63e87abb10abac60312dab1378
Author: tschwinge 
Date:   Thu Mar 17 15:07:54 2016 +

Rename GOMP_MAP_FORCE_DEALLOC to GOMP_MAP_DELETE

Also rename the Fortran OMP_MAP_FORCE_DEALLOC to OMP_MAP_DELETE.

include/
* gomp-constants.h (enum gomp_map_kind): Rename
GOMP_MAP_FORCE_DEALLOC to GOMP_MAP_DELETE.  Adjust all users.

gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Rename OMP_MAP_FORCE_DEALLOC
to OMP_MAP_DELETE.  Adjust all users.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234294 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/c/c-parser.c   | 2 +-
 gcc/cp/parser.c| 2 +-
 gcc/fortran/ChangeLog  | 5 +
 gcc/fortran/gfortran.h | 2 +-
 gcc/fortran/openmp.c   | 2 +-
 gcc/fortran/trans-openmp.c | 6 +++---
 gcc/gimplify.c | 2 +-
 gcc/omp-low.c  | 2 +-
 gcc/tree-pretty-print.c| 2 +-
 include/ChangeLog  | 5 +
 include/gomp-constants.h   | 6 ++
 libgomp/oacc-parallel.c| 6 +++---
 12 files changed, 25 insertions(+), 17 deletions(-)

diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 60ec996..82d6eca 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -10715,7 +10715,7 @@ c_parser_oacc_data_clause (c_parser *parser, 
pragma_omp_clause c_kind,
   kind = GOMP_MAP_FORCE_ALLOC;
   break;
 case PRAGMA_OACC_CLAUSE_DELETE:
-  kind = GOMP_MAP_FORCE_DEALLOC;
+  kind = GOMP_MAP_DELETE;
   break;
 case PRAGMA_OACC_CLAUSE_DEVICE:
   kind = GOMP_MAP_FORCE_TO;
diff --git gcc/cp/parser.c gcc/cp/parser.c
index 62570d4..8ba4ffe 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -30086,7 +30086,7 @@ cp_parser_oacc_data_clause (cp_parser *parser, 
pragma_omp_clause c_kind,
   kind = GOMP_MAP_FORCE_ALLOC;
   break;
 case PRAGMA_OACC_CLAUSE_DELETE:
-  kind = GOMP_MAP_FORCE_DEALLOC;
+  kind = GOMP_MAP_DELETE;
   break;
 case PRAGMA_OACC_CLAUSE_DEVICE:
   kind = GOMP_MAP_FORCE_TO;
diff --git gcc/fortran/ChangeLog gcc/fortran/ChangeLog
index 9ed112e..105e7b4 100644
--- gcc/fortran/ChangeLog
+++ gcc/fortran/ChangeLog
@@ -1,3 +1,8 @@
+2016-03-17  Thomas Schwinge  
+
+   * gfortran.h (enum gfc_omp_map_op): Rename OMP_MAP_FORCE_DEALLOC
+   to OMP_MAP_DELETE.  Adjust all users.
+
 2016-03-13  Jerry DeLisle  
Jim MacArthur  
 
diff --git gcc/fortran/gfortran.h gcc/fortran/gfortran.h
index 33fffd8..a0fb5fd 100644
--- gcc/fortran/gfortran.h
+++ gcc/fortran/gfortran.h
@@ -1112,8 +1112,8 @@ enum gfc_omp_map_op
   OMP_MAP_TO,
   OMP_MAP_FROM,
   OMP_MAP_TOFROM,
+  OMP_MAP_DELETE,
   OMP_MAP_FORCE_ALLOC,
-  OMP_MAP_FORCE_DEALLOC,
   OMP_MAP_FORCE_TO,
   OMP_MAP_FORCE_FROM,
   OMP_MAP_FORCE_TOFROM,
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index 51ab96e..a6c39cd 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -764,7 +764,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
   if ((mask & OMP_CLAUSE_DELETE)
  && gfc_match ("delete ( ") == MATCH_YES
  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-  OMP_MAP_FORCE_DEALLOC))
+  OMP_MAP_DELETE))
continue;
   if ((mask & OMP_CLAUSE_PRESENT)
  && gfc_match ("present ( ") == MATCH_YES
diff --git gcc/fortran/trans-openmp.c gcc/fortran/trans-openmp.c
index 5990202..a905ca6 100644
--- gcc/fortran/trans-openmp.c
+++ gcc/fortran/trans-openmp.c
@@ -2119,12 +2119,12 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
case OMP_MAP_TOFROM:
  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TOFROM);
  break;
+   case OMP_MAP_DELETE:
+ OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE);
+ break;
case OMP_MAP_FORCE_ALLOC:
  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC);
  break;
-   case OMP_MAP_FORCE_DEALLOC:
- OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_DEALLOC);
- break;
case OMP_MAP_FORCE_TO:
  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_TO);
  break;
diff --git gcc/gimplify.c gcc/gimplify.c
index f3e5c39..3687e7a 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -8194,7 +8194,7 @@ 

Re: [PATCH][ARM] Split out armv7ve effective target check

2016-03-19 Thread Ramana Radhakrishnan
On Wed, Mar 2, 2016 at 1:32 PM, Kyrill Tkachov
 wrote:
> Hi all,
>
> I'm seeing the fails:
> FAIL: gcc.target/arm/atomic_loaddi_2.c scan-assembler-times ldrd\tr[0-9]+,
> r[0-9]+, \\[r[0-9]+\\] 1
> FAIL: gcc.target/arm/atomic_loaddi_5.c scan-assembler-times ldrd\tr[0-9]+,
> r[0-9]+, \\[r[0-9]+\\] 1
> FAIL: gcc.target/arm/atomic_loaddi_8.c scan-assembler-times ldrd\tr[0-9]+,
> r[0-9]+, \\[r[0-9]+\\] 1
>
> when testing an arm multilib with /-march=armv7-a.
>
> The tests have an effective target check for armv7ve but it doesn't work
> because
> under the hood the check is the same as for armv7-a, that is it checks for
> the __ARM_ARCH_7A__
> predefine which is set for both march values.
>
> To check for armv7ve using predefines we need to check for both
> __ARM_ARCH_7A__ and for the hardware
> integer division predefine, making armv7ve special.
>
> So this patch separates the effective target check definition from the rest
> of the architectures
> and defines it appropriately.
>
> With this patch the aforementioned tests appear UNSUPPORTED when testing the
> /-march=armv7-a multilib.
>
> Ok for trunk?

Ok, but please follow up with updating sourcebuild.texi.

Ramana

>
> Thanks,
> Kyrill
>
> 2016-03-02  Kyrylo Tkachov  
>
> * lib/target-supports.exp: Remove v7ve entry from loop
> creating effective target checks.
> (check_effective_target_arm_arch_v7ve_ok): New procedure.
> (add_options_for_arm_arch_v7ve): Likewise.


Scan for parallelization of the oacc kernels test-cases in gfortran.dg/goacc (was: [PATCH, 15/16] Add libgomp.oacc-c-c++-common/kernels-*.c)

2016-03-19 Thread Thomas Schwinge
Hi!

On Wed, 9 Mar 2016 10:17:28 +0100, Tom de Vries  wrote:
> [Should have cited
> 
> instead of the C/C++ tests]

> Retested on current trunk.
> 
> Committed, minus the kernels-parallel-loop-data-enter-exit.f95 test.

Is there a reason why you omitted the following tree scanning tests (as
done for C/C++, and also present for Fortran on gomp-4_0-branch)?  (Note
that I had to XFAIL gfortran.dg/goacc/kernels-loop-n.f95.)  OK to commit?

commit f0294eeb30ef285c3930b975ccbc1b6d7052cc03
Author: Thomas Schwinge 
Date:   Fri Mar 18 12:52:37 2016 +0100

Scan for parallelization of the oacc kernels test-cases in gfortran.dg/goacc

gcc/testsuite/
* gfortran.dg/goacc/kernels-loop-2.f95: Scan for parallelization.
* gfortran.dg/goacc/kernels-loop-data-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-enter-exit.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data-update.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-data.f95: Likewise.
* gfortran.dg/goacc/kernels-loop.f95: Likewise.
* gfortran.dg/goacc/kernels-loop-n.f95: Likewise, XFAILed.
---
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 | 2 ++
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95| 1 +
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 | 2 ++
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95   | 2 ++
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95   | 2 ++
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-data.f95  | 2 ++
 gcc/testsuite/gfortran.dg/goacc/kernels-loop-n.f95 | 7 +++
 gcc/testsuite/gfortran.dg/goacc/kernels-loop.f95   | 2 ++
 8 files changed, 20 insertions(+)

diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95 
gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
index 5cc2e8b..865f7a6 100644
--- gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
+++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-2.f95
@@ -40,3 +40,5 @@ end program main
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.0 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.1 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.2 " 1 
"optimized" } }
+
+! { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } }
diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95 
gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
index d1bfc70..c9f3a62 100644
--- gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
+++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-2.f95
@@ -47,3 +47,4 @@ end program main
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.1 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.2 " 1 
"optimized" } }
 
+! { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } }
diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95 
gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
index feac7b2..3361607 100644
--- gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
+++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit-2.f95
@@ -46,3 +46,5 @@ end program main
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.0 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.1 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.2 " 1 
"optimized" } }
+
+! { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } }
diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95 
gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
index 632983f..5ba56fb 100644
--- gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
+++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-enter-exit.f95
@@ -44,3 +44,5 @@ end program main
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.0 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.1 " 1 
"optimized" } }
 ! { dg-final { scan-tree-dump-times "(?n);; Function MAIN__._omp_fn.2 " 1 
"optimized" } }
+
+! { dg-final { scan-tree-dump-times "(?n)oacc function \\(0," 3 "parloops1" } }
diff --git gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95 
gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
index 41b0d96..a622a96 100644
--- gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
+++ gcc/testsuite/gfortran.dg/goacc/kernels-loop-data-update.f95
@@ -43,3 +43,5 @@ end program main
 ! Check that the loop has been split off into a 

Re: [PATCH][ARM][testsuite][committed] Do not override -mcpu in no-volatile-in-it.c

2016-03-19 Thread Ramana Radhakrishnan
On Fri, Mar 18, 2016 at 10:31 AM, Andre Vieira (lists)
 wrote:
> On 16/07/15 16:31, Kyrill Tkachov wrote:
>> Hi all,
>>
>> This scan-assembler test was failing for me when testing with an
>> explicit /-march=armv7-a variant because
>> it clashed with the -mcpu=cortex-m7 and overrode it.
>>
>> This patch skips the test if the user forces an incompatible -march or
>> -mcpu option.
>> The test now appears as UNSUPPORTED in these conditions and PASSes
>> normally.
>>
>> Applied as obvious with r225892.
>>
>> Thanks,
>> Kyrill
>>
>> 2015-07-16  Kyrylo Tkachov  
>>
>> * gcc.target/arm/no-volatile-in-it.c: Skip if -mcpu is overriden.
>
> OK to backport this to gcc-5-branch?

Ok ... it was *obvious* , if you're hitting it on the gcc-5 branch
then apply it.

Ramana
>
> Cheers,
> Andre


[PATCH, PR tree-optimization/70251] Disable VEC_COND_EXPR transformation into VIEW_CONVERT_EXPR for scalar mask case

2016-03-19 Thread Ilya Enkovich
Hi,

This patch disables two match.pd patterns which transform
VEC_COND_EXPR into simple conversion in case it uses a scalar mask.
The patch was bootstrapped and regtested on x86_64-pc-linux-gnu +
separate check for new test on SDE.  OK for trunk?

Thanks,
Ilya
--
gcc/

2016-03-17  Ilya Enkovich  

* match.pd (A + (B vcmp C ? 1 : 0) -> A - (B vcmp C)): Apply
for boolean vector with vector mode only.
(A - (B vcmp C ? 1 : 0) -> A + (B vcmp C)): Likewise.

gcc/testsuite/

2016-03-17  Ilya Enkovich  

* gcc.target/i386/pr70251.c: New test.


diff --git a/gcc/match.pd b/gcc/match.pd
index 112deb3..7245ff4 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -1759,6 +1759,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (plus:c @3 (view_convert? (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
  (if (VECTOR_TYPE_P (type)
+  && VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (@0)))
   && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
   && (TYPE_MODE (TREE_TYPE (type))
   == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
@@ -1768,6 +1769,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
 (simplify
  (minus @3 (view_convert? (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
  (if (VECTOR_TYPE_P (type)
+  && VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (@0)))
   && TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
   && (TYPE_MODE (TREE_TYPE (type))
   == TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
diff --git a/gcc/testsuite/gcc.target/i386/pr70251.c 
b/gcc/testsuite/gcc.target/i386/pr70251.c
new file mode 100644
index 000..97078cd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr70251.c
@@ -0,0 +1,52 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -mavx512bw" } */
+/* { dg-require-effective-target avx512bw } */
+
+#define AVX512BW
+#include "avx512f-helper.h"
+
+unsigned long long int
+hash(unsigned long long int seed, unsigned long long int v)
+{
+  return seed ^ (v + 0x9e3779b9 + (seed<<6) + (seed>>2));
+}
+
+unsigned int a [100];
+signed char b [100];
+signed char c [100];
+
+void
+init ()
+{
+  for (int i = 0; i < 100; ++i)
+{
+  a [i] = 1000L;
+  b [i] = 10;
+  c [i] = 5;
+}
+}
+
+void
+foo ()
+{
+  for (int i = 0; i < 100; ++i)
+b [i] = (!b [i] ^ (a [i] >= b [i])) + c [i];
+}
+
+unsigned long long int
+checksum ()
+{
+  unsigned long long int seed = 0ULL;
+  for (int i = 0; i < 100; ++i)
+seed = hash (seed, b[i]);
+  return seed;
+}
+
+void
+TEST ()
+{
+  init ();
+  foo ();
+  if (checksum () != 5785906989299578598ULL)
+__builtin_abort ();
+}


C++ PATCH for c++/70147 (-fsanitize=vptr, -flifetime-dse, and virtual bases)

2016-03-19 Thread Jason Merrill
The first patch factors out testing of current_in_charge_parm from 
various places in the compiler into a new build_if_in_charge function.


The second patch implements Bernd's suggestion for modifying 
-flifetime-dse so that when we have virtual bases, we clobber the entire 
object, but only when we are in charge (and therefore know we are in the 
constructor for a complete object, and don't need to worry about tail 
padding).


The third patch adjusts the -fsanitize=vptr vptr clearing so that we 
don't clear the vptr for a virtual base when we aren't in charge of 
virtual bases, even if the current class shares the vptr from a primary 
virtual base.


The testcase tests all of these changes.

Tested x86_64-pc-linux-gnu, applying to trunk.

commit 4cffe7ea961ed6b602a954c616f5186f27c85db5
Author: Jason Merrill 
Date:   Thu Mar 17 17:22:43 2016 -0400

	* class.c (build_if_in_charge): Split out from build_base_path.

	* init.c (expand_virtual_init, expand_default_init): Use it.
	* call.c (build_special_member_call): Use it.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 34c1d9b..d445163 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -8082,11 +8082,7 @@ build_special_member_call (tree instance, tree name, vec **args,
   vtt = decay_conversion (vtt, complain);
   if (vtt == error_mark_node)
 	return error_mark_node;
-  vtt = build3 (COND_EXPR, TREE_TYPE (vtt),
-		build2 (EQ_EXPR, boolean_type_node,
-			current_in_charge_parm, integer_zero_node),
-		current_vtt_parm,
-		vtt);
+  vtt = build_if_in_charge (vtt, current_vtt_parm);
   if (BINFO_SUBVTT_INDEX (binfo))
 	sub_vtt = fold_build_pointer_plus (vtt, BINFO_SUBVTT_INDEX (binfo));
   else
diff --git a/gcc/cp/class.c b/gcc/cp/class.c
index f8ecfa1..866a0a4 100644
--- a/gcc/cp/class.c
+++ b/gcc/cp/class.c
@@ -225,6 +225,24 @@ int n_convert_harshness = 0;
 int n_compute_conversion_costs = 0;
 int n_inner_fields_searched = 0;
 
+/* Return a COND_EXPR that executes TRUE_STMT if this execution of the
+   'structor is in charge of 'structing virtual bases, or FALSE_STMT
+   otherwise.  */
+
+tree
+build_if_in_charge (tree true_stmt, tree false_stmt)
+{
+  gcc_assert (DECL_HAS_IN_CHARGE_PARM_P (current_function_decl));
+  tree cmp = build2 (NE_EXPR, boolean_type_node,
+		 current_in_charge_parm, integer_zero_node);
+  tree type = unlowered_expr_type (true_stmt);
+  if (VOID_TYPE_P (type))
+type = unlowered_expr_type (false_stmt);
+  tree cond = build3 (COND_EXPR, type,
+		  cmp, true_stmt, false_stmt);
+  return cond;
+}
+
 /* Convert to or from a base subobject.  EXPR is an expression of type
`A' or `A*', an expression of type `B' or `B*' is returned.  To
convert A to a base B, CODE is PLUS_EXPR and BINFO is the binfo for
@@ -470,12 +488,9 @@ build_base_path (enum tree_code code,
 	/* Negative fixed_type_p means this is a constructor or destructor;
 	   virtual base layout is fixed in in-charge [cd]tors, but not in
 	   base [cd]tors.  */
-	offset = build3 (COND_EXPR, ptrdiff_type_node,
-			 build2 (EQ_EXPR, boolean_type_node,
- current_in_charge_parm, integer_zero_node),
-			 v_offset,
-			 convert_to_integer (ptrdiff_type_node,
-	 BINFO_OFFSET (binfo)));
+	offset = build_if_in_charge
+	  (convert_to_integer (ptrdiff_type_node, BINFO_OFFSET (binfo)),
+	   v_offset);
   else
 	offset = v_offset;
 }
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index a50d92c..497430a 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -5642,6 +5642,7 @@ extern tree get_function_version_dispatcher	(tree);
 
 /* in class.c */
 extern tree build_vfield_ref			(tree, tree);
+extern tree build_if_in_charge			(tree true_stmt, tree false_stmt = void_node);
 extern tree build_base_path			(enum tree_code, tree,
 		 tree, int, tsubst_flags_t);
 extern tree convert_to_base			(tree, tree, bool, bool,
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index 22c039b..aee3b84 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -1243,12 +1243,7 @@ expand_virtual_init (tree binfo, tree decl)
   /* The actual initializer is the VTT value only in the subobject
 	 constructor.  In maybe_clone_body we'll substitute NULL for
 	 the vtt_parm in the case of the non-subobject constructor.  */
-  vtbl = build3 (COND_EXPR,
-		 TREE_TYPE (vtbl),
-		 build2 (EQ_EXPR, boolean_type_node,
-			 current_in_charge_parm, integer_zero_node),
-		 vtbl2,
-		 vtbl);
+  vtbl = build_if_in_charge (vtbl, vtbl2);
 }
 
   /* Compute the location of the vtpr.  */
@@ -1741,11 +1736,7 @@ expand_default_init (tree binfo, tree true_exp, tree exp, tree init, int flags,
 	, binfo, flags,
 	complain);
   base = fold_build_cleanup_point_expr (void_type_node, base);
-  rval = build3 (COND_EXPR, void_type_node,
-		 build2 (EQ_EXPR, boolean_type_node,
-			 current_in_charge_parm, integer_zero_node),
-		 base,
-		 complete);
+  rval = 

Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-03-19 Thread H.J. Lu
On Tue, Mar 15, 2016 at 7:51 PM, Jason Merrill  wrote:
> On 03/15/2016 08:25 PM, Joseph Myers wrote:
>>
>> On Tue, 15 Mar 2016, H.J. Lu wrote:
>>
>>> On Tue, Mar 15, 2016 at 3:34 PM, Joseph Myers 
>>> wrote:

 On Tue, 15 Mar 2016, H.J. Lu wrote:

> On Tue, Mar 15, 2016 at 2:39 PM, Joseph Myers 
> wrote:
>>
>> I'm not sure if the zero-size arrays (a GNU extension) are considered
>> to
>> make a struct non-empty, but in any case I think the tests should
>> cover
>> such arrays as elements of structs.
>
>
> There are couple tests for structs with members of array
> of empty types.  testsuite/g++.dg/abi/empty14.h has


 My concern is the other way round - structs with elements such as
 "int a[0];", an array [0] of a nonempty type.  My reading of the
 subobject
 definition is that such an array should not cause the struct to be
 considered nonempty (it doesn't result in any int subobjects).
>>>
>>>
>>> This is a test for struct with zero-size array, which isn't treated
>>> as empty type.  C++ and C are compatible in its passing.
>>
>>
>> Where is the current definition of empty types you're proposing for use in
>> GCC?  Is the behavior of this case clear from that definition?
>
>
> "An empty type is a type where it and all of its subobjects (recursively)
> are of structure, union, or array type.  No memory slot nor register should
> be used to pass or return an object of empty type."
>
> It seems to me that such a struct should be considered an empty type under
> this definition, since a zero-length array has no subobjects.
>

Since zero-size array is GCC extension, we can change it.   Do we
want to change its passing for C?

-- 
H.J.


Re: [PATCH] Fix PR c++/70121 (premature folding of const var that was implicitly captured)

2016-03-19 Thread Patrick Palka
On Thu, Mar 10, 2016 at 6:06 PM, Patrick Palka  wrote:
> On Thu, Mar 10, 2016 at 5:58 PM, Patrick Palka  wrote:
>> Within a lambda we should implicitly capture an outer const variable
>> only if it's odr-used in the body of the lambda.  But we are currently
>> making the decision of whether to capture such a variable, or else to
>> fold it to a constant, too early -- before we can know whether it's
>> being odr-used or not.  So we currently always fold a const variable to
>> a constant if possible instead of otherwise capturing it, but of course
>> doing this is wrong if e.g. the address of this variable is taken inside
>> the lambda's body.
>>
>> This patch reverses the behavior of process_outer_var_ref, so that we
>> always implicitly capture a const variable if it's capturable, instead
>> of always trying to first fold it to a constant.  This behavior however
>> is wrong too, and introduces a different but perhaps less important
>> regression: if we implicitly capture by value a const object that is not
>> actually odr-used within the body of the lambda, we may introduce a
>> redundant call to its copy/move constructor, see pr70121-2.C.
>>
>> Ideally we should be capturing a variable only if it's not odr-used
>
> Er, this sentence should read
>
>   Ideally we should be _implicitly_ capturing a variable only if it
> _is_ odr-used ...

Ping.


Re: [PATCH] c++/67376 Comparison with pointer to past-the-end, of array fails inside constant expression

2016-03-19 Thread Jeff Law

On 03/14/2016 04:13 PM, Jakub Jelinek wrote:

On Mon, Mar 14, 2016 at 03:25:07PM -0600, Martin Sebor wrote:

PR c++/67376 - [5/6 regression] Comparison with pointer to past-the-end
of array fails inside constant expression
PR c++/70170 - [6 regression] bogus not a constant expression error comparing
pointer to array to null
PR c++/70172 - incorrect reinterpret_cast from integer to pointer error
on invalid constexpr initialization
PR c++/60760 - arithmetic on null pointers should not be allowed in constant
expressions
PR c++/70228 - insufficient detail in diagnostics for a constexpr out of bounds
array subscript


Can you please check up the formatting in the patch?
Seems e.g. you've replaced tons of tabs with 8 spaces etc. (check your
editor setting, and check the patch with contrib/check-GNU-style.sh).
There is some trailing whitespace too, spaces before [, etc.
Jakub, do you have any comments on the substance of the patch?  If so, 
it would help immensely if you could provide them so that Martin could 
address technical issues at the same time as he fixes up whitespace nits.


jeff


Re: [RFA][PR rtl-optimization/70263] Fix creation of new REG_EQUIV notes

2016-03-19 Thread Jakub Jelinek
On Fri, Mar 18, 2016 at 04:05:08PM -0400, David Malcolm wrote:
> On Fri, 2016-03-18 at 13:20 -0600, Jeff Law wrote:
> > On 03/18/2016 01:16 PM, Bernd Schmidt wrote:
> > > On 03/18/2016 08:14 PM, Jeff Law wrote:
> > > > I also added a blurb to the dump file when we create these
> > > > equivalences
> > > > and included a test to verify the code fires.  I verified it
> > > > fired on
> > > > x86 and x86-64.  It may or may not fire on other targets, so I
> > > > left the
> > > > test in the i386 specific subdirectory.
> > > 
> > > This is the sort of thing I'd want to do with rtl unit tests.
> 
> Would you like an RTL frontend?  (and something like a 
>  gcc/testsuite/rtl.dg/ )

That really shouldn't be hard, we already have RTL readers and RTL writers,
of course e.g. stuff where RTL refers to trees will be harder (or we could
just not fill it in).

Jakub


Fix 70278 (LRA split_regs followup patch)

2016-03-19 Thread Bernd Schmidt
This fixes an oversight in my previous patch here. I used biggest_mode 
in the assumption that if the reg was used in the function, it would be 
set to something other than VOIDmode, but that fails if we have a 
multiword access - only the first hard reg gets its biggest_mode 
assigned in that case.


Bootstrapped and tested on x86_64-linux, ran (just) the new arm testcase 
manually with arm-eabi. Ok?


(The testcase seems to be from glibc. Do we keep the copyright notices 
on the reduced form)?



Bernd
	PR rtl-optimization/70278
	* lra-constraints.c (split_reg): Handle the case where biggest_mode is
	VOIDmode.

testsuite/
	* gcc.dg/torture/pr70278.c: New test.
	* gcc.target/arm/pr70278.c: New test.

Index: gcc/lra-constraints.c
===
--- gcc/lra-constraints.c	(revision 234184)
+++ gcc/lra-constraints.c	(working copy)
@@ -4982,7 +4982,12 @@ split_reg (bool before_p, int original_r
   nregs = 1;
   mode = lra_reg_info[hard_regno].biggest_mode;
   machine_mode reg_rtx_mode = GET_MODE (regno_reg_rtx[hard_regno]);
-  if (GET_MODE_SIZE (mode) > GET_MODE_SIZE (reg_rtx_mode))
+  /* A reg can have a biggest_mode of VOIDmode if it was only ever seen
+	 as part of a multi-word register.  In that case, or if the biggest
+	 mode was larger than a register, just use the reg_rtx.  Otherwise,
+	 limit the size to that of the biggest access in the function.  */
+  if (mode == VOIDmode
+	  || GET_MODE_SIZE (mode) > GET_MODE_SIZE (reg_rtx_mode))
 	{
 	  original_reg = regno_reg_rtx[hard_regno];
 	  mode = reg_rtx_mode;
Index: gcc/testsuite/gcc.dg/torture/pr70278.c
===
--- gcc/testsuite/gcc.dg/torture/pr70278.c	(revision 0)
+++ gcc/testsuite/gcc.dg/torture/pr70278.c	(working copy)
@@ -0,0 +1,37 @@
+/* { dg-do compile } */
+/*
+ * 
+ * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
+ *
+ * Developed at SunPro, a Sun Microsystems, Inc. business.
+ * Permission to use, copy, modify, and distribute this
+ * software is freely granted, provided that this notice 
+ * is preserved.
+ * 
+ */
+typedef union
+{
+  double value;
+  struct
+  {
+unsigned int msw;
+  } parts;
+} ieee_double_shape_type;
+double __ieee754_hypot(double x, double y)
+{
+ double a=x,b=y,t1,t2,y1,y2,w;
+ int j,k,ha,hb;
+ do { ieee_double_shape_type gh_u; gh_u.value = (x); (ha) = gh_u.parts.msw; } while (0);;
+ if(hb > ha) {a=y;b=x;j=ha; ha=hb;hb=j;} else {a=x;b=y;}
+ if(ha > 0x5f30) {
+do { ieee_double_shape_type sh_u; sh_u.value = (a); sh_u.parts.msw = (ha); (a) = sh_u.value; } while (0);;
+ }
+ w = a-b;
+ if (w <= b)
+ {
+ t2 = a - t1;
+ w = t1*y1-(w*(-w)-(t1*y2+t2*b));
+ }
+ if(k!=0) {
+ } else return w;
+}
Index: gcc/testsuite/gcc.target/arm/pr70278.c
===
--- gcc/testsuite/gcc.target/arm/pr70278.c	(revision 0)
+++ gcc/testsuite/gcc.target/arm/pr70278.c	(working copy)
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-march=*" } { "-march=armv4t" } } */
+/* { dg-skip-if "avoid conflicting multilib options" { *-*-* } { "-marm" } { "" } } */
+/* { dg-options "-mthumb" } */
+/* { dg-add-options arm_arch_v4t } */
+/*
+ * 
+ * Copyright (C) 1993 by Sun Microsystems, Inc. All rights reserved.
+ *
+ * Developed at SunPro, a Sun Microsystems, Inc. business.
+ * Permission to use, copy, modify, and distribute this
+ * software is freely granted, provided that this notice 
+ * is preserved.
+ * 
+ */
+typedef union
+{
+  double value;
+  struct
+  {
+unsigned int msw;
+  } parts;
+} ieee_double_shape_type;
+double __ieee754_hypot(double x, double y)
+{
+ double a=x,b=y,t1,t2,y1,y2,w;
+ int j,k,ha,hb;
+ do { ieee_double_shape_type gh_u; gh_u.value = (x); (ha) = gh_u.parts.msw; } while (0);;
+ if(hb > ha) {a=y;b=x;j=ha; ha=hb;hb=j;} else {a=x;b=y;}
+ if(ha > 0x5f30) {
+do { ieee_double_shape_type sh_u; sh_u.value = (a); sh_u.parts.msw = (ha); (a) = sh_u.value; } while (0);;
+ }
+ w = a-b;
+ if (w <= b)
+ {
+ t2 = a - t1;
+ w = t1*y1-(w*(-w)-(t1*y2+t2*b));
+ }
+ if(k!=0) {
+ } else return w;
+}



Re: C++ PATCH to fix missing warning (PR c++/70194)

2016-03-19 Thread Martin Sebor

On 03/17/2016 10:48 AM, Patrick Palka wrote:

On Thu, Mar 17, 2016 at 12:27 PM, Jeff Law  wrote:

On 03/16/2016 06:43 PM, Martin Sebor wrote:


@@ -3974,6 +3974,38 @@ build_vec_cmp (tree_code code, tree type,
 return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
   }

+/* Possibly warn about an address never being NULL.  */
+
+static void
+warn_for_null_address (location_t location, tree op, tsubst_flags_t
complain)
+{


...


+  if (TREE_CODE (cop) == ADDR_EXPR
+  && decl_with_nonnull_addr_p (TREE_OPERAND (cop, 0))
+  && !TREE_NO_WARNING (cop))
+warning_at (location, OPT_Waddress, "the address of %qD will never "
+"be NULL", TREE_OPERAND (cop, 0));
+
+  if (CONVERT_EXPR_P (op)
+  && TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) == REFERENCE_TYPE)
+{
+  tree inner_op = op;
+  STRIP_NOPS (inner_op);
+
+  if (DECL_P (inner_op))
+warning_at (location, OPT_Waddress,
+"the compiler can assume that the address of "
+"%qD will never be NULL", inner_op);



Since I noted the subtle differences between the phrasing of
the various -Waddress warnings in the bug, I have to ask: what is
the significance of the difference between the two warnings here?

Would it not be appropriate to issue the first warning in the latter
case?  Or perhaps even use the same text as is already used elsewhere:
"the address of %qD will always evaluate as ‘true’" (since it may not
be the macro NULL that's mentioned in the expression).


They were added at different times AFAICT.  The former is fairly old
(Douglas Gregor, 2008) at this point.  The latter was added by Patrick Palka
for 65168 about a year ago.

You could directly ask Patrick about motivations for a different message.


There is no plausible way for the address of a non-reference variable
to be NULL even in code with UB (aside from __attribute__ ((weak)) in
which case the warning is suppressed).  But the address of a reference
can easily seem to be NULL if one performs UB and assigns to it *(int
*)NULL or something like that.  I think that was my motivation, anyway
:)


Thanks (everyone) for the explanation.

I actually think the warning Patrick added is the most accurate
and would be appropriate in all cases.

I suppose what bothers me besides the mention of NULL even when
there is no NULL in the code, is that a) the text of the warnings
is misleading (contradictory) in some interesting cases, and b)
I can't think of a way in which the difference between the three
phrasings of the diagnostic could be useful to a user.  All three
imply the same thing: compilers can and GCC is some cases does
assume that the address of an ordinary (non weak) function, object,
or reference is not null.

To see (a), consider the invalid test program below, in which
GCC makes this assumption about the address of i even though
the warning doesn't mention it (but it makes a claim that's
contrary to the actual address), yet doesn't make the same
assumption about the address of the reference even though
the diagnostic says it can.

While I would find the warning less misleading if it simply said
in all three cases: "the address of 'x' will always evaluate as
‘true’" I think it would be even more accurate if it said
"the address of 'x' may be assumed to evaluate to ‘true’"  That
avoids making claims about whether or not it actually is null,
doesn't talk about the NULL macro when one isn't used in the
code, and by saying "may assume" it allows for both making
the assumption as well as not making one.

I'm happy to submit a patch to make this change in stage 1 if
no one objects to it.

Martin

$ cat x.c && /home/msebor/build/gcc-trunk-svn/gcc/xgcc 
-B/home/msebor/build/gcc-trunk-svn/gcc -c -xc++ x.c && 
/home/msebor/build/gcc-trunk-svn/gcc/xgcc 
-B/home/msebor/build/gcc-trunk-svn/gcc -DMAIN -Wall -Wextra -Wpedantic 
x.o -xc++ x.c && ./a.out

#if MAIN

extern int i;
extern int 

extern void f ();

int main ()
{
f ();

#define TEST(x) __builtin_printf ("%s is %s\n", #x, (x) ? "true" : "false")

TEST ( != 0);
TEST ( != 0);
TEST ();
}

#else
extern __attribute__ ((weak)) int i;
int  = i;

void f ()
{
__builtin_printf (" = %p\n = %p\n", , );
}

#endif
x.c: In function ‘int main()’:
x.c:14:17: warning: the address of ‘i’ will never be NULL [-Waddress]
 TEST ( != 0);
 ^
x.c:12:54: note: in definition of macro ‘TEST’
 #define TEST(x) __builtin_printf ("%s is %s\n", #x, (x) ? "true" : 
"false")

  ^
x.c:15:14: warning: the compiler can assume that the address of ‘r’ will 
never be NULL [-Waddress]

 TEST ( != 0);
   ~~~^~~~
x.c:12:54: note: in definition of macro ‘TEST’
 #define TEST(x) __builtin_printf ("%s is %s\n", #x, (x) ? "true" : 
"false")

  ^
x.c:12:68: warning: the address of ‘i’ will always evaluate as ‘true’ 
[-Waddres]
 #define TEST(x) 

Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-03-19 Thread H.J. Lu
On Wed, Mar 16, 2016 at 2:45 AM, Bernhard Reutner-Fischer
 wrote:
> On March 16, 2016 3:17:20 AM GMT+01:00, "H.J. Lu"  wrote:
>
>>> Where is the current definition of empty types you're proposing for
>>use in
>>> GCC?  Is the behavior of this case clear from that definition?
>>
>>https://gcc.gnu.org/ml/gcc/2016-03/msg00071.html
>>
>>Jason's patch follows it.  Here is a test for struct with zero-size
>>array of empty type, which is treated as empty type.
>
> index 000..489eb3a
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/abi/empty19.C
> @@ -0,0 +1,17 @@
> +// PR c++/60336
> +// { dg-do run }
> +// { dg-options "-Wabi=9 -x c" }
> +// { dg-additional-sources "empty14a.c" }
>
> 14a ? Not 19a ?
> Thanks
>
>

Here is the updated patch.

-- 
H.J.
From d7da4b56dddbd75da163b9fd3cc9ff4241be6ca9 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Tue, 15 Mar 2016 19:14:30 -0700
Subject: [PATCH] Add a test for struct with zero-size array of empty type

---
 gcc/testsuite/g++.dg/abi/empty19.C  | 17 +
 gcc/testsuite/g++.dg/abi/empty19.h  | 10 ++
 gcc/testsuite/g++.dg/abi/empty19a.c |  6 ++
 3 files changed, 33 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/abi/empty19.C
 create mode 100644 gcc/testsuite/g++.dg/abi/empty19.h
 create mode 100644 gcc/testsuite/g++.dg/abi/empty19a.c

diff --git a/gcc/testsuite/g++.dg/abi/empty19.C b/gcc/testsuite/g++.dg/abi/empty19.C
new file mode 100644
index 000..e3e855a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/empty19.C
@@ -0,0 +1,17 @@
+// PR c++/60336
+// { dg-do run }
+// { dg-options "-Wabi=9 -x c" }
+// { dg-additional-sources "empty19a.c" }
+// { dg-prune-output "command line option" }
+
+#include "empty19.h"
+extern "C" void fun(struct dummy, struct foo);
+
+int main()
+{
+  struct dummy d;
+  struct foo f = { -1, -2, -3, -4, -5 };
+
+  fun(d, f); // { dg-warning "empty" }
+  return 0;
+}
diff --git a/gcc/testsuite/g++.dg/abi/empty19.h b/gcc/testsuite/g++.dg/abi/empty19.h
new file mode 100644
index 000..616b87b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/empty19.h
@@ -0,0 +1,10 @@
+struct dummy0 { };
+struct dummy { struct dummy0 d[0]; };
+struct foo
+{
+  int i1;
+  int i2;
+  int i3;
+  int i4;
+  int i5;
+};
diff --git a/gcc/testsuite/g++.dg/abi/empty19a.c b/gcc/testsuite/g++.dg/abi/empty19a.c
new file mode 100644
index 000..767b1eb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/abi/empty19a.c
@@ -0,0 +1,6 @@
+#include "empty19.h"
+void fun(struct dummy d, struct foo f)
+{
+  if (f.i1 != -1)
+__builtin_abort();
+}
-- 
2.5.0



Re: [PATCH] Fix rs6000 vector builtin macro handling if it is followed by a fn-like macro without arguments (PR target/70296)

2016-03-19 Thread David Edelsohn
On Fri, Mar 18, 2016 at 5:34 PM, Jakub Jelinek  wrote:
> Hi!
>
> The following testcase is diagnosed as errorneous, because the preprocessor
> mishandles
>
> #define c(x) x
> vector c;
>
> and
>
> #define int(x) x
> vector int n;
>
> The thing is if a function-like macro is not followed by (, then it is kept
> as is, but the builtin conditional macro handling expects it always expands
> as something and calls cpp_get_token on it.  For non-function-like macros
> or function-like macros followed by ( that is not a problem, that
> cpp_get_token call just eats the macro token and pushes instead the
> replacement tokens, but for function-like macro not followed by ( it results
> in the token being dropped on the floor.
> So, in the above mentioned cases we preprocess it as
> vector ;
> and
> vector n;
> and when compiling, error on the first one, and (due to previous
> typedef int vector;) handle it at int n; rather than
> __attribute__((__vector)) int n;
>
> Fixed by peeking at the next token after the macro token (or more, if there
> are CPP_PADDING tokens) and if it is not followed by CPP_OPEN_PAREN, not
> calling cpp_get_token.  Unfortunately, cpp_macro structure is opaque outside
> of libcpp, so I had to add a helper function into libcpp.
>
> Bootstrapped/regtested on powerpc64{,le}-linux, ok for trunk?
>
> 2016-03-18  Jakub Jelinek  
>
> PR target/70296
> * include/cpplib.h (cpp_fun_like_macro_p): New prototype.
> * macro.c (cpp_fun_like_macro_p): New function.
>
> * config/rs6000/rs6000-c.c (rs6000_macro_to_expand): If IDENT is
> function-like macro, peek following token(s) if it is followed
> by CPP_OPEN_PAREN token with optional padding in between, and
> if not, don't treat it like a macro.
>
> * gcc.target/powerpc/altivec-36.c: New test.

I'm not an expert in this part of the compiler, but the rs6000 bits
are fine with me.

Thanks, David


Re: [C PATCH] Prevent -Wunused-value warning with __atomic_fetch_* (PR c/69407)

2016-03-19 Thread Jeff Law

On 03/14/2016 05:48 AM, Marek Polacek wrote:

Ping.

On Fri, Mar 04, 2016 at 07:03:09PM +0100, Marek Polacek wrote:

On Fri, Mar 04, 2016 at 06:41:26PM +0100, Jakub Jelinek wrote:

I'm ok with it for gcc6.


Cool.


But IMHO you should add dg-bogus directives here.


Ok, version with dg-bogus:

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-03-04  Marek Polacek  

PR c/69407
* c-common.c (resolve_overloaded_builtin): Set TREE_USED for the fetch
operations.

* gcc.dg/atomic-op-6.c: New test.

OK.
jeff



Re: C++ PATCH to fix missing warning (PR c++/70194)

2016-03-19 Thread Jeff Law

On 03/17/2016 10:45 AM, Jason Merrill wrote:

On 03/16/2016 08:43 PM, Martin Sebor wrote:

@@ -3974,6 +3974,38 @@ build_vec_cmp (tree_code code, tree type,
return build3 (VEC_COND_EXPR, type, cmp, minus_one_vec, zero_vec);
  }

+/* Possibly warn about an address never being NULL.  */
+
+static void
+warn_for_null_address (location_t location, tree op, tsubst_flags_t
complain)
+{

...

+  if (TREE_CODE (cop) == ADDR_EXPR
+  && decl_with_nonnull_addr_p (TREE_OPERAND (cop, 0))
+  && !TREE_NO_WARNING (cop))
+warning_at (location, OPT_Waddress, "the address of %qD will
never "
+"be NULL", TREE_OPERAND (cop, 0));
+
+  if (CONVERT_EXPR_P (op)
+  && TREE_CODE (TREE_TYPE (TREE_OPERAND (op, 0))) ==
REFERENCE_TYPE)
+{
+  tree inner_op = op;
+  STRIP_NOPS (inner_op);
+
+  if (DECL_P (inner_op))
+warning_at (location, OPT_Waddress,
+"the compiler can assume that the address of "
+"%qD will never be NULL", inner_op);


Since I noted the subtle differences between the phrasing of
the various -Waddress warnings in the bug, I have to ask: what is
the significance of the difference between the two warnings here?


The difference is that in the second case, a reference could be bound to
a null address, but that has undefined behavior, so the compiler can
assume it won't happen.
So the first can't happen, the second could, but would be considered 
undefined behavior.


jeff


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-03-19 Thread H.J. Lu
On Tue, Mar 15, 2016 at 12:32 PM, Jason Merrill  wrote:
> On 03/15/2016 12:00 PM, H.J. Lu wrote:
>>
>> On Tue, Mar 15, 2016 at 8:35 AM, Jason Merrill  wrote:
>>>
>>> I'm concerned about how this patch changes both target-independent code
>>> and
>>> target-specific code, with a passing remark that other targets might need
>>> to
>>> make similar changes.  I'm also concerned about the effect of this on
>>> other
>>> languages that might not want the same change.  So, here's an alternative
>>> patch that implements the change in the front end (and includes your
>>> testcases, thanks!).
>>>
>>> Thoughts?
>>
>>
>> On x86-64, I got
>>
>>
>> /export/gnu/import/git/sources/gcc/libstdc++-v3/src/c++11/cxx11-shim_facets.cc:273:23:
>> error: empty class ‘std::__facet_shims::other_abi {aka
>> std::integral_constant}’ parameter passing ABI changes in
>> -fabi-version=10 (GCC 6) [-Werror=abi]
>>  __collate_transform(other_abi{}, _M_get(), st, lo, hi);
>
>
> Right, need to remove the -Werror=abi bit from the patch until Jonathan
> updates libstdc++.
>
> Jason
>

I got

FAIL: g++.dg/abi/pr60336-1.C   scan-assembler jmp[\t ]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-5.C   scan-assembler jmp[\t ]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-6.C   scan-assembler jmp[\t ]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-7.C   scan-assembler jmp[\t ]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-9.C   scan-assembler jmp[\t ]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr68355.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx17integral_constantIbLb1EE

They are expected since get_ref_base_and_extent needs to be
changed to set bitsize to 0 for empty types so that when
ref_maybe_used_by_call_p_1 calls get_ref_base_and_extent to
get 0 as the maximum size on empty type.  Otherwise, find_tail_calls
won't perform tail call optimization for functions with empty type
parameters.

-- 
H.J.


Re: C PATCH for c/70093 (ICE with nested-function returning VM type)

2016-03-19 Thread Jakub Jelinek
On Wed, Mar 16, 2016 at 04:11:56PM +0100, Marek Polacek wrote:
> > 2016-03-09  Marek Polacek  
> > 
> > PR c/70093
> > * c-typeck.c (build_function_call_vec): Create a TARGET_EXPR for
> > nested functions returning VM types.
> > 
> > * cgraphunit.c (cgraph_node::expand_thunk): Also build call to the
> > function being thunked if the result type doesn't have fixed size.
> > * gimplify.c (gimplify_modify_expr): Also set LHS if the result type
> > doesn't have fixed size.
> > 
> > * gcc.dg/nested-func-10.c: New test.
> > * gcc.dg/nested-func-9.c: New test.

Ok, thanks.

Jakub


Re: [PATCH] Retry to emit global variables in HSA (PR hsa/70234)

2016-03-19 Thread Martin Jambor
Hi,

On Tue, Mar 15, 2016 at 12:59:03PM +0100, Martin Liska wrote:
> Hi.
> 
> As emission of a HSAIL function can fail for various reason (-Whsa),
> we must guarantee that a global variable is declared and at maximum once.
> 
> Following patch does that, patch can survive make check-target-libgomp and
> HSAILAsm is happy with BRIG output of declare_target-5.c source file.
> 
> Currently, I'm running bootstrap on x86_64-linux-gnu.
> Ready to install after if finishes?
> 
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2016-03-15  Martin Liska  
> 
>   PR hsa/70234
>   * hsa-brig.c (emit_function_directives): Mark unemitted
>   global variables for emission.
>   * hsa-gen.c (hsa_symbol::hsa_symbol): Initialize a new flag.
>   (get_symbol_for_decl): Likewise.
>   * hsa.h (struct hsa_symbol): New flag.
> ---
>  gcc/hsa-brig.c |  2 ++
>  gcc/hsa-gen.c  | 22 +++---
>  gcc/hsa.h  |  3 +++
>  3 files changed, 24 insertions(+), 3 deletions(-)
> 
> diff --git a/gcc/hsa-brig.c b/gcc/hsa-brig.c
> index 2a301be..9b6c0b8 100644
> --- a/gcc/hsa-brig.c
> +++ b/gcc/hsa-brig.c
> @@ -643,6 +643,8 @@ emit_function_directives (hsa_function_representation *f, 
> bool is_declaration)
>if (!f->m_declaration_p)
>  for (int i = 0; f->m_global_symbols.iterate (i, ); i++)
>{
> + gcc_assert (!sym->m_emitted_to_brig);
> + sym->m_emitted_to_brig = true;
>   emit_directive_variable (sym);
>   brig_insn_count++;
>}
> diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
> index 5939a57..473d4bd 100644
> --- a/gcc/hsa-gen.c
> +++ b/gcc/hsa-gen.c
> @@ -162,7 +162,7 @@ hsa_symbol::hsa_symbol ()
>  m_directive_offset (0), m_type (BRIG_TYPE_NONE),
>  m_segment (BRIG_SEGMENT_NONE), m_linkage (BRIG_LINKAGE_NONE), m_dim (0),
>  m_cst_value (NULL), m_global_scope_p (false), m_seen_error (false),
> -m_allocation (BRIG_ALLOCATION_AUTOMATIC)
> +m_allocation (BRIG_ALLOCATION_AUTOMATIC), m_emitted_to_brig (false)
>  {
>  }
>  
> @@ -174,7 +174,7 @@ hsa_symbol::hsa_symbol (BrigType16_t type, BrigSegment8_t 
> segment,
>  m_directive_offset (0), m_type (type), m_segment (segment),
>  m_linkage (linkage), m_dim (0), m_cst_value (NULL),
>  m_global_scope_p (global_scope_p), m_seen_error (false),
> -m_allocation (allocation)
> +m_allocation (allocation), m_emitted_to_brig (false)
>  {
>  }
>  
> @@ -880,11 +880,27 @@ get_symbol_for_decl (tree decl)
>gcc_checking_assert (slot);
>if (*slot)
>  {
> +  hsa_symbol *sym = (*slot);
> +
>/* If the symbol is problematic, mark current function also as
>problematic.  */
> -  if ((*slot)->m_seen_error)
> +  if (sym->m_seen_error)
>   hsa_fail_cfun ();
>  
> +  /* PR hsa/70234: If a global variable was marked to be emitted,
> +  but HSAIL generation of a function using the variable fails,
> +  we should retry to emit the variable in context of a different
> +  function.
> +
> +  Iterate elements whether a symbol is already in m_global_symbols
> +  of not.  */
> +  for (unsigned i = 0; i < hsa_cfun->m_global_symbols.length (); i++)
> + if (hsa_cfun->m_global_symbols[i] == sym)
> +   return *slot;
> +
> +  if (is_in_global_vars && !sym->m_emitted_to_brig)
> + hsa_cfun->m_global_symbols.safe_push (sym);
> +

Hopefully the linear search in m_global_symbols never becomes
prohibitively expensive.  But it is only necessary when
is_in_global_vars is true, so at least we could do something like:

  if (is_in_global_vars && !sym->m_emitted_to_brig)
{
  for (unsigned i = 0; i < hsa_cfun->m_global_symbols.length (); i++)
if (hsa_cfun->m_global_symbols[i] == sym)
  return *slot;
hsa_cfun->m_global_symbols.safe_push (sym);
}

OK with that change.  And even though I have seen the bug only on the
hsa branch, commit the fix to trunk too, I think it can happen there
as well.

Thanks a lot,

Martin


Re: [PATCH, PR tree-optimization/70251] Disable VEC_COND_EXPR transformation into VIEW_CONVERT_EXPR for scalar mask case

2016-03-19 Thread Richard Biener
On Thu, Mar 17, 2016 at 11:23 AM, Ilya Enkovich  wrote:
> Hi,
>
> This patch disables two match.pd patterns which transform
> VEC_COND_EXPR into simple conversion in case it uses a scalar mask.
> The patch was bootstrapped and regtested on x86_64-pc-linux-gnu +
> separate check for new test on SDE.  OK for trunk?

Ok.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2016-03-17  Ilya Enkovich  
>
> * match.pd (A + (B vcmp C ? 1 : 0) -> A - (B vcmp C)): Apply
> for boolean vector with vector mode only.
> (A - (B vcmp C ? 1 : 0) -> A + (B vcmp C)): Likewise.
>
> gcc/testsuite/
>
> 2016-03-17  Ilya Enkovich  
>
> * gcc.target/i386/pr70251.c: New test.
>
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index 112deb3..7245ff4 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -1759,6 +1759,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (simplify
>   (plus:c @3 (view_convert? (vec_cond @0 integer_each_onep@1 
> integer_zerop@2)))
>   (if (VECTOR_TYPE_P (type)
> +  && VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (@0)))
>&& TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
>&& (TYPE_MODE (TREE_TYPE (type))
>== TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
> @@ -1768,6 +1769,7 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>  (simplify
>   (minus @3 (view_convert? (vec_cond @0 integer_each_onep@1 integer_zerop@2)))
>   (if (VECTOR_TYPE_P (type)
> +  && VECTOR_MODE_P (TYPE_MODE (TREE_TYPE (@0)))
>&& TYPE_VECTOR_SUBPARTS (type) == TYPE_VECTOR_SUBPARTS (TREE_TYPE (@0))
>&& (TYPE_MODE (TREE_TYPE (type))
>== TYPE_MODE (TREE_TYPE (TREE_TYPE (@0)
> diff --git a/gcc/testsuite/gcc.target/i386/pr70251.c 
> b/gcc/testsuite/gcc.target/i386/pr70251.c
> new file mode 100644
> index 000..97078cd
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr70251.c
> @@ -0,0 +1,52 @@
> +/* { dg-do run } */
> +/* { dg-options "-O3 -mavx512bw" } */
> +/* { dg-require-effective-target avx512bw } */
> +
> +#define AVX512BW
> +#include "avx512f-helper.h"
> +
> +unsigned long long int
> +hash(unsigned long long int seed, unsigned long long int v)
> +{
> +  return seed ^ (v + 0x9e3779b9 + (seed<<6) + (seed>>2));
> +}
> +
> +unsigned int a [100];
> +signed char b [100];
> +signed char c [100];
> +
> +void
> +init ()
> +{
> +  for (int i = 0; i < 100; ++i)
> +{
> +  a [i] = 1000L;
> +  b [i] = 10;
> +  c [i] = 5;
> +}
> +}
> +
> +void
> +foo ()
> +{
> +  for (int i = 0; i < 100; ++i)
> +b [i] = (!b [i] ^ (a [i] >= b [i])) + c [i];
> +}
> +
> +unsigned long long int
> +checksum ()
> +{
> +  unsigned long long int seed = 0ULL;
> +  for (int i = 0; i < 100; ++i)
> +seed = hash (seed, b[i]);
> +  return seed;
> +}
> +
> +void
> +TEST ()
> +{
> +  init ();
> +  foo ();
> +  if (checksum () != 5785906989299578598ULL)
> +__builtin_abort ();
> +}


Re: [AArch64] Emit square root using the Newton series

2016-03-19 Thread James Greenhalgh
On Wed, Mar 16, 2016 at 02:45:37PM -0500, Evandro Menezes wrote:
> On 03/08/16 16:08, Evandro Menezes wrote:
> >On 02/16/16 14:56, Evandro Menezes wrote:
> >>On 12/08/15 15:35, Evandro Menezes wrote:
> >>>Emit square root using the Newton series
> >>>
> >>>   2015-12-03  Evandro Menezes  
> >>>
> >>>   gcc/
> >>>* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
> >>>   Declare new
> >>>function.
> >>>* config/aarch64/aarch64-simd.md (sqrt2): New
> >>>   expansion and
> >>>insn definitions.
> >>>* config/aarch64/aarch64-tuning-flags.def
> >>>(AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro.
> >>>* config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define
> >>>   new function.
> >>>* config/aarch64/aarch64.md (sqrt2): New expansion
> >>>   and insn
> >>>definitions.
> >>>* config/aarch64/aarch64.opt (mlow-precision-recip-sqrt):
> >>>   Expand option
> >>>description.
> >>>* doc/invoke.texi (mlow-precision-recip-sqrt): Likewise.
> >>>
> >>>This patch extends the patch that added support for
> >>>implementing x^-1/2 using the Newton series by adding support
> >>>for x^1/2 as well.
> >>>
> >>>Is it OK at this point of stage 3?
> >>>
> >>>Thank you,
> >>>
> >>
> >>James,
> >>
> >>As I was saying, this patch results in some validation errors in
> >>CPU2000 benchmarks using DF.  Although proving the algorithm to
> >>be pretty solid with a vast set of random values, I'm confused
> >>why some benchmarks fail to validate with this implementation of
> >>the Newton series for square root too, when they pass with the
> >>Newton series for reciprocal square root.
> >>
> >>Since I had no problems with the same algorithm on x86-64, I
> >>wonder if the initial estimate on AArch64, which offers just 8
> >>bits, whereas x86-64 offers 11 bits, has to do with it.  Then
> >>again, the algorithm iterated 1 less time on x86-64 than on
> >>AArch64.
> >>
> >>Since it seems that the initial estimate is sufficient for
> >>CPU2000 to validate when using SF, I'm leaning towards
> >>restricting the Newton series for square root only for SF.
> >>
> >>Your thoughts on the matter are appreciated,
> >
> >Add choices for the reciprocal square root approximation
> >
> >Allow a target to prefer such operation depending on the FP
> >   precision.
> >
> >gcc/
> >* config/aarch64/aarch64-protos.h
> >(AARCH64_EXTRA_TUNE_APPROX_RSQRT): New macro.
> >* config/aarch64/aarch64-tuning-flags.def
> >(AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF): New mask.
> >(AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF): Likewise.
> >* config/aarch64/aarch64.c
> >(use_rsqrt_p): New argument for the mode.
> >(aarch64_builtin_reciprocal): Devise mode from builtin.
> >(aarch64_optab_supported_p): New argument for the mode.
> >
> >
> >Now that the patch is attached, feedback is appreciated.
> 
> Ping.

Hi Evandro,

I thought this was on hold while you looked in to the underlying issue for
the failures in the other thread? With that said, I'm struggling to keep
up with where we are on this, so maybe it is time for a clean break - a new
thread for patch set v2, proposed as an explicit patch series (just to keep
the dependencies clear to me).

I'm not convinced of the value of this split, nor why we would stop here
if it was useful (vector modes vs. scalar modes would also seem an
important distinction).

If you no longer need the workaround this enables then I'm not sure I see a
good reason for it to go in, maybe I'm missing a target for which this
would be important?

Thanks,
James



Re: C++ PATCH to fix missing warning (PR c++/70194)

2016-03-19 Thread Martin Sebor

While I would find the warning less misleading if it simply said
in all three cases: "the address of 'x' will always evaluate as
‘true’" I think it would be even more accurate if it said
"the address of 'x' may be assumed to evaluate to ‘true’"  That
avoids making claims about whether or not it actually is null,
doesn't talk about the NULL macro when one isn't used in the
code, and by saying "may assume" it allows for both making
the assumption as well as not making one.


That sounds good except that talking about 'true' is wrong when there is
an explicit comparison to a null pointer constant.  I'd be fine with
changing "NULL" to "null" or similar.


Sounds good.  I will use bug 47931 - missing -Waddress warning
for comparison with NULL, to take care of the outstanding cases
where a warning still isn't issued (in either C++ or C) and also
adjust the text of the warning.

Martin

PS It seems that just adding STRIP_NOPS (op) to Marek's patch
significantly increases the number of successfully diagnosed
cases.  (The small patch I attached to 47931 covers nearly all
the remaining cases I could think of.)


Re: [RFA][PR rtl-optimization/70263] Fix creation of new REG_EQUIV notes

2016-03-19 Thread Bernd Schmidt

On 03/18/2016 08:14 PM, Jeff Law wrote:

I also added a blurb to the dump file when we create these equivalences
and included a test to verify the code fires.  I verified it fired on
x86 and x86-64.  It may or may not fire on other targets, so I left the
test in the i386 specific subdirectory.


This is the sort of thing I'd want to do with rtl unit tests.


Bootstrapped and regression tested on x86_64-linux-gnu.  And unlike the
last version, it doesn't totally disable the note creation (as verified
by the new test ;-)

OK for the trunk?


Ok.


Bernd


Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible

2016-03-19 Thread Bin.Cheng
On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
 wrote:
>
> On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng  wrote:
> > Hi,
> > ..
> > Bootstrap and test on x86_64 and AArch64.  Is it OK, not sure if it's GCC 7?
>
> Hmm.
Hi,
Thanks for reviewing.
>
> +  equal_p = true;
> +  if (e1->base_address && e2->base_address)
> +equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
> +  if (e1->offset && e2->offset)
> +equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>
> surely better to return false early.
>
> I think we don't want this in tree-data-refs.h also because of ...
>
> @@ -615,15 +619,29 @@
> hash_memrefs_baserefs_and_store_DRs_read_written_info
> (data_reference_p a)
>data_reference_p *master_dr, *base_master_dr;and REALPART) before creating 
> the DR (or adjust the equality function
and hashing
>tree ref = DR_REF (a);
>tree base_ref = DR_BASE_OBJECT (a);
> +  innermost_loop_behavior *innermost = _INNERMOST (a);
>tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>bool exist1, exist2;
>
> -  while (TREE_CODE (ref) == COMPONENT_REF
> -|| TREE_CODE (ref) == IMAGPART_EXPR
> -|| TREE_CODE (ref) == REALPART_EXPR)
> -ref = TREE_OPERAND (ref, 0);
> +  /* If reference in DR has innermost loop behavior and it is not
> + a compound memory reference, we store it to innermost_DR_map,
> + otherwise to ref_DR_map.  */
> +  if (TREE_CODE (ref) == COMPONENT_REF
> +  || TREE_CODE (ref) == IMAGPART_EXPR
> +  || TREE_CODE (ref) == REALPART_EXPR
> +  || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
> +  || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
> +{
> +  while (TREE_CODE (ref) == COMPONENT_REF
> +|| TREE_CODE (ref) == IMAGPART_EXPR
> +|| TREE_CODE (ref) == REALPART_EXPR)
> +   ref = TREE_OPERAND (ref, 0);
> +
> +  master_dr = _DR_map->get_or_insert (ref, );
> +}
> +  else
> +master_dr = _DR_map->get_or_insert (innermost, );
>
> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd need 
> to
> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
> and REALPART) before creating the DR (or adjust the equality function
> and hashing
> to disregard them which means subtracting their offset from DR_INIT.
I am not sure if I understand correctly.  But for component reference,
it is the base object that we want to record/track.  For example,

  for (i = 0; i < N; i++) {
m = *data++;

m1 = p1->x - m;
m2 = p2->x + m;

p3->y = (m1 >= m2) ? p1->y : p2->y;

p1++;
p2++;
p3++;
  }
We want to infer that reads of p1/p2 in condition statement won't trap
because there are unconditional reads of the structures, though the
unconditional reads are actual of other sub-objects.  Here it is the
invariant part of address that we want to track.
Also illustrated by this example, we can't rely on data-ref analyzer
here.  Because in gathering/scattering cases, the address could be not
affine at all.
>
> To adjust the references we collect you'd maybe could use a callback
> to get_references_in_stmt
> to adjust them.
>
> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as
Is this a part of the method you suggested above, or is it an
alternative one?  If it's the latter, then I have below questions
embedded.
>
> Index: tree-if-conv.c
> ===
> --- tree-if-conv.c  (revision 234215)
> +++ tree-if-conv.c  (working copy)
> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>
>for (i = 0; refs->iterate (i, ); i++)
>  {
> +  tree *refp = _REF (dr);
> +  while ((TREE_CODE (*refp) == COMPONENT_REF
> + && TREE_OPERAND (*refp, 2) == NULL_TREE)
> +|| TREE_CODE (*refp) == IMAGPART_EXPR
> +|| TREE_CODE (*refp) == REALPART_EXPR)
> +   refp = _OPERAND (*refp, 0);
> +  if (refp != _REF (dr))
> +   {
> + tree saved_base = *refp;
> + *refp = integer_zero_node;
> +
> + if (DR_INIT (dr))
> +   {
> + tree poffset;
> + int punsignedp, preversep, pvolatilep;
> + machine_mode pmode;
> + HOST_WIDE_INT pbitsize, pbitpos;
> + get_inner_reference (DR_REF (dr), , , ,
> +  , , , 
> ,
> +  false);
> + gcc_assert (poffset == NULL_TREE);
> +
> + DR_INIT (dr)
> +   = wide_int_to_tree (ssizetype,
> +   wi::sub (DR_INIT (dr),
> +pbitpos / BITS_PER_UNIT));
> +   }
> +
> + *refp = saved_base;
> + DR_REF (dr) = *refp;
> +   }
Looks to me the code is trying to resolve difference between two (or
more) component references, which is DR_INIT in the code.  But DR_INIT
is not the 

Re: [PATCH, PR70161] Fix fdump-ipa-all-graph

2016-03-19 Thread Richard Biener
On Thu, 17 Mar 2016, Tom de Vries wrote:

> On 15/03/16 12:37, Richard Biener wrote:
> > On Mon, 14 Mar 2016, Tom de Vries wrote:
> > 
> > > Hi,
> > > 
> > > this patch fixes PR70161, a 4.9/5/6 regression.
> > > 
> > > Currently when using -fdump-ipa-all-graph, the compiler ICEs in
> > > execute_function_dump when testing for pass->graph_dump_initialized,
> > > because
> > > pass == NULL.
> > > 
> > > The patch fixes:
> > > - the ICE by setting the pass argument in the call to
> > >execute_function_dump in execute_one_ipa_transform_pass
> > > - a subsequent ICE (triggered with -fipa-pta) by saving, resetting and
> > >restoring dump_file_name in cgraph_node::get_body, alongside the
> > >saving and restoring of the dump_file variable.
> > > - the duplicate edges in the subsequently generated dot file by
> > >ensuring that execute_function_dump is called only once per function
> > >per pass. [ Note that this bit also has an effect for the normal dump
> > >files for the ipa passes with transform function. For those functions,
> > >atm execute_function_dump is called both after execute and after
> > >transform. With the patch, it's only called after transform. ]
> > > 
> > > Bootstrapped and reg-tested on x86_64.
> > > 
> > > OK for stage4?
> > 
> > Ok.
> 
> All of the patch also OK for 4.9/5 branch?

Yes, after a few days on trunk w/o issues.

Richard.

> [ The first 2 bits fix ICES. The last part fixes a duplicate edges problem in
> the dot file, I'm not sure if that's needed in the release branches. ]
> 
> Thanks,
> - Tom
> 
> 
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: [PATCH]PR other/70268: map one directory name (old) to another (new) in __FILE__

2016-03-19 Thread Joseph Myers
On Thu, 17 Mar 2016, Hongxu Jia wrote:

> +  if (add_file_prefix_map(arg) < 0)

Bad formatting (missing space before '(').  Likewise elsewhere in this 
patch.

> +@item -ffile-prefix-map=@var{old}=@var{new}
> +@opindex ffile-prefix-map
> +When parsing __FILE__, __BASE_FILE__ and __builtin_FILE(), use directory
> +@file{@var{new}} to replace @file{@var{old}}.

Missing use of @code{} around literal source code text.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH V3]PR other/70268: map one directory name (old) to another (new) in __FILE__

2016-03-19 Thread Hongxu Jia

On 03/18/2016 03:56 PM, Bernhard Reutner-Fischer wrote:

On March 18, 2016 6:16:46 AM GMT+01:00, Hongxu Jia  
wrote:



+/* Perform user-specified mapping of __FILE__ prefixes.  Return
+   the new name corresponding to filename.  */
+
+const char *
+remap_file_filename (const char *filename)
+{
+  file_prefix_map *map;
+  char *s;
+  const char *name;
+  size_t name_len;
+
+  for (map = file_prefix_maps; map; map = map->next)
+if (filename_ncmp (filename, map->old_prefix, map->old_len) == 0)
+  break;
+  if (!map)
+return filename;
+  name = filename + map->old_len;
+  name_len = strlen (name) + 1;
+  s = (char *) alloca (name_len + map->new_len);
+  memcpy (s, map->new_prefix, map->new_len);
+  memcpy (s + map->new_len, name, name_len);
+
+  return xstrdup (s);
+}

Please explain why you first alloca() and then strdup the result instead of 
XNEWVEC


1. alloca - allocate memory that is automatically freed when the
function remap_file_filename returns

2. XNEW - allocate memory for struct file_prefix_map

3. xstrdup - duplicate a string

//Hongxu




Thanks,






Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-03-19 Thread H.J. Lu
On Wed, Mar 16, 2016 at 9:58 AM, Jason Merrill  wrote:
> On 03/16/2016 08:38 AM, H.J. Lu wrote:
>>
>> FAIL: g++.dg/abi/pr60336-1.C   scan-assembler jmp[\t
>> ]+[^$]*?_Z3xxx9true_type
>> FAIL: g++.dg/abi/pr60336-5.C   scan-assembler jmp[\t
>> ]+[^$]*?_Z3xxx9true_type
>> FAIL: g++.dg/abi/pr60336-6.C   scan-assembler jmp[\t
>> ]+[^$]*?_Z3xxx9true_type
>> FAIL: g++.dg/abi/pr60336-7.C   scan-assembler jmp[\t
>> ]+[^$]*?_Z3xxx9true_type
>> FAIL: g++.dg/abi/pr60336-9.C   scan-assembler jmp[\t
>> ]+[^$]*?_Z3xxx9true_type
>> FAIL: g++.dg/abi/pr68355.C   scan-assembler jmp[\t
>> ]+[^$]*?_Z3xxx17integral_constantIbLb1EE
>
>
> These pass for me on x86_64, but I do see calls with -m32.
>
>> They are expected since get_ref_base_and_extent needs to be
>> changed to set bitsize to 0 for empty types so that when
>> ref_maybe_used_by_call_p_1 calls get_ref_base_and_extent to
>> get 0 as the maximum size on empty type.  Otherwise, find_tail_calls
>> won't perform tail call optimization for functions with empty type
>> parameters.
>
>
> That isn't why the optimization isn't happening in pr68355 with -m32; the
> .optimized dump has
>
>   xxx (D.2289); [tail call]
>
> Rather, the failure seems to happen in load_register_parameter, at
>
>>   /* Check for overlap with already clobbered argument area,
>>  providing that this has non-zero size.  */
>>   if (is_sibcall
>>   && (size == 0
>>   || mem_overlaps_already_clobbered_arg_p
>>(XEXP (args[i].value, 0),
>> size)))
>> *sibcall_failure = 1;
>
>
> The code seems to contradict the comment, and seems to have been broken by
> r162402.  Applying this additional patch fixes those tests.
>

I am running the full test now.

-- 
H.J.


[PATCH, PR70183] Propagate dump flags in pass_manager::register_pass

2016-03-19 Thread Tom de Vries

Hi,

atm dumpfile vzeroupper is not influenced by the flags in 
-fdump-rtl-all-flags.


The patch fixes this by copying the flags in  pass_manager::register_pass.

OK for stage1 if bootstrap and reg-test succeeds?

Thanks,
- Tom
Propagate dump flags in pass_manager::register_pass

2016-03-16  Tom de Vries  

	PR other/70183
	* passes.c (pass_manager::register_pass): Propagate pflags.

	* gcc.target/i386/vzeroupper-dump-flags.c: New test.

---
 gcc/passes.c  |  6 +-
 gcc/testsuite/gcc.target/i386/vzeroupper-dump-flags.c | 10 ++
 2 files changed, 15 insertions(+), 1 deletion(-)

diff --git a/gcc/passes.c b/gcc/passes.c
index 9d90251..62fcc03 100644
--- a/gcc/passes.c
+++ b/gcc/passes.c
@@ -1497,8 +1497,12 @@ pass_manager::register_pass (struct register_pass_info *pass_info)
 tdi = TDI_rtl_all;
   /* Check if dump-all flag is specified.  */
   if (dumps->get_dump_file_info (tdi)->pstate)
-dumps->get_dump_file_info (added_pass_nodes->pass->static_pass_number)
+	{
+	  dumps->get_dump_file_info (added_pass_nodes->pass->static_pass_number)
 ->pstate = dumps->get_dump_file_info (tdi)->pstate;
+	  dumps->get_dump_file_info (added_pass_nodes->pass->static_pass_number)
+	->pflags = dumps->get_dump_file_info (tdi)->pflags;
+	}
   XDELETE (added_pass_nodes);
   added_pass_nodes = next_node;
 }
diff --git a/gcc/testsuite/gcc.target/i386/vzeroupper-dump-flags.c b/gcc/testsuite/gcc.target/i386/vzeroupper-dump-flags.c
new file mode 100644
index 000..933e595
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/vzeroupper-dump-flags.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options " -fdump-rtl-all-slim -mavx -mvzeroupper -fexpensive-optimizations" } */
+
+int
+foo (void)
+{
+  return 0;
+}
+
+/* { dg-final { scan-rtl-dump-not "\\(insn " "vzeroupper" } }  */


Re: [PATCH][SPARC] sparc: switch -fasynchronous-unwind-tables on by default.

2016-03-19 Thread Richard Henderson
On 02/29/2016 07:50 AM, Jose E. Marchesi wrote:
> The backtrace(3) implementation for sparc contains a simple unwinder
> that works well in most cases, but that unwinder is not used if
> libgcc_s.so can be dlopened and it provides _Unwind_Backtrace.

There's no reason that simple unwinder can't be put into
MD_FALLBACK_FRAME_STATE_FOR.

Currently we only use that for unwinding through signal stacks, but it could be
used for anything that the dwarf2 unwinder doesn't have data for.  Given sparc
register windows, this seems particularly reliable.


r~


Re: [PATCH, PR70185] Only finalize dot files that have been initialized

2016-03-19 Thread Richard Biener
On Thu, Mar 17, 2016 at 10:19 AM, Tom de Vries  wrote:
> On 16/03/16 12:34, Richard Biener wrote:
>>
>> On Wed, Mar 16, 2016 at 11:57 AM, Tom de Vries 
>> wrote:
>>>
>>> Hi,
>>>
>>> Atm, using fdump-tree-all-graph produces invalid dot files:
>>> ...
>>> $ rm *.c.* ; gcc test.c -O2 -S -fdump-tree-all-graph
>>> $ for f in *.dot; do dot -Tpdf $f -o dot.pdf; done
>>> Warning: test.c.006t.omplower.dot: syntax error in line 1 near '}'
>>> Warning: test.c.007t.lower.dot: syntax error in line 1 near '}'
>>> Warning: test.c.010t.eh.dot: syntax error in line 1 near '}'
>>> Warning: test.c.292t.statistics.dot: syntax error in line 1 near '}'
>>> $ cat test.c.006t.omplower.dot
>>> }
>>> $
>>> ...
>>> These dot files are finalized, but never initialized or used.
>>>
>>> The 006/007/010 files are not used because '(fn->curr_properties &
>>> PROP_cfg)
>>> == 0' at the corresponding passes.
>>>
>>> And the file test.c.292t.statistics.dot is not used, because it doesn't
>>> belong to a single pass.
>>>
>>> The current finalization code doesn't handle these cases:
>>> ...
>>>/* Do whatever is necessary to finish printing the graphs.  */
>>>for (i = TDI_end; (dfi = dumps->get_dump_file_info (i)) != NULL; ++i)
>>>  if (dumps->dump_initialized_p (i)
>>>  && (dfi->pflags & TDF_GRAPH) != 0
>>>  && (name = dumps->get_dump_file_name (i)) != NULL)
>>>{
>>>  finish_graph_dump_file (name);
>>>  free (name);
>>>}
>>> ...
>>>
>>> The patch fixes this by simply testing for pass->graph_dump_initialized
>>> instead.
>>>
>>> [ That fix exposes the lack of initialization of graph_dump_initialized.
>>> It
>>> seems to be initialized for static passes, but for dynamically added
>>> passes,
>>> such as f.i. vzeroupper the value is uninitialized. The patch also fixes
>>> this. ]
>>>
>>> Bootstrapped and reg-tested on x86_64.
>>>
>>> OK for stage1?
>>
>>
>> Seeing this I wonder if it makes more sense to move
>> ->graph_dump_initialized
>> from pass to dump_file_info?
>
>
> Done.
>
>> Also in the above shouldn't it use
>> dfi->pfilename rather than dumps->get_dump_file_name (i)?
>
>
> That one isn't defined anymore once we get to finish_optimization_passes.
>
> OK for stage1 if bootstrap and reg-test succeeds?

Ok.

Richard.

> Thanks,
> - Tom
>


[hsa branch] Use an obstack instead of multiple alloc pools

2016-03-19 Thread Martin Jambor
Hi,

when I started working on expansion to HSAIL almost three years ago, I
decided to allocate memory for most of the structures from various
alloc-pools for reasons that never materialized and the number of
pools later grew to unreasonable numbers.  So after an internal
discussion, Martin Liska wrote the following patch which changes
allocations from the various hsa alloc pools to allocations from one
obstack.  I have just committed the patch to the hsa branch after
testing it.

Thanks,

Martin


2016-03-17  Martin Liska  
Martin Jambor 

* hsa-gen.c (hsa_allocp_operand_address): Removed.
(hsa_allocp_operand_immed): Likewise.
(hsa_allocp_operand_reg): Likewise.
(hsa_allocp_operand_code_list): Likewise.
(hsa_allocp_operand_operand_list): Likewise.
(hsa_allocp_inst_basic): Likewise.
(hsa_allocp_inst_phi): Likewise.
(hsa_allocp_inst_mem): Likewise.
(hsa_allocp_inst_atomic): Likewise.
(hsa_allocp_inst_signal): Likewise.
(hsa_allocp_inst_seg): Likewise.
(hsa_allocp_inst_cmp): Likewise.
(hsa_allocp_inst_br): Likewise.
(hsa_allocp_inst_sbr): Likewise.
(hsa_allocp_inst_call): Likewise.
(hsa_allocp_inst_arg_block): Likewise.
(hsa_allocp_inst_comment): Likewise.
(hsa_allocp_inst_queue): Likewise.
(hsa_allocp_inst_srctype): Likewise.
(hsa_allocp_inst_packed): Likewise.
(hsa_allocp_inst_cvt): Likewise.
(hsa_allocp_inst_alloca): Likewise.
(hsa_allocp_bb): Likewise.
(hsa_obstack): New.
(hsa_init_data_for_cfun): Initialize obstack.
(hsa_deinit_data_for_cfun): Release memory of the obstack.
(hsa_op_immed::operator new): Use obstack instead of
object_allocator.
(hsa_op_reg::operator new): Likewise.
(hsa_op_address::operator new): Likewise.
(hsa_op_code_list::operator new): Likewise.
(hsa_op_operand_list::operator new): Likewise.
(hsa_insn_basic::operator new): Likewise.
(hsa_insn_phi::operator new): Likewise.
(hsa_insn_br::operator new): Likewise.
(hsa_insn_sbr::operator new): Likewise.
(hsa_insn_cmp::operator new): Likewise.
(hsa_insn_mem::operator new): Likewise.
(hsa_insn_atomic::operator new): Likewise.
(hsa_insn_signal::operator new): Likewise.
(hsa_insn_seg::operator new): Likewise.
(hsa_insn_call::operator new): Likewise.
(hsa_insn_arg_block::operator new): Likewise.
(hsa_insn_comment::operator new): Likewise.
(hsa_insn_srctype::operator new): Likewise.
(hsa_insn_packed::operator new): Likewise.
(hsa_insn_cvt::operator new): Likewise.
(hsa_insn_alloca::operator new): Likewise.
(hsa_init_new_bb): Likewise.
---
 gcc/hsa-gen.c | 227 ++
 1 file changed, 68 insertions(+), 159 deletions(-)

diff --git a/gcc/hsa-gen.c b/gcc/hsa-gen.c
index f66eb53..36bc52d 100644
--- a/gcc/hsa-gen.c
+++ b/gcc/hsa-gen.c
@@ -38,7 +38,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "dumpfile.h"
 #include "gimple-pretty-print.h"
 #include "diagnostic-core.h"
-#include "alloc-pool.h"
 #include "gimple-ssa.h"
 #include "tree-phinodes.h"
 #include "stringpool.h"
@@ -125,31 +124,7 @@ struct hsa_queue
   uint64_t id;
 };
 
-/* Alloc pools for allocating basic hsa structures such as operands,
-   instructions and other basic entities.  */
-static object_allocator *hsa_allocp_operand_address;
-static object_allocator *hsa_allocp_operand_immed;
-static object_allocator *hsa_allocp_operand_reg;
-static object_allocator *hsa_allocp_operand_code_list;
-static object_allocator *hsa_allocp_operand_operand_list;
-static object_allocator *hsa_allocp_inst_basic;
-static object_allocator *hsa_allocp_inst_phi;
-static object_allocator *hsa_allocp_inst_mem;
-static object_allocator *hsa_allocp_inst_atomic;
-static object_allocator *hsa_allocp_inst_signal;
-static object_allocator *hsa_allocp_inst_seg;
-static object_allocator *hsa_allocp_inst_cmp;
-static object_allocator *hsa_allocp_inst_br;
-static object_allocator *hsa_allocp_inst_sbr;
-static object_allocator *hsa_allocp_inst_call;
-static object_allocator *hsa_allocp_inst_arg_block;
-static object_allocator *hsa_allocp_inst_comment;
-static object_allocator *hsa_allocp_inst_queue;
-static object_allocator *hsa_allocp_inst_srctype;
-static object_allocator *hsa_allocp_inst_packed;
-static object_allocator *hsa_allocp_inst_cvt;
-static object_allocator *hsa_allocp_inst_alloca;
-static object_allocator *hsa_allocp_bb;
+static struct obstack hsa_obstack;
 
 /* List of pointers to all instructions that come from an object allocator.  */
 static vec  hsa_instructions;
@@ -467,52 +442,7 @@ static void
 hsa_init_data_for_cfun ()
 {
   hsa_init_compilation_unit_data ();
-  hsa_allocp_operand_address
-= new 

Re: C PATCH for c/70093 (ICE with nested-function returning VM type)

2016-03-19 Thread Marek Polacek
Ping.

On Wed, Mar 09, 2016 at 04:55:23PM +0100, Marek Polacek wrote:
> On Wed, Mar 09, 2016 at 04:31:37PM +0100, Jakub Jelinek wrote:
> > No, I meant:
> >   switch (n)
> > {
> >   struct S x;
> > case 1:
> >   fn ();
> >   break;
> > case 2:
> >   fn2 ();
> >   break;
> > case 3:
> >   x = fn ();
> >   if (x.a[0] != 42)
> > __builtin_abort ();
> >   break;
> > case 4:
> >   if (fn ().a[0] != 42)
> > __builtin_abort ();
> >   break;
> > ...
> > 
> > The reason is that anything after a noreturn call can be optimized away
> > shortly afterwards.  Perhaps you want __attribute__((noinline, noclone)) on
> > the function too just in case (I know you haven't included -O*).
>  
> Aha.  I couldn't do exactly this because of 
> error: switch jumps into scope of identifier with variably modified type
> so I moved the decl out of the switch.
> 
> > Otherwise LGTM.
> 
> Thanks.
> 
> Bootstrapped/regtested on x86_64-linux.
> 
> 2016-03-09  Marek Polacek  
> 
>   PR c/70093
>   * c-typeck.c (build_function_call_vec): Create a TARGET_EXPR for
>   nested functions returning VM types.
> 
>   * cgraphunit.c (cgraph_node::expand_thunk): Also build call to the
>   function being thunked if the result type doesn't have fixed size.
>   * gimplify.c (gimplify_modify_expr): Also set LHS if the result type
>   doesn't have fixed size.
> 
>   * gcc.dg/nested-func-10.c: New test.
>   * gcc.dg/nested-func-9.c: New test.
> 
> diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
> index 6aa0f03..de9d465 100644
> --- gcc/c/c-typeck.c
> +++ gcc/c/c-typeck.c
> @@ -3068,6 +3068,16 @@ build_function_call_vec (location_t loc, 
> vec arg_loc,
>  result = build_call_array_loc (loc, TREE_TYPE (fntype),
>  function, nargs, argarray);
>  
> +  /* In this improbable scenario, a nested function returns a VM type.
> + Create a TARGET_EXPR so that the call always has a LHS, much as
> + what the C++ FE does for functions returning non-PODs.  */
> +  if (variably_modified_type_p (TREE_TYPE (fntype), NULL_TREE))
> +{
> +  tree tmp = create_tmp_var_raw (TREE_TYPE (fntype));
> +  result = build4 (TARGET_EXPR, TREE_TYPE (fntype), tmp, result,
> +NULL_TREE, NULL_TREE);
> +}
> +
>if (VOID_TYPE_P (TREE_TYPE (result)))
>  {
>if (TYPE_QUALS (TREE_TYPE (result)) != TYPE_UNQUALIFIED)
> diff --git gcc/cgraphunit.c gcc/cgraphunit.c
> index 8b3fddc..4351ae4 100644
> --- gcc/cgraphunit.c
> +++ gcc/cgraphunit.c
> @@ -1708,7 +1708,9 @@ cgraph_node::expand_thunk (bool output_asm_thunks, bool 
> force_gimple_thunk)
>  
>/* Build call to the function being thunked.  */
>if (!VOID_TYPE_P (restype)
> -   && (!alias_is_noreturn || TREE_ADDRESSABLE (restype)))
> +   && (!alias_is_noreturn
> +   || TREE_ADDRESSABLE (restype)
> +   || TREE_CODE (TYPE_SIZE_UNIT (restype)) != INTEGER_CST))
>   {
> if (DECL_BY_REFERENCE (resdecl))
>   {
> diff --git gcc/gimplify.c gcc/gimplify.c
> index b331e41..692d168 100644
> --- gcc/gimplify.c
> +++ gcc/gimplify.c
> @@ -4838,7 +4838,8 @@ gimplify_modify_expr (tree *expr_p, gimple_seq *pre_p, 
> gimple_seq *post_p,
>   }
>notice_special_calls (call_stmt);
>if (!gimple_call_noreturn_p (call_stmt)
> -   || TREE_ADDRESSABLE (TREE_TYPE (*to_p)))
> +   || TREE_ADDRESSABLE (TREE_TYPE (*to_p))
> +   || TREE_CODE (TYPE_SIZE_UNIT (TREE_TYPE (*to_p))) != INTEGER_CST)
>   gimple_call_set_lhs (call_stmt, *to_p);
>assign = call_stmt;
>  }
> diff --git gcc/testsuite/gcc.dg/nested-func-10.c 
> gcc/testsuite/gcc.dg/nested-func-10.c
> index e69de29..ac6f76f 100644
> --- gcc/testsuite/gcc.dg/nested-func-10.c
> +++ gcc/testsuite/gcc.dg/nested-func-10.c
> @@ -0,0 +1,56 @@
> +/* PR c/70093 */
> +/* { dg-do compile } */
> +/* { dg-options "" } */
> +
> +void __attribute__((noinline, noclone))
> +foo (int n)
> +{
> +  struct S { int a[n]; };
> +
> +  struct S __attribute__((noreturn))
> +  fn (void)
> +  {
> +__builtin_abort ();
> +  }
> +
> +  auto struct S __attribute__((noreturn))
> +  fn2 (void)
> +  {
> +__builtin_abort ();
> +  }
> +
> +  struct S x;
> +  __typeof__ (fn ()) *p = 
> +  switch (n)
> +{
> +case 1:
> +  fn ();
> +  break;
> +case 2:
> +  fn2 ();
> +  break;
> +case 3:
> +  x = fn ();
> +  if (x.a[0] != 42)
> + __builtin_abort ();
> +  break;
> +case 4:
> +  if (fn ().a[0] != 42)
> + __builtin_abort ();
> +  break;
> +case 5:
> +  if (p->a[0] != 42)
> + __builtin_abort ();
> +  break;
> +case 6:
> +  if (fn2 ().a[0] != 42)
> + __builtin_abort ();
> +  break;
> +}
> +}
> +
> +int
> +main (void)
> +{
> +  foo (1);
> +}
> diff --git gcc/testsuite/gcc.dg/nested-func-9.c 
> gcc/testsuite/gcc.dg/nested-func-9.c
> index e69de29..902c258 

[AArch64] Emit division using the Newton series

2016-03-19 Thread Evandro Menezes

Emit division using the Newton series

2016-03-17  Evandro Menezes  

gcc/
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNE_APPROX_DIV_{SF,DF}: New tuning macros.
* config/aarch64/aarch64-protos.h
(AARCH64_EXTRA_TUNE_APPROX_DIV): New macro.
(aarch64_emit_approx_div): Declare new function.
* config/aarch64/aarch64.c
(aarch64_emit_approx_div): Define new function.
* config/aarch64/aarch64.md ("div3"): New expansion.
* config/aarch64/aarch64-simd.md ("div3"): Likewise.

This patch implements FP division by an approximation using the Newton 
series.


With this patch, DF division is sped up by over 100% and SF division, 
zilch, both on A57 and on M1.


Feedback welcome.

Thank you,

--
Evandro Menezes

>From 750bd4f64cea8787eb077b7537cc7d8dceafac57 Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Thu, 17 Mar 2016 14:44:55 -0500
Subject: [PATCH] Emit division using the Newton series

2016-03-17  Evandro Menezes  

gcc/
	* config/aarch64/aarch64-tuning-flags.def
	(AARCH64_EXTRA_TUNE_APPROX_DIV_{SF,DF}: New tuning macros.
	* config/aarch64/aarch64-protos.h
	(AARCH64_EXTRA_TUNE_APPROX_DIV): New macro.
	(aarch64_emit_approx_div): Declare new function.
	* config/aarch64/aarch64.c
	(aarch64_emit_approx_div): Define new function.
	* config/aarch64/aarch64.md ("div3"): New expansion.
	* config/aarch64/aarch64-simd.md ("div3"): Likewise.
---
 gcc/config/aarch64/aarch64-protos.h |  4 ++
 gcc/config/aarch64/aarch64-simd.md  | 26 ++-
 gcc/config/aarch64/aarch64-tuning-flags.def |  3 +-
 gcc/config/aarch64/aarch64.c| 67 -
 gcc/config/aarch64/aarch64.md   | 31 +++--
 5 files changed, 124 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index dced209..847a282 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -263,6 +263,9 @@ enum aarch64_extra_tuning_flags
 };
 #undef AARCH64_EXTRA_TUNING_OPTION
 
+#define AARCH64_EXTRA_TUNE_APPROX_DIV \
+(AARCH64_EXTRA_TUNE_APPROX_DIV_DF | AARCH64_EXTRA_TUNE_APPROX_DIV_SF)
+
 extern struct tune_params aarch64_tune_params;
 
 HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
@@ -362,6 +365,7 @@ void aarch64_relayout_simd_types (void);
 void aarch64_reset_previous_fndecl (void);
 void aarch64_save_restore_target_globals (tree);
 void aarch64_emit_approx_rsqrt (rtx, rtx);
+void aarch64_emit_approx_div (rtx, rtx, rtx);
 
 /* Initialize builtins for SIMD intrinsics.  */
 void init_aarch64_simd_builtins (void);
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index bd73bce..f1e53be 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -1509,7 +1509,31 @@
   [(set_attr "type" "neon_fp_mul_")]
 )
 
-(define_insn "div3"
+(define_expand "div3"
+ [(set (match_operand:VDQF 0 "register_operand" "=w")
+   (div:VDQF (match_operand:VDQF 1 "register_operand" "w")
+		 (match_operand:VDQF 2 "register_operand" "w")))]
+ "TARGET_SIMD"
+{
+  machine_mode mode = GET_MODE_INNER (GET_MODE (operands[1]));
+
+  if (flag_finite_math_only
+  && !flag_trapping_math
+  && flag_unsafe_math_optimizations
+  && !optimize_function_for_size_p (cfun)
+  && ((mode == SFmode
+   && (aarch64_tune_params.extra_tuning_flags
+   & AARCH64_EXTRA_TUNE_APPROX_DIV_SF))
+  || (mode == DFmode
+  && (aarch64_tune_params.extra_tuning_flags
+  & AARCH64_EXTRA_TUNE_APPROX_DIV_DF
+{
+  aarch64_emit_approx_div (operands[0], operands[1], operands[2]);
+  DONE;
+}
+})
+
+(define_insn "*div3"
  [(set (match_operand:VDQF 0 "register_operand" "=w")
(div:VDQF (match_operand:VDQF 1 "register_operand" "w")
 		 (match_operand:VDQF 2 "register_operand" "w")))]
diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def
index 7e45a0c..ececdc1 100644
--- a/gcc/config/aarch64/aarch64-tuning-flags.def
+++ b/gcc/config/aarch64/aarch64-tuning-flags.def
@@ -30,4 +30,5 @@
 
 AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS)
 AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrt", APPROX_RSQRT)
-
+AARCH64_EXTRA_TUNING_OPTION ("approx_div", APPROX_DIV_DF)
+AARCH64_EXTRA_TUNING_OPTION ("approx_divf", APPROX_DIV_SF)
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 12e498d..97af0c0 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -538,7 +538,8 @@ static const struct tune_params exynosm1_tunings =
   48,	/* max_case_values.  */
   64,	/* cache_line_size.  */
   tune_params::AUTOPREFETCHER_OFF, /* autoprefetcher_model.  */
-  

Re: [patch, driver] Ignore -ftree-parallelize-loops={0,1}

2016-03-19 Thread Thomas Schwinge
Hi!

On Tue, 14 Jul 2015 10:36:25 +0200, Tom de Vries  wrote:
> On 14/07/15 06:54, Jeff Law wrote:
> > On 07/13/2015 04:58 AM, Tom de Vries wrote:
> >> On 07/07/15 09:53, Tom de Vries wrote:
> >>> currently, we have these spec strings in gcc/gcc.c involving
> >>> ftree-parallelize-loops:
> >>> ...
> >>> %{fopenacc|fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}
> >>>
> >>>
> >>>
> >>> %{fopenacc|fopenmp|ftree-parallelize-loops=*:-pthread}"
> >>> ...
> >>>
> >>> Actually, ftree-parallelize-loops={0,1} means that no parallelization is
> >>> done, but these spec strings still get activated for these values.
> >>>
> >>>
> >>> Attached patch fixes that, by introducing a spec function gt (short for
> >>> greather than), and using it in the spec lines.

> Committed the patch using the gt function, as attached (formatting 
> fixed, ChangeLog entry added).

> Ignore -ftree-parallelize-loops={0,1} using gt
> 
> 2015-07-14  Tom de Vries  
> 
>   * gcc.c (greater_than_spec_func): Declare forward.
>   (LINK_COMMAND_SPEC, GOMP_SELF_SPECS): Use gt to ignore
>   -ftree-parallelize-loops={0,1}.
>   (static_spec_functions): Add greater_than_spec_func function with name
>   "gt".
>   (greater_than_spec_func): New function.

I recently noticed that this change failed to update the instances of
ftree-parallelize-loops in other spec strings.  I can't easily test my
proposed changes, but I just mechanically changed
"ftree-parallelize-loops=*" to "%:gt(%{ftree-parallelize-loops=*:%*} 1)"
(which is the spelling to use after Jakub's "[PATCH] Fix driver handling
of multiple -ftree-parallelize-loops= options (PR driver/69805)",
).
OK to commit?

commit df7d7943ae64f6df74d360e71f7c495c78647fda
Author: Thomas Schwinge 
Date:   Thu Mar 17 17:17:36 2016 +0100

Complete changes to "Ignore -ftree-parallelize-loops={0,1} using gt"

Apply the r225764 and r233573 changes to all relevant spec strings.

gcc/
* config/arc/arc.h (LINK_COMMAND_SPEC): Use gt to ignore
-ftree-parallelize-loops={0,1}.
* config/darwin.h (LINK_COMMAND_SPEC_A): Likewise.
* config/i386/mingw32.h (GOMP_SELF_SPECS): Likewise.
* config/ia64/hpux.h (LIB_SPEC): Likewise.
* config/pa/pa-hpux11.h (LIB_SPEC): Likewise.
* config/pa/pa64-hpux.h (LIB_SPEC): Likewise.
---
 gcc/config/arc/arc.h  |  3 ++-
 gcc/config/darwin.h   |  2 +-
 gcc/config/i386/mingw32.h |  2 +-
 gcc/config/ia64/hpux.h|  2 +-
 gcc/config/pa/pa-hpux11.h |  2 +-
 gcc/config/pa/pa64-hpux.h | 12 ++--
 6 files changed, 12 insertions(+), 11 deletions(-)

diff --git gcc/config/arc/arc.h gcc/config/arc/arc.h
index 21c049f..1c2a38d 100644
--- gcc/config/arc/arc.h
+++ gcc/config/arc/arc.h
@@ -188,7 +188,8 @@ along with GCC; see the file COPYING3.  If not see
 %(linker) %l " LINK_PIE_SPEC "%X %{o*} %{A} %{d} %{e*} %{m} %{N} %{n} %{r}\
 %{s} %{t} %{u*} %{x} %{z} %{Z} %{!A:%{!nostdlib:%{!nostartfiles:%S}}}\
 %{static:} %{L*} %(mfwrap) %(link_libgcc) %o\
-
%{fopenacc|fopenmp|ftree-parallelize-loops=*:%:include(libgomp.spec)%(link_gomp)}\
+%{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1):\
+   %:include(libgomp.spec)%(link_gomp)}\
 %(mflib)\
 %{fprofile-arcs|fprofile-generate|coverage:-lgcov}\
 %{!nostdlib:%{!nodefaultlibs:%(link_ssp) %(link_gcc_c_sequence)}}\
diff --git gcc/config/darwin.h gcc/config/darwin.h
index 9f686d3..c9981b8 100644
--- gcc/config/darwin.h
+++ gcc/config/darwin.h
@@ -177,7 +177,7 @@ extern GTY(()) int darwin_ms_struct;
 %{o*}%{!o:-o a.out} \
 %{!nostdlib:%{!nostartfiles:%S}} \
 %{L*} %(link_libgcc) %o 
%{fprofile-arcs|fprofile-generate*|coverage:-lgcov} \
-%{fopenacc|fopenmp|ftree-parallelize-loops=*: \
+%{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1): \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libgomp.a%s; 
: -lgomp } } \
 %{fgnu-tm: \
   %{static|static-libgcc|static-libstdc++|static-libgfortran: libitm.a%s; 
: -litm } } \
diff --git gcc/config/i386/mingw32.h gcc/config/i386/mingw32.h
index 4ac5f68..e048189 100644
--- gcc/config/i386/mingw32.h
+++ gcc/config/i386/mingw32.h
@@ -207,7 +207,7 @@ do {
 \
 
 /* mingw32 uses the  -mthreads option to enable thread support.  */
 #undef GOMP_SELF_SPECS
-#define GOMP_SELF_SPECS "%{fopenacc|fopenmp|ftree-parallelize-loops=*: " \
+#define GOMP_SELF_SPECS 
"%{fopenacc|fopenmp|%:gt(%{ftree-parallelize-loops=*:%*} 1): " \
"-mthreads -pthread}"
 #undef GTM_SELF_SPECS
 #define GTM_SELF_SPECS "%{fgnu-tm:-mthreads -pthread}"
diff --git gcc/config/ia64/hpux.h gcc/config/ia64/hpux.h
index 8b90c99..008c4f6 100644
--- gcc/config/ia64/hpux.h
+++ gcc/config/ia64/hpux.h
@@ 

Re: [PATHCH] Disable inline asm for in-tree mpfr (PR69134)

2016-03-19 Thread Marc Glisse

On Tue, 5 Jan 2016, Richard Biener wrote:

On January 5, 2016 2:20:42 PM GMT+01:00, Bernd Edlinger 
 wrote:

On 05.01.2016 13:58, Bernd Schmidt wrote:

On 01/05/2016 09:44 AM, Bernd Edlinger wrote:

Using asm code is generally not desirable for in-tree mpfr builds.


Why not?


for the same reason why we disable the asm code for in-tree gmp.
If we think mpfr is fine to use assembler, why don't we let gmp use the
assember code too?


IIRC the logic at some point at least used host CPU detection to select asm.


Note that GMP only does host CPU detection if you let it (config.guess). 
As soon as you pass an explicit --host= option to configure (easy for 
gcc), that mechanism is disabled (at least that's what I think happens).


So I looked for a way to disable the asm code, and found it can be 
done, but differently than for in-tree gmp. See the attached patch.


As noted in PR 67728, it seems that gcc's intrusive way of overriding 
CFLAGS also breaks GMP itself, not just MPFR, by hiding the macro NO_ASM 
that GMP tries to define through its own CFLAGS. So maybe Bernd's patch 
should be duplicated to also apply to GMP?


--
Marc Glisse


Re: [RFA][PR rtl-optimization/70263] Fix creation of new REG_EQUIV notes

2016-03-19 Thread Jeff Law

On 03/17/2016 12:23 PM, Bernd Schmidt wrote:

On 03/17/2016 06:37 PM, Jeff Law wrote:

+  bitmap seen_insns;

+  seen_insns = BITMAP_ALLOC (NULL);


You could save an allocation here by making this a bitmap_head and using
bitmap_initialize.

Done.




+  bitmap_set_bit (seen_insns, INSN_UID (insn));
+
   if (! INSN_P (insn))
 continue;

@@ -3646,7 +3656,8 @@ update_equiv_regs (void)
   && ! find_reg_note (XEXP (reg_equiv[regno].init_insns, 0),
   REG_EQUIV, NULL_RTX)
   && ! contains_replace_regs (XEXP (dest, 0))
-  && ! pdx_subregs[regno])
+  && ! pdx_subregs[regno]
+  && ! bitmap_bit_p (seen_insns, INSN_UID (insn)))


This looks odd to me. Isn't this condition always false? Did you want to
test the init_insn?
Right, we need to test that we found INIT_INSN prior to handling INSN. 
The test needs to go into the next conditional, after we initialize 
INIT_INSN.


I double-checked, we'll only set REG_BASIC_BLOCK if the set and all 
users are in the same block.  That was my recollection, but after the 
major goof in V1 of the patch, I wanted to be sure.


I also added a blurb to the dump file when we create these equivalences 
and included a test to verify the code fires.  I verified it fired on 
x86 and x86-64.  It may or may not fire on other targets, so I left the 
test in the i386 specific subdirectory.



Bootstrapped and regression tested on x86_64-linux-gnu.  And unlike the 
last version, it doesn't totally disable the note creation (as verified 
by the new test ;-)


OK for the trunk?

Thanks,
Jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index b7711b8..6b57495 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2016-03-18  Jeff Law  
+
+   PR rtl-optimization/70263
+   * ira.c (memref_used_between_p): Assert we found END in the insn chain.
+   (update_equiv_regs): When trying to move a store to after the insn
+   that sets the source of the store, make sure the store occurs after
+   the insn that sets the source of the store.  When successful note
+   the REG_EQUIV note created in the dump file.
+
 2016-03-17  H.J. Lu  
 
PR driver/70192
diff --git a/gcc/ira.c b/gcc/ira.c
index 062b8a4..c12318a 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -3225,13 +3225,18 @@ memref_referenced_p (rtx memref, rtx x)
 }
 
 /* TRUE if some insn in the range (START, END] references a memory location
-   that would be affected by a store to MEMREF.  */
+   that would be affected by a store to MEMREF.
+
+   Callers should not call this routine if START is after END in the
+   RTL chain.  */
+
 static int
 memref_used_between_p (rtx memref, rtx_insn *start, rtx_insn *end)
 {
   rtx_insn *insn;
 
-  for (insn = NEXT_INSN (start); insn != NEXT_INSN (end);
+  for (insn = NEXT_INSN (start);
+   insn && insn != NEXT_INSN (end);
insn = NEXT_INSN (insn))
 {
   if (!NONDEBUG_INSN_P (insn))
@@ -3245,6 +3250,7 @@ memref_used_between_p (rtx memref, rtx_insn *start, 
rtx_insn *end)
return 1;
 }
 
+  gcc_assert (insn == NEXT_INSN (end));
   return 0;
 }
 
@@ -3337,6 +3343,7 @@ update_equiv_regs (void)
   int loop_depth;
   bitmap cleared_regs;
   bool *pdx_subregs;
+  bitmap_head seen_insns;
 
   /* Use pdx_subregs to show whether a reg is used in a paradoxical
  subreg.  */
@@ -3606,11 +3613,14 @@ update_equiv_regs (void)
   /* A second pass, to gather additional equivalences with memory.  This needs
  to be done after we know which registers we are going to replace.  */
 
+  bitmap_initialize (_insns, NULL);
   for (insn = get_insns (); insn; insn = NEXT_INSN (insn))
 {
   rtx set, src, dest;
   unsigned regno;
 
+  bitmap_set_bit (_insns, INSN_UID (insn));
+
   if (! INSN_P (insn))
continue;
 
@@ -3651,6 +3661,7 @@ update_equiv_regs (void)
  rtx_insn *init_insn =
as_a  (XEXP (reg_equiv[regno].init_insns, 0));
  if (validate_equiv_mem (init_insn, src, dest)
+ && bitmap_bit_p (_insns, INSN_UID (init_insn))
  && ! memref_used_between_p (dest, init_insn, insn)
  /* Attaching a REG_EQUIV note will fail if INIT_INSN has
 multiple sets.  */
@@ -3661,9 +3672,15 @@ update_equiv_regs (void)
  ira_reg_equiv[regno].init_insns
= gen_rtx_INSN_LIST (VOIDmode, insn, NULL_RTX);
  df_notes_rescan (init_insn);
+ if (dump_file)
+   fprintf (dump_file,
+"Adding REG_EQUIV to insn %d for source of insn %d\n",
+INSN_UID (init_insn),
+INSN_UID (insn));
}
}
 }
+  bitmap_clear (_insns);
 
   cleared_regs = BITMAP_ALLOC (NULL);
   /* Now scan all regs killed in an insn to see if any of them are
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 7fe7295..336de6a 100644
--- a/gcc/testsuite/ChangeLog
+++ 

[PATCH, i386, AVX-512] Emit vpbroadcastq instead if non-existent vbroadcastsd.

2016-03-19 Thread Kirill Yukhin
Hello,
Intel spec [1] states that there're almost all broadcasting
intructions variants available, except for (p. 2-4)
vbroadcastsd %xmm, %xmm
It is safe to emit
vpbroadcastq %xmm, %xmm
instead.

I was uable to extract a testcase, but if this insn is generated -
we'll got asm error.

[1] - 
https://software.intel.com/sites/default/files/managed/b4/3a/319433-024.pdf

Bootstrapped and regtested.

Richard,
is it ok to check in to main trunk?

gcc/
* config/i386/sse.md: Use vpbroadcastq for broadcasting DF
values to 128b regs.

--
Thanks, K

commit 72e85f1b936d61edc93603862c810a8b4817b8a7
Author: Kirill Yukhin 
Date:   Thu Mar 17 18:05:22 2016 +0300

AVX-512. Use vpbroadcastq for broadcasting DF values to 128b regs.

diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 3c521b3..b25c246 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -17269,7 +17269,14 @@
(match_operand: 1 "nonimmediate_operand" "vm")
(parallel [(const_int 0)]]
   "TARGET_AVX512F"
-  "vbroadcast\t{%1, 
%0|%0, %1}"
+{
+  /*  There is no DF broadcast (in AVX-512*) to 128b register.
+  Mimic it with integer variant.  */
+  if (mode == V2DFmode)
+return "vpbroadcastq\t{%1, %0|%0, %1}";
+  else
+return "vbroadcast\t{%1, 
%0|%0, %1}";
+}
   [(set_attr "type" "ssemov")
(set_attr "prefix" "evex")
(set_attr "mode" "")])


[PATCH, aarch64] Fix 70048

2016-03-19 Thread Richard Henderson

This fixes only the regression described in the PR.

There was quite a bit of follow-up that points to new work that ought to be 
done during the gcc7 cycle, but isn't really appropriate now.


Tested on aarch64-linux; committed as reviewed in the PR.


r~
PR target/70048
* config/aarch64/aarch64.c (virt_or_elim_regno_p): New.
(aarch64_classify_address): Use it.
(aarch64_legitimize_address): Force all subexpressions of PLUS
into registers.  Simplify as (sfp+const)+reg or (reg+reg)+const.


diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index cf1239d..12e498d 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -3847,6 +3847,18 @@ aarch64_mode_valid_for_sched_fusion_p (machine_mode mode)
 && GET_MODE_SIZE (mode) == 8);
 }
 
+/* Return true if REGNO is a virtual pointer register, or an eliminable
+   "soft" frame register.  Like REGNO_PTR_FRAME_P except that we don't
+   include stack_pointer or hard_frame_pointer.  */
+static bool
+virt_or_elim_regno_p (unsigned regno)
+{
+  return ((regno >= FIRST_VIRTUAL_REGISTER
+  && regno <= LAST_VIRTUAL_POINTER_REGISTER)
+ || regno == FRAME_POINTER_REGNUM
+ || regno == ARG_POINTER_REGNUM);
+}
+
 /* Return true if X is a valid address for machine mode MODE.  If it is,
fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
effect.  OUTER_CODE is PARALLEL for a load/store pair.  */
@@ -3890,9 +3902,7 @@ aarch64_classify_address (struct aarch64_address_info 
*info,
 
   if (! strict_p
  && REG_P (op0)
- && (op0 == virtual_stack_vars_rtx
- || op0 == frame_pointer_rtx
- || op0 == arg_pointer_rtx)
+ && virt_or_elim_regno_p (REGNO (op0))
  && CONST_INT_P (op1))
{
  info->type = ADDRESS_REG_IMM;
@@ -4953,74 +4963,43 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
machine_mode mode)
 
   if (GET_CODE (x) == PLUS && CONST_INT_P (XEXP (x, 1)))
 {
-  HOST_WIDE_INT offset = INTVAL (XEXP (x, 1));
-  HOST_WIDE_INT base_offset;
+  rtx base = XEXP (x, 0);
+  rtx offset_rtx XEXP (x, 1);
+  HOST_WIDE_INT offset = INTVAL (offset_rtx);
 
-  if (GET_CODE (XEXP (x, 0)) == PLUS)
+  if (GET_CODE (base) == PLUS)
{
- rtx op0 = XEXP (XEXP (x, 0), 0);
- rtx op1 = XEXP (XEXP (x, 0), 1);
+ rtx op0 = XEXP (base, 0);
+ rtx op1 = XEXP (base, 1);
 
- /* Address expressions of the form Ra + Rb + CONST.
+ /* Force any scaling into a temp for CSE.  */
+ op0 = force_reg (Pmode, op0);
+ op1 = force_reg (Pmode, op1);
 
-If CONST is within the range supported by the addressing
-mode "reg+offset", do not split CONST and use the
-sequence
-  Rt = Ra + Rb;
-  addr = Rt + CONST.  */
- if (REG_P (op0) && REG_P (op1))
-   {
- machine_mode addr_mode = GET_MODE (x);
- rtx base = gen_reg_rtx (addr_mode);
- rtx addr = plus_constant (addr_mode, base, offset);
+ /* Let the pointer register be in op0.  */
+ if (REG_POINTER (op1))
+   std::swap (op0, op1);
 
- if (aarch64_legitimate_address_hook_p (mode, addr, false))
-   {
- emit_insn (gen_adddi3 (base, op0, op1));
- return addr;
-   }
-   }
- /* Address expressions of the form Ra + Rb< 16
  || mode == TImode)
base_offset = ((offset + 64 * GET_MODE_SIZE (mode))
@@ -5032,15 +5011,12 @@ aarch64_legitimize_address (rtx x, rtx /* orig_x  */, 
machine_mode mode)
   else
base_offset = offset & ~0xfff;
 
-  if (base_offset == 0)
-   return x;
-
-  offset -= base_offset;
-  rtx base_reg = gen_reg_rtx (Pmode);
-  rtx val = force_operand (plus_constant (Pmode, XEXP (x, 0), base_offset),
-  NULL_RTX);
-  emit_move_insn (base_reg, val);
-  x = plus_constant (Pmode, base_reg, offset);
+  if (base_offset != 0)
+   {
+ base = plus_constant (Pmode, base, base_offset);
+ base = force_operand (base, NULL_RTX);
+ return plus_constant (Pmode, base, offset - base_offset);
+   }
 }
 
   return x;


Re: [PATCH] Fix PR64764

2016-03-19 Thread H.J. Lu
On Wed, Mar 16, 2016 at 9:35 AM, Tom de Vries  wrote:
> On 16/03/16 17:15, H.J. Lu wrote:
>>
>> On Wed, Mar 16, 2016 at 9:12 AM, H.J. Lu  wrote:
>
>
>>> Any particular reason why this test was changed to DOS format?
>
>
> FWIW, the test was in DOS format from the start.
>
>

DOS format was introduced by r220530:

Index: gcc.dg/uninit-19.c
===
--- gcc.dg/uninit-19.c  (revision 220529)
+++ gcc.dg/uninit-19.c  (revision 220530)
@@ -10,7 +10,7 @@ fn1 (int p1, float *f1, float *f2, float
  unsigned char *c2, float *p10)^M
 {^M
   if (p1 & 8)^M
-b[3] = p10[a];  /* { dg-warning "may be used uninitialized" } */^M
+b[3] = p10[a];  /* 13.  */^M
 }^M
 ^M
 void^M
@@ -19,5 +19,8 @@ fn2 ()
   float *n;^M
   if (l & 6)^M
 n =  + m;^M
-  fn1 (l, , , , , , , n);^M
+  fn1 (l, , , , , , , n);  /* 22.  */^M
 }^M
+^M
+/* { dg-warning "may be used uninitialized" "" { target nonpic } 13 } */^M
+/* { dg-warning "may be used uninitialized" "" { target { ! nonpic }
} 22 } */^M

"^M" was added to those changed lines.

-- 
H.J.


Patch ping

2016-03-19 Thread Jakub Jelinek
Hi!

I'd like to ping a C++ PR70144 patch:
  http://gcc.gnu.org/ml/gcc-patches/2016-03/msg00653.html
Since then, Zdenek has filed another PR with __builtin_constant_p (foo)
where foo is FUNCTION_DECL, and only the above version fixes that, not the
other one.

Jakub


Re: [C PATCH] Fix up composite_types (PR c/70280)

2016-03-19 Thread Joseph Myers
On Thu, 17 Mar 2016, Jakub Jelinek wrote:

> Hi!
> 
> Zdenek reported a compare debug issue, where it is dumping used function
> prototypes and there is a difference between -g0 and -g in
> -2:   static int BIO_vsnprintf (char *, size_t, const char *, struct  *, 
> void, ...);
> +2:   static int BIO_vsnprintf (char *, size_t, const char *, struct  *);
> The former is of course wrong, and the reason for that is that C FE's
> composite_type mishandles void_list_node - that should terminate the list
> if there are no varargs, it is not a void argument or something similar,
> and if the original lists are terminated by that, the new one should be too.
> Testcase is not included, as it is too large and reduction didn't work very
> well for that.
> 
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Use simplify_replace_rtx instead of replace_rtx for DEBUG_INSNs in reload

2016-03-19 Thread Richard Biener
On Fri, 18 Mar 2016, Jakub Jelinek wrote:

> Hi!
> 
> This patch fixes one of the spots that use replace_rtx, as this changes
> debug insns, it clearly wants to replace just based on regno, not on pointer
> equality, and for debug insns simplification is always desirable too.
> The other place in reload1 that modifies DEBUG_INSNs also uses
> simplify_replace_rtx.
> 
> Bootstrapped/regtested on {x86_64,i686,powerpc64{,le}}-linux, ok for trunk?

Ok.

Thanks,
Richard.

> 2016-03-18  Jakub Jelinek  
> 
>   * reload1.c (emit_input_reload_insns): Use simplify_replace_rtx
>   instead of replace_rtx for DEBUG_INSNs.
> 
> --- gcc/reload1.c.jj  2016-03-02 07:39:13.0 +0100
> +++ gcc/reload1.c 2016-03-16 10:41:34.622921016 +0100
> @@ -7395,7 +7395,9 @@ emit_input_reload_insns (struct insn_cha
> /* Adjust any debug insns between temp and insn.  */
> while ((temp = NEXT_INSN (temp)) != insn)
>   if (DEBUG_INSN_P (temp))
> -   replace_rtx (PATTERN (temp), old, reloadreg);
> +   INSN_VAR_LOCATION_LOC (temp)
> + = simplify_replace_rtx (INSN_VAR_LOCATION_LOC (temp),
> + old, reloadreg);
>   else
> gcc_assert (NOTE_P (temp));
>   }


Re: [PATCH, PR70161] Fix fdump-ipa-all-graph

2016-03-19 Thread Tom de Vries

On 15/03/16 12:37, Richard Biener wrote:

On Mon, 14 Mar 2016, Tom de Vries wrote:


Hi,

this patch fixes PR70161, a 4.9/5/6 regression.

Currently when using -fdump-ipa-all-graph, the compiler ICEs in
execute_function_dump when testing for pass->graph_dump_initialized, because
pass == NULL.

The patch fixes:
- the ICE by setting the pass argument in the call to
   execute_function_dump in execute_one_ipa_transform_pass
- a subsequent ICE (triggered with -fipa-pta) by saving, resetting and
   restoring dump_file_name in cgraph_node::get_body, alongside the
   saving and restoring of the dump_file variable.
- the duplicate edges in the subsequently generated dot file by
   ensuring that execute_function_dump is called only once per function
   per pass. [ Note that this bit also has an effect for the normal dump
   files for the ipa passes with transform function. For those functions,
   atm execute_function_dump is called both after execute and after
   transform. With the patch, it's only called after transform. ]

Bootstrapped and reg-tested on x86_64.

OK for stage4?


Ok.


All of the patch also OK for 4.9/5 branch?

[ The first 2 bits fix ICES. The last part fixes a duplicate edges 
problem in the dot file, I'm not sure if that's needed in the release 
branches. ]


Thanks,
- Tom





[PATCH, PR70269] Set dump_file to NULL in cgraph_node::get_body

2016-03-19 Thread Tom de Vries

Hi,

this patch fixes PR70269, an 5/6 regression.

When compiling with "-O2 -fipa-pta -fdump-ipa-pta-graph" we try to 
initialize a graph dump file for ipa-cp, while the dump file is not 
enabled, which causes an ICE because dump_file_name is NULL.


This condition in pass_init_dump_file enables the unnecessary 
initialization, because dump_file is non-NULL:

...
  if (initializing_dump
  && dump_file && (dump_flags & TDF_GRAPH)
  && cfun && (cfun->curr_properties & PROP_cfg))
...

The dump_file is non-NULL, but it's the dump file for ipa-pta, the pass 
that calls cgraph_node:get_body which triggers the ipa transform of ipa-cp.


The patch fixes this by resetting dump_file to NULL in 
cgraph_node::get_body.


OK for stage 4 trunk/5 branch if bootstrap and reg-test succeeds?

Thanks,
- Tom
Set dump_file to NULL in cgraph_node::get_body

2016-03-17  Tom de Vries  

	PR ipa/70269
	* cgraph.c (cgraph_node::get_body): Set dump_file to NULL after save.

	* gcc.dg/pr70269.c: New test.

---
 gcc/cgraph.c   | 1 +
 gcc/testsuite/gcc.dg/pr70269.c | 7 +++
 2 files changed, 8 insertions(+)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 518ef24..4804081 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -3372,6 +3372,7 @@ cgraph_node::get_body (void)
   const char *saved_dump_file_name = dump_file_name;
   int saved_dump_flags = dump_flags;
   dump_file_name = NULL;
+  dump_file = NULL;
 
   push_cfun (DECL_STRUCT_FUNCTION (decl));
   execute_all_ipa_transforms ();
diff --git a/gcc/testsuite/gcc.dg/pr70269.c b/gcc/testsuite/gcc.dg/pr70269.c
new file mode 100644
index 000..030cea1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr70269.c
@@ -0,0 +1,7 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fipa-pta -fdump-ipa-pta-graph" } */
+
+void
+foo (void)
+{
+}


Re: [PATCH V3]PR other/70268: map one directory name (old) to another (new) in __FILE__

2016-03-19 Thread Bernhard Reutner-Fischer
On March 18, 2016 6:16:46 AM GMT+01:00, Hongxu Jia  
wrote:



+/* Perform user-specified mapping of __FILE__ prefixes.  Return
+   the new name corresponding to filename.  */
+
+const char *
+remap_file_filename (const char *filename)
+{
+  file_prefix_map *map;
+  char *s;
+  const char *name;
+  size_t name_len;
+
+  for (map = file_prefix_maps; map; map = map->next)
+if (filename_ncmp (filename, map->old_prefix, map->old_len) == 0)
+  break;
+  if (!map)
+return filename;
+  name = filename + map->old_len;
+  name_len = strlen (name) + 1;
+  s = (char *) alloca (name_len + map->new_len);
+  memcpy (s, map->new_prefix, map->new_len);
+  memcpy (s + map->new_len, name, name_len);
+
+  return xstrdup (s);
+}

Please explain why you first alloca() and then strdup the result instead of 
XNEWVEC

Thanks,



Re: [PATCH] c++/65579 - set readonly bit on static constexpr members of templates

2016-03-19 Thread Martin Sebor

On 03/17/2016 08:07 AM, Jason Merrill wrote:

On 03/09/2016 05:09 PM, Martin Sebor wrote:

While going through constexpr bugs looking for background
on one I'm currently working on I came across bug 65579 -
[C++11] gcc requires definition of a static constexpr member
even though it is not odr-used.

The bug points out that GCC (sometimes) emits references to
static constexpr data members of class templates even when
they aren't odr-used.  (A more detailed analysis of why this
happens is in my comment #1 on the bug.)


This should have been fixed up in complete_vars; why didn't it work?


IIUC, the job of complete_vars(CLASS) is to complete the type of
variables of class CLASS.  In the test case from the bug (below),
the type of constexpr_member is not CLASS (struct B) but rather
struct A<0>, so its type is never completed there, or apparently
anywhere else.

  template  struct A { int i; };
  struct B {
  static constexpr A<0> constexpr_member = { 1 };
  };

Martin


Re: [AArch64] Emit square root using the Newton series

2016-03-19 Thread Evandro Menezes

On 03/17/16 09:55, James Greenhalgh wrote:

On Wed, Mar 16, 2016 at 02:45:37PM -0500, Evandro Menezes wrote:

On 03/08/16 16:08, Evandro Menezes wrote:

On 02/16/16 14:56, Evandro Menezes wrote:

On 12/08/15 15:35, Evandro Menezes wrote:

Emit square root using the Newton series

   2015-12-03  Evandro Menezes  

   gcc/
* config/aarch64/aarch64-protos.h (aarch64_emit_swsqrt):
   Declare new
function.
* config/aarch64/aarch64-simd.md (sqrt2): New
   expansion and
insn definitions.
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNE_FAST_SQRT): New tuning macro.
* config/aarch64/aarch64.c (aarch64_emit_swsqrt): Define
   new function.
* config/aarch64/aarch64.md (sqrt2): New expansion
   and insn
definitions.
* config/aarch64/aarch64.opt (mlow-precision-recip-sqrt):
   Expand option
description.
* doc/invoke.texi (mlow-precision-recip-sqrt): Likewise.

This patch extends the patch that added support for
implementing x^-1/2 using the Newton series by adding support
for x^1/2 as well.

Is it OK at this point of stage 3?

Thank you,


James,

As I was saying, this patch results in some validation errors in
CPU2000 benchmarks using DF.  Although proving the algorithm to
be pretty solid with a vast set of random values, I'm confused
why some benchmarks fail to validate with this implementation of
the Newton series for square root too, when they pass with the
Newton series for reciprocal square root.

Since I had no problems with the same algorithm on x86-64, I
wonder if the initial estimate on AArch64, which offers just 8
bits, whereas x86-64 offers 11 bits, has to do with it.  Then
again, the algorithm iterated 1 less time on x86-64 than on
AArch64.

Since it seems that the initial estimate is sufficient for
CPU2000 to validate when using SF, I'm leaning towards
restricting the Newton series for square root only for SF.

Your thoughts on the matter are appreciated,

Add choices for the reciprocal square root approximation

Allow a target to prefer such operation depending on the FP
   precision.

gcc/
* config/aarch64/aarch64-protos.h
(AARCH64_EXTRA_TUNE_APPROX_RSQRT): New macro.
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF): New mask.
(AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF): Likewise.
* config/aarch64/aarch64.c
(use_rsqrt_p): New argument for the mode.
(aarch64_builtin_reciprocal): Devise mode from builtin.
(aarch64_optab_supported_p): New argument for the mode.


Now that the patch is attached, feedback is appreciated.

Ping.

Hi Evandro,

I thought this was on hold while you looked in to the underlying issue for
the failures in the other thread? With that said, I'm struggling to keep
up with where we are on this, so maybe it is time for a clean break - a new
thread for patch set v2, proposed as an explicit patch series (just to keep
the dependencies clear to me).

I'm not convinced of the value of this split, nor why we would stop here
if it was useful (vector modes vs. scalar modes would also seem an
important distinction).

If you no longer need the workaround this enables then I'm not sure I see a
good reason for it to go in, maybe I'm missing a target for which this
would be important?


Hi, James.

OK, I'll start a thread over.

Thank you,

--
Evandro Menezes



Re: [PATCH] Add debug_varinfo and debug_varmap

2016-03-19 Thread Richard Biener
On Wed, 16 Mar 2016, Tom de Vries wrote:

> [ was: Re: [RFC] dump_varmap in tree-ssa-structalias.c ]
> On 10/03/16 10:07, Richard Biener wrote:
> > On Thu, 10 Mar 2016, Tom de Vries wrote:
> > 
> > > Hi,
> > > 
> > > I wrote attached patch to print the actual contents of the varmap variable
> > > in
> > > tree-ssa-structalias.c.
> > > 
> > > Does it make sense to rewrite this into a dump_varmap/debug_varmap patch?
> > 
> > Yes (but please not dump it by default)
> 
> Right, that was my intention as well.
> 
> > and I'd rather have a
> > split-out dump_varinfo to work with when debugging.
> 
> Done.
> 
> OK for stage1 if bootstrap and reg-test succeeds?

Ok.

Richard.


[gomp-nvptx 7/7] nvptx backend: define STACK_SIZE_MODE

2016-03-19 Thread Alexander Monakov
Default definition of STACK_SIZE_MODE is word_mode, which is DImode on NVPTX.
However, stack pointer mode matches pointer mode, so needs to be SImode on
32-bit NVPTX ABI.  Define it to Pmode to fix 32-bit code generation.

* config/nvptx/nvptx.h (STACK_SIZE_MODE): Define.
---
 gcc/ChangeLog.gomp-nvptx | 4 
 gcc/config/nvptx/nvptx.h | 1 +
 2 files changed, 5 insertions(+)

diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h
index 7810cca..6da4d06 100644
--- a/gcc/config/nvptx/nvptx.h
+++ b/gcc/config/nvptx/nvptx.h
@@ -83,6 +83,7 @@
 
 #define POINTER_SIZE (TARGET_ABI64 ? 64 : 32)
 #define Pmode (TARGET_ABI64 ? DImode : SImode)
+#define STACK_SIZE_MODE Pmode
 
 /* Registers.  Since ptx is a virtual target, we just define a few
hard registers for special purposes and leave pseudos unallocated.


Re: [PATCH] Fix PR c++/70218 (illegal access to private field succeeds)

2016-03-19 Thread Patrick Palka
On Wed, Mar 16, 2016 at 5:20 PM, Matthias Klose  wrote:
> On 13.03.2016 21:03, Patrick Palka wrote:
>>
>> Here we are mishandling the deferred_access_stack by not coherently
>> pushing/popping from it.  In cp_parser_lambda_expression we are calling
>> (in order):
>>
>>push_deferring_access_checks (dk_no_deferred);
>>cp_parser_start_tentative_firewall (parser);
>>...
>>pop_deferring_access_checks ();
>>cp_parser_end_tentative_firewall (parser, start, lambda_expr);
>>
>> But the order of the last two popping calls does not correspond with the
>> order
>> of the first two pushing calls.  pop_deferring_access_checks should be
>> called last.  This error may cause us to drop deferred access checks
>> instead of performing them.
>>
>> Bootstrap + regtest in progress, does this look OK to commit if testing
>> succeeds?
>
>
> when applying this patch to the gcc-5-branch I see regressions like
>
> /scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:
> In function 'void foo()':
> /scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:6:8:
> error: 'int X::i' is private
> /scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:16:18:
> error: within this context
>
> Excess errors:
> /scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:6:8:
> error: 'int X::i' is private
> /scratch/packages/gcc/5/gcc-5-5.3.1/src/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-70218.C:16:18:
> error: within this context
>
>
> haven't yet checked the trunk. I don't see any other regressions besides the
> usual noise in the ubsan tests.

lambda-70218.C is the test case that this patch adds.  It looks like
GCC 5 and 6 report errors access errors that originate inside a lambda
slightly differently.  So I think the dg-error directives in
lambda-70218.C would just have to be trivially adjusted for a GCC 5
backport.


[AArch64] Add precision choices for the reciprocal square root approximation

2016-03-19 Thread Evandro Menezes

Add precision choices for the reciprocal square root approximation

Allow a target to prefer such operation depending on the FP
   precision.

gcc/
* config/aarch64/aarch64-protos.h
(AARCH64_EXTRA_TUNE_APPROX_RSQRT): New macro.
* config/aarch64/aarch64-tuning-flags.def
(AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF): New mask.
(AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF): Likewise.
* config/aarch64/aarch64.c
(use_rsqrt_p): New argument for the mode.
(aarch64_builtin_reciprocal): Devise mode from builtin.
(aarch64_optab_supported_p): New argument for the mode.

This patch allows a target to choose for which FP precision the 
reciprocal square root approximation is used.


For example, though this approximation is improves the performance 
noticeably for DF on A57, for SF, not so much, if at all.


Feedback appreciated.

Thank you,

--
Evandro Menezes

>From 95581aefcf324233c3603f4d8232ee18c5836f8a Mon Sep 17 00:00:00 2001
From: Evandro Menezes 
Date: Thu, 17 Mar 2016 17:00:03 -0500
Subject: [PATCH] Add precision choices for the reciprocal square root
 approximation

Allow a target to prefer such operation depending on the FP precision.

gcc/
	* config/aarch64/aarch64-protos.h
	(AARCH64_EXTRA_TUNE_APPROX_RSQRT): New macro.
	* config/aarch64/aarch64-tuning-flags.def
	(AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF): New mask.
	(AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF): Likewise.
	* config/aarch64/aarch64.c
	(use_rsqrt_p): New argument for the mode.
	(aarch64_builtin_reciprocal): Devise mode from builtin.
	(aarch64_optab_supported_p): New argument for the mode.
---
 gcc/config/aarch64/aarch64-protos.h |  3 +++
 gcc/config/aarch64/aarch64-tuning-flags.def |  3 ++-
 gcc/config/aarch64/aarch64.c| 23 +++
 3 files changed, 20 insertions(+), 9 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h b/gcc/config/aarch64/aarch64-protos.h
index dced209..58e5d73 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -263,6 +263,9 @@ enum aarch64_extra_tuning_flags
 };
 #undef AARCH64_EXTRA_TUNING_OPTION
 
+#define AARCH64_EXTRA_TUNE_APPROX_RSQRT \
+  (AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF | AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF)
+
 extern struct tune_params aarch64_tune_params;
 
 HOST_WIDE_INT aarch64_initial_elimination_offset (unsigned, unsigned);
diff --git a/gcc/config/aarch64/aarch64-tuning-flags.def b/gcc/config/aarch64/aarch64-tuning-flags.def
index 7e45a0c..57d9588 100644
--- a/gcc/config/aarch64/aarch64-tuning-flags.def
+++ b/gcc/config/aarch64/aarch64-tuning-flags.def
@@ -29,5 +29,6 @@
  AARCH64_TUNE_ to give an enum name. */
 
 AARCH64_EXTRA_TUNING_OPTION ("rename_fma_regs", RENAME_FMA_REGS)
-AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrt", APPROX_RSQRT)
+AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrt", APPROX_RSQRT_DF)
+AARCH64_EXTRA_TUNING_OPTION ("approx_rsqrtf", APPROX_RSQRT_SF)
 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 12e498d..e651123 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -7440,12 +7440,16 @@ aarch64_memory_move_cost (machine_mode mode ATTRIBUTE_UNUSED,
to optimize 1.0/sqrt.  */
 
 static bool
-use_rsqrt_p (void)
+use_rsqrt_p (machine_mode mode)
 {
   return (!flag_trapping_math
 	  && flag_unsafe_math_optimizations
-	  && ((aarch64_tune_params.extra_tuning_flags
-	   & AARCH64_EXTRA_TUNE_APPROX_RSQRT)
+	  && ((GET_MODE_INNER (mode) == SFmode
+	   && (aarch64_tune_params.extra_tuning_flags
+		   & AARCH64_EXTRA_TUNE_APPROX_RSQRT_SF))
+	  || (GET_MODE_INNER (mode) == DFmode
+		  && (aarch64_tune_params.extra_tuning_flags
+		  & AARCH64_EXTRA_TUNE_APPROX_RSQRT_DF))
 	  || flag_mrecip_low_precision_sqrt));
 }
 
@@ -7455,9 +7459,12 @@ use_rsqrt_p (void)
 static tree
 aarch64_builtin_reciprocal (tree fndecl)
 {
-  if (!use_rsqrt_p ())
-return NULL_TREE;
-  return aarch64_builtin_rsqrt (DECL_FUNCTION_CODE (fndecl));
+  machine_mode mode = TYPE_MODE (TREE_TYPE (fndecl));
+
+  if (use_rsqrt_p (mode))
+return aarch64_builtin_rsqrt (DECL_FUNCTION_CODE (fndecl));
+
+  return NULL_TREE;
 }
 
 typedef rtx (*rsqrte_type) (rtx, rtx);
@@ -13952,13 +13959,13 @@ aarch64_promoted_type (const_tree t)
 /* Implement the TARGET_OPTAB_SUPPORTED_P hook.  */
 
 static bool
-aarch64_optab_supported_p (int op, machine_mode, machine_mode,
+aarch64_optab_supported_p (int op, machine_mode mode1, machine_mode,
 			   optimization_type opt_type)
 {
   switch (op)
 {
 case rsqrt_optab:
-  return opt_type == OPTIMIZE_FOR_SPEED && use_rsqrt_p ();
+  return opt_type == OPTIMIZE_FOR_SPEED && use_rsqrt_p (mode1);
 
 default:
   return true;
-- 
1.9.1



Re: [PATCH] Fix PR c++/70218 (illegal access to private field succeeds)

2016-03-19 Thread Jason Merrill

OK.

Jason


[i386] Support .lbss etc. sections with Solaris as (PR target/59407)

2016-03-19 Thread Rainer Orth
gcc.target/i386/pr58218.c currently FAILs on 64-bit Solaris/x86 with the
native assembler:

FAIL: gcc.target/i386/pr58218.c (test for excess errors)

Excess errors:
Assembler: pr58218.c
"/var/tmp//cciHFIO7.s", line 3 : Section attributes do not match

.section.lbss,"aw",@nobits

It turns out x86_64 large sections need to marked with a 'h' section
flag for as.  gas implicitly sets SHF_AMD64_LARGE based on section
names, but also accepts an 'l' for the same purpose.

The following patch fixes this by using the SECTION_MACH_DEP section
flag to mark large sections and emit the right flag in
default_elf_asm_named_section.

Given this comment in output.h

#define SECTION_MACH_DEP 0x400  /* subsequent bits reserved for target 
*/

handling only a single SECTION_MACH_DEP can be considered a hack.
Currently, only one user of SECTION_MACH_DEP (avr) uses more than one
section flag, so maybe I can get away with this for now.

A full solution would split out the part of
default_elf_asm_named_section that emits the flags into a new
default_elf_asm_section_flags which prints the flag string to a stream,
invoking it either via a macro than be overridden or perhaps a target
hook (which seems not fully right either since those are object file
format agnostic and this is just a small part of emitting ELF named
sections).

The patch has been bootstrapped without regressions on
i386-pc-solaris2.12 (with both as and gas) and x86_64-pc-linux-gnu.
This is not a regression, so this may have to wait for GCC 7 stage 1.

Ok for mainline now or then?

Thanks.
Rainer


2016-03-15  Rainer Orth  

PR target/59407
* config/i386/i386.c (SECTION_LARGE): Define.
(x86_64_elf_select_section): Set it for large data/bss sections.
Only clear SECTION_WRITE for .lrodata.
(x86_64_elf_section_type_flags): Set SECTION_LARGE for large
data/bss sections.
* config/i386/sol2.h (MACH_DEP_SECTION_ASM_FLAG): Define.
* varasm.c (default_elf_asm_named_section): Grow flagchars.
[MACH_DEP_SECTION_ASM_FLAG] Emit MACH_DEP_SECTION_ASM_FLAG for
SECTION_MACH_DEP.
* doc/tm.texi.in (Sections, MACH_DEP_SECTION_ASM_FLAG): Describe.
* doc/tm.texi: Regenerate.

# HG changeset patch
# Parent  8470acf190fb0e7e0d710db9583ed9725f6a2888
Support .lbss etc. sections with Solaris as (PR target/59407)

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -6473,6 +6473,9 @@ ix86_in_large_data_p (tree exp)
   return false;
 }
 
+/* i386-specific section flag to mark large sections.  */
+#define SECTION_LARGE SECTION_MACH_DEP
+
 /* Switch to the appropriate section for output of DECL.
DECL is either a `VAR_DECL' node or a constant of some sort.
RELOC indicates whether forming the initial value of DECL requires
@@ -6485,7 +6488,7 @@ x86_64_elf_select_section (tree decl, in
   if (ix86_in_large_data_p (decl))
 {
   const char *sname = NULL;
-  unsigned int flags = SECTION_WRITE;
+  unsigned int flags = SECTION_WRITE | SECTION_LARGE;
   switch (categorize_decl_for_section (decl, reloc))
 	{
 	case SECCAT_DATA:
@@ -6512,7 +6515,7 @@ x86_64_elf_select_section (tree decl, in
 	case SECCAT_RODATA_MERGE_STR_INIT:
 	case SECCAT_RODATA_MERGE_CONST:
 	  sname = ".lrodata";
-	  flags = 0;
+	  flags &= ~SECTION_WRITE;
 	  break;
 	case SECCAT_SRODATA:
 	case SECCAT_SDATA:
@@ -6547,6 +6550,9 @@ x86_64_elf_section_type_flags (tree decl
 {
   unsigned int flags = default_section_type_flags (decl, name, reloc);
 
+  if (ix86_in_large_data_p (decl))
+flags |= SECTION_LARGE;
+
   if (decl == NULL_TREE
   && (strcmp (name, ".ldata.rel.ro") == 0
 	  || strcmp (name, ".ldata.rel.ro.local") == 0))
diff --git a/gcc/config/i386/sol2.h b/gcc/config/i386/sol2.h
--- a/gcc/config/i386/sol2.h
+++ b/gcc/config/i386/sol2.h
@@ -208,6 +208,14 @@ along with GCC; see the file COPYING3.  
 #undef TARGET_ASM_NAMED_SECTION
 #define TARGET_ASM_NAMED_SECTION i386_solaris_elf_named_section
 
+/* Sun as requires "h" flag for large sections, GNU as can do without, but
+   accepts "l".  */
+#ifdef USE_GAS
+#define MACH_DEP_SECTION_ASM_FLAG 'l'
+#else
+#define MACH_DEP_SECTION_ASM_FLAG 'h'
+#endif
+
 #ifndef USE_GAS
 /* Emit COMDAT group signature symbols for Sun as.  */
 #undef TARGET_ASM_FILE_END
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -7097,6 +7097,12 @@ defined, GCC will assume such a section 
 both this macro and @code{FINI_SECTION_ASM_OP}.
 @end defmac
 
+@defmac MACH_DEP_SECTION_ASM_FLAG
+If defined, a C expression whose value is a character constant
+containing the flag used to mark a machine-dependent section.  This
+corresponds to the @code{SECTION_MACH_DEP} section flag.
+@end defmac
+
 @defmac CRT_CALL_STATIC_FUNCTION (@var{section_op}, @var{function})
 If defined, an ASM statement that switches 

Re: PING: [PATCH] PR driver/70192: Properly set flag_pie and flag_pic

2016-03-19 Thread Bernd Schmidt

On 03/17/2016 02:59 PM, H.J. Lu wrote:

On Fri, Mar 11, 2016 at 9:09 AM, H.J. Lu  wrote:

We can't set flag_pie to the default when flag_pic == 0, which may be
set by -fno-pic or -fno-PIC, since the default value of flag_pie is
non-zero when GCC is configured with --enable-default-pie.  We need
to initialize flag_pic to -1 so that we can tell if -fpic, -fPIC,
-fno-pic or -fno-PIC is used.



 PR driver/70192
 * opts.c (finish_options): Don't set flag_pie to the default if
 -fpic, -fPIC, -fno-pic or -fno-PIC is used.  Set flag_pic to 0
 if it is -1.


I think this part is ok.


diff --git a/gcc/testsuite/gcc.dg/pie-2.c b/gcc/testsuite/gcc.dg/pie-2.c
new file mode 100644
index 000..e185e51
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pie-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fPIE" } */
+
+#if __PIC__ != 2
+# error __PIC__ is not 2!
+#endif
+
+#if __PIE__ != 2
+# error __PIE__ is not 2!
+#endif


In normal code that should probably use the "__PIC__ - 0" trick to guard 
against cases where the macro isn't defined, but I suppose we'd be 
getting an error in that case as well.



diff --git a/gcc/testsuite/gcc.dg/pie-3.c b/gcc/testsuite/gcc.dg/pie-3.c
new file mode 100644
index 000..fe46c98
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pie-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-pie" } */
+
+#ifdef __PIC__
+# error __PIC__ is defined!
+#endif
+
+#ifdef __PIE__
+# error __PIE__ is defined!
+#endif
diff --git a/gcc/testsuite/gcc.dg/pie-4.c b/gcc/testsuite/gcc.dg/pie-4.c
new file mode 100644
index 000..977baf0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pie-4.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-fno-PIE" } */
+
+#ifdef __PIC__
+# error __PIC__ is defined!
+#endif
+
+#ifdef __PIE__
+# error __PIE__ is defined!
+#endif

>> diff --git a/gcc/testsuite/gcc.dg/pie-6.c b/gcc/testsuite/gcc.dg/pie-6.c
>> new file mode 100644
>> index 000..85529a8
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/pie-6.c
>> @@ -0,0 +1,10 @@
>> +/* { dg-do compile { target { ! pie_enabled } } } */
>> +/* { dg-options "" } */
>> +
>> +#ifdef __PIC__
>> +# error __PIC__ is defined!
>> +#endif
>> +
>> +#ifdef __PIE__
>> +# error __PIE__ is defined!
>> +#endif

These I'm not so sure about. I could imagine there are targets where pic 
is the default. I'd remove these tests or the test for __PIC__. So, ok 
with that change.



Bernd


Re: PING: [PATCH] PR driver/70192: Properly set flag_pie and flag_pic

2016-03-19 Thread Bernd Schmidt

On 03/17/2016 04:13 PM, H.J. Lu wrote:

On Thu, Mar 17, 2016 at 8:09 AM, Bernd Schmidt  wrote:

On 03/17/2016 04:06 PM, H.J. Lu wrote:


This is the patch I am going to check in.

That still mentions darwin which I imagine might not be an exhaustive test.


We can add an effective target, something like ignore_pic_pie, and
use it instead of *-*-darwin*.


That should have been done _before_ committing the patch in a form that 
was not approved.



Bernd


[C++ PATCH] Diagnose invalid _Jv_AllocObject prototype (PR c++/70267)

2016-03-19 Thread Jakub Jelinek
Hi!

_Jv_AllocObject returns a pointer, and as the testcase below shows,
we easily ICE if a wrong prototype is provided for it instead.
There is already other diagnostics (e.g. when it is missing, or when
it is overloaded function), so this ensures at least the return type
is sane.

Wonder about all the other spots where the C++ FE relies on user prototypes
for __cxa* etc. functions, perhaps some sanity checking will be needed too
to avoid ICEs on invalid stuff.

Bootstrapped/regtested on x86_64-linux and i686-linux (including java on
both as usually), ok for trunk?

2016-03-17  Jakub Jelinek  

PR c++/70267
* init.c (build_new_1): Complain and return error_mark_node
if alloc_fn is not _Jv_AllocObject function returning pointer.

* g++.dg/ext/java-3.C: New test.

--- gcc/cp/init.c.jj2016-03-05 07:46:50.0 +0100
+++ gcc/cp/init.c   2016-03-17 17:18:21.326917746 +0100
@@ -2872,6 +2872,14 @@ build_new_1 (vec **placemen
  return error_mark_node;
}
   alloc_fn = OVL_CURRENT (alloc_fn);
+  if (TREE_CODE (alloc_fn) != FUNCTION_DECL
+ || TREE_CODE (TREE_TYPE (alloc_fn)) != FUNCTION_TYPE
+ || !POINTER_TYPE_P (TREE_TYPE (TREE_TYPE (alloc_fn
+   {
+ if (complain & tf_error)
+   error ("%qD is not a function returning a pointer", alloc_fn);
+ return error_mark_node;
+   }
   class_addr = build1 (ADDR_EXPR, jclass_node, class_decl);
   alloc_call = cp_build_function_call_nary (alloc_fn, complain,
class_addr, NULL_TREE);
--- gcc/testsuite/g++.dg/ext/java-3.C.jj2016-03-17 18:25:26.417381245 
+0100
+++ gcc/testsuite/g++.dg/ext/java-3.C   2016-03-17 18:24:48.0 +0100
@@ -0,0 +1,39 @@
+// PR c++/70267
+// { dg-do compile }
+// { dg-options "-O2" }
+
+extern "Java"
+{
+  typedef __java_int jint;
+  namespace java
+  {
+namespace lang
+{
+  class Class;
+  class Object;
+  class Throwable {};
+  class Foo;
+}
+  }
+} 
+
+typedef struct java::lang::Object * jobject;
+typedef struct java::lang::Throwable * jthrowable;
+typedef class  java::lang::Class * jclass;
+
+using java::lang::Foo;
+
+class Foo: public java::lang::Throwable
+{
+  public:static::java::lang::Class class$;
+};
+
+extern "C" Foo _Jv_AllocObject (jclass);
+extern "C" void _Jv_Throw (jthrowable) __attribute__ ((__noreturn__));
+
+void 
+Bar4 (void)
+{
+  Foo * f = new java::lang::Foo;   // { dg-error "is not a function 
returning a pointer" }
+  throw (f);
+}

Jakub


Re: [PATCH, aarch64] Fix 70048

2016-03-19 Thread James Greenhalgh
On Wed, Mar 16, 2016 at 02:25:27PM -0700, Richard Henderson wrote:
> PR target/70048
> * config/aarch64/aarch64.c (virt_or_elim_regno_p): New.
> (aarch64_classify_address): Use it.
> (aarch64_legitimize_address): Force all subexpressions of PLUS
> into registers.  Simplify as (sfp+const)+reg or (reg+reg)+const.
> 
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index cf1239d..12e498d 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -3847,6 +3847,18 @@ aarch64_mode_valid_for_sched_fusion_p (machine_mode 
> mode)
>&& GET_MODE_SIZE (mode) == 8);
>  }
>  
> +/* Return true if REGNO is a virtual pointer register, or an eliminable
> +   "soft" frame register.  Like REGNO_PTR_FRAME_P except that we don't
> +   include stack_pointer or hard_frame_pointer.  */

In that case, do we want to write this as:

  return REGNO_PTR_FRAME_P (regno)
 && regno != STACK_POINTER_REGNUM
 && regno != HARD_FRAME_POINTER_REGNUM;

for clarity?

> +static bool
> +virt_or_elim_regno_p (unsigned regno)

Most functions in here get the "aarch64" in their name even if they are
static.

> +{
> +  return ((regno >= FIRST_VIRTUAL_REGISTER
> +&& regno <= LAST_VIRTUAL_POINTER_REGISTER)
> +   || regno == FRAME_POINTER_REGNUM
> +   || regno == ARG_POINTER_REGNUM);
> +}

Otherwise, this looks OK to me. Thanks for the fix.

James



[PATCH]PR other/70268: map one directory name (old) to another (new) in __FILE__

2016-03-19 Thread Hongxu Jia
Similar -fdebug-prefix-map, add option -ffile-prefix-map to map one
directory name (old) to another (new) in __FILE__, __BASE_FILE__ and
__builtin_FILE().

PR other/70268

* gcc/c-family/c.opt(-ffile-prefix-map=): New option.
* gcc/c-family/c-opts.c: Include file-map.h
(c_common_handle_option): Handle -ffile-prefix-map.
* gcc/gimplify.c: Include file-map.h
(gimplify_call_expr): Call remap_file_filename
* gcc/dwarf2out.c (gen_producer_string): Ignore -ffile-prefix-map.
* libcpp/macro.c: Include file-map.h
(_cpp_builtin_macro_text): Call remap_file_filename
* libcpp/include/file-map.h (remap_file_filename,
add_file_prefix_map): Declare.
* libcpp/file-map.c: Include config.h, system.h, file-map.h.
(struct file_prefix_map, file_prefix_maps, add_file_prefix_map,
remap_file_filename): New.
* libcpp/Makefile.in (file-map.c, file-map.o,
file-map.h): Update dependencies.
* doc/invoke.texi (-ffile-prefix-map): Document.

Signed-off-by: Hongxu Jia 
---
 gcc/ChangeLog |  9 +
 gcc/c-family/c-opts.c |  6 
 gcc/c-family/c.opt|  4 +++
 gcc/doc/invoke.texi   |  6 
 gcc/dwarf2out.c   |  1 +
 gcc/gimplify.c|  2 ++
 libcpp/ChangeLog  | 12 +++
 libcpp/Makefile.in| 10 +++---
 libcpp/file-map.c | 92 +++
 libcpp/include/file-map.h | 30 
 libcpp/macro.c|  2 ++
 11 files changed, 169 insertions(+), 5 deletions(-)
 create mode 100644 libcpp/file-map.c
 create mode 100644 libcpp/include/file-map.h

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 68fcd05..d58f6ee 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2016-03-17  Hongxu Jia  
+   PR **/**
+   * c-family/c-opts.c: Include file-map.h
+   (c_common_handle_option): Handle -ffile-prefix-map.
+   * c-family/c.opt(-ffile-prefix-map=): New option.
+   * gimplify.c: Include file-map.h
+   (gimplify_call_expr): Call remap_file_filename
+   * dwarf2out.c (gen_producer_string): Ignore -ffile-prefix-map.
+
 2016-03-16  Carlos O'Donell  
Sandra Loosemore  
 
diff --git a/gcc/c-family/c-opts.c b/gcc/c-family/c-opts.c
index fec58bc..4dab155 100644
--- a/gcc/c-family/c-opts.c
+++ b/gcc/c-family/c-opts.c
@@ -39,6 +39,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "plugin.h"/* For PLUGIN_INCLUDE_FILE event.  */
 #include "mkdeps.h"
 #include "dumpfile.h"
+#include "file-map.h"
 
 #ifndef DOLLARS_IN_IDENTIFIERS
 # define DOLLARS_IN_IDENTIFIERS true
@@ -503,6 +504,11 @@ c_common_handle_option (size_t scode, const char *arg, int 
value,
   cpp_opts->narrow_charset = arg;
   break;
 
+case OPT_ffile_prefix_map_:
+  if (add_file_prefix_map(arg) < 0)
+error ("invalid argument %qs to -ffile-prefix-map", arg);
+  break;
+
 case OPT_fwide_exec_charset_:
   cpp_opts->wide_charset = arg;
   break;
diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 7c5f6c7..2b88874 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1204,6 +1204,10 @@ fexec-charset=
 C ObjC C++ ObjC++ Joined RejectNegative
 -fexec-charset=  Convert all strings and character constants to 
character set .
 
+ffile-prefix-map=
+C ObjC C++ ObjC++ Joined RejectNegative

[PATCH] PR c/70281: C FE: fix uninitialized range for __builtin_types_compatible_p

2016-03-19 Thread David Malcolm
PR c/70281 reports another case where Valgrind identified an uninitialized
src_range in a c_expr in the C frontend, this time in
the parsing of __builtin_types_compatible_p.

For gcc 7 I hope to fix this more robustly (via poisoning the values in
a c_expr ctor), but for now, this patch fixes the specific issue found.

Successfully bootstrapped on x86_64-pc-linux-gnu; adds 7 PASS results
to gcc.sum.

OK for trunk?

gcc/c/ChangeLog:
PR c/70281
* c-parser.c (c_parser_postfix_expression): Set the source range
for uses of "__builtin_types_compatible_p".

gcc/testsuite/ChangeLog:
PR c/70281
* gcc.dg/plugin/diagnostic-test-expressions-1.c
(test_builtin_types_compatible_p): New test function.
* gcc.dg/pr70281.c: New test case.
---
 gcc/c/c-parser.c   |  6 --
 .../gcc.dg/plugin/diagnostic-test-expressions-1.c  | 18 ++
 gcc/testsuite/gcc.dg/pr70281.c |  9 +
 3 files changed, 31 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr70281.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 60ec996..b80b279 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -7782,9 +7782,10 @@ c_parser_postfix_expression (c_parser *parser)
  expr.value = error_mark_node;
  break;
}
- c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
-"expected %<)%>");
  {
+   location_t close_paren_loc = c_parser_peek_token (parser)->location;
+   c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
+  "expected %<)%>");
tree e1, e2;
e1 = groktypename (t1, NULL, NULL);
e2 = groktypename (t2, NULL, NULL);
@@ -7799,6 +7800,7 @@ c_parser_postfix_expression (c_parser *parser)
 
expr.value
  = comptypes (e1, e2) ? integer_one_node : integer_zero_node;
+   set_c_expr_source_range (, loc, close_paren_loc);
  }
  break;
case RID_BUILTIN_CALL_WITH_STATIC_CHAIN:
diff --git a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c 
b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
index 49ad670..c2d27d7 100644
--- a/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
+++ b/gcc/testsuite/gcc.dg/plugin/diagnostic-test-expressions-1.c
@@ -534,6 +534,24 @@ void test_builtin_choose_expr (int i)
 }
 
 extern int f (int);
+
+void test_builtin_types_compatible_p (unsigned long i)
+{
+  __emit_expression_range (0,
+  f (i) + __builtin_types_compatible_p (long, int)); 
/* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   f (i) + __builtin_types_compatible_p (long, int));
+   ~~^~
+   { dg-end-multiline-output "" } */
+
+  __emit_expression_range (0,
+  __builtin_types_compatible_p (long, int) + f (i)); 
/* { dg-warning "range" } */
+/* { dg-begin-multiline-output "" }
+   __builtin_types_compatible_p (long, int) + f (i));
+   ~^~~
+   { dg-end-multiline-output "" } */
+}
+
 void test_builtin_call_with_static_chain (int i, void *ptr)
 {
   __emit_expression_range (0, __builtin_call_with_static_chain (f (i), ptr));  
/* { dg-warning "range" } */
diff --git a/gcc/testsuite/gcc.dg/pr70281.c b/gcc/testsuite/gcc.dg/pr70281.c
new file mode 100644
index 000..9447fb1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr70281.c
@@ -0,0 +1,9 @@
+/* { dg-options "-Wall -fdiagnostics-show-caret" } */
+int bch_stats_show ()
+{
+  return __builtin_types_compatible_p (unsigned, int) ? "" : "";  /* { 
dg-warning "cast" } */
+/* { dg-begin-multiline-output "" }
+   return __builtin_types_compatible_p (unsigned, int) ? "" : "";
+  ~~^~~~
+   { dg-end-multiline-output "" } */
+}
-- 
1.8.5.3



Re: PING: [PATCH] PR driver/70192: Properly set flag_pie and flag_pic

2016-03-19 Thread H.J. Lu
On Thu, Mar 17, 2016 at 7:18 AM, Bernd Schmidt  wrote:
> On 03/17/2016 02:59 PM, H.J. Lu wrote:
>>
>> On Fri, Mar 11, 2016 at 9:09 AM, H.J. Lu  wrote:
>>>
>>> We can't set flag_pie to the default when flag_pic == 0, which may be
>>> set by -fno-pic or -fno-PIC, since the default value of flag_pie is
>>> non-zero when GCC is configured with --enable-default-pie.  We need
>>> to initialize flag_pic to -1 so that we can tell if -fpic, -fPIC,
>>> -fno-pic or -fno-PIC is used.
>
>
>>>  PR driver/70192
>>>  * opts.c (finish_options): Don't set flag_pie to the default if
>>>  -fpic, -fPIC, -fno-pic or -fno-PIC is used.  Set flag_pic to 0
>>>  if it is -1.
>
>
> I think this part is ok.
>
>>> diff --git a/gcc/testsuite/gcc.dg/pie-2.c b/gcc/testsuite/gcc.dg/pie-2.c
>>> new file mode 100644
>>> index 000..e185e51
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pie-2.c
>>> @@ -0,0 +1,10 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-fPIE" } */
>>> +
>>> +#if __PIC__ != 2
>>> +# error __PIC__ is not 2!
>>> +#endif
>>> +
>>> +#if __PIE__ != 2
>>> +# error __PIE__ is not 2!
>>> +#endif
>
>
> In normal code that should probably use the "__PIC__ - 0" trick to guard
> against cases where the macro isn't defined, but I suppose we'd be getting
> an error in that case as well.
>
>
>>> diff --git a/gcc/testsuite/gcc.dg/pie-3.c b/gcc/testsuite/gcc.dg/pie-3.c
>>> new file mode 100644
>>> index 000..fe46c98
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pie-3.c
>>> @@ -0,0 +1,10 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-fno-pie" } */
>>> +
>>> +#ifdef __PIC__
>>> +# error __PIC__ is defined!
>>> +#endif
>>> +
>>> +#ifdef __PIE__
>>> +# error __PIE__ is defined!
>>> +#endif
>>> diff --git a/gcc/testsuite/gcc.dg/pie-4.c b/gcc/testsuite/gcc.dg/pie-4.c
>>> new file mode 100644
>>> index 000..977baf0
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pie-4.c
>>> @@ -0,0 +1,10 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-options "-fno-PIE" } */
>>> +
>>> +#ifdef __PIC__
>>> +# error __PIC__ is defined!
>>> +#endif
>>> +
>>> +#ifdef __PIE__
>>> +# error __PIE__ is defined!
>>> +#endif
>
>>> diff --git a/gcc/testsuite/gcc.dg/pie-6.c b/gcc/testsuite/gcc.dg/pie-6.c
>>> new file mode 100644
>>> index 000..85529a8
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/pie-6.c
>>> @@ -0,0 +1,10 @@
>>> +/* { dg-do compile { target { ! pie_enabled } } } */
>>> +/* { dg-options "" } */
>>> +
>>> +#ifdef __PIC__
>>> +# error __PIC__ is defined!
>>> +#endif
>>> +
>>> +#ifdef __PIE__
>>> +# error __PIE__ is defined!
>>> +#endif
>
> These I'm not so sure about. I could imagine there are targets where pic is
> the default. I'd remove these tests or the test for __PIC__. So, ok with
> that change.

Darwin is such a target.  Here is a follow-up patch I was planning to
submit.  But I will remove __PIC__ instead.

-- 
H.J.
From ff1ef6e4e969b244984d1ae4c93960e36edd1334 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Fri, 11 Mar 2016 09:24:41 -0800
Subject: [PATCH] Skip some PIC/PIE tests for *-*-darwin* targets

Since Darwin defaults to PIC, not PIE, skip tests of default __PIC__
and __PIE__ setting for *-*-darwin* targets.

	* gcc.dg/pic-1.c: Skip for *-*-darwin* targets.
	* gcc.dg/pic-3.c: Likewise.
	* gcc.dg/pic-4.c: Likewise.
	* gcc.dg/pie-1.c: Likewise.
	* gcc.dg/pie-3.c: Likewise.
	* gcc.dg/pie-4.c: Likewise.
	* gcc.dg/pie-6.c: Likewise.
---
 gcc/testsuite/gcc.dg/pic-1.c | 2 +-
 gcc/testsuite/gcc.dg/pic-3.c | 2 +-
 gcc/testsuite/gcc.dg/pic-4.c | 2 +-
 gcc/testsuite/gcc.dg/pie-1.c | 2 +-
 gcc/testsuite/gcc.dg/pie-3.c | 2 +-
 gcc/testsuite/gcc.dg/pie-4.c | 2 +-
 gcc/testsuite/gcc.dg/pie-6.c | 2 +-
 7 files changed, 7 insertions(+), 7 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/pic-1.c b/gcc/testsuite/gcc.dg/pic-1.c
index 7eb0765..86360aa 100644
--- a/gcc/testsuite/gcc.dg/pic-1.c
+++ b/gcc/testsuite/gcc.dg/pic-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! *-*-darwin* } } } */
 /* { dg-options "-fpic" } */
 
 #if __PIC__ != 1
diff --git a/gcc/testsuite/gcc.dg/pic-3.c b/gcc/testsuite/gcc.dg/pic-3.c
index d7d861b..7c4bbce 100644
--- a/gcc/testsuite/gcc.dg/pic-3.c
+++ b/gcc/testsuite/gcc.dg/pic-3.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! *-*-darwin* } } } */
 /* { dg-options "-fno-pic" } */
 
 #ifdef __PIC__
diff --git a/gcc/testsuite/gcc.dg/pic-4.c b/gcc/testsuite/gcc.dg/pic-4.c
index 732f61f..727fe14 100644
--- a/gcc/testsuite/gcc.dg/pic-4.c
+++ b/gcc/testsuite/gcc.dg/pic-4.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { ! *-*-darwin* } } } */
 /* { dg-options "-fno-PIC" } */
 
 #ifdef __PIC__
diff --git a/gcc/testsuite/gcc.dg/pie-1.c b/gcc/testsuite/gcc.dg/pie-1.c
index ff6281f..ca43e8b 100644
--- a/gcc/testsuite/gcc.dg/pie-1.c
+++ b/gcc/testsuite/gcc.dg/pie-1.c
@@ -1,4 +1,4 @@
-/* { dg-do compile } */
+/* { dg-do compile { target { 

C++ PATCH for c++/70259 (-flifetime-dse vs. empty bases)

2016-03-19 Thread Jason Merrill
The constructor for an empty class can't do the -flifetime-dse clobber 
because when the class is used as a base it might be assigned the same 
offset as a real base, so the clobber would mess with real data.


Tested x86_64-pc-linux-gnu, applying to trunk.

commit e1a5f038350d1881153d8f65359bd883f7452237
Author: Jason Merrill 
Date:   Wed Mar 16 13:46:32 2016 -0400

	PR c++/70259
	* decl.c (start_preparsed_function): Don't clobber an empty base.

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 4ee4ccc..e783163 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14125,6 +14125,8 @@ start_preparsed_function (tree decl1, tree attrs, int flags)
   && (flag_lifetime_dse > 1)
   && DECL_CONSTRUCTOR_P (decl1)
   && !DECL_CLONED_FUNCTION_P (decl1)
+  /* Clobbering an empty base is harmful if it overlays real data.  */
+  && !is_empty_class (current_class_type)
   /* We can't clobber safely for an implicitly-defined default constructor
 	 because part of the initialization might happen before we enter the
 	 constructor, via AGGR_INIT_ZERO_FIRST (c++/68006).  */
diff --git a/gcc/testsuite/g++.dg/opt/flifetime-dse5.C b/gcc/testsuite/g++.dg/opt/flifetime-dse5.C
new file mode 100644
index 000..2c49021
--- /dev/null
+++ b/gcc/testsuite/g++.dg/opt/flifetime-dse5.C
@@ -0,0 +1,13 @@
+// PR c++/70259
+// { dg-options -O2 }
+// { dg-do run }
+
+struct Empty { };
+struct A { A() : a(true) { } bool a; };
+struct B : Empty { B() : Empty() { } };
+struct C : A, B { C() : A(), B() { } };
+int main() {
+  C c;
+  if ( c.a == false )
+__builtin_abort();
+};


New French PO file for 'gcc' (version 6.1-b20160131)

2016-03-19 Thread Translation Project Robot
Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the French team of translators.  The file is available at:

http://translationproject.org/latest/gcc/fr.po

(This file, 'gcc-6.1-b20160131.fr.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

http://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

http://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.




Re: [PATCH, PR tree-optimization/70252] Fix boolean vectors conversion

2016-03-19 Thread Richard Biener
On Thu, Mar 17, 2016 at 1:02 PM, Ilya Enkovich  wrote:
> Hi,
>
> Current widening and narrowing vectorization functions may work
> incorrectly for scalar masks because we may have different boolean
> vector types having the same mode.  E.g. vec(4) and vec(8)
> both have QImode.  That means if we need to convert vec(4) into
> vec(16) we may actually find QImode->HImode conversion optab
> and try to use it which is incorrect because this optab entry is
> used for vec(8) to vec(16) conversion.
>
> I suppose the best fix for GCC 6 is to just catch and disable such
> conversion by checking number of vetor elements.  It doesn't disable
> any vectorization because we don't have any conversion patterns
> for vec(4) anyway.
>
> It's not clear what to do for GCC 7 though to enable such conversions.
> It looks like for AVX-512 we have three boolean vectors sharing the
> same QImode: vec(2), vec(4) and vec(8).  It means
> we can't use optabs to check operations on these vectors even
> using conversion optabs instead of direct ones.  Can we use half/quarter
> byte modes for such masks or something like that?  Another option is to
> handle their conversion separately with no optabs usage at all (target 
> hooks?).
>
> The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.
> OK for trunk?

Ok.

Richard.

> Thanks,
> Ilya
> --
> gcc/
>
> 2016-03-17  Ilya Enkovich  
>
> PR tree-optimization/70252
> * tree-vect-stmts.c (supportable_widening_operation): Check resulting
> boolean vector has a proper number of elements.
> (supportable_narrowing_operation): Likewise.
>
> gcc/testsuite/
>
> 2016-03-17  Ilya Enkovich  
>
> PR tree-optimization/70252
> * gcc.dg/pr70252.c: New test.
>
>
> diff --git a/gcc/testsuite/gcc.dg/pr70252.c b/gcc/testsuite/gcc.dg/pr70252.c
> new file mode 100644
> index 000..209e691
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/pr70252.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3" } */
> +/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* 
> x86_64-*-* } } } */
> +
> +extern unsigned char a [150];
> +extern unsigned char b [150];
> +extern unsigned char c [150];
> +extern unsigned char d [150];
> +extern unsigned char e [150];
> +
> +void foo () {
> +  for (int i = 92; i <= 141; i += 2) {
> +int tmp = (d [i] && b [i]) <= (a [i] > c [i]);
> +e [i] = tmp >> b [i];
> +  }
> +}
> diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
> index 06b1ab7..d12c062 100644
> --- a/gcc/tree-vect-stmts.c
> +++ b/gcc/tree-vect-stmts.c
> @@ -8940,7 +8940,12 @@ supportable_widening_operation (enum tree_code code, 
> gimple *stmt,
>
>if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
>&& insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
> -return true;
> +  /* For scalar masks we may have different boolean
> +vector types having the same QImode.  Thus we
> +add additional check for elements number.  */
> +return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +   || (TYPE_VECTOR_SUBPARTS (vectype) / 2
> +   == TYPE_VECTOR_SUBPARTS (wide_vectype)));
>
>/* Check if it's a multi-step conversion that can be done using 
> intermediate
>   types.  */
> @@ -8991,7 +8996,9 @@ supportable_widening_operation (enum tree_code code, 
> gimple *stmt,
>
>if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
>   && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
> -   return true;
> +   return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +   || (TYPE_VECTOR_SUBPARTS (intermediate_type) / 2
> +   == TYPE_VECTOR_SUBPARTS (wide_vectype)));
>
>prev_type = intermediate_type;
>prev_mode = intermediate_mode;
> @@ -9075,7 +9082,12 @@ supportable_narrowing_operation (enum tree_code code,
>*code1 = c1;
>
>if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> -return true;
> +/* For scalar masks we may have different boolean
> +   vector types having the same QImode.  Thus we
> +   add additional check for elements number.  */
> +return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +   || (TYPE_VECTOR_SUBPARTS (vectype) * 2
> +   == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
>
>/* Check if it's a multi-step conversion that can be done using 
> intermediate
>   types.  */
> @@ -9140,7 +9152,9 @@ supportable_narrowing_operation (enum tree_code code,
>(*multi_step_cvt)++;
>
>if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
> -   return true;
> +   return (!VECTOR_BOOLEAN_TYPE_P (vectype)
> +   || (TYPE_VECTOR_SUBPARTS (intermediate_type) * 2
> +   == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
>
>prev_mode = intermediate_mode;
>prev_type = 

Re: [AArch64] Disable pcrelative_literal_loads with fix-cortex-a53-843419

2016-03-19 Thread Richard Earnshaw (lists)
On 14/03/16 15:34, Christophe Lyon wrote:
> On 10 March 2016 at 14:24, James Greenhalgh  wrote:
>> On Thu, Mar 10, 2016 at 01:37:50PM +0100, Christophe Lyon wrote:
>>> On 10 March 2016 at 12:43, James Greenhalgh  
>>> wrote:
 On Tue, Jan 26, 2016 at 03:43:36PM +0100, Christophe Lyon wrote:
> With the attachment
>
>
> On 26 January 2016 at 15:42, Christophe Lyon  
> wrote:
>> Hi,
>>
>> This is a followup to PR63304.
>>
>> As discussed in bugzilla, this patch disables pcrelative_literal_loads
>> when -mfix-cortex-a53-843419 (or its default configure option) is
>> used.
>>
>> I copied the behavior of -mfix-cortex-a53-835769 (e.g. in
>> aarch64_can_inline_p), and I have tested by building the Linux kernel
>> using -mfix-cortex-a53-843419 and checked that
>> R_AARCH64_ADR_PREL_PG_HI21 relocations are not emitted anymore (under
>> CONFIG_ARM64_ERRATUM_843419).
>>
>> For reference, this is motivated by:
>> https://bugs.linaro.org/show_bug.cgi?id=1994
>> and further details on Launchpad:
>> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1533009
>>
>> OK for trunk?

 Thanks, this looks like a clear regression from GCC 5 (we can no longer
 build the kernel, so this workaround is fine to go in now). Please remember
 to add the link to the relevant PR in the ChangeLog.

 I'd also really appreciate a nice big comment over this code:

> +  /* If it is not set on the command line, we default to no pc
> + relative literal loads, unless the workaround for Cortex-A53
> + erratum 843419 is in effect.  */
> +  if (opts->x_nopcrelative_literal_loads == 2
> +  && !TARGET_FIX_ERR_A53_843419)

 Explaining why this is important (i.e. some summary of the discussion
 in PR63304 regarding the kernel module loader).

 Can you repost with that comment added? I don't have any other objections
 to the patch.

>>>
>>> OK, here is an updated version.
>>
>> Thanks.
>>
>> This is OK for trunk.
>>
> 
> When GCC is configured to enable the A53 erratum 843419 workaround by default,
> this patch caused gcc.target/aarch64/pr63304_1.c to fail.
> 
> The attached patch fixes the problem by forcing the use of
> -mno-fix-cortex-a53-843419.
> 
> OK, or do we prefer not to bother?
> 
> Thanks,
> 
> Christophe
> 
> 
>> James
>>
>>
>> pr70113.log.txt
>>
>>
>> 2016-03-14  Christophe Lyon  
>>
>>  * gcc.target/aarch64/pr63304_1.c: Add -mno-fix-cortex-a53-843419.
>>

OK.

R.



Re: PING: [PATCH] PR driver/70192: Properly set flag_pie and flag_pic

2016-03-19 Thread Bernd Schmidt

On 03/17/2016 04:26 PM, H.J. Lu wrote:

On Thu, Mar 17, 2016 at 8:23 AM, Bernd Schmidt  wrote:

On 03/17/2016 04:13 PM, H.J. Lu wrote:

We can add an effective target, something like ignore_pic_pie, and
use it instead of *-*-darwin*.



That should have been done _before_ committing the patch in a form that was
not approved.



How should we move forward?


Maybe an effective target pic_default, which tests whether __PIC__ is 
defined without any options. Please prepare a patch.



Bernd



Re: [PATCH, match] Fix pr68714

2016-03-19 Thread Jakub Jelinek
On Tue, Mar 15, 2016 at 08:09:54AM -0700, Richard Henderson wrote:
> Ah, sure.  I should have simply tested the reassoc1 dump file, before
> generic vector lowering.

The testcase fails on i386 (and I assume fails on powerpc too), due to the
psABI warnings/notes.

I've committed this as obvious.

2016-03-16  Jakub Jelinek  

PR tree-optimization/68714
* gcc.dg/tree-ssa/pr68714.c: Add -w -Wno-psabi to dg-options.

--- gcc/testsuite/gcc.dg/tree-ssa/pr68714.c.jj  2016-03-15 17:10:18.627539190 
+0100
+++ gcc/testsuite/gcc.dg/tree-ssa/pr68714.c 2016-03-16 14:20:34.160133852 
+0100
@@ -1,5 +1,6 @@
+/* PR tree-optimization/68714 */
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-reassoc1" } */
+/* { dg-options "-O2 -fdump-tree-reassoc1 -w -Wno-psabi" } */
 
 typedef int vec __attribute__((vector_size(16)));
 vec f(vec x,vec y){


Jakub


Re: [PATCH PR69489/01]Improve tree ifcvt by storing/tracking DR against its innermost loop bahavior if possible

2016-03-19 Thread Richard Biener
On Wed, Mar 16, 2016 at 5:17 PM, Bin.Cheng  wrote:
> On Wed, Mar 16, 2016 at 12:20 PM, Richard Biener
>  wrote:
>>
>> On Wed, Mar 16, 2016 at 10:59 AM, Bin Cheng  wrote:
>> > Hi,
>> > ..
>> > Bootstrap and test on x86_64 and AArch64.  Is it OK, not sure if it's GCC 
>> > 7?
>>
>> Hmm.
> Hi,
> Thanks for reviewing.
>>
>> +  equal_p = true;
>> +  if (e1->base_address && e2->base_address)
>> +equal_p &= operand_equal_p (e1->base_address, e2->base_address, 0);
>> +  if (e1->offset && e2->offset)
>> +equal_p &= operand_equal_p (e1->offset, e2->offset, 0);
>>
>> surely better to return false early.
>>
>> I think we don't want this in tree-data-refs.h also because of ...
>>
>> @@ -615,15 +619,29 @@
>> hash_memrefs_baserefs_and_store_DRs_read_written_info
>> (data_reference_p a)
>>data_reference_p *master_dr, *base_master_dr;and REALPART) before 
>> creating the DR (or adjust the equality function
> and hashing
>>tree ref = DR_REF (a);
>>tree base_ref = DR_BASE_OBJECT (a);
>> +  innermost_loop_behavior *innermost = _INNERMOST (a);
>>tree ca = bb_predicate (gimple_bb (DR_STMT (a)));
>>bool exist1, exist2;
>>
>> -  while (TREE_CODE (ref) == COMPONENT_REF
>> -|| TREE_CODE (ref) == IMAGPART_EXPR
>> -|| TREE_CODE (ref) == REALPART_EXPR)
>> -ref = TREE_OPERAND (ref, 0);
>> +  /* If reference in DR has innermost loop behavior and it is not
>> + a compound memory reference, we store it to innermost_DR_map,
>> + otherwise to ref_DR_map.  */
>> +  if (TREE_CODE (ref) == COMPONENT_REF
>> +  || TREE_CODE (ref) == IMAGPART_EXPR
>> +  || TREE_CODE (ref) == REALPART_EXPR
>> +  || !(DR_BASE_ADDRESS (a) || DR_OFFSET (a)
>> +  || DR_INIT (a) || DR_STEP (a) || DR_ALIGNED_TO (a)))
>> +{
>> +  while (TREE_CODE (ref) == COMPONENT_REF
>> +|| TREE_CODE (ref) == IMAGPART_EXPR
>> +|| TREE_CODE (ref) == REALPART_EXPR)
>> +   ref = TREE_OPERAND (ref, 0);
>> +
>> +  master_dr = _DR_map->get_or_insert (ref, );
>> +}
>> +  else
>> +master_dr = _DR_map->get_or_insert (innermost, );
>>
>> we don't want an extra hashmap but replace ref_DR_map entirely.  So we'd 
>> need to
>> strip outermost non-variant handled-components (COMPONENT_REF, IMAGPART
>> and REALPART) before creating the DR (or adjust the equality function
>> and hashing
>> to disregard them which means subtracting their offset from DR_INIT.
> I am not sure if I understand correctly.  But for component reference,
> it is the base object that we want to record/track.  For example,
>
>   for (i = 0; i < N; i++) {
> m = *data++;
>
> m1 = p1->x - m;
> m2 = p2->x + m;
>
> p3->y = (m1 >= m2) ? p1->y : p2->y;
>
> p1++;
> p2++;
> p3++;
>   }
> We want to infer that reads of p1/p2 in condition statement won't trap
> because there are unconditional reads of the structures, though the
> unconditional reads are actual of other sub-objects.  Here it is the
> invariant part of address that we want to track.

Well, the variant parts - we want to strip invariant parts as far as we can
(offsetof (x) and offsetof (y))

> Also illustrated by this example, we can't rely on data-ref analyzer
> here.  Because in gathering/scattering cases, the address could be not
> affine at all.

Sure, but that's a different issue.

>>
>> To adjust the references we collect you'd maybe could use a callback
>> to get_references_in_stmt
>> to adjust them.
>>
>> OTOH post-processing the DRs in if_convertible_loop_p_1 can be as simple as
> Is this a part of the method you suggested above, or is it an
> alternative one?  If it's the latter, then I have below questions
> embedded.

It is an alternative to adding a hook to get_references_in_stmt and
probably "easier".

>>
>> Index: tree-if-conv.c
>> ===
>> --- tree-if-conv.c  (revision 234215)
>> +++ tree-if-conv.c  (working copy)
>> @@ -1235,6 +1220,38 @@ if_convertible_loop_p_1 (struct loop *lo
>>
>>for (i = 0; refs->iterate (i, ); i++)
>>  {
>> +  tree *refp = _REF (dr);
>> +  while ((TREE_CODE (*refp) == COMPONENT_REF
>> + && TREE_OPERAND (*refp, 2) == NULL_TREE)
>> +|| TREE_CODE (*refp) == IMAGPART_EXPR
>> +|| TREE_CODE (*refp) == REALPART_EXPR)
>> +   refp = _OPERAND (*refp, 0);
>> +  if (refp != _REF (dr))
>> +   {
>> + tree saved_base = *refp;
>> + *refp = integer_zero_node;
>> +
>> + if (DR_INIT (dr))
>> +   {
>> + tree poffset;
>> + int punsignedp, preversep, pvolatilep;
>> + machine_mode pmode;
>> + HOST_WIDE_INT pbitsize, pbitpos;
>> + get_inner_reference (DR_REF (dr), , , 
>> ,
>> +  , , , 
>> ,
>> +  false);
>> + gcc_assert (poffset == 

Re: [PATCH, 16/16] Add libgomp.oacc-fortran/kernels-*.f95

2016-03-19 Thread Thomas Schwinge
Hi!

On Wed, 9 Mar 2016 10:19:09 +0100, Tom de Vries  wrote:
> On 09/11/15 21:12, Tom de Vries wrote:
> > This patch adds Fortran oacc kernels execution tests.
> 
> Retested on current trunk.
> 
> Committed, minus the kernels-parallel-loop-data-enter-exit.f95 test.

As obvious, committed in r234257:

commit baeaf028bfed958e14abc8b9f3ca10949bacaf97
Author: tschwinge 
Date:   Wed Mar 16 13:10:20 2016 +

Nowadays, we use plain -fopenacc to enable OpenACC kernels processing

libgomp/
* testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Adjust to
-ftree-parallelize-loops/-fopenacc changes.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95:
Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop-data.f95: Likewise.
* testsuite/libgomp.oacc-fortran/kernels-loop.f95: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@234257 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 15 +++
 libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-2.f95 |  1 -
 .../libgomp.oacc-fortran/kernels-loop-data-2.f95  |  1 -
 .../kernels-loop-data-enter-exit-2.f95|  1 -
 .../libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95 |  1 -
 .../libgomp.oacc-fortran/kernels-loop-data-update.f95 |  1 -
 .../testsuite/libgomp.oacc-fortran/kernels-loop-data.f95  |  1 -
 libgomp/testsuite/libgomp.oacc-fortran/kernels-loop.f95   |  1 -
 8 files changed, 15 insertions(+), 7 deletions(-)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index 5a91504..fca65e6 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,18 @@
+2016-03-16  Thomas Schwinge  
+
+   * testsuite/libgomp.oacc-fortran/kernels-loop-2.f95: Adjust to
+   -ftree-parallelize-loops/-fopenacc changes.
+   * testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95:
+   Likewise.
+   * testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95:
+   Likewise.
+   * testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95:
+   Likewise.
+   * testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95:
+   Likewise.
+   * testsuite/libgomp.oacc-fortran/kernels-loop-data.f95: Likewise.
+   * testsuite/libgomp.oacc-fortran/kernels-loop.f95: Likewise.
+
 2016-03-13  Thomas Schwinge  
 
* testsuite/lib/libgomp.exp (libgomp_init): Potentially append to
diff --git libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-2.f95 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-2.f95
index 1fb40ee..163e8d5 100644
--- libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-2.f95
+++ libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-2.f95
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-ftree-parallelize-loops=32" }
 
 program main
   implicit none
diff --git libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95
index 7b52253..4c73606 100644
--- libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95
+++ libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-2.f95
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-ftree-parallelize-loops=32" }
 
 program main
   implicit none
diff --git 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95
index af98efa..da11aaf 100644
--- libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95
+++ libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit-2.f95
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-ftree-parallelize-loops=32" }
 
 program main
   implicit none
diff --git 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95
index bb6f8dc..f4b4eb3 100644
--- libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95
+++ libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-enter-exit.f95
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-ftree-parallelize-loops=32" }
 
 program main
   implicit none
diff --git libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95 
libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95
index cab1f2c..d2083e2 100644
--- libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95
+++ libgomp/testsuite/libgomp.oacc-fortran/kernels-loop-data-update.f95
@@ -1,5 +1,4 @@
 ! { dg-do run }
-! { dg-options "-ftree-parallelize-loops=32" }
 
 

Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-03-19 Thread Jason Merrill

On 03/16/2016 03:39 PM, H.J. Lu wrote:

On Wed, Mar 16, 2016 at 10:02 AM, H.J. Lu  wrote:

On Wed, Mar 16, 2016 at 9:58 AM, Jason Merrill  wrote:

On 03/16/2016 08:38 AM, H.J. Lu wrote:


FAIL: g++.dg/abi/pr60336-1.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-5.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-6.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-7.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr60336-9.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx9true_type
FAIL: g++.dg/abi/pr68355.C   scan-assembler jmp[\t
]+[^$]*?_Z3xxx17integral_constantIbLb1EE



These pass for me on x86_64, but I do see calls with -m32.


They are expected since get_ref_base_and_extent needs to be
changed to set bitsize to 0 for empty types so that when
ref_maybe_used_by_call_p_1 calls get_ref_base_and_extent to
get 0 as the maximum size on empty type.  Otherwise, find_tail_calls
won't perform tail call optimization for functions with empty type
parameters.



That isn't why the optimization isn't happening in pr68355 with -m32; the
.optimized dump has

   xxx (D.2289); [tail call]

Rather, the failure seems to happen in load_register_parameter, at


   /* Check for overlap with already clobbered argument area,
  providing that this has non-zero size.  */
   if (is_sibcall
   && (size == 0
   || mem_overlaps_already_clobbered_arg_p
(XEXP (args[i].value, 0),
size)))
 *sibcall_failure = 1;



The code seems to contradict the comment, and seems to have been broken by
r162402.  Applying this additional patch fixes those tests.



I am running the full test now.


On x86-64, I got

export/gnu/import/git/sources/gcc/gcc/testsuite/gcc.dg/ubsan/object-size-9.c:11:13:
runtime error: load of address 0x00600ffa with insufficient space
for an object of type 'char'
0x00600ffa: note: pointer points here

PASS: gcc.dg/ubsan/object-size-9.c   -O2  execution test
FAIL: gcc.dg/ubsan/object-size-9.c   -O2  output pattern test
Output was:


That looks like a dejagnu glitch; the output you quote seems to match 
the expected output from the test.


Jason



[PATCH] PR lto/70258: [6 Regression] flag_pic is cleared for PIE in lto_post_options

2016-03-19 Thread H.J. Lu
Since PIE implies PIC, we should set flag_pic to flag_pie for PIE in
LTO.

Tested on x86-64.  OK for trunk?

H.J.
---
PR lto/70258
* lto-lang.c (lto_post_options): Set flag_pic to flag_pie for
PIE.
---
 gcc/lto/lto-lang.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/lto/lto-lang.c b/gcc/lto/lto-lang.c
index 691e9e2..b5efe3a 100644
--- a/gcc/lto/lto-lang.c
+++ b/gcc/lto/lto-lang.c
@@ -836,7 +836,7 @@ lto_post_options (const char **pfilename ATTRIBUTE_UNUSED)
   /* If -fPIC or -fPIE was used at compile time, be sure that
  flag_pie is 2.  */
   flag_pie = MAX (flag_pie, flag_pic);
-  flag_pic = 0;
+  flag_pic = flag_pie;
   break;
 
 case LTO_LINKER_OUTPUT_EXEC: /* Normal executable */
-- 
2.5.0



Re: [PATCH] Fix PR64764

2016-03-19 Thread H.J. Lu
On Wed, Mar 16, 2016 at 9:12 AM, H.J. Lu  wrote:
> On Mon, Feb 9, 2015 at 2:30 AM, Tom de Vries  wrote:
>> On 09-02-15 09:59, Richard Biener wrote:
>>>
>>> On Thu, 5 Feb 2015, Tom de Vries wrote:
>>>
 On 26-01-15 15:47, Richard Biener wrote:
>
> Index: gcc/testsuite/gcc.dg/uninit-19.c
> ===
> --- gcc/testsuite/gcc.dg/uninit-19.c(revision 0)
> +++ gcc/testsuite/gcc.dg/uninit-19.c(working copy)
> @@ -0,0 +1,23 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O -Wuninitialized" } */
> +
> +int a, l, m;
> +float *b;
> +float c, d, e, g, h;
> +unsigned char i, k;
> +void
> +fn1 (int p1, float *f1, float *f2, float *f3, unsigned char *c1, float
> *f4,
> + unsigned char *c2, float *p10)
> +{
> +  if (p1 & 8)
> +b[3] = p10[a];  /* { dg-warning "may be used uninitialized" } */
> +}
> +
> +void
> +fn2 ()
> +{
> +  float *n;
> +  if (l & 6)
> +n =  + m;
> +  fn1 (l, , , , , , , n);
> +}


 Hi Richard,

 this new test fails with -fpic, because fn1 is not inlined.

 Adding static to fn1 allows it to pass both with and without -fpic. But
 that
 change might affect whether it still serves as a regression test for this
 PR,
 I'm not sure.

 Another way to fix this could be to use the warning line number 22
 instead 13
 for fpic.
>>>
>>>
>>> Either way is fine with me.
>>>
>>
>> Committed using the method of different line number for -fpic.
>>
>> Thanks,
>> - Tom
>>
>> 2015-02-09  Tom de Vries  
>>
>> * gcc.dg/uninit-19.c: Fix warning line for fpic.
>> ---
>>  gcc/testsuite/gcc.dg/uninit-19.c | 7 +--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/testsuite/gcc.dg/uninit-19.c
>> b/gcc/testsuite/gcc.dg/uninit-19.c
>> index 3113cab..fc7acea 100644
>> --- a/gcc/testsuite/gcc.dg/uninit-19.c
>> +++ b/gcc/testsuite/gcc.dg/uninit-19.c
>> @@ -10,7 +10,7 @@ fn1 (int p1, float *f1, float *f2, float *f3, unsigned
>> char *c1, float *f4,
>>   unsigned char *c2, float *p10)
>>  {
>>if (p1 & 8)
>> -b[3] = p10[a];  /* { dg-warning "may be used uninitialized" } */
>> +b[3] = p10[a];  /* 13.  */
>>  }
>>
>>  void
>> @@ -19,5 +19,8 @@ fn2 ()
>>float *n;
>>if (l & 6)
>>  n =  + m;
>> -  fn1 (l, , , , , , , n);
>> +  fn1 (l, , , , , , , n);  /* 22.  */
>>  }
>> +
>> +/* { dg-warning "may be used uninitialized" "" { target nonpic } 13 } */
>> +/* { dg-warning "may be used uninitialized" "" { target { ! nonpic } } 22 }
>> */
>> --
>> 1.9.1
>>
>
> Any particular reason why this test was changed to DOS format?
>

I ran dos2unix on gcc.dg/uninit-19.c and checked it in.


-- 
H.J.


[gomp4] Merge trunk r234323 (2016-03-18) into gomp-4_0-branch

2016-03-19 Thread Thomas Schwinge
Hi!

Committed to gomp-4_0-branch in r234351:

commit 4514391426e57f78cb3bfd66d09f5065eff66243
Merge: 2d2924a 666094f
Author: tschwinge 
Date:   Sat Mar 19 15:34:52 2016 +

svn merge -r 232931:234323 svn+ssh://gcc.gnu.org/svn/gcc/trunk


git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/branches/gomp-4_0-branch@234351 
138bc75d-0d04-0410-961f-82ee72b054a4


Grüße
 Thomas


Re: Please include ada-hurd.diff upstream (try2)

2016-03-19 Thread Arnaud Charlet
> > The copyright notices are wrong (or at least incomplete).
> 
> Hi, what is wrong then, copyright years and/or the text?

Both. The copyright year should include 2016 and the text should be
copyright FSF, not AdaCore.

Arno


Re: [PATCH, PR70185] Only finalize dot files that have been initialized

2016-03-19 Thread Richard Biener
On Wed, Mar 16, 2016 at 11:57 AM, Tom de Vries  wrote:
> Hi,
>
> Atm, using fdump-tree-all-graph produces invalid dot files:
> ...
> $ rm *.c.* ; gcc test.c -O2 -S -fdump-tree-all-graph
> $ for f in *.dot; do dot -Tpdf $f -o dot.pdf; done
> Warning: test.c.006t.omplower.dot: syntax error in line 1 near '}'
> Warning: test.c.007t.lower.dot: syntax error in line 1 near '}'
> Warning: test.c.010t.eh.dot: syntax error in line 1 near '}'
> Warning: test.c.292t.statistics.dot: syntax error in line 1 near '}'
> $ cat test.c.006t.omplower.dot
> }
> $
> ...
> These dot files are finalized, but never initialized or used.
>
> The 006/007/010 files are not used because '(fn->curr_properties & PROP_cfg)
> == 0' at the corresponding passes.
>
> And the file test.c.292t.statistics.dot is not used, because it doesn't
> belong to a single pass.
>
> The current finalization code doesn't handle these cases:
> ...
>   /* Do whatever is necessary to finish printing the graphs.  */
>   for (i = TDI_end; (dfi = dumps->get_dump_file_info (i)) != NULL; ++i)
> if (dumps->dump_initialized_p (i)
> && (dfi->pflags & TDF_GRAPH) != 0
> && (name = dumps->get_dump_file_name (i)) != NULL)
>   {
> finish_graph_dump_file (name);
> free (name);
>   }
> ...
>
> The patch fixes this by simply testing for pass->graph_dump_initialized
> instead.
>
> [ That fix exposes the lack of initialization of graph_dump_initialized. It
> seems to be initialized for static passes, but for dynamically added passes,
> such as f.i. vzeroupper the value is uninitialized. The patch also fixes
> this. ]
>
> Bootstrapped and reg-tested on x86_64.
>
> OK for stage1?

Seeing this I wonder if it makes more sense to move ->graph_dump_initialized
from pass to dump_file_info?  Also in the above shouldn't it use
dfi->pfilename rather than dumps->get_dump_file_name (i)?

Richard.

> Thanks,
> - Tom


Re: RFA: PATCH to load_register_parameters for empty structs and sibcalls

2016-03-19 Thread Bernd Schmidt

On 03/16/2016 07:45 PM, Jason Merrill wrote:

Discussion of empty class parameter passing ABI led me to notice that
r162402 broke sibcalls with arguments of size 0 in some cases.  Before
that commit, the code read


else if ((partial == 0 || args[i].pass_on_stack)
 && size != 0)
{

[...]

  if (is_sibcall
  && mem_overlaps_already_clobbered_arg_p (XEXP (args[i].value,
0), size))
 *sibcall_failure = 1;


and after,



if (is_sibcall
&& (size == 0
|| mem_overlaps_already_clobbered_arg_p
 (XEXP (args[i].value, 0),
size)))



So now we set *sibcall_failure if size==0, whereas before we didn't
enter the outer block.  The comment also contradicts the code.


The patch looks ok. I was trying to research the earlier change, but I 
can't find a message in the archives. Cc'ing Iain in case he has input.



Bernd



Re: [PATCH] Fix PR64764

2016-03-19 Thread H.J. Lu
On Wed, Mar 16, 2016 at 9:41 AM, H.J. Lu  wrote:
> On Wed, Mar 16, 2016 at 9:35 AM, Tom de Vries  wrote:
>> On 16/03/16 17:15, H.J. Lu wrote:
>>>
>>> On Wed, Mar 16, 2016 at 9:12 AM, H.J. Lu  wrote:
>>
>>
 Any particular reason why this test was changed to DOS format?
>>
>>
>> FWIW, the test was in DOS format from the start.
>>
>>
>
> DOS format was introduced by r220530:
>
> Index: gcc.dg/uninit-19.c
> ===
> --- gcc.dg/uninit-19.c  (revision 220529)
> +++ gcc.dg/uninit-19.c  (revision 220530)
> @@ -10,7 +10,7 @@ fn1 (int p1, float *f1, float *f2, float
>   unsigned char *c2, float *p10)^M
>  {^M
>if (p1 & 8)^M
> -b[3] = p10[a];  /* { dg-warning "may be used uninitialized" } */^M
> +b[3] = p10[a];  /* 13.  */^M
>  }^M
>  ^M
>  void^M
> @@ -19,5 +19,8 @@ fn2 ()
>float *n;^M
>if (l & 6)^M
>  n =  + m;^M
> -  fn1 (l, , , , , , , n);^M
> +  fn1 (l, , , , , , , n);  /* 22.  */^M
>  }^M
> +^M
> +/* { dg-warning "may be used uninitialized" "" { target nonpic } 13 } */^M
> +/* { dg-warning "may be used uninitialized" "" { target { ! nonpic }
> } 22 } */^M
>
> "^M" was added to those changed lines.
>

Never mind.  "^M" was there before.

-- 
H.J.


[PATCH] PR testsuite/70150: Check pie_enabled target in PIC tests

2016-03-19 Thread H.J. Lu
We need to check pie_enabled target in PIC tests to support GCC where
PIE is enabled by default when configured with --enable-default-pie.

OK for master?

H.J.
---
PR testsuite/70150
* gcc.dg/20020312-2.c (dg-additional-options): Set to "-no-pie"
for pie_enabled target.
* gcc.dg/uninit-19.c: Check pie_enabled for PIC.
* gcc.target/i386/pr34256.c: Likewise.
---
 gcc/testsuite/gcc.dg/20020312-2.c   | 1 +
 gcc/testsuite/gcc.dg/uninit-19.c| 4 ++--
 gcc/testsuite/gcc.target/i386/pr34256.c | 4 ++--
 3 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/20020312-2.c 
b/gcc/testsuite/gcc.dg/20020312-2.c
index 5fce50d..5c5cb09 100644
--- a/gcc/testsuite/gcc.dg/20020312-2.c
+++ b/gcc/testsuite/gcc.dg/20020312-2.c
@@ -7,6 +7,7 @@
 
 /* { dg-do run } */
 /* { dg-options "-O -fno-pic" } */
+/* { dg-additional-options "-no-pie" { target pie_enabled } } */
 /* { dg-require-effective-target nonlocal_goto } */
 
 extern void abort (void);
diff --git a/gcc/testsuite/gcc.dg/uninit-19.c b/gcc/testsuite/gcc.dg/uninit-19.c
index d7b9ed0..8c2bbeb 100644
--- a/gcc/testsuite/gcc.dg/uninit-19.c
+++ b/gcc/testsuite/gcc.dg/uninit-19.c
@@ -22,5 +22,5 @@ fn2 ()
   fn1 (l, , , , , , , n);  /* 22.  */
 }
 
-/* { dg-warning "may be used uninitialized" "" { target nonpic } 13 } */
-/* { dg-warning "may be used uninitialized" "" { target { ! nonpic } } 22 } */
+/* { dg-warning "may be used uninitialized" "" { target { nonpic || 
pie_enabled } } 13 } */
+/* { dg-warning "may be used uninitialized" "" { target { { ! nonpic } && { ! 
pie_enabled } } } 22 } */
diff --git a/gcc/testsuite/gcc.target/i386/pr34256.c 
b/gcc/testsuite/gcc.target/i386/pr34256.c
index 992312a..6987457 100644
--- a/gcc/testsuite/gcc.target/i386/pr34256.c
+++ b/gcc/testsuite/gcc.target/i386/pr34256.c
@@ -10,5 +10,5 @@ unsigned long long  foo(__m64 m) {
   return _mm_cvtm64_si64(_mm_add_pi32(x, y));
 }
 
-/* { dg-final { scan-assembler-times "mov" 2 { target nonpic } } } */
-/* { dg-final { scan-assembler-times "mov" 4 { target { ! nonpic } } } } */
+/* { dg-final { scan-assembler-times "mov" 2 { target { nonpic || pie_enabled 
} } } } */
+/* { dg-final { scan-assembler-times "mov" 4 { target { { ! nonpic } && { ! 
pie_enabled } } } } } */
-- 
2.5.0



[GCC][ARM] Skip tests that assume target supports arm mode, when testing M profiles

2016-03-19 Thread Andre Vieira (lists)
Hello,

This patch skips four tests that assume a target supports ARM mode when
testing M-profiles.
Tested it by running the four tests for A-profiles and M-profiles.

Is this ok?

Cheers,
Andre

gcc/testsuite/ChangeLog:
2016-03-17  Andre Vieira  

* gcc/testsuite/gcc.target/arm/attr-align1.c: Skip if M-profile.
* gcc/testsuite/gcc.target/arm/attr-align3.c: Likewise.
* gcc/testsuite/gcc.target/arm/attr_arm.c: Likewise.
* gcc/testsuite/gcc.target/arm/flip-thumb.c: Likewise.
diff --git a/gcc/testsuite/gcc.target/arm/attr-align1.c 
b/gcc/testsuite/gcc.target/arm/attr-align1.c
index 
96d29a9eed5a81306cb90393a2eb4fe7236ae50b..a53f16706860b69fcc60071b818fbc9f89fc33c7
 100644
--- a/gcc/testsuite/gcc.target/arm/attr-align1.c
+++ b/gcc/testsuite/gcc.target/arm/attr-align1.c
@@ -2,6 +2,7 @@
Verify alignment when both attribute optimize and target are used.  */
 /* { dg-do compile } */
 /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
+/* { dg-skip-if "" arm_cortex_m } */
 
 void
 __attribute__ ((target ("arm")))
diff --git a/gcc/testsuite/gcc.target/arm/attr-align3.c 
b/gcc/testsuite/gcc.target/arm/attr-align3.c
index 
edcf64b45e053eca4ae5f0be2de3afd7b674f464..593d7fbc2b999d264cb06f54363c471480117f32
 100644
--- a/gcc/testsuite/gcc.target/arm/attr-align3.c
+++ b/gcc/testsuite/gcc.target/arm/attr-align3.c
@@ -2,6 +2,7 @@
Verify alignment when attribute target is used.  */
 /* { dg-do compile } */
 /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
+/* { dg-skip-if "" arm_cortex_m } */
 /* { dg-options "-Os -mthumb" }  */
 
 /* Check that arm code is always 4 bytes aligned.  */
diff --git a/gcc/testsuite/gcc.target/arm/attr_arm.c 
b/gcc/testsuite/gcc.target/arm/attr_arm.c
index 
f5c70ef690fc68425e0c4a0f458cd73ebde2f0ab..d765d121e2965a440234a1793688bc97aa60d831
 100644
--- a/gcc/testsuite/gcc.target/arm/attr_arm.c
+++ b/gcc/testsuite/gcc.target/arm/attr_arm.c
@@ -1,5 +1,6 @@
 /* Check that attribute target arm is recognized.  */
 /* { dg-do compile } */
+/* { dg-skip-if "" arm_cortex_m } */
 /* { dg-final { scan-assembler "\\.arm" } } */
 /* { dg-final { scan-assembler-not "\\.thumb_func" } } */
 
diff --git a/gcc/testsuite/gcc.target/arm/flip-thumb.c 
b/gcc/testsuite/gcc.target/arm/flip-thumb.c
index 
355d66377558d9007f58056180940122fcf148e0..4bbe546b6325b2cbc9f9b7f7c52c29815c231916
 100644
--- a/gcc/testsuite/gcc.target/arm/flip-thumb.c
+++ b/gcc/testsuite/gcc.target/arm/flip-thumb.c
@@ -2,6 +2,7 @@
 /* { dg-do compile } */
 /* Make sure the current multilib supports thumb.  */
 /* { dg-skip-if "" { ! { arm_thumb1_ok || arm_thumb2_ok } } } */
+/* { dg-skip-if "" arm_cortex_m } */
 /* { dg-options "-O2 -mflip-thumb -mno-restrict-it" } */
 /* { dg-final { scan-assembler "\\.arm" } } */
 /* { dg-final { scan-assembler-times "\\.thumb_func" 1} } */


Re: [patch] libstdc++/69945 Add __gnu_cxx::__freeres hook

2016-03-19 Thread Mark Wielaard
On Thu, 2016-03-03 at 16:34 +0100, Mark Wielaard wrote:
> On Wed, 2016-02-24 at 18:35 +, Jonathan Wakely wrote:
> > This adds a new function to libsupc++ which will free the memory still
> > in use by the pool used for allocating exceptions when malloc fails.
> > 
> > This is similar to glibc's __libc_freeres, which valgrind (and other
> > tools?) use to tell glibc to deallocate everything before exiting.
> > 
> > I initially called it __gnu_cxx::__free_eh_pool() but I figured we
> > might have other memory in use at some later date, and we wouldn't
> > want valgrind to have to start calling a second function, nor make a
> > function called __free_eh_pool() actually free other things.
> 
> I tested this on x86_64-pc-linux-gnu with Ivo's valgrind patch from
> https://bugs.kde.org/show_bug.cgi?id=345307 and it works pretty nicely.
> No more spurious still reachable memory issues with memcheck.
> 
> Is there any possibility to get this backported for 5.4?

If there is anything I can do to help move this patch forward, please
let me know.

Thanks,

Mark


Re: [gomp4.1] map clause parsing improvements

2016-03-19 Thread Thomas Schwinge
Hi!

On Mon, 19 Oct 2015 12:34:08 +0200, Jakub Jelinek  wrote:
> On Mon, Oct 19, 2015 at 12:20:23PM +0200, Thomas Schwinge wrote:
> > > @@ -77,7 +79,21 @@ enum gomp_map_kind

> > > +/* OpenMP 4.1 alias for forced deallocation.  */
> > > +GOMP_MAP_DELETE =GOMP_MAP_FORCE_DEALLOC,
> > 
> > To avoid confusion about two different identifiers naming the same
> > functionality, I'd prefer to avoid such aliases ("GOMP_MAP_DELETE =
> > GOMP_MAP_FORCE_DEALLOC"), and instead just rename GOMP_MAP_FORCE_DEALLOC
> > to GOMP_MAP_DELETE, if that's the name you prefer.
> 
> If you are ok with removing GOMP_MAP_FORCE_DEALLOC and just use
> GOMP_MAP_DELETE, that is ok by me, just post a patch.

That's simple enouch; OK to commit?  (I'm also including the related
change, to rename the Fortran OMP_MAP_FORCE_DEALLOC to OMP_MAP_DELETE,
because I think that's what you'd do, once starting the OpenMP 4.5
Fortran front end work.)

commit d60e36a2a935a9319602221360b1a6abf282f434
Author: Thomas Schwinge 
Date:   Wed Mar 16 18:10:26 2016 +0100

Rename GOMP_MAP_FORCE_DEALLOC to GOMP_MAP_DELETE

Also rename the Fortran OMP_MAP_FORCE_DEALLOC to OMP_MAP_DELETE.

include/
* gomp-constants.h (enum gomp_map_kind): Rename
GOMP_MAP_FORCE_DEALLOC to GOMP_MAP_DELETE.  Adjust all users.

gcc/fortran/
* gfortran.h (enum gfc_omp_map_op): Rename OMP_MAP_FORCE_DEALLOC
to OMP_MAP_DELETE.  Adjust all users.
---
 gcc/c/c-parser.c   | 2 +-
 gcc/cp/parser.c| 2 +-
 gcc/fortran/gfortran.h | 2 +-
 gcc/fortran/openmp.c   | 2 +-
 gcc/fortran/trans-openmp.c | 6 +++---
 gcc/gimplify.c | 2 +-
 gcc/omp-low.c  | 2 +-
 gcc/tree-pretty-print.c| 2 +-
 include/gomp-constants.h   | 6 ++
 libgomp/oacc-parallel.c| 6 +++---
 10 files changed, 15 insertions(+), 17 deletions(-)

diff --git gcc/c/c-parser.c gcc/c/c-parser.c
index 60ec996..82d6eca 100644
--- gcc/c/c-parser.c
+++ gcc/c/c-parser.c
@@ -10715,7 +10715,7 @@ c_parser_oacc_data_clause (c_parser *parser, 
pragma_omp_clause c_kind,
   kind = GOMP_MAP_FORCE_ALLOC;
   break;
 case PRAGMA_OACC_CLAUSE_DELETE:
-  kind = GOMP_MAP_FORCE_DEALLOC;
+  kind = GOMP_MAP_DELETE;
   break;
 case PRAGMA_OACC_CLAUSE_DEVICE:
   kind = GOMP_MAP_FORCE_TO;
diff --git gcc/cp/parser.c gcc/cp/parser.c
index 62570d4..8ba4ffe 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -30086,7 +30086,7 @@ cp_parser_oacc_data_clause (cp_parser *parser, 
pragma_omp_clause c_kind,
   kind = GOMP_MAP_FORCE_ALLOC;
   break;
 case PRAGMA_OACC_CLAUSE_DELETE:
-  kind = GOMP_MAP_FORCE_DEALLOC;
+  kind = GOMP_MAP_DELETE;
   break;
 case PRAGMA_OACC_CLAUSE_DEVICE:
   kind = GOMP_MAP_FORCE_TO;
diff --git gcc/fortran/gfortran.h gcc/fortran/gfortran.h
index 33fffd8..a0fb5fd 100644
--- gcc/fortran/gfortran.h
+++ gcc/fortran/gfortran.h
@@ -1112,8 +1112,8 @@ enum gfc_omp_map_op
   OMP_MAP_TO,
   OMP_MAP_FROM,
   OMP_MAP_TOFROM,
+  OMP_MAP_DELETE,
   OMP_MAP_FORCE_ALLOC,
-  OMP_MAP_FORCE_DEALLOC,
   OMP_MAP_FORCE_TO,
   OMP_MAP_FORCE_FROM,
   OMP_MAP_FORCE_TOFROM,
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index 51ab96e..a6c39cd 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -764,7 +764,7 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, uint64_t mask,
   if ((mask & OMP_CLAUSE_DELETE)
  && gfc_match ("delete ( ") == MATCH_YES
  && gfc_match_omp_map_clause (>lists[OMP_LIST_MAP],
-  OMP_MAP_FORCE_DEALLOC))
+  OMP_MAP_DELETE))
continue;
   if ((mask & OMP_CLAUSE_PRESENT)
  && gfc_match ("present ( ") == MATCH_YES
diff --git gcc/fortran/trans-openmp.c gcc/fortran/trans-openmp.c
index 5990202..a905ca6 100644
--- gcc/fortran/trans-openmp.c
+++ gcc/fortran/trans-openmp.c
@@ -2119,12 +2119,12 @@ gfc_trans_omp_clauses (stmtblock_t *block, 
gfc_omp_clauses *clauses,
case OMP_MAP_TOFROM:
  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_TOFROM);
  break;
+   case OMP_MAP_DELETE:
+ OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_DELETE);
+ break;
case OMP_MAP_FORCE_ALLOC:
  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_ALLOC);
  break;
-   case OMP_MAP_FORCE_DEALLOC:
- OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_DEALLOC);
- break;
case OMP_MAP_FORCE_TO:
  OMP_CLAUSE_SET_MAP_KIND (node, GOMP_MAP_FORCE_TO);
  break;
diff --git gcc/gimplify.c gcc/gimplify.c
index f3e5c39..3687e7a 100644
--- gcc/gimplify.c
+++ gcc/gimplify.c
@@ -8194,7 +8194,7 @@ gimplify_oacc_declare_1 (tree clause)
   case GOMP_MAP_ALLOC:
   case GOMP_MAP_FORCE_ALLOC:
   case GOMP_MAP_FORCE_TO:
- 

Re: [PATCH] Change replace_rtx if from is a REG (PR target/70245, take 2)

2016-03-19 Thread Bernd Schmidt

On 03/17/2016 12:16 PM, Jakub Jelinek wrote:


Thus, I've reverted the patch (kept the testcase), and after some
discussions on IRC bootstrapped/regtested on x86_64-linux and i686-linux
following version, which right now should change behavior just for the i?86
case and nothing else, so shouldn't break other targets.
I believe at least the epiphany and sh peepholes that use replace_rtx will
want similar treatment, but will leave testing of that to their maintainers.

Ok for trunk?


Ok.


Bernd


[PATCH, PR tree-optimization/70252] Fix boolean vectors conversion

2016-03-19 Thread Ilya Enkovich
Hi,

Current widening and narrowing vectorization functions may work
incorrectly for scalar masks because we may have different boolean
vector types having the same mode.  E.g. vec(4) and vec(8)
both have QImode.  That means if we need to convert vec(4) into
vec(16) we may actually find QImode->HImode conversion optab
and try to use it which is incorrect because this optab entry is
used for vec(8) to vec(16) conversion.

I suppose the best fix for GCC 6 is to just catch and disable such
conversion by checking number of vetor elements.  It doesn't disable
any vectorization because we don't have any conversion patterns
for vec(4) anyway.

It's not clear what to do for GCC 7 though to enable such conversions.
It looks like for AVX-512 we have three boolean vectors sharing the
same QImode: vec(2), vec(4) and vec(8).  It means
we can't use optabs to check operations on these vectors even
using conversion optabs instead of direct ones.  Can we use half/quarter
byte modes for such masks or something like that?  Another option is to
handle their conversion separately with no optabs usage at all (target hooks?).

The patch was bootstrapped and regtested on x86_64-unknown-linux-gnu.
OK for trunk?

Thanks,
Ilya
--
gcc/

2016-03-17  Ilya Enkovich  

PR tree-optimization/70252
* tree-vect-stmts.c (supportable_widening_operation): Check resulting
boolean vector has a proper number of elements.
(supportable_narrowing_operation): Likewise.

gcc/testsuite/

2016-03-17  Ilya Enkovich  

PR tree-optimization/70252
* gcc.dg/pr70252.c: New test.


diff --git a/gcc/testsuite/gcc.dg/pr70252.c b/gcc/testsuite/gcc.dg/pr70252.c
new file mode 100644
index 000..209e691
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr70252.c
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+/* { dg-additional-options "-march=skylake-avx512" { target { i?86-*-* 
x86_64-*-* } } } */
+
+extern unsigned char a [150];
+extern unsigned char b [150];
+extern unsigned char c [150];
+extern unsigned char d [150];
+extern unsigned char e [150];
+
+void foo () {
+  for (int i = 92; i <= 141; i += 2) {
+int tmp = (d [i] && b [i]) <= (a [i] > c [i]);
+e [i] = tmp >> b [i];
+  }
+}
diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index 06b1ab7..d12c062 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -8940,7 +8940,12 @@ supportable_widening_operation (enum tree_code code, 
gimple *stmt,
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
   && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
-return true;
+  /* For scalar masks we may have different boolean
+vector types having the same QImode.  Thus we
+add additional check for elements number.  */
+return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+   || (TYPE_VECTOR_SUBPARTS (vectype) / 2
+   == TYPE_VECTOR_SUBPARTS (wide_vectype)));
 
   /* Check if it's a multi-step conversion that can be done using intermediate
  types.  */
@@ -8991,7 +8996,9 @@ supportable_widening_operation (enum tree_code code, 
gimple *stmt,
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (wide_vectype)
  && insn_data[icode2].operand[0].mode == TYPE_MODE (wide_vectype))
-   return true;
+   return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+   || (TYPE_VECTOR_SUBPARTS (intermediate_type) / 2
+   == TYPE_VECTOR_SUBPARTS (wide_vectype)));
 
   prev_type = intermediate_type;
   prev_mode = intermediate_mode;
@@ -9075,7 +9082,12 @@ supportable_narrowing_operation (enum tree_code code,
   *code1 = c1;
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
-return true;
+/* For scalar masks we may have different boolean
+   vector types having the same QImode.  Thus we
+   add additional check for elements number.  */
+return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+   || (TYPE_VECTOR_SUBPARTS (vectype) * 2
+   == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
 
   /* Check if it's a multi-step conversion that can be done using intermediate
  types.  */
@@ -9140,7 +9152,9 @@ supportable_narrowing_operation (enum tree_code code,
   (*multi_step_cvt)++;
 
   if (insn_data[icode1].operand[0].mode == TYPE_MODE (narrow_vectype))
-   return true;
+   return (!VECTOR_BOOLEAN_TYPE_P (vectype)
+   || (TYPE_VECTOR_SUBPARTS (intermediate_type) * 2
+   == TYPE_VECTOR_SUBPARTS (narrow_vectype)));
 
   prev_mode = intermediate_mode;
   prev_type = intermediate_type;


Re: [RFA 1/2]: Don't ignore target_header_dir when deciding inhibit_libc

2016-03-19 Thread Andre Vieira (lists)
On 23/10/15 12:31, Bernd Schmidt wrote:
> On 10/12/2015 11:58 AM, Ulrich Weigand wrote:
>>
>> Index: gcc/configure.ac
>> ===
>> --- gcc/configure.ac(revision 228530)
>> +++ gcc/configure.ac(working copy)
>> @@ -1993,7 +1993,7 @@ elif test "x$TARGET_SYSTEM_ROOT" != x; t
>>   fi
>>
>>   if test x$host != x$target || test "x$TARGET_SYSTEM_ROOT" != x; then
>> -  if test "x$with_headers" != x; then
>> +  if test "x$with_headers" != x && test "x$with_headers" != xyes; then
>>   target_header_dir=$with_headers
>> elif test "x$with_sysroot" = x; then
>>  
>> target_header_dir="${test_exec_prefix}/${target_noncanonical}/sys-include"
>>
> 
> I'm missing the beginning of this conversation, but this looks like a
> reasonable change (avoiding target_header_dir=yes for --with-headers).
> So, approved.
> 
> 
> Bernd
> 
Hi there,

I was wondering why this never made it to trunk. I am currently running
into an issue that this patch would fix.

Cheers,
Andre


Re: PING^1: [PATCH] Add TYPE_EMPTY_RECORD for C++ empty class

2016-03-19 Thread Jason Merrill

On 03/16/2016 07:55 AM, H.J. Lu wrote:

On Tue, Mar 15, 2016 at 7:51 PM, Jason Merrill  wrote:

On 03/15/2016 08:25 PM, Joseph Myers wrote:


On Tue, 15 Mar 2016, H.J. Lu wrote:


On Tue, Mar 15, 2016 at 3:34 PM, Joseph Myers 
wrote:


On Tue, 15 Mar 2016, H.J. Lu wrote:


On Tue, Mar 15, 2016 at 2:39 PM, Joseph Myers 
wrote:


I'm not sure if the zero-size arrays (a GNU extension) are considered
to
make a struct non-empty, but in any case I think the tests should
cover
such arrays as elements of structs.



There are couple tests for structs with members of array
of empty types.  testsuite/g++.dg/abi/empty14.h has



My concern is the other way round - structs with elements such as
"int a[0];", an array [0] of a nonempty type.  My reading of the
subobject
definition is that such an array should not cause the struct to be
considered nonempty (it doesn't result in any int subobjects).



This is a test for struct with zero-size array, which isn't treated
as empty type.  C++ and C are compatible in its passing.



Where is the current definition of empty types you're proposing for use in
GCC?  Is the behavior of this case clear from that definition?



"An empty type is a type where it and all of its subobjects (recursively)
are of structure, union, or array type.  No memory slot nor register should
be used to pass or return an object of empty type."

It seems to me that such a struct should be considered an empty type under
this definition, since a zero-length array has no subobjects.


Since zero-size array is GCC extension, we can change it.   Do we
want to change its passing for C?


I would think so; it seems to follow clearly from this definition.  I 
have trouble imagining that anyone would ever pass an object containing 
a zero-length array by value, so it shouldn't matter much either way, 
but I consistency is good.


Jason



[gomp-nvptx 4/7] nvptx backend: re-enable line info generation

2016-03-19 Thread Alexander Monakov
* config/nvptx/nvptx.c (nvptx_option_override): Remove custom handling
of debug info options.
---
 gcc/ChangeLog.gomp-nvptx | 5 +
 gcc/config/nvptx/nvptx.c | 9 -
 2 files changed, 5 insertions(+), 9 deletions(-)

diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index 81dd9a2..e69e0be 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -156,15 +156,6 @@ nvptx_option_override (void)
   /* Assumes that it will see only hard registers.  */
   flag_var_tracking = 0;
 
-  if (write_symbols == DBX_DEBUG)
-/* The stabs testcases want to know stabs isn't supported.  */
-sorry ("stabs debug format not supported");
-
-  /* Actually we don't have any debug format, but don't be
- unneccesarily noisy.  */
-  write_symbols = NO_DEBUG;
-  debug_info_level = DINFO_LEVEL_NONE;
-
   if (nvptx_optimize < 0)
 nvptx_optimize = optimize > 0;
 


Re: [PATCH] Change replace_rtx if from is a REG (PR target/70245, take 2)

2016-03-19 Thread Jakub Jelinek
On Thu, Mar 17, 2016 at 11:07:03PM +1030, Alan Modra wrote:
> On Thu, Mar 17, 2016 at 12:16:58PM +0100, Jakub Jelinek wrote:
> > the rs6000 backend for whatever strange reason I haven't understood
> > really wants pointer equality instead of REGNO comparison (even when the
> > modes match), one (reg:DI 12) should be replaced, another (reg:DI 12)
> > should not.
> 
> By the look of what you posted in the bugzilla, the pattern is the
> parallel emitted by rs6000_emit_savres_rtx.  In that parallel, the
> stack memory locations for register saves are described relative to
> whatever frame_reg_rtx is in use, which may be r12.
> rs6000_frame_related wants to translate the frame_reg_rtx into stack
> pointer plus offset for debug info.
> 
> The parallel matches save_gpregs__r12 and similar in rs6000.md,
> which emit a call to an out-of-line register save function.  This
> function actually takes r12 as a parameter, hence the (use (reg:P 12))
> in the pattern.
> 
> rs6000_frame_related probably should just be replacing individual
> SETs in the parallel using simplify_replace_rtx.  Especially since
> after calling replace_rtx, it already iterates over them to simplify.

That was one thing, and another thing was during combiner, where it replaced
just subset of the registers and left others in (e.g. with different mode).

And on aarch64 trying to replace flags reg with CC_NZ mode when there is CC
mode (or vice versa?).  I bet for GCC 7 we want to analyze all uses of
replace_rtx for what exactly we want.

Jakub


C++ PATCH for c++/70139 (-fno-elide-constructors breaks regex)

2016-03-19 Thread Jason Merrill
The constexpr code for shortcutting trivial copy ctor/op= didn't get 
updated for the C++14 constexpr implementation, where we need to 
consider side effects.


For GCC 5.4 I'm just going to disable the shortcut.

Tested x86_64-pc-linux-gnu, applying to trunk.

commit a805189949e8ed36713d5eb78c283a5000566bf0
Author: Jason Merrill 
Date:   Fri Mar 18 14:57:58 2016 -0400

	PR c++/70139
	* constexpr.c (cxx_eval_call_expression): Fix trivial copy.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 5f97c9d..1f496b5 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1239,19 +1239,39 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
   return t;
 }
 
+  constexpr_ctx new_ctx = *ctx;
+  if (DECL_CONSTRUCTOR_P (fun) && !ctx->object
+  && TREE_CODE (t) == AGGR_INIT_EXPR)
+{
+  /* We want to have an initialization target for an AGGR_INIT_EXPR.
+	 If we don't already have one in CTX, use the AGGR_INIT_EXPR_SLOT.  */
+  new_ctx.object = AGGR_INIT_EXPR_SLOT (t);
+  tree ctor = new_ctx.ctor = build_constructor (DECL_CONTEXT (fun), NULL);
+  CONSTRUCTOR_NO_IMPLICIT_ZERO (ctor) = true;
+  ctx->values->put (new_ctx.object, ctor);
+  ctx = _ctx;
+}
+
   /* Shortcut trivial constructor/op=.  */
   if (trivial_fn_p (fun))
 {
+  tree init = NULL_TREE;
   if (call_expr_nargs (t) == 2)
-	{
-	  tree arg = convert_from_reference (get_nth_callarg (t, 1));
-	  return cxx_eval_constant_expression (ctx, arg,
-	   lval, non_constant_p,
-	   overflow_p);
-	}
+	init = convert_from_reference (get_nth_callarg (t, 1));
   else if (TREE_CODE (t) == AGGR_INIT_EXPR
 	   && AGGR_INIT_ZERO_FIRST (t))
-	return build_zero_init (DECL_CONTEXT (fun), NULL_TREE, false);
+	init = build_zero_init (DECL_CONTEXT (fun), NULL_TREE, false);
+  if (init)
+	{
+	  tree op = get_nth_callarg (t, 0);
+	  if (is_dummy_object (op))
+	op = ctx->object;
+	  else
+	op = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (op)), op);
+	  tree set = build2 (MODIFY_EXPR, TREE_TYPE (op), op, init);
+	  return cxx_eval_constant_expression (ctx, set, lval,
+	   non_constant_p, overflow_p);
+	}
 }
 
   /* We can't defer instantiating the function any longer.  */
@@ -1287,19 +1307,6 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 }
 }
 
-  constexpr_ctx new_ctx = *ctx;
-  if (DECL_CONSTRUCTOR_P (fun) && !ctx->object
-  && TREE_CODE (t) == AGGR_INIT_EXPR)
-{
-  /* We want to have an initialization target for an AGGR_INIT_EXPR.
-	 If we don't already have one in CTX, use the AGGR_INIT_EXPR_SLOT.  */
-  new_ctx.object = AGGR_INIT_EXPR_SLOT (t);
-  tree ctor = new_ctx.ctor = build_constructor (DECL_CONTEXT (fun), NULL);
-  CONSTRUCTOR_NO_IMPLICIT_ZERO (ctor) = true;
-  ctx->values->put (new_ctx.object, ctor);
-  ctx = _ctx;
-}
-
   bool non_constant_args = false;
   cxx_bind_parameters_in_call (ctx, t, _call,
 			   non_constant_p, overflow_p, _constant_args);
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-trivial1.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-trivial1.C
new file mode 100644
index 000..f4b74a7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-trivial1.C
@@ -0,0 +1,20 @@
+// PR c++/70139
+// { dg-options "-fno-elide-constructors" }
+// { dg-do compile { target c++11 } }
+
+template
+struct A
+{
+  T a;
+  U b;
+  constexpr A () : a (), b () { }
+  constexpr A (const T , const U ) : a (x), b (y) { }
+};
+struct B
+{
+  constexpr B (const bool x) : c (x) {}
+  constexpr bool operator!= (const B x) const { return c != x.c; }
+  bool c;
+};
+constexpr static A d[] = { { B (true), nullptr }, { B (false), nullptr } };
+static_assert (d[0].a != d[1].a, "");


Re: [PATCH][PR rtl-optimization/70024] Fix argument to CROSSING_JUMP_P

2016-03-19 Thread Jeff Law

On 03/16/2016 11:37 AM, Andreas Schwab wrote:

Jeff Law  writes:


PR rtl-optimization/70024


That's probably a typo.

Already fixed.
jeff


Re: [RFA][PATCH][PR tree-optimization/64058] Improve and stabilize sorting of coalesce pairs

2016-03-19 Thread Jeff Law

On 03/15/2016 08:22 AM, Richard Biener wrote:

To work around the narrow API in the comparison function we have to either
store additional data in each node or have them available in globals.  The
former would be horribly wasteful, the latter is just ugly.  I choose the
latter in the lazy evaluation of the conflicts version.


Works for me.
I'm going to take a look at Trevor's suggestion to use std::sort with a 
suitable class.  That may ultimately be cleaner.




As far as a testcase goes we want to scan the dumps for the actual
coalesces
being done.  Might be a bit fragile though...


I suspect that's going to be quite fragile and may have more target
dependencies than we'd like (due to branch costing and such).


Yes.

Otherwise -ENOPATCH.
Right.  I haven't written the part to count the number of unique bits 
across two bitmaps yet as exported function from bitmap.[ch] yet.  So no 
patch was included.  Off to do that now :-)


jeff


[Patch] [x86_64]: minor latency changes for znver1.md

2016-03-19 Thread Kumar, Venkataramanan
Hi Uros,

The below patch changes the latency values for fp type load reservations. 

It passes normal bootstrap and bootstrap with BOOT_CFLAGS="-O2 -g - 
march=znver1 -mno-clzero -mno-sha " on avx2 target.
Also compiled and ran SPEC2006 with -march=znver1 and -Ofast .  

Ok for trunk?

ChangeLog
2016-03-17  Venkataramanan Kumar  

    * config/i386/znver1.md : Fix latency for FP/SSE/AVX load type 
reservations.

---snip---
diff --git a/gcc/config/i386/znver1.md b/gcc/config/i386/znver1.md
index 1d28c05..7db0562 100644
--- a/gcc/config/i386/znver1.md
+++ b/gcc/config/i386/znver1.md
@@ -328,7 +328,7 @@
  (eq_attr "type" "fcmov"))
 "znver1-vector,znver1-fvector")
 
-(define_insn_reservation "znver1_fp_mov_direct_load" 5
+(define_insn_reservation "znver1_fp_mov_direct_load" 8 
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "znver1_decode" "direct")
   (and (eq_attr "type" "fmov")
@@ -349,7 +349,7 @@
(eq_attr "memory" "none"
 "znver1-double,znver1-fp3")
 
-(define_insn_reservation "znver1_fp_mov_double_load" 9
+(define_insn_reservation "znver1_fp_mov_double_load" 12
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "znver1_decode" "double")
   (and (eq_attr "type" "fmov")
@@ -386,7 +386,7 @@
(eq_attr "type" "fcmp"
 "znver1-double,znver1-fp0,znver1-fp2")
 
-(define_insn_reservation "znver1_fp_fcmp_load" 6
+(define_insn_reservation "znver1_fp_fcmp_load" 9
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "memory" "none")
   (and (eq_attr "znver1_decode" "double")
@@ -400,13 +400,13 @@
   (eq_attr "memory" "none")))
 "znver1-direct,znver1-fp0*5")
 
-(define_insn_reservation "znver1_fp_op_mul_load" 9
+(define_insn_reservation "znver1_fp_op_mul_load" 12 
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "fop,fmul")
   (eq_attr "memory" "load")))
 "znver1-direct,znver1-load,znver1-fp0*5")
 
-(define_insn_reservation "znver1_fp_op_imul_load" 13
+(define_insn_reservation "znver1_fp_op_imul_load" 16
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "fop,fmul")
   (and (eq_attr "fp_int_src" "true")
@@ -419,13 +419,13 @@
   (eq_attr "memory" "none")))
 "znver1-direct,znver1-fp3*15")
 
-(define_insn_reservation "znver1_fp_op_div_load" 19
+(define_insn_reservation "znver1_fp_op_div_load" 22
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "fdiv")
   (eq_attr "memory" "load")))
 "znver1-direct,znver1-load,znver1-fp3*15")
 
-(define_insn_reservation "znver1_fp_op_idiv_load" 24
+(define_insn_reservation "znver1_fp_op_idiv_load" 27
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "fdiv")
   (and (eq_attr "fp_int_src" "true")
@@ -444,7 +444,7 @@
   (eq_attr "memory" "none")))
 "znver1-direct,znver1-fp0|znver1-fp1|znver1-fp3")
 
-(define_insn_reservation "znver1_mmx_add_load" 5
+(define_insn_reservation "znver1_mmx_add_load" 8
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "mmxadd")
   (eq_attr "memory" "load")))
@@ -456,7 +456,7 @@
   (eq_attr "memory" "none")))
 "znver1-direct,znver1-fp0|znver1-fp3")
 
-(define_insn_reservation "znver1_mmx_cmp_load" 5
+(define_insn_reservation "znver1_mmx_cmp_load" 8
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "mmxcmp")
   (eq_attr "memory" "load")))
@@ -468,7 +468,7 @@
   (eq_attr "memory" "none")))
 "znver1-direct,znver1-fp1|znver1-fp2")
 
-(define_insn_reservation "znver1_mmx_cvt_pck_shuf_load" 5
+(define_insn_reservation "znver1_mmx_cvt_pck_shuf_load" 8
 (and (eq_attr "cpu" "znver1")
  (and (eq_attr "type" "mmxcvt,sseshuf,sseshuf1")
   (eq_attr "memory" "load")))
@@ -480,7 +480,7 @@
   (eq_attr "memory" "none")))
 "znver1-direct,znver1-fp2")
 

  1   2   >