date:20170901

[PATCH] Fix profile update in tree-ssa-isolate-paths.c (PR tree-optimization/82059).

2017-09-01 Thread Martin Liška

Hello.

In order to have valid profile, we should add counts of an edge only when
it's really redirected and hasn't been redirected in previous invocation of the 
function.

Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

gcc/ChangeLog:

2017-08-31  Martin Liska  

PR tree-optimization/82059
* gimple-ssa-isolate-paths.c (isolate_path): Add profile and
frequency only when an edge is redirected.

gcc/testsuite/ChangeLog:

2017-08-31  Martin Liska  

PR tree-optimization/82059
* gcc.dg/tree-ssa/pr82059.c: New test.
---
 gcc/gimple-ssa-isolate-paths.c  |  9 ++---
 gcc/testsuite/gcc.dg/tree-ssa/pr82059.c | 22 ++
 2 files changed, 28 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr82059.c


diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
index fbc41057463..807e0032410 100644
--- a/gcc/gimple-ssa-isolate-paths.c
+++ b/gcc/gimple-ssa-isolate-paths.c
@@ -160,14 +160,17 @@ isolate_path (basic_block bb, basic_block duplicate,
 	for (ei = ei_start (duplicate->succs); (e2 = ei_safe_edge (ei)); )
 	  remove_edge (e2);
 }
-  bb->frequency += EDGE_FREQUENCY (e);
-  bb->count += e->count;
 
   /* Complete the isolation step by redirecting E to reach DUPLICATE.  */
   e2 = redirect_edge_and_branch (e, duplicate);
   if (e2)
-flush_pending_stmts (e2);
+{
+  flush_pending_stmts (e2);
 
+  /* Update profile only when redirection is really processed.  */
+  bb->frequency += EDGE_FREQUENCY (e);
+  bb->count += e->count;
+}
 
   /* There may be more than one statement in DUPLICATE which exhibits
  undefined behavior.  Ultimately we want the first such statement in
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82059.c b/gcc/testsuite/gcc.dg/tree-ssa/pr82059.c
new file mode 100644
index 000..0285b03cc04
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82059.c
@@ -0,0 +1,22 @@
+/* PR tree-optimization/82059 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-isolate-paths" } */
+
+struct a
+{
+  char b;
+  struct a *c;
+} d (), f;
+void *e;
+long g;
+void
+h ()
+{
+  struct a *i = 0;
+  if (g)
+i = e;
+  if (!i)
+d ();
+  i->c = &f;
+  i->b = *(char *) h;
+}

Re: [PATCH] Expand switch statements with a single (or none) non-default case.

2017-09-01 Thread Martin Liška

On 08/30/2017 02:56 PM, Richard Biener wrote:
> On Wed, Aug 30, 2017 at 2:32 PM, Martin Liška  wrote:
>> On 08/30/2017 02:28 PM, Richard Biener wrote:
>>> On Wed, Aug 30, 2017 at 1:13 PM, Martin Liška  wrote:
 Hi.

 Simple transformation of switch statements where degenerated switch can be 
 interpreted
 as gimple condition (or removed if having any non-default case). I 
 originally though
 that we don't have switch statements without non-default cases, but 
 PR82032 shows we
 can see it.

 Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

 Ready to be installed?
>>>
>>> While I guess this case is ok to handle here it would be nice if CFG cleanup
>>> would do the same.  I suppose find_taken_edge somehow doesn't work for
>>> this case even after my last enhancement?  Or is CFG cleanup for some reason
>>> not run?
>>
>> Do you mean both with # of non-default edges equal to 0 and 1?
>> Let me take a look.
> 
> First and foremost 0.  The case of 1 non-default and a default would
> need extra code.

For the test-case I reduced, one needs:

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index b7593068ea9..13af516c6ac 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -8712,7 +8712,7 @@ const pass_data pass_data_split_crit_edges =
   PROP_no_crit_edges, /* properties_provided */
   0, /* properties_destroyed */
   0, /* todo_flags_start */
-  0, /* todo_flags_finish */
+  TODO_cleanup_cfg, /* todo_flags_finish */
 };

 class pass_split_crit_edges : public gimple_opt_pass

And the code eliminates the problematic switch statement. Do you believe it's 
the right approach
to add the clean up and preserve the assert in tree-switch-conversion.c?

For the case with # of edges == 1, should I place it to tree-cfg.c in order to 
trigger it as a clean-up?
Thoughts?

Martin

> 
> Richard.
> 
>> Martin
>>
>>>
>>> Richard.
>>>
 Martin

 gcc/ChangeLog:

 2017-08-25  Martin Liska  

 PR tree-optimization/82032
 * tree-switch-conversion.c (generate_high_low_equality): New
 function.
 (expand_degenerated_switch): Likewise.
 (process_switch): Call expand_degenerated_switch.
 (try_switch_expansion): Likewise.
 (emit_case_nodes): Use generate_high_low_equality.

 gcc/testsuite/ChangeLog:

 2017-08-25  Martin Liska  

 PR tree-optimization/82032
 * gcc.dg/tree-ssa/pr68198.c: Update jump threading expectations.
 * gcc.dg/tree-ssa/switch-expansion.c: New test.
 * gcc.dg/tree-ssa/vrp34.c: Update scanned pattern.
 * g++.dg/other/pr82032.C: New test.
 ---
  gcc/testsuite/g++.dg/other/pr82032.C |  36 +++
  gcc/testsuite/gcc.dg/tree-ssa/pr68198.c  |   6 +-
  gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c |  14 +++
  gcc/testsuite/gcc.dg/tree-ssa/vrp34.c|   5 +-
  gcc/tree-switch-conversion.c | 123 
 ++-
  5 files changed, 152 insertions(+), 32 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/other/pr82032.C
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c

>>

Re: [PATCH] Expand switch statements with a single (or none) non-default case.

2017-09-01 Thread Richard Biener

On Fri, Sep 1, 2017 at 10:07 AM, Martin Liška  wrote:
> On 08/30/2017 02:56 PM, Richard Biener wrote:
>> On Wed, Aug 30, 2017 at 2:32 PM, Martin Liška  wrote:
>>> On 08/30/2017 02:28 PM, Richard Biener wrote:
 On Wed, Aug 30, 2017 at 1:13 PM, Martin Liška  wrote:
> Hi.
>
> Simple transformation of switch statements where degenerated switch can 
> be interpreted
> as gimple condition (or removed if having any non-default case). I 
> originally though
> that we don't have switch statements without non-default cases, but 
> PR82032 shows we
> can see it.
>
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
>
> Ready to be installed?

 While I guess this case is ok to handle here it would be nice if CFG 
 cleanup
 would do the same.  I suppose find_taken_edge somehow doesn't work for
 this case even after my last enhancement?  Or is CFG cleanup for some 
 reason
 not run?
>>>
>>> Do you mean both with # of non-default edges equal to 0 and 1?
>>> Let me take a look.
>>
>> First and foremost 0.  The case of 1 non-default and a default would
>> need extra code.
>
> For the test-case I reduced, one needs:
>
> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index b7593068ea9..13af516c6ac 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -8712,7 +8712,7 @@ const pass_data pass_data_split_crit_edges =
>PROP_no_crit_edges, /* properties_provided */
>0, /* properties_destroyed */
>0, /* todo_flags_start */
> -  0, /* todo_flags_finish */
> +  TODO_cleanup_cfg, /* todo_flags_finish */
>  };
>
>  class pass_split_crit_edges : public gimple_opt_pass
>
> And the code eliminates the problematic switch statement. Do you believe it's 
> the right approach
> to add the clean up and preserve the assert in tree-switch-conversion.c?

Eh, no.  If you run cleanup-cfg after critical edge splitting they
will be unsplit immediately, making
it (mostly) a no-op.

OTOH I wanted to eliminate that "pass" in favor of just calling
split_critical_edges () where needed
(that's already done in some places).

> For the case with # of edges == 1, should I place it to tree-cfg.c in order 
> to trigger it as a clean-up?

I believe the code for edges == 1 can reside in
cleanup_control_expr_graph.  Probably easiest
from a flow perspective if we do the switch -> cond transform before
the existing code handling
cond and switch via find-taken-edge.

> Thoughts?
>
> Martin
>
>>
>> Richard.
>>
>>> Martin
>>>

 Richard.

> Martin
>
> gcc/ChangeLog:
>
> 2017-08-25  Martin Liska  
>
> PR tree-optimization/82032
> * tree-switch-conversion.c (generate_high_low_equality): New
> function.
> (expand_degenerated_switch): Likewise.
> (process_switch): Call expand_degenerated_switch.
> (try_switch_expansion): Likewise.
> (emit_case_nodes): Use generate_high_low_equality.
>
> gcc/testsuite/ChangeLog:
>
> 2017-08-25  Martin Liska  
>
> PR tree-optimization/82032
> * gcc.dg/tree-ssa/pr68198.c: Update jump threading expectations.
> * gcc.dg/tree-ssa/switch-expansion.c: New test.
> * gcc.dg/tree-ssa/vrp34.c: Update scanned pattern.
> * g++.dg/other/pr82032.C: New test.
> ---
>  gcc/testsuite/g++.dg/other/pr82032.C |  36 +++
>  gcc/testsuite/gcc.dg/tree-ssa/pr68198.c  |   6 +-
>  gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c |  14 +++
>  gcc/testsuite/gcc.dg/tree-ssa/vrp34.c|   5 +-
>  gcc/tree-switch-conversion.c | 123 
> ++-
>  5 files changed, 152 insertions(+), 32 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/other/pr82032.C
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c
>
>
>>>
>

[PATCH, Makefile] improve libsubdir variable transmission to sub-makes on Windows

2017-09-01 Thread Olivier Hainque

Hello,

To convey the value of libsubdir to sub-makes, gcc/Makefile.in has

  # Flags to pass to recursive makes.
  # CC is set by configure.
  # ??? The choices here will need some experimenting with.

  export AR_FOR_TARGET
  ...
  export libsubdir

then:

  FLAGS_TO_PASS = \
  ...
  (libsubdir not here)

This is not working well on cygwin environments where environment
variable names are translated to uppercase (so sub-makes evaluating
the variable with the lowercase name don't get the value).

The attached patch is a suggestion to address this by simply adding
"libsubdir" to the list of explicit FLAGS_TO_PASS.

Used in-house for a while, with gcc-6 on a wide variety of host/target
configurations.

Bootstrapped and regression tested with mainline on x86_64-linux.

OK to commit ?

Thanks in advance,

With Kind Regards,

Olivier

2017-09-01  Jerome Lambourg  

* Makefile.in (FLAGS_TO_PASS): Add libsubdir.




gcc-libsubdir.diff
Description: Binary data

Re: [PATCH] Fix bug in simplify_ternary_operation

2017-09-01 Thread Tom de Vries


On 08/31/2017 11:44 PM, Jeff Law wrote:

On 08/28/2017 12:26 PM, Tom de Vries wrote:

Hi,

I think I found a bug in r17465:
...

* cse.c (simplify_ternary_operation): Handle more IF_THEN_ELSE
simplifications.

diff --git a/gcc/cse.c b/gcc/cse.c
index e001597..3c27387 100644
--- a/gcc/cse.c
+++ b/gcc/cse.c
@@ -4713,6 +4713,17 @@ simplify_ternary_operation (code, mode,
op0_mode, op0, op1, op2)


Note: the parameters of simplify_ternary_operation have the following
meaning:
...
/* Simplify CODE, an operation with result mode MODE and three operands,
OP0, OP1, and OP2.  OP0_MODE was the mode of OP0 before it became
a constant.  Return 0 if no simplifications is possible.  */

rtx
simplify_ternary_operation (code, mode, op0_mode, op0, op1, op2)
  enum rtx_code code;
  enum machine_mode mode, op0_mode;
  rtx op0, op1, op2;
...


   && rtx_equal_p (XEXP (op0, 1), op1)
   && rtx_equal_p (XEXP (op0, 0), op2))
 return op2;
+  else if (! side_effects_p (op0))
+   {
+ rtx temp;
+ temp = simplify_relational_operation (GET_CODE (op0), op0_mode,
+   XEXP (op0, 0), XEXP
(op0, 1));


We're handling code == IF_THEN_ELSE here, so op0 is the condition, op1
is the 'then expr' and op2 is the 'else expr'.

The parameters of simplify_relational_operation have the following meaning:
...
/* Like simplify_binary_operation except used for relational operators.
MODE is the mode of the operands, not that of the result.  If MODE
is VOIDmode, both operands must also be VOIDmode and we compare the
operands in "infinite precision".

If no simplification is possible, this function returns zero.
Otherwise, it returns either const_true_rtx or const0_rtx.  */

rtx
simplify_relational_operation (code, mode, op0, op1)
  enum rtx_code code;
  enum machine_mode mode;
  rtx op0, op1;
...

The problem in the patch is that we use op0_mode argument for the mode
parameter. The mode parameter of simplify_relational_operation needs to
be the mode of the operands of the condition, while op0_mode is the mode
of the condition.

Patch below fixes this on current trunk.

[ I found this by running into an ICE in
gcc.c-torture/compile/pr28776-2.c for gcn target. I haven't been able to
reproduce this with an upstream branch yet. ]

OK for trunk if bootstrap and reg-test for x86_64 succeeds?

So clearly setting cmp_mode to op0_mode is wrong.   But we also have to
make sure that if cmp_mode is VOIDmode that either XEXP (op0, 0) has a
non-void mode or that XEXP (op0, 1) has a non-void mode, otherwise we're
likely to abort down in simplify_const_relational_operation.



You're referring to this assert:
...
/* Check if the given comparison (done in the given MODE) is actually 

   a tautology or a contradiction.  If the mode is VOID_mode, the 

   comparison is done in "infinite precision".  If no simplification 

   is possible, this function returns zero.  Otherwise, it returns 


   either const_true_rtx or const0_rtx.  */

rtx
simplify_const_relational_operation (enum rtx_code code,
 machine_mode mode,
 rtx op0, rtx op1)
{
  ...

  gcc_assert (mode != VOIDmode
  || (GET_MODE (op0) == VOIDmode
  && GET_MODE (op1) == VOIDmode));
...

added by Honza:
...
* simplify-rtx.c (simplify_relational_operation): Verify that
mode == VOIDmode implies both operands to be VOIDmode.
...

In other words, rewriting the assert in more readable form:
...
#define BOOL_IMPLIES(a, b) (!(a) || (b))
  gcc_assert (BOOL_IMPLIES (mode == VOIDmode,
(GET_MODE (op0) == VOIDmode
 && GET_MODE (op1) == VOIDmode)));
...
[ I'd be in favor of rewriting imply relations using a macro or some 
such, I find it easier to understand. ]


Now, simplify_relational_operation starts like this:
...
rtx
simplify_relational_operation (enum rtx_code code, machine_mode mode,
   machine_mode cmp_mode, rtx op0, rtx op1)
{
  rtx tem, trueop0, trueop1;

  if (cmp_mode == VOIDmode)
cmp_mode = GET_MODE (op0);
  if (cmp_mode == VOIDmode)
cmp_mode = GET_MODE (op1);

  tem = simplify_const_relational_operation (code, cmp_mode, op0, op1);
...

AFAIU, the cmp_mode ifs ensure that the assert in 
simplify_const_relational_operation doesn't trigger.



ISTM a better fix is to return NULL_RTX if cmp_mode is VOIDmode and both
the sub-operations are VOIDmode as well.



I don't think we need that. simplify_const_relational_operation can 
handle the situation that mode == VOIDmode && GET_MODE (op0) == VOIDmode 
&& GET_MODE (op1) == VOIDmode.


Thanks,
- Tom


Can you try that and verify that pr28776-2.c continues to work?
jeff

Patch for [Bug fortran/81841] [5/6/7/8 Regression] THREADPRIVATE (OpenMP) wrongly rejected in BLOCK DATA

2017-09-01 Thread dbroemmel

Hi all,

attached are a proposed fix and new testcase for PR81841. The
THREADPRIVATE statement is currently wrongly rejected as part of BLOCK DATA.

The testcase also does (very basic) runtime checks. It fails (compiling)
prior to the patch and completes after. Tested on x86_64 GNU/Linux.

Thanks,
Dirk


2017-09-01 dbroemmel

PR fortran/81841
* parse.c (parse_spec): adding ST_OMP_THREADPRIVATE as allowed
statement.

2017-09-01 dbroemmel

PR fortran/81841
* gfortran.dg/gomp/omp_threadprivate3.f90: New testcase.
Index: gcc/fortran/parse.c
===
--- gcc/fortran/parse.c	(revision 251553)
+++ gcc/fortran/parse.c	(working copy)
@@ -3694,6 +3694,7 @@
 	case ST_EQUIVALENCE:
 	case ST_IMPLICIT:
 	case ST_IMPLICIT_NONE:
+	case ST_OMP_THREADPRIVATE:
 	case ST_PARAMETER:
 	case ST_STRUCTURE_DECL:
 	case ST_TYPE:
Index: gcc/testsuite/gfortran.dg/gomp/omp_threadprivate3.f90
===
--- gcc/testsuite/gfortran.dg/gomp/omp_threadprivate3.f90	(revision 0)
+++ gcc/testsuite/gfortran.dg/gomp/omp_threadprivate3.f90	(working copy)
@@ -0,0 +1,158 @@
+! { dg-do run }
+! { dg-options "-fopenmp" }
+! PR fortran/81841
+
+
+! Test OpenMP THREADPRIVATE statement together w/ BLOCK DATA for COMMON BLOCKs.
+!
+! We try to test for correct initialisation per thread of compile and runtime values.
+!(The latter should be correctly defined on the master thread only, the former should
+! be identical on all threads.)
+! We also test behaviour in between two OpenMP PARALLEL regions.
+!(Changed values only expected to be valid between the first overlapping threads.)
+!
+! This could be extended to allocatables and pointers, however, assuming
+! THREADPRIVATE works in first place, this should test the BLOCK DATA case...
+program block_data_threadprivate
+   use omp_lib
+   implicit none
+
+   integer :: int1, int2, int3! variables 1 and 2 end up in two common blocks
+   real:: flt1, flt2, flt3! variables 3 should stay undefined
+  ! (and are not necessary but included to
+  ! check undefined values)
+   common /c_block_1/ int1, flt1  ! to be initialised at runtime
+   !$OMP THREADPRIVATE (/c_block_1/)
+   common /c_block_2/ int2, flt2  ! to be initialised via BLOCK DATA
+   !$OMP THREADPRIVATE (/c_block_2/)
+
+   ! runtime init int1 and flt1
+   !   (should be defined on master thread only)
+   int1 = 1
+   flt1 = 1.1
+   !   (int2 and flt2 should be available on all threads via BLOCK DATA and COMMON BLOCK)
+   write(*,'(a27,1x,3(i5,1x,f5.2,1x))') 'main thread, variables are:', int1, flt1,  int2,  flt2,  int3,  flt3
+   write(*,'(a27,1x,3(a5,1x,a5  ,1x))') 'they should be:', '1',  '1.1', '2',   '2.2', '???', '???'
+
+   ! spawn fixed number of threads to have predictable behaviour 
+   write(*,'(a)') 'spawning 4 threads'
+   !$OMP PARALLEL default(none) private(int3, flt3) num_threads(4)
+   !$OMP CRITICAL
+   ! critical to get nicer, sorted output
+   write(*,'(a22,1x,i3,a1,1x)',advance='no') 'thread id', omp_get_thread_num(), ':'
+   write(*,'(3(i5,1x,f5.2,1x))') int1, flt1, int2,  flt2,  int3,  flt3
+   select case (omp_get_thread_num())
+   case (0)
+  write(*,'(a27,1x,3(a5,1x,a5  ,1x))') 'they should be:', '1',  '1.1', '2',   '2.2', '???', '???'
+  if (int1 /= 1)   call abort
+  if (int2 /= 2)   call abort
+  if (flt1 >= 1.11 .or.  flt1 <= 1.09) call abort
+  if (flt2 >= 2.21 .or.  flt2 <= 2.19) call abort
+   case (1)
+  write(*,'(a27,1x,3(a5,1x,a5  ,1x))') 'they should be:', '???',  '???', '2',   '2.2', '???', '???'
+  if (int1 == 1)   call abort
+  if (int2 /= 2)   call abort
+  if (flt1 <= 1.11 .and. flt1 >= 1.09) call abort
+  if (flt2 >= 2.21 .or.  flt2 <= 2.19) call abort
+   case (2)
+  write(*,'(a27,1x,3(a5,1x,a5  ,1x))') 'they should be:', '???',  '???', '2',   '2.2', '???', '???'
+  if (int1 == 1)   call abort
+  if (int2 /= 2)   call abort
+  if (flt1 <= 1.11 .and. flt1 >= 1.09) call abort
+  if (flt2 >= 2.21 .or.  flt2 <= 2.19) call abort
+   case (3)
+  write(*,'(a27,1x,3(a5,1x,a5  ,1x))') 'they should be:', '???',  '???', '2',   '2.2', '???', '???'
+  if (int1 == 1)   call abort
+  if (int2 /= 2)   call abort
+  if (flt1 <= 1.11 .and. flt1 >= 1.09) call abort
+  if (flt2 >= 2.21 .or.  flt2 <= 2.19) call abort
+   case default
+  call abort
+   end select
+   !$OMP END CRITICAL
+   !$OMP END PARALLEL
+
+   ! spawn fixed (but lower) number of threads and change private data
+   write(*,'(a)') 'spawning 2 threads (changing thread private variables)'
+   !$OMP PARALLEL default(none) private(int3, flt3) num_threads(2)
+   int1 = 1

Re: [PATCH] streambuf_iterator: avoid debug-dependent behaviour

2017-09-01 Thread Jonathan Wakely


On 24/08/17 12:57 +0300, Petr Ovtchenkov wrote:

Explicit do sgetc from associated streambuf. Avoid debug-dependent
sgetc (within _M_at_eof()):

  __glibcxx_requires_cond(!_M_at_eof(),
  _M_message(__gnu_debug::__msg_inc_istreambuf)
  ._M_iterator(*this));

Increment operators not require not-eof precoditions.

Don't unlink associated streambuf if eof detected (in _M_get).

Clean logic in postfix increment operator.


I find it very hard to understand the reasons for this patch. What
you've written above is too terse for me.

Are you fixing a bug? If so, do you have a testcase that demonstrates
the problem, and is fixed by these changes?

Is this just refactoring, without changing behaviour?

Finally, and very importantly, do you have a copyright assignment for
changes to GCC? See https://gcc.gnu.org/contribute.html#legal

Re: [PATCH] Expand switch statements with a single (or none) non-default case.

2017-09-01 Thread Martin Liška

On 09/01/2017 10:26 AM, Richard Biener wrote:
> On Fri, Sep 1, 2017 at 10:07 AM, Martin Liška  wrote:
>> On 08/30/2017 02:56 PM, Richard Biener wrote:
>>> On Wed, Aug 30, 2017 at 2:32 PM, Martin Liška  wrote:
 On 08/30/2017 02:28 PM, Richard Biener wrote:
> On Wed, Aug 30, 2017 at 1:13 PM, Martin Liška  wrote:
>> Hi.
>>
>> Simple transformation of switch statements where degenerated switch can 
>> be interpreted
>> as gimple condition (or removed if having any non-default case). I 
>> originally though
>> that we don't have switch statements without non-default cases, but 
>> PR82032 shows we
>> can see it.
>>
>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>> tests.
>>
>> Ready to be installed?
>
> While I guess this case is ok to handle here it would be nice if CFG 
> cleanup
> would do the same.  I suppose find_taken_edge somehow doesn't work for
> this case even after my last enhancement?  Or is CFG cleanup for some 
> reason
> not run?

 Do you mean both with # of non-default edges equal to 0 and 1?
 Let me take a look.
>>>
>>> First and foremost 0.  The case of 1 non-default and a default would
>>> need extra code.
>>
>> For the test-case I reduced, one needs:
>>
>> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
>> index b7593068ea9..13af516c6ac 100644
>> --- a/gcc/tree-cfg.c
>> +++ b/gcc/tree-cfg.c
>> @@ -8712,7 +8712,7 @@ const pass_data pass_data_split_crit_edges =
>>PROP_no_crit_edges, /* properties_provided */
>>0, /* properties_destroyed */
>>0, /* todo_flags_start */
>> -  0, /* todo_flags_finish */
>> +  TODO_cleanup_cfg, /* todo_flags_finish */
>>  };
>>
>>  class pass_split_crit_edges : public gimple_opt_pass
>>
>> And the code eliminates the problematic switch statement. Do you believe 
>> it's the right approach
>> to add the clean up and preserve the assert in tree-switch-conversion.c?
> 
> Eh, no.  If you run cleanup-cfg after critical edge splitting they
> will be unsplit immediately, making
> it (mostly) a no-op.
> 
> OTOH I wanted to eliminate that "pass" in favor of just calling
> split_critical_edges () where needed
> (that's already done in some places).

Good, so I will leave it to you. Should I in meantime remove the assert in 
tree-switch-conversion.c ?

> 
>> For the case with # of edges == 1, should I place it to tree-cfg.c in order 
>> to trigger it as a clean-up?
> 
> I believe the code for edges == 1 can reside in
> cleanup_control_expr_graph.  Probably easiest
> from a flow perspective if we do the switch -> cond transform before
> the existing code handling
> cond and switch via find-taken-edge.

Working on that, good place to do the transformation.

Martin

> 
>> Thoughts?
>>
>> Martin
>>
>>>
>>> Richard.
>>>
 Martin

>
> Richard.
>
>> Martin
>>
>> gcc/ChangeLog:
>>
>> 2017-08-25  Martin Liska  
>>
>> PR tree-optimization/82032
>> * tree-switch-conversion.c (generate_high_low_equality): New
>> function.
>> (expand_degenerated_switch): Likewise.
>> (process_switch): Call expand_degenerated_switch.
>> (try_switch_expansion): Likewise.
>> (emit_case_nodes): Use generate_high_low_equality.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2017-08-25  Martin Liska  
>>
>> PR tree-optimization/82032
>> * gcc.dg/tree-ssa/pr68198.c: Update jump threading expectations.
>> * gcc.dg/tree-ssa/switch-expansion.c: New test.
>> * gcc.dg/tree-ssa/vrp34.c: Update scanned pattern.
>> * g++.dg/other/pr82032.C: New test.
>> ---
>>  gcc/testsuite/g++.dg/other/pr82032.C |  36 +++
>>  gcc/testsuite/gcc.dg/tree-ssa/pr68198.c  |   6 +-
>>  gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c |  14 +++
>>  gcc/testsuite/gcc.dg/tree-ssa/vrp34.c|   5 +-
>>  gcc/tree-switch-conversion.c | 123 
>> ++-
>>  5 files changed, 152 insertions(+), 32 deletions(-)
>>  create mode 100644 gcc/testsuite/g++.dg/other/pr82032.C
>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c
>>
>>

>>

Re: [PATCH] Expand switch statements with a single (or none) non-default case.

2017-09-01 Thread Richard Biener

On Fri, Sep 1, 2017 at 12:44 PM, Martin Liška  wrote:
> On 09/01/2017 10:26 AM, Richard Biener wrote:
>> On Fri, Sep 1, 2017 at 10:07 AM, Martin Liška  wrote:
>>> On 08/30/2017 02:56 PM, Richard Biener wrote:
 On Wed, Aug 30, 2017 at 2:32 PM, Martin Liška  wrote:
> On 08/30/2017 02:28 PM, Richard Biener wrote:
>> On Wed, Aug 30, 2017 at 1:13 PM, Martin Liška  wrote:
>>> Hi.
>>>
>>> Simple transformation of switch statements where degenerated switch can 
>>> be interpreted
>>> as gimple condition (or removed if having any non-default case). I 
>>> originally though
>>> that we don't have switch statements without non-default cases, but 
>>> PR82032 shows we
>>> can see it.
>>>
>>> Patch can bootstrap on ppc64le-redhat-linux and survives regression 
>>> tests.
>>>
>>> Ready to be installed?
>>
>> While I guess this case is ok to handle here it would be nice if CFG 
>> cleanup
>> would do the same.  I suppose find_taken_edge somehow doesn't work for
>> this case even after my last enhancement?  Or is CFG cleanup for some 
>> reason
>> not run?
>
> Do you mean both with # of non-default edges equal to 0 and 1?
> Let me take a look.

 First and foremost 0.  The case of 1 non-default and a default would
 need extra code.
>>>
>>> For the test-case I reduced, one needs:
>>>
>>> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
>>> index b7593068ea9..13af516c6ac 100644
>>> --- a/gcc/tree-cfg.c
>>> +++ b/gcc/tree-cfg.c
>>> @@ -8712,7 +8712,7 @@ const pass_data pass_data_split_crit_edges =
>>>PROP_no_crit_edges, /* properties_provided */
>>>0, /* properties_destroyed */
>>>0, /* todo_flags_start */
>>> -  0, /* todo_flags_finish */
>>> +  TODO_cleanup_cfg, /* todo_flags_finish */
>>>  };
>>>
>>>  class pass_split_crit_edges : public gimple_opt_pass
>>>
>>> And the code eliminates the problematic switch statement. Do you believe 
>>> it's the right approach
>>> to add the clean up and preserve the assert in tree-switch-conversion.c?
>>
>> Eh, no.  If you run cleanup-cfg after critical edge splitting they
>> will be unsplit immediately, making
>> it (mostly) a no-op.
>>
>> OTOH I wanted to eliminate that "pass" in favor of just calling
>> split_critical_edges () where needed
>> (that's already done in some places).
>
> Good, so I will leave it to you. Should I in meantime remove the assert in 
> tree-switch-conversion.c ?

Yes, as said your patch was generally OK, I just wondered why we left
the switches "unoptimized".

Richard.

>>
>>> For the case with # of edges == 1, should I place it to tree-cfg.c in order 
>>> to trigger it as a clean-up?
>>
>> I believe the code for edges == 1 can reside in
>> cleanup_control_expr_graph.  Probably easiest
>> from a flow perspective if we do the switch -> cond transform before
>> the existing code handling
>> cond and switch via find-taken-edge.
>
> Working on that, good place to do the transformation.
>
> Martin
>
>>
>>> Thoughts?
>>>
>>> Martin
>>>

 Richard.

> Martin
>
>>
>> Richard.
>>
>>> Martin
>>>
>>> gcc/ChangeLog:
>>>
>>> 2017-08-25  Martin Liska  
>>>
>>> PR tree-optimization/82032
>>> * tree-switch-conversion.c (generate_high_low_equality): New
>>> function.
>>> (expand_degenerated_switch): Likewise.
>>> (process_switch): Call expand_degenerated_switch.
>>> (try_switch_expansion): Likewise.
>>> (emit_case_nodes): Use generate_high_low_equality.
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2017-08-25  Martin Liska  
>>>
>>> PR tree-optimization/82032
>>> * gcc.dg/tree-ssa/pr68198.c: Update jump threading expectations.
>>> * gcc.dg/tree-ssa/switch-expansion.c: New test.
>>> * gcc.dg/tree-ssa/vrp34.c: Update scanned pattern.
>>> * g++.dg/other/pr82032.C: New test.
>>> ---
>>>  gcc/testsuite/g++.dg/other/pr82032.C |  36 +++
>>>  gcc/testsuite/gcc.dg/tree-ssa/pr68198.c  |   6 +-
>>>  gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c |  14 +++
>>>  gcc/testsuite/gcc.dg/tree-ssa/vrp34.c|   5 +-
>>>  gcc/tree-switch-conversion.c | 123 
>>> ++-
>>>  5 files changed, 152 insertions(+), 32 deletions(-)
>>>  create mode 100644 gcc/testsuite/g++.dg/other/pr82032.C
>>>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c
>>>
>>>
>
>>>
>

Re: [PATCH] Improve -Ofast vectorization of std::sin etc. (PR libstdc++/81706)

2017-09-01 Thread Jakub Jelinek

On Mon, Aug 07, 2017 at 05:27:42PM +0200, Jakub Jelinek wrote:
> > This should really be a separate function.  Perhaps "merge_one_attribute"?
> 
> If it is outlined without the first 7 lines, i.e. just the body of if (b),
> then it could be duplicate_one_attribute (tree *, tree, const char *);
> called like if (b) duplicate_one_attribute (&DECL_ATTRIBUTES (b), s, "omp 
> declare simd");
> If it is duplicated as whole, it should be called
> duplicate_one_attr_to_builtin or something similar.
> In any case, it could be in tree.c or attribs.c.
> 
> The primary question is if we want this behavior, or if we should go the
> libstdc++ patch routine (and for Jonathan the question is if he knows
> why __builtin_XXXf has been used there rather than the ::XXXf).

Here is updated patch that commonizes big part of that into
duplicate_one_attribute.  Bootstrapped/regtested on x86_64-linux and
i686-linux.  The question stands, do we want to go this way or using some
libstdc++ solution?

2017-09-01  Jakub Jelinek  

PR libstdc++/81706
* attribs.c (attribute_value_equal): Use omp_declare_simd_clauses_equal
for comparison of OMP_CLAUSEs regardless of flag_openmp{,_simd}.
(duplicate_one_attribute): New function.
* attribs.h (duplicate_one_attribute): New declaration.

* c-decl.c (merge_decls): Copy "omp declare simd" attributes from
newdecl to corresponding __builtin_ if any.

* decl.c (duplicate_decls): Copy "omp declare simd" attributes from
newdecl to corresponding __builtin_ if any.

* gcc.target/i386/pr81706.c: New test.
* g++.dg/ext/pr81706.C: New test.

--- gcc/attribs.c.jj2017-09-01 09:25:37.0 +0200
+++ gcc/attribs.c   2017-09-01 09:54:28.581146071 +0200
@@ -1116,9 +1116,9 @@ attribute_value_equal (const_tree attr1,
 TREE_VALUE (attr2)) == 1);
 }
 
-  if ((flag_openmp || flag_openmp_simd)
-  && TREE_VALUE (attr1) && TREE_VALUE (attr2)
+  if (TREE_VALUE (attr1)
   && TREE_CODE (TREE_VALUE (attr1)) == OMP_CLAUSE
+  && TREE_VALUE (attr2)
   && TREE_CODE (TREE_VALUE (attr2)) == OMP_CLAUSE)
 return omp_declare_simd_clauses_equal (TREE_VALUE (attr1),
   TREE_VALUE (attr2));
@@ -1310,6 +1310,31 @@ merge_decl_attributes (tree olddecl, tre
   DECL_ATTRIBUTES (newdecl));
 }
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+void
+duplicate_one_attribute (tree *attrs, tree attr, const char *name)
+{
+  if (!attr)
+return;
+  tree a = lookup_attribute (name, *attrs);
+  while (attr)
+{
+  tree a2;
+  for (a2 = a; a2; a2 = lookup_attribute (name, TREE_CHAIN (a2)))
+   if (attribute_value_equal (attr, a2))
+ break;
+  if (!a2)
+   {
+ a2 = copy_node (attr);
+ TREE_CHAIN (a2) = *attrs;
+ *attrs = a2;
+   }
+  attr = lookup_attribute (name, TREE_CHAIN (attr));
+}
+}
+
 #if TARGET_DLLIMPORT_DECL_ATTRIBUTES
 
 /* Specialization of merge_decl_attributes for various Windows targets.
--- gcc/attribs.h.jj2017-09-01 09:25:37.0 +0200
+++ gcc/attribs.h   2017-09-01 09:54:57.366809765 +0200
@@ -77,6 +77,11 @@ extern tree remove_attribute (const char
 
 extern tree merge_attributes (tree, tree);
 
+/* Duplicate all attributes with name NAME in ATTR list to *ATTRS if
+   they are missing there.  */
+
+extern void duplicate_one_attribute (tree *, tree, const char *);
+
 /* Given two Windows decl attributes lists, possibly including
dllimport, return a list of their union .  */
 extern tree merge_dllimport_decl_attributes (tree, tree);
--- gcc/c/c-decl.c.jj   2017-09-01 09:25:40.707410605 +0200
+++ gcc/c/c-decl.c  2017-09-01 09:49:57.419314085 +0200
@@ -2569,6 +2569,17 @@ merge_decls (tree newdecl, tree olddecl,
set_builtin_decl_declared_p (fncode, true);
  break;
}
+
+ tree s = lookup_attribute ("omp declare simd",
+DECL_ATTRIBUTES (newdecl));
+ if (s)
+   {
+ tree b
+   = builtin_decl_explicit (DECL_FUNCTION_CODE (newdecl));
+ if (b)
+   duplicate_one_attribute (&DECL_ATTRIBUTES (b), s,
+"omp declare simd");
+   }
}
}
  else
--- gcc/cp/decl.c.jj2017-09-01 09:26:24.748892739 +0200
+++ gcc/cp/decl.c   2017-09-01 09:55:52.940160495 +0200
@@ -2470,6 +2470,16 @@ next_arg:;
  break;
}
}
+
+ tree s = lookup_attribute ("omp declare simd",
+DECL_ATTRIBUTES (newdecl));
+ if (s)
+   {
+ tree b = builtin_decl_explicit (DECL_FUNCTIO

[PATCH] Add UBSAN_{PTR,BOUNDS} folding (PR sanitizer/81981)

2017-09-01 Thread Jakub Jelinek

Hi!

This patch fixes the following testcase by folding some ubsan internal fns
we'd either remove anyway during sanopt, or lower into if (cond)
do_something during sanopt where cond would be always false.

Additionally, I've tried to clean up a little bit IFN_UBSAN_OBJECT_SIZE
handling by using variables for the call arguments that make it clear
what the arguments are.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-09-01  Jakub Jelinek  

PR sanitizer/81981
* gimple-fold.c (gimple_fold_call): Optimize away useless UBSAN_PTR
and UBSAN_BOUNDS internal calls.  Clean up IFN_UBSAN_OBJECT_SIZE
handling.

* gcc.dg/ubsan/pr81981.c: New test.

--- gcc/gimple-fold.c.jj2017-08-10 02:31:21.0 +0200
+++ gcc/gimple-fold.c   2017-08-29 18:50:49.993673432 +0200
@@ -3938,11 +3938,23 @@ gimple_fold_call (gimple_stmt_iterator *
gimple_call_arg (stmt, 2));
  break;
case IFN_UBSAN_OBJECT_SIZE:
- if (integer_all_onesp (gimple_call_arg (stmt, 2))
- || (TREE_CODE (gimple_call_arg (stmt, 1)) == INTEGER_CST
- && TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST
- && tree_int_cst_le (gimple_call_arg (stmt, 1),
- gimple_call_arg (stmt, 2
+ {
+   tree offset = gimple_call_arg (stmt, 1);
+   tree objsize = gimple_call_arg (stmt, 2);
+   if (integer_all_onesp (objsize)
+   || (TREE_CODE (offset) == INTEGER_CST
+   && TREE_CODE (objsize) == INTEGER_CST
+   && tree_int_cst_le (offset, objsize)))
+ {
+   gsi_replace (gsi, gimple_build_nop (), false);
+   unlink_stmt_vdef (stmt);
+   release_defs (stmt);
+   return true;
+ }
+ }
+ break;
+   case IFN_UBSAN_PTR:
+ if (integer_zerop (gimple_call_arg (stmt, 1)))
{
  gsi_replace (gsi, gimple_build_nop (), false);
  unlink_stmt_vdef (stmt);
@@ -3950,6 +3962,25 @@ gimple_fold_call (gimple_stmt_iterator *
  return true;
}
  break;
+   case IFN_UBSAN_BOUNDS:
+ {
+   tree index = gimple_call_arg (stmt, 1);
+   tree bound = gimple_call_arg (stmt, 2);
+   if (TREE_CODE (index) == INTEGER_CST
+   && TREE_CODE (bound) == INTEGER_CST)
+ {
+   index = fold_convert (TREE_TYPE (bound), index);
+   if (TREE_CODE (index) == INTEGER_CST
+   && tree_int_cst_le (index, bound))
+ {
+   gsi_replace (gsi, gimple_build_nop (), false);
+   unlink_stmt_vdef (stmt);
+   release_defs (stmt);
+   return true;
+ }
+ }
+ }
+ break;
case IFN_GOACC_DIM_SIZE:
case IFN_GOACC_DIM_POS:
  result = fold_internal_goacc_dim (stmt);
--- gcc/testsuite/gcc.dg/ubsan/pr81981.c.jj 2017-08-29 18:54:33.826107761 
+0200
+++ gcc/testsuite/gcc.dg/ubsan/pr81981.c2017-08-29 18:55:36.721386827 
+0200
@@ -0,0 +1,21 @@
+/* PR sanitizer/81981 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wmaybe-uninitialized -fsanitize=undefined 
-ffat-lto-objects" } */
+
+int v;
+
+int
+foo (int i)
+{
+  int t[1], u[1];
+  int n = 0;
+
+  if (i)
+{
+  t[n] = i;
+  u[0] = i;
+}
+
+  v = u[0];/* { dg-warning "may be used uninitialized in this 
function" } */
+  return t[0]; /* { dg-warning "may be used uninitialized in this 
function" } */
+}

Jakub

[PATCH] Document -fsanitize=pointer-overflow (PR sanitizer/81902)

2017-09-01 Thread Jakub Jelinek

Hi!

Martin Sebor reported I forgot to document this new sanitizer option.
Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-09-01  Jakub Jelinek  

PR sanitizer/81902
* doc/invoke.texi: Document -fsanitize=pointer-overflow.

--- gcc/doc/invoke.texi.jj  2017-08-29 19:03:08.0 +0200
+++ gcc/doc/invoke.texi 2017-08-29 20:30:36.563053682 +0200
@@ -11074,6 +11074,12 @@ This option enables instrumentation of C
 accesses and some conversions between pointers to base and derived classes,
 to verify the referenced object has the correct dynamic type.
 
+@item -fsanitize=pointer-overflow
+@opindex fsanitize=pointer-overflow
+
+This option enables instrumentation of pointer arithmetics.  If the pointer
+arithmetics overflows, a run-time error is issued.
+
 @end table
 
 While @option{-ftrapv} causes traps for signed overflows to be emitted,

Jakub

[OBVIOUS][PATCH] Fix warning for simple-object-elf.c.

2017-09-01 Thread Martin Liška

Installed after discussion with Richi.

Martin
>From ec7cf772d8083f5c25558a9d2d1316f610ea28dd Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 1 Sep 2017 12:53:46 +0200
Subject: [PATCH] Fix warning for simple-object-elf.c.

libiberty/ChangeLog:

2017-09-01  Martin Liska  

	* simple-object-elf.c (simple_object_elf_copy_lto_debug_sections):
	Remove duplicite declaration.
---
 libiberty/simple-object-elf.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/libiberty/simple-object-elf.c b/libiberty/simple-object-elf.c
index 646208a85b9..6774eb273c9 100644
--- a/libiberty/simple-object-elf.c
+++ b/libiberty/simple-object-elf.c
@@ -1232,7 +1232,6 @@ simple_object_elf_copy_lto_debug_sections (simple_object_read *sobj,
   off_t offset;
   off_t length;
   int ret;
-  const char *errmsg;
   simple_object_write_section *dest;
   off_t flags;
   unsigned char *buf;
-- 
2.14.1

[committed] Recognize #pragma omp ordered simd with -fopenmp-simd (PR c/81887)

2017-09-01 Thread Jakub Jelinek

Hi!

#pragma omp ordered simd is significant even for -fopenmp-simd mode,
if it is ignored, then some valid code might be miscompiled,
therefore this patch parses it with that option too.
#pragma omp ordered threads simd
or
#pragma omp ordered simd threads
is treated like #pragma omp ordered simd, while
#pragma omp ordered
#pragma omp ordered threads
or
#pragma omp ordered depend(...)
is ignored.  Bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk so far.

2017-09-01  Jakub Jelinek  

PR c/81887
c-family/
* c-pragma.c (omp_pragmas): Move "ordered" entry from here to ...
(omp_pragmas_simd): ... here.
* c-omp.c (c_finish_omp_ordered): If clauses isn't simd clause alone,
create new clauses list containing just simd clause.
c/
* c-parser.c (c_parser_omp_ordered): Handle -fopenmp-simd.
cp/
* parser.c (cp_parser_omp_ordered): Handle -fopenmp-simd.
fortran/
* parse.c (decode_omp_directive): Use matchs instead of matcho for
end ordered and ordered directives, except for ordered depend.  For
-fopenmp-simd and ordered depend, reject the stmt.
* trans-openmp.c (gfc_trans_omp_ordered): For -fopenmp-simd ignore
threads clause and if simd clause isn't present, just translate the
body.
testsuite/
* c-c++-common/gomp/pr81887.c: New test.
* gfortran.dg/gomp/pr81887.f90: New test.

--- gcc/c-family/c-pragma.c.jj  2017-08-29 19:03:08.0 +0200
+++ gcc/c-family/c-pragma.c 2017-08-29 21:04:22.140967018 +0200
@@ -1277,7 +1277,6 @@ static const struct omp_pragma_def omp_p
   { "end", PRAGMA_OMP_END_DECLARE_TARGET },
   { "flush", PRAGMA_OMP_FLUSH },
   { "master", PRAGMA_OMP_MASTER },
-  { "ordered", PRAGMA_OMP_ORDERED },
   { "section", PRAGMA_OMP_SECTION },
   { "sections", PRAGMA_OMP_SECTIONS },
   { "single", PRAGMA_OMP_SINGLE },
@@ -1291,6 +1290,7 @@ static const struct omp_pragma_def omp_p
   { "declare", PRAGMA_OMP_DECLARE },
   { "distribute", PRAGMA_OMP_DISTRIBUTE },
   { "for", PRAGMA_OMP_FOR },
+  { "ordered", PRAGMA_OMP_ORDERED },
   { "parallel", PRAGMA_OMP_PARALLEL },
   { "simd", PRAGMA_OMP_SIMD },
   { "target", PRAGMA_OMP_TARGET },
--- gcc/c-family/c-omp.c.jj 2017-08-10 02:31:19.0 +0200
+++ gcc/c-family/c-omp.c2017-08-29 21:28:34.030112023 +0200
@@ -116,6 +116,10 @@ c_finish_omp_ordered (location_t loc, tr
   tree t = make_node (OMP_ORDERED);
   TREE_TYPE (t) = void_type_node;
   OMP_ORDERED_BODY (t) = stmt;
+  if (!flag_openmp /* flag_openmp_simd */
+  && (OMP_CLAUSE_CODE (clauses) != OMP_CLAUSE_SIMD
+ || OMP_CLAUSE_CHAIN (clauses)))
+clauses = build_omp_clause (loc, OMP_CLAUSE_SIMD);
   OMP_ORDERED_CLAUSES (t) = clauses;
   SET_EXPR_LOCATION (t, loc);
   return add_stmt (t);
--- gcc/c/c-parser.c.jj 2017-08-29 19:03:08.0 +0200
+++ gcc/c/c-parser.c2017-08-29 21:08:48.749059955 +0200
@@ -15647,6 +15647,11 @@ c_parser_omp_ordered (c_parser *parser,
 
   if (!strcmp ("depend", p))
{
+ if (!flag_openmp) /* flag_openmp_simd  */
+   {
+ c_parser_skip_to_pragma_eol (parser, false);
+ return false;
+   }
  if (context == pragma_stmt)
{
  error_at (loc,
@@ -15667,6 +15672,11 @@ c_parser_omp_ordered (c_parser *parser,
 
   tree clauses = c_parser_omp_all_clauses (parser, OMP_ORDERED_CLAUSE_MASK,
   "#pragma omp ordered");
+
+  if (!flag_openmp /* flag_openmp_simd  */
+  && omp_find_clause (clauses, OMP_CLAUSE_SIMD) == NULL_TREE)
+return false;
+
   c_finish_omp_ordered (loc, clauses,
c_parser_omp_structured_block (parser, if_p));
   return true;
--- gcc/cp/parser.c.jj  2017-08-29 19:03:09.0 +0200
+++ gcc/cp/parser.c 2017-08-29 21:19:36.816368363 +0200
@@ -35406,6 +35406,11 @@ cp_parser_omp_ordered (cp_parser *parser
 
   if (strcmp (p, "depend") == 0)
{
+ if (!flag_openmp) /* flag_openmp_simd */
+   {
+ cp_parser_skip_to_pragma_eol (parser, pragma_tok);
+ return false;
+   }
  if (context == pragma_stmt)
{
  error_at (pragma_tok->location, "%<#pragma omp ordered%> with "
@@ -35426,6 +35431,11 @@ cp_parser_omp_ordered (cp_parser *parser
   tree clauses
 = cp_parser_omp_all_clauses (parser, OMP_ORDERED_CLAUSE_MASK,
 "#pragma omp ordered", pragma_tok);
+
+  if (!flag_openmp /* flag_openmp_simd  */
+  && omp_find_clause (clauses, OMP_CLAUSE_SIMD) == NULL_TREE)
+return false;
+
   c_finish_omp_ordered (loc, clauses,
cp_parser_omp_structured_block (parser, if_p));
   return true;
--- gcc/fortran/parse.c.jj  2017-08-07 18:50:09.0 +0200
+++ gcc/fortran/parse.c 2017-08-30 00:29:00.795203626 +0200
@@ -875,7 +875,7 @@ decode_omp_directive (void)
   matcho ("end do", gfc_

[PATCH] Fix x86_64 ICE with -fpie -mcmodel=large (PR target/81766)

2017-09-01 Thread Jakub Jelinek

Hi!

As mentioned in the PR, ix86_init_pic_reg for -mcmodel=large PIC creates
invalid RTL.  Shrink wrapping managed to work around it by unconditionally
running find_many_sub_basic_blocks that has been invoked even when the
prologue or split prologue actually didn't contain anything, but that isn't
done anymore.

The problem is that we add a label into the sequence that we then insert on
the single succ edge after ENTRY; but this insertion inserts the label after
the NOTE_INSN_BLOCK_BEG, which is invalid, because labels should precede
that.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2017-09-01  Jakub Jelinek  

PR target/81766
* config/i386/i386.c (ix86_init_large_pic_reg): Return label instead of 
void.
(ix86_init_pic_reg): Remember label from ix86_init_large_pic_reg, if 
non-NULL
and preceded by NOTE_INSN_BASIC_BLOCK, swap the note and label.

* gcc.target/i386/pr81766.c: New test.

--- gcc/config/i386/i386.c.jj   2017-08-07 18:50:10.0 +0200
+++ gcc/config/i386/i386.c  2017-08-08 16:01:41.917136120 +0200
@@ -8829,7 +8829,7 @@ ix86_use_pseudo_pic_reg (void)
 
 /* Initialize large model PIC register.  */
 
-static void
+static rtx_code_label *
 ix86_init_large_pic_reg (unsigned int tmp_regno)
 {
   rtx_code_label *label;
@@ -8846,6 +8846,7 @@ ix86_init_large_pic_reg (unsigned int tm
   emit_insn (gen_set_got_offset_rex64 (tmp_reg, label));
   emit_insn (ix86_gen_add3 (pic_offset_table_rtx,
pic_offset_table_rtx, tmp_reg));
+  return label;
 }
 
 /* Create and initialize PIC register if required.  */
@@ -8854,6 +8855,7 @@ ix86_init_pic_reg (void)
 {
   edge entry_edge;
   rtx_insn *seq;
+  rtx_code_label *label = NULL;
 
   if (!ix86_use_pseudo_pic_reg ())
 return;
@@ -8863,7 +8865,7 @@ ix86_init_pic_reg (void)
   if (TARGET_64BIT)
 {
   if (ix86_cmodel == CM_LARGE_PIC)
-   ix86_init_large_pic_reg (R11_REG);
+   label = ix86_init_large_pic_reg (R11_REG);
   else
emit_insn (gen_set_got_rex64 (pic_offset_table_rtx));
 }
@@ -8887,6 +8889,22 @@ ix86_init_pic_reg (void)
   entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
   insert_insn_on_edge (seq, entry_edge);
   commit_one_edge_insertion (entry_edge);
+
+  if (label)
+{
+  basic_block bb = BLOCK_FOR_INSN (label);
+  rtx_insn *bb_note = PREV_INSN (label);
+  /* If the note preceding the label starts a basic block, and the
+label is a member of the same basic block, interchange the two.  */
+  if (bb_note != NULL_RTX
+ && NOTE_INSN_BASIC_BLOCK_P (bb_note)
+ && bb != NULL
+ && bb == BLOCK_FOR_INSN (bb_note))
+   {
+ reorder_insns_nobb (bb_note, bb_note, label);
+ BB_HEAD (bb) = label;
+   }
+}
 }
 
 /* Initialize a variable CUM of type CUMULATIVE_ARGS
--- gcc/testsuite/gcc.target/i386/pr81766.c.jj  2017-08-08 16:10:04.299459808 
+0200
+++ gcc/testsuite/gcc.target/i386/pr81766.c 2017-08-08 16:09:28.0 
+0200
@@ -0,0 +1,9 @@
+/* PR target/81766 */
+/* { dg-do compile { target { pie && lp64 } } } */
+/* { dg-options "-O2 -fpie -mcmodel=large" } */
+
+int
+main ()
+{
+  return 0;
+}

Jakub

[PATCH] Fix asan create_odr_indicator (PR sanitizer/81923)

2017-09-01 Thread Jakub Jelinek

Hi!

glibc fails to build with -fsanitize=address, because DECL_ASSEMBLER_NAME
on some variables starts with the * character (e.g. for vars with __asm
specified names).  We need to strip name encoding from those before
appending after __odr_asan. Fixed thusly, bootstrapped/regtested on
x86_64-linux and i686-linux, ok for trunk?

2017-09-01  Jakub Jelinek  

PR sanitizer/81923
* asan.c (create_odr_indicator): Strip name encoding from assembler
name before appending it after __odr_asan_.

* gcc.dg/asan/pr81923.c: New test.

--- gcc/asan.c.jj   2017-08-10 02:31:21.0 +0200
+++ gcc/asan.c  2017-08-29 17:25:58.337595628 +0200
@@ -2529,9 +2529,12 @@ create_odr_indicator (tree decl, tree ty
   /* DECL_NAME theoretically might be NULL.  Bail out with 0 in this case.  */
   if (decl_name == NULL_TREE)
 return build_int_cst (uptr, 0);
-  size_t len = strlen (IDENTIFIER_POINTER (decl_name)) + sizeof 
("__odr_asan_");
+  const char *dname = IDENTIFIER_POINTER (decl_name);
+  if (HAS_DECL_ASSEMBLER_NAME_P (decl))
+dname = targetm.strip_name_encoding (dname);
+  size_t len = strlen (dname) + sizeof ("__odr_asan_");
   name = XALLOCAVEC (char, len);
-  snprintf (name, len, "__odr_asan_%s", IDENTIFIER_POINTER (decl_name));
+  snprintf (name, len, "__odr_asan_%s", dname);
 #ifndef NO_DOT_IN_LABEL
   name[sizeof ("__odr_asan") - 1] = '.';
 #elif !defined(NO_DOLLAR_IN_LABEL)
--- gcc/testsuite/gcc.dg/asan/pr81923.c.jj  2017-08-29 18:08:59.183881570 
+0200
+++ gcc/testsuite/gcc.dg/asan/pr81923.c 2017-08-29 18:09:27.643550083 +0200
@@ -0,0 +1,10 @@
+/* PR sanitizer/81923 */
+/* { dg-do link } */
+
+int foobar __asm (__USER_LABEL_PREFIX__ "barbaz") = 34;
+
+int
+main ()
+{
+  return 0;
+}

Jakub

RE: [PATCH 6/7] [ARC] Reimplement ZOL support.

2017-09-01 Thread Claudiu Zissulescu

> I'm happy with this if the doc is updated inline with Sandra's
> suggestions.

Updated with Sandra's suggestions. Thank you for your feedback,
Claudiu

RE: [PATCH 7/7] [ARC] Fix errors in arc_ifcvt.

2017-09-01 Thread Claudiu Zissulescu

> Looks good.

Committed. Thank you for your review,
Claudiu

RE: [PATCH 5/7] [ARC] Update various patterns

2017-09-01 Thread Claudiu Zissulescu

> That looks fine to me.

Committed. Thank you for your review,
Claudiu

RE: [PATCH 4/7] [ARC] Use TARGET_USE_ANCHORS_FOR_SYMBOL_P.

2017-09-01 Thread Claudiu Zissulescu

> Looks good.

Committed. Thank you for your review,
Claudiu

Re: [1/2] PR 78736: New warning -Wenum-conversion

2017-09-01 Thread Joseph Myers

On Fri, 1 Sep 2017, Prathamesh Kulkarni wrote:

> > If it's an implicit conversion between different enum types, the warning
> > is correct.  The more important question for the review is: is it correct
> > to replace the implicit conversion by an explicit one?  That is, for each
> > value in the source type, what enumerator of the destination type has the
> > same value, and do the semantics match in whatever way is required for the
> > code in question?
> Well, for instance unit_sign in libgfortran/io.h is defined as:
> typedef enum
> { SIGN_PROCDEFINED, SIGN_SUPPRESS, SIGN_PLUS, SIGN_UNSPECIFIED }
> unit_sign;
> 
> and unit_sign_s is defined:
> typedef enum
> { SIGN_S, SIGN_SS, SIGN_SP }
> unit_sign_s;
> 
> Since the enum types are different, I assume warnings for implicit
> conversion from unit_sign_s to unit_sign would be correct ?
> And since unit_sign_s is a subset of unit_sign in terms of
> value-ranges, I assume replacing implicit by explicit conversion would
> be OK ?

Whether an explicit conversion is OK depends on *the semantics of the 
individual values and the intended semantics of the code doing the 
conversion*.  That is, for the semantics of whatever code converts 
unit_sign_s to unit_sign, is it indeed correct that an input of SIGN_S 
should result in an output of SIGN_PROCDEFINED, an input of SIGN_SS should 
result in an output of SIGN_SUPPRESS and an input of SIGN_SP should result 
in an output of SIGN_PLUS?

That is not a question one should expect C front-end maintainers to have 
any expertise in.  Thus, I strongly advise sending each patch that fixes 
the warning fallout for such conversions separately, CC:ing the 
maintainers of the relevant part of GCC, and including an explanation with 
each such patch of what the semantics are in the relevant code and why an 
explicit conversion is correct.  It would also seem a good idea to me to 
make sure that each enum in question has a comment on its definition, 
saying that the values have to be kept consistent (in whatever way) with 
the values of another enum, because of the conversions in whatever 
function named in the constant, so that people editing either enum know 
they can't just insert values in the middle of it.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Fix profile update in tree-ssa-isolate-paths.c (PR tree-optimization/82059).

2017-09-01 Thread Jan Hubicka

> Hello.
> 
> In order to have valid profile, we should add counts of an edge only when
> it's really redirected and hasn't been redirected in previous invocation of 
> the function.
> 
> Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.
> 
> Ready to be installed?
> Martin
> 
> gcc/ChangeLog:
> 
> 2017-08-31  Martin Liska  
> 
>   PR tree-optimization/82059
>   * gimple-ssa-isolate-paths.c (isolate_path): Add profile and
>   frequency only when an edge is redirected.

OK, thanks!
Honza
> 
> gcc/testsuite/ChangeLog:
> 
> 2017-08-31  Martin Liska  
> 
>   PR tree-optimization/82059
>   * gcc.dg/tree-ssa/pr82059.c: New test.
> ---
>  gcc/gimple-ssa-isolate-paths.c  |  9 ++---
>  gcc/testsuite/gcc.dg/tree-ssa/pr82059.c | 22 ++
>  2 files changed, 28 insertions(+), 3 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr82059.c
> 
> 

> diff --git a/gcc/gimple-ssa-isolate-paths.c b/gcc/gimple-ssa-isolate-paths.c
> index fbc41057463..807e0032410 100644
> --- a/gcc/gimple-ssa-isolate-paths.c
> +++ b/gcc/gimple-ssa-isolate-paths.c
> @@ -160,14 +160,17 @@ isolate_path (basic_block bb, basic_block duplicate,
>   for (ei = ei_start (duplicate->succs); (e2 = ei_safe_edge (ei)); )
> remove_edge (e2);
>  }
> -  bb->frequency += EDGE_FREQUENCY (e);
> -  bb->count += e->count;
>  
>/* Complete the isolation step by redirecting E to reach DUPLICATE.  */
>e2 = redirect_edge_and_branch (e, duplicate);
>if (e2)
> -flush_pending_stmts (e2);
> +{
> +  flush_pending_stmts (e2);
>  
> +  /* Update profile only when redirection is really processed.  */
> +  bb->frequency += EDGE_FREQUENCY (e);
> +  bb->count += e->count;
> +}
>  
>/* There may be more than one statement in DUPLICATE which exhibits
>   undefined behavior.  Ultimately we want the first such statement in
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr82059.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr82059.c
> new file mode 100644
> index 000..0285b03cc04
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr82059.c
> @@ -0,0 +1,22 @@
> +/* PR tree-optimization/82059 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-isolate-paths" } */
> +
> +struct a
> +{
> +  char b;
> +  struct a *c;
> +} d (), f;
> +void *e;
> +long g;
> +void
> +h ()
> +{
> +  struct a *i = 0;
> +  if (g)
> +i = e;
> +  if (!i)
> +d ();
> +  i->c = &f;
> +  i->b = *(char *) h;
> +}
>

Re: Patch for [Bug fortran/81841] [5/6/7/8 Regression] THREADPRIVATE (OpenMP) wrongly rejected in BLOCK DATA

2017-09-01 Thread Jakub Jelinek

On Fri, Sep 01, 2017 at 11:09:47AM +0200, dbroemmel wrote:
> Hi all,
> 
> attached are a proposed fix and new testcase for PR81841. The
> THREADPRIVATE statement is currently wrongly rejected as part of BLOCK DATA.
> 
> The testcase also does (very basic) runtime checks. It fails (compiling)
> prior to the patch and completes after. Tested on x86_64 GNU/Linux.
> 
> Thanks,
> Dirk
> 
> 
> 2017-09-01 dbroemmel

The ChangeLog format is date two spaces real name two spaces 
> 
> PR fortran/81841
> * parse.c (parse_spec): adding ST_OMP_THREADPRIVATE as allowed

Add instead of adding
Also all the ChangeLog lines except empty and ones starting with date
should be tab indented, not sure if it is your mailer that ate it or
omission.

> statement.
> 
> 2017-09-01 dbroemmel
> 
> PR fortran/81841
> * gfortran.dg/gomp/omp_threadprivate3.f90: New testcase.

> Index: gcc/fortran/parse.c
> ===
> --- gcc/fortran/parse.c   (revision 251553)
> +++ gcc/fortran/parse.c   (working copy)
> @@ -3694,6 +3694,7 @@
>   case ST_EQUIVALENCE:
>   case ST_IMPLICIT:
>   case ST_IMPLICIT_NONE:
> + case ST_OMP_THREADPRIVATE:
>   case ST_PARAMETER:
>   case ST_STRUCTURE_DECL:
>   case ST_TYPE:

This looks good.

> Index: gcc/testsuite/gfortran.dg/gomp/omp_threadprivate3.f90
> ===
> --- gcc/testsuite/gfortran.dg/gomp/omp_threadprivate3.f90 (revision 0)
> +++ gcc/testsuite/gfortran.dg/gomp/omp_threadprivate3.f90 (working copy)
> @@ -0,0 +1,158 @@
> +! { dg-do run }

This is wrong.  Runtime testcases for OpenMP belong into libgomp/testsuite/.
That said, I fail to see why such a large testcase is needed, wouldn't a
simple
! PR fortran/81841
! { dg-do compile }

block data
  implicit none
  integer :: int2
  real:: flt2
  common /c_block_2/ int2, flt2
  !$OMP THREADPRIVATE(/c_block_2/)
  data int2, flt2 /2, 2.2/
end block data

testcase in gfortran.dg/gomp/ be sufficient here?

Jakub

Re: Patch for [Bug fortran/81841] [5/6/7/8 Regression] THREADPRIVATE (OpenMP) wrongly rejected in BLOCK DATA

2017-09-01 Thread dbroemmel

> This is wrong.  Runtime testcases for OpenMP belong into libgomp/testsuite/.
Well, that's a path where I found some Fortran OpenMP stuff, I didn't
look for other places.

> That said, I fail to see why such a large testcase is needed, wouldn't a
> simple
> ! PR fortran/81841
> ! { dg-do compile }
> 
> block data
>   implicit none
>   integer :: int2
>   real:: flt2
>   common /c_block_2/ int2, flt2
>   !$OMP THREADPRIVATE(/c_block_2/)
>   data int2, flt2 /2, 2.2/
> end block data
> 
> testcase in gfortran.dg/gomp/ be sufficient here?
That would suffice and is the first testcase I added to PR81841. It was
suggested I could add runtime tests as well, so I tried.

Dirk

Re: [PATCH] Fix asan create_odr_indicator (PR sanitizer/81923)

2017-09-01 Thread Richard Biener

On September 1, 2017 1:48:04 PM GMT+02:00, Jakub Jelinek  
wrote:
>Hi!
>
>glibc fails to build with -fsanitize=address, because
>DECL_ASSEMBLER_NAME
>on some variables starts with the * character (e.g. for vars with __asm
>specified names).  We need to strip name encoding from those before
>appending after __odr_asan. Fixed thusly, bootstrapped/regtested on
>x86_64-linux and i686-linux, ok for trunk?

OK. 

Richard. 

>2017-09-01  Jakub Jelinek  
>
>   PR sanitizer/81923
>   * asan.c (create_odr_indicator): Strip name encoding from assembler
>   name before appending it after __odr_asan_.
>
>   * gcc.dg/asan/pr81923.c: New test.
>
>--- gcc/asan.c.jj  2017-08-10 02:31:21.0 +0200
>+++ gcc/asan.c 2017-08-29 17:25:58.337595628 +0200
>@@ -2529,9 +2529,12 @@ create_odr_indicator (tree decl, tree ty
>/* DECL_NAME theoretically might be NULL.  Bail out with 0 in this
>case.  */
>   if (decl_name == NULL_TREE)
> return build_int_cst (uptr, 0);
>-  size_t len = strlen (IDENTIFIER_POINTER (decl_name)) + sizeof
>("__odr_asan_");
>+  const char *dname = IDENTIFIER_POINTER (decl_name);
>+  if (HAS_DECL_ASSEMBLER_NAME_P (decl))
>+dname = targetm.strip_name_encoding (dname);
>+  size_t len = strlen (dname) + sizeof ("__odr_asan_");
>   name = XALLOCAVEC (char, len);
>-  snprintf (name, len, "__odr_asan_%s", IDENTIFIER_POINTER
>(decl_name));
>+  snprintf (name, len, "__odr_asan_%s", dname);
> #ifndef NO_DOT_IN_LABEL
>   name[sizeof ("__odr_asan") - 1] = '.';
> #elif !defined(NO_DOLLAR_IN_LABEL)
>--- gcc/testsuite/gcc.dg/asan/pr81923.c.jj 2017-08-29
>18:08:59.183881570 +0200
>+++ gcc/testsuite/gcc.dg/asan/pr81923.c2017-08-29 18:09:27.643550083
>+0200
>@@ -0,0 +1,10 @@
>+/* PR sanitizer/81923 */
>+/* { dg-do link } */
>+
>+int foobar __asm (__USER_LABEL_PREFIX__ "barbaz") = 34;
>+
>+int
>+main ()
>+{
>+  return 0;
>+}
>
>   Jakub

Re: [PATCH] Document -fsanitize=pointer-overflow (PR sanitizer/81902)

2017-09-01 Thread Richard Biener

On September 1, 2017 1:19:34 PM GMT+02:00, Jakub Jelinek  
wrote:
>Hi!
>
>Martin Sebor reported I forgot to document this new sanitizer option.
>Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
>for
>trunk?

OK. Also do a changes. HTML entry? 

Richard. 

>2017-09-01  Jakub Jelinek  
>
>   PR sanitizer/81902
>   * doc/invoke.texi: Document -fsanitize=pointer-overflow.
>
>--- gcc/doc/invoke.texi.jj 2017-08-29 19:03:08.0 +0200
>+++ gcc/doc/invoke.texi2017-08-29 20:30:36.563053682 +0200
>@@ -11074,6 +11074,12 @@ This option enables instrumentation of C
>accesses and some conversions between pointers to base and derived
>classes,
> to verify the referenced object has the correct dynamic type.
> 
>+@item -fsanitize=pointer-overflow
>+@opindex fsanitize=pointer-overflow
>+
>+This option enables instrumentation of pointer arithmetics.  If the
>pointer
>+arithmetics overflows, a run-time error is issued.
>+
> @end table
> 
>While @option{-ftrapv} causes traps for signed overflows to be emitted,
>
>   Jakub

Re: [PATCH] Add UBSAN_{PTR,BOUNDS} folding (PR sanitizer/81981)

2017-09-01 Thread Richard Biener

On September 1, 2017 1:16:54 PM GMT+02:00, Jakub Jelinek  
wrote:
>Hi!
>
>This patch fixes the following testcase by folding some ubsan internal
>fns
>we'd either remove anyway during sanopt, or lower into if (cond)
>do_something during sanopt where cond would be always false.
>
>Additionally, I've tried to clean up a little bit IFN_UBSAN_OBJECT_SIZE
>handling by using variables for the call arguments that make it clear
>what the arguments are.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

I think there's a helper for replace - with-nop. 

Richard. 

>2017-09-01  Jakub Jelinek  
>
>   PR sanitizer/81981
>   * gimple-fold.c (gimple_fold_call): Optimize away useless UBSAN_PTR
>   and UBSAN_BOUNDS internal calls.  Clean up IFN_UBSAN_OBJECT_SIZE
>   handling.
>
>   * gcc.dg/ubsan/pr81981.c: New test.
>
>--- gcc/gimple-fold.c.jj   2017-08-10 02:31:21.0 +0200
>+++ gcc/gimple-fold.c  2017-08-29 18:50:49.993673432 +0200
>@@ -3938,11 +3938,23 @@ gimple_fold_call (gimple_stmt_iterator *
>   gimple_call_arg (stmt, 2));
> break;
>   case IFN_UBSAN_OBJECT_SIZE:
>-if (integer_all_onesp (gimple_call_arg (stmt, 2))
>-|| (TREE_CODE (gimple_call_arg (stmt, 1)) == INTEGER_CST
>-&& TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST
>-&& tree_int_cst_le (gimple_call_arg (stmt, 1),
>-gimple_call_arg (stmt, 2
>+{
>+  tree offset = gimple_call_arg (stmt, 1);
>+  tree objsize = gimple_call_arg (stmt, 2);
>+  if (integer_all_onesp (objsize)
>+  || (TREE_CODE (offset) == INTEGER_CST
>+  && TREE_CODE (objsize) == INTEGER_CST
>+  && tree_int_cst_le (offset, objsize)))
>+{
>+  gsi_replace (gsi, gimple_build_nop (), false);
>+  unlink_stmt_vdef (stmt);
>+  release_defs (stmt);
>+  return true;
>+}
>+}
>+break;
>+  case IFN_UBSAN_PTR:
>+if (integer_zerop (gimple_call_arg (stmt, 1)))
>   {
> gsi_replace (gsi, gimple_build_nop (), false);
> unlink_stmt_vdef (stmt);
>@@ -3950,6 +3962,25 @@ gimple_fold_call (gimple_stmt_iterator *
> return true;
>   }
> break;
>+  case IFN_UBSAN_BOUNDS:
>+{
>+  tree index = gimple_call_arg (stmt, 1);
>+  tree bound = gimple_call_arg (stmt, 2);
>+  if (TREE_CODE (index) == INTEGER_CST
>+  && TREE_CODE (bound) == INTEGER_CST)
>+{
>+  index = fold_convert (TREE_TYPE (bound), index);
>+  if (TREE_CODE (index) == INTEGER_CST
>+  && tree_int_cst_le (index, bound))
>+{
>+  gsi_replace (gsi, gimple_build_nop (), false);
>+  unlink_stmt_vdef (stmt);
>+  release_defs (stmt);
>+  return true;
>+}
>+}
>+}
>+break;
>   case IFN_GOACC_DIM_SIZE:
>   case IFN_GOACC_DIM_POS:
> result = fold_internal_goacc_dim (stmt);
>--- gcc/testsuite/gcc.dg/ubsan/pr81981.c.jj2017-08-29
>18:54:33.826107761 +0200
>+++ gcc/testsuite/gcc.dg/ubsan/pr81981.c   2017-08-29 18:55:36.721386827
>+0200
>@@ -0,0 +1,21 @@
>+/* PR sanitizer/81981 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -Wmaybe-uninitialized -fsanitize=undefined
>-ffat-lto-objects" } */
>+
>+int v;
>+
>+int
>+foo (int i)
>+{
>+  int t[1], u[1];
>+  int n = 0;
>+
>+  if (i)
>+{
>+  t[n] = i;
>+  u[0] = i;
>+}
>+
>+  v = u[0];   /* { dg-warning "may be used uninitialized in this
>function" } */
>+  return t[0];/* { dg-warning "may be used uninitialized in 
>this
>function" } */
>+}
>
>   Jakub

[PATCH] [ARC][ZOL] Account for empty body loops

2017-09-01 Thread Claudiu Zissulescu

From: claziss 

Hi Andrew,

By mistake I've pushed an incoplete ZOL-rework patch, and it missing the 
attached parts. Please can you check if it is ok?

Thank you,
Claudiu

gcc/
2017-09-01  Claudiu Zissulescu 

* config/arc/arc.c (hwloop_optimize): Account for empty
body loops.

testsuite/
2017-09-01  Claudiu Zissulescu 

* gcc.target/arc/loop-1.c: Add test.
---
 gcc/config/arc/arc.c  | 13 +++--
 gcc/testsuite/gcc.target/arc/loop-1.c | 12 
 2 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100755 gcc/testsuite/gcc.target/arc/loop-1.c

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 52a9b24..d519063 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -7240,6 +7240,12 @@ hwloop_optimize (hwloop_info loop)
fprintf (dump_file, ";; loop %d too long\n", loop->loop_no);
   return false;
 }
+  else if (!loop->length)
+{
+  if (dump_file)
+   fprintf (dump_file, ";; loop %d is empty\n", loop->loop_no);
+  return false;
+}
 
   /* Check if we use a register or not.  */
   if (!REG_P (loop->iter_reg))
@@ -7311,8 +7317,11 @@ hwloop_optimize (hwloop_info loop)
   && INSN_P (last_insn)
   && (JUMP_P (last_insn) || CALL_P (last_insn)
  || GET_CODE (PATTERN (last_insn)) == SEQUENCE
- || get_attr_type (last_insn) == TYPE_BRCC
- || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT))
+ /* At this stage we can have (insn (clobber (mem:BLK
+(reg instructions, ignore them.  */
+ || (GET_CODE (PATTERN (last_insn)) != CLOBBER
+ && (get_attr_type (last_insn) == TYPE_BRCC
+ || get_attr_type (last_insn) == TYPE_BRCC_NO_DELAY_SLOT
 {
   if (loop->length + 2 > ARC_MAX_LOOP_LENGTH)
{
diff --git a/gcc/testsuite/gcc.target/arc/loop-1.c 
b/gcc/testsuite/gcc.target/arc/loop-1.c
new file mode 100755
index 000..274bb46
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arc/loop-1.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+/* Check how we handle empty body loops.  */
+
+int a;
+void fn1(void) {
+  int i;
+  for (; i < 8; i++) {
+double A[a];
+  }
+}
-- 
1.9.1

Re: C/C++ PATCH to add __remove_qualifiers (PR c/65455, c/39985)

2017-09-01 Thread Marek Polacek

On Thu, Aug 31, 2017 at 04:38:27PM +, Joseph Myers wrote:
> I think the documentation needs to say (and the tests need to test) that 
> this produces a non-atomic type (like lvalue-to-rvalue conversion), if 
> that's the intent for how it handles atomic types, since _Atomic is 
> syntactically a qualifier but largely not treated like one in the 
> standard.

True, updated patch here:

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-09-01  Marek Polacek  

PR c/39985
PR c/65455
* c-common.c (c_common_reswords): Add __remove_qualifiers and
__remove_qualifiers__.
(keyword_begins_type_specifier): Handle RID_REMOVE_QUALS.
* c-common.h (enum rid): Add RID_REMOVE_QUALS.

* c-decl.c (start_struct): Also check in_remove_qualifiers.
(finish_struct): Likewise.
(start_enum): Likewise.
(finish_enum): Likewise.
* c-parser.c (c_keyword_starts_typename): Handle RID_REMOVE_QUALS.
(c_token_starts_declspecs): Likewise.
(c_parser_declaration_or_fndef): For __auto_type, remove all type
qualifiers.
(c_parser_declspecs): Handle RID_REMOVE_QUALS.
(c_parser_remove_qualifiers_specifier): New function.
(c_parser_objc_selector): Handle RID_REMOVE_QUALS.
* c-tree.h (enum c_typespec_kind): Update a comment.
Declare in_remove_qualifiers.
* c-typeck.c (in_remove_qualifiers): New global variable.
(build_external_ref): Also check in_remove_qualifiers.
(struct maybe_used_decl): Likewise.
(record_maybe_used_decl): Likewise.
(pop_maybe_used): Likewise.

* parser.c (cp_keyword_starts_decl_specifier_p): Handle
RID_REMOVE_QUALS.
(cp_parser_simple_type_specifier): Likewise.
(cp_parser_sizeof_operand): For __remove_qualifiers, remove all type
qualifiers.

* doc/extend.texi: Document __remove_qualifiers.

* c-c++-common/remove-quals-1.c: New test.
* c-c++-common/remove-quals-2.c: New test.
* c-c++-common/remove-quals-3.c: New test.
* c-c++-common/remove-quals-4.c: New test.
* g++.dg/ext/remove-quals-1.C: New test.
* g++.dg/ext/remove-quals-2.C: New test.
* gcc.dg/auto-type-3.c: New test.
* gcc.dg/remove-quals-1.c: New test.
* gcc.dg/remove-quals-2.c: New test.
* gcc.dg/remove-quals-3.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index d959dbc25bb..ae92ff440f6 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -423,6 +423,8 @@ const struct c_common_resword c_common_reswords[] =
   { "__null",  RID_NULL,   0 },
   { "__real",  RID_REALPART,   0 },
   { "__real__",RID_REALPART,   0 },
+  { "__remove_qualifiers", RID_REMOVE_QUALS, 0 },
+  { "__remove_qualifiers__", RID_REMOVE_QUALS, 0 },
   { "__restrict",  RID_RESTRICT,   0 },
   { "__restrict__",RID_RESTRICT,   0 },
   { "__signed",RID_SIGNED, 0 },
@@ -7525,6 +7527,7 @@ keyword_begins_type_specifier (enum rid keyword)
 case RID_CLASS:
 case RID_UNION:
 case RID_ENUM:
+case RID_REMOVE_QUALS:
   return true;
 default:
   if (keyword >= RID_FIRST_INT_N
diff --git gcc/c-family/c-common.h gcc/c-family/c-common.h
index 8e367680600..e726aa8844b 100644
--- gcc/c-family/c-common.h
+++ gcc/c-family/c-common.h
@@ -101,7 +101,7 @@ enum rid
   RID_ASM,   RID_TYPEOF,   RID_ALIGNOF,  RID_ATTRIBUTE,  RID_VA_ARG,
   RID_EXTENSION, RID_IMAGPART, RID_REALPART, RID_LABEL,  RID_CHOOSE_EXPR,
   RID_TYPES_COMPATIBLE_P,  RID_BUILTIN_COMPLEX, 
RID_BUILTIN_SHUFFLE,
-  RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,
+  RID_DFLOAT32, RID_DFLOAT64, RID_DFLOAT128,  RID_REMOVE_QUALS,

   /* TS 18661-3 keywords, in the same sequence as the TI_* values.  */
   RID_FLOAT16,
diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index d526f0e88e4..b9cd5f8cf56 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -7516,12 +7516,14 @@ start_struct (location_t loc, enum tree_code code, tree 
name,
  within a statement expr used within sizeof, et. al.  This is not
  terribly serious as C++ doesn't permit statement exprs within
  sizeof anyhow.  */
-  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof))
+  if (warn_cxx_compat && (in_sizeof || in_typeof || in_alignof
+ || in_remove_qualifiers))
 warning_at (loc, OPT_Wc___compat,
"defining type in %qs expression is invalid in C++",
-   (in_sizeof
-? "sizeof"
-: (in_typeof ? "typeof" : "alignof")));
+   (in_sizeof ? "sizeof"
+: (in_typeof ? "typeof"
+ : (in_alignof ? "alignof"
+   : "__remove_qualifiers";

   return ref;
 }
@@ -8159,7 +8161,7 @@ finish_struct (location_t loc, tree t, tree fieldlist, 
tree attributes,
  struc

Re: c-family PATCH to improve -Wtautological-compare (PR c/81783)

2017-09-01 Thread Marek Polacek

Ping.

On Fri, Aug 25, 2017 at 02:47:45PM +0200, Marek Polacek wrote:
> Ping.
> 
> On Wed, Aug 16, 2017 at 05:24:56PM +0200, Marek Polacek wrote:
> > On Wed, Aug 16, 2017 at 11:07:36AM -0400, David Malcolm wrote:
> > > On Wed, 2017-08-16 at 16:29 +0200, Marek Polacek wrote:
> > > > This patch improves -Wtautological-compare so that it also detects
> > > > bitwise comparisons involving & and | that are always true or false,
> > > > e.g.
> > > > 
> > > >   if ((a & 16) == 10)
> > > > return 1;
> > > > 
> > > > can never be true.  Note that e.g. "(a & 9) == 8" is *not* always
> > > > false
> > > > or true.
> > > > 
> > > > I think it's pretty straightforward with one snag: we shouldn't warn
> > > > if
> > > > the constant part of the bitwise operation comes from a macro, but
> > > > currently
> > > > that's not possible, so I XFAILed this in the new test.
> > > 
> > > Maybe I'm missing something here, but why shouldn't it warn when the
> > > constant comes from a macro?
> > 
> > Just my past experience.  Sometimes you can't really control the macro
> > and then you get annoying warnings.
> > 
> > E.g. I had to tweak the warning that warns about if (i == i) to not warn 
> > about
> >   
> >   #define N 2
> >   if (a[N] == a[2]) {}
> > 
> > because that gave bogus warning during bootstrap, if I recall well.
> > 
> > > At the end of your testcase you have this example:
> > > 
> > > #define N 0x10
> > >   if ((a & N) == 10) /* { dg-bogus "bitwise comparison always evaluates 
> > > to false" "" { xfail *-*-* } } */
> > >  return 1;
> > >   if ((a | N) == 10) /* { dg-bogus "bitwise comparison always evaluates 
> > > to false" "" { xfail *-*-* } } */
> > >return 1;
> > > 
> > > That code looks bogus to me (and if the defn of "N" is further away,
> > > it's harder to spot that it's wrong): shouldn't we warn about it?
> > 
> > I'm glad you think so.  More than happy to make it an expected warning.
> > 
> > > > This has found one issue in the GCC codebase and it's a genuine bug:
> > > > .  
> > > 
> > > In this example GOVD_WRITTEN is from an enum, not a macro, but if
> > > GOVD_WRITTEN had been a macro, shouldn't we still issue a warning?
> > 
> > I feel like we should, but some might feel otherwise.
> > 
> > Thanks,
> > 
> > Marek
> 
>   Marek

Marek

Re: [PATCH] Expand switch statements with a single (or none) non-default case.

2017-09-01 Thread Martin Liška

On 09/01/2017 12:57 PM, Richard Biener wrote:
> On Fri, Sep 1, 2017 at 12:44 PM, Martin Liška  wrote:
>> On 09/01/2017 10:26 AM, Richard Biener wrote:
>>> On Fri, Sep 1, 2017 at 10:07 AM, Martin Liška  wrote:
 On 08/30/2017 02:56 PM, Richard Biener wrote:
> On Wed, Aug 30, 2017 at 2:32 PM, Martin Liška  wrote:
>> On 08/30/2017 02:28 PM, Richard Biener wrote:
>>> On Wed, Aug 30, 2017 at 1:13 PM, Martin Liška  wrote:
 Hi.

 Simple transformation of switch statements where degenerated switch 
 can be interpreted
 as gimple condition (or removed if having any non-default case). I 
 originally though
 that we don't have switch statements without non-default cases, but 
 PR82032 shows we
 can see it.

 Patch can bootstrap on ppc64le-redhat-linux and survives regression 
 tests.

 Ready to be installed?
>>>
>>> While I guess this case is ok to handle here it would be nice if CFG 
>>> cleanup
>>> would do the same.  I suppose find_taken_edge somehow doesn't work for
>>> this case even after my last enhancement?  Or is CFG cleanup for some 
>>> reason
>>> not run?
>>
>> Do you mean both with # of non-default edges equal to 0 and 1?
>> Let me take a look.
>
> First and foremost 0.  The case of 1 non-default and a default would
> need extra code.

 For the test-case I reduced, one needs:

 diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
 index b7593068ea9..13af516c6ac 100644
 --- a/gcc/tree-cfg.c
 +++ b/gcc/tree-cfg.c
 @@ -8712,7 +8712,7 @@ const pass_data pass_data_split_crit_edges =
PROP_no_crit_edges, /* properties_provided */
0, /* properties_destroyed */
0, /* todo_flags_start */
 -  0, /* todo_flags_finish */
 +  TODO_cleanup_cfg, /* todo_flags_finish */
  };

  class pass_split_crit_edges : public gimple_opt_pass

 And the code eliminates the problematic switch statement. Do you believe 
 it's the right approach
 to add the clean up and preserve the assert in tree-switch-conversion.c?
>>>
>>> Eh, no.  If you run cleanup-cfg after critical edge splitting they
>>> will be unsplit immediately, making
>>> it (mostly) a no-op.
>>>
>>> OTOH I wanted to eliminate that "pass" in favor of just calling
>>> split_critical_edges () where needed
>>> (that's already done in some places).
>>
>> Good, so I will leave it to you. Should I in meantime remove the assert in 
>> tree-switch-conversion.c ?
> 
> Yes, as said your patch was generally OK, I just wondered why we left
> the switches "unoptimized".

Good.

I'm sending v2 for single non-default case situation.
Patch can bootstrap on ppc64le-redhat-linux and survives regression tests.

Ready to be installed?
Martin

> 
> Richard.
> 
>>>
 For the case with # of edges == 1, should I place it to tree-cfg.c in 
 order to trigger it as a clean-up?
>>>
>>> I believe the code for edges == 1 can reside in
>>> cleanup_control_expr_graph.  Probably easiest
>>> from a flow perspective if we do the switch -> cond transform before
>>> the existing code handling
>>> cond and switch via find-taken-edge.
>>
>> Working on that, good place to do the transformation.
>>
>> Martin
>>
>>>
 Thoughts?

 Martin

>
> Richard.
>
>> Martin
>>
>>>
>>> Richard.
>>>
 Martin

 gcc/ChangeLog:

 2017-08-25  Martin Liska  

 PR tree-optimization/82032
 * tree-switch-conversion.c (generate_high_low_equality): New
 function.
 (expand_degenerated_switch): Likewise.
 (process_switch): Call expand_degenerated_switch.
 (try_switch_expansion): Likewise.
 (emit_case_nodes): Use generate_high_low_equality.

 gcc/testsuite/ChangeLog:

 2017-08-25  Martin Liska  

 PR tree-optimization/82032
 * gcc.dg/tree-ssa/pr68198.c: Update jump threading 
 expectations.
 * gcc.dg/tree-ssa/switch-expansion.c: New test.
 * gcc.dg/tree-ssa/vrp34.c: Update scanned pattern.
 * g++.dg/other/pr82032.C: New test.
 ---
  gcc/testsuite/g++.dg/other/pr82032.C |  36 +++
  gcc/testsuite/gcc.dg/tree-ssa/pr68198.c  |   6 +-
  gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c |  14 +++
  gcc/testsuite/gcc.dg/tree-ssa/vrp34.c|   5 +-
  gcc/tree-switch-conversion.c | 123 
 ++-
  5 files changed, 152 insertions(+), 32 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/other/pr82032.C
  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/switch-expansion.c


>>

>>

>From

Re: [PATCH] PR target/80556

2017-09-01 Thread Simon Wright

On 29 Jun 2017, at 21:41, Simon Wright  wrote:
> 
> On 28 Jun 2017, at 18:40, Jeff Law  wrote:
>> 
>> On 06/09/2017 07:57 AM, Simon Wright wrote:
>>>   2017-06-09 Simon Wright 
>>> 
>>>   PR target/80556
>>>   * configure.ac (stage1_ldflags): For Darwin, include -lSystem.
>>> (poststage1_ldflags): likewise.
>>>   * configure: regenerated.
>> I'm a bit confused here.  Isn't -lSystem included in darwin's LIB_SPEC
>> in which case the right things ought to already be happening, shouldn't it?
> 
> The specs that involve -lSystem are
> 
> *link_gcc_c_sequence:
> %:version-compare(>= 10.6 mmacosx-version-min= -no_compact_unwind)
> %{!static:%{!static-libgcc:   %:version-compare(>= 10.6 
> mmacosx-version-min= -lSystem) } }
> %{fno-pic|fno-PIC|fno-pie|fno-PIE|fapple-kext|mkernel|static|mdynamic-no-pic: 
>   %:version-compare(>= 10.7 mmacosx-version-min= -no_pie) } %G %L
> 
> *lib:
> %{!static:-lSystem}
> 
> but I also see
> 
> *libgcc:
> %{static-libgcc|static: -lgcc_eh -lgcc; 
> 
> which might be the root of the problem?
> 
> Looking at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80556#c39, I report 
> that
> 
>   $ gnatmake raiser -largs -static-libgcc -static-libstdc++
> 
> resulted in the link command
> 
>   /usr/bin/ld -dynamic -arch x86_64 -macosx_version_min 10.12.0
>   -weak_reference_mismatches non-weak -o raiser -L./
>   -L/opt/gcc-7.1.0/lib/gcc/x86_64-apple-darwin15/7.1.0/adalib/
>   -L/opt/gcc-7.1.0/lib/gcc/x86_64-apple-darwin15/7.1.0
>   -L/opt/gcc-7.1.0/lib/gcc/x86_64-apple-darwin15/7.1.0/../../.. b~raiser.o
>   ./raiser.o -v
>   /opt/gcc-7.1.0/lib/gcc/x86_64-apple-darwin15/7.1.0/adalib/libgnat.a
>   -no_compact_unwind -lgcc_eh -lgcc -lSystem
> 
> i.e. -lSystem is *after* -lgcc, so that its exception handling won't be 
> invoked.
> 
> I don't know what -lgcc_eh does, but my patch would be pretty much equivalent 
> to changing the libgcc spec above to
> 
> *libgcc:
> %{static-libgcc|static: -lSystem -lgcc_eh -lgcc; 
> 
> and if that would be OK it would obviously be much better.
> 
> I've rebuilt gcc-8-20170528 with this change alone (i.e. not the patch 
> currently posted here), successfully.

I've rebuilt and tested gcc-8-20170820 with this change, successfully.

gcc/Changelog:

   2017-09-01 Simon Wright 

   PR target/80556
   * config/darwin.h (REAL_LIBGCC_SPEC): for static-libgcc|static, include 
-lSystem first.



80556-darwin.h.diff
Description: Binary data

[C++ PATCH] restore CLASSTYPE_SORTED_FIELDS

2017-09-01 Thread Nathan Sidwell

I've reverted last weeks conversion of CLASSTYPE_SORTED_FIELDS to a 
hash-map.  It became clear this was a rat hole I have insufficient time 
to go down.


I'm now attacking the problem from another direction, which should be 
more fruitful.


nathan
--
Nathan Sidwell
2017-09-01  Nathan Sidwell  

	Revert 2017-08-28  Nathan Sidwell  
	Restore sorted_fields vector.
	* cp-tree.h (lang_type): Restore sorted_fields vector.
	(CLASSTYPE_SORTED_FIELDS): Restore.
	(CLASSTYPE_BINDINGS): Delete.
	* name-lookup.c (lookup_field_1): Restore binary search.
	(sorted_fields_type_new, count_fields,
	add_fields_to_record_type, add_enum_fields_to_record_type): Restore
	(set_class_bindings): Revert.
	(insert_late_enum_def_binding): Restore field_vec insertion.

Index: cp-tree.h
===
--- cp-tree.h	(revision 251585)
+++ cp-tree.h	(working copy)
@@ -2007,10 +2007,10 @@ struct GTY(()) lang_type {
  as a list of adopted protocols or a pointer to a corresponding
  @interface.  See objc/objc-act.h for details.  */
   tree objc_info;
-
-  /* Map from IDENTIFIER nodes to DECLS.  */
-  hash_map *bindings;
-
+  /* sorted_fields is sorted based on a pointer, so we need to be able
+ to resort it if pointers get rearranged.  */
+  struct sorted_fields_type * GTY ((reorder ("resort_sorted_fields")))
+sorted_fields;
   /* FIXME reuse another field?  */
   tree lambda_expr;
 };
@@ -3236,9 +3236,10 @@ extern void decl_shadowed_for_var_insert
&& TREE_CODE (TYPE_NAME (NODE)) == TYPE_DECL	\
&& TYPE_DECL_ALIAS_P (TYPE_NAME (NODE)))
 
-/* The binding map for a class (not always present).  */
-#define CLASSTYPE_BINDINGS(NODE) \
-  (LANG_TYPE_CLASS_CHECK (NODE)->bindings)
+/* For a class type: if this structure has many fields, we'll sort them
+   and put them into a TREE_VEC.  */
+#define CLASSTYPE_SORTED_FIELDS(NODE) \
+  (LANG_TYPE_CLASS_CHECK (NODE)->sorted_fields)
 
 /* If non-NULL for a VAR_DECL, FUNCTION_DECL, TYPE_DECL or
TEMPLATE_DECL, the entity is either a template specialization (if
Index: name-lookup.c
===
--- name-lookup.c	(revision 251585)
+++ name-lookup.c	(working copy)
@@ -1183,33 +1183,58 @@ lookup_fnfields_slot_nolazy (tree type,
 tree
 lookup_field_1 (tree type, tree name, bool want_type)
 {
-  tree field = NULL_TREE;
+  tree field;
 
   gcc_assert (identifier_p (name) && RECORD_OR_UNION_TYPE_P (type));
 
-  if (CLASSTYPE_BINDINGS (type))
+  if (CLASSTYPE_SORTED_FIELDS (type))
 {
-  tree *slot = CLASSTYPE_BINDINGS (type)->get (name);
-
-  if (slot)
-	{
-	  field = *slot;
-
-	  if (STAT_HACK_P (field))
+  tree *fields = &CLASSTYPE_SORTED_FIELDS (type)->elts[0];
+  int lo = 0, hi = CLASSTYPE_SORTED_FIELDS (type)->len;
+  int i;
+
+  while (lo < hi)
+	{
+	  i = (lo + hi) / 2;
+
+	  if (DECL_NAME (fields[i]) > name)
+	hi = i;
+	  else if (DECL_NAME (fields[i]) < name)
+	lo = i + 1;
+	  else
 	{
+	  field = NULL_TREE;
+
+	  /* We might have a nested class and a field with the
+		 same name; we sorted them appropriately via
+		 field_decl_cmp, so just look for the first or last
+		 field with this name.  */
 	  if (want_type)
-		field = STAT_TYPE (field);
+		{
+		  do
+		field = fields[i--];
+		  while (i >= lo && DECL_NAME (fields[i]) == name);
+		  if (!DECL_DECLARES_TYPE_P (field))
+		field = NULL_TREE;
+		}
 	  else
-		field = STAT_DECL (field);
-	}
+		{
+		  do
+		field = fields[i++];
+		  while (i < hi && DECL_NAME (fields[i]) == name);
+		}
+
+	  if (field)
+	  	{
+	  	  field = strip_using_decl (field);
+	  	  if (is_overloaded_fn (field))
+	  	field = NULL_TREE;
+	  	}
 
-	  field = strip_using_decl (field);
-	  if (OVL_P (field))
-	field = NULL_TREE;
-	  else if (want_type && !DECL_DECLARES_TYPE_P (field))
-	field = NULL_TREE;
+	  return field;
+	}
 	}
-  return field;
+  return NULL_TREE;
 }
 
   field = TYPE_FIELDS (type);
@@ -1287,62 +1312,113 @@ lookup_fnfields_slot (tree type, tree na
   return lookup_fnfields_slot_nolazy (type, name);
 }
 
-/* Add DECL into MAP under NAME.  Collisions fail silently.  Doesn't
-   do sophisticated collision checking.  Deals with STAT_HACK.  */
+/* Allocate and return an instance of struct sorted_fields_type with
+   N fields.  */
 
-static void
-add_class_member (hash_map *map, tree name, tree decl)
+static struct sorted_fields_type *
+sorted_fields_type_new (int n)
 {
-  bool existed;
-  tree *slot = &map->get_or_insert (name, &existed);
-  if (!existed)
-*slot = decl;
-  else if (TREE_CODE (*slot) == TYPE_DECL && DECL_ARTIFICIAL (*slot))
-*slot = stat_hack (decl, *slot);
-  else if (TREE_CODE (decl) == TYPE_DECL && DECL_ARTIFICIAL (decl))
-*slot = stat_hack (*slot, decl);
+  struct sorted_fields_type *sft;
+  sft = (sorted_fields_type *) ggc_internal_alloc (sizeof (sorted_fields_type)
+  +

Re: Patch for [Bug fortran/81841] [5/6/7/8 Regression] THREADPRIVATE (OpenMP) wrongly rejected in BLOCK DATA

2017-09-01 Thread Jakub Jelinek

On Fri, Sep 01, 2017 at 02:27:40PM +0200, dbroemmel wrote:
> > This is wrong.  Runtime testcases for OpenMP belong into libgomp/testsuite/.
> Well, that's a path where I found some Fortran OpenMP stuff, I didn't
> look for other places.
> 
> > That said, I fail to see why such a large testcase is needed, wouldn't a
> > simple
> > ! PR fortran/81841
> > ! { dg-do compile }
> > 
> > block data
> >   implicit none
> >   integer :: int2
> >   real:: flt2
> >   common /c_block_2/ int2, flt2
> >   !$OMP THREADPRIVATE(/c_block_2/)
> >   data int2, flt2 /2, 2.2/
> > end block data
> > 
> > testcase in gfortran.dg/gomp/ be sufficient here?
> That would suffice and is the first testcase I added to PR81841. It was
> suggested I could add runtime tests as well, so I tried.

If you really need a testcase, it would be enough to do something like:
  use omp_lib
  !$omp parallel num_threads(2)
  int2 = omp_get_thread_num ()
  !$omp barrier
  if (int2 != omp_get_thread_num ()) call abort
  !$omp end parallel
or so to ensure it has the threadprivate property by writing something
different to it in each thread and after barrier verifying it has the
expected value in each thread.

Jakub

Re: [OBVIOUS][PATCH] Fix warning for simple-object-elf.c.

2017-09-01 Thread H.J. Lu

On Fri, Sep 1, 2017 at 4:22 AM, Martin Liška  wrote:
> Installed after discussion with Richi.
>
> Martin

I also checked this into binutils.

-- 
H.J.

[PATCH]: PR target/80204 (Darwin macosx-version-min problem)

2017-09-01 Thread Simon Wright

In gcc/config/darwin-driver.c, darwin_find_version_from_kernel() assumes
that the minor version in the Darwin kernel version (16.7.0, => minor
version 7) is equal to the bugfix component of the macOS version, so that
the compiler receives -mmacosx-version-min=10.12.7 and the linker receives
-macosx_version_min 10.12.7.

Unfortunately, Apple don’t apply this algorithm; the macOS version is
actually 10.12.6.

Getting this wrong means that it’s impossible to run an executable from 
within a bundle: Sierra complains "You have macOS 10.12.6. The application
requires macOS 10.12.7 or later".

A workround would perhaps be to link the executable with
-Wl,-macosx_version_min,`sw_vers -productVersion` (I assume that it’s only
the linker phase that matters?)

I see that Apple’s gcc (Apple LLVM version 8.0.0
(clang-800.0.42.1)) specifies - only at link time -
  -macosx_version_min 10.12.0

This patch does the same.

gcc/Changelog:

2017-09-01 Simon Wright 

PR target/80204
* config/darwin-driver.c (darwin_find_version_from_kernel): eliminate 
calculation of the
  minor version, always output as 0.



80204-darwin-driver.c.diff
Description: Binary data

[PATCH][GCC][ARM] Dot Product commandline options [Patch (1/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch adds support for the +dotprod extension to ARM.
Dot Product requires Adv.SIMD to work and so enables this option
by default when enabled.

It is available from ARMv8.2-a and onwards and is enabled by
default on Cortex-A55 and Cortex-A75.

Regtested and bootstrapped on arm-none-eabi and no issues.

Ok for trunk?

gcc/
2017-09-01  Tamar Christina  

* config/arm/arm.h (TARGET_DOTPROD): New.
* config/arm/arm.c (arm_arch_dotprod): New.
(arm_option_reconfigure_globals): Add arm_arch_dotprod.
* config/arm/arm-c.c (__ARM_FEATURE_DOTPROD): New.
* config/arm/arm-cpus.in (cortex-a55, cortex-75): Enabled +dotprod.
(armv8.2-a, cortex-a75.cortex-a55): Likewise.
* config/arm/arm-isa.h (isa_bit_dotprod, ISA_DOTPROD): New.
* config/arm/t-multilib (v8_2_a_simd_variants): Add dotprod.
* doc/invoke.texi (armv8.2-a): Document dotprod

-- 
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index 55472434c3a6e90c5693bbaabd3265f7d968787f..295f03bf8ee02be7c89ed2967d283be206e9f25a 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -73,6 +73,7 @@ arm_cpu_builtins (struct cpp_reader* pfile)
   def_or_undef_macro (pfile, "__ARM_FEATURE_QRDMX", TARGET_NEON_RDMA);
 
   def_or_undef_macro (pfile, "__ARM_FEATURE_CRC32", TARGET_CRC32);
+  def_or_undef_macro (pfile, "__ARM_FEATURE_DOTPROD", TARGET_DOTPROD);
   def_or_undef_macro (pfile, "__ARM_32BIT_STATE", TARGET_32BIT);
 
   cpp_undef (pfile, "__ARM_FEATURE_CMSE");
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index d009a9e18acb093aefe0f9d8d6de49489fc2325c..7707eec5edf36b0cb4339bc52bc45a92b6ea007f 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -357,6 +357,7 @@ begin arch armv8.2-a
  option crypto add FP_ARMv8 CRYPTO
  option nocrypto remove ALL_CRYPTO
  option nofp remove ALL_FP
+ option dotprod add FP_ARMv8 DOTPROD
 end arch armv8.2-a
 
 begin arch armv8-m.base
@@ -1269,9 +1270,10 @@ begin cpu cortex-a55
  cname cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
+ option dotprod add FP_ARMv8 DOTPROD
  option nofp remove ALL_FP
  costs cortex_a53
 end cpu cortex-a55
@@ -1280,9 +1282,10 @@ begin cpu cortex-a75
  cname cortexa75
  tune for cortex-a57
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
+ option dotprod add FP_ARMv8 DOTPROD
  costs cortex_a73
 end cpu cortex-a75
 
@@ -1292,9 +1295,10 @@ begin cpu cortex-a75.cortex-a55
  cname cortexa75cortexa55
  tune for cortex-a53
  tune flags LDSCHED
- architecture armv8.2-a+fp16
+ architecture armv8.2-a+fp16+dotprod
  fpu neon-fp-armv8
  option crypto add FP_ARMv8 CRYPTO
+ option dotprod add FP_ARMv8 DOTPROD
  costs cortex_a73
 end cpu cortex-a75.cortex-a55
 
diff --git a/gcc/config/arm/arm-isa.h b/gcc/config/arm/arm-isa.h
index dbd29eaa52f2007498c2aff6263b8b6c3a70e2c2..60a50edf08dd7d3ac9ad46967250f4dcc6b8768b 100644
--- a/gcc/config/arm/arm-isa.h
+++ b/gcc/config/arm/arm-isa.h
@@ -66,6 +66,7 @@ enum isa_feature
 isa_bit_fp_d32,	/* 32 Double precision registers.  */
 isa_bit_crypto,	/* Crypto extension to ARMv8.  */
 isa_bit_fp16,	/* FP16 data processing (half-precision float).  */
+isa_bit_dotprod,	/* Dot Product instructions.  */
 
 /* ISA Quirks (errata?).  Don't forget to add this to the list of
all quirks below.  */
@@ -159,6 +160,7 @@ enum isa_feature
 #define ISA_FP_ARMv8	ISA_FPv5, ISA_FP_D32
 #define ISA_NEON	ISA_FP_D32, isa_bit_neon
 #define ISA_CRYPTO	ISA_NEON, isa_bit_crypto
+#define ISA_DOTPROD	ISA_NEON, isa_bit_dotprod
 
 /* List of all quirk bits to strip out when comparing CPU features with
architectures.  */
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 4f53583cf0219de4329bc64a47a5a42c550ff354..44a95bf7eb2eab8e3cf07ac9cc7aad3d9997b27f 100644
--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -210,6 +210,11 @@ extern tree arm_fp16_type_node;
 /* FPU supports ARMv8.1 Adv.SIMD extensions.  */
 #define TARGET_NEON_RDMA (TARGET_NEON && arm_arch8_1)
 
+/* Supports for Dot Product AdvSIMD extensions.  */
+#define TARGET_DOTPROD (TARGET_NEON	\
+			&& bitmap_bit_p (arm_active_target.isa,		\
+	isa_bit_dotprod))
+
 /* FPU supports the floating point FP16 instructions for ARMv8.2 and later.  */
 #define TARGET_VFP_FP16INST \
   (TARGET_32BIT && TARGET_HARD_FLOAT && TARGET_VFP5 && arm_fp16_inst)
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 71379dd5afc4c0dd62fdafd0893d2ad47ae7..486591137f95cfb2e51adb7082f346edf84449de 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -952,6 +952,9 @@ int arm_condexec_masklen = 0;
 /* Nonzero if chip supports the ARMv8 CRC instructions.  */
 int arm_arch_crc = 0;
 
+/* Nonzero if chip supports the AdvSIMD Dot Product instru

[PATCH][GCC][AArch64] Dot Product commandline options [Patch (4/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch adds support for the +dotprod extension to AArch64.
Dot Product requires Adv.SIMD to work and so enables this option
by default when enabled.

It is available from ARMv8.2-a and onwards and is enabled by
default on Cortex-A55 and Cortex-A75.

Regtested and bootstrapped on aarch64-none-elf and no issues.

Ok for trunk?

gcc/
2017-09-01  Tamar Christina  

* config/aarch64/aarch64.h (AARCH64_FL_DOTPROD): New.
(AARCH64_ISA_DOTPROD, TARGET_DOTPROD): New.
* config/aarch64/aarch64-c.c (aarch64_update_cpp_builtins): Add 
TARGET_DOTPROD.
* config/aarch64/aarch64-option-extensions.def (dotprod): New.
* config/aarch64/aarch64-cores.def (cortex-a55, cortex-a75): Enable 
TARGET_DOTPROD.
(cortex-a75.cortex-a55): Likewise.
* doc/invoke.texi (aarch64-feature-modifiers): Document dotprod.

-- 
diff --git a/gcc/config/aarch64/aarch64-c.c b/gcc/config/aarch64/aarch64-c.c
index 177e638682f9dae3476b76e48a2d96c70d5acbd1..c7d866f3b567bbb55bf2c5152c9d0729fc2eff2c 100644
--- a/gcc/config/aarch64/aarch64-c.c
+++ b/gcc/config/aarch64/aarch64-c.c
@@ -106,6 +106,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
 
 
   aarch64_def_or_undef (TARGET_CRC32, "__ARM_FEATURE_CRC32", pfile);
+  aarch64_def_or_undef (TARGET_DOTPROD, "__ARM_FEATURE_DOTPROD", pfile);
 
   cpp_undef (pfile, "__AARCH64_CMODEL_TINY__");
   cpp_undef (pfile, "__AARCH64_CMODEL_SMALL__");
diff --git a/gcc/config/aarch64/aarch64-cores.def b/gcc/config/aarch64/aarch64-cores.def
index 10893324d3fd856ba60247fd1a48c56d0cf2fc39..16e44855872112c81db349e098f932edd52117be 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -83,8 +83,8 @@ AARCH64_CORE("thunderx2t99",  thunderx2t99,  thunderx2t99, 8_1A,  AARCH64_FL_FOR
 /* ARMv8.2-A Architecture Processors.  */
 
 /* ARM ('A') cores. */
-AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa53, 0x41, 0xd05, -1)
-AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, 0xd0a, -1)
+AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1)
+AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1)
 
 /* ARMv8-A big.LITTLE implementations.  */
 
@@ -95,6 +95,6 @@ AARCH64_CORE("cortex-a73.cortex-a53",  cortexa73cortexa53, cortexa53, 8A,  AARCH
 
 /* ARM DynamIQ big.LITTLE configurations.  */
 
-AARCH64_CORE("cortex-a75.cortex-a55",  cortexa75cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 0xd05), -1)
+AARCH64_CORE("cortex-a75.cortex-a55",  cortexa75cortexa55, cortexa53, 8_2A,  AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 0xd05), -1)
 
 #undef AARCH64_CORE
diff --git a/gcc/config/aarch64/aarch64-option-extensions.def b/gcc/config/aarch64/aarch64-option-extensions.def
index a989a2ec23e53f849903503b57a44c2a3e6812be..2c808f0b9ef7ca037239040d6fd0b57c664c12e1 100644
--- a/gcc/config/aarch64/aarch64-option-extensions.def
+++ b/gcc/config/aarch64/aarch64-option-extensions.def
@@ -43,8 +43,8 @@
 AARCH64_OPT_EXTENSION("fp", AARCH64_FL_FP, 0, AARCH64_FL_SIMD | AARCH64_FL_CRYPTO | AARCH64_FL_F16, "fp")
 
 /* Enabling "simd" also enables "fp".
-   Disabling "simd" also disables "crypto".  */
-AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, AARCH64_FL_CRYPTO, "asimd")
+   Disabling "simd" also disables "crypto" and "dotprod".  */
+AARCH64_OPT_EXTENSION("simd", AARCH64_FL_SIMD, AARCH64_FL_FP, AARCH64_FL_CRYPTO | AARCH64_FL_DOTPROD, "asimd")
 
 /* Enabling "crypto" also enables "fp", "simd".
Disabling "crypto" just disables "crypto".  */
@@ -67,4 +67,8 @@ AARCH64_OPT_EXTENSION("rcpc", AARCH64_FL_RCPC, 0, 0, "lrcpc")
Disabling "rdma" just disables "rdma".  */
 AARCH64_OPT_EXTENSION("rdma", AARCH64_FL_RDMA, AARCH64_FL_FP | AARCH64_FL_SIMD, 0, "asimdrdm")
 
+/* Enabling "dotprod" also enables "simd".
+   Disabling "dotprod" only disables "dotprod".  */
+AARCH64_OPT_EXTENSION("dotprod", AARCH64_FL_DOTPROD, AARCH64_FL_SIMD, 0, "asimddp")
+
 #undef AARCH64_OPT_EXTENSION
diff --git a/gcc/config/aarch64/aarch64.h b/gcc/config/aarch64/aarch64.h
index 7f91edb5713d7e8eda2f0a024a0f97b4e111c4b0..a61530e278f9e69dc0fe674d4fc2e58ec975dd21 100644
--- a/gcc/config/aarch64/aarch64.h
+++ b/gcc/config/aarch64/aarch64.h
@@ -151,7 +151,8 @@ extern unsigned aarch64_architecture_version;
 #define AARCH64_FL_F16	  (1 << 9)  /* Has ARMv8.2-A FP16 extensions.  */
 /* ARMv8.3-A architecture extensions.  */
 #define AARCH64_FL_V8_3	  (1 << 10)  /* Has ARMv8.3-A features.  */
-#define AARCH64_FL_RCPC	  (1 << 11)  /* Has support for RCpc model.  */
+#define AARCH64_FL_RCPC   (1 << 11)  /* Has support for RCpc

[PATCH][GCC][Testsuite][ARM][AArch64] Enable Dot Product for generic tests for ARM and AArch64 [Patch (7/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch enables tests for Dot Product vectorization
in gcc for ARM and AArch64.

The ARMv8.2-a Dot Product instructions only support 8-bit
element vectorization.

Dot product is available from ARMv8.2-a and onwards.

Regtested and bootstrapped on aarch64-none-elf and
arm-none-eabi and no issues.

Ok for trunk?

gcc/testsuite
2017-09-01  Tamar Christina  

* gcc.dg/vect/vect-reduc-dot-s8a.c
(dg-additional-options, dg-require-effective-target): Add +dotprod.
* gcc.dg/vect/vect-reduc-dot-u8a.c
(dg-additional-options, dg-require-effective-target): Add +dotprod.

-- 
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
index dc4f52019d5435edbbc811b73dee0f98ff44c1b1..c36fbcbf4693f59c2ca747aeb2d41dcd0f48f673 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-s8a.c
@@ -1,4 +1,6 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-require-effective-target arm_arch_v8a_ok { target arm*-*-* } } */
 
 #include 
 #include "tree-vect.h"
diff --git a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
index f3cc6c78c25305d91becd585be8949514ebc521c..c449103d8c8ed8d0861c7e9c231558c86d4f1b85 100644
--- a/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
+++ b/gcc/testsuite/gcc.dg/vect/vect-reduc-dot-u8a.c
@@ -1,4 +1,6 @@
 /* { dg-require-effective-target vect_int } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" { target { aarch64*-*-* || arm*-*-* } } } */
+/* { dg-require-effective-target arm_arch_v8a_ok { target arm*-*-* } } */
 
 #include 
 #include "tree-vect.h"

[PATCH][GCC][AArch64] Dot Product NEON intrinsics [Patch (6/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch adds the Adv.SIMD intrinsics for Dot product.

Dot product is available from ARMv8.2-a and onwards.

Regtested and bootstrapped on aarch64-none-elf and no issues.

Ok for trunk?

gcc/
2017-09-01  Tamar Christina  

* config/aarch64/arm_neon.h (vdot_u32, vdotq_u32, vdot_s32, vdotq_s32): 
New.
(vdot_lane_u32, vdot_laneq_u32, vdotq_lane_u32, vdotq_laneq_u32): New.
(vdot_lane_s32, vdot_laneq_s32, vdotq_lane_s32, vdotq_laneq_s32): New.

gcc/testsuite/
2017-09-01  Tamar Christina  

* gcc.target/aarch64/advsimd-intrinsics/vect-dot-qi.h: New.
* gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vect-dot-s8.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vect-dot-u8.c: New.

-- 
diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
index d7b30b0e5ee6144d543d354ce9978fe9c5d5ae73..96e740f91a7fb01d201c1badf08199a2a76cb483 100644
--- a/gcc/config/aarch64/arm_neon.h
+++ b/gcc/config/aarch64/arm_neon.h
@@ -31541,6 +31541,99 @@ vminnmvq_f16 (float16x8_t __a)
 
 #pragma GCC pop_options
 
+/* AdvSIMD Dot Product intrinsics.  */
+
+#pragma GCC push_options
+#pragma GCC target ("arch=armv8.2-a+dotprod")
+
+__extension__ extern __inline uint32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_u32 (uint32x2_t __r, uint8x8_t __a, uint8x8_t __b)
+{
+  return __builtin_aarch64_udotv8qi_ (__r, __a, __b);
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
+{
+  return __builtin_aarch64_udotv16qi_ (__r, __a, __b);
+}
+
+__extension__ extern __inline int32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_s32 (int32x2_t __r, int8x8_t __a, int8x8_t __b)
+{
+  return __builtin_aarch64_sdotv8qi (__r, __a, __b);
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
+{
+  return __builtin_aarch64_sdotv16qi (__r, __a, __b);
+}
+
+__extension__ extern __inline uint32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_lane_u32 (uint32x2_t __r, uint8x8_t __a, uint8x8_t __b, const int __index)
+{
+  return __builtin_aarch64_udot_lanev8qi_s (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline uint32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_laneq_u32 (uint32x2_t __r, uint8x8_t __a, uint8x16_t __b,
+		const int __index)
+{
+  return __builtin_aarch64_udot_laneqv8qi_s (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_lane_u32 (uint32x4_t __r, uint8x16_t __a, uint8x8_t __b,
+		const int __index)
+{
+  return __builtin_aarch64_udot_lanev16qi_s (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline uint32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_laneq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b,
+		 const int __index)
+{
+  return __builtin_aarch64_udot_laneqv16qi_s (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_lane_s32 (int32x2_t __r, int8x8_t __a, int8x8_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_lanev8qi (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x2_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdot_laneq_s32 (int32x2_t __r, int8x8_t __a, int8x16_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_laneqv8qi (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_lane_s32 (int32x4_t __r, int8x16_t __a, int8x8_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_lanev16qi (__r, __a, __b, __index);
+}
+
+__extension__ extern __inline int32x4_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vdotq_laneq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b, const int __index)
+{
+  return __builtin_aarch64_sdot_laneqv16qi (__r, __a, __b, __index);
+}
+#pragma GCC pop_options
+
 #undef __aarch64_vget_lane_any
 
 #undef __aarch64_vdup_lane_any
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c
new file mode 100644
index ..f75503e1ef52a215b91538dad243b51d88b99c00
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-compile.c
@@ -0,0 +1,74 @@
+/* { dg-skip-if "can't compile on arm." { arm*-*-* } } */
+/* { dg-do compile } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+
+#include 
+
+/* Unsigned Dot

[PATCH][GCC][ARM][AArch64] Testsuite framework changes and execution tests [Patch (8/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch enables the execution runs for Dot product and also
adds the feature tests.

The ARMv8.2-a Dot Product instructions only support 8-bit
element vectorization.

Dot product is available from ARMv8.2-a and onwards.

Regtested and bootstrapped on aarch64-none-elf and
arm-none-eabi and no issues.

Ok for trunk?

gcc/testsuite
2017-09-01  Tamar Christina  

* lib/target-supports.exp
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache): New.
(check_effective_target_arm_v8_2a_dotprod_neon_ok): New.
(add_options_for_arm_v8_2a_dotprod_neon): New.
(check_effective_target_arm_v8_2a_dotprod_neon_hw): New.
(check_effective_target_vect_sdot_qi): New.
(check_effective_target_vect_udot_qi): New.
* gcc.target/arm/simd/vdot-exec.c: New.
* gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c: New.
* gcc/doc/sourcebuild.texi: Document arm_v8_2a_dotprod_neon.

-- 
diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
index e6313dc031ef5b2b5a72180bccf1e876812efe48..bb6fe68a460dd6a699a76953e221028a15997001 100644
--- a/gcc/doc/sourcebuild.texi
+++ b/gcc/doc/sourcebuild.texi
@@ -1678,6 +1678,17 @@ ARM target supports executing instructions from ARMv8.2 with the FP16
 extension.  Some multilibs may be incompatible with these options.
 Implies arm_v8_2a_fp16_neon_ok and arm_v8_2a_fp16_scalar_hw.
 
+@item arm_v8_2a_dotprod_neon_ok
+@anchor{arm_v8_2a_dotprod_neon_ok}
+ARM target supports options to generate instructions from ARMv8.2 with
+the Dot Product extension. Some multilibs may be incompatible with these
+options.
+
+@item arm_v8_2a_dotprod_neon_hw
+ARM target supports executing instructions from ARMv8.2 with the Dot
+Product extension. Some multilibs may be incompatible with these options.
+Implies arm_v8_2a_dotprod_neon_ok.
+
 @item arm_prefer_ldrd_strd
 ARM target prefers @code{LDRD} and @code{STRD} instructions over
 @code{LDM} and @code{STM} instructions.
@@ -2269,6 +2280,11 @@ supported by the target; see the
 @ref{arm_v8_2a_fp16_neon_ok,,arm_v8_2a_fp16_neon_ok} effective target
 keyword.
 
+@item arm_v8_2a_dotprod_neon
+Add options for ARMv8.2 with Adv.SIMD Dot Product support, if this is
+supported by the target; see the
+@ref{arm_v8_2a_dotprod_neon_ok} effective target keyword.
+
 @item bind_pic_locally
 Add the target-specific flags needed to enable functions to bind
 locally when using pic/PIC passes in the testsuite.
diff --git a/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c
new file mode 100644
index ..96d7f0ebc4fd89e966a17b2d7bb6b17e4b511c67
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/advsimd-intrinsics/vdot-exec.c
@@ -0,0 +1,75 @@
+/* { dg-skip-if "can't compile on arm." { arm*-*-* } } */
+/* { dg-do run } */
+/* { dg-additional-options "-O3 -march=armv8.2-a+dotprod" } */
+/* { dg-require-effective-target arm_v8_2a_dotprod_neon_hw } */
+
+#include 
+
+extern void abort();
+
+#define P(n1,n2) n1,n1,n1,n1,n2,n2,n2,n2
+#define ARR(nm, p, ty, ...) ty nm##_##p = { __VA_ARGS__ }
+#define TEST(t1, t2, t3, f, r1, r2, n1, n2) \
+	ARR(f, x, t1, r1);		\
+	ARR(f, y, t2, r2);		\
+	t3 f##_##r = {0};		\
+	f##_##r = f (f##_##r, f##_##x, f##_##y);  \
+	if (f##_##r[0] != n1 || f##_##r[1] != n2)   \
+	  abort ();
+
+#define TEST_LANE(t1, t2, t3, f, r1, r2, n1, n2, n3, n4) \
+	ARR(f, x, t1, r1);		\
+	ARR(f, y, t2, r2);		\
+	t3 f##_##rx = {0};		\
+	f##_##rx = f (f##_##rx, f##_##x, f##_##y, 0);  \
+	if (f##_##rx[0] != n1 || f##_##rx[1] != n2)   \
+	  abort ();\
+	t3 f##_##rx1 = {0};			\
+	f##_##rx1 = f (f##_##rx1, f##_##x, f##_##y, 1);  \
+	if (f##_##rx1[0] != n3 || f##_##rx1[1] != n4)   \
+	  abort ();
+
+#define Px(n1,n2,n3,n4) P(n1,n2),P(n3,n4)
+#define TEST_LANEQ(t1, t2, t3, f, r1, r2, n1, n2, n3, n4, n5, n6, n7, n8) \
+	ARR(f, x, t1, r1);		\
+	ARR(f, y, t2, r2);		\
+	t3 f##_##rx = {0};		\
+	f##_##rx = f (f##_##rx, f##_##x, f##_##y, 0);  \
+	if (f##_##rx[0] != n1 || f##_##rx[1] != n2)   \
+	  abort ();\
+	t3 f##_##rx1 = {0};			\
+	f##_##rx1 = f (f##_##rx1, f##_##x, f##_##y, 1);  \
+	if (f##_##rx1[0] != n3 || f##_##rx1[1] != n4)   \
+	  abort (); \
+	t3 f##_##rx2 = {0};\
+	f##_##rx2 = f (f##_##rx2, f##_##x, f##_##y, 2);  \
+	if (f##_##rx2[0] != n5 || f##_##rx2[1] != n6)   \
+	  abort ();\
+	t3 f##_##rx3 = {0};			\
+	f##_##rx3 = f (f##_##rx3, f##_##x, f##_##y, 3);  \
+	if (f##_##rx3[0] != n7 || f##_##rx3[1] != n8)   \
+	  abort ();
+
+int
+main()
+{
+  TEST (uint8x8_t, uint8x8_t, uint32x2_t, vdot_u32, P(1,2), P(2,3), 8, 24);
+  TEST (int8x8_t, int8x8_t, int32x2_t, vdot_s32, P(1,2), P(-2,-3), -8, -24);
+
+  TEST (uint8x16_t, uint8x16_t, uint32x4_t, vdotq_u32, P(1,2), P(2,3), 8, 24);
+  TEST (int8x16_t, int8x16_t, int32x4_t, vdotq_s32, P(1,2), P(-2,-3), -8, -24);
+
+  TEST_LANE (uint8x8_t, uint8x8_t, uint32x2_t, vd

[PATCH][GCC][AArch64] Dot Product SIMD patterns [Patch (5/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch adds the instructions for Dot Product to AArch64 along
with the intrinsics and vectorizer pattern.

Armv8.2-a dot product supports 8-bit element values both
signed and unsigned.

Dot product is available from Arm8.2-a and onwards.

Regtested and bootstrapped on aarch64-none-elf and no issues.

Ok for trunk?

gcc/
2017-09-01  Tamar Christina  

* config/aarch64/aarch64-builtins.c
(aarch64_types_quadopu_lane_qualifiers): New.
(TYPES_QUADOPU_LANE): New.
* config/aarch64/aarch64-simd.md (aarch64_dot): New.
(dot_prod, aarch64_dot_lane): New.
(aarch64_dot_laneq): New.
* config/aarch64/aarch64-simd-builtins.def (sdot, udot): New.
(sdot_lane, udot_lane, sdot_laneq, udot_laneq): New.
* config/aarch64/iterators.md (UNSPEC_SDOT, UNSPEC_UDOT): New.
(DOT_MODE, dot_mode, Vdottype, DOTPROD): New.
(sur): Add SDOT and UDOT.

-- 
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index d30009ba441cabc511ddbc821379daae6de09fa2..a1b598c3da29ca791c261ca8a6f918573a818974 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -168,6 +168,11 @@ aarch64_types_quadop_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none,
   qualifier_none, qualifier_lane_index };
 #define TYPES_QUADOP_LANE (aarch64_types_quadop_lane_qualifiers)
+static enum aarch64_type_qualifiers
+aarch64_types_quadopu_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+  qualifier_unsigned, qualifier_lane_index };
+#define TYPES_QUADOPU_LANE (aarch64_types_quadopu_lane_qualifiers)
 
 static enum aarch64_type_qualifiers
 aarch64_types_binop_imm_p_qualifiers[SIMD_MAX_BUILTIN_ARGS]
diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def b/gcc/config/aarch64/aarch64-simd-builtins.def
index d713d5d8b88837ec6f2dc51188fb252f8d5bc8bd..52d01342372e518b1238ea14097e8f0574e9a605 100644
--- a/gcc/config/aarch64/aarch64-simd-builtins.def
+++ b/gcc/config/aarch64/aarch64-simd-builtins.def
@@ -205,6 +205,14 @@
   BUILTIN_VSDQ_I_DI (BINOP, srshl, 0)
   BUILTIN_VSDQ_I_DI (BINOP_UUS, urshl, 0)
 
+  /* Implemented by aarch64_{_lane}{q}.  */
+  BUILTIN_VB (TERNOP, sdot, 0)
+  BUILTIN_VB (TERNOPU, udot, 0)
+  BUILTIN_VB (QUADOP_LANE, sdot_lane, 0)
+  BUILTIN_VB (QUADOPU_LANE, udot_lane, 0)
+  BUILTIN_VB (QUADOP_LANE, sdot_laneq, 0)
+  BUILTIN_VB (QUADOPU_LANE, udot_laneq, 0)
+
   BUILTIN_VDQ_I (SHIFTIMM, ashr, 3)
   VAR1 (SHIFTIMM, ashr_simd, 0, di)
   BUILTIN_VDQ_I (SHIFTIMM, lshr, 3)
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index f3e084f8778d70c82823b92fa80ff96021ad26db..21d46c84ab317c2d62afdf8c48117886aaf483b0 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -386,6 +386,87 @@
 }
 )
 
+;; These instructions map to the __builtins for the Dot Product operations.
+(define_insn "aarch64_dot"
+  [(set (match_operand:VS 0 "register_operand" "=w")
+	(unspec:VS [(match_operand:VS 1 "register_operand" "0")
+		(match_operand: 2 "register_operand" "w")
+		(match_operand: 3 "register_operand" "w")]
+		DOTPROD))]
+  "TARGET_DOTPROD"
+  "dot\\t%0., %2., %3."
+  [(set_attr "type" "neon_dot")]
+)
+
+;; These expands map to the Dot Product optab the vectorizer checks for.
+;; The auto-vectorizer expects a dot product builtin that also does an
+;; accumulation into the provided register.
+;; Given the following pattern
+;;
+;; for (i=0; idot_prod"
+  [(set (match_operand:VS 0 "register_operand")
+	(unspec:VS [(match_operand: 1 "register_operand")
+		(match_operand: 2 "register_operand")
+		(match_operand:VS 3 "register_operand")]
+		DOTPROD))]
+  "TARGET_DOTPROD"
+{
+  emit_insn (
+gen_aarch64_dot (operands[3], operands[3], operands[1],
+operands[2]));
+  emit_insn (gen_rtx_SET (operands[0], operands[3]));
+  DONE;
+})
+
+;; These instructions map to the __builtins for the Dot Product
+;; indexed operations.
+(define_insn "aarch64_dot_lane"
+  [(set (match_operand:VS 0 "register_operand" "=w")
+	(unspec:VS [(match_operand:VS 1 "register_operand" "0")
+		(match_operand: 2 "register_operand" "w")
+		(match_operand:V8QI 3 "register_operand" "")
+		(match_operand:SI 4 "immediate_operand" "i")]
+		DOTPROD))]
+  "TARGET_DOTPROD"
+  {
+operands[4]
+  = GEN_INT (ENDIAN_LANE_N (V8QImode, INTVAL (operands[4])));
+return "dot\\t%0., %2., %3.4b[%4]";
+  }
+  [(set_attr "type" "neon_dot")]
+)
+
+(define_insn "aarch64_dot_laneq"
+  [(set (match_operand:VS 0 "register_operand" "=w")
+	(unspec:VS [(match_operand:VS 1 "register_operand" "0")
+		(match_operand: 2 "register_operand" "w")
+		(match_operand:V16QI 3 "register_operand" "")
+		(match_operand:SI 4 "immediate_operand" "i")]
+		DOTPROD))]
+  "TARGET_DOTPROD"
+  {
+operands[4]
+  = GEN_INT (ENDIAN_LANE_N (V16QImode,

[PATCH] fix matching of vxworks7 in libgcc's configuration

2017-09-01 Thread Olivier Hainque


Part of the VxWorks 7 configuration specificities is triggered by this piece
in libgcc/config.host:

  # Common parts for widely ported systems.
  case ${host} in
  ...
  *-*-vxworks7)
tmake_file=t-vxworks7
;;

For the powerpc family of targets, this should apply to SPE targets as well.

However, these are typically canonicalized as *-vxworks7spe and the case filter
above fails to match that.

This causes a build failure of libgcc when configured for e500v2-wrs-vxworks,
complaining that semLib.h could not be found while compiling vxlib-tls.c,
despite VSB_DIR being set and the sequence of -I that should allow finding it
in t-vxworks7.

Indeed, we mistakenly use t-vxworks instead of t-vxworks7, and the former
relies on a different environment variable to locate the header files (as
expected for versions of VxWorks prior to 7).

The attached patch fixes this by adding a wildcard after vxworks7 in the
case filter quoted above.

Committing to mainline, after checking that it has the intended effect and
lets the build proceed further.

Olivier

2018-09-01  Olivier Hainque  

libgcc/
* config.host (*-*-vxworks7): Widen scope to vxworks7*.



libgcc-vx7.diff
Description: Binary data

[PATCH][GCC][ARM] Dot Product NEON patterns [Patch (2/8)]

2017-09-01 Thread Tamar Christina

Hi All,

This patch adds the instructions for Dot Product to ARM along
with the intrinsics and vectorizer pattern.

Armv8.2-a dot product supports 8-bit element values both
signed and unsigned.

Dot product is available from Armv8.2-a and onwards.

Regtested and bootstrapped on arm-none-eabi and no issues.

Ok for trunk?

gcc/
2017-09-01  Tamar Christina  

* config/arm/arm-builtins.c (arm_unsigned_uternop_qualifiers): New.
(UTERNOP_QUALIFIERS, arm_umac_lane_qualifiers, UMAC_LANE_QUALIFIERS): 
New.
* config/arm/arm_neon_builtins.def (sdot, udot, sdot_lane, udot_lane): 
new.
* config/arm/iterators.md (DOTPROD, DOT_MODE, dot_mode): New.
(UNSPEC_DOT_S, UNSPEC_DOT_U, opsuffix): New.
* config/arm/neon.md (neon_dot): New.
(neon_dot_lane, dot_prod): New.
* config/arm/types.md (neon_dot, neon_dot_q): New.
* config/arm/unspecs.md (UNSPEC_DOT_S, UNSPEC_DOT_U): New.

-- 
diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 7504ed581c63a657a0dff48442633704bd252b2e..467596c1dbefe62cc92a0ffb8a03ecaf950f3701 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -105,6 +105,13 @@ arm_ternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   = { qualifier_none, qualifier_none, qualifier_none, qualifier_none };
 #define TERNOP_QUALIFIERS (arm_ternop_qualifiers)
 
+/* unsigned T (unsigned T, unsigned T, unsigned T).  */
+static enum arm_type_qualifiers
+arm_unsigned_uternop_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+  qualifier_unsigned };
+#define UTERNOP_QUALIFIERS (arm_unsigned_uternop_qualifiers)
+
 /* T (T, immediate).  */
 static enum arm_type_qualifiers
 arm_binop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
@@ -131,6 +138,13 @@ arm_mac_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
   qualifier_none, qualifier_lane_index };
 #define MAC_LANE_QUALIFIERS (arm_mac_lane_qualifiers)
 
+/* unsigned T (unsigned T, unsigned T, unsigend T, lane index).  */
+static enum arm_type_qualifiers
+arm_umac_lane_qualifiers[SIMD_MAX_BUILTIN_ARGS]
+  = { qualifier_unsigned, qualifier_unsigned, qualifier_unsigned,
+  qualifier_unsigned, qualifier_lane_index };
+#define UMAC_LANE_QUALIFIERS (arm_umac_lane_qualifiers)
+
 /* T (T, T, immediate).  */
 static enum arm_type_qualifiers
 arm_ternop_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
diff --git a/gcc/config/arm/arm_neon_builtins.def b/gcc/config/arm/arm_neon_builtins.def
index 07f0368343a0c940c1cc1848d31f28a47a587b6f..982eec810dafb5ec955273099853f8842020d104 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -331,3 +331,7 @@ VAR11 (STORE1, vst4,
 	v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf)
 VAR9 (STORE1LANE, vst4_lane,
 	v8qi, v4hi, v4hf, v2si, v2sf, v8hi, v8hf, v4si, v4sf)
+VAR2 (TERNOP, sdot, v8qi, v16qi)
+VAR2 (UTERNOP, udot, v8qi, v16qi)
+VAR2 (MAC_LANE, sdot_lane, v8qi, v16qi)
+VAR2 (UMAC_LANE, udot_lane, v8qi, v16qi)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 7acbaf1bb40a4f270e75968804546508f7839e49..139e09fd929e17216ad9383505f1453a73d071fb 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -410,6 +410,8 @@
 
 (define_int_iterator VFM_LANE_AS [UNSPEC_VFMA_LANE UNSPEC_VFMS_LANE])
 
+(define_int_iterator DOTPROD [UNSPEC_DOT_S UNSPEC_DOT_U])
+
 ;;
 ;; Mode attributes
 ;;
@@ -720,6 +722,10 @@
 
 (define_mode_attr pf [(V8QI "p") (V16QI "p") (V2SF "f") (V4SF "f")])
 
+;; Mapping attribute for Dot Product input modes based on result mode.
+(define_mode_attr DOT_MODE [(V2SI "V8QI") (V4SI "V16QI")])
+(define_mode_attr dot_mode [(V2SI "v8qi") (V4SI "v16qi")])
+
 ;;
 ;; Code attributes
 ;;
@@ -816,6 +822,7 @@
   (UNSPEC_VSRA_S_N "s") (UNSPEC_VSRA_U_N "u")
   (UNSPEC_VRSRA_S_N "s") (UNSPEC_VRSRA_U_N "u")
   (UNSPEC_VCVTH_S "s") (UNSPEC_VCVTH_U "u")
+  (UNSPEC_DOT_S "s") (UNSPEC_DOT_U "u")
 ])
 
 (define_int_attr vcvth_op
@@ -1003,3 +1010,6 @@
 
 (define_int_attr mrrc [(VUNSPEC_MRRC "mrrc") (VUNSPEC_MRRC2 "mrrc2")])
 (define_int_attr MRRC [(VUNSPEC_MRRC "MRRC") (VUNSPEC_MRRC2 "MRRC2")])
+
+(define_int_attr opsuffix [(UNSPEC_DOT_S "s8")
+			   (UNSPEC_DOT_U "u8")])
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 45b3bd18052dd4a33e4b9c10f3ca2ea7e2eed5ce..4c919f53757ada3ca40609b8244ed3b403be3329 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -3044,6 +3044,71 @@
   DONE;
 })
 
+;; These instructions map to the __builtins for the Dot Product operations.
+(define_insn "neon_dot"
+  [(set (match_operand:VCVTI 0 "register_operand" "=w")
+	(unspec:VCVTI [(match_operand:VCVTI 1 "r

Re: Patch for [Bug fortran/81841] [5/6/7/8 Regression] THREADPRIVATE (OpenMP) wrongly rejected in BLOCK DATA

2017-09-01 Thread dbroemmel

> If you really need a testcase, it would be enough to do something like:
>   use omp_lib
>   !$omp parallel num_threads(2)
>   int2 = omp_get_thread_num ()
>   !$omp barrier
>   if (int2 != omp_get_thread_num ()) call abort
>   !$omp end parallel
> or so to ensure it has the threadprivate property by writing something
> different to it in each thread and after barrier verifying it has the
> expected value in each thread.
I'm more than fine with the short compile-only testcase. I pretty sure
my largish runtime test doesn't get near covering all relevant aspects
of the THREADPRIVATE directive for common blocks. Also, the fix is for
this reject-valid parsing error, so not really to do with anything else,
so perhaps shouldn't test anything else?

Dirk

Re: C++ PATCH for c++/80767, unnecessary instantiation of generic lambda

2017-09-01 Thread David Edelsohn

On Thu, Aug 31, 2017 at 1:16 PM, Jason Merrill  wrote:
> On Thu, Aug 31, 2017 at 9:37 AM, David Edelsohn  wrote:
>> Jason,
>>
>> The recent patch for 82030
>>
>> PR c++/82030 - ICE inheriting from multiple lambdas
>>
>> PR c++/80767
>> * call.c (compare_ics): Handle null candidate.
>>
>> causes bootstrap to fail for AIX due to a segfault ICE building
>> libstdc++ with stage1 compiler.
>
> Hmm, strange.  Testcase?

I cannot reproduce the bootstrap failure with your patch, so I cannot
explain it. I hope that it is not the beginning of a hardware failure
in the GNU Farm AIX system.

- David

Re: [PATCH] Add UBSAN_{PTR,BOUNDS} folding (PR sanitizer/81981)

2017-09-01 Thread Jakub Jelinek

On Fri, Sep 01, 2017 at 02:32:43PM +0200, Richard Biener wrote:
> On September 1, 2017 1:16:54 PM GMT+02:00, Jakub Jelinek  
> wrote:
> >Hi!
> >
> >This patch fixes the following testcase by folding some ubsan internal
> >fns
> >we'd either remove anyway during sanopt, or lower into if (cond)
> >do_something during sanopt where cond would be always false.
> >
> >Additionally, I've tried to clean up a little bit IFN_UBSAN_OBJECT_SIZE
> >handling by using variables for the call arguments that make it clear
> >what the arguments are.
> >
> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> I think there's a helper for replace - with-nop. 

Can't find it.
gimplify_and_update_call_from_tree has to add it, but I'd need
to call it with something that gimplifies into empty sequence (void_node?)
and it would still go through 
push_gimplify_context/gimplify_and_add/pop_gimplify_context
etc., which looks quite expensive.

> >2017-09-01  Jakub Jelinek  
> >
> > PR sanitizer/81981
> > * gimple-fold.c (gimple_fold_call): Optimize away useless UBSAN_PTR
> > and UBSAN_BOUNDS internal calls.  Clean up IFN_UBSAN_OBJECT_SIZE
> > handling.
> >
> > * gcc.dg/ubsan/pr81981.c: New test.
> >
> >--- gcc/gimple-fold.c.jj 2017-08-10 02:31:21.0 +0200
> >+++ gcc/gimple-fold.c2017-08-29 18:50:49.993673432 +0200
> >@@ -3938,11 +3938,23 @@ gimple_fold_call (gimple_stmt_iterator *
> > gimple_call_arg (stmt, 2));
> >   break;
> > case IFN_UBSAN_OBJECT_SIZE:
> >-  if (integer_all_onesp (gimple_call_arg (stmt, 2))
> >-  || (TREE_CODE (gimple_call_arg (stmt, 1)) == INTEGER_CST
> >-  && TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST
> >-  && tree_int_cst_le (gimple_call_arg (stmt, 1),
> >-  gimple_call_arg (stmt, 2
> >+  {
> >+tree offset = gimple_call_arg (stmt, 1);
> >+tree objsize = gimple_call_arg (stmt, 2);
> >+if (integer_all_onesp (objsize)
> >+|| (TREE_CODE (offset) == INTEGER_CST
> >+&& TREE_CODE (objsize) == INTEGER_CST
> >+&& tree_int_cst_le (offset, objsize)))
> >+  {
> >+gsi_replace (gsi, gimple_build_nop (), false);
> >+unlink_stmt_vdef (stmt);
> >+release_defs (stmt);
> >+return true;
> >+  }
> >+  }
> >+  break;
> >+case IFN_UBSAN_PTR:
> >+  if (integer_zerop (gimple_call_arg (stmt, 1)))
> > {
> >   gsi_replace (gsi, gimple_build_nop (), false);
> >   unlink_stmt_vdef (stmt);
> >@@ -3950,6 +3962,25 @@ gimple_fold_call (gimple_stmt_iterator *
> >   return true;
> > }
> >   break;
> >+case IFN_UBSAN_BOUNDS:
> >+  {
> >+tree index = gimple_call_arg (stmt, 1);
> >+tree bound = gimple_call_arg (stmt, 2);
> >+if (TREE_CODE (index) == INTEGER_CST
> >+&& TREE_CODE (bound) == INTEGER_CST)
> >+  {
> >+index = fold_convert (TREE_TYPE (bound), index);
> >+if (TREE_CODE (index) == INTEGER_CST
> >+&& tree_int_cst_le (index, bound))
> >+  {
> >+gsi_replace (gsi, gimple_build_nop (), false);
> >+unlink_stmt_vdef (stmt);
> >+release_defs (stmt);
> >+return true;
> >+  }
> >+  }
> >+  }
> >+  break;
> > case IFN_GOACC_DIM_SIZE:
> > case IFN_GOACC_DIM_POS:
> >   result = fold_internal_goacc_dim (stmt);
> >--- gcc/testsuite/gcc.dg/ubsan/pr81981.c.jj  2017-08-29
> >18:54:33.826107761 +0200
> >+++ gcc/testsuite/gcc.dg/ubsan/pr81981.c 2017-08-29 18:55:36.721386827
> >+0200
> >@@ -0,0 +1,21 @@
> >+/* PR sanitizer/81981 */
> >+/* { dg-do compile } */
> >+/* { dg-options "-O2 -Wmaybe-uninitialized -fsanitize=undefined
> >-ffat-lto-objects" } */
> >+
> >+int v;
> >+
> >+int
> >+foo (int i)
> >+{
> >+  int t[1], u[1];
> >+  int n = 0;
> >+
> >+  if (i)
> >+{
> >+  t[n] = i;
> >+  u[0] = i;
> >+}
> >+
> >+  v = u[0]; /* { dg-warning "may be used uninitialized in this
> >function" } */
> >+  return t[0];  /* { dg-warning "may be used uninitialized in 
> >this
> >function" } */
> >+}
> >
> > Jakub

Jakub

Re: Patch for [Bug fortran/81841] [5/6/7/8 Regression] THREADPRIVATE (OpenMP) wrongly rejected in BLOCK DATA

2017-09-01 Thread Jakub Jelinek

On Fri, Sep 01, 2017 at 03:47:10PM +0200, dbroemmel wrote:
> > If you really need a testcase, it would be enough to do something like:
> >   use omp_lib
> >   !$omp parallel num_threads(2)
> >   int2 = omp_get_thread_num ()
> >   !$omp barrier
> >   if (int2 != omp_get_thread_num ()) call abort
> >   !$omp end parallel
> > or so to ensure it has the threadprivate property by writing something
> > different to it in each thread and after barrier verifying it has the
> > expected value in each thread.
> I'm more than fine with the short compile-only testcase. I pretty sure
> my largish runtime test doesn't get near covering all relevant aspects
> of the THREADPRIVATE directive for common blocks. Also, the fix is for
> this reject-valid parsing error, so not really to do with anything else,
> so perhaps shouldn't test anything else?

Yes, I said initially a compile time testcase is just fine for me.

Jakub

PR82045: Avoid passing machine modes through "..."

2017-09-01 Thread Richard Sandiford

PR82045 is about a bootstrap failure on sparc-sun-solaris2.11.
The problem was that we were passing the new machine_mode wrapper
classes through "..."  to emit_library_call(_value), which then
read them back as ints instead.

The simplest fix seemed to be replace "..." with an array of
rtx_mode_ts, then provide wrappers for the common cases.  This
bulks out rtl.h a bit, but it does make things a bit more typesafe.

AFAICT this is the only place where we passed machine_modes this way.

Tested so far on aarch64-linux-gnu, powerpc64le-linux-gnu and
x86_64-linux-gnu.  Rainer also tested an earlier version on
sparc-sun-solaris2.11 (thanks).  Some multi-target testing is
still in progress.  OK to install if the remaining tests pass?

Richard


2017-09-01  Richard Sandiford  

gcc/
PR bootstrap/82045
* rtl.h (emit_library_call_value_1): Declare.
(emit_library_call): Replace declaration with a series of overloads.
Remove the parameter count argument.
(emit_library_call_value): Likewise.
* calls.c (emit_library_call_value_1): Make global.  Replace varargs
with an "rtx_mode_t *".
(emit_library_call_value): Delete.
(emit_library_call): Likewise.
* asan.c (asan_emit_stack_protection): Update calls accordingly.
(asan_emit_allocas_unpoison): Likewise.
* builtins.c (expand_builtin_powi): Likewise.
(expand_asan_emit_allocas_unpoison): Likewise.
* cfgexpand.c (expand_main_function): Likewise.
* config/aarch64/aarch64.c (aarch64_trampoline_init): Likewise.
* config/aarch64/aarch64.h (PROFILE_HOOK): Likewise.
* config/alpha/alpha.c (alpha_trampoline_init): Likewise.
* config/arm/arm.c (arm_trampoline_init): Likewise.
(arm_call_tls_get_addr): Likewise.
(arm_expand_divmod_libfunc): Likewise.
* config/bfin/bfin.md (umulsi3_highpart): Likewise.
(smulsi3_highpart): Likewise.
* config/c6x/c6x.c (c6x_initialize_trampoline): Likewise.
(c6x_expand_compare): Likewise.
(c6x_expand_movmem): Likewise.
* config/frv/frv.c (frv_trampoline_init): Likewise.
* config/i386/i386.c (ix86_trampoline_init): Likewise.
(ix86_expand_divmod_libfunc): Likewise.
* config/ia64/ia64.c (ia64_expand_tls_address): Likewise.
(ia64_expand_compare): Likewise.
(ia64_profile_hook): Likewise.
* config/ia64/ia64.md (save_stack_nonlocal): Likewise.
(nonlocal_goto): Likewise.
(restore_stack_nonlocal): Likewise.
* config/m32r/m32r.c (block_move_call): Likewise.
(m32r_trampoline_init): Likewise.
* config/m68k/linux.h (FINALIZE_TRAMPOLINE): Likewise.
* config/m68k/m68k.c (m68k_call_tls_get_addr): Likewise.
(m68k_call_m68k_read_tp): Likewise.
* config/microblaze/microblaze.c (microblaze_call_tls_get_addr)
(microblaze_expand_divide): Likewise.
* config/mips/mips.h (mips_args): Likewise.
* config/mips/sdemtk.h (mips_sync_icache): Likewise.
(MIPS_ICACHE_SYNC): Likewise.
* config/nios2/nios2.c (nios2_emit_expensive_div): Likewise.
(nios2_trampoline_init): Likewise.
* config/pa/pa.c (hppa_tls_call): Likewise.
(pa_trampoline_init): Likewise.
* config/pa/pa.md (canonicalize_funcptr_for_compare): Likewise.
* config/powerpcspe/powerpcspe.c (rs6000_legitimize_tls_address)
(expand_strn_compare): Likewise.
(rs6000_generate_compare): Likewise.
(rs6000_expand_float128_convert): Likewise.
(output_profile_hook): Likewise.
(rs6000_trampoline_init): Likewise.
* config/powerpcspe/powerpcspe.md (neg2): Likewise.
* config/riscv/riscv.h (PROFILE_HOOK): Likewise.
* config/rs6000/rs6000-string.c (expand_strn_compare): Likewise.
* config/rs6000/rs6000.c (rs6000_legitimize_tls_address): Likewise.
(rs6000_generate_compare): Likewise.
(rs6000_expand_float128_convert): Likewise.
(output_profile_hook): Likewise.
(rs6000_trampoline_init): Likewise.
* config/rs6000/rs6000.md (neg2): Likewise.
* config/sh/sh.c (sh_trampoline_init): Likewise.
* config/sparc/sparc.c (emit_soft_tfmode_libcall): Likewise.
(sparc_emit_float_lib_cmp): Likewise.
(sparc32_initialize_trampoline): Likewise.
(sparc64_initialize_trampoline): Likewise.
(sparc_profile_hook): Likewise.
* config/spu/spu.c (ea_load_store): Likewise.
* config/spu/spu.md (floatunssidf2): Likewise.
* config/tilegx/tilegx.c (tilegx_trampoline_init): Likewise.
* config/tilepro/tilepro.c (tilepro_trampoline_init): Likewise.
* config/visium/visium.c (expand_block_move_4): Likewise.
(expand_block_move_2): Likewise.
(expand_block_move_1): Likewise.
(expand_block_set_4): Likewise.
(expand_block_set_2): Likewise.
(expand_block_set_1): Li

Re: [PING #2] [PATCH] enhance -Wrestrict to handle string built-ins (PR 78918)

2017-09-01 Thread Jeff Law

On 08/28/2017 06:27 PM, Martin Sebor wrote:
>> Correct.  I wound my way through this mess a while back.  Essentially
>> Red Hat had a customer with code that had overlapped memcpy arguments.
>> We had them use the memstomp interposition library to start tracking
>> these problems down.
>>
>> One of the things that popped up was structure/class copies which were
>> implemented via calls to memcpy.In the case of self assignment, the
>> interposition library would note the overlap and (rightly IMHO) complain.
> 
> Is this bug 32667?  I'm not having any luck reproducing it with
> any of the test cases there and varying struct sizes, or with
> the test case in the duplicate bug 65029 I filed for the same
> thing last year.  It would be nice to have a test case.
It certainly seems to be in the same class as 32667.  I'm not sure if
it's exactly the same root cause in terms of the front end interactions,
but the net result is the same.

I just reviewed the discussion in my outbox -- I didn't see sample code
or a reference to 32667, but I'm only looking at one side of the
conversation.


> 
>> One could argue that GCC should emit memmove by default for structure
>> assignments, only using memcpy when it knows its not doing self
>> assignment (which may be hard to determine).  Furthermore, GCC should
>> eliminate self structure/class assignment.
> 
> If it's still a problem emitting memmove seems like the right
> thing to do.  From what I've read the performance advantage of
> memcpy over memmove seems debatable at best.  Most performance
> sensitive code avoids making copies of very large objects so
> the only code that can be impacted doesn't care about efficiency
> quite so much.  For small enough objects, inlining the copy as
> GCC already does would obviate the efficiency concern altogether.
I personally feel we reached the point of diminishing returns years ago
--  in attempts to inline/unroll the copies in GCC, in terms of glibc's
copier implementations and in the gain of using memcpy over memmove vs
the pain of folks that can't be bothered to do it right WRT overlapped
copies.

But that's a bit of a digression.  THe issue at handle is structure
assignments that GCC is turning into memcpys.  That's a  much smaller
subset of calls to memcpy and more likely to not be a performance issue
to change.


>> A self-copy should just be folded away.  It's no different than x = x on
>> scalars except that we've dropped it to a memcpy in the IL.  Doing so
>> makes the code more efficient and removes false positives from tools
>> like the memstomp interposition library, making those tools more useful.
> 
> It's possible to do in simple cases but not in general.  I agree
> that in the general case when overlap is possible the only safe
> solution, short of actually testing for it at runtime, is to call
> memmove.
So let's zap those where we know its a self copy.  I'd look favorably on
a patch that use memmove on the others, but I won't immediately approve
it without wider buy-in as I expect it could well be controversial.
These should be separate patches as the former should go forward
immediately.

Jeff

[PATCH][committed] Always initialize hash table elements

2017-09-01 Thread Jeff Law


The recently added code to try and optimize certain binary operations
when both operands were equal had a subtle bug.

Specifically, when that optimization applied, it failed to initialize
the hash table element returned from the failed lookup.   That hash
table element would be kept in the empty/uninitialized state which has
special meaning...

Under the right set of circumstances that could result in items being in
the hash table that we couldn't lookup (we could find them via a
traversal though).

The sequence of events from the BZ testcase looks like this:

Enter object1 into slot X
Enter object2 into slot Y (after rehashing due to conflict)

Y must be < X to trigger the bug because the next step requires a
reallocation of the table.  WHen we reallocate we walk through the old
table from first to last and insert the objects into the new table.

That results in object2 going into slot X and object1 going into slot Y
in the new table.

Then we remove object2 from the hash table.  This leaves slot X in a
deleted state.

Then we lookup object3 and get back slot X which will be put into an
empty state.  Since we didn't get a hit in the table, we try to lookup
an alternate form that would allow us to prove object3 has a constant
value.  That succeeds and we never re-initialize slot X, leaving it in
the deleted state.

The deleted state is important as it also means uninitialized.  The
generic bits of hash-table.h (reasonably) assume that a slot in the
empty/uninitialized state  implies that no other objects with a
conflicting hash are in the table.

So when we try to lookup object1 in the table, it misses because slot1
is uninitialized/empty -- which triggers the assert as we know object1
must be in the hash table.

[ Yes, I could almost certainly trigger this in a simpler way. ]

The fix is to ensure we initialize the slot properly after a miss, but
when we are able to prove the expression has a constant value.

Bootstrapped and regression tested on x86_64.  Installed on the trunk.

Jeff
commit 85f11a8902c4f7ec06111b197d37b26e40be6e0a
Author: law 
Date:   Fri Sep 1 15:32:15 2017 +

PR tree-optimization/82052
* tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr):
Always initialize the returned slot after a hash table miss
when INSERT is true.

PR tree-optimization/82052
* gcc.c-torture/compile/pr82052.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@251600 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 60824ea8a2e..f9e49e0cc94 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2017-09-01  Jeff Law  
+
+   PR tree-optimization/82052
+   * tree-ssa-scopedtables.c (avail_exprs_stack::lookup_avail_expr):
+   Always initialize the returned slot after a hash table miss
+   when INSERT is true.
+
 2017-09-01  Alexander Monakov  
 
* config/s390/s390.md (mem_signal_fence): Remove.
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index b3045634838..9ace97b5b25 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2017-09-01  Jeff Law  
+
+   PR tree-optimization/82052
+   * gcc.c-torture/compile/pr82052.c: New test.
+
 2017-09-01  Jakub Jelinek  
 
PR sanitizer/81923
diff --git a/gcc/testsuite/gcc.c-torture/compile/pr82052.c 
b/gcc/testsuite/gcc.c-torture/compile/pr82052.c
new file mode 100644
index 000..44419855745
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/compile/pr82052.c
@@ -0,0 +1,391 @@
+typedef unsigned char uint8_t;
+typedef unsigned short uint16_t;
+typedef unsigned uint32_t;
+char a, t9, t22;
+uint32_t c[56];
+uint32_t d, t, t1, t5, t13, t19, t31, t36, t40;
+struct {
+  unsigned f0 : 1;
+  unsigned f7 : 4;
+} l, t3;
+uint16_t g, t17, t29 = 65531, t42 = 1;
+short m, t6, t11, t12, t20, t27 = 1, t34 = 7, t38, t43, s;
+uint8_t p, u, t2, t4, t14, t24, t33, t44 = 50, t90;
+int f, h = 5, n, o = 40211, q, v = 2, w, t7, t8, t10, t15, t16, t18, t21, t25,
+   t26, t28, t30 = 3743393910, t32, t37 = 105423096, t39, t46, t47, t48,
+   t49, t88, t89, x, y;
+char r;
+char t23;
+uint16_t t35[][7][2];
+static uint8_t t41;
+char z[][8][3];
+char fn1(char p1, int p2) { return p1 < 0 ?: p1 >> p2; }
+short fn2() {}
+void fn3(uint8_t p1) { d = d >> 8 ^ c[(d ^ p1) & 5]; }
+void fn4(uint32_t p1, int p2) {
+  int e;
+  uint8_t b = e;
+  d = 8 ^ c[(d ^ b) & 5];
+  fn3(e >> 8);
+  fn3(e >> 6);
+  fn3(e >> 24);
+  if (p2)
+printf(0);
+}
+int fn5(p1) {
+  if (t37)
+for (; t28;)
+  ;
+}
+uint16_t fn6(char p1, int p2) {
+  int k;
+  for (; t32; t32++)
+for (; t32 < 8; t32++)
+  fn4(t23, k);
+}
+uint8_t fn7(p1) { return 1; }
+uint32_t fn8(uint8_t p1, uint32_t p2) {
+  t22 = t44 | 1;
+  t34--;
+  l.f7 = p2;
+  fn4(t18, t88);
+  fn4(t17, t88);
+  fn4(t3.f0, t88);
+  fn4(t16, t88);
+  fn4(t15, t88);
+  fn4(t14, t88);
+  fn4(t13, t88);
+  fn4(t12, t88);
+

Re: [PATCH] Fix x86_64 ICE with -fpie -mcmodel=large (PR target/81766)

2017-09-01 Thread Uros Bizjak

On Fri, Sep 1, 2017 at 1:42 PM, Jakub Jelinek  wrote:
> Hi!
>
> As mentioned in the PR, ix86_init_pic_reg for -mcmodel=large PIC creates
> invalid RTL.  Shrink wrapping managed to work around it by unconditionally
> running find_many_sub_basic_blocks that has been invoked even when the
> prologue or split prologue actually didn't contain anything, but that isn't
> done anymore.
>
> The problem is that we add a label into the sequence that we then insert on
> the single succ edge after ENTRY; but this insertion inserts the label after
> the NOTE_INSN_BLOCK_BEG, which is invalid, because labels should precede
> that.
>
> Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
> trunk?
>
> 2017-09-01  Jakub Jelinek  
>
> PR target/81766
> * config/i386/i386.c (ix86_init_large_pic_reg): Return label instead 
> of void.
> (ix86_init_pic_reg): Remember label from ix86_init_large_pic_reg, if 
> non-NULL
> and preceded by NOTE_INSN_BASIC_BLOCK, swap the note and label.
>
> * gcc.target/i386/pr81766.c: New test.

OK.

Thanks,
Uros.

> --- gcc/config/i386/i386.c.jj   2017-08-07 18:50:10.0 +0200
> +++ gcc/config/i386/i386.c  2017-08-08 16:01:41.917136120 +0200
> @@ -8829,7 +8829,7 @@ ix86_use_pseudo_pic_reg (void)
>
>  /* Initialize large model PIC register.  */
>
> -static void
> +static rtx_code_label *
>  ix86_init_large_pic_reg (unsigned int tmp_regno)
>  {
>rtx_code_label *label;
> @@ -8846,6 +8846,7 @@ ix86_init_large_pic_reg (unsigned int tm
>emit_insn (gen_set_got_offset_rex64 (tmp_reg, label));
>emit_insn (ix86_gen_add3 (pic_offset_table_rtx,
> pic_offset_table_rtx, tmp_reg));
> +  return label;
>  }
>
>  /* Create and initialize PIC register if required.  */
> @@ -8854,6 +8855,7 @@ ix86_init_pic_reg (void)
>  {
>edge entry_edge;
>rtx_insn *seq;
> +  rtx_code_label *label = NULL;
>
>if (!ix86_use_pseudo_pic_reg ())
>  return;
> @@ -8863,7 +8865,7 @@ ix86_init_pic_reg (void)
>if (TARGET_64BIT)
>  {
>if (ix86_cmodel == CM_LARGE_PIC)
> -   ix86_init_large_pic_reg (R11_REG);
> +   label = ix86_init_large_pic_reg (R11_REG);
>else
> emit_insn (gen_set_got_rex64 (pic_offset_table_rtx));
>  }
> @@ -8887,6 +8889,22 @@ ix86_init_pic_reg (void)
>entry_edge = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
>insert_insn_on_edge (seq, entry_edge);
>commit_one_edge_insertion (entry_edge);
> +
> +  if (label)
> +{
> +  basic_block bb = BLOCK_FOR_INSN (label);
> +  rtx_insn *bb_note = PREV_INSN (label);
> +  /* If the note preceding the label starts a basic block, and the
> +label is a member of the same basic block, interchange the two.  */
> +  if (bb_note != NULL_RTX
> + && NOTE_INSN_BASIC_BLOCK_P (bb_note)
> + && bb != NULL
> + && bb == BLOCK_FOR_INSN (bb_note))
> +   {
> + reorder_insns_nobb (bb_note, bb_note, label);
> + BB_HEAD (bb) = label;
> +   }
> +}
>  }
>
>  /* Initialize a variable CUM of type CUMULATIVE_ARGS
> --- gcc/testsuite/gcc.target/i386/pr81766.c.jj  2017-08-08 16:10:04.299459808 
> +0200
> +++ gcc/testsuite/gcc.target/i386/pr81766.c 2017-08-08 16:09:28.0 
> +0200
> @@ -0,0 +1,9 @@
> +/* PR target/81766 */
> +/* { dg-do compile { target { pie && lp64 } } } */
> +/* { dg-options "-O2 -fpie -mcmodel=large" } */
> +
> +int
> +main ()
> +{
> +  return 0;
> +}
>
> Jakub

Re: Add support to trace comparison instructions and switch statements

2017-09-01 Thread Jakub Jelinek

On Fri, Jul 21, 2017 at 01:38:17PM +0800, 吴潍浠(此彼) wrote:
> Hi Jeff
> 
> I have signed the copyright assignment, and used the name 'Wish Wu' .
> Should I send you a copy of my assignment ?
> 
> The attachment is my new patch with small changes. 
> Codes are checked by ./contrib/check_GNU_style.sh, except some special files.

Please provide a ChangeLog entry, you can use ./contrib/mklog as a start.

@@ -975,6 +974,10 @@ fsanitize=
 Common Driver Report Joined
 Select what to sanitize.
 
+fsanitize-coverage=
+Common Driver Report Joined
+Select what to coverage sanitize.
+

Why Driver?  The reason fsanitize= needs it is that say for
-fsanitize=address we add libraries in the driver, etc., but that
isn't the case for the coverage, right?

--- gcc/flag-types.h(revision 250199)
+++ gcc/flag-types.h(working copy)
@@ -250,6 +250,14 @@ enum sanitize_code {
  | SANITIZE_BOUNDS_STRICT
 };
 
+/* Different trace modes.  */
+enum sanitize_coverage_code {
+  /* Trace PC.  */
+  SANITIZE_COV_TRACE_PC = 1UL << 0,
+  /* Trace Compare.  */
+  SANITIZE_COV_TRACE_CMP = 1UL << 1
+};

No need for UL suffixes, the reason sanitize_code uses them is
that it includes 1 << 16 and above and might be included even in target code
(for host we require 32-bit integers, for target code it might be just
16-bit).

--- gcc/opts.c  (revision 250199)
+++ gcc/opts.c  (working copy)
@@ -1519,6 +1519,17 @@ const struct sanitizer_opts_s sanitizer_opts[] =
   { NULL, 0U, 0UL, false }
 };
 
+/* -f{,no-}sanitize-coverage= suboptions.  */
+const struct sanitizer_opts_s coverage_sanitizer_opts[] =
+{
+#define SANITIZER_OPT(name, flags, recover) \
+{ #name, flags, sizeof #name - 1, recover }
+  SANITIZER_OPT (trace-pc, SANITIZE_COV_TRACE_PC, false),
+  SANITIZER_OPT (trace-cmp, SANITIZE_COV_TRACE_CMP, false),
+#undef SANITIZER_OPT
+  { NULL, 0U, 0UL, false }

No need to have the recover argument for the macro, just add false to it
(unless you want to use a different struct type that wouldn't even include
that member).

+/* Given ARG, an unrecognized coverage sanitizer option, return the best
+   matching coverage sanitizer option, or NULL if there isn't one.  */
+
+static const char *
+get_closest_coverage_sanitizer_option (const string_fragment &arg)
+{
+  best_match  bm (arg);
+  for (int i = 0; coverage_sanitizer_opts[i].name != NULL; ++i)
+{
+  bm.consider (coverage_sanitizer_opts[i].name);
+}

Body which contains just one line shouldn't be wrapped in {}s, just use
  for (int i = 0; coverage_sanitizer_opts[i].name != NULL; ++i)
bm.consider (coverage_sanitizer_opts[i].name);

+unsigned int
+parse_coverage_sanitizer_options (const char *p, location_t loc,
+unsigned int flags, int value, bool complain)

Wrong formatting, unsigned int should go below const char *, like:

parse_coverage_sanitizer_options (const char *p, location_t loc,
  unsigned int flags, int value, bool complain)

+{
+  while (*p != 0)
+{
+  size_t len, i;
+  bool found = false;
+  const char *comma = strchr (p, ',');
+
+  if (comma == NULL)
+   len = strlen (p);
+  else
+   len = comma - p;
+  if (len == 0)
+   {
+ p = comma + 1;
+ continue;
+   }
+
+  /* Check to see if the string matches an option class name.  */
+  for (i = 0; coverage_sanitizer_opts[i].name != NULL; ++i)
+   if (len == coverage_sanitizer_opts[i].len
+   && memcmp (p, coverage_sanitizer_opts[i].name, len) == 0)
+ {
+   if (value)
+ flags |= coverage_sanitizer_opts[i].flag;
+   else
+ flags &= ~coverage_sanitizer_opts[i].flag;
+   found = true;
+   break;
+ }
+
+  if (! found && complain)
+   {
+ const char *hint
+   = get_closest_coverage_sanitizer_option (string_fragment (p, len));
+
+ if (hint)
+   error_at (loc,
+ "unrecognized argument to "
+ "-f%ssanitize-coverage= option: %q.*s;"
+ " did you mean %qs?",
+ value ? "" : "no-",
+ (int) len, p, hint);
+ else
+   error_at (loc,
+ "unrecognized argument to "
+ "-f%ssanitize-coverage= option: %q.*s",
+ value ? "" : "no-",
+ (int) len, p);
+   }
+
+  if (comma == NULL)
+   break;
+  p = comma + 1;
+}
+  return flags;
+}

Though, looking at the above, it sounds like there is just way too much
duplication.  So, maybe better just use the parse_sanitizer_options
and get_closest_coverage_option functions for all of
-f{,no-}sanitize{,-recover,-coverage}= , add
  const struct sanitizer_opts_s *opts = sanitizer_opts;
  if (code == OPT_fsanitize_coverage_)
opts = coverage_sanitizer_opts;
early in both functions, deal with the diagnostics (to print "-coverage

Re: Fix inconsistent section flags

2017-09-01 Thread Jeff Law

On 08/29/2017 09:09 AM, Joerg Sonnenberger wrote:
> On Mon, Aug 28, 2017 at 11:42:53AM -0600, Jeff Law wrote:
>> Does the attached (untested) patch work for you?
> 
> Works for the cases I care about, yes.
Thanks.   Here's what I'm committing after the usual bootstrap &
regression testing cycle on x86.  This includes a regression test for
the testsuite.

Jeff
commit 371072bf395be11f36ef31bb3cfec06bbfc58597
Author: law 
Date:   Fri Sep 1 16:26:00 2017 +

* varasm.c (bss_initializer_p): Do not put constants into .bss
(categorize_decl_for_section): Handle bss_initializer_p returning
false when DECL_INITIAL is NULL.

* gcc.target/i386/const-in-bss.c: New test.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@251602 
138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index e71380e5c9d..f9d9eb74a3a 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2017-09-01  Joerg Sonnenberger  
+   Jeff Law  
+
+   * varasm.c (bss_initializer_p): Do not put constants into .bss
+   (categorize_decl_for_section): Handle bss_initializer_p returning
+   false when DECL_INITIAL is NULL.
+
 2017-09-01  Andreas Krebbel  
 
PR target/82012
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index fe6e4301bff..e82751438d6 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -5,6 +5,8 @@
 
 2017-09-01  Jeff Law  
 
+   * gcc.target/i386/const-in-bss.c: New test.
+
PR tree-optimization/82052
* gcc.c-torture/compile/pr82052.c: New test.
 
diff --git a/gcc/testsuite/gcc.target/i386/const-in-bss.c 
b/gcc/testsuite/gcc.target/i386/const-in-bss.c
new file mode 100644
index 000..c70aa0bcb4e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/const-in-bss.c
@@ -0,0 +1,6 @@
+/* { dg-do compile { target *-*-linux* } } */
+
+__attribute__((section("readonly1"))) const int foo1c;
+
+/* { dg-final { scan-assembler "readonly1,\"a\"" } } */
+/* { dg-final { scan-assembler-not "readonly1,\"aw\"" } } */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index e5393377a43..d38d2c2721b 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -976,16 +976,16 @@ decode_reg_name (const char *name)
 bool
 bss_initializer_p (const_tree decl)
 {
-  return (DECL_INITIAL (decl) == NULL
- /* In LTO we have no errors in program; error_mark_node is used
-to mark offlined constructors.  */
- || (DECL_INITIAL (decl) == error_mark_node
- && !in_lto_p)
- || (flag_zero_initialized_in_bss
- /* Leave constant zeroes in .rodata so they
-can be shared.  */
- && !TREE_READONLY (decl)
- && initializer_zerop (DECL_INITIAL (decl;
+  /* Do not put constants into the .bss section, they belong in a readonly
+ section.  */
+  return (!TREE_READONLY (decl)
+ && (DECL_INITIAL (decl) == NULL
+ /* In LTO we have no errors in program; error_mark_node is used
+to mark offlined constructors.  */
+ || (DECL_INITIAL (decl) == error_mark_node
+ && !in_lto_p)
+ || (flag_zero_initialized_in_bss
+ && initializer_zerop (DECL_INITIAL (decl);
 }
 
 /* Compute the alignment of variable specified by DECL.
@@ -6517,7 +6517,8 @@ categorize_decl_for_section (const_tree decl, int reloc)
ret = SECCAT_BSS;
   else if (! TREE_READONLY (decl)
   || TREE_SIDE_EFFECTS (decl)
-  || ! TREE_CONSTANT (DECL_INITIAL (decl)))
+  || (DECL_INITIAL (decl)
+  && ! TREE_CONSTANT (DECL_INITIAL (decl
{
  /* Here the reloc_rw_mask is not testing whether the section should
 be read-only or not, but whether the dynamic link will have to
@@ -6537,7 +6538,8 @@ categorize_decl_for_section (const_tree decl, int reloc)
   location.  -fmerge-all-constants allows even that (at the
   expense of not conforming).  */
ret = SECCAT_RODATA;
-  else if (TREE_CODE (DECL_INITIAL (decl)) == STRING_CST)
+  else if (DECL_INITIAL (decl)
+  && TREE_CODE (DECL_INITIAL (decl)) == STRING_CST)
ret = SECCAT_RODATA_MERGE_STR_INIT;
   else
ret = SECCAT_RODATA_MERGE_CONST;

Fix excess precision handling of compound assignments (PR c/82071)

2017-09-01 Thread Joseph Myers

PR c/82071 reports how compound assignment operators such as += handle
excess precision inconsistently with the same operation done with a
plain assignment and binary operator.

There were (at least) two problems with how compound assignments
handled excess precision.  The EXCESS_PRECISION_EXPR for an argument
with excess precision was removed too early, resulting in
build_binary_op being called with an rhs operand whose type reflected
the evaluation format, so not having sufficient information to achieve
the intended semantics in all cases, and then the code called
c_fully_fold on the results of build_binary_op without allowing for
the possibility of an EXCESS_PRECISION_EXPR as the result, so leading
to double rounding of the result (first to its semantic type, then to
the type of the LHS of the assignment) instead of the intended single
rounding.

This patch fixes those problems by keeping EXCESS_PRECISION_EXPRs
further through build_modify_expr (and build_atomic_assign which it
calls) and only removing them locally where appropriate.

Note that while this patch should achieve *consistency*, that's
consistency with the understanding of C99 semantics that I originally
intended to implement.  For the particular case in the testcase, C11
semantics (from N1531) differ from that understanding of C99
semantics, in that an implicit conversion of an integer to floating
point can have excess precision.  I intend to implement those C11
semantics separately (conditional on flag_isoc11) (which will also
mean that building conditional expressions can produce a result with
excess precision even when the arguments lack excess precision, where
previously it could not), and not to close the bug until that is also
done.

Tested for x86_64-pc-linux-gnu.  Applied to mainline.

gcc/c:
2017-09-01  Joseph Myers  

PR c/82071
* c-typeck.c (build_atomic_assign): Handle argument with excess
precision.  Ensure any EXCESS_PRECISION_EXPR is present in
argument passed to build_binary_op and convert_for_assignment but
not for call to c_fully_fold.
(build_modify_expr): Do not remove EXCESS_PRECISION_EXPR early.
Ensure build_binary_op is called with argument with original
semantic type.  Avoid calling c_fully_fold with an
EXCESS_PRECISION_EXPR from build_binary_op.

gcc/testsuite:
2017-09-01  Joseph Myers  

PR c/82071
* gcc.target/i386/excess-precision-7.c: New test.

Index: gcc/c/c-typeck.c
===
--- gcc/c/c-typeck.c(revision 251561)
+++ gcc/c/c-typeck.c(working copy)
@@ -3919,7 +3919,9 @@ build_atomic_assign (location_t loc, tree lhs, enu
   tree lhs_type = TREE_TYPE (lhs);
   tree lhs_addr = build_unary_op (loc, ADDR_EXPR, lhs, false);
   tree seq_cst = build_int_cst (integer_type_node, MEMMODEL_SEQ_CST);
-  tree rhs_type = TREE_TYPE (rhs);
+  tree rhs_semantic_type = TREE_TYPE (rhs);
+  tree nonatomic_rhs_semantic_type;
+  tree rhs_type;
 
   gcc_assert (TYPE_ATOMIC (lhs_type));
 
@@ -3933,6 +3935,15 @@ build_atomic_assign (location_t loc, tree lhs, enu
  with a loop.  */
   compound_stmt = c_begin_compound_stmt (false);
 
+  /* Remove any excess precision (which is only present here in the
+ case of compound assignments).  */
+  if (TREE_CODE (rhs) == EXCESS_PRECISION_EXPR)
+{
+  gcc_assert (modifycode != NOP_EXPR);
+  rhs = TREE_OPERAND (rhs, 0);
+}
+  rhs_type = TREE_TYPE (rhs);
+
   /* Fold the RHS if it hasn't already been folded.  */
   if (modifycode != NOP_EXPR)
 rhs = c_fully_fold (rhs, false, NULL);
@@ -3941,6 +3952,8 @@ build_atomic_assign (location_t loc, tree lhs, enu
  the VAL temp variable to hold the RHS.  */
   nonatomic_lhs_type = build_qualified_type (lhs_type, TYPE_UNQUALIFIED);
   nonatomic_rhs_type = build_qualified_type (rhs_type, TYPE_UNQUALIFIED);
+  nonatomic_rhs_semantic_type = build_qualified_type (rhs_semantic_type,
+ TYPE_UNQUALIFIED);
   val = create_tmp_var_raw (nonatomic_rhs_type);
   TREE_ADDRESSABLE (val) = 1;
   TREE_NO_WARNING (val) = 1;
@@ -4100,8 +4113,17 @@ cas_loop:
   add_stmt (loop_label);
 
   /* newval = old + val;  */
+  if (rhs_type != rhs_semantic_type)
+val = build1 (EXCESS_PRECISION_EXPR, nonatomic_rhs_semantic_type, val);
   rhs = build_binary_op (loc, modifycode, old, val, true);
-  rhs = c_fully_fold (rhs, false, NULL);
+  if (TREE_CODE (rhs) == EXCESS_PRECISION_EXPR)
+{
+  tree eptype = TREE_TYPE (rhs);
+  rhs = c_fully_fold (TREE_OPERAND (rhs, 0), false, NULL);
+  rhs = build1 (EXCESS_PRECISION_EXPR, eptype, rhs);
+}
+  else
+rhs = c_fully_fold (rhs, false, NULL);
   rhs = convert_for_assignment (loc, UNKNOWN_LOCATION, nonatomic_lhs_type,
rhs, NULL_TREE, ic_assign, false, NULL_TREE,
NULL_TREE, 0);
@@ -5727,7 +5749,6 @@ build_modify_expr (location_t l

[C++ PATCH] METHOD_VEC cleanup

2017-09-01 Thread Nathan Sidwell

This patch reorganizes the processing of METHOD_VEC during class 
definition.  I remove a bunch of places where we check for non-NULL 
METHOD_VEC -- now there are no magic slots, it doesn't matter if the 
vector is NULL.  Some processing moves out of finish_struct_methods, 
leaving it to just sort the method vector.


With this change a glaring inconsistency between template and 
non-template or instantiation class defintions.  The former never calls 
set_class_bindings, so field search will always be the slow TYPE_FIELDS 
iteration.  I have a patch for that and I'm curious about whether it'll 
make a change in practice (if there are fewer than 8 fields, we never 
create the FIELD_VEC).


applied to trunk.

nathan
--
Nathan Sidwell
2017-09-01  Nathan Sidwell  

	* class.c (finish_struct_methods): Done clear DECL_IN_AGGR_P here.
	Don't call maybe_warn_about_overly_private_class here.
	(warn_hidden): Cleanup declarations and comments.
	(type_has_user_provided_constructor): No need to check
	CLASSTYPE_METHOD_VEC.
	(type_has_user_provided_or_explicit_constructor): Likewise.
	(classtype_has_move_assign_or_move_ctor_p): Likewise.
	(check_bases_and_members): Don't call finish_struct_methods here.
	(finish_struct_1): Call finish_struct_methods and
	set_class_bindings immediately after layout.  Clear DECL_IN_AGGR_P
	here.
	(finish_struct): For templates process USING_DECLS and clear
	DECL_IN_AGGR_P before calling finish_struct_methods. Call
	maybe_warn_about_overly_private_class here.

Index: class.c
===
--- class.c	(revision 251592)
+++ class.c	(working copy)
@@ -2313,15 +2313,6 @@ finish_struct_methods (tree t)
   if (!method_vec)
 return;
 
-  /* Clear DECL_IN_AGGR_P for all functions.  */
-  for (tree fn = TYPE_FIELDS (t); fn; fn = DECL_CHAIN (fn))
-if (DECL_DECLARES_FUNCTION_P (fn))
-  DECL_IN_AGGR_P (fn) = false;
-
-  /* Issue warnings about private constructors and such.  If there are
- no methods, then some public defaults are generated.  */
-  maybe_warn_about_overly_private_class (t);
-
   qsort (method_vec->address (), method_vec->length (),
 	 sizeof (tree), method_name_cmp);
 }
@@ -2976,16 +2967,12 @@ warn_hidden (tree t)
   /* We go through each separately named virtual function.  */
   for (int i = 0; vec_safe_iterate (method_vec, i, &fns); ++i)
 {
-  tree fndecl;
+  tree name = OVL_NAME (fns);
+  auto_vec base_fndecls;
   tree base_binfo;
   tree binfo;
   int j;
 
-  /* All functions in this slot in the CLASSTYPE_METHOD_VEC will
-	 have the same name.  Figure out what name that is.  */
-  tree name = OVL_NAME (fns);
-  /* There are no possibly hidden functions yet.  */
-  auto_vec base_fndecls;
   /* Iterate through all of the base classes looking for possibly
 	 hidden functions.  */
   for (binfo = TYPE_BINFO (t), j = 0;
@@ -3002,14 +2989,14 @@ warn_hidden (tree t)
   /* Remove any overridden functions.  */
   for (ovl_iterator iter (fns); iter; ++iter)
 	{
-	  fndecl = *iter;
+	  tree fndecl = *iter;
 	  if (TREE_CODE (fndecl) == FUNCTION_DECL
 	  && DECL_VINDEX (fndecl))
 	{
-		/* If the method from the base class has the same
-		   signature as the method from the derived class, it
-		   has been overridden.  */
-		for (size_t k = 0; k < base_fndecls.length (); k++)
+	  /* If the method from the base class has the same
+		 signature as the method from the derived class, it
+		 has been overridden.  */
+	  for (size_t k = 0; k < base_fndecls.length (); k++)
 		if (base_fndecls[k]
 		&& same_signature_p (fndecl, base_fndecls[k]))
 		  base_fndecls[k] = NULL_TREE;
@@ -4892,11 +4879,6 @@ adjust_clone_args (tree decl)
 static void
 clone_constructors_and_destructors (tree t)
 {
-  /* If for some reason we don't have a CLASSTYPE_METHOD_VEC, we bail
- out now.  */
-  if (!CLASSTYPE_METHOD_VEC (t))
-return;
-
   /* While constructors can be via a using declaration, at this point
  we no longer need to know that.  */
   for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (t)); iter; ++iter)
@@ -5134,10 +5116,6 @@ type_has_user_provided_constructor (tree
   if (!TYPE_HAS_USER_CONSTRUCTOR (t))
 return false;
 
-  /* This can happen in error cases; avoid crashing.  */
-  if (!CLASSTYPE_METHOD_VEC (t))
-return false;
-
   for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (t)); iter; ++iter)
 if (user_provided_p (*iter))
   return true;
@@ -5156,10 +5134,6 @@ type_has_user_provided_or_explicit_const
   if (!TYPE_HAS_USER_CONSTRUCTOR (t))
 return false;
 
-  /* This can happen in error cases; avoid crashing.  */
-  if (!CLASSTYPE_METHOD_VEC (t))
-return false;
-
   for (ovl_iterator iter (CLASSTYPE_CONSTRUCTORS (t)); iter; ++iter)
 {
   tree fn = *iter;
@@ -5348,9 +5322,6 @@ classtype_has_move_assign_or_move_ctor_p
 	  || (!CLASSTYPE_LAZY_MOVE_CTOR (t)
 		  && !CLASSTYPE_LAZY_MOVE_ASSIGN (t)));
 
-  if (!CLASSTY

Re: c-family PATCH to improve -Wtautological-compare (PR c/81783)

2017-09-01 Thread Jeff Law

On 08/16/2017 08:29 AM, Marek Polacek wrote:
> This patch improves -Wtautological-compare so that it also detects
> bitwise comparisons involving & and | that are always true or false, e.g.
> 
>   if ((a & 16) == 10)
> return 1;
> 
> can never be true.  Note that e.g. "(a & 9) == 8" is *not* always false
> or true.
> 
> I think it's pretty straightforward with one snag: we shouldn't warn if
> the constant part of the bitwise operation comes from a macro, but currently
> that's not possible, so I XFAILed this in the new test.
> 
> This has found one issue in the GCC codebase and it's a genuine bug:
> .  
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> 
> 2017-08-16  Marek Polacek  
> 
>   PR c/81783
>   * c-warn.c (warn_tautological_bitwise_comparison): New function.
>   (warn_tautological_cmp): Call it.
> 
>   * doc/invoke.texi: Update -Wtautological-compare documentation.
> 
>   * c-c++-common/Wtautological-compare-5.c: New test.
OK.
jeff

Re: [PATCH] Write dependency information (-M*) even if there are errors

2017-09-01 Thread Jeff Law

On 08/13/2017 09:25 AM, Boris Kolpackov wrote:
> Hi,
> 
> I've been instructed to resend this patch from the gcc mailing list:
> 
> Currently GCC does not write extracted header dependency information
> if there are errors. However, this can be useful when dealing with
> outdated generated headers that trigger errors which would have been
> resolved if we could update it. A concrete example in our case is a
> version check with #error.
> 
> The included (trivial) patch changes this behavior. Note also that
> this is how Clang already behaves. I've tested the patch in build2
> and everything works well (i.e., no invalid dependency output in the
> face of various preprocessor errors such as #error, stray #else, etc).
> 
> While I don't foresee any backwards-compatibility issues with such
> an unconditional change (after all, the compiler still exists with
> an error status), if there are concerns, I could re-do it via an
> option (e.g., -ME, analogous to -MG).
> 
> Joseph Myers  writes:
> 
>> I suppose a question for the present proposal would be making sure any 
>> dependencies generated in this case do not include dependencies on files 
>> that don't exist (so #include "some-misspelling.h" doesn't create any sort 
>> of dependency on such a header).
> Good point. I've tested this and I believe everything is in order:
> unless -MG is specified, a non-existent header is treated as a fatal
> error so we don't even get to writing the dependency info. And if -MG
> is specified, then there is no error and we get the missing header in
> the dependency output, as requested.
> 
> P.S. I have the paperwork necessary to contribute on file with FSF.
> 
> Thanks,
> Boris
> 
> 
> deps-on-error.diff
> 
> 
> Index: gcc/c-family/ChangeLog
> ===
> --- gcc/c-family/ChangeLog(revision 250514)
> +++ gcc/c-family/ChangeLog(working copy)
> @@ -1,3 +1,8 @@
> +2017-08-06  Boris Kolpackov 
> +
> + * c-opts.c (c_common_finish): Write dependency information even if
> + there are errors.
Thanks.  I've installed this on the trunk.  Sorry for the delays.

jeff

Re: [PATCH 01/17] Add param-type-mismatch.c/C testcases as a baseline

2017-09-01 Thread Jeff Law

On 07/24/2017 02:04 PM, David Malcolm wrote:
> Amongst other things, this patch kit aims to improve our handling
> of mismatched types within function calls.
Certainly a worthy goal. It's silly that someone trying to parse the
diagnostic we currently give to have to manually walk the arguments and
figure out where the mismatch is.  Underlining the mismatched parameter
at the call site and in the declaration would be a huge step forward.

I'd support getting these tests into the testsuite independent of the
BLT work.  They may need to be xfailed for a period of time.

If you wanted to go forward on that, be my guest.

jeff

Re: [PATCH 11/17] Add JSON implementation

2017-09-01 Thread Jeff Law

On 07/24/2017 02:05 PM, David Malcolm wrote:
> This patch is the first part of the Language Server Protocol
> server proof-of-concept.
> 
> It adds support to gcc for reading and writing JSON,
> based on DOM-like trees of "json::value" instances.
> 
> gcc/ChangeLog:
>   * Makefile.in (OBJS): Add json.o.
>   * json.c: New file.
>   * json.h: New file.
>   * selftest-run-tests.c (selftest::run_tests): Call json_c_tests.
>   * selftest.h (selftest::json_c_tests): New decl.
Didn't someone else already raise the issue of why not just use an
existing json implementation.  I realize we dislike too many external
dependencies.  BUt I also dislike recreating the wheel :-)  These days I
probably dislike the latter more than the former.

I'm not sure if I'm sold on the LSP idea in the kind of timeframes we're
looking at for gcc-8.

jeff

Re: [PATCH 12/17] Add server.h and server.c

2017-09-01 Thread Jeff Law

On 07/24/2017 02:05 PM, David Malcolm wrote:
> This patch adds a "server" abstract base class for listening
> on a port, for use by the later patches to implement an LSP server.
> 
> It's largely adapted from examples in glibc's docs.  I suspect that
> I've introduced platform-specific assumptions (and that it may need some
> extra configure tests for the extra functionality), but this part of
> the kit is just a proof-of-concept.
> 
> gcc/ChangeLog:
>   * Makefile.in (OBJS): Add server.o.
>   * server.c: New file.
>   * server.h: New file.
And this is where I start to get scared :-)  Once folks can start
interacting with GCC over a TCP connection we have to start thinking
much harder about the security ramifications of various hunks of code.

If we end up going down this path at some point, I'd really like to look
for ways to leverage existing code that's already being used in the wild
and hopefully has been through real security audits.

Jeff

Re: [PATCH] Add UBSAN_{PTR,BOUNDS} folding (PR sanitizer/81981)

2017-09-01 Thread Richard Biener

On September 1, 2017 3:53:28 PM GMT+02:00, Jakub Jelinek  
wrote:
>On Fri, Sep 01, 2017 at 02:32:43PM +0200, Richard Biener wrote:
>> On September 1, 2017 1:16:54 PM GMT+02:00, Jakub Jelinek
> wrote:
>> >Hi!
>> >
>> >This patch fixes the following testcase by folding some ubsan
>internal
>> >fns
>> >we'd either remove anyway during sanopt, or lower into if (cond)
>> >do_something during sanopt where cond would be always false.
>> >
>> >Additionally, I've tried to clean up a little bit
>IFN_UBSAN_OBJECT_SIZE
>> >handling by using variables for the call arguments that make it
>clear
>> >what the arguments are.
>> >
>> >Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>> 
>> I think there's a helper for replace - with-nop. 
>
>Can't find it.
>gimplify_and_update_call_from_tree has to add it, but I'd need
>to call it with something that gimplifies into empty sequence
>(void_node?)
>and it would still go through
>push_gimplify_context/gimplify_and_add/pop_gimplify_context
>etc., which looks quite expensive.

OK, I thought we have one. Can you add a helper for it please? replace_with_nop 
or so? I thought there's maybe replace_with_value which handles null lhs by 
replacing with nop. (can't check, writing from phone) 

Richard. 

>> >2017-09-01  Jakub Jelinek  
>> >
>> >PR sanitizer/81981
>> >* gimple-fold.c (gimple_fold_call): Optimize away useless UBSAN_PTR
>> >and UBSAN_BOUNDS internal calls.  Clean up IFN_UBSAN_OBJECT_SIZE
>> >handling.
>> >
>> >* gcc.dg/ubsan/pr81981.c: New test.
>> >
>> >--- gcc/gimple-fold.c.jj2017-08-10 02:31:21.0 +0200
>> >+++ gcc/gimple-fold.c   2017-08-29 18:50:49.993673432 +0200
>> >@@ -3938,11 +3938,23 @@ gimple_fold_call (gimple_stmt_iterator *
>> >gimple_call_arg (stmt, 2));
>> >  break;
>> >case IFN_UBSAN_OBJECT_SIZE:
>> >- if (integer_all_onesp (gimple_call_arg (stmt, 2))
>> >- || (TREE_CODE (gimple_call_arg (stmt, 1)) == INTEGER_CST
>> >- && TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST
>> >- && tree_int_cst_le (gimple_call_arg (stmt, 1),
>> >- gimple_call_arg (stmt, 2
>> >+ {
>> >+   tree offset = gimple_call_arg (stmt, 1);
>> >+   tree objsize = gimple_call_arg (stmt, 2);
>> >+   if (integer_all_onesp (objsize)
>> >+   || (TREE_CODE (offset) == INTEGER_CST
>> >+   && TREE_CODE (objsize) == INTEGER_CST
>> >+   && tree_int_cst_le (offset, objsize)))
>> >+ {
>> >+   gsi_replace (gsi, gimple_build_nop (), false);
>> >+   unlink_stmt_vdef (stmt);
>> >+   release_defs (stmt);
>> >+   return true;
>> >+ }
>> >+ }
>> >+ break;
>> >+   case IFN_UBSAN_PTR:
>> >+ if (integer_zerop (gimple_call_arg (stmt, 1)))
>> >{
>> >  gsi_replace (gsi, gimple_build_nop (), false);
>> >  unlink_stmt_vdef (stmt);
>> >@@ -3950,6 +3962,25 @@ gimple_fold_call (gimple_stmt_iterator *
>> >  return true;
>> >}
>> >  break;
>> >+   case IFN_UBSAN_BOUNDS:
>> >+ {
>> >+   tree index = gimple_call_arg (stmt, 1);
>> >+   tree bound = gimple_call_arg (stmt, 2);
>> >+   if (TREE_CODE (index) == INTEGER_CST
>> >+   && TREE_CODE (bound) == INTEGER_CST)
>> >+ {
>> >+   index = fold_convert (TREE_TYPE (bound), index);
>> >+   if (TREE_CODE (index) == INTEGER_CST
>> >+   && tree_int_cst_le (index, bound))
>> >+ {
>> >+   gsi_replace (gsi, gimple_build_nop (), false);
>> >+   unlink_stmt_vdef (stmt);
>> >+   release_defs (stmt);
>> >+   return true;
>> >+ }
>> >+ }
>> >+ }
>> >+ break;
>> >case IFN_GOACC_DIM_SIZE:
>> >case IFN_GOACC_DIM_POS:
>> >  result = fold_internal_goacc_dim (stmt);
>> >--- gcc/testsuite/gcc.dg/ubsan/pr81981.c.jj 2017-08-29
>> >18:54:33.826107761 +0200
>> >+++ gcc/testsuite/gcc.dg/ubsan/pr81981.c2017-08-29
>18:55:36.721386827
>> >+0200
>> >@@ -0,0 +1,21 @@
>> >+/* PR sanitizer/81981 */
>> >+/* { dg-do compile } */
>> >+/* { dg-options "-O2 -Wmaybe-uninitialized -fsanitize=undefined
>> >-ffat-lto-objects" } */
>> >+
>> >+int v;
>> >+
>> >+int
>> >+foo (int i)
>> >+{
>> >+  int t[1], u[1];
>> >+  int n = 0;
>> >+
>> >+  if (i)
>> >+{
>> >+  t[n] = i;
>> >+  u[0] = i;
>> >+}
>> >+
>> >+  v = u[0];/* { dg-warning "may be used uninitialized in 
>> >this
>> >function" } */
>> >+  return t[0]; /* { dg-warning "may be used uninitialized in 
>> >this
>> >function" } */
>> >+}
>> >
>> >Jakub
>
>   Jakub

Re: [PATCH 02/17] diagnostics: support prefixes within diagnostic_show_locus

2017-09-01 Thread Jeff Law

On 07/24/2017 02:04 PM, David Malcolm wrote:
> This patch reworks diagnostic_show_locus so that it respects
> the prefix of the pretty_printer.
> 
> This is used later in the patch kit for embedding source dumps
> within a hierarchical tree-like dump, by using the ASCII art for the
> tree as the prefix.
> 
> gcc/c-family/ChangeLog:
>   * c-opts.c (c_diagnostic_finalizer): Move call to
>   pp_destroy_prefix to before call to diagnostic_show_locus.
> 
> gcc/ChangeLog:
>   * diagnostic-show-locus.c (layout::print_source_line):
>   Call pp_emit_prefix.
>   (layout::should_print_annotation_line_p): Likewise.
>   (layout::print_leading_fixits): Likewise.
>   (layout::print_trailing_fixits): Likewise, for first correction.
>   (layout::print_trailing_fixits): Replace call to move_to_column
>   with a call to print_newline, to avoid emitting a prefix after
>   last line.
>   (layout::move_to_column): Call pp_emit_prefix.
>   (layout::show_ruler): Likewise.
>   (diagnostic_show_locus): Remove save/restore of prefix.
>   (selftest::test_diagnostic_context::test_diagnostic_context): Set
>   prefixing rule to DIAGNOSTICS_SHOW_PREFIX_NEVER.
>   (selftest::test_one_liner_fixit_remove): Verify the interaction of
>   pp_set_prefix with rulers and fix-it hints.
>   * diagnostic.c (default_diagnostic_start_span_fn): Don't modify
>   the prefix.
> 
> gcc/testsuite/ChangeLog:
>   * gcc.dg/plugin/diagnostic_plugin_test_show_locus.c
>   (custom_diagnostic_finalizer): Reset pp_indentation of printer.
I don't see anything here concerning.  I think it's mostly a matter
running with this when you need it.

jeff

Re: [PATCH 03/17] Core of BLT implementation

2017-09-01 Thread Jeff Law

On 07/24/2017 02:05 PM, David Malcolm wrote:
> This patch implements the core of the new "blt" type: an optional
> "on-the-side" hierarchical recording of source ranges, associated
> with grammar productions, with a sparse mapping to our regular
> "tree" type.
So I think one of the big questions here is whether or not BLT hits the
right compromise between the two syntax trees to meet the needs you
envision.  It's an area I'm not really versed in, so I'm probably not a
good judge of whether or not we're on the right track -- though it does
seem to my naive eyes that you need to have most of the tree to do
something like refactoring tools.

> 
> Caveats:
> * the name is a placeholder (see the comment in blt.h)
> * I'm ignoring memory management for now (e.g. how do these get freed?
>   possible a custom arena, or an obstack within a blt_context or
>   somesuch)
Yea, management is a real question.  It obviously depends on the use
case -- ie, are the use cases such that we expect to work on a function
level, then move on to the next function and so-on.  Or is it the case
that we are likely to need to build the BLT for function A, then analyze
some other code in function B, then go back and issue fixit hints for
function A?  ie, what are the expected lifetimes of a BLT.

My understanding is you can have nodes in the BLT pointing to trees and
vice-versa.  This may have implications on memory management :-)

Jeff

Re: [PATCH 00/22] RFC: integrated 3rd-party static analysis support

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> This patch kit clearly isn't ready yet as-is (see e.g. the
> "known unknowns" below), but I'm posting it now in the hope of
> getting early feedback.
[ ... ]

> 
> 
> Statement of the problem
> 
> 
> Static analysis is IMHO done too late, if at all: static analysis tools are 
> run
> as an optional extra, "on the side", rather than in developers' normal
> workflow, with some kind of "override the compiler and do extra work" hook,
> which may preclude running more than one analyzer at once.  Analysis results
> are reviewed (if at all) in some kind of on-the-side tool, rather than when 
> the
> code is being edited, or patches being prepared.
I'm sure you know my opinions on this stuff.  But for the benefit of the
rest of our readers, I agree, 100% totally on all of this.

For checkers to really be effective, they have to be part of the
standard workflow that we use every day.  Anything else is ultimately a
losing battle.  That's in large part why I continue to support improving
GCC's ability to emit high quality useful warnings about likely
programming errors.

So this raises one very high level question.  By providing this
capability do we undermine further development of GCC's down analysis
capabilities or does it merely allow that development to move to its
most natural place (gcc, llvm/clang, smatch, cppcheck, whatever)
allowing each tool to focus on what it does best?


> 
> It would also be good to tag binaries with information on what analyzers
> were run, what options they were invoked with, etc.
> Potentially have "dump_file" information from optimization passes stored
> in the metadata also.   Have a tool to query all of this.
So as you know this is a real area of interest for Red Hat.  Nick has
been playing in this space with his binary annotation project.  How are
these likely to interact with each other?

[ ... ]

> 
> 
> Known unknowns
> ==
> 
> How does one suppress a specific false-positive site?
> Do we need a pragma for it?  (though pragmas ought to already affect some of
> the underlying checkers...)
I'm always conflicted on this kind of suppression/marking.  You can
easily end up with a boatload of unmaintainable markers.  But without
them you've got a firehose of useless information.  Sigh.


> 
> 
> Dependencies
> 
> 
> The "checkers" subdirectory uses Python 2 or 3, and has a few Python
> dependencies, including "firehose" and "gccinvocation".
I'm not sure if there's general buy-in on firehose.  Not sure about
gccinvocation.  So these may need revisiting.

But we certainly need a way to suck in and present information to the
developers.  I'd prefer to re-use existing concepts and code, so JSON
may be the way to go for the interchange format.




JEff

[PATCH] Disable type demotion for sanitizer (PR sanitizer/82072)

2017-09-01 Thread Marek Polacek

Here, do_narrow and convert_to_integer_1 is demoting signed types to unsigned,
e.g. for
  i = i - lmin
where i is int and lmin is long int, so what we should produce is
  i = (int) ((long int) i - lmin)
but instead it produces
  i = (int) ((unsigned int) i - (unsigned int) lmin);
which hides the overflow.  Similarly for NEGATE_EXPR.  This patch prevents
such demoting when the sanitizer is on.

There still might be a similar issue with division or shifting, but I couldn't
trigger that.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2017-09-01  Marek Polacek  

PR sanitizer/82072
* convert.c (do_narrow): When sanitizing signed integer overflows,
bail out for signed types.
(convert_to_integer_1) : Likewise.

* c-c++-common/ubsan/pr82072.c: New test.

diff --git gcc/convert.c gcc/convert.c
index 22152cae79b..139d790fd98 100644
--- gcc/convert.c
+++ gcc/convert.c
@@ -434,6 +434,13 @@ do_narrow (location_t loc,
 typex = lang_hooks.types.type_for_size (TYPE_PRECISION (typex),
TYPE_UNSIGNED (typex));
 
+  /* The type demotion below might cause doing unsigned arithmetic
+ instead of signed, and thus hide overflow bugs.  */
+  if ((ex_form == PLUS_EXPR || ex_form == MINUS_EXPR)
+  && !TYPE_UNSIGNED (typex)
+  && sanitize_flags_p (SANITIZE_SI_OVERFLOW))
+return NULL_TREE;
+
   /* But now perhaps TYPEX is as wide as INPREC.
  In that case, do nothing special here.
  (Otherwise would recurse infinitely in convert.  */
@@ -895,7 +902,12 @@ convert_to_integer_1 (tree type, tree expr, bool dofold)
TYPE_UNSIGNED (typex));
 
  if (!TYPE_UNSIGNED (typex))
-   typex = unsigned_type_for (typex);
+   {
+ /* Using unsigned arithmetic may hide overflow bugs.  */
+ if (sanitize_flags_p (SANITIZE_SI_OVERFLOW))
+   break;
+ typex = unsigned_type_for (typex);
+   }
  return convert (type,
  fold_build1 (ex_form, typex,
   convert (typex,
diff --git gcc/testsuite/c-c++-common/ubsan/pr82072.c 
gcc/testsuite/c-c++-common/ubsan/pr82072.c
index e69de29bb2d..d5683406b14 100644
--- gcc/testsuite/c-c++-common/ubsan/pr82072.c
+++ gcc/testsuite/c-c++-common/ubsan/pr82072.c
@@ -0,0 +1,19 @@
+/* PR sanitizer/82072 */
+/* { dg-do run } */
+/* { dg-options "-fsanitize=signed-integer-overflow" } */
+
+int
+main ()
+{
+  long long l = -__LONG_LONG_MAX__ - 1;
+  int i = 0;
+  asm volatile ("" : "+r" (i));
+  i -= l;
+  asm volatile ("" : "+r" (i));
+  i = -l;
+  asm volatile ("" : "+r" (i));
+  return 0;
+}
+
+/* { dg-output "signed integer overflow: 0 - -9223372036854775808 cannot be 
represented in type 'long long int'\[^\n\r]*(\n|\r\n|\r)" } */
+/* { dg-output "\[^\n\r]*negation of -9223372036854775808 cannot be 
represented in type 'long long int'\[^\n\r]*; cast to an unsigned type to 
negate this value to itself" } */

Marek

Re: [PATCH 01/22] Expose assert_loceq outside of input.c; add ASSERT_LOCEQ

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> gcc/ChangeLog:
>   * input.c: Include "selftest-input.h".
>   (selftest::assert_loceq): Remove "static".  Add "report_loc" param
>   and update assertions to use it.
>   (selftest::test_accessing_ordinary_linemaps): Use ASSERT_LOCEQ
>   rather than assert_loceq.
>   (selftest::test_builtins): Likewise.
>   * selftest-input.h: New file.
No concerns here.  IMHO this is all probably within an area that I think
you could argue for self-approval.

jeff

Re: [PATCH 02/22] libcpp: add linemap_position_for_file_line_and_column

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> gcc/ChangeLog:
>   * input.c (selftest::test_making_arbitrary_locations): New function.
>   (selftest::input_c_tests): Call it.
> 
> libcpp/ChangeLog:
>   * include/line-map.h (linemap_position_for_file_line_and_column):
>   New decl.
>   * line-map.c (linemap_position_for_file_line_and_column): New
>   function.
SImilarly.  No concerns.  Go with it when it makes sense.

jeff

Re: [PATCH 03/22] Add JSON implementation

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> This patch adds support to gcc for reading and writing JSON,
> based on DOM-like trees of json::value instances.
> 
> gcc/ChangeLog:
>   * Makefile.in (OBJS): Add json.o.
>   * json.cc: New file.
>   * json.h: New file.
>   * selftest-run-tests.c (selftest::run_tests): Call json_cc_tests.
>   * selftest.h (selftest::json_cc_tests): New decl.
Any chance we can re-use an implementation?  I don't see a lot of value
in providing our own JSON bits.  Is JSON inherently easier/better in
some way over XML or other formats we could be using for the interchange
of data?


jeff

[C++ PATCH] missing set_class_binding call

2017-09-01 Thread Nathan Sidwell

This patch adds a set_class_binding call that was missing from the 
template class definition case.


Brief experimentation shows undetectable difference in performance.

Applied to trunk.

nathan
--
Nathan Sidwell
2017-09-01  Nathan Sidwell  

	* class.c (finish_struct): Call set_class_bindings for the
	template case too.

Index: class.c
===
--- class.c	(revision 251604)
+++ class.c	(working copy)
@@ -7188,6 +7188,7 @@ finish_struct (tree t, tree attributes)
   /* COMPLETE_TYPE_P is now true.  */
 
   finish_struct_methods (t);
+  set_class_bindings (t, TYPE_FIELDS (t));
 
   /* We need to emit an error message if this type was used as a parameter
 	 and it is an abstract type, even if it is a template. We construct

Re: [PATCH 06/22] Makefile.in: hack in -lpthread

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> The checker.cc patch later in the kit can optionally make use of pthread
> if available.
> 
> Doing it properly would involve some configure checks; this patch simply
> hacks in -lpthread into LIB unconditionally for now.
> 
> gcc/ChangeLog:
>   * Makefile.in (LIB): Hack in -lpthread.
Obviously this would need to be improved before it could move forward.

What's the advantage of firing off a thread vs the usual fork/exec
model?  fork/exec is something we know how to deal with across all the
hosts GCC supports.

jeff

Re: [PATCH 07/22] Add minimal version of Nick Clifton's annobin code

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> This patch provides a way to "watermark" binaries with
> metadata.  It's used later in the patch kit to watermark
> binaries with static analysis results and metadata.
> 
> See:
>   https://fedoraproject.org/wiki/Toolchain/Watermark
> 
> Note: this is a version of Nick Clifton's "annobin" gcc plugin:
>   https://nickc.fedorapeople.org/
> heavily hacked up by me:
> * removed everything (including plugin support) not needed by
>   later patches in the kit
> * rewritten as an API, rather than as a plugin
> * removed annobin_inform (..., "ICE: ...") calls in favor of
>   gcc_assert.
> * line-wrapped
> * added a annobin_ensure_init to initialize annobin_is_64bit.
> * added #ifndef guard to annobin.h
> 
> It includes the commits:
> * Remove size limit on string passed to annobin_output_string_note
> * Version 2 of spec: Add a GA prefix to all names
So we're really like to have just one annobin that handles Nick's needs
as well as yours.  Having two just seems silly.

The idea behind using a plugin was to allow annobin to continue
development/releases independent of the trunk of GCC.  Ideally we'd be
able to re-use new versions of annobin in older versions of GCC that
have a fairly long lifecycle.

I think you and Nick need to coordinate in this space.

jeff

Re: [PATCH 05/22] diagnostic.c/h: add support for external tools

2017-09-01 Thread Jeff Law

On 08/04/2017 04:04 PM, David Malcolm wrote:
> This patch adds fields "external_tool" and "external_test_id"
> to diagnostic_info, allowing for diagnostics to be marked as
> coming from a 3rd-party tool.
> 
> Instead of printing the pertinent warning flag e.g.:
> 
>   foo.c:10:1: something is wrong [-Wpointer-arith]
> 
> the tool "ID" and (optionally) test ID is printed e.g.:
> 
>   foo.c:10:1: something is wrong [cppcheck:memleak]
> 
> gcc/ChangeLog:
>   * diagnostic-show-locus.c: Include "selftest-diagnostic.h".
>   (class selftest::test_diagnostic_context): Move to
>   selftest-diagnostic.h.
>   * diagnostic.c: Include "selftest-diagnostic.h".
>   (diagnostic_info::diagnostic_info): New ctor.
>   (print_option_information): Handle external_tool and
>   external_test_id fields of diagnostic_info.
>   (diagnostic_report_diagnostic): Assert that diagnostic->kind is
>   not DK_UNSPECIFIED.
>   (selftest::dummy_option_name_cb): New function.
>   (selftest::assert_option_information): New function.
>   (selftest::test_print_option_information): New function.
>   (selftest::diagnostic_c_tests): Call
>   selftest::test_print_option_information.
>   * diagnostic.h (struct diagnostic_info): Add default ctor,
>   along with new fields "external_tool" and "external_test_id".
>   * selftest-diagnostic.h: New file.
Seems fairly reasonble and probably in the realm of self-approvable.

jeff

[PATCH v2] [libcc1] Rename C{,P}_COMPILER_NAME and remove triplet from them

2017-09-01 Thread Sergio Durigan Junior

On Wednesday, August 23 2017, Pedro Alves wrote:

> On 08/23/2017 05:17 AM, Sergio Durigan Junior wrote:
>> Hi there,
>> 
>> This is a series of two patches, one for GDB and one for GCC, which aims
>> to improve the detection and handling of triplets present on compiler
>> names.  The motivation for this series was mostly the fact that GDB's
>> "compile" command is broken on Debian unstable, as can be seen here:
>> 
>>   
>> 
>> The reason for the failure is the fact that Debian compiles GCC using
>> the --program-{prefix,suffix} options from configure in order to name
>> the compiler using the full triplet (i.e., Debian's GCC is not merely
>> named "gcc", but e.g. "x86_64-linux-gnu-gcc-7"), which end up naming the
>> C_COMPILER_NAME and CP_COMPILER_NAME defines with the specified prefix
>> and suffix.  Therefore, the regexp being used to match the compiler name
>> is wrong because it doesn't take into account the fact that the defines
>> may already contain the triplets.
>
> As discussed on IRC, I think the problem is that C_COMPILER_NAME
> in libcc1 includes the full triplet in the first place.  I think
> that it shouldn't.  I think that C_COMPILER_NAME should always
> be "gcc".
>
> The problem is in bootstrapping code, before there's a plugin
> yet -- i.e.., in the code that libcc1 uses to find the compiler (which
> then loads a plugin that libcc1 talks with).
>
> Please bear with me while I lay down my rationale, so that we're
> in the same page.
>
> C_COMPILER_NAME seems to include the prefix currently in an attempt
> to support cross debugging, or more generically, --enable-targets=all
> in gdb, but the whole thing doesn't really work as intended if
> C_COMPILER_NAME already includes a target prefix.
>
> IIUC the libcc1/plugin design, a single "libcc1.so" (what gdb loads,
> not the libcc1plugin compiler plugin) should work with any compiler in
> the PATH, in case you have several in the system.  E.g., one for
> each arch.
>
> Let me expand.
>
> The idea is that gdb always dlopens "libcc1.so", by that name exactly.
> Usually that'll open the libcc1.so installed in the system, e.g.,
> "/usr/lib64/libcc1.so", which for convenience was originally built from the
> same source tree as the systems's compiler was built.  You could force gdb to
> load some other libcc1.so, e.g., by tweaking LD_LIBRARY_PATH of course,
> but you shouldn't need to.
>
> libcc1.so is responsible for finding a compiler that targets the
> architecture of the inferior that the user is debugging in gdb.
> E.g., say you're cross debugging for arm-none-eabi, on a
> x86-64 Fedora host.  GDB knows the target inferior's architecture, and passes
> down to (the system) libcc1 a triplet regex like "arm*-*eabi*" or
> similar to libcc1,.  libcc1 appends "-" + C_COMPILER_NAME to that regex,
> generating something like "arm*-*eabi*-gcc", and then looks for binaries
> in PATH that match that regex.  When one is found, e.g., "arm-none-eabi-gcc",
> libcc1 forks/execs that compiler, passing it "-fplugin=libcc1plugin".
> libcc1 then communicates with that compiler's libcc1plugin plugin
> via a socket.
>
> In this scheme, "libcc1.so", the library that gdb loads, has no
> target-specific logic at all.  It should work with any compiler
> in the system, for any target/arch.  All it does is marshall the gcc/gdb
> interface between the gcc plugin and gdb, it is not linked against gcc.
> That boundary is versioned, and ABI-stable.  So as long as the
> libcc1.so that gdb loads understands the same API version of the gcc/gdb
> interface API as gdb understands, it all should work.  (The APIs
> are always extended keeping backward compatibility.)
>
> So in this scheme, having the "C_COMPILER_NAME" macro in libcc1
> include the target prefix for the --target that the plugin that
> libcc1 is built along with, seems to serve no real purpose, AFAICT.
> It's just getting in the way.
>
> I.e., something like:
>
>   "$gdb_specified_triplet_re" + "-" + C_COMPILER_NAME
>
> works if C_COMPILER_NAME is exactly "gcc", but not if C_COMPILER_NAME is 
> already:
>
>   "$whatever_triplet_libcc1_happened_to_be_built_with" + "-gcc"
>
> because we end up with:
>
>   "$gdb_specified_triplet_re" + "-" 
> "$whatever_triplet_libcc1_happened_to_be_built_with" +  "-gcc"
>
> which is the problem case.
>
> In sum, I think the libcc1.so (not the plugin) should _not_ have baked
> in target awareness, and thus C_COMPILER_NAME should always be "gcc", and
> then libcc1's regex should be adjusted to also tolerate a suffix in
> the final compiler binary name regex.
>
> WDYT?

As I replied before, I agree with Pedro's rationale here and his idea
actually makes my initial patch much simpler.  By renaming
C_COMPILER_NAME (and the new CP_COMPILER_NAME) to just "gcc" (or "g++"),
the Debian bug I was fixing is solved and we don't have to bother with
breaking compatibility with older gdb's packaged by the distros, because
they will also keep wor

PING: [Updated, PATCH] i386: Avoid stack realignment if possible

2017-09-01 Thread H.J. Lu

On Sun, Aug 13, 2017 at 3:02 PM, H.J. Lu  wrote:
> On Mon, Aug 07, 2017 at 08:58:49AM -0700, H.J. Lu wrote:
>> On Tue, Jul 25, 2017 at 7:54 AM, Uros Bizjak  wrote:
>> > On Tue, Jul 25, 2017 at 3:52 PM, H.J. Lu  wrote:
>> >> On Fri, Jul 14, 2017 at 4:46 AM, H.J. Lu  wrote:
>> >>> On Fri, Jul 7, 2017 at 5:56 PM, H.J. Lu  wrote:
>>  On Fri, Jul 07, 2017 at 09:58:42AM -0700, H.J. Lu wrote:
>> > On Fri, Dec 20, 2013 at 8:06 AM, Jakub Jelinek  
>> > wrote:
>> > > Hi!
>> > >
>> > > Honza recently changed the i?86 backend, so that it often doesn't
>> > > do -maccumulate-outgoing-args by default on x86_64.
>> > > Unfortunately, on some of the here included testcases this regressed
>> > > quite a bit the generated code.  As AVX vectors are used, the dynamic
>> > > realignment code needs to assume e.g. that some of them will need to 
>> > > be
>> > > spilled, and for -mno-accumulate-outgoing-args the code needs to set
>> > > need_drap early as well.  But in when emitting the prologue/epilogue,
>> > > if need_drap is set, we don't perform the optimization for leaf 
>> > > functions
>> > > which have zero size stack frame, thus we end up with uselessly doing
>> > > dynamic stack realignment, setting up DRAP that nothing uses and 
>> > > later on
>> > > restore everything back.
>> > >
>> > > This patch improves it, if the DRAP register isn't live at the start 
>> > > of
>> > > entry bb successor and we aren't going to realign the stack, we don't
>> > > need DRAP at all, and even if we need DRAP register, that can't be 
>> > > the sole
>> > > reason for doing stack realignment, the prologue code is able to set 
>> > > up DRAP
>> > > even without dynamic stack realignment.
>> > >
>> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
>> > >
>> > > 2013-12-20  Jakub Jelinek  
>> > >
>> > > PR target/59501
>> > > * config/i386/i386.c (ix86_save_reg): Don't return true for 
>> > > drap_reg
>> > > if !crtl->stack_realign_needed.
>> > > (ix86_finalize_stack_realign_flags): If drap_reg isn't live 
>> > > on entry
>> > > and stack_realign_needed will be false, clear drap_reg and 
>> > > need_drap.
>> > > Optimize leaf functions that don't need stack frame even if
>> > > crtl->need_drap.
>> > >
>> > > * gcc.target/i386/pr59501-1.c: New test.
>> > > * gcc.target/i386/pr59501-1a.c: New test.
>> > > * gcc.target/i386/pr59501-2.c: New test.
>> > > * gcc.target/i386/pr59501-2a.c: New test.
>> > > * gcc.target/i386/pr59501-3.c: New test.
>> > > * gcc.target/i386/pr59501-3a.c: New test.
>> > > * gcc.target/i386/pr59501-4.c: New test.
>> > > * gcc.target/i386/pr59501-4a.c: New test.
>> > > * gcc.target/i386/pr59501-5.c: New test.
>> > > * gcc.target/i386/pr59501-6.c: New test.
>> >
>> > LGTM, assuming Jakub is OK with the patch.
>> >
>> > Thanks,
>> > Uros.
>>
>> Jakub, can you take a look at this:
>>
>> https://gcc.gnu.org/ml/gcc-patches/2017-07/msg00400.html
>>
>
> Here is the updated patch to fix
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81769
>
> OK for trunk?
>
> Thanks.
>
> H.J.
> ---
> ix86_finalize_stack_frame_flags has been extended to eliminate frame
> pointer when the new stack frame isn't needed with and without
> -maccumulate-outgoing-args as well as -fomit-frame-pointer.  Since stack
> access with larger alignment may be optimized out, to decide if stack
> realignment is needed, we need to not only check for stack frame access,
> but also verify the alignment of stack frame access.  Since alignment of
> memory access via arg_pointer is set up by caller, not by callee, we
> should find the maximum stack alignment from the stack frame access
> instructions via stack pointer and frame pointrer to avoid stack
> realignment when stack alignment needed is less than incoming stack
> boundary.
>
> gcc/
>
> PR target/59501
> PR target/81624
> PR target/81769
> * config/i386/i386.c (ix86_finalize_stack_frame_flags): Don't
> realign stack if stack alignment needed is less than incoming
> stack boundary.
>
> gcc/testsuite/
>
> PR target/59501
> PR target/81624
> PR target/81769
> * gcc.target/i386/pr59501-4a.c: Remove xfail.
> * gcc.target/i386/pr81769-1a.c: New test.
> * gcc.target/i386/pr81769-1b.c: Likewise.
> * gcc.target/i386/pr81769-2.c: Likewise.
> ---
>  gcc/config/i386/i386.c | 143 
> ++---
>  gcc/testsuite/gcc.target/i386/pr59501-4a.c |   2 +-
>  gcc/testsuite/gcc.target/i386/pr81769-1a.c |  21 +
>  gcc/testsuite/gcc.target/i386/pr81769-1b.c |   7 ++
>  gcc/testsuite/gcc.target/i386/pr81769-2.c  |  21 ++

[C++ PATCH[ move METHOD_VEC handling

2017-09-01 Thread Nathan Sidwell

This patch moves the CLASSTYPE_METHOD_VEC sorting functions from class.c 
to name-lookup.c.  It also merges the intial sort call into 
set_class_bindings, leaving that function in control of arranging the 
post-definition lookup data objects.


Applied to trunk.

nathan
--
Nathan Sidwell
2017-09-01  Nathan Sidwell  

	* cp-tree.h (resort_type_method_vec): Move declaration to ...
	* name-lookup.h (resort_type_method_vec): ... here.
	(set_class_bindings): Lose 2nd arg.
	* class.c (finish_struct_1, finish_struct): Adjust
	set_class_bindings call.  Don't call finish_struct_methods.
	(resort_data, method_name_cmp, resort_method_name_cmp,
	resort_type_method_vec, finish_struct_methods): Move to ...
	* name-lookup.c (resort_data, method_name_cmp,
	resort_method_name_cmp, resort_type_method_vec): ... here.
	(set_class_bindings): Lose fields arg.  Swallow finish_struct_methods.

Index: class.c
===
--- class.c	(revision 251608)
+++ class.c	(working copy)
@@ -129,10 +129,7 @@ static void handle_using_decl (tree, tre
 static tree dfs_modify_vtables (tree, void *);
 static tree modify_all_vtables (tree, tree);
 static void determine_primary_bases (tree);
-static void finish_struct_methods (tree);
 static void maybe_warn_about_overly_private_class (tree);
-static int method_name_cmp (const void *, const void *);
-static int resort_method_name_cmp (const void *, const void *);
 static void add_implicitly_declared_members (tree, tree*, int, int);
 static tree fixed_type_or_null (tree, int *, int *);
 static tree build_simple_base_path (tree expr, tree binfo);
@@ -2247,76 +2244,6 @@ maybe_warn_about_overly_private_class (t
 }
 }
 
-static struct {
-  gt_pointer_operator new_value;
-  void *cookie;
-} resort_data;
-
-/* Comparison function to compare two TYPE_METHOD_VEC entries by name.  */
-
-static int
-method_name_cmp (const void* m1_p, const void* m2_p)
-{
-  const tree *const m1 = (const tree *) m1_p;
-  const tree *const m2 = (const tree *) m2_p;
-
-  if (OVL_NAME (*m1) < OVL_NAME (*m2))
-return -1;
-  return 1;
-}
-
-/* This routine compares two fields like method_name_cmp but using the
-   pointer operator in resort_field_decl_data.  */
-
-static int
-resort_method_name_cmp (const void* m1_p, const void* m2_p)
-{
-  const tree *const m1 = (const tree *) m1_p;
-  const tree *const m2 = (const tree *) m2_p;
-
-  tree n1 = OVL_NAME (*m1);
-  tree n2 = OVL_NAME (*m2);
-  resort_data.new_value (&n1, resort_data.cookie);
-  resort_data.new_value (&n2, resort_data.cookie);
-  if (n1 < n2)
-return -1;
-  return 1;
-}
-
-/* Resort TYPE_METHOD_VEC because pointers have been reordered.  */
-
-void
-resort_type_method_vec (void* obj,
-			void* /*orig_obj*/,
-			gt_pointer_operator new_value,
-			void* cookie)
-{
-  if (vec *method_vec = (vec *) obj)
-{
-  resort_data.new_value = new_value;
-  resort_data.cookie = cookie;
-  qsort (method_vec->address (), method_vec->length (), sizeof (tree),
-	 resort_method_name_cmp);
-}
-}
-
-/* Warn about duplicate methods in fn_fields.
-
-   Sort methods that are not special (i.e., constructors, destructors,
-   and type conversion operators) so that we can find them faster in
-   search.  */
-
-static void
-finish_struct_methods (tree t)
-{
-  vec *method_vec = CLASSTYPE_METHOD_VEC (t);
-  if (!method_vec)
-return;
-
-  qsort (method_vec->address (), method_vec->length (),
-	 sizeof (tree), method_name_cmp);
-}
-
 /* Make BINFO's vtable have N entries, including RTTI entries,
vbase and vcall offsets, etc.  Set its type and call the back end
to lay it out.  */
@@ -6966,8 +6893,7 @@ finish_struct_1 (tree t)
   layout_class_type (t, &virtuals);
   /* COMPLETE_TYPE_P is now true.  */
 
-  finish_struct_methods (t);
-  set_class_bindings (t, TYPE_FIELDS (t));
+  set_class_bindings (t);
 
   if (CLASSTYPE_AS_BASE (t) != t)
 /* We use the base type for trivial assignments, and hence it
@@ -7187,8 +7113,7 @@ finish_struct (tree t, tree attributes)
   TYPE_SIZE_UNIT (t) = size_zero_node;
   /* COMPLETE_TYPE_P is now true.  */
 
-  finish_struct_methods (t);
-  set_class_bindings (t, TYPE_FIELDS (t));
+  set_class_bindings (t);
 
   /* We need to emit an error message if this type was used as a parameter
 	 and it is an abstract type, even if it is a template. We construct
Index: cp-tree.h
===
--- cp-tree.h	(revision 251592)
+++ cp-tree.h	(working copy)
@@ -5952,8 +5952,6 @@ extern tree convert_to_base_statically
 extern tree build_vtbl_ref			(tree, tree);
 extern tree build_vfn_ref			(tree, tree);
 extern tree get_vtable_decl			(tree, int);
-extern void resort_type_method_vec		(void *, void *,
-		 gt_pointer_operator, void *);
 extern bool add_method(tree, tree, bool);
 extern tree declared_access			(tree);
 extern tree currently_open_class		(tree);
Index: name-lookup.c

backwards threader cleanups

2017-09-01 Thread Aldy Hernandez

Hi.

Attached are misc cleanups to tree-ssa-threadbackwards.c and friends.
The main gist of the patch is making the path vectors live in the
heap, not GC.  But I also cleaned up some comments to reflect reality,
and renamed VAR_BB which could use a more meaningful name.  Finally, I
abstracted some common code to
register_jump_thread_path_if_profitable() in preparation for some
upcoming work by me :).

Tested on x86-64 Linux.

OK?


curr
Description: Binary data

Re: [PATCH] libquadmath/81848, allow PowerPC to enable libquadmath

2017-09-01 Thread Michael Meissner

On Wed, Aug 30, 2017 at 01:11:51AM +0200, Jakub Jelinek wrote:
> On Tue, Aug 15, 2017 at 11:06:01PM -0400, Michael Meissner wrote:
> > 2017-08-15  Michael Meissner  
> > 
> > PR libquadmath/81848
> > * configure.ac (powerpc*-linux*): Use attribute mode KC to create
> > complex __float128 on PowerPC instead of attribute mode TC.
> > * quadmth.h (__complex128): Likewise.
> 
> quadmath.h ?

Yes, thanks.

> > * configure: Regenerate.
> > * math/cbrtq.c (CBRT2): Use __float128 not long double.
> > (CBRT4): Likewise.
> > (CBRT2I): Likewise.
> > (CBRT4I): Likewise.
> > * math/j0q.c (U0): Likewise.
> > * math/sqrtq.c (sqrtq): Don't depend on implicit conversion
> > between __float128, instead explicitly convert the __float128
> > value to long double because the PowerPC does not allow __float128
> > and long double in the same expression.
> 
> Does the Q suffix on ppc* imply __float128 like on x86_64 etc.?

Yes.

> > --- libquadmath/math/sqrtq.c(revision 251097)
> > +++ libquadmath/math/sqrtq.c(working copy)
> > @@ -31,15 +31,18 @@ sqrtq (const __float128 x)
> >  return y;
> >}
> >  
> > -#ifdef HAVE_SQRTL
> > -  if (x <= LDBL_MAX && x >= LDBL_MIN)
> > +#if defined(HAVE_SQRTL)
> 
> Why the #ifdef -> #if defined change?  That looks unnecessary.

I'll change it back.

> >{
> > -/* Use long double result as starting point.  */
> > -y = sqrtl ((long double) x);
> > +long double xl = (long double)x;
> 
> Please add a space after (long double)

Ok.

> > +if (xl <= LDBL_MAX && xl >= LDBL_MIN)
> > +  {
> > +   /* Use long double result as starting point.  */
> > +   y = sqrtl (xl);
> >  
> > -/* One Newton iteration.  */
> > -y -= 0.5q * (y - x / y);
> > -return y;
> > +   /* One Newton iteration.  */
> > +   y -= 0.5q * (y - x / y);
> > +   return y;
> > +  }
> >}
> >  #endif
> >  
> 
> Otherwise LGTM.

Thanks.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

[PATCH] Add UBSAN_{PTR,BOUNDS} folding (PR sanitizer/81981, take 2)

2017-09-01 Thread Jakub Jelinek

On Fri, Sep 01, 2017 at 07:10:51PM +0200, Richard Biener wrote:
> OK, I thought we have one.  Can you add a helper for it please? 
> replace_with_nop or so?  I thought there's maybe replace_with_value which
> handles null lhs by replacing with nop.  (can't check, writing from phone)

Actually, you're right, replace_call_with_value does the right thing
when called on call without lhs (all these internal fns don't have lhs),
and NULL_TREE val ensures we'd ICE if that ever wasn't the case.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2017-09-01  Jakub Jelinek  

PR sanitizer/81981
* gimple-fold.c (gimple_fold_call): Optimize away useless UBSAN_PTR
and UBSAN_BOUNDS internal calls.  Clean up IFN_UBSAN_OBJECT_SIZE
handling.  Use replace_call_with_value with NULL instead of
gsi_replace, unlink_stmt_vdef and release_defs.

* gcc.dg/ubsan/pr81981.c: New test.

--- gcc/gimple-fold.c.jj2017-09-01 09:26:37.054748039 +0200
+++ gcc/gimple-fold.c   2017-09-01 19:37:03.283795450 +0200
@@ -3936,18 +3936,43 @@ gimple_fold_call (gimple_stmt_iterator *
gimple_call_arg (stmt, 2));
  break;
case IFN_UBSAN_OBJECT_SIZE:
- if (integer_all_onesp (gimple_call_arg (stmt, 2))
- || (TREE_CODE (gimple_call_arg (stmt, 1)) == INTEGER_CST
- && TREE_CODE (gimple_call_arg (stmt, 2)) == INTEGER_CST
- && tree_int_cst_le (gimple_call_arg (stmt, 1),
- gimple_call_arg (stmt, 2
+ {
+   tree offset = gimple_call_arg (stmt, 1);
+   tree objsize = gimple_call_arg (stmt, 2);
+   if (integer_all_onesp (objsize)
+   || (TREE_CODE (offset) == INTEGER_CST
+   && TREE_CODE (objsize) == INTEGER_CST
+   && tree_int_cst_le (offset, objsize)))
+ {
+   replace_call_with_value (gsi, NULL_TREE);
+   return true;
+ }
+ }
+ break;
+   case IFN_UBSAN_PTR:
+ if (integer_zerop (gimple_call_arg (stmt, 1)))
{
- gsi_replace (gsi, gimple_build_nop (), false);
- unlink_stmt_vdef (stmt);
- release_defs (stmt);
+ replace_call_with_value (gsi, NULL_TREE);
  return true;
}
  break;
+   case IFN_UBSAN_BOUNDS:
+ {
+   tree index = gimple_call_arg (stmt, 1);
+   tree bound = gimple_call_arg (stmt, 2);
+   if (TREE_CODE (index) == INTEGER_CST
+   && TREE_CODE (bound) == INTEGER_CST)
+ {
+   index = fold_convert (TREE_TYPE (bound), index);
+   if (TREE_CODE (index) == INTEGER_CST
+   && tree_int_cst_le (index, bound))
+ {
+   replace_call_with_value (gsi, NULL_TREE);
+   return true;
+ }
+ }
+ }
+ break;
case IFN_GOACC_DIM_SIZE:
case IFN_GOACC_DIM_POS:
  result = fold_internal_goacc_dim (stmt);
--- gcc/testsuite/gcc.dg/ubsan/pr81981.c.jj 2017-09-01 19:35:37.555782465 
+0200
+++ gcc/testsuite/gcc.dg/ubsan/pr81981.c2017-09-01 19:35:37.555782465 
+0200
@@ -0,0 +1,21 @@
+/* PR sanitizer/81981 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wmaybe-uninitialized -fsanitize=undefined 
-ffat-lto-objects" } */
+
+int v;
+
+int
+foo (int i)
+{
+  int t[1], u[1];
+  int n = 0;
+
+  if (i)
+{
+  t[n] = i;
+  u[0] = i;
+}
+
+  v = u[0];/* { dg-warning "may be used uninitialized in this 
function" } */
+  return t[0]; /* { dg-warning "may be used uninitialized in this 
function" } */
+}


Jakub

[PATCH] Fix a pasto in lra-remat.c (reg_overlap_for_remat_p)

2017-09-01 Thread Jakub Jelinek

Hi!

This is something that has been reported privately to me.
This is in code introduced in
https://gcc.gnu.org/ml/gcc-patches/2016-03/msg00660.html
and looks indeed like a pasto to me, before the loop there is
a very similar set of stmts without the 2 suffix, and unless
regno being a hard reg (after possible reg_renumber) implies
that regno2 (after possible reg_renumber) is a hard reg, if
unlucky we might access out of bounds.

I don't have a testcase for this, but have bootstrapped/regtested
it on x86_64-linux and i686-linux, ok for trunk?

2017-09-01  Jakub Jelinek  

* lra-remat.c (reg_overlap_for_remat_p): Fix a pasto.

--- gcc/lra-remat.c.jj  2017-05-25 10:37:00.0 +0200
+++ gcc/lra-remat.c 2017-09-01 19:42:07.615291583 +0200
@@ -684,7 +684,7 @@ reg_overlap_for_remat_p (lra_insn_reg *r
 
if (regno2 >= FIRST_PSEUDO_REGISTER && reg_renumber[regno2] >= 0)
  regno2 = reg_renumber[regno2];
-   if (regno >= FIRST_PSEUDO_REGISTER)
+   if (regno2 >= FIRST_PSEUDO_REGISTER)
  nregs2 = 1;
else
  nregs2 = hard_regno_nregs[regno2][reg->biggest_mode];

Jakub

[PATCH] Fix gdbhooks.py

2017-09-01 Thread Jakub Jelinek

Hi!

I'm seeing on the trunk errors like:
Traceback (most recent call last):
  File "", line 1, in 
  File "../../gcc/gdbhooks.py", line 441
if name == 'E_VOIDmode':
   ^
TabError: inconsistent use of tabs and spaces in indentation
.gdbinit:14: Error in sourced command file:
Error while executing Python code.

(with gdb 7.12.1).

The following patch fixes that, bootstrapped/regtested on x86_64-linux and
i686-linux, ok for trunk?

2017-09-01  Jakub Jelinek  

* gdbhooks.py (OptMachineModePrinter.to_string): Use 8 spaces
instead of tab.

--- gcc/gdbhooks.py.jj  2017-09-01 09:26:27.0 +0200
+++ gcc/gdbhooks.py 2017-09-01 20:04:04.135133016 +0200
@@ -438,8 +438,8 @@ class OptMachineModePrinter:
 
 def to_string (self):
 name = str(self.gdbval['m_mode'])
-   if name == 'E_VOIDmode':
-   return ''
+if name == 'E_VOIDmode':
+return ''
 return name[2:] if name.startswith('E_') else name
 
 ##


Jakub

[PATCH]: PR target/80204 (Darwin macosx-version-min problem)

2017-09-01 Thread Simon Wright

In gcc/config/darwin-driver.c, darwin_find_version_from_kernel() assumes
that the minor version in the Darwin kernel version (16.7.0, => minor
version 7) is equal to the bugfix component of the macOS version, so that
the compiler receives -mmacosx-version-min=10.12.7 and the linker receives
-macosx_version_min 10.12.7.

Unfortunately, Apple don’t apply this algorithm; the macOS version is
actually 10.12.6.

Getting this wrong means that it’s impossible to run an executable from 
within a bundle: Sierra complains "You have macOS 10.12.6. The application
requires macOS 10.12.7 or later".

A workround would perhaps be to link the executable with
-Wl,-macosx_version_min,`sw_vers -productVersion` (I assume that it’s only
the linker phase that matters?)

I see that Apple’s gcc (Apple LLVM version 8.0.0
(clang-800.0.42.1)) specifies - only at link time -
 -macosx_version_min 10.12.0

This patch does the same.

gcc/Changelog:

2017-09-01 Simon Wright 

PR target/80204
* config/darwin-driver.c (darwin_find_version_from_kernel): eliminate 
calculation of the
  minor version, always output as 0.



80204-darwin-driver.c.diff
Description: Binary data

Re: backwards threader cleanups

2017-09-01 Thread Jeff Law

On 09/01/2017 02:18 PM, Aldy Hernandez wrote:
> Hi.
> 
> Attached are misc cleanups to tree-ssa-threadbackwards.c and friends.
> The main gist of the patch is making the path vectors live in the
> heap, not GC.  But I also cleaned up some comments to reflect reality,
> and renamed VAR_BB which could use a more meaningful name.  Finally, I
> abstracted some common code to
> register_jump_thread_path_if_profitable() in preparation for some
> upcoming work by me :).
> 
> Tested on x86-64 Linux.
It looks like you dropped a level of indirection for path in
profitable_jump_thread_path and perhaps others that push blocks onto the
path?   Does your change from having the vectors in the GC space to the
heap also change them from embeddable vectors to a space efficient
vector?  It has to for this change to be safe.

See the discussion in vec.h

I don't recall any inherent reason we use the embedded vector layout.
It's the default which is probably why it's used here more than anything.

Otherwise the change looks good.  I think you just to make sure you're
not using the embedded layout.  Which I think is just a different type
when  you declare the vec.



Jeff

Re: [PATCH][compare-elim] Merge zero-comparisons with normal ops

2017-09-01 Thread Segher Boessenkool

Hi!

On Tue, Aug 29, 2017 at 09:39:06AM +0100, Kyrill Tkachov wrote:
> On 28/08/17 19:26, Jeff Law wrote:
> >On 08/10/2017 03:14 PM, Michael Collison wrote:
> >>One issue that we keep encountering on aarch64 is GCC not making good use 
> >>of the flag-setting arithmetic instructions
> >>like ADDS, SUBS, ANDS etc. that perform an arithmetic operation and 
> >>compare the result against zero.
> >>They are represented in a fairly standard way in the backend as PARALLEL 
> >>patterns:
> >>(parallel [(set (reg x1) (op (reg x2) (reg x3)))
> >>(set (reg cc) (compare (op (reg x2) (reg x3)) (const_int 
> >>0)))])

That is incorrect: the compare has to come first.  From md.texi:

@cindex @code{compare}, canonicalization of
[ ... ]

@item
For instructions that inherently set a condition code register, the
@code{compare} operator is always written as the first RTL expression of
the @code{parallel} instruction pattern.  For example,
[ ... ]

aarch64.md seems to do this correctly, fwiw.

> >>GCC isn't forming these from separate arithmetic and comparison 
> >>instructions as aggressively as it could.
> >>A particular pain point is when the result of the arithmetic insn is used 
> >>before the comparison instruction.
> >>The testcase in this patch is one such example where we have:
> >>(insn 7 35 33 2 (set (reg/v:SI 0 x0 [orig:73  ] [73])
> >> (plus:SI (reg:SI 0 x0 [ x ])
> >> (reg:SI 1 x1 [ y ]))) "comb.c":3 95 {*addsi3_aarch64}
> >>  (nil))
> >>(insn 33 7 34 2 (set (reg:SI 1 x1 [77])
> >> (plus:SI (reg/v:SI 0 x0 [orig:73  ] [73])
> >> (const_int 2 [0x2]))) "comb.c":4 95 {*addsi3_aarch64}
> >>  (nil))
> >>(insn 34 33 17 2 (set (reg:CC 66 cc)
> >> (compare:CC (reg/v:SI 0 x0 [orig:73  ] [73])
> >> (const_int 0 [0]))) "comb.c":4 391 {cmpsi}
> >>  (nil))
> >>
> >>This scares combine away as x0 is used in insn 33 as well as the 
> >>comparison in insn 34.
> >>I think the compare-elim pass can help us here.
> >Is it the multiple use or the hard register that combine doesn't
> >appreciate.  The latter would definitely steer us towards compare-elim.
> 
> It's the multiple use IIRC.

Multiple use, and multiple set (of x1), and more complications...

7+33 won't combine to an existing insn.

7+34 will not even be tried (insn 33 is the first use of x0, not insn 34).
But it cannot work anyway, since x1 in insn 7 is clobbered in insn 33, so
7 cannot be merged into 34.

7+33+34 results in a parallel of a compare with the same invalid insn
as in the 7+33 case.  Combine would try to split it to two insns again,
except it already has two insns (the arith and the compare).  It does
not see that when it splits the insn it can combine the first half with
the compare.

What would be needed is pulling insn 34 before insn 33 (which is fine, no
conflicts there), and then we could combine 7+34 just fine.  But combine
tries to be linear complexity, and it really cannot change insns around
anyway.

Segher

Re: backwards threader cleanups

2017-09-01 Thread Trevor Saunders

On Fri, Sep 01, 2017 at 04:18:20PM -0400, Aldy Hernandez wrote:
> Hi.
> 
> Attached are misc cleanups to tree-ssa-threadbackwards.c and friends.
> The main gist of the patch is making the path vectors live in the
> heap, not GC.  But I also cleaned up some comments to reflect reality,
> and renamed VAR_BB which could use a more meaningful name.  Finally, I
> abstracted some common code to
> register_jump_thread_path_if_profitable() in preparation for some
> upcoming work by me :).
> 
> Tested on x86-64 Linux.
> 
> OK?

 check_subpath_and_update_thread_path (basic_block last_bb, basic_block new_bb,
  hash_set *visited_bbs,
- vec *&path,
+ vec &path,
  int *next_path_length)
 {
   edge e;
   int e_count = 0;
   edge_iterator ei;
-  vec *next_path;
-  vec_alloc (next_path, 10);
+  vec next_path = vNULL;

I believe all paths out of this scope release next_path, so it can be
said the scope owns this vector and you could use auto_vec here.

@@ -776,9 +786,8 @@ find_jump_threads_backwards (basic_block bb, bool speed_p)
   if (!name || TREE_CODE (name) != SSA_NAME)
 return;
 
-  vec *bb_path;
-  vec_alloc (bb_path, 10);
-  vec_safe_push (bb_path, bb);
+  vec bb_path = vNULL;

same holds here I think.

+  bb_path.safe_push (bb);
   hash_set *visited_bbs = new hash_set;

   btw I don't see why this hash table can't be just a hash_table<> and
   live on the stack.

thanks

Trev

Re: [PATCH 00/22] RFC: integrated 3rd-party static analysis support

2017-09-01 Thread Trevor Saunders

On Fri, Sep 01, 2017 at 11:46:41AM -0600, Jeff Law wrote:
> On 08/04/2017 04:04 PM, David Malcolm wrote:
> > This patch kit clearly isn't ready yet as-is (see e.g. the
> > "known unknowns" below), but I'm posting it now in the hope of
> > getting early feedback.
> [ ... ]
> 
> > 
> > 
> > Statement of the problem
> > 
> > 
> > Static analysis is IMHO done too late, if at all: static analysis tools are 
> > run
> > as an optional extra, "on the side", rather than in developers' normal
> > workflow, with some kind of "override the compiler and do extra work" hook,
> > which may preclude running more than one analyzer at once.  Analysis results
> > are reviewed (if at all) in some kind of on-the-side tool, rather than when 
> > the
> > code is being edited, or patches being prepared.
> I'm sure you know my opinions on this stuff.  But for the benefit of the
> rest of our readers, I agree, 100% totally on all of this.

I more or less agree, though I think arrangements where its run by build
bots and bustage is treated like breaking the build can be reasonable.

> For checkers to really be effective, they have to be part of the
> standard workflow that we use every day.  Anything else is ultimately a
> losing battle.  That's in large part why I continue to support improving
> GCC's ability to emit high quality useful warnings about likely
> programming errors.

I find myself also wondering how much can be done with generic checkers
as opposed to ones specific to a particular project and its style /
idioms.  if nongeneric checkers are important then it should be
important to make it easy for people to write their own checkers.

> So this raises one very high level question.  By providing this
> capability do we undermine further development of GCC's down analysis
> capabilities or does it merely allow that development to move to its
> most natural place (gcc, llvm/clang, smatch, cppcheck, whatever)
> allowing each tool to focus on what it does best?

Given there is already a lot of work going on around llvm/clang checkers
it seems reasonable to expect there would be less insentive to improve
gcc in that area if you could just use the clang based checkers from
gcc.  Of course if that is a good or bad thing is another question.

Most of the static chekers support being invoked on a file or as if
they were a compiler. So it should be easy to write a wrapper script
that runs the checker and then the compiler if you wishh, or have a
build target that does all the static checking.  So its not clear to me how
much users really get out of this work.  In part I wonder if this is
more a social problem to solve than a technical one.

> > Known unknowns
> > ==
> > 
> > How does one suppress a specific false-positive site?
> > Do we need a pragma for it?  (though pragmas ought to already affect some of
> > the underlying checkers...)
> I'm always conflicted on this kind of suppression/marking.  You can
> easily end up with a boatload of unmaintainable markers.  But without
> them you've got a firehose of useless information.  Sigh.

The other question would be if multiple checkers share a false positive,
how do you nicely disable all of them.  Having 3 different comments /
pragmas to disable the same thing sounds unpleasent.

thanks

Trev

Re: [PATCH][RFA/RFC] Stack clash mitigation patch 06/08 - V3

2017-09-01 Thread Jeff Law

On 08/29/2017 05:14 PM, Segher Boessenkool wrote:
> Hi Jeff,
> 
> Sorry for the delay in reviewing this.  It mostly looks good :-)
Thanks.  No worries about the delay.  Your input definitely helped move
the target independent stuff to a better place.  And frankly I
wanted/needed some time away from this stuff.

> 
> On Sun, Jul 30, 2017 at 11:45:16PM -0600, Jeff Law wrote:
>>
>> This contains the PPC bits for stack clash protection.
>>
>> Changes since V2:
>>
>> Exploits inlined/unrolled probes and rotated loops for the dynamic area.
>>  Some trivial simplifications.  It also uses the new params to control
>> if probes are needed and how often to probe.
> 
> 
>>  * config/rs6000/rs6000-protos.h (output_probe_stack_range): Update
>>  prototype for new argument.
>>  * config/rs6000/rs6000.c (wrap_frame_mem): New function extracted
>>  from rs6000_emit_allocate_stack.
>>  (handle_residual): New function. 
> 
> Trailing space.
Fixed, along with the other ChangeLog nits you pointed out.



> 
> 
>> +/* INSN allocates SIZE bytes on the stack (STACK_REG) using a store
>> +   with update style insn.
>> +
>> +   Set INSN's alias set/attributes and add suitable flags and notes
>> +   for the dwarf CFI machinery.  */
>> +static void
>> +wrap_frame_mem (rtx insn, rtx stack_reg, HOST_WIDE_INT size)
> 
> This isn't such a great name...  Something with "frame allocate"?  And
> the "wrap" part is meaningless...  "set_attrs_for_frame_allocation"?
> Or maybe "stack allocation" is better.
The name is terrible.  I shouldn't have let that get through.  I think
set attrs_for_stack_allocation is better given the current behavior of
the function, but as you note some further refinement would probably
help here and would result in a different name.


> 
> Actually, everywhere it is used it has a Pmode == SImode wart before
> it, to emit the right update_stack insn...  So fold that into this
> function, name it rs6000_emit_allocate_stack_1?
Agreed.  That seems like a nice cleanup.  Look at the call from
rs6000_emit_allocate_stack.  Instead of a Pmode == SImode, it tests
TARGET_32BIT instead.  But I think that one is still safe to convert
based on how we set Pmode in rs6000_option_override_internal.


> 
>> +/* Allocate ROUNDED_SIZE - ORIG_SIZE bytes on the stack, storing
>> +   ORIG_SP into *sp after the allocation.
> 
> This is a bit misleading: it has to store it at the _same time_ as doing
> any change to r1.  Or, more precisely, it has to ensure r1 points at a
> valid stack backchain at all times.  Since the red zone is pretty small
> (if it exists at all, it doesn't on all ABIs) you need atomic store-with-
> update in most cases.
Yea, I was imprecise in my language.
> 
>> +static rtx_insn *
>> +handle_residual (HOST_WIDE_INT orig_size,
>> + HOST_WIDE_INT rounded_size,
>> + rtx orig_sp)
> 
> Not a great name either.  Since this is only used once, and it is hard
> to make a good name for it, and it is only a handful of statements, just
> inline it?
It was originally used multiple times, but refactoring resulted in a
single use.  I'll just inline it -- it's trivial after we revamp
wrap_frame_mem.



> 
>> +/* Allocate ORIG_SIZE bytes on the stack and probe the newly
>> +   allocated space every STACK_CLASH_PROTECTION_PROBE_INTERVAL bytes.
>> +
>> +   COPY_REG, if non-null, should contain a copy of the original
>> +   stack pointer modified by COPY_OFF at exit from this function.
>> +
>> +   This is subtly different than the Ada probing in that it tries hard to
>> +   prevent attacks that jump the stack guard.  Thus it is never allowed to
>> +   allocate more than STACK_CLASH_PROTECTION_PROBE_INTERVAL bytes of stack
>> +   space without a suitable probe.  */
> 
> Please handle the COPY_OFF part in the (single) caller of this, that
> will simplify things a little I think?
Sure.  I thought I had it that way at one point.


> 
> You don't describe what the return value here is (but neither does
> rs6000_emit_allocate_stack); it is the single instruction that actually
> changes the stack pointer?  For the split stack support?  (Does the stack
> clash code actually work with split stack, has that been tested?)
As far as I was able to ascertain (and essentially duplicate) we return
the insn that allocates the stack, but only if the allocation was
handled a single insn.  Otherwise we return NULL.

But that was determined largely by guesswork.  It interacts with split
stack support, but I'm not entirely what it's meant to do.  If you have
insights here, I'll happily add comments to the routines -- when I was
poking at this stuff I was rather distressed to not have any real
guidance on what the return value should be.

I have not tested stack clash and split stacks. ISTM they ought to be
able to work together, but I haven't though real deep about it.



> 
> Maybe we should keep track of that sp_adjust insn in a global, or in
> machine_function, or even in the stack info struct.
Maybe.  It's cert

87 matches

Mail list logo