Re: [patch] Fix timevar internal consistency failure

2016-02-10 Thread David Malcolm
On Wed, 2016-02-10 at 15:15 +0100, Michael Matz wrote:
> Hi,
> 
> On Wed, 10 Feb 2016, Richard Biener wrote:
> 
> > > The problem is that TV_PHASE_DBGINFO is now nested within 
> > > TV_PHASE_OPT_GEN, which violates the above mutual exclusivity 
> > > requirement.  Therefore the attached patch simply gets rid of 
> > > TV_PHASE_DBGINFO (as well as of the sibling
> > > TV_PHASE_CHECK_DBGINFO 
> > > which was already unused).
> > > 
> > > Tested on x86_64-suse-linux, OK for the mainline?
> > 
> > Ok.
> 
> I had this in my tree for a while, asserting that such nesting
> doesn't 
> happen (it asserts that we're always in some phase, and that phases
> don't 
> nest).  Might be a good addition for gcc 7.

> Ciao,
> Michael.
> 
> Index: timevar.c
> ===
> --- timevar.c (revision 232927)
> +++ timevar.c (working copy)
> @@ -325,6 +325,8 @@ timer::push (timevar_id_t timevar)
>push_internal (tv);
>  }
>  
> +static timevar_id_t global_phase;

FWIW I like the idea, but could this be a private field within class
timer, rather than a global?  I moved the global state relating to
timevars into a "class timer" in r223092 (the jit uses this, via a
followup: r226530: in theory different threads can have different timer
instances [1]).

I see that all of the accesses are within timer:: methods.  Maybe
"m_current_phase" or somesuch?

[1] https://gcc.gnu.org/onlinedocs/jit/topics/performance.html#the-timi
ng-api
and there's a TV_JIT_ACQUIRING_MUTEX which a thread uses for time spent
waiting to acquire the big jit mutex (which guards e.g. the g_timer
singleton pointer.  Within toplev and below, g_timer is the single
"live" instance of class timer, but there can be multiple threads each
with timer instances waiting to call into toplev via gcc_jit_context_co
mpile, if that makes sense).

>  /* Push TV onto the timing stack, either one of the builtin ones
> for a timevar_id_t, or one provided by client code to libgccjit. 
>  */
>  
> @@ -350,6 +352,8 @@ timer::push_internal (struct timevar_def
>if (m_stack)
>  timevar_accumulate (_stack->timevar->elapsed, _start_time,
> );
>  
> +  gcc_assert (global_phase >= TV_PHASE_SETUP
> +   && global_phase <= TV_PHASE_FINALIZE);
>/* Reset the start time; from now on, time is attributed to
>   TIMEVAR.  */
>m_start_time = now;
> @@ -432,6 +436,9 @@ timer::start (timevar_id_t timevar)
>  {
>struct timevar_def *tv = _timevars[timevar];
>  
> +  gcc_assert (global_phase == TV_NONE || global_phase == TV_TOTAL);
> +  global_phase = timevar;
> +
>/* Mark this timing variable as used.  */
>tv->used = 1;
>  
> @@ -463,6 +470,12 @@ timer::stop (timevar_id_t timevar)
>struct timevar_def *tv = _timevars[timevar];
>struct timevar_time_def now;
>  
> +  gcc_assert (global_phase == timevar);
> +  if (timevar == TV_TOTAL)
> +global_phase = TV_NONE;
> +  else
> +global_phase = TV_TOTAL;
> +
>/* TIMEVAR must have been started via timevar_start.  */
>gcc_assert (tv->standalone);
>tv->standalone = 0; /* Enable a restart.  */


Re: [patch] Fix timevar internal consistency failure

2016-02-10 Thread Michael Matz
Hi,

On Wed, 10 Feb 2016, David Malcolm wrote:

> > +static timevar_id_t global_phase;
> 
> FWIW I like the idea, but could this be a private field within class
> timer, rather than a global?

Sure, consider the patch amended accordingly.


Ciao,
Michael.


RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-10 Thread Claudiu Zissulescu
Please find attached the amended patch for FPU instructions.

Ok to apply?


0001-ARC-Add-single-double-IEEE-precission-FPU-support.patch
Description: 0001-ARC-Add-single-double-IEEE-precission-FPU-support.patch


Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 02:35 PM, Richard Biener wrote:


Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1274,7 +1274,8 @@ noce_try_store_flag_constants (struct no
&& CONST_INT_P (XEXP (a, 1))
&& CONST_INT_P (XEXP (b, 1))
&& rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
-  && noce_operand_ok (XEXP (a, 0))
+  && (REG_P (XEXP (a, 0))
+ || ! reg_mentioned_p (if_info->x, XEXP (a, 0)))


I guess that would also work. Could maybe use a brief comment.


Bernd


Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 02:50 PM, Richard Biener wrote:

On Wed, 10 Feb 2016, Bernd Schmidt wrote:


On 02/10/2016 02:35 PM, Richard Biener wrote:


Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1274,7 +1274,8 @@ noce_try_store_flag_constants (struct no
 && CONST_INT_P (XEXP (a, 1))
 && CONST_INT_P (XEXP (b, 1))
 && rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
-  && noce_operand_ok (XEXP (a, 0))
+  && (REG_P (XEXP (a, 0))
+ || ! reg_mentioned_p (if_info->x, XEXP (a, 0)))


I guess that would also work. Could maybe use a brief comment.


Ok.  I'm testing that.  I wonder if we need to use reg_overlap_mentioned_p
here (hard-reg pairs?) or if reg_mentioned_p is safe.


Let's go with reg_overlap_mentioned_p. I kind of forgot about that once 
I thought of possible issues with emitting a move :-(



Bernd


Re: [PATCH PR68021]Set ratio to 1 when computing the value of biv cand by itself

2016-02-10 Thread Bin.Cheng
On Wed, Feb 10, 2016 at 1:27 PM, Richard Biener
 wrote:
>
> On Wed, Feb 10, 2016 at 12:34 PM, Bin Cheng  wrote:
> > Hi,
> > This is another way to fix PR68021, and I think it's the least intrusive 
> > way.  The issue is triggered in a special case in which cand is a original 
> > biv, and use denotes the value of the biv itself.  In this case, the use is 
> > added specifically for the original biv, as a result, get_computation_aff 
> > isn't called for the  pair before rewriting the use.  It is 
> > possible that constant_multiple_of/operand_equal_q could fail because of 
> > inconsistent fold behavior.  The fold behavior is fully described in 
> > PR68021.
> > This patch fixes IVOPT part of issue by setting ratio to 1, because it is 
> > known that the use has the value of the biv cand.
> >
> > Bootstrap and test on x86_64 and aarch64.  Is it OK if no failures?
>
> Ok, but please add a comment why we have this special-casing instead
> of relying on constant_multiple_of.
Done, patch applied at revision 233269.

Thanks,
bin
>
>
> Thanks,
> Richard.
>
> > Thanks,
> > bin
> >
> > 2016-02-09  Bin Cheng  
> >
> > PR tree-optimization/68021
> > * tree-ssa-loop-ivopts.c (get_computation_aff): Set ratio to 1 if
> > when computing the value of biv cand by itself.
> >
> > gcc/testsuite/ChangeLog
> > 2016-02-09  Bin Cheng  
> >
> > PR tree-optimization/68021
> > * gcc.dg/tree-ssa/pr68021.c: New test.


Re: libgcc: On AIX, increase chances to find landing pads for exceptions

2016-02-10 Thread David Edelsohn
On Wed, Feb 10, 2016 at 1:52 AM, Michael Haubenwallner
 wrote:
>
> On 02/08/2016 02:59 PM, David Edelsohn wrote:
>> Runtime linking is disabled by default on AIX, and I disabled it for 
>> libstdc++.
>
> For large applications mainly developed on/for Linux I do prefer/need
> runtime linking even on AIX. Still I do believe there is no AIX-based
> reason to leave runtime linking disabled, but build-/linktime issues
> instead that cause things to fail with runtime linking enabled.

What do you mean by the term "runtime linking"?  Runtime linking means
runtime overloading of symbols -- preloading -- not dynamic linking
and loading.  dlopen does not require runtime linking.  There also are
issues with searching for shared objects with .a or .so file
extension, but that can be addressed separately.

Runtime linking causes every global, exported function call to be
invoked through indirect glue code.  And each function must be
inserted into the TOC.  The indirect call overhead is very expensive,
and potential TOC overflow can cause even more performance
degradation.

Your statement of no AIX-based reason to leave runtime linking
disabled is fundamentally flawed.

>
>> There are two remaining issues:
>>
>> 1) FDEs with overlapping ranges causing problems with exceptions.  I'm
>> not sure of the best way to work around this.  Your patch is one
>> possible solution.
>
> This patch is not meant as a final solution, but to improve current
> situation with broken build systems exporting even _GLOBAL__ symbols.
> I'm about to prepare another libtool patch to fix that one.
>
>> 2) AIX linker garbage collection conflicting with scanning for
>> symbols.  collect2 scanning needs to better emulate SVR4 linker
>> semantics for object files and archives.
>
> Probably collect2 should filter the symbol list originating in either
> an explicit -bexport:file or the -bexpall/-bexpfull flags and pass the
> resulting symbol list as explicit -bexport:file only to the AIX linker?

-bexpall and -bexpfull cause numerous problem by re-exporting symbols.

All of the suggestions will produce programs that function, but have
severe performance impacts and unintended consequences that you seem
to be ignoring.

- David

>
> /haubi/
>
>>
>> Thanks, David
>>
>>
>> On Mon, Feb 8, 2016 at 7:14 AM, Michael Haubenwallner
>>  wrote:
>>> Hi David,
>>>
>>> still experiencing exception-not-caught problems with gcc-4.2.4 on AIX
>>> leads me to some patch proposed in http://gcc.gnu.org/PR13878 back in
>>> 2004 already, ought to be fixed by some different commit since 3.4.0.
>>>
>>> As long as build systems (even libtool right now) on AIX do export these
>>> _GLOBAL__* symbols from shared libraries, overlapping frame-base address
>>> ranges may become registered, even if newer gcc (seen with 4.8) does name
>>> the FDE symbols more complex to reduce these chances.
>>>
>>> But still, just think of linking some static library into multiple shared
>>> libraries and/or the main executable. Or sometimes there is just need for
>>> some hackery to override a shared object's implementation detail and rely
>>> on runtime linking to do the override at runtime.
>>>
>>> Agreed both is "wrong" to some degree, but the larger an application is,
>>> the higher is the chance for this to happen.
>>>
>>> Thoughts?
>>>
>>> Thanks!
>>> /haubi/


Re: [PATCH PR68021]Set ratio to 1 when computing the value of biv cand by itself

2016-02-10 Thread Richard Biener
On Wed, Feb 10, 2016 at 12:34 PM, Bin Cheng  wrote:
> Hi,
> This is another way to fix PR68021, and I think it's the least intrusive way. 
>  The issue is triggered in a special case in which cand is a original biv, 
> and use denotes the value of the biv itself.  In this case, the use is added 
> specifically for the original biv, as a result, get_computation_aff isn't 
> called for the  pair before rewriting the use.  It is possible 
> that constant_multiple_of/operand_equal_q could fail because of inconsistent 
> fold behavior.  The fold behavior is fully described in PR68021.
> This patch fixes IVOPT part of issue by setting ratio to 1, because it is 
> known that the use has the value of the biv cand.
>
> Bootstrap and test on x86_64 and aarch64.  Is it OK if no failures?

Ok, but please add a comment why we have this special-casing instead
of relying on constant_multiple_of.

Thanks,
Richard.

> Thanks,
> bin
>
> 2016-02-09  Bin Cheng  
>
> PR tree-optimization/68021
> * tree-ssa-loop-ivopts.c (get_computation_aff): Set ratio to 1 if
> when computing the value of biv cand by itself.
>
> gcc/testsuite/ChangeLog
> 2016-02-09  Bin Cheng  
>
> PR tree-optimization/68021
> * gcc.dg/tree-ssa/pr68021.c: New test.


libgcc patch committed: Align stack in __stack_split_initialize

2016-02-10 Thread Ian Lance Taylor
PR 68562 points out that the x86 stack is misaligned when
__stack_split_initialize calls __generic_morestack_set_initial_sp,
causing crashes with the trunk glibc.  This patch fixes the problem by
aligning the stack.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline and gcc 5 branch.

Ian

2016-02-10  Ian Lance Taylor  

PR go/68562
* config/i386/morestack.S (__stack_split_initialize): Align
stack.
Index: config/i386/morestack.S
===
--- config/i386/morestack.S (revision 233268)
+++ config/i386/morestack.S (working copy)
@@ -732,6 +732,7 @@ __stack_split_initialize:
 
leal-16000(%esp),%eax   # We should have at least 16K.
movl%eax,%gs:0x30
+   subl$4,%esp # Align stack.
pushl   $16000
pushl   %esp
 #ifdef __PIC__
@@ -739,13 +740,14 @@ __stack_split_initialize:
 #else
call__generic_morestack_set_initial_sp
 #endif
-   addl$8,%esp
+   addl$12,%esp
ret
 
 #else /* defined(__x86_64__) */
 
leaq-16000(%rsp),%rax   # We should have at least 16K.
X86_64_SAVE_NEW_STACK_BOUNDARY (ax)
+   subq$8,%rsp # Align stack.
movq%rsp,%rdi
movq$16000,%rsi
 #ifdef __PIC__
@@ -753,6 +755,7 @@ __stack_split_initialize:
 #else
call__generic_morestack_set_initial_sp
 #endif
+   addq$8,%rsp
ret
 
 #endif /* defined(__x86_64__) */


Small C++ PATCH for debug_tree(cp_expr)

2016-02-10 Thread Jason Merrill
David's cp_expr change made use of the 'pt' gdb macro less convenient in 
some cases, so I'm adding a debug_tree overload to handle it.
commit 48d6e68c6b43b11d4f0bd696f19b490358954937
Author: Jason Merrill 
Date:   Mon Feb 8 23:46:14 2016 -0500

	* ptree.c (debug_tree): Implement for cp_expr.

diff --git a/gcc/cp/ptree.c b/gcc/cp/ptree.c
index 2c8ff99..8266e83 100644
--- a/gcc/cp/ptree.c
+++ b/gcc/cp/ptree.c
@@ -287,3 +287,11 @@ cxx_print_xnode (FILE *file, tree node, int indent)
   break;
 }
 }
+
+/* Print the node NODE on standard error, for debugging.  */
+
+DEBUG_FUNCTION void
+debug_tree (cp_expr node)
+{
+  debug_tree (node.get_value());
+}


Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Richard Biener
On Wed, 10 Feb 2016, Bernd Schmidt wrote:

> On 02/10/2016 02:04 PM, Richard Biener wrote:
> > where noce_try_store_flag_constants identifies
> > 
> > (plus:SI (reg/v:SI 160 [ mod_tlen ])
> >   (reg/v:SI 224 [  ]))
> > 
> > as "common" and then tries to detect the case where setting the
> > result would clobber that value.  It doesn't seem to expect
> > anything else than regs that can be equal to the destination though
> > which is clearly an oversight.
> 
> > /* If we have x := test ? x + 3 : x + 4 then move the original
> >  x out of the way while we store flags.  */
> > -  if (common && rtx_equal_p (common, if_info->x))
> > +  if (common && reg_mentioned_p (if_info->x, common))
> > {
> > - common = gen_reg_rtx (mode);
> > - noce_emit_move_insn (common, if_info->x);
> > + rtx tem = gen_reg_rtx (mode);
> > + noce_emit_move_insn (tem, common);
> > + common = tem;
> > }
> 
> I'm not so sure noce_emit_move_insn will reliably handle an arbitrary
> expression. I think a more conservative fix would be to disable this transform
> if common is not a reg.

I also wondered about this but then noce_emit_move_insn is quite elaborate
(calling into expanders eventually).

But if you prefer I can instead test the following

Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1274,7 +1274,7 @@ noce_try_store_flag_constants (struct no
   && CONST_INT_P (XEXP (a, 1))
   && CONST_INT_P (XEXP (b, 1))
   && rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
-  && noce_operand_ok (XEXP (a, 0))
+  && REG_P (XEXP (a, 0))
   && if_info->branch_cost >= 2)
 {
   common = XEXP (a, 0);

Thanks,
Richard.


Re: Patch ping

2016-02-10 Thread Richard Biener
On Wed, 10 Feb 2016, Jakub Jelinek wrote:

> Hi!
> 
> I'd like to ping a P1 patch:
> PR ipa/69241, PR c++/69649
>   https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00192.html

Ok.

Thanks,
Richard.

>   Jakub
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: C++ PATCH for c++/69657 (abs not inlined)

2016-02-10 Thread Jason Merrill

On 02/09/2016 03:29 PM, Rainer Orth wrote:

This patch broke Solaris bootstrap (seen on i386-pc-solaris2.12):


Fixed by pruning hidden names from the lookup result in more places.

Jason


commit 09bb9e19d3f284c2f9d8bf57959c99334363c3c9
Author: Jason Merrill 
Date:   Tue Feb 9 16:20:46 2016 -0500

	PR c++/69657

	* name-lookup.c (ambiguous_decl): Call remove_hidden_names.
	(lookup_name_real_1): Likewise.
	(remove_hidden_names): Handle non-functions too.

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 227d6f2..8d6e75a 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -4221,9 +4221,9 @@ ambiguous_decl (struct scope_binding *old, cxx_binding *new_binding, int flags)
   val = new_binding->value;
   if (val)
 {
-  if (hidden_name_p (val) && !(flags & LOOKUP_HIDDEN))
-	val = NULL_TREE;
-  else
+  if (!(flags & LOOKUP_HIDDEN))
+	val = remove_hidden_names (val);
+  if (val)
 	switch (TREE_CODE (val))
 	  {
 	  case TEMPLATE_DECL:
@@ -4353,7 +4353,7 @@ hidden_name_p (tree val)
   return false;
 }
 
-/* Remove any hidden friend functions from a possibly overloaded set
+/* Remove any hidden declarations from a possibly overloaded set
of functions.  */
 
 tree
@@ -4362,7 +4362,7 @@ remove_hidden_names (tree fns)
   if (!fns)
 return fns;
 
-  if (TREE_CODE (fns) == FUNCTION_DECL && hidden_name_p (fns))
+  if (DECL_P (fns) && hidden_name_p (fns))
 fns = NULL_TREE;
   else if (TREE_CODE (fns) == OVERLOAD)
 {
@@ -4931,6 +4931,10 @@ lookup_name_real_1 (tree name, int prefer_type, int nonclass, bool block_p,
   if (!val)
 val = unqualified_namespace_lookup (name, flags);
 
+  /* Anticipated built-ins and friends aren't found by normal lookup.  */
+  if (val && !(flags & LOOKUP_HIDDEN))
+val = remove_hidden_names (val);
+
   /* If we have a single function from a using decl, pull it out.  */
   if (val && TREE_CODE (val) == OVERLOAD && !really_overloaded_fn (val))
 val = OVL_FUNCTION (val);
diff --git a/gcc/testsuite/g++.dg/lookup/builtin7.C b/gcc/testsuite/g++.dg/lookup/builtin7.C
new file mode 100644
index 000..c612d8c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/builtin7.C
@@ -0,0 +1,14 @@
+// PR c++/69657
+
+typedef unsigned int size_t;
+namespace std {
+extern void *calloc(size_t, size_t);
+}
+using std::calloc;
+int
+main ()
+{
+  char *(*pfn) = (char *(*)) calloc ;
+  (bool)
+  (bool)&::calloc;
+}


Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 12:49 PM, Thomas Schwinge wrote:

Hi!

Ping.


I think this has to be considered after gcc-6. In general, what's the 
state of OpenACC these days?


I'm slightly confused by the interface between offloaded code and 
libgomp. It looks like you're collecting avoid-offloading flags 
per-function, but then when things get registered, it seems like a 
per-image flag. Is that right? It seems like too large a hammer.



+ bool avoid_offloading_p = true;
+ for (unsigned ix = 0; ix != GOMP_DIM_MAX; ix++)
+   {
+ if (dims[ix] > 1)
+   {
+ avoid_offloading_p = false;
+ break;
+   }
+   }


Avoid unnecessary braces.


+  executable directqives be used, or runtime library calls be


Typo.


Bernd


Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 02:33 PM, Richard Biener wrote:


But if you prefer I can instead test the following

Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1274,7 +1274,7 @@ noce_try_store_flag_constants (struct no
&& CONST_INT_P (XEXP (a, 1))
&& CONST_INT_P (XEXP (b, 1))
&& rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
-  && noce_operand_ok (XEXP (a, 0))
+  && REG_P (XEXP (a, 0))
&& if_info->branch_cost >= 2)


I think I prefer this, but actually maybe OBJECT_P is safe enough.


Bernd



Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Richard Biener
On Wed, 10 Feb 2016, Richard Biener wrote:

> On Wed, 10 Feb 2016, Bernd Schmidt wrote:
> 
> > On 02/10/2016 02:04 PM, Richard Biener wrote:
> > > where noce_try_store_flag_constants identifies
> > > 
> > > (plus:SI (reg/v:SI 160 [ mod_tlen ])
> > >   (reg/v:SI 224 [  ]))
> > > 
> > > as "common" and then tries to detect the case where setting the
> > > result would clobber that value.  It doesn't seem to expect
> > > anything else than regs that can be equal to the destination though
> > > which is clearly an oversight.
> > 
> > > /* If we have x := test ? x + 3 : x + 4 then move the original
> > >x out of the way while we store flags.  */
> > > -  if (common && rtx_equal_p (common, if_info->x))
> > > +  if (common && reg_mentioned_p (if_info->x, common))
> > >   {
> > > -   common = gen_reg_rtx (mode);
> > > -   noce_emit_move_insn (common, if_info->x);
> > > +   rtx tem = gen_reg_rtx (mode);
> > > +   noce_emit_move_insn (tem, common);
> > > +   common = tem;
> > >   }
> > 
> > I'm not so sure noce_emit_move_insn will reliably handle an arbitrary
> > expression. I think a more conservative fix would be to disable this 
> > transform
> > if common is not a reg.
> 
> I also wondered about this but then noce_emit_move_insn is quite elaborate
> (calling into expanders eventually).
> 
> But if you prefer I can instead test the following
> 
> Index: gcc/ifcvt.c
> ===
> --- gcc/ifcvt.c (revision 233262)
> +++ gcc/ifcvt.c (working copy)
> @@ -1274,7 +1274,7 @@ noce_try_store_flag_constants (struct no
>&& CONST_INT_P (XEXP (a, 1))
>&& CONST_INT_P (XEXP (b, 1))
>&& rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
> -  && noce_operand_ok (XEXP (a, 0))
> +  && REG_P (XEXP (a, 0))
>&& if_info->branch_cost >= 2)
>  {
>common = XEXP (a, 0);

Or less aggressive

Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1274,7 +1274,8 @@ noce_try_store_flag_constants (struct no
   && CONST_INT_P (XEXP (a, 1))
   && CONST_INT_P (XEXP (b, 1))
   && rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
-  && noce_operand_ok (XEXP (a, 0))
+  && (REG_P (XEXP (a, 0))
+ || ! reg_mentioned_p (if_info->x, XEXP (a, 0)))
   && if_info->branch_cost >= 2)
 {
   common = XEXP (a, 0);



Patch ping

2016-02-10 Thread Jakub Jelinek
Hi!

I'd like to ping a P1 patch:
PR ipa/69241, PR c++/69649
  https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00192.html

Jakub


Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 03:39 PM, Thomas Schwinge wrote:


Yes, we need a hammer that big: we have to ensure consistency between
data regions on the device and code offloading to the device, as
otherwise we'll very easily run into inconsistencies, because of the
non-shared memory.  In the general case, it's "all or nothing": you
either have to offload all kernels or none of them.


That's unfortunately not the impression I got from the earlier 
discussion, and this seems to imply that one unprofitable kernel would 
disable all the others - IMO this is not acceptable. There need to be 
more compiler smarts to figure out whether a kernel is a valid candidate 
for skipping the offloading.



Bernd


Re: Use plain -fopenacc to enable OpenACC kernels processing

2016-02-10 Thread Thomas Schwinge
Hi!

Will this patch be acceptable for GCC trunk in the current development
stage?  In its current incarnation, this patch depends on my
'Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid
offloading"' patch,
,
which Bernd suggested "has to be considered after gcc-6".  So, I'll have
to re-work this patch here, hence I'm first checking if it generally
meets approval?

On Fri, 5 Feb 2016 13:06:17 +0100, I wrote:
> On Mon, 9 Nov 2015 18:39:19 +0100, Tom de Vries  
> wrote:
> > On 09/11/15 16:35, Tom de Vries wrote:
> > > this patch series for stage1 trunk adds support to:
> > > - parallelize oacc kernels regions using parloops, and
> > > - map the loops onto the oacc gang dimension.
> 
> > Atm, the parallelization behaviour for the kernels region is controlled 
> > by flag_tree_parallelize_loops, which is also used to control generic 
> > auto-parallelization by autopar using omp. That is not ideal, and we may 
> > want a separate flag (or param) to control the behaviour for oacc 
> > kernels, f.i. -foacc-kernels-gang-parallelize=. I'm open to suggestions.
> 
> I suggest to use plain -fopenacc to enable OpenACC kernels processing
> (which just makes sense, I hope) ;-) and have later processing stages
> determine the actual parametrization (currently: number of gangs) (that
> is, Nathan's recent "Default compute dimensions" patches).
> 
> The code changes are simple enough; OK for trunk?  (This patch depends on
> my 'Un-parallelized OpenACC kernels constructs with nvptx offloading:
> "avoid offloading"' pending review,
> .)
> 
> Originally, I want to use:
> 
> OMP_CLAUSE_NUM_GANGS_EXPR (clause) = build_int_cst (integer_type_node, 
> n_threads == 0 ? -1 : n_threads);
> 
> ... to store -1 "have the compiler decidew" (instead of now 0 "have the
> run-time decide", which might prevent some code optimizations, as I
> understand it) for the n_threads == 0 case, but it seems that for an
> offloaded OpenACC kernels region, gcc/omp-low.c:oacc_validate_dims is
> called with the parameter "used" set to 0 instead of "gang", and then the
> "Default anything left to 1 or a partitioned default" logic will default
> dims["gang"] to oacc_min_dims["gang"] (that is, 1) instead of the
> oacc_default_dims["gang"] (that is, 32).  Nathan, does that smell like a
> bug (and could you look into that)?
> 
> diff --git gcc/tree-parloops.c gcc/tree-parloops.c
> index 139e38c..e498e5b 100644
> --- gcc/tree-parloops.c
> +++ gcc/tree-parloops.c
> @@ -2016,7 +2016,8 @@ transform_to_exit_first_loop (struct loop *loop,
>  /* Create the parallel constructs for LOOP as described in gen_parallel_loop.
> LOOP_FN and DATA are the arguments of GIMPLE_OMP_PARALLEL.
> NEW_DATA is the variable that should be initialized from the argument
> -   of LOOP_FN.  N_THREADS is the requested number of threads.  */
> +   of LOOP_FN.  N_THREADS is the requested number of threads, which can be 0 
> if
> +   that number is to be determined later.  */
>  
>  static void
>  create_parallel_loop (struct loop *loop, tree loop_fn, tree data,
> @@ -2049,6 +2050,7 @@ create_parallel_loop (struct loop *loop, tree loop_fn, 
> tree data,
>basic_block paral_bb = single_pred (bb);
>gsi = gsi_last_bb (paral_bb);
>  
> +  gcc_checking_assert (n_threads != 0);
>t = build_omp_clause (loc, OMP_CLAUSE_NUM_THREADS);
>OMP_CLAUSE_NUM_THREADS_EXPR (t)
>   = build_int_cst (integer_type_node, n_threads);
> @@ -2221,7 +2223,8 @@ create_parallel_loop (struct loop *loop, tree loop_fn, 
> tree data,
>  }
>  
>  /* Generates code to execute the iterations of LOOP in N_THREADS
> -   threads in parallel.
> +   threads in parallel, which can be 0 if that number is to be determined
> +   later.
>  
> NITER describes number of iterations of LOOP.
> REDUCTION_LIST describes the reductions existent in the LOOP.  */
> @@ -2318,6 +2321,7 @@ gen_parallel_loop (struct loop *loop,
>else
>   m_p_thread=MIN_PER_THREAD;
>  
> +  gcc_checking_assert (n_threads != 0);
>many_iterations_cond =
>   fold_build2 (GE_EXPR, boolean_type_node,
>nit, build_int_cst (type, m_p_thread * n_threads));
> @@ -3177,7 +3181,7 @@ oacc_entry_exit_ok (struct loop *loop,
>  static bool
>  parallelize_loops (bool oacc_kernels_p)
>  {
> -  unsigned n_threads = flag_tree_parallelize_loops;
> +  unsigned n_threads;
>bool changed = false;
>struct loop *loop;
>struct loop *skip_loop = NULL;
> @@ -3199,6 +3203,13 @@ parallelize_loops (bool oacc_kernels_p)
>if (cfun->has_nonlocal_label)
>  return false;
>  
> +  /* For OpenACC kernels, n_threads will be determined later; otherwise, it's
> + the argument to -ftree-parallelize-loops.  */
> +  if (oacc_kernels_p)
> +

Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 02:04 PM, Richard Biener wrote:

where noce_try_store_flag_constants identifies

(plus:SI (reg/v:SI 160 [ mod_tlen ])
  (reg/v:SI 224 [  ]))

as "common" and then tries to detect the case where setting the
result would clobber that value.  It doesn't seem to expect
anything else than regs that can be equal to the destination though
which is clearly an oversight.



/* If we have x := test ? x + 3 : x + 4 then move the original
 x out of the way while we store flags.  */
-  if (common && rtx_equal_p (common, if_info->x))
+  if (common && reg_mentioned_p (if_info->x, common))
{
- common = gen_reg_rtx (mode);
- noce_emit_move_insn (common, if_info->x);
+ rtx tem = gen_reg_rtx (mode);
+ noce_emit_move_insn (tem, common);
+ common = tem;
}


I'm not so sure noce_emit_move_insn will reliably handle an arbitrary 
expression. I think a more conservative fix would be to disable this 
transform if common is not a reg.



Bernd


Re: [patch] Fix timevar internal consistency failure

2016-02-10 Thread Michael Matz
Hi,

On Wed, 10 Feb 2016, Richard Biener wrote:

> > The problem is that TV_PHASE_DBGINFO is now nested within 
> > TV_PHASE_OPT_GEN, which violates the above mutual exclusivity 
> > requirement.  Therefore the attached patch simply gets rid of 
> > TV_PHASE_DBGINFO (as well as of the sibling TV_PHASE_CHECK_DBGINFO 
> > which was already unused).
> >
> > Tested on x86_64-suse-linux, OK for the mainline?
> 
> Ok.

I had this in my tree for a while, asserting that such nesting doesn't 
happen (it asserts that we're always in some phase, and that phases don't 
nest).  Might be a good addition for gcc 7.


Ciao,
Michael.

Index: timevar.c
===
--- timevar.c   (revision 232927)
+++ timevar.c   (working copy)
@@ -325,6 +325,8 @@ timer::push (timevar_id_t timevar)
   push_internal (tv);
 }
 
+static timevar_id_t global_phase;
+
 /* Push TV onto the timing stack, either one of the builtin ones
for a timevar_id_t, or one provided by client code to libgccjit.  */
 
@@ -350,6 +352,8 @@ timer::push_internal (struct timevar_def
   if (m_stack)
 timevar_accumulate (_stack->timevar->elapsed, _start_time, );
 
+  gcc_assert (global_phase >= TV_PHASE_SETUP
+ && global_phase <= TV_PHASE_FINALIZE);
   /* Reset the start time; from now on, time is attributed to
  TIMEVAR.  */
   m_start_time = now;
@@ -432,6 +436,9 @@ timer::start (timevar_id_t timevar)
 {
   struct timevar_def *tv = _timevars[timevar];
 
+  gcc_assert (global_phase == TV_NONE || global_phase == TV_TOTAL);
+  global_phase = timevar;
+
   /* Mark this timing variable as used.  */
   tv->used = 1;
 
@@ -463,6 +470,12 @@ timer::stop (timevar_id_t timevar)
   struct timevar_def *tv = _timevars[timevar];
   struct timevar_time_def now;
 
+  gcc_assert (global_phase == timevar);
+  if (timevar == TV_TOTAL)
+global_phase = TV_NONE;
+  else
+global_phase = TV_TOTAL;
+
   /* TIMEVAR must have been started via timevar_start.  */
   gcc_assert (tv->standalone);
   tv->standalone = 0; /* Enable a restart.  */


C++ PATCH for c++/10200 (DR 141, template name lookup after ->)

2016-02-10 Thread Jason Merrill
After . or ->, when we see a name followed by < we look for a template 
name, first in the scope of the object expression and then in the 
enclosing scope.  DR 141 clarified that when we look in the enclosing 
scope, we only consider class templates, since there's no way a 
non-member function template could be correct in that situation.


When I fixed that, I found that we were failing to do the lookup in the 
object scope in the case where that scope is the current instantiation, 
so I needed to fix that as well.


Tested x86_64-pc-linux-gnu, applying to trunk.
commit b3a4e73d776023398007c3d41bf957cee1792683
Author: Jason Merrill 
Date:   Mon Feb 8 23:46:24 2016 -0500

	PR c++/10200

	* parser.c (cp_parser_lookup_name): When looking for a template
	after . or ->, only consider class templates.
	(cp_parser_postfix_dot_deref_expression): Handle the current
	instantiation.  Remember a dependent object expression.
	* typeck2.c (build_x_arrow): Handle the current instantiation.

diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 6f47edf..07d1821 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -7184,8 +7184,16 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
   if (token_type == CPP_DEREF)
 postfix_expression = build_x_arrow (location, postfix_expression,
 	tf_warning_or_error);
-  /* Check to see whether or not the expression is type-dependent.  */
-  dependent_p = type_dependent_expression_p (postfix_expression);
+  /* According to the standard, no expression should ever have
+ reference type.  Unfortunately, we do not currently match
+ the standard in this respect in that our internal representation
+ of an expression may have reference type even when the standard
+ says it does not.  Therefore, we have to manually obtain the
+ underlying type here.  */
+  scope = non_reference (TREE_TYPE (postfix_expression));
+  /* Check to see whether or not the expression is type-dependent and
+ not the current instantiation.  */
+  dependent_p = !scope || dependent_scope_p (scope);
   /* The identifier following the `->' or `.' is not qualified.  */
   parser->scope = NULL_TREE;
   parser->qualifying_scope = NULL_TREE;
@@ -7194,16 +7202,8 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
 
   /* Enter the scope corresponding to the type of the object
  given by the POSTFIX_EXPRESSION.  */
-  if (!dependent_p && TREE_TYPE (postfix_expression) != NULL_TREE)
+  if (!dependent_p)
 {
-  scope = TREE_TYPE (postfix_expression);
-  /* According to the standard, no expression should ever have
-	 reference type.  Unfortunately, we do not currently match
-	 the standard in this respect in that our internal representation
-	 of an expression may have reference type even when the standard
-	 says it does not.  Therefore, we have to manually obtain the
-	 underlying type here.  */
-  scope = non_reference (scope);
   /* The type of the POSTFIX_EXPRESSION must be complete.  */
   if (scope == unknown_type_node)
 	{
@@ -7215,7 +7215,10 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
 	 required to be of complete type for purposes of class member
 	 access (5.2.5) outside the member function body.  */
   else if (postfix_expression != current_class_ref
-	   && !(processing_template_decl && scope == current_class_type))
+	   && !(processing_template_decl
+		&& current_class_type
+		&& (same_type_ignoring_top_level_qualifiers_p
+			(scope, current_class_type
 	scope = complete_type_or_else (scope, NULL_TREE);
   /* Let the name lookup machinery know that we are processing a
 	 class member access expression.  */
@@ -7231,6 +7234,10 @@ cp_parser_postfix_dot_deref_expression (cp_parser *parser,
   if (scope == error_mark_node)
 	postfix_expression = error_mark_node;
 }
+  else
+/* Tell cp_parser_lookup_name that there was an object, even though it's
+   type-dependent.  */
+parser->context->object_type = unknown_type_node;
 
   /* Assume this expression is not a pseudo-destructor access.  */
   pseudo_destructor_p = false;
@@ -24720,10 +24727,15 @@ cp_parser_lookup_name (cp_parser *parser, tree name,
 	decl = NULL_TREE;
 
   if (!decl)
-	/* Look it up in the enclosing context.  */
-	decl = lookup_name_real (name, tag_type != none_type,
+	/* Look it up in the enclosing context.  DR 141: When looking for a
+	   template-name after -> or ., only consider class templates.  */
+	decl = lookup_name_real (name, tag_type != none_type || is_template,
  /*nonclass=*/0,
  /*block_p=*/true, is_namespace, 0);
+  if (object_type == unknown_type_node)
+	/* The object is type-dependent, so we can't look anything up; we used
+	   this to get the DR 141 behavior.  */
+	object_type = NULL_TREE;
   parser->object_scope = object_type;
   parser->qualifying_scope = NULL_TREE;
 }
diff --git a/gcc/cp/typeck2.c b/gcc/cp/typeck2.c
index 

Re: [PATCH PR69652, Regression]

2016-02-10 Thread Richard Biener
On Wed, Feb 10, 2016 at 11:26 AM, Yuri Rumyantsev  wrote:
> Thanks Richard for your comments.
> I changes algorithm to remove dead scalar statements as you proposed.
>
> Bootstrap and regression testing did not show any new failures on x86-64.
> Is it OK for trunk?

Ok.

Thanks,
Richard.

> Changelog:
> 2016-02-10  Yuri Rumyantsev  
>
> PR tree-optimization/69652
> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
> to nested loop, did source re-formatting, skip debug statements,
> add check on statement with volatile operand, remove dead scalar
> statements.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/torture/pr69652.c: New test.
>
>
> 2016-02-09 15:33 GMT+03:00 Richard Biener :
>> On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev  wrote:
>>> Hi All,
>>>
>>> Here is updated patch - I came back to move call statements also since
>>>  masked loads are presented by internal call. I also assume that for
>>> the following simple loop
>>>   for (i = 0; i < n; i++)
>>> if (b1[i])
>>>   a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]);
>>> motion must be done for all vector statements in semi-hammock including 
>>> SQRT.
>>>
>>> Bootstrap and regression testing did not show any new failures.
>>> Is it OK for trunk?
>>
>> The patch is incredibly hard to parse due to the re-indenting.  Please
>> consider sending
>> diffs with -b.
>>
>> This issue exposes that you are moving (masked) stores across loads without
>> checking aliasing.  In the specific case those loads are dead and thus
>> this is safe
>> but in general I thought we were checking that we are using the same VUSE
>> during the sinking operation.
>>
>> Thus, I'd rather have
>>
>> + /* Check that LHS does not have uses outside of STORE_BB.  */
>> + res = true;
>> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
>> +   {
>> + gimple *use_stmt;
>> + use_stmt = USE_STMT (use_p);
>> + if (is_gimple_debug (use_stmt))
>> +   continue;
>> + if (gimple_bb (use_stmt) != store_bb)
>> +   {
>> + res = false;
>> + break;
>> +   }
>> +   }
>>
>> also check for the dead code case and DCE those stmts here.  Like so:
>>
>>if (has_zero_uses (lhs))
>> {
>>   gsi_remove (_from, true);
>>   continue;
>> }
>>
>> before the above loop.
>>
>> Richard.
>>
>>> ChangeLog:
>>>
>>> 2016-02-05  Yuri Rumyantsev  
>>>
>>> PR tree-optimization/69652
>>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
>>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all
>>> skipped scalar statements, introduce variable LAST_VUSE to keep
>>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the
>>> begining of current masked store processing, did source re-formatting,
>>> skip parsing of debug gimples, stop processing if a gimple with
>>> volatile operand has been encountered, save scalar statement
>>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE
>>> iterator, change vuse of all saved scalar statements to LAST_VUSE if
>>> it makes sence.
>>>
>>> gcc/testsuite/ChangeLog:
>>> * gcc.dg/torture/pr69652.c: New test.
>>>
>>> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek :
 On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote:
> Here is a patch that cures the issues with non-correct vuse for scalar
> statements during code motion, i.e. if vuse of scalar statement is
> vdef of masked store which has been sunk to new basic block, we must
> fix it up.  The patch also fixed almost all remarks pointed out by
> Jacub.
>
> Bootstrapping and regression testing on v86-64 did not show any new 
> failures.
> Is it OK for trunk?
>
> ChangeLog:
> 2016-02-04  Yuri Rumyantsev  
>
> PR tree-optimization/69652
> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all
> skipped scalar statements, introduce variable LAST_VUSE that has
> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the
> begining of current masked store processing, did source re-formatting,
> skip parsing of debug gimples, stop processing when call or gimple
> with volatile operand habe been encountered, save scalar statement
> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE
> iterator, change vuse of all saved scalar statements to LAST_VUSE if
> it makes sence.
>
> gcc/testsuite/ChangeLog:
> * gcc.dg/torture/pr69652.c: New test.

 Your mailer breaks ChangeLog formatting, so it is hard to check the
 formatting of the ChangeLog entry.

 diff 

Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Richard Biener
On Wed, 10 Feb 2016, Bernd Schmidt wrote:

> On 02/10/2016 02:50 PM, Richard Biener wrote:
> > On Wed, 10 Feb 2016, Bernd Schmidt wrote:
> > 
> > > On 02/10/2016 02:35 PM, Richard Biener wrote:
> > > 
> > > > Index: gcc/ifcvt.c
> > > > ===
> > > > --- gcc/ifcvt.c (revision 233262)
> > > > +++ gcc/ifcvt.c (working copy)
> > > > @@ -1274,7 +1274,8 @@ noce_try_store_flag_constants (struct no
> > > >  && CONST_INT_P (XEXP (a, 1))
> > > >  && CONST_INT_P (XEXP (b, 1))
> > > >  && rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
> > > > -  && noce_operand_ok (XEXP (a, 0))
> > > > +  && (REG_P (XEXP (a, 0))
> > > > + || ! reg_mentioned_p (if_info->x, XEXP (a, 0)))
> > > 
> > > I guess that would also work. Could maybe use a brief comment.
> > 
> > Ok.  I'm testing that.  I wonder if we need to use reg_overlap_mentioned_p
> > here (hard-reg pairs?) or if reg_mentioned_p is safe.
> 
> Let's go with reg_overlap_mentioned_p. I kind of forgot about that once I
> thought of possible issues with emitting a move :-(

Ok, the following is in testing now.

Ok?

Thanks,
Richard.

2016-02-10  Richard Biener  

PR rtl-optimization/69291
* ifcvt.c (noce_try_store_flag_constants): Do not allow
subexpressions affected by changing the result.

Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1274,7 +1274,10 @@ noce_try_store_flag_constants (struct no
   && CONST_INT_P (XEXP (a, 1))
   && CONST_INT_P (XEXP (b, 1))
   && rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
-  && noce_operand_ok (XEXP (a, 0))
+  /* Allow expressions that are not using the result or plain
+ registers where we handle overlap below.  */
+  && (REG_P (XEXP (a, 0))
+ || ! reg_overlap_mentioned_p (if_info->x, XEXP (a, 0)))
   && if_info->branch_cost >= 2)
 {
   common = XEXP (a, 0);


[PATCH, testsuite]: Add -mcpu=ev4 for alpha for gcc.dg/tree-ssa/sra-{18,19}.c

2016-02-10 Thread Uros Bizjak
Hello!

Due to various move cost functions, the expected transformation is
triggered only for ev4.

2016-02-10  Uros Bizjak  

* gcc.dg/tree-ssa/sra-17.c: Add -mcpu=ev4 for target alpha*-*-*.
* gcc.dg/tree-ssa/sra-18.c: Ditto.

Tested on alphaev6-linux-gnu and committed to mainline SVN.

Uros.
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c
index 3aba470..a66344b 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-17.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { aarch64*-*-* alpha*-*-* arm*-*-* hppa*-*-* 
powerpc*-*-* s390*-*-* } } } */
 /* { dg-options "-O2 -fdump-tree-esra --param 
sra-max-scalarization-size-Ospeed=32" } */
+/* { dg-additional-options "-mcpu=ev4" { target alpha*-*-* } } */
 
 extern void abort (void);
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c 
b/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c
index 50835c0..47fa204 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/sra-18.c
@@ -1,5 +1,6 @@
 /* { dg-do run { target { aarch64*-*-* alpha*-*-* arm*-*-* hppa*-*-* 
powerpc*-*-* s390*-*-* } } } */
 /* { dg-options "-O2 -fdump-tree-esra --param 
sra-max-scalarization-size-Ospeed=32" } */
+/* { dg-additional-options "-mcpu=ev4" { target alpha*-*-* } } */
 
 extern void abort (void);
 struct foo { long x; };


Re: [PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Richard Biener
On Wed, 10 Feb 2016, Bernd Schmidt wrote:

> On 02/10/2016 02:35 PM, Richard Biener wrote:
> 
> > Index: gcc/ifcvt.c
> > ===
> > --- gcc/ifcvt.c (revision 233262)
> > +++ gcc/ifcvt.c (working copy)
> > @@ -1274,7 +1274,8 @@ noce_try_store_flag_constants (struct no
> > && CONST_INT_P (XEXP (a, 1))
> > && CONST_INT_P (XEXP (b, 1))
> > && rtx_equal_p (XEXP (a, 0), XEXP (b, 0))
> > -  && noce_operand_ok (XEXP (a, 0))
> > +  && (REG_P (XEXP (a, 0))
> > + || ! reg_mentioned_p (if_info->x, XEXP (a, 0)))
> 
> I guess that would also work. Could maybe use a brief comment.

Ok.  I'm testing that.  I wonder if we need to use reg_overlap_mentioned_p
here (hard-reg pairs?) or if reg_mentioned_p is safe.

I'm not too much into RTL ...

Thanks,
Richard.


Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Thomas Schwinge
Hi!

On Wed, 10 Feb 2016 14:25:50 +0100, Bernd Schmidt  wrote:
> On 02/10/2016 12:49 PM, Thomas Schwinge wrote:
> > [...]
> 
> I think this has to be considered after gcc-6.

Hmm, I see.


> In general, what's the 
> state of OpenACC these days?

Much improved compared to GCC 5.  :-) Anything specific you'd like me to
elaborate on?   should be fairly
accurate.


> I'm slightly confused by the interface between offloaded code and 
> libgomp. It looks like you're collecting avoid-offloading flags 
> per-function, but then when things get registered, it seems like a 
> per-image flag.

(Per-image flag that affects all offloading for a given offloading type,
even.)

> Is that right? It seems like too large a hammer.

Yes, we need a hammer that big: we have to ensure consistency between
data regions on the device and code offloading to the device, as
otherwise we'll very easily run into inconsistencies, because of the
non-shared memory.  In the general case, it's "all or nothing": you
either have to offload all kernels or none of them.


> >> [...]
> 
> Avoid unnecessary braces.
> 
> >> [...]
> 
> Typo.

Thanks for the review; fixed.


Grüße
 Thomas


[wwwdocs] New bits to porting_to

2016-02-10 Thread Marek Polacek
Some minor issues I noticed.

Ok?

Index: porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.7
diff -u -r1.7 porting_to.html
--- porting_to.html 9 Feb 2016 21:06:32 -   1.7
+++ porting_to.html 10 Feb 2016 16:45:37 -
@@ -220,6 +220,28 @@
 the C++ standard library.
 
 
+Call of overloaded 'abs(unsigned int&)' is ambiguous
+
+
+The additional overloads can cause the compiler to reject invalid code that
+was accepted before.  An example of such code is the below:
+
+
+
+#include stdlib.h
+
+int
+foo (unsigned x)
+{
+  abs (x);
+}
+
+
+
+Since calling abs() on an unsigned value doesn't make sense,
+this code will become explicitly invalid as per discussion in the LWG.
+
+
 Optimizations remove null pointer checks for this
 
 
@@ -239,6 +261,15 @@
 pointers, not only those involving this.
 
 
+Deprecation of std::auto_ptr
+
+
+The std::auto_ptr template class was deprecated in C++11, so GCC
+now warns about its usage.  This warning can be suppressed with the
+-Wno-deprecated-declarations command-line option, though we advise
+to port the code to use C++11's std::unique_ptr instead.
+
+
 -Wmisleading-indentation
 
 A new warning -Wmisleading-indentation was added
@@ -303,8 +334,24 @@
 the indentation of the source was fixed instead.
 
 
-Links
+Enhanced -Wnonnull
+
+The -Wnonnull warning has been improved so that it also warns
+about comparing parameters declared as nonnull with NULL.  For
+example, the compiler will warn about the following code:
+
 
+
+__attribute__((nonnull)) void
+foo (void *p)
+{
+  if (p == NULL)
+abort ();
+  // ...
+}
+
+
+Links
 
 
 

Marek


Re: [wwwdocs] New bits to porting_to

2016-02-10 Thread Jonathan Wakely

On 10/02/16 17:46 +0100, Marek Polacek wrote:

+int
+foo (unsigned x)
+{
+  abs (x);


Let's make this "return abs (x);" so we don't have a missing return.


+The std::auto_ptr template class was deprecated in C++11, so GCC


s/template class/class template/



Re: [PATCH] PR driver/69265: improved suggestions for various misspelled options

2016-02-10 Thread Bernd Schmidt

On 02/09/2016 09:44 PM, David Malcolm wrote:

This is a bug in a new feature, so it isn't a regression as such, but
it's fairly visible, and I believe the fix is relatively low-risk
(error-handling of typos of command-line options).

This also now covers PR driver/69453 (and its duplicate PR
driver/69642), so people *are* running into this.


I think the patch looks reasonable (I expect it needs slight adjustment 
after an earlier sanitizer options change). Whether it's OK or not at 
this stage is something I think I'll want to ask a RM. My inclination 
would be yes.


A small improvement might be calculating the candidates array only once 
when making the first suggestion and not freeing it. BTW, I've also run 
into a case of an unhelpful suggestion:


./cc1 ~/hw.c -fno-if-convert
cc1: error: unrecognized command line option ‘-fno-if-convert’

which should instead suggest fno-if-conversion.


Bernd


Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Thomas Schwinge
Hi!

On Wed, 10 Feb 2016 17:37:30 +0100, Bernd Schmidt  wrote:
> On 02/10/2016 05:23 PM, Thomas Schwinge wrote:
> > Why?  A user of GCC has no intrinsic interest in getting OpenACC kernels
> > constructs' code offloaded; the user wants his code to execute as fast as
> > possible.
> >
> > If you consider the whole of OpenACC kernels code offloading as a
> > compiler optimization, then it's fine for GCC to abort this
> > "optimization" if it's reasonably clear that this transformation (code
> > offloading) will not be profitable -- just like what GCC does with other
> > possible code optimizations/transformations.
> 
> Yes, but if a single kernel (which might not even get executed at 
> run-time) can inhibit offloading for the whole program, then we're not 
> making an intelligent decision, and IMO violating user expectations. 

Sure, I agree it's a pretty "rough-grained" decision.  (Owed to the
non-shared-memory offloading architecture -- shared-memory offloading
indeed can make such decisions case by case.)

> IIUC it's also disabling offloading for parallels rather than just 
> kernels, which we previously said shouldn't happen.

Ah, you're talking about mixed OpenACC parallel/kernels codes -- I
understood the earlier discussion to apply to parallel-only codes, where
the "avoid offloading" flag will never be set.  In mixed parallel/kernels
code with one un-parallelized kernels construct, offloading would also
(have to be) disabled for the parallel constructs (for the same data
consistency reasons explained before).  The majority of codes I've seen
use either parallel or kernels constructs, typically not both.

> > As I've said before,
> > profiling the execution times of several real-world codes has shown that
> > under the assumtion that parloops fails to parallelize one kernel (one
> > out of possibly many), this one kernel has always been a "hot spot", and
> > avoiding offloading in this case has always helped prevent performance
> > degradation below host-fallback performance.
> 
> IMO a warning for the specific kernel that's problematic would be better 

That's something Tom suggested,
,
and which motivated my patch, in going one step further:

> so that users can selectively apply -fopenacc to files where it is 
> profitable.

This puts it into the hands of the user to selectively mark kernels
constructs as suitable for GCC's current parloops processing (for
example, by disabling OpenACC/offloading on a per-file basis) -- which is
something we wanted to avoid, given the idea that in the future, GCC will
improve, and will be able to handle kernels constructs better, and the
user would then have to re-visit/un-do their earlier changes with each
GCC release, instead of just recompiling their code.

> > It's of course unfortunate that we have to disable our offloading
> > machinery for a lot of codes using OpenACC kernels, but given the current
> > state of OpenACC kernels parallelization analysis (parloops), doing so is
> > still profitable for a user, compared to regressed performance with
> > single-threaded offloaded execution.
> 
> How often does this occur on real-world code?

Quite a lot for code using the kernels construct, as discussed before,
given that parloops fails to handle a lot of constructs in real-world
code.

> Will we end up supporting 
> OpenACC by not doing offloading at all in the usual case?

This whole discussion does not at all apply to the body of OpenACC code
using the parallel instead of the kernels construct, which will be
parallelized/offloaded just fine.

> The way you 
> describe it, it sounds like we should recommend that -fopenacc not be 
> used in gcc-6 and restore the previous invoke.texi langauge that marks 
> it as experimental.

Huh?  Like, at random, discouraging users from using GCC's SIMD
vectorizer just because that one fails to vectorize some code that it
could/should vectorize?  (Of course, I'm well aware that GCC's SIMD
vectorizer is much more mature than the OpenACC kernels/parloops
handling; it's seen many more years of development.)

Certainly we should document that there is still a lot of room for
improvement in OpenACC kernels handling (just like it's the case for a
lot of other generic compiler optimizations) -- and we're doing exactly
that on .  I don't follow how that
translates to discouraging use of -fopenacc however?


Grüße
 Thomas


Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 05:23 PM, Thomas Schwinge wrote:

Why?  A user of GCC has no intrinsic interest in getting OpenACC kernels
constructs' code offloaded; the user wants his code to execute as fast as
possible.

If you consider the whole of OpenACC kernels code offloading as a
compiler optimization, then it's fine for GCC to abort this
"optimization" if it's reasonably clear that this transformation (code
offloading) will not be profitable -- just like what GCC does with other
possible code optimizations/transformations.


Yes, but if a single kernel (which might not even get executed at 
run-time) can inhibit offloading for the whole program, then we're not 
making an intelligent decision, and IMO violating user expectations. 
IIUC it's also disabling offloading for parallels rather than just 
kernels, which we previously said shouldn't happen.



As I've said before,
profiling the execution times of several real-world codes has shown that
under the assumtion that parloops fails to parallelize one kernel (one
out of possibly many), this one kernel has always been a "hot spot", and
avoiding offloading in this case has always helped prevent performance
degradation below host-fallback performance.


IMO a warning for the specific kernel that's problematic would be better 
so that users can selectively apply -fopenacc to files where it is 
profitable.



It's of course unfortunate that we have to disable our offloading
machinery for a lot of codes using OpenACC kernels, but given the current
state of OpenACC kernels parallelization analysis (parloops), doing so is
still profitable for a user, compared to regressed performance with
single-threaded offloaded execution.


How often does this occur on real-world code? Will we end up supporting 
OpenACC by not doing offloading at all in the usual case? The way you 
describe it, it sounds like we should recommend that -fopenacc not be 
used in gcc-6 and restore the previous invoke.texi langauge that marks 
it as experimental.



Bernd


Re: [wwwdocs] New bits to porting_to

2016-02-10 Thread Marek Polacek
On Wed, Feb 10, 2016 at 05:14:40PM +, Jonathan Wakely wrote:
> On 10/02/16 17:46 +0100, Marek Polacek wrote:
> >+int
> >+foo (unsigned x)
> >+{
> >+  abs (x);
> 
> Let's make this "return abs (x);" so we don't have a missing return.
 
Ok.

> >+The std::auto_ptr template class was deprecated in C++11, so 
> >GCC
> 
> s/template class/class template/

Fixed.

I'll commit this then:

Index: porting_to.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-6/porting_to.html,v
retrieving revision 1.7
diff -u -r1.7 porting_to.html
--- porting_to.html 9 Feb 2016 21:06:32 -   1.7
+++ porting_to.html 10 Feb 2016 17:17:49 -
@@ -220,6 +220,27 @@
 the C++ standard library.
 
 
+Call of overloaded 'abs(unsigned int&)' is ambiguous
+
+
+The additional overloads can cause the compiler to reject invalid code that
+was accepted before.  An example of such code is the below:
+
+
+
+#include stdlib.h
+int
+foo (unsigned x)
+{
+  return abs (x);
+}
+
+
+
+Since calling abs() on an unsigned value doesn't make sense,
+this code will become explicitly invalid as per discussion in the LWG.
+
+
 Optimizations remove null pointer checks for this
 
 
@@ -239,6 +260,15 @@
 pointers, not only those involving this.
 
 
+Deprecation of std::auto_ptr
+
+
+The std::auto_ptr class template was deprecated in C++11, so GCC
+now warns about its usage.  This warning can be suppressed with the
+-Wno-deprecated-declarations command-line option, though we advise
+to port the code to use C++11's std::unique_ptr instead.
+
+
 -Wmisleading-indentation
 
 A new warning -Wmisleading-indentation was added
@@ -303,8 +333,24 @@
 the indentation of the source was fixed instead.
 
 
-Links
+Enhanced -Wnonnull
+
+The -Wnonnull warning has been improved so that it also warns
+about comparing parameters declared as nonnull with NULL.  For
+example, the compiler will warn about the following code:
+
 
+
+__attribute__((nonnull)) void
+foo (void *p)
+{
+  if (p == NULL)
+abort ();
+  // ...
+}
+
+
+Links
 
 
 

Marek


Re: [PATCH][RFC][Offloading] Fix PR68463

2016-02-10 Thread Ilya Verbin
Hi!

On Tue, Jan 19, 2016 at 16:32:13 +0300, Ilya Verbin wrote:
> On Tue, Jan 19, 2016 at 10:36:28 +0100, Jakub Jelinek wrote:
> > On Tue, Jan 19, 2016 at 09:57:01AM +0100, Richard Biener wrote:
> > > On Mon, 18 Jan 2016, Ilya Verbin wrote:
> > > > On Fri, Jan 15, 2016 at 09:15:01 +0100, Richard Biener wrote:
> > > > > On Fri, 15 Jan 2016, Ilya Verbin wrote:
> > > > > > II) The __offload_func_table, __offload_funcs_end, 
> > > > > > __offload_var_table,
> > > > > > __offload_vars_end are now provided by the linker script, instead of
> > > > > > crtoffload{begin,end}.o, this allows to surround all offload 
> > > > > > objects, even
> > > > > > those that are not claimed by lto-plugin.
> > > > > > Unfortunately it works only with ld, but doen't work with gold, 
> > > > > > because
> > > > > > https://sourceware.org/bugzilla/show_bug.cgi?id=15373
> > > > > > Any thoughts how to enable this linker script for gold?
> > > > > 
> > > > > The easiest way would probably to add this handling to the default
> > > > > "linker script" in gold.  I don't see an easy way around requiring
> > > > > changes to gold here - maybe dumping the default linker script from
> > > > > bfd and injecting the rules with some scripting so you have a complete
> > > > > script.  Though likely gold won't grok that result.
> > > > > 
> > > > > Really a question for Ian though.
> > > > 
> > > > Or the gcc driver can add crtoffload{begin,end}.o, but the problem is 
> > > > that it
> > > > can't determine whether the program contains offloading or not.  So it 
> > > > can add
> > > > them to all -fopenmp/-fopenacc programs, if the compiler was configured 
> > > > with
> > > > --enable-offload-targets=...  The overhead would be about 340 bytes for
> > > > binaries which doesn't use offloading.  Is this acceptable?  (Jakub?)
> > > 
> > > Can lto-wrapper add them as plugin outputs?  Or does that wreck ordering?
> 
> Currently it's implemented this way, but it will not work after my patch,
> because e.g. offload-without-lto.o and offload-with-lto.o will be linked in
> this order:
> offload-without-lto.o, crtoffloadbegin.o, offload-with-lto.o, crtoffloadend.o
> ^
> (will be not claimed by the plugin)
> 
> But we need this one:
> crtoffloadbegin.o, offload-without-lto.o, offload-with-lto.o, crtoffloadend.o
> 
> > Yeah, if that would work, it would be certainly appreciated, one thing is
> > wasting .text space and relocations in all -fopenmp programs (for -fopenacc
> > programs one kind of assumes there will be some offloading in there),
> > another one some extra constructor/destructor or what that would be even
> > worse.
> 
> They contain only 5 symbols, without constructors/destructors.

This patch adds crtoffload{begin,end}.o to all -fopenmp programs, if they exist.
I couldn't think of a better solution...
Tested using the testcase from the previous mail, e.g.:

$ gcc -DNUM=1 -c -fopenmp test.c -o obj1.o
$ gcc -DNUM=2 -c -fopenmp test.c -o obj2.o
$ gcc -DNUM=3 -c -fopenmp test.c -o obj3.o
$ gcc -DNUM=4 -c -fopenmp test.c -o obj4.o -flto
$ gcc -DNUM=5 -c -fopenmp test.c -o obj5.o
$ gcc -DNUM=6 -c -fopenmp test.c -o obj6.o -flto
$ gcc -DNUM=7 -c -fopenmp test.c -o obj7.o
$ gcc-ar -cvq libtest.a obj3.o obj4.o obj5.o
$ gcc -fopenmp main.c obj1.o obj2.o libtest.a obj6.o obj7.o

And other combinations.


gcc/
PR driver/68463
* config/gnu-user.h (GNU_USER_TARGET_STARTFILE_SPEC): Add
crtoffloadbegin.o for -fopenacc/-fopenmp if it exists.
(GNU_USER_TARGET_ENDFILE_SPEC): Add crtoffloadend.o for
-fopenacc/-fopenmp if it exists.
* lto-wrapper.c (offloadbegin, offloadend): Remove static vars.
(offload_objects_file_name): New static var.
(tool_cleanup): Remove offload_objects_file_name file.
(copy_file): Remove function.
(find_offloadbeginend): Remove function.
(run_gcc): Remove offload_argc and offload_argv.
Get offload_objects_file_name from -foffload-objects=... option.
Read names of object files with offload from this file, pass them to
compile_images_for_offload_targets.  Don't call find_offloadbeginend and
don't pass offloadbegin and offloadend to the linker.  Don't pass
offload non-LTO files to the linker, because now they're not claimed.
lto-plugin/
PR driver/68463
* lto-plugin.c (struct plugin_offload_file): New.
(offload_files): Change type.
(offload_files_last, offload_files_last_obj): New.
(offload_files_last_lto): New.
(free_2): Adjust accordingly.
(all_symbols_read_handler): Don't add offload files to lto_arg_ptr.
Don't call free_1 for offload_files.  Write names of object files with
offloading to the temporary file.  Add new option to lto_arg_ptr.
(claim_file_handler): Don't claim file if it contains offload sections
without LTO sections.  If it contains offload sections, add to the list.


diff --git 

[patch] c++/61198 backport to gcc-4.9

2016-02-10 Thread Jonathan Wakely

I wanted to backport r232232 (aka 2edb91b1) which helps with the
compile-time regression tracked by PR60976. On the gcc-4.9 branch
it produces lots of ICEs due to PR61198, which was only fixed for
gcc-5.

This backports the PR61198 fix to the 4.9 branch, which resolves the
ICEs I'm seeing and so would let me also backport the libstdc++
change.

Is this safe for 4.9?

Tested powerpc64le-linux, no new failures.

commit d98a90afa73cecd8c62a93381cde2077471753f2
Author: Jonathan Wakely 
Date:   Wed Feb 10 12:41:26 2016 +

Backport PR c++/61198 fix

gcc:
	2014-12-19  Kai Tietz  

	PR c++/61198
	* pt.c (most_general_template): Don't break for template-alias.

gcc/testsuite:

	2014-12-19  Kai Tietz  
		Paolo Carlini  

	PR c++/61198
	* g++.dg/cpp0x/alias-decl-45.C: New file.

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 60e9671..7485b95 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18873,6 +18873,7 @@ most_general_template (tree decl)
 	break;
 
   if (CLASS_TYPE_P (TREE_TYPE (decl))
+	  && !TYPE_DECL_ALIAS_P (TYPE_NAME (TREE_TYPE (decl)))
 	  && CLASSTYPE_TEMPLATE_SPECIALIZATION (TREE_TYPE (decl)))
 	break;
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/alias-decl-45.C b/gcc/testsuite/g++.dg/cpp0x/alias-decl-45.C
new file mode 100644
index 000..e3434f5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/alias-decl-45.C
@@ -0,0 +1,24 @@
+// PR c++/61198
+// { dg-do compile { target c++11 } }
+
+template
+struct broken
+{
+	template
+	using rebind = broken;
+};
+
+template
+struct broken<2, derp_t>
+{
+	template
+	using rebind = broken<2, target_t>;
+};
+
+int main(int argc, char **argv)
+{		
+	broken<2, float>::rebind u;
+
+	return 0;
+}
+


Re: [Patch, fortran, pr67451, v1] [5/6 Regression] ICE with sourced allocation from coarray

2016-02-10 Thread Dominique d'Humières
> Hi all,
> 
> unfortunately was my last patch for pr67451 not perfect and introduced
> regressions occurring on s390(x) and with the sanitizer.  These were
> caused, because when taking the array specs from the source=-expression
> also its attributes, like coarray state and so on where taken from there.
> This additionally added a crank to local objects to allocate, that were
> no coarrays overwriting data in the array handle.  The attached patch
> fixes both issues.
>
> The patch for gcc-5 is not affected, because in gcc-5 the feature of
> taking the array spec from the source=-expression is not implemented.
>
> Bootstrapped and regtested ok on x86_64-linux-gnu/F23.
>
>  Ok for trunk?
>
> Regards,
> Andre

The patch fixes the two issues I saw on x86_64-apple-darwin15.

Thanks.

Dominique



Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Thomas Schwinge
Hi!

On Wed, 10 Feb 2016 16:27:40 +0100, Bernd Schmidt  wrote:
> On 02/10/2016 03:39 PM, Thomas Schwinge wrote:
> 
> > Yes, we need a hammer that big: we have to ensure consistency between
> > data regions on the device and code offloading to the device, as
> > otherwise we'll very easily run into inconsistencies, because of the
> > non-shared memory.  In the general case, it's "all or nothing": you
> > either have to offload all kernels or none of them.
> 
> That's unfortunately not the impression I got from the earlier 
> discussion

:-(

> and this seems to imply that one unprofitable kernel would 
> disable all the others

Correct.

> - IMO this is not acceptable.

Why?  A user of GCC has no intrinsic interest in getting OpenACC kernels
constructs' code offloaded; the user wants his code to execute as fast as
possible.

If you consider the whole of OpenACC kernels code offloading as a
compiler optimization, then it's fine for GCC to abort this
"optimization" if it's reasonably clear that this transformation (code
offloading) will not be profitable -- just like what GCC does with other
possible code optimizations/transformations.  As I've said before,
profiling the execution times of several real-world codes has shown that
under the assumtion that parloops fails to parallelize one kernel (one
out of possibly many), this one kernel has always been a "hot spot", and
avoiding offloading in this case has always helped prevent performance
degradation below host-fallback performance.

It's of course unfortunate that we have to disable our offloading
machinery for a lot of codes using OpenACC kernels, but given the current
state of OpenACC kernels parallelization analysis (parloops), doing so is
still profitable for a user, compared to regressed performance with
single-threaded offloaded execution.

Of course...

> There need to be 
> more compiler smarts to figure out whether a kernel is a valid candidate 
> for skipping the offloading.

... that would be better, obviously.  But, I suggest we work on that
incrementally, after fixing the performance regression with my "avoid
offloading" patch.

I have difficulties coming up with an algorithm/parametrization to have
the compiler/runtime decide whether offloading will be profitable given
input parameters such as a ratio of parallelized/single-threaded kernels.
So I'm all ears to suggestions in that regard.  Consider: if we encounter
a single-threaded kernel, the compiler (parloops) has just given up
"understanding" the user's code.  And again, implementing such heuristics
to me sounds like incremental follow-up projects, quite possibly in
combination with generally improving OpenACC kernels handling/parloops.


Grüße
 Thomas


Re: [PATCH, PR67709 ] Don't call call_cgraph_insertion_hooks in simd_clone_create

2016-02-10 Thread Dominique d'Humières
The patch fixes the PR on x86_64-apple-darwin15.

> OK for stage1 trunk?

What it the reason to delay the fix for a couple of months?

Thanks for working on the issue.

Dominique



Re: [PATCH, PR67709 ] Don't call call_cgraph_insertion_hooks in simd_clone_create

2016-02-10 Thread Tom de Vries

On 10/02/16 17:55, Dominique d'Humières wrote:

The patch fixes the PR on x86_64-apple-darwin15.


OK for stage1 trunk?


What it the reason to delay the fix for a couple of months?



We're in stage4 (  https://gcc.gnu.org/ml/gcc/2016-01/msg00168.html ):
...
trunk is in regression and documentation fixes stage now.  This means 
any new features or fixes for bugs that are not regressions have to wait 
for GCC 7 now.

...

This patch doesn't fix a regression. So AFAIU, this patch belongs in stage1.

Jakub opened up the possibility here ( 
https://gcc.gnu.org/ml/gcc-patches/2016-02/msg00545.html ) to accept it 
for stage4, depending on feedback Honza and Richi.


Thanks,
- Tom




Re: [PATCH, stage1] Better error recovery for merge-conflict markers

2016-02-10 Thread David Malcolm
On Mon, 2016-02-08 at 10:07 +0100, Bert Wesarg wrote:
> David,
> 
> On Thu, Apr 9, 2015 at 10:29 AM, Bert Wesarg <
> bert.wes...@googlemail.com> wrote:
> > Hi David,
> > 
> > > Various tools that operate on source code files will inject
> > > markers
> > > into them when an unfixable conflict occurs in a merger.
> > > 
> > > There appears to be no blessed standard for these conflict
> > > markers,
> > > but an ad-hoc convention is for 7 '<' , '=', or '>' characters at
> > > the start of a line, followed optionally by a space and optional
> > > text
> > > 
> > > e.g.:
> > > <<< HEAD
> > > extern int some_var;
> > > ===
> > > extern short some_var;
> > > > > > > > > > Some other branch
> > > 
> > > This convention is followed by GNU patch:
> > >   http://git.savannah.gnu.org/cgit/patch.git/tree/src/merge.c
> > > by git:
> > > 
> > > http://git.kernel.org/cgit/git/git.git/tree/Documentation/merge-c
> > > onfig.txt
> > > and by various other tools.
> > 
> > 
> > if you read both of these tools carefully, you will notice an
> > alternative
> > conflict style (named 'diff3' in both of them), that includes a
> > third
> > section, the common pre-image. Here is an example:
> > 
> > <<< HEAD
> > extern int some_var;
> > > > > > > > > merge base
> > extern int var;
> > ===
> > extern short var;
> > > > > > > > > 
> > > > > > > > > Some other branch
> > 
> > 
> > Additionally, git supports a custom conflict-marker-size to change
> > the
> > default of 7 on a per file name (the conflict-marker-size
> > attribute). So it
> > may be worthwhile to support other sizes than 7 in this patch too.
> 
> you never commentewd on my mail, but I saw this now in trunk. I can
> only repeat myself here.

Thanks.  FWIW I did read your mail, and it was a factor in me proposing
this alternate implementation which would be more flexible:
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01515.html
but that implementation had the drawback of not working with .i files
from -save-temps, so we went with the original approach.



C++ PATCH for c++/68926 (SFINAE failure with template-id)

2016-02-10 Thread Jason Merrill

A standard SFINAE problem of not passing 'complain' down far enough.

Tested x86_64-pc-linux-gnu, applying to trunk, 5, 4.9.
commit ca7de783e98c7b1163be0eefacca762e8aa3e685
Author: Jason Merrill 
Date:   Wed Feb 10 12:45:55 2016 -0500

	PR c++/68926

	* pt.c (resolve_nondeduced_context): Add complain parm.
	(do_auto_deduction): Pass it.
	* cvt.c (convert_to_void): Likewise.
	* decl.c (cp_finish_decl): Likewise.
	* init.c (build_new): Likewise.
	* rtti.c (get_tinfo_decl_dynamic): Likewise.
	* semantics.c (finish_decltype_type): Likewise.
	* typeck.c (decay_conversion): Likewise.
	* cp-tree.h: Adjust declaration.
	* call.c (standard_conversion): Add complain parm, pass it along.
	(implicit_conversion): Pass it.

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index ce87be7..cb71176 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -190,7 +190,6 @@ static struct z_candidate *add_function_candidate
 	 tree, int, tsubst_flags_t);
 static conversion *implicit_conversion (tree, tree, tree, bool, int,
 	tsubst_flags_t);
-static conversion *standard_conversion (tree, tree, tree, bool, int);
 static conversion *reference_binding (tree, tree, tree, bool, int,
   tsubst_flags_t);
 static conversion *build_conv (conversion_kind, tree, conversion *);
@@ -1080,7 +1079,7 @@ strip_top_quals (tree t)
 
 static conversion *
 standard_conversion (tree to, tree from, tree expr, bool c_cast_p,
-		 int flags)
+		 int flags, tsubst_flags_t complain)
 {
   enum tree_code fcode, tcode;
   conversion *conv;
@@ -1110,7 +1109,7 @@ standard_conversion (tree to, tree from, tree expr, bool c_cast_p,
   else if (TREE_CODE (to) == BOOLEAN_TYPE)
 	{
 	  /* Necessary for eg, TEMPLATE_ID_EXPRs (c++/50961).  */
-	  expr = resolve_nondeduced_context (expr);
+	  expr = resolve_nondeduced_context (expr, complain);
 	  from = TREE_TYPE (expr);
 	}
 }
@@ -1149,7 +1148,8 @@ standard_conversion (tree to, tree from, tree expr, bool c_cast_p,
 	 the standard conversion sequence to perform componentwise
 	 conversion.  */
   conversion *part_conv = standard_conversion
-	(TREE_TYPE (to), TREE_TYPE (from), NULL_TREE, c_cast_p, flags);
+	(TREE_TYPE (to), TREE_TYPE (from), NULL_TREE, c_cast_p, flags,
+	 complain);
 
   if (part_conv)
 	{
@@ -1799,7 +1799,7 @@ implicit_conversion (tree to, tree from, tree expr, bool c_cast_p,
   if (TREE_CODE (to) == REFERENCE_TYPE)
 conv = reference_binding (to, from, expr, c_cast_p, flags, complain);
   else
-conv = standard_conversion (to, from, expr, c_cast_p, flags);
+conv = standard_conversion (to, from, expr, c_cast_p, flags, complain);
 
   if (conv)
 return conv;
diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index ead017e..3b91089 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -6180,7 +6180,7 @@ extern tree get_template_parms_at_level (tree, int);
 extern tree get_template_innermost_arguments	(const_tree);
 extern tree get_template_argument_pack_elems	(const_tree);
 extern tree get_function_template_decl		(const_tree);
-extern tree resolve_nondeduced_context		(tree);
+extern tree resolve_nondeduced_context		(tree, tsubst_flags_t);
 extern hashval_t iterative_hash_template_arg (tree arg, hashval_t val);
 extern tree coerce_template_parms   (tree, tree, tree);
 extern tree coerce_template_parms   (tree, tree, tree, tsubst_flags_t);
diff --git a/gcc/cp/cvt.c b/gcc/cp/cvt.c
index 60362fd..0d1048c 100644
--- a/gcc/cp/cvt.c
+++ b/gcc/cp/cvt.c
@@ -1253,7 +1253,7 @@ convert_to_void (tree expr, impl_conv_void implicit, tsubst_flags_t complain)
 
 default:;
 }
-  expr = resolve_nondeduced_context (expr);
+  expr = resolve_nondeduced_context (expr, complain);
   {
 tree probe = expr;
 
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 11f7ce6..09bd512 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -6575,7 +6575,7 @@ cp_finish_decl (tree decl, tree init, bool init_const_expr_p,
   if (TREE_CODE (d_init) == TREE_LIST)
 	d_init = build_x_compound_expr_from_list (d_init, ELK_INIT,
 		  tf_warning_or_error);
-  d_init = resolve_nondeduced_context (d_init);
+  d_init = resolve_nondeduced_context (d_init, tf_warning_or_error);
   type = TREE_TYPE (decl) = do_auto_deduction (type, d_init,
 		   auto_node,
tf_warning_or_error,
diff --git a/gcc/cp/init.c b/gcc/cp/init.c
index cb2e852..338f85e 100644
--- a/gcc/cp/init.c
+++ b/gcc/cp/init.c
@@ -3364,7 +3364,7 @@ build_new (vec **placement, tree type, tree nelts,
   if (auto_node)
 	{
 	  tree d_init = (**init)[0];
-	  d_init = resolve_nondeduced_context (d_init);
+	  d_init = resolve_nondeduced_context (d_init, complain);
 	  type = do_auto_deduction (type, d_init, auto_node);
 	}
 }
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 76a6019..a215aa7 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -18575,7 +18575,7 @@ 

Re: [PATCH] Fix PR c++/69098 (bogus errors with static data member template)

2016-02-10 Thread Patrick Palka
On Wed, Feb 10, 2016 at 2:16 PM, Patrick Palka  wrote:
> tsubst_qualified_id() is currently not prepared to handle a SCOPED_REF
> whose RHS is a variable template.  r226642 made this deficiency more
> obvious by marking all variable templates as dependent (thus forcing
> them to be wrapped in a SCOPED_REF) but before that it was also possible
> to trigger a bogus error if the scope of the variable template was
> dependent (e.g. foo2 in the test case 69098-2.C fails to compile even
> before r226642, whereas foo1 fails to compile only after r226642).
>
> Further, check_template_keyword() is currently not prepared to handle
> variable templates as well.  And again r226642 helped to expose this
> issue but it was already possible to trigger before that (e.g. foo4
> always failed to compile whereas foo3 only fails after r226642).

Err, sorry, disregard this last sentence, including the parenthetical.
Both tsubst_qualified_id() and check_template_keyword() must be fixed
in order for foo2 to compile.  For foo1 to compile, only
tsubst_qualified_id() must be fixed.  foo3 and foo4 are just examples
for which an earlier iteration of this patch caused an ICE.


[PATCH] Fix PR c++/69098 (bogus errors with static data member template)

2016-02-10 Thread Patrick Palka
tsubst_qualified_id() is currently not prepared to handle a SCOPED_REF
whose RHS is a variable template.  r226642 made this deficiency more
obvious by marking all variable templates as dependent (thus forcing
them to be wrapped in a SCOPED_REF) but before that it was also possible
to trigger a bogus error if the scope of the variable template was
dependent (e.g. foo2 in the test case 69098-2.C fails to compile even
before r226642, whereas foo1 fails to compile only after r226642).

Further, check_template_keyword() is currently not prepared to handle
variable templates as well.  And again r226642 helped to expose this
issue but it was already possible to trigger before that (e.g. foo4
always failed to compile whereas foo3 only fails after r226642).

This patch makes tsubst_qualified_id() and check_template_keyword()
handle variable templates accordingly.  The changes in
check_template_keyword() are fairly straightforward, and in
tsubst_qualified_id() I just copied the way variable templates are
handled in tsubst_copy_and_build [TEMPLATE_ID_EXPR].

Boostrap + regtest in progress on x86_64-pc-linux-gnu, Ok to commit
after testing?

gcc/cp/ChangeLog:

PR c++/69098
* pt.c (tsubst_qualified_id): Consider that EXPR might
be a variable template.
* typeck.c (check_template_keyword): Don't emit an error
if DECL is a variable template.

gcc/testsuite/ChangeLog:

PR c++/69098
* g++.dg/cpp1y/69098.C: New test.
* g++.dg/cpp1y/69098-2.C: New test.
---
 gcc/cp/pt.c  | 15 -
 gcc/cp/typeck.c  | 10 -
 gcc/testsuite/g++.dg/cpp1y/69098-2.C | 37 +++
 gcc/testsuite/g++.dg/cpp1y/69098.C   | 43 
 4 files changed, 103 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/69098-2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/69098.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 725adba..6780a98 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -13726,7 +13726,20 @@ tsubst_qualified_id (tree qualified_id, tree args,
 }
 
   if (is_template)
-expr = lookup_template_function (expr, template_args);
+{
+  if (variable_template_p (expr))
+   {
+ expr = lookup_template_variable (expr, template_args);
+ if (!any_dependent_template_arguments_p (template_args))
+   {
+ expr = finish_template_variable (expr, complain);
+ mark_used (expr);
+   }
+ expr = convert_from_reference (expr);
+   }
+  else
+   expr = lookup_template_function (expr, template_args);
+}
 
   if (expr == error_mark_node && complain & tf_error)
 qualified_name_lookup_error (scope, TREE_OPERAND (qualified_id, 1),
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index d2c23f4..959dc5a 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -2601,7 +2601,15 @@ check_template_keyword (tree decl)
   if (TREE_CODE (decl) != TEMPLATE_DECL
   && TREE_CODE (decl) != TEMPLATE_ID_EXPR)
 {
-  if (!is_overloaded_fn (decl))
+  if (VAR_P (decl))
+   {
+ if (DECL_USE_TEMPLATE (decl)
+ && PRIMARY_TEMPLATE_P (DECL_TI_TEMPLATE (decl)))
+   ;
+ else
+   permerror (input_location, "%qD is not a template", decl);
+   }
+  else if (!is_overloaded_fn (decl))
permerror (input_location, "%qD is not a template", decl);
   else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/69098-2.C 
b/gcc/testsuite/g++.dg/cpp1y/69098-2.C
new file mode 100644
index 000..2e968bb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/69098-2.C
@@ -0,0 +1,37 @@
+// PR c++/69098
+// { dg-do compile { target c++14 } }
+
+struct A
+{
+  template 
+  static void *pf;
+};
+
+template 
+bool foo1 () {
+  return A::pf;
+}
+
+template 
+bool foo2 () {
+  return B::template pf;
+}
+
+template 
+bool foo3 () {
+  return ::pf;
+}
+
+template 
+bool foo4 () {
+  return ::template pf;
+}
+
+
+void bar () {
+  foo1();
+  foo2();
+  foo3();
+  foo4();
+}
+
diff --git a/gcc/testsuite/g++.dg/cpp1y/69098.C 
b/gcc/testsuite/g++.dg/cpp1y/69098.C
new file mode 100644
index 000..afc4294
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/69098.C
@@ -0,0 +1,43 @@
+// PR c++/69098
+// { dg-do compile { target c++14 } }
+
+template struct SpecPerType;
+
+class Specializer
+{
+public:
+template void MbrFnTempl() //Must be a template
+   {
+   }
+   template struct InnerClassTempl
+   {  //Had to be a template whenever I tested for it
+   static void InnerMemberFn();
+   };
+
+   void Trigger()
+   {
+   InnerClassTempl<0u>::InnerMemberFn();
+   }
+};
+
+template<> struct SpecPerType
+{
+   using FnType = void (Specializer::*)();
+template static constexpr FnType SpecMbrFnPtr =
+::template MbrFnTempl;
+};
+
+template constexpr SpecPerType::FnType
+SpecPerType::SpecMbrFnPtr; 

Re: Warning location fix, PR c++/69733

2016-02-10 Thread Jakub Jelinek
On Wed, Feb 10, 2016 at 08:26:42PM +0100, Bernd Schmidt wrote:
> This PR notes that in this warning:
> const.ii:5:25: warning: type qualifiers ignored on function return type
> [-Wignored-qualifiers]
> const double value() const {return val;}
>  ^
> 
> we are pointing at the wrong qualifier. Below I'm attaching a patch that
> makes it point at the first qualifier of the return type (or the return type
> in case it's a typedef with qualifiers) instead. However, it turns out this
> is not consistent with the C frontend, which points at the function name for
> this warning.
> 
> I'm guessing we want to be consistent between frontends, and I also have a
> similar patch for C. Before I finalize it all with testcases and everything
> - which behaviour is desired?

Just a nit from compile time POV, wouldn't it be better to compute loc only
inside of the if (SCALAR_TYPE_P (type) || VOID_TYPE_P (type)) block, so that
it is not computed when it is not needed?

> --- gcc/cp/decl.c (revision 233217)
> +++ gcc/cp/decl.c (working copy)
> @@ -10009,8 +10009,14 @@ grokdeclarator (const cp_declarator *dec
>  
>   if (type_quals != TYPE_UNQUALIFIED)
> {
> + location_t loc;
> + loc = smallest_type_quals_location (type_quals,
> + declspecs->locations);
> + if (loc == UNKNOWN_LOCATION)
> +   loc = declspecs->locations[ds_type_spec];
>   if (SCALAR_TYPE_P (type) || VOID_TYPE_P (type))
> -   warning (OPT_Wignored_qualifiers,
> +   warning_at (loc,
> +   OPT_Wignored_qualifiers,
>  "type qualifiers ignored on function return type");
>   /* We now know that the TYPE_QUALS don't apply to the
>  decl, but to its return type.  */

Jakub


libgo patch committed: Support pkg-config in go tool

2016-02-10 Thread Ian Lance Taylor
Michael Hudson-Doyle has written a patch to support pkg-config for
gccgo in the go tool.  The patch has not been committed to the master
sources, because they are in a freeze for the Go 1.6 release.  I've
decided to pick it up and commit it to the gccgo sources, so that it
is available for the GCC 6 release.  This fixes GCC PR 66904.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 233260)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-4cec4c5db5b054c5536ec5c50ee7aebec83563bc
+28a9dfbc3cda0bf7fd4f3fb1506c547f6cdf41a5
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/cmd/go/build.go
===
--- libgo/go/cmd/go/build.go(revision 233110)
+++ libgo/go/cmd/go/build.go(working copy)
@@ -1445,6 +1445,9 @@ func (b *builder) build(a *action) (err
if err != nil {
return err
}
+   if _, ok := buildToolchain.(gccgoToolchain); ok {
+   cgoObjects = append(cgoObjects, filepath.Join(a.objdir, 
"_cgo_flags"))
+   }
cgoObjects = append(cgoObjects, outObj...)
gofiles = append(gofiles, outGo...)
}
@@ -2620,12 +2623,64 @@ func (tools gccgoToolchain) ld(b *builde
cxx := len(root.p.CXXFiles) > 0 || len(root.p.SwigCXXFiles) > 0
objc := len(root.p.MFiles) > 0
 
+   readCgoFlags := func(flagsFile string) error {
+   flags, err := ioutil.ReadFile(flagsFile)
+   if err != nil {
+   return err
+   }
+   for _, line := range strings.Split(string(flags), "\n") {
+   if strings.HasPrefix(line, "_CGO_LDFLAGS=") {
+   cgoldflags = append(cgoldflags, 
strings.Fields(line[13:])...)
+   }
+   }
+   return nil
+   }
+
+   readAndRemoveCgoFlags := func(archive string) (string, error) {
+   newa, err := ioutil.TempFile(b.work, filepath.Base(archive))
+   if err != nil {
+   return "", err
+   }
+   olda, err := os.Open(archive)
+   if err != nil {
+   return "", err
+   }
+   _, err = io.Copy(newa, olda)
+   if err != nil {
+   return "", err
+   }
+   err = olda.Close()
+   if err != nil {
+   return "", err
+   }
+   err = newa.Close()
+   if err != nil {
+   return "", err
+   }
+
+   newarchive := newa.Name()
+   err = b.run(b.work, root.p.ImportPath, nil, "ar", "x", 
newarchive, "_cgo_flags")
+   if err != nil {
+   return "", err
+   }
+   err = b.run(".", root.p.ImportPath, nil, "ar", "d", newarchive, 
"_cgo_flags")
+   if err != nil {
+   return "", err
+   }
+   err = readCgoFlags(filepath.Join(b.work, "_cgo_flags"))
+   if err != nil {
+   return "", err
+   }
+   return newarchive, nil
+   }
+
actionsSeen := make(map[*action]bool)
// Make a pre-order depth-first traversal of the action graph, taking 
note of
// whether a shared library action has been seen on the way to an 
action (the
// construction of the graph means that if any path to a node passes 
through
// a shared library action, they all do).
var walk func(a *action, seenShlib bool)
+   var err error
walk = func(a *action, seenShlib bool) {
if actionsSeen[a] {
return
@@ -2644,16 +2699,23 @@ func (tools gccgoToolchain) ld(b *builde
// doesn't work.
if !apackagesSeen[a.p] {
apackagesSeen[a.p] = true
+   target := a.target
+   if len(a.p.CgoFiles) > 0 {
+   target, err = 
readAndRemoveCgoFlags(target)
+   if err != nil {
+   return
+   }
+   }
if a.p.fake && a.p.external {
// external _tests, if present must 
come before
// internal _tests. Store these on a 
separate list
// and place them 

gcc-5-branch backports

2016-02-10 Thread Jakub Jelinek
Hi!

I've committed following backports of my trunk commits to gcc-5-branch
after bootstrapping/regtesting them on x86_64-linux and i686-linux.

Jakub
2016-02-10  Jakub Jelinek  

Backported from mainline
2015-12-03  Jakub Jelinek  

PR preprocessor/57580
* c-ppoutput.c (print): Change printed field to bool.
Move src_file last for smaller padding.
(init_pp_output): Set print.printed to false instead of 0.
(scan_translation_unit): Fix up formatting.  Set print.printed
to true after printing something other than newline.
(scan_translation_unit_trad): Set print.printed to true instead of 1.
(maybe_print_line_1): Set print.printed to false instead of 0.
(print_line_1): Likewise.
(do_line_change): Set print.printed to true instead of 1.
(cb_define, dump_queued_macros, cb_include, cb_def_pragma,
dump_macro): Set print.printed to false after printing newline.

* c-c++-common/cpp/pr57580.c: New test.
* c-c++-common/gomp/pr57580.c: New test.

--- gcc/c-family/c-ppoutput.c   (revision 231212)
+++ gcc/c-family/c-ppoutput.c   (revision 231213)
@@ -31,11 +31,11 @@ static struct
   const cpp_token *prev;   /* Previous token.  */
   const cpp_token *source; /* Source token for spacing.  */
   int src_line;/* Line number currently being written. 
 */
-  unsigned char printed;   /* Nonzero if something output at line.  */
+  bool printed;/* True if something output at line.  */
   bool first_time; /* pp_file_change hasn't been called yet.  */
-  const char *src_file;/* Current source file.  */
   bool prev_was_system_token;  /* True if the previous token was a
   system token.*/
+  const char *src_file;/* Current source file.  */
 } print;
 
 /* Defined and undefined macros being queued for output with -dU at
@@ -153,7 +153,7 @@ init_pp_output (FILE *out_stream)
 
   /* Initialize the print structure.  */
   print.src_line = 1;
-  print.printed = 0;
+  print.printed = false;
   print.prev = 0;
   print.outf = out_stream;
   print.first_time = 1;
@@ -206,12 +206,16 @@ scan_translation_unit (cpp_reader *pfile
{
  line_marker_emitted = do_line_change (pfile, token, loc, false);
  putc (' ', print.outf);
+ print.printed = true;
}
  else if (print.source->flags & PREV_WHITE
   || (print.prev
   && cpp_avoid_paste (pfile, print.prev, token))
   || (print.prev == NULL && token->type == CPP_HASH))
-   putc (' ', print.outf);
+   {
+ putc (' ', print.outf);
+ print.printed = true;
+   }
}
   else if (token->flags & PREV_WHITE)
{
@@ -222,6 +226,7 @@ scan_translation_unit (cpp_reader *pfile
  && !in_pragma)
line_marker_emitted = do_line_change (pfile, token, loc, false);
  putc (' ', print.outf);
+ print.printed = true;
}
 
   avoid_paste = false;
@@ -239,7 +244,7 @@ scan_translation_unit (cpp_reader *pfile
fprintf (print.outf, "%s %s", space, name);
  else
fprintf (print.outf, "%s", name);
- print.printed = 1;
+ print.printed = true;
  in_pragma = true;
}
   else if (token->type == CPP_PRAGMA_EOL)
@@ -250,23 +255,23 @@ scan_translation_unit (cpp_reader *pfile
   else
{
  if (cpp_get_options (parse_in)->debug)
- linemap_dump_location (line_table, token->src_loc,
-print.outf);
+   linemap_dump_location (line_table, token->src_loc, print.outf);
 
  if (do_line_adjustments
  && !in_pragma
  && !line_marker_emitted
- && print.prev_was_system_token != !!in_system_header_at(loc)
+ && print.prev_was_system_token != !!in_system_header_at (loc)
  && !is_location_from_builtin_token (loc))
/* The system-ness of this token is different from the one
   of the previous token.  Let's emit a line change to
   mark the new system-ness before we emit the token.  */
{
  do_line_change (pfile, token, loc, false);
- print.prev_was_system_token = !!in_system_header_at(loc);
+ print.prev_was_system_token = !!in_system_header_at (loc);
}
  cpp_output_token (token, print.outf);
  line_marker_emitted = false;
+ print.printed = true;
}
 
   /* CPP_COMMENT tokens and raw-string literal tokens can
@@ -316,7 +321,7 @@ scan_translation_unit_trad (cpp_reader *
   size_t len = pfile->out.cur - pfile->out.base;
   maybe_print_line (pfile->out.first_line);
   fwrite 

Warning location fix, PR c++/69733

2016-02-10 Thread Bernd Schmidt

This PR notes that in this warning:
const.ii:5:25: warning: type qualifiers ignored on function return type 
[-Wignored-qualifiers]

const double value() const {return val;}
 ^

we are pointing at the wrong qualifier. Below I'm attaching a patch that 
makes it point at the first qualifier of the return type (or the return 
type in case it's a typedef with qualifiers) instead. However, it turns 
out this is not consistent with the C frontend, which points at the 
function name for this warning.


I'm guessing we want to be consistent between frontends, and I also have 
a similar patch for C. Before I finalize it all with testcases and 
everything - which behaviour is desired?



Bernd
Index: gcc/cp/decl.c
===
--- gcc/cp/decl.c	(revision 233217)
+++ gcc/cp/decl.c	(working copy)
@@ -10009,8 +10009,14 @@ grokdeclarator (const cp_declarator *dec
 
 	if (type_quals != TYPE_UNQUALIFIED)
 	  {
+		location_t loc;
+		loc = smallest_type_quals_location (type_quals,
+		declspecs->locations);
+		if (loc == UNKNOWN_LOCATION)
+		  loc = declspecs->locations[ds_type_spec];
 		if (SCALAR_TYPE_P (type) || VOID_TYPE_P (type))
-		  warning (OPT_Wignored_qualifiers,
+		  warning_at (loc,
+			  OPT_Wignored_qualifiers,
 			   "type qualifiers ignored on function return type");
 		/* We now know that the TYPE_QUALS don't apply to the
 		   decl, but to its return type.  */



Re: [PATCH][cilkplus] fix c++ implicit conversions with cilk_spawn (PR/69024, PR/68997)

2016-02-10 Thread Jeff Law

On 01/20/2016 10:57 AM, Ryan Burn wrote:

This patch follows on from
https://gcc.gnu.org/ml/gcc-patches/2015-12/msg02142.html

As discussed, it creates a separate function
cilk_cp_detect_spawn_and_unwrap in gcc/cp to handle processing
cilk_spawn expressions for c++ and adds support for implicit
constructor and type conversions.

Bootstrapped and regression tested on x86_64-linux.
FYI, Just saw your assignment fly by.  I'll try to get a close look at 
this patch shortly.


jeff



Patch to fix PR69148

2016-02-10 Thread Vladimir Makarov

  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69148

  The patch was successfully bootstrapped and tested on x86/x86-64 and 
s390x.


  Committed as rev. 233283

I'll wait for a few days before committing it into gcc-5 branch.


Index: ChangeLog
===
--- ChangeLog	(revision 233282)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2016-02-10  Vladimir Makarov  
+
+	PR target/69148
+	* lra-constraints.c (curr_insn_transform): Find in/out operands
+	for secondary memory moves.  Update dups.
+
 2016-02-10  Yuri Rumyantsev  
 
 	PR tree-optimization/69652
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 233282)
+++ lra-constraints.c	(working copy)
@@ -3559,14 +3559,26 @@ curr_insn_transform (bool check_only_p)
 
   if (use_sec_mem_p)
 {
+  int in = -1, out = -1;
   rtx new_reg, src, dest, rld;
   machine_mode sec_mode, rld_mode;
 
-  lra_assert (sec_mem_p);
-  lra_assert (curr_static_id->operand[0].type == OP_OUT
-		  && curr_static_id->operand[1].type == OP_IN);
-  dest = *curr_id->operand_loc[0];
-  src = *curr_id->operand_loc[1];
+  lra_assert (curr_insn_set != NULL_RTX && sec_mem_p);
+  dest = SET_DEST (curr_insn_set);
+  src = SET_SRC (curr_insn_set);
+  for (i = 0; i < n_operands; i++)
+	if (*curr_id->operand_loc[i] == dest)
+	  out = i;
+	else if (*curr_id->operand_loc[i] == src)
+	  in = i;
+  for (i = 0; i < curr_static_id->n_dups; i++)
+	if (out < 0 && *curr_id->dup_loc[i] == dest)
+	  out = curr_static_id->dup_num[i];
+	else if (in < 0 && *curr_id->dup_loc[i] == src)
+	  in = curr_static_id->dup_num[i];
+  lra_assert (out >= 0 && in >= 0
+		  && curr_static_id->operand[out].type == OP_OUT
+		  && curr_static_id->operand[in].type == OP_IN);
   rld = (GET_MODE_SIZE (GET_MODE (dest)) <= GET_MODE_SIZE (GET_MODE (src))
 	 ? dest : src);
   rld_mode = GET_MODE (rld);
@@ -3599,14 +3611,16 @@ curr_insn_transform (bool check_only_p)
 	}
   else if (dest == rld)
 {
-	  *curr_id->operand_loc[0] = new_reg;
+	  *curr_id->operand_loc[out] = new_reg;
+	  lra_update_dup (curr_id, out);
 	  after = emit_spill_move (false, new_reg, dest);
 	  lra_process_new_insns (curr_insn, NULL, after,
  "Inserting the sec. move");
 	}
   else
 	{
-	  *curr_id->operand_loc[1] = new_reg;
+	  *curr_id->operand_loc[in] = new_reg;
+	  lra_update_dup (curr_id, in);
 	  /* See comments above.  */
 	  push_to_sequence (before);
 	  before = emit_spill_move (true, new_reg, src);
Index: testsuite/ChangeLog
===
--- testsuite/ChangeLog	(revision 233282)
+++ testsuite/ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2016-02-10  Vladimir Makarov  
+
+	PR target/69468
+	* gcc.target/s390/pr69148.c: New.
+
 2016-02-10  Yuri Rumyantsev  
 
 	PR tree-optimization/69652
Index: testsuite/gcc.target/s390/pr69148.c
===
--- testsuite/gcc.target/s390/pr69148.c	(revision 0)
+++ testsuite/gcc.target/s390/pr69148.c	(working copy)
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O -march=z196 -m64 -w" } */
+union U { int r; float f; };
+struct A {
+  int a;
+union U b[64];
+};
+double foo (double);
+
+void
+bar (struct A *z, int x)
+{
+  union U y;
+  y.f = foo (z->b[x].f);
+  z->a = y.r ? 4 : y.r;
+}


Re: [PR69315] enable finish_function to recurse for constexpr functions

2016-02-10 Thread Alexandre Oliva
On Jan 26, 2016, Alexandre Oliva  wrote:

> We don't want finish_function to be called recursively from mark_used.
> However, it's desirable and necessary to call itself recursively when
> performing delayed folding, because that may have to instantiate and
> evaluate constexpr template functions.

> So, arrange for finish_function to accept being called recursively
> during delayed folding, save and restore the controlling variables,
> and process the deferred mark_used calls only when the outermost call
> completes.

> Regstrapped on x86_64-linux-gnu and i686-linux-gnu.  Ok to install?

Ping?

https://gcc.gnu.org/ml/gcc-patches/2016-01/msg02010.html

-- 
Alexandre Oliva, freedom fighterhttp://FSFLA.org/~lxoliva/
You must be the change you wish to see in the world. -- Gandhi
Be Free! -- http://FSFLA.org/   FSF Latin America board member
Free Software Evangelist|Red Hat Brasil GNU Toolchain Engineer


Re: [PATCH] Fix PR c++/69098 (bogus errors with static data member template)

2016-02-10 Thread Jason Merrill

On 02/10/2016 02:16 PM, Patrick Palka wrote:

...in tsubst_qualified_id() I just copied the way variable templates are
handled in tsubst_copy_and_build [TEMPLATE_ID_EXPR].


Let's factor that code out into another function rather than copy it.

Jason



Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 06:37 PM, Thomas Schwinge wrote:

On Wed, 10 Feb 2016 17:37:30 +0100, Bernd Schmidt  wrote:

IIUC it's also disabling offloading for parallels rather than just
kernels, which we previously said shouldn't happen.


Ah, you're talking about mixed OpenACC parallel/kernels codes -- I
understood the earlier discussion to apply to parallel-only codes, where
the "avoid offloading" flag will never be set.  In mixed parallel/kernels
code with one un-parallelized kernels construct, offloading would also
(have to be) disabled for the parallel constructs (for the same data
consistency reasons explained before).  The majority of codes I've seen
use either parallel or kernels constructs, typically not both.


That's not something I'd want to hard-code into the compiler however. 
Don't know how Jakub feels but to me this approach is way too 
coarse-grained.



Huh?  Like, at random, discouraging users from using GCC's SIMD
vectorizer just because that one fails to vectorize some code that it
could/should vectorize?  (Of course, I'm well aware that GCC's SIMD
vectorizer is much more mature than the OpenACC kernels/parloops
handling; it's seen many more years of development.)


Your description sounded like it's not actually not optimizing, but 
actively hurting performance for a large selection of real world codes. 
If I understood that correctly, we need to document this in the manual.



Bernd


Fix incomplete initialization of declspecs

2016-02-10 Thread Bernd Schmidt
I've noticed that build_null_declspecs fails to clear out all location 
information since it's missing a multiplication with the type size. The 
simplest fix seems to be to just clear the entire structure.


Bootstrapping & testing now on x86_64-linux, ok if that succeeds? 
Depending on what we do for 69733, this might be a prerequisite patch.



Bernd
	* c-decl.c (build_null_declspecs): Zero the entire struct.

Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c	(revision 233217)
+++ gcc/c/c-decl.c	(working copy)
@@ -9460,38 +9486,12 @@ struct c_declspecs *
 build_null_declspecs (void)
 {
   struct c_declspecs *ret = XOBNEW (_obstack, struct c_declspecs);
-  memset (>locations, 0, cdw_number_of_elements);
-  ret->type = 0;
-  ret->expr = 0;
-  ret->decl_attr = 0;
-  ret->attrs = 0;
+  memset (ret, 0, sizeof *ret);
   ret->align_log = -1;
   ret->typespec_word = cts_none;
   ret->storage_class = csc_none;
   ret->expr_const_operands = true;
-  ret->declspecs_seen_p = false;
   ret->typespec_kind = ctsk_none;
-  ret->non_sc_seen_p = false;
-  ret->typedef_p = false;
-  ret->explicit_signed_p = false;
-  ret->deprecated_p = false;
-  ret->default_int_p = false;
-  ret->long_p = false;
-  ret->long_long_p = false;
-  ret->short_p = false;
-  ret->signed_p = false;
-  ret->unsigned_p = false;
-  ret->complex_p = false;
-  ret->inline_p = false;
-  ret->noreturn_p = false;
-  ret->thread_p = false;
-  ret->thread_gnu_p = false;
-  ret->const_p = false;
-  ret->volatile_p = false;
-  ret->atomic_p = false;
-  ret->restrict_p = false;
-  ret->saturating_p = false;
-  ret->alignas_p = false;
   ret->address_space = ADDR_SPACE_GENERIC;
   return ret;
 }



[PATCH] PR driver/69265 and 69453: improved suggestions for various misspelled options

2016-02-10 Thread David Malcolm
On Wed, 2016-02-10 at 17:25 +0100, Bernd Schmidt wrote:
> On 02/09/2016 09:44 PM, David Malcolm wrote:
> > This is a bug in a new feature, so it isn't a regression as such,
> > but
> > it's fairly visible, and I believe the fix is relatively low-risk
> > (error-handling of typos of command-line options).
> >
> > This also now covers PR driver/69453 (and its duplicate PR
> > driver/69642), so people *are* running into this.
>
> I think the patch looks reasonable (I expect it needs slight
> adjustment
> after an earlier sanitizer options change).

It did (r232826 was the change in question).

> Whether it's OK or not at
> this stage is something I think I'll want to ask a RM. My inclination
> would be yes.
>
> A small improvement might be calculating the candidates array only
> once
> when making the first suggestion and not freeing it.

Done.

> BTW, I've also run
> into a case of an unhelpful suggestion:
>
> ./cc1 ~/hw.c -fno-if-convert
> cc1: error: unrecognized command line option ‘-fno-if-convert’
>
> which should instead suggest fno-if-conversion.

It actually handled that, with the patch.  I've added a test case
for that (and for the other cases mentioned in PR driver/69453 and its
dup).

Changes in this version:
* rebased to today's trunk (in particular, r232826 reworked the sanitizer
  options)
* added testcases for PR driver/69453 "-Wno-", including the one cited by
  Bernd.
* only build the list of candidates once (the first time there's an unrecognized
  option), storing it in a new field of class driver
* add hint for options taking arguments, adding the various enum values
  to the candidate strings, so e.g.:
-tls-model=global-dynamic
  can be corrected to
-ftls-model=global-dynamic
  whereas previously no suggestion was offered

Successfully bootstrapped on x86_64-pc-linux-gnu;
adds 16 PASS results to gcc.sum.

Is this OK for trunk in stage 4? (it's not a regression, but as noted
before it's somewhat user-visible and relatively low-risk, I believe).

Dave

Blurb from original patch follows:
As of r230285 (b279775faf3c56b554ecd38159b70ea7f2d37e0b; PR driver/67613)
the driver provides suggestions for misspelled options.

This works well for some options e.g.

 $ gcc -static-libfortran test.f95
 gcc: error: unrecognized command line option '-static-libfortran';
 did you mean '-static-libgfortran'?

but as reported in PR driver/69265 it can generate poor suggestions:

 $ c++ -sanitize=address foo.cc
 c++: error: unrecognized command line option ‘-sanitize=address’;
 did you mean ‘-Wframe-address’?

The root cause is that the current implementation only considers
cl_options[].opt_text, and has no knowledge of the arguments to
-fsanitize (and hence doesn't consider the "address" text when
computing edit distances).

It also fails to consider the alternate ways of spelling options
e.g. "-Wno-" vs "-W".

The following patch addresses these issues by building a vec of
candidates from cl_options[].opt_text, rather than just using
the latter.

gcc/ChangeLog:
PR driver/69265
PR driver/69453
* gcc.c (driver::driver): Initialize m_option_suggestions.
(driver::~driver): Clean up m_option_suggestions.
(suggest_option): Convert to...
(driver::suggest_option): ...this, and split out into
driver::build_option_suggestions and find_closest_string.
(driver::build_option_suggestions): New function, from
first half of suggest_option.  Special-case
OPT_fsanitize_ and OPT_fsanitize_recover_, making use of
the sanitizer_opts array.  For options of enum types, add the
various enum values to the candidate strings.
(driver::handle_unrecognized_options): Remove "const".
* gcc.h (driver::handle_unrecognized_options): Likewise.
(driver::build_option_suggestions): New decl.
(driver::suggest_option): New decl.
(driver::m_option_suggestions): New field.
* opts-common.c (add_misspelling_candidates): New function.
* opts.c (sanitizer_opts): Remove decl of struct sanitizer_opts_s
and make non-static.
* opts.h (sanitizer_opts): New array decl.
(add_misspelling_candidates): New function decl.
* spellcheck.c (find_closest_string): New function.
* spellcheck.h (find_closest_string): New function decl.

gcc/testsuite/ChangeLog:
PR driver/69265
PR driver/69453
* gcc.dg/spellcheck-options-3.c: New test case.
* gcc.dg/spellcheck-options-4.c: New test case.
* gcc.dg/spellcheck-options-5.c: New test case.
* gcc.dg/spellcheck-options-6.c: New test case.
* gcc.dg/spellcheck-options-7.c: New test case.
* gcc.dg/spellcheck-options-8.c: New test case.
* gcc.dg/spellcheck-options-9.c: New test case.
* gcc.dg/spellcheck-options-10.c: New test case.
---
 gcc/gcc.c| 112 ---
 gcc/gcc.h 

Re: Fix incomplete initialization of declspecs

2016-02-10 Thread Joseph Myers
On Wed, 10 Feb 2016, Bernd Schmidt wrote:

> I've noticed that build_null_declspecs fails to clear out all location
> information since it's missing a multiplication with the type size. The
> simplest fix seems to be to just clear the entire structure.
> 
> Bootstrapping & testing now on x86_64-linux, ok if that succeeds? Depending on
> what we do for 69733, this might be a prerequisite patch.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH] Fix PR c++/69098 (bogus errors with static data member template)

2016-02-10 Thread Patrick Palka

On Wed, 10 Feb 2016, Jason Merrill wrote:


On 02/10/2016 02:16 PM, Patrick Palka wrote:

...in tsubst_qualified_id() I just copied the way variable templates are
handled in tsubst_copy_and_build [TEMPLATE_ID_EXPR].


Let's factor that code out into another function rather than copy it.


Done.  Does this look OK?

-- >8 --

gcc/cp/ChangeLog:

PR c++/69098
* pt.c (lookup_and_finish_template_variable): New function,
extracted from ...
(tsubst_copy_and_build) [TEMPLATE_ID_EXPR]: ... here.
(tsubst_qualified_id): Consider that EXPR might be a variable
template.
* typeck.c (check_template_keyword): Don't emit an error
if DECL is a variable template.

gcc/testsuite/ChangeLog:

PR c++/69098
* g++.dg/cpp1y/69098.C: New test.
* g++.dg/cpp1y/69098-2.C: New test.
---
 gcc/cp/pt.c  | 36 +-
 gcc/cp/typeck.c  | 10 -
 gcc/testsuite/g++.dg/cpp1y/69098-2.C | 37 +++
 gcc/testsuite/g++.dg/cpp1y/69098.C   | 43 
 4 files changed, 115 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/69098-2.C
 create mode 100644 gcc/testsuite/g++.dg/cpp1y/69098.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index eea3834..6776c74 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -8698,6 +8698,24 @@ finish_template_variable (tree var, tsubst_flags_t 
complain)

   return instantiate_template (templ, arglist, complain);
 }
+
+/* Construct a TEMPLATE_ID_EXPR for the given variable template TEMPL having
+   TARGS template args, and instantiate it if it's not dependent.  */
+
+static tree
+lookup_and_finish_template_variable (tree templ, tree targs,
+tsubst_flags_t complain)
+{
+  templ = lookup_template_variable (templ, targs);
+  if (!any_dependent_template_arguments_p (targs))
+{
+  templ = finish_template_variable (templ, complain);
+  mark_used (templ);
+}
+
+  return convert_from_reference (templ);
+}
+

 struct pair_fn_data
 {
@@ -13726,7 +13744,13 @@ tsubst_qualified_id (tree qualified_id, tree args,
 }

   if (is_template)
-expr = lookup_template_function (expr, template_args);
+{
+  if (variable_template_p (expr))
+   expr = lookup_and_finish_template_variable (expr, template_args,
+   complain);
+  else
+   expr = lookup_template_function (expr, template_args);
+}

   if (expr == error_mark_node && complain & tf_error)
 qualified_name_lookup_error (scope, TREE_OPERAND (qualified_id, 1),
@@ -15906,15 +15930,7 @@ tsubst_copy_and_build (tree t,
  return error_mark_node;

if (variable_template_p (templ))
- {
-   templ = lookup_template_variable (templ, targs);
-   if (!any_dependent_template_arguments_p (targs))
- {
-   templ = finish_template_variable (templ, complain);
-   mark_used (templ);
- }
-   RETURN (convert_from_reference (templ));
- }
+ RETURN (lookup_and_finish_template_variable (templ, targs, complain));

if (TREE_CODE (templ) == COMPONENT_REF)
  {
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index c9fa112..fb2a2c4 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -2601,7 +2601,15 @@ check_template_keyword (tree decl)
   if (TREE_CODE (decl) != TEMPLATE_DECL
   && TREE_CODE (decl) != TEMPLATE_ID_EXPR)
 {
-  if (!is_overloaded_fn (decl))
+  if (VAR_P (decl))
+   {
+ if (DECL_USE_TEMPLATE (decl)
+ && PRIMARY_TEMPLATE_P (DECL_TI_TEMPLATE (decl)))
+   ;
+ else
+   permerror (input_location, "%qD is not a template", decl);
+   }
+  else if (!is_overloaded_fn (decl))
permerror (input_location, "%qD is not a template", decl);
   else
{
diff --git a/gcc/testsuite/g++.dg/cpp1y/69098-2.C 
b/gcc/testsuite/g++.dg/cpp1y/69098-2.C
new file mode 100644
index 000..2e968bb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/69098-2.C
@@ -0,0 +1,37 @@
+// PR c++/69098
+// { dg-do compile { target c++14 } }
+
+struct A
+{
+  template 
+  static void *pf;
+};
+
+template 
+bool foo1 () {
+  return A::pf;
+}
+
+template 
+bool foo2 () {
+  return B::template pf;
+}
+
+template 
+bool foo3 () {
+  return ::pf;
+}
+
+template 
+bool foo4 () {
+  return ::template pf;
+}
+
+
+void bar () {
+  foo1();
+  foo2();
+  foo3();
+  foo4();
+}
+
diff --git a/gcc/testsuite/g++.dg/cpp1y/69098.C 
b/gcc/testsuite/g++.dg/cpp1y/69098.C
new file mode 100644
index 000..afc4294
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/69098.C
@@ -0,0 +1,43 @@
+// PR c++/69098
+// { dg-do compile { target c++14 } }
+
+template struct SpecPerType;
+
+class Specializer
+{
+public:
+template void MbrFnTempl() //Must be a template
+   {
+   }
+   template struct 

[PATCH], PR 68404 patch #2 (disable power8/power9 fusion on PowerPC)

2016-02-10 Thread Michael Meissner
This patch disables -mcpu=power8/-mtune=power8 from setting -mpower8-fusion and
-mcpu=power9/-mtune=power9 from setting -mpower9-fusion.  I will look at the
earlyclobber that Bernd Schmidt mentioned, but for now it may be safest to just
disable it for GCC 6.0.

I built it on a little endian power8 system, and there were no regressions.  Is
it ok to install?

[gcc]
2016-02-10  Michael Meissner  

PR target/68404
* config/rs6000/predicates.md (fusion_gpr_addis): Revert
2016-02-09 change.

* config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Do not set
power8/power9 fusion by default.
(ISA_3_0_MASKS_SERVER): Likewise.

* config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
code setting -mpower8-fusion if -mtune=power8 and -mpower9-fusion
if -mtune=power9.

* doc/invoke.texi (RS/6000 and PowerPC Options): Document that
-mpower8-fusion and -mpower9-fusion are not set by default.

[gcc/testsuites]
2016-02-10  Michael Meissner  

PR target/68404
* gcc.target/powerpc/fusion.c: Do not assume that -mtune=power8
sets -mpower8-fusion or -mtune=power9 sets -mpower9-fusion.
* gcc.target/powerpc/fusion2.c: Likewise.
* gcc.target/powerpc/fusion3.c: Likewise.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797



Re: [PATCH], PR 68404 patch #2 (disable power8/power9 fusion on PowerPC)

2016-02-10 Thread Jakub Jelinek
On Wed, Feb 10, 2016 at 05:42:17PM -0500, Michael Meissner wrote:
> This patch disables -mcpu=power8/-mtune=power8 from setting -mpower8-fusion 
> and
> -mcpu=power9/-mtune=power9 from setting -mpower9-fusion.  I will look at the
> earlyclobber that Bernd Schmidt mentioned, but for now it may be safest to 
> just
> disable it for GCC 6.0.
> 
> I built it on a little endian power8 system, and there were no regressions.  
> Is
> it ok to install?

Doesn't this mean the bug is still there, just not enabled unless
-mpower[89]-fusion (ok, perhaps mitigated by the previous workaround patch)?
Wouldn't it be better to just forcefully clear the options (and thus ignore
-them) for the time being if they are known to be broken?

> [gcc]
> 2016-02-10  Michael Meissner  
> 
>   PR target/68404
>   * config/rs6000/predicates.md (fusion_gpr_addis): Revert
>   2016-02-09 change.
> 
>   * config/rs6000/rs6000-cpus.def (ISA_2_7_MASKS_SERVER): Do not set
>   power8/power9 fusion by default.
>   (ISA_3_0_MASKS_SERVER): Likewise.
> 
>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Remove
>   code setting -mpower8-fusion if -mtune=power8 and -mpower9-fusion
>   if -mtune=power9.
> 
>   * doc/invoke.texi (RS/6000 and PowerPC Options): Document that
>   -mpower8-fusion and -mpower9-fusion are not set by default.
> 
> [gcc/testsuites]
> 2016-02-10  Michael Meissner  
> 
>   PR target/68404
>   * gcc.target/powerpc/fusion.c: Do not assume that -mtune=power8
>   sets -mpower8-fusion or -mtune=power9 sets -mpower9-fusion.
>   * gcc.target/powerpc/fusion2.c: Likewise.
>   * gcc.target/powerpc/fusion3.c: Likewise.
> 
> -- 
> Michael Meissner, IBM
> IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
> email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797

Jakub


Re: [PATCH] Fix PR c++/69098 (bogus errors with static data member template)

2016-02-10 Thread Jason Merrill

OK.

Jason


[PATCH] Fix unnecessary -Wmaybe-uninitialized false positive (PR target/65313)

2016-02-10 Thread Jakub Jelinek
Hi!

During profiledbootstrap on ppc64 I've noticed a -Wmaybe-uninitialized
warning in vect_schedule_slp_instance, when built with -fprofile-generate.
While it is clearly a false positive, IMHO it is completely unnecessary
to use here two variables, one uninitialized, another bool whether
it is initialized.  In valid code gimple_assign_rhs_code should not
return ERROR_MARK, so we can use ocode == ERROR_MARK for the allsame
case and ocode != ERROR_MARK for !allsame.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-10  Jakub Jelinek  

PR target/65313
* tree-vect-slp.c (vect_schedule_slp_instance): Avoid
-Wmaybe-uninitialized warning.

--- gcc/tree-vect-slp.c.jj  2016-01-21 13:54:19.0 +0100
+++ gcc/tree-vect-slp.c 2016-02-09 13:40:30.280769470 +0100
@@ -3568,20 +3568,18 @@ vect_schedule_slp_instance (slp_tree nod
   if (SLP_TREE_TWO_OPERATORS (node))
 {
   enum tree_code code0 = gimple_assign_rhs_code (stmt);
-  enum tree_code ocode;
+  enum tree_code ocode = ERROR_MARK;
   gimple *ostmt;
   unsigned char *mask = XALLOCAVEC (unsigned char, group_size);
-  bool allsame = true;
   FOR_EACH_VEC_ELT (SLP_TREE_SCALAR_STMTS (node), i, ostmt)
if (gimple_assign_rhs_code (ostmt) != code0)
  {
mask[i] = 1;
-   allsame = false;
ocode = gimple_assign_rhs_code (ostmt);
  }
else
  mask[i] = 0;
-  if (!allsame)
+  if (ocode != ERROR_MARK)
{
  vec v0;
  vec v1;

Jakub


[PATCH] Fix another ipa-split caused ICE (PR ipa/69241)

2016-02-10 Thread Jakub Jelinek
Hi!

Markus has pointed out to a reduced testcase which still ICEs even with the
PR69241 fix.  In that case the function with TREE_ADDRESSABLE return type
does not return at all (and -Wreturn-type properly diagnoses it).
For that case the following patch just forces the lhs on the *.part.*
call, so that we don't ICE in assign_temp.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2016-02-10  Jakub Jelinek  

PR ipa/69241
* ipa-split.c (split_function): If split part returns TREE_ADDRESSABLE
type by reference, force lhs on the call.

* g++.dg/ipa/pr69241-4.C: New test.

--- gcc/ipa-split.c.jj  2016-02-10 16:05:37.0 +0100
+++ gcc/ipa-split.c 2016-02-10 17:18:12.553061670 +0100
@@ -1589,7 +1589,20 @@ split_function (basic_block return_bb, s
}
}
  else
-   gsi_insert_after (, call, GSI_NEW_STMT);
+   {
+ /* Force a lhs if the split part has to return a value.  */
+ if (split_point->split_part_set_retval
+ && DECL_BY_REFERENCE (DECL_RESULT (current_function_decl)))
+   {
+ retval = DECL_RESULT (current_function_decl);
+ if (TREE_ADDRESSABLE (TREE_TYPE (TREE_TYPE (retval
+   {
+ retval = get_or_create_ssa_default_def (cfun, retval);
+ gimple_call_set_lhs (call, build_simple_mem_ref (retval));
+   }
+   }
+ gsi_insert_after (, call, GSI_NEW_STMT);
+   }
  if (tsan_func_exit_call)
gsi_insert_after (, tsan_func_exit_call, GSI_NEW_STMT);
}
--- gcc/testsuite/g++.dg/ipa/pr69241-4.C.jj 2016-02-10 17:22:03.977866326 
+0100
+++ gcc/testsuite/g++.dg/ipa/pr69241-4.C2016-02-10 17:22:00.073920229 
+0100
@@ -0,0 +1,55 @@
+// PR ipa/69241
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2 -Wno-return-type" }
+
+template  class A;
+struct B {
+  using pointer = int *;
+};
+template > class basic_string {
+  long _M_string_length;
+  enum { _S_local_capacity = 15 } _M_local_buf[_S_local_capacity];
+  B::pointer _M_local_data;
+
+public:
+  ~basic_string();
+};
+template 
+int operator<<(_Traits, basic_string<_CharT, _Alloc>);
+class C {
+  basic_string _M_string;
+};
+class D {
+  C _M_stringbuf;
+};
+class F {
+  int stream;
+  D stream_;
+};
+class G {
+public:
+  void operator&(int);
+};
+class H {
+public:
+  H(unsigned);
+  H(H &&);
+  bool m_fn1();
+};
+class I {
+  void m_fn2(const int &&);
+  static H m_fn3(const int &);
+};
+template  void Bind(Functor);
+class J {
+public:
+  static basic_string m_fn4();
+};
+int a;
+void I::m_fn2(const int &&) { Bind(m_fn3); }
+H I::m_fn3(const int &) {
+  !false ? (void)0 : G() & F() << J::m_fn4();
+  H b(a);
+  if (b.m_fn1())
+F();
+}

Jakub


[C++ Patch] PR 68726 ("ice: tree check: expected tree_vec, have error_mark in comp_template_args_with_info, at cp/pt.c:7890")

2016-02-10 Thread Paolo Carlini

Hi,

turns out, this small ICE on invalid is actually a regression, I can 
only reproduce it in trunk. Anyway, the ICE occurs immediately when, 
during error recovery, comp_template_args gets an error_mark_node as 
second argument. Tested x86_64-linux.


Thanks,
Paolo.

P.S. Second try: the first time my message was rejected as spam!?!

///


/cp
2016-02-10  Paolo Carlini  

PR c++/68726
* pt.c (lookup_template_class_1): Check tsubst return value for
error_mark_node.

/testsuite
2016-02-10  Paolo Carlini  

PR c++/68726
* g++.dg/cpp0x/pr68726.C: New.
Index: cp/pt.c
===
--- cp/pt.c (revision 233308)
+++ cp/pt.c (working copy)
@@ -8547,6 +8547,8 @@ lookup_template_class_1 (tree d1, tree arglist, tr
arglist, complain, NULL_TREE);
  --processing_template_decl;
  TREE_VEC_LENGTH (arglist)++;
+ if (partial_inst_args == error_mark_node)
+   return error_mark_node;
  use_partial_inst_tmpl =
/*...and we must not be looking at the partial instantiation
 itself. */
Index: testsuite/g++.dg/cpp0x/pr68726.C
===
--- testsuite/g++.dg/cpp0x/pr68726.C(revision 0)
+++ testsuite/g++.dg/cpp0x/pr68726.C(working copy)
@@ -0,0 +1,7 @@
+// { dg-do compile { target c++11 } }
+
+template  struct A {
+  template  struct __construct_helper;  // { dg-error "expected" }
+  template 
+  using __has_construct typename __construct_helper<_Args...>::type;  // { 
dg-error "expected" }
+} struct : A {  // { dg-error "expected" }


Re: C++ PATCH for c++/10200 (DR 141, template name lookup after ->)

2016-02-10 Thread Markus Trippelsdorf
On 2016.02.10 at 10:30 -0500, Jason Merrill wrote:
> After . or ->, when we see a name followed by < we look for a template name,
> first in the scope of the object expression and then in the enclosing scope.
> DR 141 clarified that when we look in the enclosing scope, we only consider
> class templates, since there's no way a non-member function template could
> be correct in that situation.
> 
> When I fixed that, I found that we were failing to do the lookup in the
> object scope in the case where that scope is the current instantiation, so I
> needed to fix that as well.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

This commit causes: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69753

-- 
Markus


AW: AW: Wonly-top-basic-asm

2016-02-10 Thread Bernd Edlinger
On 11.2.2016, David Wohlferd wrote:
>
> Since no one expressed any objections, I have renamed the option from
> -Wonly-top-basic-asm to -Wbasic-asm-in-function.  This more clearly
> conveys what the option does (give a warning if you find basic asm in a
> function).
> 

why not simply -Wbasic-asm ?

> I believe the attached patch addresses all the other outstanding comments.
> 

Indentation wrong here. The whole block must be indented by 2 spaces.

>   if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN) && !is_goto)
>+  {
>+/* Warn on basic asm used inside of functions,
>+   EXCEPT when in naked functions.  Also allow asm (""). */

Comments should end with dot space space */

>+if (warn_basic_asm_in_function && TREE_STRING_LENGTH (str) != 1)
>+  if (lookup_attribute ("naked",
>+  DECL_ATTRIBUTES (current_function_decl))

the DECL_ATTRIBUTES should be at the same column as the "naked".

>+== NULL_TREE)
>+  warning_at (asm_loc, OPT_Wbasic_asm_in_function,
>+  "asm statement in function does not use extended syntax");
>+
> goto done_asm;
>+  }

>@@ -18199,6 +18201,17 @@
> /* If the extended syntax was not used, mark the ASM_EXPR.  */
> if (!extended_p)
>   {
>+/* Warn on basic asm used inside of functions,
>+   EXCEPT when in naked functions.  Also allow asm (""). */

Comments should end with dot space space */

>+if (warn_basic_asm_in_function
>+&& TREE_STRING_LENGTH (string) != 1)
>+  if (lookup_attribute ("naked",
>+   DECL_ATTRIBUTES (current_function_decl))

the DECL_ATTRIBUTES should be at the same column as the "naked".

> ChangeLog:
> 2016-02-10  David Wohlferd  
> 
> * doc/extend.texi: Doc basic asm behavior and new
> -Wbasic-asm-in-function option.
>  * doc/invoke.texi: Doc new -Wbasic-asm-in-function option.
>  * c-family/c.opt: Define -Wbasic-asm-in-function.
>  * c/c-parser.c: Implement -Wbasic-asm-in-function for C.
>  * cp/parser.c: Implement -Wbasic-asm-in-function for c++.

C++, isn't it always upper case?

>  * testsuite/c-c++-common/Wbasic-asm-in-function.c: New tests for
>  -Wbasic-asm-in-function.
>  * testsuite/c-c++-common/Wbasic-asm-in-function-2.c: Ditto.
> 

ChangeLog lines begin with TAB.

Please split the ChangeLog, there are separate ChangeLogs
at gcc/ChangeLog (doc changes go in there)
gcc/c/ChangeLog, gcc/cp/ChangeLog, gcc/c-family/ChangLog
and gcc/testsuite/ChangeLog, the respective ChangeLog entries
use relative file names.

Please add the function name where you changed in brackets.

For instance:
* c-parser.c (cp_parser_asm_definition): Implement -Wbasic-asm-in-function.


Thanks
Bernd.

> While I have a release on file with FSF, I don't have write access to SVN.
> 
> dw


Re: [PATCH PR69052]Check if loop inv can be propagated into mem ref with additional addr expr canonicalization

2016-02-10 Thread Jeff Law

On 02/09/2016 04:08 AM, Bin Cheng wrote:

Hi,
When counting cost for loop inv, GCC checks if a loop inv can be propagated 
into its use site (a memory reference).  If it cannot be propagated, we 
increase its cost so that it's expensive enough to be hoisted out of loop.  
Currently we simply replace loop inv register in the use site with its 
definition expression, then call validate_changes to check if the result insn 
is valid.  This is weak because validate_changes doesn't take canonicalization 
into consideration.  Given below example:

   Loop inv def:
69: r149:SI=r87:SI+const(unspec[`'] 1)
   REG_DEAD r87:SI
   Loop inv use:
70: r150:SI=[r90:SI*0x4+r149:SI]
   REG_DEAD r149:SI

The address expression after propagation is "r90 * 0x4 + (r87 + const(unspec[`']))".  
Function validate_changes simply returns false to it.  As a matter of fact, the propagation is 
feasible if we canonicalize address expression into the form like "(r90 * 0x4 + r87) + 
const(unspec[`'])".

This patch fixes the problem by canonicalizing address expression and verifying 
if the new addr is valid.  The canonicalization follows GCC insn 
canonicalization rules.  The test case from bugzilla PR is also included.
As for the canonicalize_address interface, there is another 
canonicalize_address in fwprop.c which only changes shift into mult.  I think 
it would be good to factor out a common RTL interface in GCC, but that's stage1 
work.


Also note there's bits in combine that will canonicalize appropriate 
shifts into mults.  Clearly there's a need for some generalized routines 
to take a fairly generic address and perform canonicalizations and 
simplifications on it.



Bootstrap and test on x86_64 and AArch64.  Is it OK?

Thanks,
bin

2016-02-09  Bin Cheng

PR tree-optimization/69052
* loop-invariant.c (canonicalize_address): New function.
(inv_can_prop_to_addr_use): Check validity of address expression
which is canonicalized by above function.

gcc/testsuite/ChangeLog
2016-02-09  Bin Cheng

PR tree-optimization/69052
* gcc.target/i386/pr69052.c: New test.


pr69052-20160204.txt


diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index 707f044..157e273 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -754,6 +754,74 @@ create_new_invariant (struct def *def, rtx_insn *insn, 
bitmap depends_on,
return inv;
  }

+/* Returns a canonical version address for X.  It identifies
+   addr expr in the form of A + B + C.  Following instruction
+   canonicalization rules, MULT operand is moved to the front,
+   CONST operand is moved to the end; also PLUS operators are
+   chained to the left.  */
+
+static rtx
+canonicalize_address (rtx x)
+{
+  rtx op0, op1, op2;
+  machine_mode mode = GET_MODE (x);
+  enum rtx_code code = GET_CODE (x);
+
+  if (code != PLUS)
+return x;
+
+  /* Extract operands from A + B (+ C).  */
+  if (GET_CODE (XEXP (x, 0)) == PLUS)
+{
+  op0 = XEXP (XEXP (x, 0), 0);
+  op1 = XEXP (XEXP (x, 0), 1);
+  op2 = XEXP (x, 1);
+}
+  else if (GET_CODE (XEXP (x, 1)) == PLUS)
+{
+  op0 = XEXP (x, 0);
+  op1 = XEXP (XEXP (x, 1), 0);
+  op2 = XEXP (XEXP (x, 1), 1);
+}
+  else
+{
+  op0 = XEXP (x, 0);
+  op1 = XEXP (x, 1);
+  op2 = NULL_RTX;
+}
+
+  /* Move MULT operand to the front.  */
+  if (!REG_P (op1) && !CONST_INT_P (op1))
+std::swap (op0, op1);
This feels a bit hack-ish in the sense that you already know the form of 
the RTL you're expecting and just assume that you'll be given something 
of that form, but no more complex.


ISTM you're better off walking the whole rtx, recording the tidbits as 
you go into a vec.  If you see something unexpected during that walk, 
you punt canonicalization of the whole expression.


You then sort the vec.  You want to move things like MULT to the start 
and all the constants to the end I think.


You then do simplifications, particularly on the constants, but there 
may be something useful to do with MULT terms as well.  You could also 
arrange to rewrite ASHIFTs into MULTs at this stage.


Then you generate a new equivalent expression from the simplified 
operands in the vec.


You might look at tree-ssa-reassoc for ideas on implementation details.

Initially just use it in the LICM code, but I think given that kind of 
structure it'd be generally useful to replace bits of combine and fwprop


If your contention is that only a few forms really matter, then I'd like 
to see those forms spelled out better in the comment and some kind of 
checking that we have reasonable incoming RTL.




+
+  /* Move CONST operand to the end.  */
+  if (CONST_INT_P (op0))
+std::swap (op0, op1);
You might want to check CONSTANT_P here.  Maybe it doesn't matter in 
practice, but things like (plus (plus (symbol-ref) (const_int) const_int))


That also gives you a fighting chance at extending this to 

Re: TR29124 C++ Special Maths - Make pull functions into global namespace.

2016-02-10 Thread Mike Stump
I’m seeing:

/home/mrs/work1/gcc/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:
 In function 'void test(const testcase_riemann_zeta (&)[Num], Tp)':
/home/mrs/work1/gcc/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:285:15:
 error: 'riemann_zeta' is not a member of 'std'
compiler exited with status 1
output is:
/home/mrs/work1/gcc/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:
 In function 'void test(const testcase_riemann_zeta (&)[Num], Tp)':
/home/mrs/work1/gcc/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:285:15:
 error: 'riemann_zeta' is not a member of 'std'

FAIL: special_functions/18_riemann_zeta/check_value.cc (test for excess errors)
Excess errors:
/home/mrs/work1/gcc/libstdc++-v3/testsuite/special_functions/18_riemann_zeta/check_value.cc:285:15:
 error: 'riemann_zeta' is not a member of 'std'

UNRESOLVED: special_functions/18_riemann_zeta/check_value.cc compilation failed 
to produce executable
extra_tool_flags are:
 -D__STDCPP_WANT_MATH_SPEC_FUNCS__

on a recent trunk.  This is a typical newlib port.  Not sure if this is a real 
bug or not, but thought I’d forward it long.

RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-10 Thread Claudiu Zissulescu
> > In the expand:
> > 18: cc:CC_FPU=cmp(r159:DF,r162:DF)
> > 19: r163:SI=cc:CC_FPU<0
> > 20: r161:QI=r163:SI#0
> > 21: r153:SI=zero_extend(r161:QI)
> > 22: cc:CC_ZN=cmp(r153:SI,0)
> > 23: pc={(cc:CC_ZN!=0)?L28:pc}
> >
> > Then after combine we get this:
> > 18: cc:CC_FPU=cmp(r2:DF,r4:DF)
> >REG_DEAD r4:DF
> >REG_DEAD r2:DF
> > 23: pc={(cc:CC_ZN<0)?L28:pc}
> >REG_DEAD cc:CC_ZN
> >REG_BR_PROB 6102
> 
> That sound like a bug.  Have you looked more closely what's going on?

The fwprop1 is collapsing insn 20 into insn 21. No surprise until here. Then, 
the combiner is changing first insn 19 and 21 into insn 21 (this seems sane). 
Followed by combining the resulted insn 21 into insn 22. Finally, insn 22 is 
changing the condition of the jump (insn 22).
The last steps are a bit too aggressive, but I can make a logic out of it. 
Practically, insn 22 tells to the combiner how to change a CC_FPU mode into a 
CC_ZN mode, resulting into the modification of insn 21 to insn23. However, I 
cannot understand why the combiner chooses for CC_ZN instead of CC_FPU


[patch] Fix timevar internal consistency failure

2016-02-10 Thread Eric Botcazou
Hi,

I just ran into a timevar internal consistency failure with -ftime-report:

Execution times (seconds)
 phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall  
   
114 kB ( 0%) ggc
 phase parsing   :   0.29 ( 7%) usr   0.01 ( 9%) sys   0.30 ( 7%) wall  
[...]
repair loop structures  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall   

0 kB ( 0%) ggc
 TOTAL :   4.06 0.11 4.18  
58199 kB
Extra diagnostic checks enabled; compiler may run slowly.
Configure with --enable-checking=release to disable checks.
Timing error: total of phase timers exceeds total time.
ggc_mem 59609520 > 59596208

This only happens with -g and was probably introduced by the merge of the 
early-debug branch.  timevar.def reads:

/* The compiler phases.

   These must be mutually exclusive, and the NAME field must begin
   with "phase".

   Also, their sum must be within a millionth of the total time (see
   validate_phases).  */

The problem is that TV_PHASE_DBGINFO is now nested within TV_PHASE_OPT_GEN, 
which violates the above mutual exclusivity requirement.  Therefore the 
attached patch simply gets rid of TV_PHASE_DBGINFO (as well as of the sibling 
TV_PHASE_CHECK_DBGINFO which was already unused).

Tested on x86_64-suse-linux, OK for the mainline?


2016-02-10  Eric Botcazou  

* timevar.def (TV_PHASE_DBGINFO): Delete.
(TV_PHASE_CHECK_DBGINFO): Likewise.
* varpool.c (varpool_node::assemble_decl): Do not change timevar.

-- 
Eric BotcazouIndex: varpool.c
===
--- varpool.c	(revision 233237)
+++ varpool.c	(working copy)
@@ -586,9 +586,7 @@ varpool_node::assemble_decl (void)
   /* After the parser has generated debugging information, augment
 	 this information with any new location/etc information that may
 	 have become available after the compilation proper.  */
-  timevar_start (TV_PHASE_DBGINFO);
   debug_hooks->late_global_decl (decl);
-  timevar_stop (TV_PHASE_DBGINFO);
   return true;
 }
 
Index: timevar.def
===
--- timevar.def	(revision 233237)
+++ timevar.def	(working copy)
@@ -43,8 +43,6 @@ DEFTIMEVAR (TV_PHASE_PARSING , "
 DEFTIMEVAR (TV_PHASE_DEFERRED, "phase lang. deferred")
 DEFTIMEVAR (TV_PHASE_LATE_PARSING_CLEANUPS, "phase late parsing cleanups")
 DEFTIMEVAR (TV_PHASE_OPT_GEN , "phase opt and generate")
-DEFTIMEVAR (TV_PHASE_DBGINFO , "phase debug info")
-DEFTIMEVAR (TV_PHASE_CHECK_DBGINFO   , "phase check & debug info")
 DEFTIMEVAR (TV_PHASE_LATE_ASM, "phase last asm")
 DEFTIMEVAR (TV_PHASE_STREAM_IN   , "phase stream in")
 DEFTIMEVAR (TV_PHASE_STREAM_OUT  , "phase stream out")


Re: [PATCH][AArch64] Only update assembler .arch directive when necessary

2016-02-10 Thread James Greenhalgh
On Wed, Feb 10, 2016 at 10:32:16AM +, Kyrill Tkachov wrote:
> Hi James,
> 
> On 10/02/16 10:11, James Greenhalgh wrote:
> >On Thu, Feb 04, 2016 at 01:50:31PM +, Kyrill Tkachov wrote:
> >>Hi all,
> >>
> >>As part of the target attributes and pragmas support for GCC 6 I changed the
> >>aarch64 port to emit a .arch assembly directive for each function that
> >>describes the architectural features used by that function.  This is a 
> >>change
> >>from GCC 5 behaviour where we output a single .arch directive at the
> >>beginning of the assembly file corresponding to architectural features given
> >>on the command line.
> >
> >>Bootstrapped and tested on aarch64-none-linux-gnu.  With this patch I 
> >>managed
> >>to build a recent allyesconfig Linux kernel where before the build would 
> >>fail
> >>when assembling the LSE instructions.
> >>
> >>Ok for trunk?
> >One comment, that I'm willing to be convinced on...
> >
> >>Thanks,
> >>Kyrill
> >>
> >>2016-02-04  Kyrylo Tkachov  
> >>
> >> * config/aarch64/aarch64.c (struct aarch64_output_asm_info):
> >> New struct definition.
> >> (aarch64_previous_asm_output): New variable.
> >> (aarch64_declare_function_name): Only output .arch assembler
> >> directive if it will be different from the previously output
> >> directive.
> >> (aarch64_start_file): New function.
> >> (TARGET_ASM_FILE_START): Define.
> >>
> >>2016-02-04  Kyrylo Tkachov  
> >>
> >> * gcc.target/aarch64/assembler_arch_1.c: Add -dA to dg-options.
> >> Delete unneeded -save-temps.
> >> * gcc.target/aarch64/assembler_arch_7.c: Likewise.
> >> * gcc.target/aarch64/target_attr_15.c: Scan assembly for
> >> .arch armv8-a\n.
> >> * gcc.target/aarch64/assembler_arch_1.c: New test.
> >>commit 2df0f24332e316b8d18d4571438f76726a0326e7
> >>Author: Kyrylo Tkachov 
> >>Date:   Wed Jan 27 12:54:54 2016 +
> >>
> >> [AArch64] Only update assembler .arch directive when necessary
> >>
> >>diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> >>index 5ca2ae8..0751440 100644
> >>--- a/gcc/config/aarch64/aarch64.c
> >>+++ b/gcc/config/aarch64/aarch64.c
> >>@@ -11163,6 +11163,17 @@ aarch64_asm_preferred_eh_data_format (int code 
> >>ATTRIBUTE_UNUSED, int global)
> >> return (global ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | type;
> >>  }
> >>+struct aarch64_output_asm_info
> >>+{
> >>+  const struct processor *arch;
> >>+  const struct processor *cpu;
> >>+  unsigned long isa_flags;
> >Why not just keep the last string you printed, and use a string compare
> >to decide whether to print or not? Sure we'll end up doing a bit more
> >work, but the logic becomes simpler to follow and we don't need to pass
> >around another struct...
> 
> I did do it this way to avoid a string comparison (I try to avoid
> manual string manipulations where I can as they're so easy to get wrong)
> though this isn't on any hot path.
> We don't really pass the structure around anywhere, we just keep one
> instance. We'd have to do the same with a string i.e. keep a string
> object around that we'd strcpy (or C++ equivalent) a string to every time
> we wanted to update it, so I thought this approach is cleaner as the
> architecture features are already fully described by a pointer to
> an element in the static constant all_architectures table and an
> unsigned long holding the ISA flags.
> 
> If you insist I can change it to a string, but I personally don't
> think it's worth it.

Had you been working on a C string I probably wouldn't have noticed. But
you're already working with C++ strings in this function, so much of what
you are concerned about is straightforward.

I'd encourage you to try it using idiomatic string manipulation in C++, the
cleanup should be worth it.

Thanks,
James



Re: [patch] libstdc++/69116 Constrain std::valarray functions and operators

2016-02-10 Thread Jonathan Wakely

On 22/01/16 21:15 +, Jonathan Wakely wrote:

This is a regression, caused by the front end starting to diagnose the
invalid library instantiations more eagerly. The fix seems simple and
safe, so I plan to backport it to the branches too.


Committed to the branches too.


Re: libgcc: On AIX, increase chances to find landing pads for exceptions

2016-02-10 Thread Michael Haubenwallner

On 02/08/2016 02:59 PM, David Edelsohn wrote:
> Runtime linking is disabled by default on AIX, and I disabled it for 
> libstdc++.

For large applications mainly developed on/for Linux I do prefer/need
runtime linking even on AIX. Still I do believe there is no AIX-based
reason to leave runtime linking disabled, but build-/linktime issues
instead that cause things to fail with runtime linking enabled.

> There are two remaining issues:
> 
> 1) FDEs with overlapping ranges causing problems with exceptions.  I'm
> not sure of the best way to work around this.  Your patch is one
> possible solution.

This patch is not meant as a final solution, but to improve current
situation with broken build systems exporting even _GLOBAL__ symbols.
I'm about to prepare another libtool patch to fix that one.

> 2) AIX linker garbage collection conflicting with scanning for
> symbols.  collect2 scanning needs to better emulate SVR4 linker
> semantics for object files and archives.

Probably collect2 should filter the symbol list originating in either
an explicit -bexport:file or the -bexpall/-bexpfull flags and pass the
resulting symbol list as explicit -bexport:file only to the AIX linker?

/haubi/

> 
> Thanks, David
> 
> 
> On Mon, Feb 8, 2016 at 7:14 AM, Michael Haubenwallner
>  wrote:
>> Hi David,
>>
>> still experiencing exception-not-caught problems with gcc-4.2.4 on AIX
>> leads me to some patch proposed in http://gcc.gnu.org/PR13878 back in
>> 2004 already, ought to be fixed by some different commit since 3.4.0.
>>
>> As long as build systems (even libtool right now) on AIX do export these
>> _GLOBAL__* symbols from shared libraries, overlapping frame-base address
>> ranges may become registered, even if newer gcc (seen with 4.8) does name
>> the FDE symbols more complex to reduce these chances.
>>
>> But still, just think of linking some static library into multiple shared
>> libraries and/or the main executable. Or sometimes there is just need for
>> some hackery to override a shared object's implementation detail and rely
>> on runtime linking to do the override at runtime.
>>
>> Agreed both is "wrong" to some degree, but the larger an application is,
>> the higher is the chance for this to happen.
>>
>> Thoughts?
>>
>> Thanks!
>> /haubi/


Re: [PATCH PR69652, Regression]

2016-02-10 Thread Yuri Rumyantsev
Thanks Richard for your comments.
I changes algorithm to remove dead scalar statements as you proposed.

Bootstrap and regression testing did not show any new failures on x86-64.
Is it OK for trunk?

Changelog:
2016-02-10  Yuri Rumyantsev  

PR tree-optimization/69652
* tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
to nested loop, did source re-formatting, skip debug statements,
add check on statement with volatile operand, remove dead scalar
statements.

gcc/testsuite/ChangeLog:
* gcc.dg/torture/pr69652.c: New test.


2016-02-09 15:33 GMT+03:00 Richard Biener :
> On Fri, Feb 5, 2016 at 3:54 PM, Yuri Rumyantsev  wrote:
>> Hi All,
>>
>> Here is updated patch - I came back to move call statements also since
>>  masked loads are presented by internal call. I also assume that for
>> the following simple loop
>>   for (i = 0; i < n; i++)
>> if (b1[i])
>>   a1[i] = sqrtf(a2[i] * a2[i] + a3[i] * a3[i]);
>> motion must be done for all vector statements in semi-hammock including SQRT.
>>
>> Bootstrap and regression testing did not show any new failures.
>> Is it OK for trunk?
>
> The patch is incredibly hard to parse due to the re-indenting.  Please
> consider sending
> diffs with -b.
>
> This issue exposes that you are moving (masked) stores across loads without
> checking aliasing.  In the specific case those loads are dead and thus
> this is safe
> but in general I thought we were checking that we are using the same VUSE
> during the sinking operation.
>
> Thus, I'd rather have
>
> + /* Check that LHS does not have uses outside of STORE_BB.  */
> + res = true;
> + FOR_EACH_IMM_USE_FAST (use_p, imm_iter, lhs)
> +   {
> + gimple *use_stmt;
> + use_stmt = USE_STMT (use_p);
> + if (is_gimple_debug (use_stmt))
> +   continue;
> + if (gimple_bb (use_stmt) != store_bb)
> +   {
> + res = false;
> + break;
> +   }
> +   }
>
> also check for the dead code case and DCE those stmts here.  Like so:
>
>if (has_zero_uses (lhs))
> {
>   gsi_remove (_from, true);
>   continue;
> }
>
> before the above loop.
>
> Richard.
>
>> ChangeLog:
>>
>> 2016-02-05  Yuri Rumyantsev  
>>
>> PR tree-optimization/69652
>> * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
>> to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all
>> skipped scalar statements, introduce variable LAST_VUSE to keep
>> vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the
>> begining of current masked store processing, did source re-formatting,
>> skip parsing of debug gimples, stop processing if a gimple with
>> volatile operand has been encountered, save scalar statement
>> with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE
>> iterator, change vuse of all saved scalar statements to LAST_VUSE if
>> it makes sence.
>>
>> gcc/testsuite/ChangeLog:
>> * gcc.dg/torture/pr69652.c: New test.
>>
>> 2016-02-04 19:40 GMT+03:00 Jakub Jelinek :
>>> On Thu, Feb 04, 2016 at 05:46:27PM +0300, Yuri Rumyantsev wrote:
 Here is a patch that cures the issues with non-correct vuse for scalar
 statements during code motion, i.e. if vuse of scalar statement is
 vdef of masked store which has been sunk to new basic block, we must
 fix it up.  The patch also fixed almost all remarks pointed out by
 Jacub.

 Bootstrapping and regression testing on v86-64 did not show any new 
 failures.
 Is it OK for trunk?

 ChangeLog:
 2016-02-04  Yuri Rumyantsev  

 PR tree-optimization/69652
 * tree-vect-loop.c (optimize_mask_stores): Move declaration of STMT1
 to nested loop, introduce new SCALAR_VUSE vector to keep vuse of all
 skipped scalar statements, introduce variable LAST_VUSE that has
 vuse of LAST_STORE, add assertion that SCALAR_VUSE is empty in the
 begining of current masked store processing, did source re-formatting,
 skip parsing of debug gimples, stop processing when call or gimple
 with volatile operand habe been encountered, save scalar statement
 with vuse in SCALAR_VUSE, skip processing debug statements in IMM_USE
 iterator, change vuse of all saved scalar statements to LAST_VUSE if
 it makes sence.

 gcc/testsuite/ChangeLog:
 * gcc.dg/torture/pr69652.c: New test.
>>>
>>> Your mailer breaks ChangeLog formatting, so it is hard to check the
>>> formatting of the ChangeLog entry.
>>>
>>> diff --git a/gcc/testsuite/gcc.dg/torture/pr69652.c 
>>> b/gcc/testsuite/gcc.dg/torture/pr69652.c
>>> new file mode 100644
>>> index 000..91f30cf
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.dg/torture/pr69652.c
>>> @@ -0,0 +1,14 @@
>>> +/* { dg-do 

Re: [patch] Fix timevar internal consistency failure

2016-02-10 Thread Richard Biener
On Wed, Feb 10, 2016 at 10:55 AM, Eric Botcazou  wrote:
> Hi,
>
> I just ran into a timevar internal consistency failure with -ftime-report:
>
> Execution times (seconds)
>  phase setup :   0.00 ( 0%) usr   0.00 ( 0%) sys   0.01 ( 0%) wall
> 114 kB ( 0%) ggc
>  phase parsing   :   0.29 ( 7%) usr   0.01 ( 9%) sys   0.30 ( 7%) wall
> [...]
> repair loop structures  :   0.01 ( 0%) usr   0.00 ( 0%) sys   0.00 ( 0%) wall
> 0 kB ( 0%) ggc
>  TOTAL :   4.06 0.11 4.18
> 58199 kB
> Extra diagnostic checks enabled; compiler may run slowly.
> Configure with --enable-checking=release to disable checks.
> Timing error: total of phase timers exceeds total time.
> ggc_mem 59609520 > 59596208
>
> This only happens with -g and was probably introduced by the merge of the
> early-debug branch.  timevar.def reads:
>
> /* The compiler phases.
>
>These must be mutually exclusive, and the NAME field must begin
>with "phase".
>
>Also, their sum must be within a millionth of the total time (see
>validate_phases).  */
>
> The problem is that TV_PHASE_DBGINFO is now nested within TV_PHASE_OPT_GEN,
> which violates the above mutual exclusivity requirement.  Therefore the
> attached patch simply gets rid of TV_PHASE_DBGINFO (as well as of the sibling
> TV_PHASE_CHECK_DBGINFO which was already unused).
>
> Tested on x86_64-suse-linux, OK for the mainline?

Ok.

Richard.

>
> 2016-02-10  Eric Botcazou  
>
> * timevar.def (TV_PHASE_DBGINFO): Delete.
> (TV_PHASE_CHECK_DBGINFO): Likewise.
> * varpool.c (varpool_node::assemble_decl): Do not change timevar.
>
> --
> Eric Botcazou


Re: [PATCH][AArch64] Only update assembler .arch directive when necessary

2016-02-10 Thread James Greenhalgh
On Thu, Feb 04, 2016 at 01:50:31PM +, Kyrill Tkachov wrote:
> Hi all,
> 
> As part of the target attributes and pragmas support for GCC 6 I changed the
> aarch64 port to emit a .arch assembly directive for each function that
> describes the architectural features used by that function.  This is a change
> from GCC 5 behaviour where we output a single .arch directive at the
> beginning of the assembly file corresponding to architectural features given
> on the command line.

> Bootstrapped and tested on aarch64-none-linux-gnu.  With this patch I managed
> to build a recent allyesconfig Linux kernel where before the build would fail
> when assembling the LSE instructions.
> 
> Ok for trunk?

One comment, that I'm willing to be convinced on...

> 
> Thanks,
> Kyrill
> 
> 2016-02-04  Kyrylo Tkachov  
> 
> * config/aarch64/aarch64.c (struct aarch64_output_asm_info):
> New struct definition.
> (aarch64_previous_asm_output): New variable.
> (aarch64_declare_function_name): Only output .arch assembler
> directive if it will be different from the previously output
> directive.
> (aarch64_start_file): New function.
> (TARGET_ASM_FILE_START): Define.
> 
> 2016-02-04  Kyrylo Tkachov  
> 
> * gcc.target/aarch64/assembler_arch_1.c: Add -dA to dg-options.
> Delete unneeded -save-temps.
> * gcc.target/aarch64/assembler_arch_7.c: Likewise.
> * gcc.target/aarch64/target_attr_15.c: Scan assembly for
> .arch armv8-a\n.
> * gcc.target/aarch64/assembler_arch_1.c: New test.

> commit 2df0f24332e316b8d18d4571438f76726a0326e7
> Author: Kyrylo Tkachov 
> Date:   Wed Jan 27 12:54:54 2016 +
> 
> [AArch64] Only update assembler .arch directive when necessary
> 
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 5ca2ae8..0751440 100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -11163,6 +11163,17 @@ aarch64_asm_preferred_eh_data_format (int code 
> ATTRIBUTE_UNUSED, int global)
> return (global ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | type;
>  }
>  
> +struct aarch64_output_asm_info
> +{
> +  const struct processor *arch;
> +  const struct processor *cpu;
> +  unsigned long isa_flags;

Why not just keep the last string you printed, and use a string compare
to decide whether to print or not? Sure we'll end up doing a bit more
work, but the logic becomes simpler to follow and we don't need to pass
around another struct...

Thanks,
James




Re: [PATCH][AArch64] Only update assembler .arch directive when necessary

2016-02-10 Thread Kyrill Tkachov

Hi James,

On 10/02/16 10:11, James Greenhalgh wrote:

On Thu, Feb 04, 2016 at 01:50:31PM +, Kyrill Tkachov wrote:

Hi all,

As part of the target attributes and pragmas support for GCC 6 I changed the
aarch64 port to emit a .arch assembly directive for each function that
describes the architectural features used by that function.  This is a change
from GCC 5 behaviour where we output a single .arch directive at the
beginning of the assembly file corresponding to architectural features given
on the command line.



Bootstrapped and tested on aarch64-none-linux-gnu.  With this patch I managed
to build a recent allyesconfig Linux kernel where before the build would fail
when assembling the LSE instructions.

Ok for trunk?

One comment, that I'm willing to be convinced on...


Thanks,
Kyrill

2016-02-04  Kyrylo Tkachov  

 * config/aarch64/aarch64.c (struct aarch64_output_asm_info):
 New struct definition.
 (aarch64_previous_asm_output): New variable.
 (aarch64_declare_function_name): Only output .arch assembler
 directive if it will be different from the previously output
 directive.
 (aarch64_start_file): New function.
 (TARGET_ASM_FILE_START): Define.

2016-02-04  Kyrylo Tkachov  

 * gcc.target/aarch64/assembler_arch_1.c: Add -dA to dg-options.
 Delete unneeded -save-temps.
 * gcc.target/aarch64/assembler_arch_7.c: Likewise.
 * gcc.target/aarch64/target_attr_15.c: Scan assembly for
 .arch armv8-a\n.
 * gcc.target/aarch64/assembler_arch_1.c: New test.
commit 2df0f24332e316b8d18d4571438f76726a0326e7
Author: Kyrylo Tkachov 
Date:   Wed Jan 27 12:54:54 2016 +

 [AArch64] Only update assembler .arch directive when necessary

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 5ca2ae8..0751440 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11163,6 +11163,17 @@ aarch64_asm_preferred_eh_data_format (int code 
ATTRIBUTE_UNUSED, int global)
 return (global ? DW_EH_PE_indirect : 0) | DW_EH_PE_pcrel | type;
  }
  
+struct aarch64_output_asm_info

+{
+  const struct processor *arch;
+  const struct processor *cpu;
+  unsigned long isa_flags;

Why not just keep the last string you printed, and use a string compare
to decide whether to print or not? Sure we'll end up doing a bit more
work, but the logic becomes simpler to follow and we don't need to pass
around another struct...


I did do it this way to avoid a string comparison (I try to avoid
manual string manipulations where I can as they're so easy to get wrong)
though this isn't on any hot path.
We don't really pass the structure around anywhere, we just keep one
instance. We'd have to do the same with a string i.e. keep a string
object around that we'd strcpy (or C++ equivalent) a string to every time
we wanted to update it, so I thought this approach is cleaner as the
architecture features are already fully described by a pointer to
an element in the static constant all_architectures table and an
unsigned long holding the ISA flags.

If you insist I can change it to a string, but I personally don't
think it's worth it.

Thanks,
Kyrill



Thanks,
James






Re: [Patch, fortran, pr67451, v1] [5/6 Regression] ICE with sourced allocation from coarray

2016-02-10 Thread Andre Vehreschild
Hi all,

unfortunately was my last patch for pr67451 not perfect and introduced
regressions occurring on s390(x) and with the sanitizer. These were
caused, because when taking the array specs from the source=-expression
also its attributes, like coarray state and so on where taken from
there. This additionally added a corank to local objects to allocate,
that were no coarrays overwriting data in the array handle. The attached
patch fixes both issues.

The patch for gcc-5 is not affected, because in gcc-5 the feature of
taking the array spec from the source=-expression is not implemented.

Bootstrapped and regtested ok on x86_64-linux-gnu/F23.

Ok for trunk?

Regards,
Andre

On Tue, 2 Feb 2016 19:24:46 +0100
Paul Richard Thomas  wrote:

> Hi Andre,
> 
> This looks to be OK for trunk.
> 
> I'll move to the 5-branch patch right away.
> 
> Thanks
> 
> Paul
> 
> On 29 January 2016 at 19:17, Andre Vehreschild  wrote:
> > Hi all,
> >
> > attached is a patch to fix a regression in current gfortran when a
> > coarray is used in the source=-expression of an allocate(). The ICE was
> > caused by the class information, i.e., _vptr and so on, not at the
> > expected place. The patch fixes this.
> >
> > The patch also fixes pr69418, which I will flag as a duplicate in a
> > second.
> >
> > Bootstrapped and regtested ok on x86_64-linux-gnu/F23.
> >
> > Ok for trunk?
> >
> > Backport to gcc-5 is pending, albeit more difficult, because the
> > allocate() implementation on 5 is not as advanced the one in 6.
> >
> > Regards,
> > Andre
> > --
> > Andre Vehreschild * Email: vehre ad gmx dot de  
> 
> 
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 
diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index 2ff2833..649b80f 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -5401,17 +5401,8 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
   if (!retrieve_last_ref (, _ref))
 return false;
 
-  if (ref->u.ar.type == AR_FULL && expr3 != NULL)
-{
-  /* F08:C633: Array shape from expr3.  */
-  ref = expr3->ref;
-
-  /* Find the last reference in the chain.  */
-  if (!retrieve_last_ref (, _ref))
-	return false;
-  alloc_w_e3_arr_spec = true;
-}
-
+  /* Take the allocatable and coarray properties solely from the expr-ref's
+ attributes and not from source=-expression.  */
   if (!prev_ref)
 {
   allocatable = expr->symtree->n.sym->attr.allocatable;
@@ -5428,6 +5419,17 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
   if (!dimension)
 gcc_assert (coarray);
 
+  if (ref->u.ar.type == AR_FULL && expr3 != NULL)
+{
+  /* F08:C633: Array shape from expr3.  */
+  ref = expr3->ref;
+
+  /* Find the last reference in the chain.  */
+  if (!retrieve_last_ref (, _ref))
+	return false;
+  alloc_w_e3_arr_spec = true;
+}
+
   /* Figure out the size of the array.  */
   switch (ref->u.ar.type)
 {
@@ -5463,7 +5465,8 @@ gfc_array_allocate (gfc_se * se, gfc_expr * expr, tree status, tree errmsg,
   gfc_init_block (_descriptor_block);
   size = gfc_array_init_size (se->expr, alloc_w_e3_arr_spec ? expr->rank
 			   : ref->u.ar.as->rank,
-			  ref->u.ar.as->corank, , lower, upper,
+			  coarray ? ref->u.ar.as->corank : 0,
+			  , lower, upper,
 			  >pre, _descriptor_block, ,
 			  expr3_elem_size, nelems, expr3, e3_arr_desc,
 			  e3_is_array_constr, expr);
diff --git a/gcc/testsuite/gfortran.dg/coarray_allocate_5.f08 b/gcc/testsuite/gfortran.dg/coarray_allocate_5.f08
new file mode 100644
index 000..feb1bf3
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/coarray_allocate_5.f08
@@ -0,0 +1,32 @@
+! { dg-do run }
+! { dg-options "-fcoarray=lib -lcaf_single -fdump-tree-original" }
+!
+! Contributed by Ian Harvey  
+! Extended by Andre Vehreschild  
+! to test that coarray references in allocate work now
+! PR fortran/67451
+
+  program main
+implicit none
+type foo
+  integer :: bar = 99
+end type
+class(foo), dimension(:), allocatable :: foobar[:]
+class(foo), dimension(:), allocatable :: some_local_object
+allocate(foobar(10)[*])
+
+allocate(some_local_object, source=foobar)
+
+if (.not. allocated(foobar)) call abort()
+if (lbound(foobar, 1) /= 1 .OR. ubound(foobar, 1) /= 10) call abort()
+if (.not. allocated(some_local_object)) call abort()
+if (any(some_local_object(:)%bar /= [99, 99,  99, 99, 99, 99, 99, 99, 99, 99])) call abort()
+
+deallocate(some_local_object)
+deallocate(foobar)
+  end program
+
+! Check that some_local_object is treated as rank-1 array.
+! This failed beforehand, because the coarray attribute of the source=expression
+! was propagated to some_local_object in the allocate.
+! { dg-final { scan-tree-dump-not "some_local_object\._data\.dim\[1\]\.lbound" 

[PATCH] Fixup PR69719 fix

2016-02-10 Thread Richard Biener

The following patch improves the fix for PR69719 after spending another
hour in trying to understand that code (and revisiting the original
patch postings).  The code assumes that dr_a1 is left of dr_a2 which
is not always the case (sorting doesn't guarantee that), if that's not
the case we can observe negative differences as seen in the PR.  There
are downstream tests that also rely on that ordering so we'd better
fix the ordering rather than just computing the absolute of the diff.

Now, I still don't get how validity can be guaranteed given that
dr_a1 seg_len might be larger than diff and I don't get how the
segment length of dr_b can play any role in validating either.

But without a testcase it's not the time to kill this code and replace
it by something more obvious (I do have some more obvious solution
though).

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2016-02-10  Richard Biener  

PR tree-optimization/69719
* tree-vect-data-refs.c (vect_prune_runtime_alias_test_list):
Adjust previous fix by ensuring that dr_a1 is left of dr_a2.

Index: gcc/tree-vect-data-refs.c
===
--- gcc/tree-vect-data-refs.c   (revision 233261)
+++ gcc/tree-vect-data-refs.c   (working copy)
@@ -3081,9 +3081,12 @@ vect_prune_runtime_alias_test_list (loop
  || !tree_fits_shwi_p (dr_a2->offset))
continue;
 
+ /* Make sure dr_a1 starts left of dr_a2.  */
+ if (tree_int_cst_lt (dr_a2->offset, dr_a1->offset))
+   std::swap (*dr_a1, *dr_a2);
+
  unsigned HOST_WIDE_INT diff
-   = absu_hwi (tree_to_shwi (dr_a2->offset)
-   - tree_to_shwi (dr_a1->offset));
+   = tree_to_shwi (dr_a2->offset) - tree_to_shwi (dr_a1->offset);
 
 
  /* Now we check if the following condition is satisfied:


Re: [PATCH] PR rtl-optimization/64081: Enable RTL loop unrolling for duplicated exit blocks and back edges.

2016-02-10 Thread Alexander Fomin
Hi,
Here is a quick status update.
(Which comes a bit late due to bisection efforts)

This patch still causes bootrstrap failure on AIX when applied
on top of r219827.
I tried to bisect first commit eliminating AIX problem - it may
be useful anyway - but my current results seem misleading.
Therefore, I'll to continue the investigation.

As far as I understand, it can be checked in during Stage 1 for
GCC 7 at worst.

Thanks,
Alexander

On Sat, Feb 06, 2016 at 12:42:49PM -0700, Jeff Law wrote:
> On 02/06/2016 12:08 PM, David Edelsohn wrote:
> 
> >>Normally I'd say that if it was approved before, then it's still good to go
> >>since there haven't been major conceptual changes in this code since the
> >>patch was originally written and now.
> >>
> >>However, in this instance the patch had been reported to cause problems on
> >>AIX, problems that we can't reproduce now -- which makes me want to be more
> >>cautious.  Was it a problem with the patch, or some other latent issue -- we
> >>don't know at this point.
> >>
> >>So I think the way to go is to apply this patch on top of r219827 where it
> >>caused the AIX failure.  Then bootstrap on aix and determine the root cause
> >>of of the AIX bootstrap failure.  If it's this patch, then update the patch
> >>as needed.  If the patch is just exposing a latent bug elsewhere, we should
> >>evaluate whether or not that latent but has been fixed or not before
> >>applying this fix to the trunk.
> >>
> >>It's considerably more work, but ISTM it's the right thing to do.
> >
> >I'm on the fence about this patch.  I definitely don't think that it
> >should be merged for GCC 6.
> >
> >If the patch were to be proposed during Stage 1 for GCC 7 and had not
> >caused bootstrap problems for AIX, no one would have any question.
> >
> >The problem is we don't know if the patch exposed a latent bug that
> >independently was fixed after the patch was reverted or if the patch
> >still contains a bug that has been rendered latent by another change.
> >
> >Another approach to track down the cause would be to bisect which
> >patch fixed the bootstrap failure if the patch had not been reverted.
> Yes, that would be a good approach as well.  The concern here would be that
> without doing the root cause analysis, bisection may just find a patch which
> made the issue go latent.  To be sure we still have to do some root cause
> analysis.
> 
> Given this fixes a regression, I'm still open to incorporating the patch,
> but we've got to know what went wrong when the patch was previously applied
> and that whatever that problem was got fixed.
> Jeff
> 


[PATCH PR68021]Set ratio to 1 when computing the value of biv cand by itself

2016-02-10 Thread Bin Cheng
Hi,
This is another way to fix PR68021, and I think it's the least intrusive way.  
The issue is triggered in a special case in which cand is a original biv, and 
use denotes the value of the biv itself.  In this case, the use is added 
specifically for the original biv, as a result, get_computation_aff isn't 
called for the  pair before rewriting the use.  It is possible that 
constant_multiple_of/operand_equal_q could fail because of inconsistent fold 
behavior.  The fold behavior is fully described in PR68021.
This patch fixes IVOPT part of issue by setting ratio to 1, because it is known 
that the use has the value of the biv cand.

Bootstrap and test on x86_64 and aarch64.  Is it OK if no failures?

Thanks,
bin

2016-02-09  Bin Cheng  

PR tree-optimization/68021
* tree-ssa-loop-ivopts.c (get_computation_aff): Set ratio to 1 if
when computing the value of biv cand by itself.

gcc/testsuite/ChangeLog
2016-02-09  Bin Cheng  

PR tree-optimization/68021
* gcc.dg/tree-ssa/pr68021.c: New test.
diff --git a/gcc/tree-ssa-loop-ivopts.c b/gcc/tree-ssa-loop-ivopts.c
index 4026d28..48facec 100644
--- a/gcc/tree-ssa-loop-ivopts.c
+++ b/gcc/tree-ssa-loop-ivopts.c
@@ -3741,7 +3741,15 @@ get_computation_aff (struct loop *loop,
   var = fold_convert (uutype, var);
 }
 
-  if (!constant_multiple_of (ustep, cstep, ))
+  /* Ratio is 1 when computing the value of biv cand by itself.  */
+  if (cand->pos == IP_ORIGINAL && cand->incremented_at == use->stmt)
+{
+  gcc_assert (is_gimple_assign (use->stmt));
+  gcc_assert (use->iv->ssa_name == cand->var_after);
+  gcc_assert (gimple_assign_lhs (use->stmt) == cand->var_after);
+  rat = 1;
+}
+  else if (!constant_multiple_of (ustep, cstep, ))
 return false;
 
   /* In case both UBASE and CBASE are shortened to UUTYPE from some common
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr68021.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr68021.c
new file mode 100644
index 000..f60b1ff
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr68021.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O3" } */
+
+char a;
+void fn1 (char *p1, int p2, int p3)
+{
+  int i, x;
+  for (i = 0; i < 10; i++)
+{
+  for (x = 0; x < p3; x++)
+   {
+ *p1 = a;
+ p1--;
+   }
+  p1 += p2;
+}
+}


Re: [PATCH] PR rtl-optimization/64081: Enable RTL loop unrolling for duplicated exit blocks and back edges.

2016-02-10 Thread Bernd Schmidt

On 02/10/2016 12:34 PM, Alexander Fomin wrote:


This patch still causes bootrstrap failure on AIX when applied
on top of r219827.
I tried to bisect first commit eliminating AIX problem - it may
be useful anyway - but my current results seem misleading.
Therefore, I'll to continue the investigation.

As far as I understand, it can be checked in during Stage 1 for
GCC 7 at worst.


Only if we understand the bootstrap failure.


Bernd



Re: Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid offloading"

2016-02-10 Thread Thomas Schwinge
Hi!

Ping.

On Thu, 04 Feb 2016 15:47:25 +0100, I wrote:
> Here is the patch re-worked for trunk.  Instead of passing
> -foffload-force in the affected libgomp test cases, I instead chose to
> have them expect the warning.  This way, we're testing more in line to
> what users will be doing, and we'll notice how the OpenACC kernels
> handling improves, when parloops gets able to parallelize more offloaded
> code (and the "avoid offloading" handling will no longer trigger).  OK to
> commit?
> 
> commit acd66946777671486a0f69706b25a3ec5f877306
> Author: Thomas Schwinge 
> Date:   Tue Feb 2 20:41:42 2016 +0100
> 
> Un-parallelized OpenACC kernels constructs with nvptx offloading: "avoid 
> offloading"
> 
>   gcc/
>   * common.opt: Add -foffload-force.
>   * lto-wrapper.c (merge_and_complain, append_compiler_options):
>   Handle it.
>   * doc/invoke.texi: Document it.
>   * config/nvptx/mkoffload.c (struct id_map): Add "flags" member.
>   (record_id): Parse, and set it.
>   (process): Use it.
>   * config/nvptx/nvptx.c (nvptx_attribute_table): Add "omp avoid
>   offloading".
>   (nvptx_record_offload_symbol): Use it.
>   (nvptx_goacc_validate_dims): Set it.
>   libgomp/
>   * libgomp.h (gomp_offload_target_available_p): New function
>   declaration.
>   * target.c (gomp_offload_target_available_p): New function
>   definition.
>   (GOMP_offload_register_ver, GOMP_offload_unregister_ver)
>   (gomp_init_device, gomp_unload_device): Handle and document "avoid
>   offloading" flag ("host_table == NULL").
>   (resolve_device): Document "avoid offloading".
>   * oacc-init.c (resolve_device): Likewise.
>   * libgomp.texi (Enabling OpenACC): Likewise.
>   * testsuite/lib/libgomp.exp
>   (check_effective_target_nvptx_offloading_configured): New proc
>   definition.
>   * testsuite/libgomp.oacc-c-c++-common/avoid-offloading-1.c: New
>   file.
>   * testsuite/libgomp.oacc-c-c++-common/avoid-offloading-2.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/avoid-offloading-3.c:
>   Likewise.
>   * testsuite/libgomp.oacc-fortran/avoid-offloading-1.f: Likewise.
>   * testsuite/libgomp.oacc-fortran/avoid-offloading-2.f: Likewise.
>   * testsuite/libgomp.oacc-fortran/avoid-offloading-3.f: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/abort-3.c: Expect warning.
>   * testsuite/libgomp.oacc-c-c++-common/abort-4.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/deviceptr-1.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-1.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta-2.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta-3.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-alias-ipa-pta.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-empty.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-2.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-3.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-4.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-5.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq-6.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-and-seq.c:
>   Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-loop-collapse.c:
>   Likewise.
>   * testsuite/libgomp.oacc-fortran/combined-directives-1.f90:
>   Likewise.
>   * testsuite/libgomp.oacc-fortran/non-scalar-data.f90: Likewise.
> 
>   libgomp/
>   * testsuite/libgomp.oacc-c-c++-common/combined-directives-1.c: Set
>   "-ftree-parallelize-loops=32".
>   * testsuite/libgomp.oacc-c-c++-common/default-1.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/host_data-1.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/kernels-1.c: Likewise.
>   * testsuite/libgomp.oacc-c-c++-common/nested-2.c: Likewise.
> ---
>  gcc/common.opt |4 +
>  gcc/config/nvptx/mkoffload.c   |   73 +++-
>  gcc/config/nvptx/nvptx.c   |   42 ++-
>  gcc/doc/invoke.texi|   12 +-
>  gcc/lto-wrapper.c  |2 +
>  libgomp/libgomp.h  |1 +
>  libgomp/libgomp.texi   |8 ++
>  libgomp/oacc-init.c|   19 ++-
>  libgomp/target.c   |  122 
> 
>  

RE: [PATCH] [ARC] Add single/double IEEE precission FPU support.

2016-02-10 Thread Claudiu Zissulescu
> That sound like a bug.  Have you looked more closely what's going on?

Right, I found it. Forgot to set the C_MODE for CC_FPU* modes in the 
arc_mode_class[]. I will prepare a new patch with the proper handling.

Thanks!


[PATCH] Fix PR69291, RTL if-conversion bug

2016-02-10 Thread Richard Biener

In this case if-conversion sees

 if ()
  (set (reg/v:SI 224 [  ])
(plus:SI (plus:SI (reg/v:SI 160 [ mod_tlen ])
(reg/v:SI 224 [  ]))
(const_int 11 [0xb])
 else
  (set (reg/v:SI 224 [  ])
(plus:SI (plus:SI (reg/v:SI 160 [ mod_tlen ])
(reg/v:SI 224 [  ]))
(const_int 10 [0xa])))

where noce_try_store_flag_constants identifies

(plus:SI (reg/v:SI 160 [ mod_tlen ])
 (reg/v:SI 224 [  ]))

as "common" and then tries to detect the case where setting the
result would clobber that value.  It doesn't seem to expect
anything else than regs that can be equal to the destination though
which is clearly an oversight.

The following patch fixes this.

Bootstrap and regtest running on x86_64-unknown-linux-gnu, I verified
this fixes my observed ruby miscompile on i586-linux.

Ok for trunk?

Thanks,
Richard.

2016-02-10  Richard Biener  

PR rtl-optimization/69291
* ifcvt.c (noce_try_store_flag_constants): Properly handle
common expressions.

Index: gcc/ifcvt.c
===
--- gcc/ifcvt.c (revision 233262)
+++ gcc/ifcvt.c (working copy)
@@ -1381,10 +1381,11 @@ noce_try_store_flag_constants (struct no
 
   /* If we have x := test ? x + 3 : x + 4 then move the original
 x out of the way while we store flags.  */
-  if (common && rtx_equal_p (common, if_info->x))
+  if (common && reg_mentioned_p (if_info->x, common))
{
- common = gen_reg_rtx (mode);
- noce_emit_move_insn (common, if_info->x);
+ rtx tem = gen_reg_rtx (mode);
+ noce_emit_move_insn (tem, common);
+ common = tem;
}
 
   target = noce_emit_store_flag (if_info, if_info->x, reversep, normalize);


Re: [PATCH] s390: Add -fsplit-stack support

2016-02-10 Thread Marcin Kościelnicki

On 04/02/16 13:44, Marcin Kościelnicki wrote:

On 03/02/16 18:27, Ulrich Weigand wrote:

Marcin Kościelnicki wrote:


libgcc/ChangeLog:

* config.host: Use t-stack and t-stack-s390 for s390*-*-linux.
* config/s390/morestack.S: New file.
* config/s390/t-stack-s390: New file.
* generic-morestack.c (__splitstack_find): Add s390-specific code.

gcc/ChangeLog:

* common/config/s390/s390-common.c (s390_supports_split_stack):
New function.
(TARGET_SUPPORTS_SPLIT_STACK): New macro.
* config/s390/s390-protos.h: Add s390_expand_split_stack_prologue.
* config/s390/s390.c (struct machine_function): New field
split_stack_varargs_pointer.
(s390_register_info): Mark r12 as clobbered if it'll be used as temp
in s390_emit_prologue.
(s390_emit_prologue): Use r12 as temp if r1 is taken by split-stack
vararg pointer.
(morestack_ref): New global.
(SPLIT_STACK_AVAILABLE): New macro.
(s390_expand_split_stack_prologue): New function.
(s390_live_on_entry): New function.
(s390_va_start): Use split-stack vararg pointer if appropriate.
(s390_asm_file_end): Emit the split-stack note sections.
(TARGET_EXTRA_LIVE_ON_ENTRY): New macro.
* config/s390/s390.md (UNSPEC_STACK_CHECK): New unspec.
(UNSPECV_SPLIT_STACK_CALL): New unspec.
(UNSPECV_SPLIT_STACK_DATA): New unspec.
(split_stack_prologue): New expand.
(split_stack_space_check): New expand.
(split_stack_data): New insn.
(split_stack_call): New expand.
(split_stack_call_*): New insn.
(split_stack_cond_call): New expand.
(split_stack_cond_call_*): New insn.
---
Changes applied.  Testsuite still running, still works on my simple
tests.

As for common code prerequisites: #3 is no longer needed, and very
likely
so is #4 (it fixes problems that I've only seen with ESA mode, and
testsuite
runs just fine without it now).


OK, I see.  The patch is OK for mainline then, assuming testing passes.


Well, testing passes (as in, is no worse than x86 - the testsuite
doesn't really agree with -fsplit-stack in a few places involving
backtraces).  However, there's still the libgo issue to be taken care
of.  For my tests, I patched it up with:
[...]


I see the libgo patch has landed today.  Can we get this pushed?

Marcin Kościelnicki




[PATCH] PR plugins/69758: add params.list to PLUGIN_HEADERS

2016-02-10 Thread David Malcolm
params.h is listed in PLUGIN_HEADERS.  As of r227566 params.h
#includes params.list, but the latter is not in PLUGIN_HEADERS,
leading to compilation failure for plugins that include params.h
e.g. for gcc-python-plugin:

  In file included from gcc-cfg.c:40:0:
  
/install-dogfood/lib/gcc/x86_64-pc-linux-gnu/6.0.0/plugin/include/params.h:87:23:
 fatal error: params.list: No such file or directory
   #include "params.list"

The following patch fixes it in the obvious way, by adding
params.list to PLUGIN_HEADERS so that it gets installed.

Successfully bootstrapped on x86_64-pc-linux-gnu.
Verified via "make install" and then verifying the build of
the affected files in gcc-python-plugin.

OK for trunk?

gcc/ChangeLog:
PR plugins/69758
* Makefile.in (PLUGIN_HEADERS): Add params.list.
---
 gcc/Makefile.in | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index aa3c018..6c15830 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -3320,7 +3320,8 @@ PLUGIN_HEADERS = $(TREE_H) $(CONFIG_H) $(SYSTEM_H) 
coretypes.h $(TM_H) \
   tree-ssa-loop-niter.h tree-ssa-ter.h tree-ssa-threadedge.h \
   tree-ssa-threadupdate.h inchash.h wide-int.h signop.h hash-map.h \
   hash-set.h dominance.h cfg.h cfgrtl.h cfganal.h cfgbuild.h cfgcleanup.h \
-  lcm.h cfgloopmanip.h builtins.def chkp-builtins.def pass-instances.def
+  lcm.h cfgloopmanip.h builtins.def chkp-builtins.def pass-instances.def \
+  params.list
 
 # generate the 'build fragment' b-header-vars
 s-header-vars: Makefile
-- 
1.8.5.3



Re: [PATCH] PR plugins/69758: add params.list to PLUGIN_HEADERS

2016-02-10 Thread Bernd Schmidt

On 02/11/2016 04:29 AM, David Malcolm wrote:

gcc/ChangeLog:
PR plugins/69758
* Makefile.in (PLUGIN_HEADERS): Add params.list.


Ok.


Bernd



Re: AW: Wonly-top-basic-asm

2016-02-10 Thread David Wohlferd
Since no one expressed any objections, I have renamed the option from 
-Wonly-top-basic-asm to -Wbasic-asm-in-function.  This more clearly 
conveys what the option does (give a warning if you find basic asm in a 
function).


I believe the attached patch addresses all the other outstanding comments.

ChangeLog:
2016-02-10  David Wohlferd  

* doc/extend.texi: Doc basic asm behavior and new
-Wbasic-asm-in-function option.
* doc/invoke.texi: Doc new -Wbasic-asm-in-function option.
* c-family/c.opt: Define -Wbasic-asm-in-function.
* c/c-parser.c: Implement -Wbasic-asm-in-function for C.
* cp/parser.c: Implement -Wbasic-asm-in-function for c++.
* testsuite/c-c++-common/Wbasic-asm-in-function.c: New tests for
-Wbasic-asm-in-function.
* testsuite/c-c++-common/Wbasic-asm-in-function-2.c: Ditto.

While I have a release on file with FSF, I don't have write access to SVN.

dw
Index: gcc/c-family/c.opt
===
--- gcc/c-family/c.opt	(revision 233308)
+++ gcc/c-family/c.opt	(working copy)
@@ -585,6 +585,10 @@
 C++ ObjC++ Var(warn_namespaces) Warning
 Warn on namespace definition.
 
+Wbasic-asm-in-function
+C ObjC ObjC++ C++ Var(warn_basic_asm_in_function) Warning
+Warn on unsafe uses of basic asm.
+
 Wsized-deallocation
 C++ ObjC++ Var(warn_sized_deallocation) Warning EnabledBy(Wextra)
 Warn about missing sized deallocation functions.
Index: gcc/c/c-parser.c
===
--- gcc/c/c-parser.c	(revision 233308)
+++ gcc/c/c-parser.c	(working copy)
@@ -5972,7 +5972,18 @@
   labels = NULL_TREE;
 
   if (c_parser_next_token_is (parser, CPP_CLOSE_PAREN) && !is_goto)
+  {
+/* Warn on basic asm used inside of functions,
+   EXCEPT when in naked functions.  Also allow asm (""). */
+if (warn_basic_asm_in_function && TREE_STRING_LENGTH (str) != 1)
+  if (lookup_attribute ("naked",
+			DECL_ATTRIBUTES (current_function_decl))
+	  == NULL_TREE)
+	warning_at (asm_loc, OPT_Wbasic_asm_in_function,
+		"asm statement in function does not use extended syntax");
+
 goto done_asm;
+  }
 
   /* Parse each colon-delimited section of operands.  */
   nsections = 3 + is_goto;
Index: gcc/cp/parser.c
===
--- gcc/cp/parser.c	(revision 233308)
+++ gcc/cp/parser.c	(working copy)
@@ -18041,6 +18041,8 @@
   bool goto_p = false;
   required_token missing = RT_NONE;
 
+  location_t asm_loc = cp_lexer_peek_token (parser->lexer)->location;
+
   /* Look for the `asm' keyword.  */
   cp_parser_require_keyword (parser, RID_ASM, RT_ASM);
 
@@ -18199,6 +18201,17 @@
 	  /* If the extended syntax was not used, mark the ASM_EXPR.  */
 	  if (!extended_p)
 	{
+	  /* Warn on basic asm used inside of functions,
+		 EXCEPT when in naked functions.  Also allow asm (""). */
+	  if (warn_basic_asm_in_function
+		  && TREE_STRING_LENGTH (string) != 1)
+		if (lookup_attribute ("naked",
+ DECL_ATTRIBUTES (current_function_decl))
+		== NULL_TREE)
+		  warning_at (asm_loc, OPT_Wbasic_asm_in_function,
+			  "asm statement in function does not use extended"
+			  " syntax");
+
 	  tree temp = asm_stmt;
 	  if (TREE_CODE (temp) == CLEANUP_POINT_EXPR)
 		temp = TREE_OPERAND (temp, 0);
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 233308)
+++ gcc/doc/extend.texi	(working copy)
@@ -7458,7 +7458,8 @@
 @end table
 
 @subsubheading Remarks
-Using extended @code{asm} typically produces smaller, safer, and more
+Using extended @code{asm} (@pxref{Extended Asm}) typically produces smaller,
+safer, and more
 efficient code, and in most cases it is a better solution than basic
 @code{asm}.  However, there are two situations where only basic @code{asm}
 can be used:
@@ -7516,11 +7517,51 @@
 Basic @code{asm} provides no
 mechanism to provide different assembler strings for different dialects.
 
-Here is an example of basic @code{asm} for i386:
+Basic @code{asm} statements do not perform an implicit "memory" clobber
+(@pxref{Clobbers}).  Also, there is no implicit clobbering of @emph{any}
+registers, so (other than in @code{naked} functions which follow the ABI
+rules) changed registers must be restored to their original value before
+exiting the @code{asm}.  While this behavior has not always been
+documented, GCC has worked this way since at least v2.95.3.
 
+@strong{Warning:} This "clobber nothing" behavior may be different than how
+other compilers treat basic @code{asm}, since the C standards for the
+@code{asm} statement provide no guidance regarding these semantics.  As a
+result, @code{asm} statements that work correctly on other compilers may not
+work correctly with GCC (and vice versa), even though they both compile
+without error.
+
+Future versions of GCC may change basic @code{asm} to clobber 

Re: [C++ Patch] PR 68726

2016-02-10 Thread Jason Merrill

OK.

Jason