date:20181130

[PATCH] Partial fix for asan on big endian targets (PR sanitizer/88289)

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 12:44:04PM +0100, Martin Liška wrote:
> Ok, I'm sending updated version of the patch. I factored out the shadow memory
> byte emission into a class, it's responsible for underlying flushing and 
> guarantees
> that stores are 4B aligned (when beginning of stack vars is properly aligned
> to ASAN_RED_ZONE_SIZE).
> 
> So far I tested the patch on x86_64-linux-gnu and ppc64le-linux-gnu machine.

The patch broke most of the asan tests on powerpc64-linux unfortunately,
sorry for not catching it during patch review.

The following patch fixes most of them, bootstrapped/regtested on
powerpc64-linux, committed to trunk as obvious.

I'm still seeing some regressions, ICEs like:
FAIL: c-c++-common/asan/clone-test-1.c   -O1  (internal compiler error)
FAIL: c-c++-common/asan/clone-test-1.c   -O1  (test for excess errors)
Excess errors:
during RTL pass: expand
/home/jakub/gcc2/gcc/testsuite/c-c++-common/asan/clone-test-1.c:25:5: internal 
compiler error: in asan_clear_shadow, at asan.c:1181
0x10c52833 asan_clear_shadow
../../gcc/asan.c:1181
0x10c618b7 asan_emit_stack_protection(rtx_def*, rtx_def*, unsigned int, long*, 
tree_node**, int)
../../gcc/asan.c:1676
0x10621fe7 expand_used_vars
../../gcc/cfgexpand.c:2277
0x1062708f execute
../../gcc/cfgexpand.c:6338
That is
  gcc_assert ((len & 3) == 0);
To be debugged later.

2018-12-01  Jakub Jelinek  

PR sanitizer/88289
* asan.c (asan_redzone_buffer::flush_redzone_payload): Fix up
an off-by-one for BYTES_BIG_ENDIAN.

--- gcc/asan.c.jj   2018-11-30 19:59:59.675789930 +0100
+++ gcc/asan.c  2018-11-30 23:19:55.336780260 +0100
@@ -1326,7 +1326,7 @@ asan_redzone_buffer::flush_redzone_paylo
   for (unsigned i = 0; i < RZ_BUFFER_SIZE; i++)
 {
   unsigned char v
-   = m_shadow_bytes[BYTES_BIG_ENDIAN ? RZ_BUFFER_SIZE - i : i];
+   = m_shadow_bytes[BYTES_BIG_ENDIAN ? RZ_BUFFER_SIZE - i - 1 : i];
   val |= (unsigned HOST_WIDE_INT)v << (BITS_PER_UNIT * i);
   if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, "%02x ", v);

Jakub

[doc, rfc] document __builtin_setjmp and __builtin_longjmp

2018-11-30 Thread Sandra Loosemore

I have written a new patch for PR 59039 to address more of the comments 
there, as well as my own complaints about the draft patch attached to 
the issue.  I'd like to get some feedback on this one before I commit it.


-Sandra
2018-11-30  Sandra Loosemore  

	PR c/59039

	gcc/
	* doc/extend.texi (Nonlocal gotos): New section.
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 266688)
+++ gcc/doc/extend.texi	(working copy)
@@ -27,6 +27,7 @@ extensions, accepted by GCC in C90 mode
 * Local Labels::Labels local to a block.
 * Labels as Values::Getting pointers to labels, and computed gotos.
 * Nested Functions::Nested function in GNU C.
+* Nonlocal Gotos::  Nonlocal gotos.
 * Constructing Calls::  Dispatching a call to another function.
 * Typeof::  @code{typeof}: referring to the type of an expression.
 * Conditionals::Omitting the middle operand of a @samp{?:} expression.
@@ -520,6 +521,61 @@ bar (int *array, int offset, int size)
 @}
 @end smallexample
 
+@node Nonlocal Gotos
+@section Nonlocal Gotos
+@cindex nonlocal gotos
+
+GCC provides the built-in functions @code{__builtin_setjmp} and
+@code{__builtin_longjmp} which are similar to, but not interchangeable
+with, the C library functions @code{setjmp} and @code{longjmp}.  
+The built-in versions are used internally by GCC's libraries
+to implement exception handling on some targets.  You should use the 
+standard C library functions declared in @code{} in user code
+instead of the builtins.
+
+@code{__builtin_setjmp} and @code{__builtin_longjmp} use GCC's normal
+mechanisms to save and restore registers using the stack on function
+entry and exit.  The jump buffer argument @var{buf} holds only the
+information needed to restore the stack frame, rather than the entire 
+set of saved register values.  
+
+An important caveat is that GCC arranges to save and restore only
+those registers known to the specific architecture variant being
+compiled for.  This can make @code{__builtin_setjmp} and
+@code{__builtin_longjmp} more efficient than their library
+counterparts in some cases, but it can also cause incorrect and
+mysterious behavior when mixing with code that uses the full register
+set.
+
+You should declare the jump buffer argument @var{buf} to the
+built-in functions as:
+
+@smallexample
+#include 
+intptr_t @var{buf}[5];
+@end smallexample
+
+@deftypefn {Built-in Function} {int} __builtin_setjmp (intptr_t *@var{buf})
+This function saves the current stack context in @var{buf}.  
+@code{__builtin_setjmp} returns 0 when returning directly,
+and 1 when returning from @code{__builtin_longjmp} using the same
+@var{buf}.
+@end deftypefn
+
+@deftypefn {Built-in Function} {void} __builtin_longjmp (intptr_t *@var{buf}, int @var{val})
+This function restores the stack context in @var{buf}, 
+saved by a previous call to @code{__builtin_setjmp}.  After
+@code{__builtin_longjmp} is finished, the program resumes execution as
+if the matching @code{__builtin_setjmp} returns the value @var{val},
+which must be 1.
+
+Because @code{__builtin_longjmp} depends on the function return
+mechanism to restore the stack context, it cannot be called
+from the same function calling @code{__builtin_setjmp} to
+initialize @var{buf}.  It can only be called from a function called
+(directly or indirectly) from the function calling @code{__builtin_setjmp}.
+@end deftypefn
+
 @node Constructing Calls
 @section Constructing Function Calls
 @cindex constructing calls

Re: [PATCH] Add missing noexpect causes in tuple for move functions

2018-11-30 Thread nick

On 2018-11-30 6:12 p.m., Ville Voutilainen wrote:
> On Sat, 1 Dec 2018 at 01:05, Nicholas Krause  wrote:
>>
>> This adds the remainging noexcept causes required for this cause
>> to meet the spec as dicussed last year and documented here:
>> http://cplusplus.github.io/LWG/lwg-active.html#2899.
> 
> I don't see how this change is sufficient; the noexcept-specs need to
> be added to tuple's
> special member functions, not just to _Tuple_impl, and your suggested
> patch contains no
> tests.
> 

It was tested I just didn't mention that as it was assumed, that's my mistake 
and
sorry for that. This was more just to make sure that this is fine. If you would
prefer I send a patch  cleaning it up for all the classes i.e. tuple, and it's 
version's that's fine. I just want to ask do you want a patch or a series will
each patch touching one of the tuple clases as I assume your the maintainer.

Cheers,

Nick

Re: [RS6000] num_insns_constant ICE

2018-11-30 Thread Segher Boessenkool

On Fri, Nov 30, 2018 at 06:16:40PM +1030, Alan Modra wrote:
> On Thu, Nov 29, 2018 at 01:52:34PM -0600, Segher Boessenkool wrote:
> > On Sun, Nov 25, 2018 at 10:49:12PM +1030, Alan Modra wrote:
> > > +  while (nregs-- > 0)
> > > +{
> > > +  int ins = num_insns_constant_gpr (value);
> > 
> > You probably should mask "value" here so that it is only one GPR.
> > Alternatively, make num_insns_constant_gpr handle that.  Consider what
> > happens for a 64-bit number with 32-bit registers here?
> 
> Oh, right, the low part will always give an answer of 2 if the high
> parts isn't merely a sign extension.
> 
> > > + val = l[WORDS_BIG_ENDIAN ? 0 : 1] << 32;
> > 
> > This doesn't work, as in the other patch: long can be 32 bit.
> 
> Huh, I originally had "high" and "low" HOST_WIDE_INTs which avoided
> this sort of problem on 32-bit hosts.
> 
> Revised patch follows.  Bootstrapped and regression tested
> powerpc64le-linux.  Hopefully this has removed all the stupidity.

I think it is fine like this, or certainly an improvement :-)  Okay for
trunk.  Thanks!


Segher


>   * config/rs6000/rs6000.c (num_insns_constant_gpr): Renamed from
>   num_insns_constant_wide.  Make static.  Revise comment.
>   (num_insns_constant_multi): New function.
>   (num_insns_constant): Formatting.  Correct CONST_WIDE_INT
>   calculation.  Simplify and extract code common to both
>   CONST_INT and CONST_DOUBLE.  Add gcc_unreachable for unhandled
>   const_double modes.
>   * config/rs6000/rs6000-protos.h (num_insns_const_wide): Delete.

Re: [PATCH] Improve combiner's find_split_point (PR target/54589)

2018-11-30 Thread Segher Boessenkool

On Thu, Nov 29, 2018 at 04:36:01PM -0600, Segher Boessenkool wrote:
> Hi Jakub,
> 
> On Thu, Nov 29, 2018 at 10:49:21PM +0100, Jakub Jelinek wrote:
> > The following patch attempts to improve find_split_point inside of
> > complex MEM addresses, if the target supports REG + REG + const
> > addressing, but doesn't support more_complex_rtx + REG + const,
> > try to split it at more_complex_rtx rather than more_complex_rtx + REG.
> 
> > 2018-11-29  Jakub Jelinek  
> > 
> > PR target/54589
> > * combine.c (find_split_point): For invalid memory address
> > nonobj + obj + const, if reg + obj + const is valid addressing
> > mode, split at nonobj.  Use if rather than else if for the
> > fallback.  Comment fixes.
> > 
> > * gcc.target/i386/pr54589.c: New test.
> 
> That looks good, but let me try it on some bigger builds first.

Whoops, forgot to get back to you.  I tested it, and it looks fine, it
optimises the code quite often :-)  Please commit it to trunk.

Thanks,


Segher

Re: [PATCH] Added information about inline assembler in stack calculations (.su files)

2018-11-30 Thread Jeff Law

On 11/26/18 7:02 AM, Torbjorn SVENSSON wrote:
> Hi,
> 
> Attached is a small patch that, in case of inline assembler code, 
> indicates that the function stack usage is uncertain due to inline 
> assembler.
> 
> The test suite are using "nop" as an assembler instruction on all 
> targets, is this acceptable or is there a better way to test this?
> 
> Patch has been tested on gcc-arm-none-eabi-7-2018-q2-update for both 
> arm-none-eabi and x86_64-linux-gnu and SVN head (r266454) for 
> x86_64-linux-gnu.
One could argue that allocating stack space inside an ASM is a really
bad idea.  Consider things like dwarf debugging and unwind tables.  If
you're allocating stack inside an ASM that stuff is going to be totally
wrong.

So I think my question before moving forward with something like this is
whether or not it makes sense at all to bother dumping data for a
scenario that we'd probably suggest developers avoid to begin with.

jeff

Re: [PATCH] minor FDO profile related fixes

2018-11-30 Thread Jeff Law

On 11/9/18 7:12 PM, Indu Bhagat wrote:
> Changelog entry attached. Sorry about that.
> 
> Comments inline.
> 
> Thanks
> 
> On 11/09/2018 04:23 PM, Jeff Law wrote:
>> On 11/7/18 5:49 PM, Indu Bhagat wrote:
>>> diff --git a/gcc/coverage.c b/gcc/coverage.c
>>> index 599a3bb..7595e6c 100644
>>> --- a/gcc/coverage.c
>>> +++ b/gcc/coverage.c
>>> @@ -358,7 +358,7 @@ get_coverage_counts (unsigned counter, unsigned 
>>> cfg_checksum,
>>>
>>> if (warning_printed && dump_enabled_p ())
>>>   {
>>> dump_user_location_t loc
>>> -= dump_user_location_t::from_location_t (input_location);
>>> += dump_user_location_t::from_function_decl (current_function_decl);
>>>
>>> dump_printf_loc (MSG_MISSED_OPTIMIZATION, loc,
>>>  "use -Wno-error=coverage-mismatch to tolerate "
>>>
>>>  "the mismatch but performance may drop if the "
>>>
>> Patches should include a ChangeLog entry.  Because your patch didn't
>> include one, it's unclear if this hunk is something you really intended
>> to submit or not.  What's the point behind going from using
>> input_location to the function decl?
>>
> This is intended to be a part of the patch. The patch intends to make 
> opt-info and dumps related to ipa-profile more precise.
> 
> Using from_function_decl gives the precise location of the function. 
> from_location_t gives the same incorrect location for each function
> 
> with the coverage-mismatch issue in a compilation unit (location appears to 
> be the end of file)
OK.  And more generally we're trying to move away from input_location.
If associating the diagnostic with the function makes sense, then this
hunk is clearly OK.


>>> @@ -1317,12 +1327,12 @@ branch_prob (void)
>>> values.release ();
>>> free_edge_list (el);
>>> coverage_end_function (lineno_checksum, cfg_checksum);
>>> -  if (flag_branch_probabilities && profile_info)
>>> +  if (flag_branch_probabilities
>>> +  && (profile_status_for_fn (cfun) == PROFILE_READ))
>> So what's the point of this change?  Under what circumstances do the two
>> conditionals do something different.  I don't know this code well, but
>> ISTM that if profile_info is nonzero, then the profile status is going
>> to be PROFILE_READ.  Is that not the case?
> 
> On Trunk, there is a direct relationship between the three :
> 
> 1. (.gcda file) is present  --implies--> 2. profile_info set
> --implies--> 3. profile_status set to PROFILE_READ
Right.   That's the basic assumption I was working from.

> 
> But for functions added between profile-generate and profile-use, 3.
> should not be done based on 2.
> For such functions, profile_info may be set (because there is a .gcda
> file), but profile_status should not be set
> to PROFILE_READ, because such functions have no feedback data.
Ah, I see what you're getting at here.  THe sources have changed since
we gathered the profiling data.  In the case where new functions were
added to the source, we won't have profiling data for them.  So we
should be checking for profiling data attached to a function.  Got it.


> 
> 
> Changelog.precise-ipa-dump-optinfo.patch.ver1
> 
> gcc/ChangeLog:
> 
> 2018-11-07  "Indu Bhagat"  <"indu.bha...@oracle.com">
> 
>   * coverage.c (get_coverage_counts): Use from_function_decl for precise
>   function location.
>   * profile-count.c (profile_count::dump): Add handling for precise
>   profile quality.
>   * profile.c (compute_branch_probabilities): Rely on exec_counts instead
>   of profile_info to set x_profile_status of function.
>   (branch_prob): Do not set x_profile_status of function based on
>   profile_info. Done above based on exec_counts.
> 
I'll fix up the trivial extraneous curley braces nit and commit this
momentarily.

THanks,
Jeff

Re: [PATCH v5] Repeat jump threading after combine

2018-11-30 Thread Jeff Law

On 11/28/18 11:57 PM, Ilya Leoshkevich wrote:
> Repost of v4: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02067.html
> 
> Bootstrapped and regtested on x86_64-redhat-linux, s390x-redhat-linux
> and ppc64le-redhat-linux.
> 
> Consider the following RTL:
> 
> (insn (set (reg 65) (if_then_else (eq %cc 0) 1 0)))
> (insn (parallel [(set %cc (compare (reg 65) 0)) (clobber %scratch)]))
> (jump_insn (set %pc (if_then_else (ne %cc 0) (label_ref 23) %pc)))
> 
> Combine simplifies this into:
> 
> (note NOTE_INSN_DELETED)
> (note NOTE_INSN_DELETED)
> (jump_insn (set %pc (if_then_else (eq %cc 0) (label_ref 23) %pc)))
> 
> opening up the possibility to perform jump threading.
> 
> gcc/ChangeLog:
> 
> 2018-09-19  Ilya Leoshkevich  
> 
>   PR target/80080
>   * cfgcleanup.c (class pass_postreload_jump): New pass.
>   (pass_postreload_jump::execute): Likewise.
>   (make_pass_postreload_jump): Likewise.
>   * passes.def: Add pass_postreload_jump before
>   pass_postreload_cse.
>   * tree-pass.h (make_pass_postreload_jump): New pass.
OK
jeff

Re: [PATCH] Avoid weird integral types in reassoc range opts (PR tree-optimization/88274)

2018-11-30 Thread Jeff Law

On 11/30/18 1:44 PM, Jakub Jelinek wrote:
> Hi!
> 
> The following patch makes sure that we don't use weird integral types
> (either ones which have smaller precision than corresponding mode's
> precision, or ones like C++ -fstrict-enum enumeral types which have
> normal precision, but smaller TYPE_MAX_VALUE or larger TYPE_MIN_VALUE),
> because the constants could be outside of the range of those types.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2018-11-30  Jakub Jelinek  
> 
>   PR tree-optimization/88274
>   * tree-ssa-reassoc.c (optimize_range_tests_xor,
>   optimize_range_tests_diff): If type has smaller precision than
>   corresponding mode or if it has non-standard min/max, compute
>   everything in a standard type for the precision.
OK
jeff

Re: [PATCH] Comparison with an enum should mention enum value.

2018-11-30 Thread Jeff Law

On 11/30/18 2:49 AM, Martin Liška wrote:
> Hi.
> 
> This is one more clean up of the transition that I've just tested
> on x86_64-linux-gnu.
> 
> I'm planning to install it once I'll come back from vacation (27/12).
> 
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-11-29  Martin Liska  
> 
>   * builtins.c (expand_movstr): Compare with RETURN_BEGIN.
>   * expr.c (move_by_pieces): Likewise.
>   (store_by_pieces): Likewise.
>   (store_expr): Fix GNU coding style.
OK
jeff

Re: [PATCH] Introduce --param logical-op-non-short-circuit (PR testsuite/85368)

2018-11-30 Thread Jeff Law

On 11/30/18 1:50 PM, Jakub Jelinek wrote:
> Hi!
> 
> Until we stop depending on BRANCH_COST and LOGICAL_OP_NON_SHORT_CIRCUIT
> macros at least for early GIMPLE, I'm afraid the current state for the
> testsuite is terrible, on some targets it is enough to use
> -mbranch-cost={1,2} to pick either of the setting, but other targets,
> while they implement -mbranch-cost=, redefine LOGICAL_OP_NON_SHORT_CIRCUIT,
> so it ignores BRANCH_COST altogether, or sometimes, other targets don't
> implement -mbranch-cost= and have various complex definitions of BRANCH_COST
> or LOGICAL_OP_NON_SHORT_CIRCUIT depending on other command line options.
> 
> The following patch introduces a new param (not an option, to make it
> clearer it is intended primarily for the testsuite and we can more easily
> remove it again) that overrides the target's LOGICAL_OP_NON_SHORT_CIRCUIT
> and adjusts the testsuite to use it.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, Christophe has kindly
> tested it on ARM.  Ok for trunk?
> 
> 2018-11-30  Jakub Jelinek  
> 
>   PR testsuite/85368
>   * params.def (PARAM_LOGICAL_OP_NON_SHORT_CIRCUIT): New param.
>   * tree-ssa-ifcombine.c (ifcombine_ifandif): If
>   --param logical-op-non-short-circuit is present, override
>   LOGICAL_OP_NON_SHORT_CIRCUIT value from the param.
>   * fold-const.c (fold_range_test, fold_truth_andor): Likewise.
> 
>   * lib/target-supports.exp (logical_op_short_circuit): Remove.
>   * gcc.dg/builtin-bswap-7.c: Remove logical_op_short_circuit
>   effective target, drop -mbranch-cost= options from the test and
>   instead pass --param logical-op-non-short-circuit=0 or
>   --param logical-op-non-short-circuit=1 depending on what the
>   tests meant to test.
>   * gcc.dg/pr21643.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: Likewise.
>   * gcc.dg/tree-ssa/phi-opt-11.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-dom-thread-4.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-thread-14.c: Likewise.
>   * gcc.dg/tree-ssa/vrp47.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-dom-thread-11.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-dom-thread-16.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-dom-thread-14.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: Likewise.
>   * gcc.dg/tree-ssa/vrp87.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: Likewise.
>   * gcc.dg/tree-ssa/phi-opt-2.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-13.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-thread-11.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: Likewise.
>   * gcc.dg/tree-ssa/forwprop-28.c: Likewise.
>   * gcc.dg/binop-xor1.c: Likewise.
>   * gcc.dg/pr46309.c: Likewise.
>   * gcc.dg/tree-ssa/ssa-dom-thread-18.c: New test.
>   * gcc.dg/tree-ssa/reassoc-32.c: Add
>   --param logical-op-non-short-circuit=1 to dg-options.
>   * gcc.dg/tree-ssa/reassoc-33.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-34.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-35.c: Likewise.
>   * gcc.dg/tree-ssa/reassoc-36.c: Likewise.
OK.

Jeff

[committed] Fix style issue pointed out by Bernhard in my recent change

2018-11-30 Thread Jeff Law


Bernhard pointed out that we should be using


machine_mode 

rather than

enum machine_mode 

in one of my patches.  I guess old habits die hard.

Anyway, this trivial patch fixes the nit.  I'll try to remember to start
omitting the "enum" "class" and "struct" thingies more appropriately :-)



jeff
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6c57ed0027e..0bf512d35a9 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2018-11-30  Jeff Law  
+
+   * optabs.c (expand_binop): Use "machine_mode" rather than
+   "enum machine mode" in most recent change.
+
 2018-11-30  Wilco Dijkstra  
 
PR middle-end/64242
diff --git a/gcc/optabs.c b/gcc/optabs.c
index 130b1182ef0..1f87e428816 100644
--- a/gcc/optabs.c
+++ b/gcc/optabs.c
@@ -1377,8 +1377,8 @@ expand_binop (machine_mode mode, optab binoptab, rtx op0, 
rtx op1,
   start_sequence ();
 
   /* Do the actual arithmetic.  */
-  enum machine_mode op0_mode = GET_MODE (op0);
-  enum machine_mode op1_mode = GET_MODE (op1);
+  machine_mode op0_mode = GET_MODE (op0);
+  machine_mode op1_mode = GET_MODE (op1);
   if (op0_mode == VOIDmode)
op0_mode = int_mode;
   if (op1_mode == VOIDmode)

Re: [PATCH] Add missing noexpect causes in tuple for move functions

2018-11-30 Thread Ville Voutilainen

On Sat, 1 Dec 2018 at 01:05, Nicholas Krause  wrote:
>
> This adds the remainging noexcept causes required for this cause
> to meet the spec as dicussed last year and documented here:
> http://cplusplus.github.io/LWG/lwg-active.html#2899.

I don't see how this change is sufficient; the noexcept-specs need to
be added to tuple's
special member functions, not just to _Tuple_impl, and your suggested
patch contains no
tests.

Re: C++ PATCH to implement C++20 P0634R3, Down with typename!

2018-11-30 Thread Marek Polacek

On Wed, Nov 28, 2018 at 05:09:23PM -0500, Jason Merrill wrote:
> On 11/12/18 10:27 AM, Marek Polacek wrote:
> > This patch implements C++20 P0634R3, Down with typename!
> > 
> > which makes 'typename' optional in several contexts specified in [temp.res].
> > 
> > The gist of the patch is in cp_parser_simple_type_specifier, where, if the
> > context makes typename optional and the id is qualified, we pretend we've
> > seen the typename keyword.
> 
> Sounds good.

\o/

> > There's quite a lot of churn because we need to be careful where we want
> > to make typename optional, and e.g. a flag in cp_parser would be too global.
> 
> Did you consider adding a cp_parser_flags parameter rather than a specific
> bool?  I don't feel strongly about it, but it would simplify things the next
> time we want to do something like this.

I think I did but ran out of time.  I've changed the boolean parameters to
cp_parser_flags in this patch.

> > The resolve_typename_type hunk was to make typename9.C work with -fconcepts.
> 
> Makes sense.
> 
> > +   const bool typename_optional_p = (cxx_dialect >= cxx2a);
> 
> I'm uncomfortable with repeating this in lots of places in the parser; I'd
> prefer to have one place where we control whether the feature is enabled or
> not.  This seems like the right place:
> > + bool typename_p = (flags & CP_PARSER_FLAGS_TYPENAME_OPTIONAL);
> > + type = cp_parser_type_name (parser, (qualified_p && typename_p));

Indeed, this was a nice cleanup.

> > I'm not sure about some of the bits in typename5.C, not quite sure if the
> > code is valid, but I didn't have time to investigate deeply and it seems
> > pretty obscure anyway.  There are preexisting cases when g++ and clang++
> > disagree.
> 
> > +  // ??? Not sure if these are (in)valid.
> 
> I think these are all valid, since the qualified-ids all appear as a
> decl-specifier of a member-declaration.

Ok, I found a bug in the previous version of this patch: when we saw the
template keyword, we didn't call cp_parser_type_name in
cp_parser_simple_type_specifier, as a result, typename was needed in
a member-declaration even though it now shouldn't.

Fixed by calling cp_parser_make_typename_type.

> > +  typedef typename A::template N a1;
> > +  typedef typename A::template N a2;
> > +  typename A::template N a3;
> > +  typename A::template N a4;
> > +  typedef A::N a5; // { dg-error "not a type|invalid" }
> > +  typedef A::N a6; // { dg-error "not a type|invalid" }
> 
> The ::template shouldn't be required here anymore: [temp.names] "In a
> qualified-id used as the name in a typename-specifier (12.7),
> elaborated-type-specifier (9.1.7.3), using-declaration (9.8), or
> class-or-decltype (10.6), an optional keyword template appearing at the top
> level is ignored. In these contexts, a < token is always assumed to
> introduce a template-argument-list."
> 
> This is C++17 DR 1710.  You don't need to implement it in this patch, but
> let's not test for the wrong behavior.  :)

Thanks for the pointer to that!  With ::template it now passes, as shown in
typename5.C.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2018-11-30  Marek Polacek  

Implement P0634R3, Down with typename!
* parser.c (CP_PARSER_FLAGS_TYPENAME_OPTIONAL): New enumerator.
(cp_parser_type_name): Remove declaration.
(cp_parser_postfix_expression): Pass CP_PARSER_FLAGS_TYPENAME_OPTIONAL
to cp_parser_type_id.
(cp_parser_new_type_id): Pass CP_PARSER_FLAGS_TYPENAME_OPTIONAL to
cp_parser_type_specifier_seq.
(cp_parser_lambda_declarator_opt): Pass
CP_PARSER_FLAGS_TYPENAME_OPTIONAL to
cp_parser_parameter_declaration_clause.
(cp_parser_condition): Pass CP_PARSER_FLAGS_NONE to
cp_parser_declarator.
(cp_parser_simple_declaration): Pass CP_PARSER_FLAGS_NONE to
cp_parser_init_declarator.
(cp_parser_conversion_type_id): Pass CP_PARSER_FLAGS_NONE to
cp_parser_type_specifier_seq.
(cp_parser_default_type_template_argument): Pass
CP_PARSER_FLAGS_TYPENAME_OPTIONAL to cp_parser_type_id.
(cp_parser_template_parameter): Pass CP_PARSER_FLAGS_TYPENAME_OPTIONAL
to cp_parser_parameter_declaration.
(cp_parser_explicit_instantiation): Pass CP_PARSER_FLAGS_NONE to
cp_parser_declarator.
(cp_parser_simple_type_specifier): Adjust call to cp_parser_type_name
to relay if we should treat the typename keyword as optional.  Maybe
call cp_parser_make_typename_type is parsing a template-id and it's
not a TYPE_DECL.
(cp_parser_type_name): Remove unused function.
(cp_parser_enum_specifier): Pass to CP_PARSER_FLAGS_NONE
cp_parser_type_specifier_seq.
(cp_parser_alias_declaration): Pass CP_PARSER_FLAGS_TYPENAME_OPTIONAL
to cp_parser_type_id.
(cp_parser_init_declarator): New parameter.  Pass i

Re: [PATCH] Fix PR64242

2018-11-30 Thread Jeff Law

On 11/29/18 12:26 PM, Wilco Dijkstra wrote:
> Fix PR64242 - the middle end expansion for long jmp updates the
> hard frame pointer before it reads the new stack pointer.  This
> results in a corrupted stack pointer if the jmp buffer is a local.
> The obvious fix is to insert a temporary.
> 
> AArch64 bootstrap & regress pass. OK to commit?
> 
> ChangeLog:
> 2018-11-29  Wilco Dijkstra  
> 
> gcc/
>   PR middle-end/64242
>   * builtins.c (expand_builtin_longjmp): Use a temporary when restoring
>   the frame pointer.
>   (expand_builtin_nonlocal_goto): Likewise.
> 
> testsuite/
>   PR middle-end/64242
>   * gcc.c-torture/execute/pr64242.c: New test.
THanks for tracking this down.  I'd like to have this run through my
next testing cycle, so I went ahead and installed  it for you.

Thanks again,
Jeff

[PATCH] Add missing noexpect causes in tuple for move functions

2018-11-30 Thread Nicholas Krause

This adds the remainging noexcept causes required for this cause
to meet the spec as dicussed last year and documented here:
http://cplusplus.github.io/LWG/lwg-active.html#2899.

Signed-off-by: Nicholas Krause 
---
 libstdc++-v3/include/std/tuple | 4 
 1 file changed, 4 insertions(+)

diff --git a/libstdc++-v3/include/std/tuple b/libstdc++-v3/include/std/tuple
index 56b97c25eed..d17512a1b7e 100644
--- a/libstdc++-v3/include/std/tuple
+++ b/libstdc++-v3/include/std/tuple
@@ -214,6 +214,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
enable_if::type>
 explicit
 constexpr _Tuple_impl(_UHead&& __head, _UTail&&... __tail)
+noexcept(__and_,
+ is_nothrow_move_constructible<_Inherited>>::value)
: _Inherited(std::forward<_UTail>(__tail)...),
  _Base(std::forward<_UHead>(__head)) { }
 
@@ -237,6 +239,8 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 constexpr _Tuple_impl(_Tuple_impl<_Idx, _UHead, _UTails...>&& __in)
+noexcept(__and_,
+ is_nothrow_move_constructible<_Inherited>>::value)
: _Inherited(std::move
 (_Tuple_impl<_Idx, _UHead, _UTails...>::_M_tail(__in))),
  _Base(std::forward<_UHead>
-- 
2.17.1

[PATCH] v4: C/C++: add fix-it hints for missing '&' and '*' (PR c++/87850)

2018-11-30 Thread David Malcolm

On Tue, 2018-11-20 at 22:23 +, Joseph Myers wrote:
> On Tue, 20 Nov 2018, David Malcolm wrote:
> 
> > Should I do:
> 
> You should do whatever is appropriate for the warning in
> question.  But if 
> what's appropriate for the warning in question includes types that
> are 
> compatible but not the same, the comments need to avoid saying it's
> about 
> the types being the same. 

Thanks.

Here's an updated version of the patch.  I added a new
  compatible_types_for_indirection_note_p
specifically for use by maybe_emit_indirection_note, and provided
implementations for C and C++.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for trunk?

Changed in v4: introduce a compatible_types_for_indirection_note_p
function, and use it in place of same_type_p.

Changed in v3: use same_type_p rather than pointer equality.
Provide implementation of same_type_p for C.

Changed in v2: add a note, and put the fix-it hint on that instead

Blurb follows:

This patch adds a note with a fix-it hint to various
pointer-vs-non-pointer diagnostics, suggesting the addition of
a leading '&' or '*'.

For example, note the ampersand fix-it hint in the following:

demo.c: In function 'int main()':
demo.c:5:22: error: invalid conversion from 'pthread_key_t' {aka 'unsigned int'}
   to 'pthread_key_t*' {aka 'unsigned int*'} [-fpermissive]
5 |   pthread_key_create(key, NULL);
  |  ^~~
  |  |
  |  pthread_key_t {aka unsigned int}
demo.c:5:22: note: possible fix: take the address with '&'
5 |   pthread_key_create(key, NULL);
  |  ^~~
  |  &
In file included from demo.c:1:
/usr/include/pthread.h:1122:47: note:   initializing argument 1 of
   'int pthread_key_create(pthread_key_t*, void (*)(void*))'
 1122 | extern int pthread_key_create (pthread_key_t *__key,
  |~~~^

gcc/c-family/ChangeLog:
PR c++/87850
* c-common.c: Include "gcc-rich-location.h".
(maybe_emit_indirection_note): New function.
* c-common.h (maybe_emit_indirection_note): New decl.
(compatible_types_for_indirection_note_p): New decl.

gcc/c/ChangeLog:
PR c++/87850
* c-typeck.c (compatible_types_for_indirection_note_p): New
function.
(convert_for_assignment): Call maybe_emit_indirection_note for
pointer vs non-pointer diagnostics.

gcc/cp/ChangeLog:
PR c++/87850
* call.c (convert_like_real): Call
maybe_emit_indirection_note for "invalid conversion" diagnostic.
* typeck.c (compatible_types_for_indirection_note_p): New
function.

gcc/testsuite/ChangeLog:
PR c++/87850
* c-c++-common/indirection-fixits.c: New test.
---
 gcc/c-family/c-common.c |  35 +++
 gcc/c-family/c-common.h |   4 +
 gcc/c/c-typeck.c|  18 +-
 gcc/cp/call.c   |   2 +
 gcc/cp/typeck.c |   8 +
 gcc/testsuite/c-c++-common/indirection-fixits.c | 299 
 6 files changed, 364 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/indirection-fixits.c

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 9d51815..f235c9d 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -48,6 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "substring-locations.h"
 #include "spellcheck.h"
+#include "gcc-rich-location.h"
 #include "selftest.h"

 cpp_reader *parse_in;  /* Declared in c-pragma.h.  */
@@ -8420,6 +8421,40 @@ maybe_suggest_missing_token_insertion (rich_location 
*richloc,
 }
 }

+/* Potentially emit a note about likely missing '&' or '*',
+   depending on EXPR and EXPECTED_TYPE.  */
+
+void
+maybe_emit_indirection_note (location_t loc,
+tree expr, tree expected_type)
+{
+  gcc_assert (expr);
+  gcc_assert (expected_type);
+
+  tree actual_type = TREE_TYPE (expr);
+
+  /* Missing '&'.  */
+  if (TREE_CODE (expected_type) == POINTER_TYPE
+  && compatible_types_for_indirection_note_p (actual_type,
+ TREE_TYPE (expected_type))
+  && lvalue_p (expr))
+{
+  gcc_rich_location richloc (loc);
+  richloc.add_fixit_insert_before ("&");
+  inform (&richloc, "possible fix: take the address with %qs", "&");
+}
+
+  /* Missing '*'.  */
+  if (TREE_CODE (actual_type) == POINTER_TYPE
+  && compatible_types_for_indirection_note_p (TREE_TYPE (actual_type),
+ expected_type))
+{
+  gcc_rich_location richloc (loc);
+  richloc.add_fixit_insert_before ("*");
+  inform (&richloc, "possible fix: dereference with %qs", "*");
+}
+}
+
 #if CHECKING_P

 namespace selftest {

Re: [patch,openacc] Fix infinite recursion in OMP clause pretty-printing, default label

2018-11-30 Thread Joseph Myers

On Fri, 30 Nov 2018, Julian Brown wrote:

> ChangeLog
> 
> gcc/
> * tree-pretty-print.c (dump_omp_clause): Make default case
> gcc_unreachable.

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [patch,openacc] Fix infinite recursion in OMP clause pretty-printing, default label

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 09:35:45PM +, Julian Brown wrote:
> How's this? (Obvious, but re-tested anyway.)
> 
> Thanks,
> 
> Julian
> 
> ChangeLog
> 
> gcc/
> * tree-pretty-print.c (dump_omp_clause): Make default case
> gcc_unreachable.

Ok, thanks.

> diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
> index 99eca4a..0861cc9 100644
> --- a/gcc/tree-pretty-print.c
> +++ b/gcc/tree-pretty-print.c
> @@ -1180,9 +1180,7 @@ dump_omp_clause (pretty_printer *pp, tree clause, int 
> spc, dump_flags_t flags)
>break;
>  
>  default:
> -  /* Should never happen.  */
> -  dump_generic_node (pp, clause, spc, flags, false);
> -  break;
> +  gcc_unreachable ();
>  }
>  }
>  


Jakub

Patch to fix PR88179

2018-11-30 Thread Vladimir Makarov


  The following patch fixes

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88179

  The patch was successfully bootstrapped and tested on x86-64.

   Committed as rev. 266682.

Index: ChangeLog
===
--- ChangeLog	(revision 266681)
+++ ChangeLog	(working copy)
@@ -1,3 +1,9 @@
+2018-11-30  Vladimir Makarov  
+
+	PR rtl-optimization/88179
+	* lra-constraints.c (address_eliminator::address_eliminator):
+	Don't eleminate regs in illegitimate address.
+
 2018-11-30  David Malcolm  
 
 	PR preprocessor/88257
Index: lra-constraints.c
===
--- lra-constraints.c	(revision 266678)
+++ lra-constraints.c	(working copy)
@@ -359,14 +359,20 @@ address_eliminator::address_eliminator (
   if (m_base_loc != NULL)
 {
   m_base_reg = *m_base_loc;
-  lra_eliminate_reg_if_possible (m_base_loc);
+  /* If we have non-legitimate address which is decomposed not in
+	 the way we expected, don't do elimination here.  In such case
+	 the address will be reloaded and elimination will be done in
+	 reload insn finally.  */
+  if (REG_P (m_base_reg))
+	lra_eliminate_reg_if_possible (m_base_loc);
   if (m_ad->base_term2 != NULL)
 	*m_ad->base_term2 = *m_ad->base_term;
 }
   if (m_index_loc != NULL)
 {
   m_index_reg = *m_index_loc;
-  lra_eliminate_reg_if_possible (m_index_loc);
+  if (REG_P (m_index_reg))
+	lra_eliminate_reg_if_possible (m_index_loc);
 }
 }

Re: [patch,openacc] Fix infinite recursion in OMP clause pretty-printing, default label

2018-11-30 Thread Julian Brown

On Thu, 29 Nov 2018 21:25:33 +
Joseph Myers  wrote:

> On Thu, 29 Nov 2018, Julian Brown wrote:
> 
> > On Thu, 20 Sep 2018 10:08:51 -0700
> > Cesar Philippidis  wrote:
> >   
> > > Apparently, Tom ran into an ICE when we were adding support for
> > > new clauses back in the gomp-4_0-branch days.  This patch
> > > shouldn't be necessary because all of the clauses are fully
> > > implemented now, but it may prevent similar bugs from occurring
> > > in the future at least during development.
> > > 
> > > Is this patch OK for trunk? I bootstrapped and regtested it for
> > > x86_64 Linux with nvptx offloading.  
> > 
> > Joseph, could you take a look at this please?  
> 
> Lots of other places in the same function use gcc_unreachable ().  I
> think using gcc_unreachable () here as well would be more appropriate
> than special-casing this one place in this function to use "unknown".

How's this? (Obvious, but re-tested anyway.)

Thanks,

Julian

ChangeLog

gcc/
* tree-pretty-print.c (dump_omp_clause): Make default case
gcc_unreachable.
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index 99eca4a..0861cc9 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -1180,9 +1180,7 @@ dump_omp_clause (pretty_printer *pp, tree clause, int spc, dump_flags_t flags)
   break;
 
 default:
-  /* Should never happen.  */
-  dump_generic_node (pp, clause, spc, flags, false);
-  break;
+  gcc_unreachable ();
 }
 }

Re: [PATCH, libfortran] PR 88137 Initialize backtrace state once

2018-11-30 Thread Janne Blomqvist

On Fri, Nov 30, 2018 at 9:29 PM Janne Blomqvist 
wrote:

> On Fri, Nov 30, 2018 at 9:07 PM Thomas Schwinge 
> wrote:
>
>> Hi!
>>
>> On Thu, 22 Nov 2018 11:17:24 +0200, Janne Blomqvist <
>> blomqvist.ja...@gmail.com> wrote:
>> > From backtrace.h for backtrace_create_state:
>> >
>> >Calling this function allocates resources that can not be freed.
>> >There is no backtrace_free_state function.  The state is used to
>> >cache information that is expensive to recompute.  Programs are
>> >expected to call this function at most once and to save the return
>> >value for all later calls to backtrace functions.
>> >
>> > So instead of calling backtrace_create_state every time we wish to
>> > show a backtrace, do it once and store the result in a static
>> > variable.  libbacktrace allows multiple threads to access the state,
>> > so no need to use TLS.
>>
>> Hmm, OK, but...
>>
>> > --- a/libgfortran/runtime/backtrace.c
>> > +++ b/libgfortran/runtime/backtrace.c
>> > @@ -146,11 +146,15 @@ full_callback (void *data, uintptr_t pc, const
>> char *filename,
>> >  void
>> >  show_backtrace (bool in_signal_handler)
>> >  {
>> > -  struct backtrace_state *lbstate;
>> > +  /* Note that libbacktrace allows the state to be accessed from
>> > + multiple threads, so we don't need to use a TLS variable for the
>> > + state here.  */
>> > +  static struct backtrace_state *lbstate;
>> >struct mystate state = { 0, false, in_signal_handler };
>> > -
>> > -  lbstate = backtrace_create_state (NULL, __gthread_active_p (),
>> > - error_callback, NULL);
>> > +
>> > +  if (!lbstate)
>> > +lbstate = backtrace_create_state (NULL, __gthread_active_p (),
>> > +   error_callback, NULL);
>>
>> ... don't you still have to make sure that only one thread ever executes
>> "backtrace_create_state", and writes into "lbstate" (via locking, or
>> atomics, or "pthread_once", or some such)?  Or is that situation (more
>> than one thread entering "show_backtrace" with uninitialized "lbstate")
>> not possible to happen for other reasons?
>>
>
> I thought about that, but I think it probably(?) doesn't matter.
>
> - Two threads could race and run backtrace_create_state() in parallel, and
> probably we'd waste some memory then, but that was already possible before.
>
> - As for locking, the function can be called from a signal handler, so
> pthread_mutex is out. I guess I could implement a spinlock with atomics,
> but, meh, seems overkill.
>
> - And using atomics to access lbstate, it's an aligned pointer anyway, so
> while it's a race to access it, it shouldn't be possible to get a situation
> with a corrupted value for the pointer, right? I could use
> __atomic_load/store to access it, but that doesn't buy much in the end,
> does it, since the problem is parallel initialization, and not non-atomic
> access to the lbstate pointer? Or does gcc support targets with non-atomic
> access to aligned pointers?
>
> Or is there something I'm missing?
>

Using atomics for accessing the static variable can be done as below,
passes regtesting, Ok for trunk/8/7 with a ChangeLog?  If no objections,
I'll commit it as obvious in a few days.

diff --git a/libgfortran/runtime/backtrace.c
b/libgfortran/runtime/backtrace.c
index 3fc973a5e6d..93ea14af19d 100644
--- a/libgfortran/runtime/backtrace.c
+++ b/libgfortran/runtime/backtrace.c
@@ -149,15 +149,20 @@ show_backtrace (bool in_signal_handler)
   /* Note that libbacktrace allows the state to be accessed from
  multiple threads, so we don't need to use a TLS variable for the
  state here.  */
-  static struct backtrace_state *lbstate;
+  static struct backtrace_state *lbstate_saved;
+  struct backtrace_state *lbstate;
   struct mystate state = { 0, false, in_signal_handler };

+  lbstate = __atomic_load_n (&lbstate_saved, __ATOMIC_RELAXED);
   if (!lbstate)
-lbstate = backtrace_create_state (NULL, __gthread_active_p (),
- error_callback, NULL);
-
-  if (lbstate == NULL)
-return;
+{
+  lbstate = backtrace_create_state (NULL, __gthread_active_p (),
+   error_callback, NULL);
+  if (lbstate)
+   __atomic_store_n (&lbstate_saved, lbstate, __ATOMIC_RELAXED);
+  else
+   return;
+}

   if (!BACKTRACE_SUPPORTED || (in_signal_handler && BACKTRACE_USES_MALLOC))
 {


-- 
Janne Blomqvist

[PATCH] Add vec_concatv4sf_0 and vec_concatv4si_0 insns (PR target/88278)

2018-11-30 Thread Jakub Jelinek

Hi!

As mentioned in the PR, while in vec_concatv2df or vec_concatv2di we have
alternatives where the lower half is low 64-bit part of a xmm reg or 64-bit
memory and upper half is zero using movq/vmovq (or movq2dq), for
vec_concatv4sf and vec_concatv4si we don't have anything similar, although
the operations do pretty much the same thing.  I'm not advocating to put
in alternatives with GPR operands as having V2SF or V2SI in a GPR is too
weird, but for the cases where a simple movq or vmovq instruction can move
64-bits and clear upper 64-bits these patterns look useful to me.

I had to tweak the pr53759.c testcase because it started FAILing, but only
because it changed:
-   vxorps  %xmm0, %xmm0, %xmm0
-   vmovlps (%rsi), %xmm0, %xmm0
+   vmovq   (%rsi), %xmm0
vaddps  %xmm0, %xmm0, %xmm0
vmovaps %xmm0, (%rdi)
ret
I've added a variant of that testcase that tests its original purpose by
using a register there not known to be all zeros.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-30  Jakub Jelinek  

PR target/88278
* config/i386/sse.md (*vec_concatv4sf_0, *vec_concatv4si_0): New insns.

* gcc.target/i386/pr88278.c: New test.
* gcc.target/i386/pr53759.c: Don't expect vmovlps insn, expect vmovq
instead.
* gcc.target/i386/pr53759-2.c: New test.

--- gcc/config/i386/sse.md.jj   2018-11-30 21:36:22.277762263 +0100
+++ gcc/config/i386/sse.md  2018-11-30 21:38:02.261120768 +0100
@@ -7248,6 +7248,17 @@ (define_insn "*vec_concatv4sf"
(set_attr "prefix" "orig,maybe_evex,orig,maybe_evex")
(set_attr "mode" "V4SF,V4SF,V2SF,V2SF")])
 
+(define_insn "*vec_concatv4sf_0"
+  [(set (match_operand:V4SF 0 "register_operand"   "=v")
+   (vec_concat:V4SF
+ (match_operand:V2SF 1 "nonimmediate_operand" "xm")
+ (match_operand:V2SF 2 "const0_operand"   " C")))]
+  "TARGET_SSE2"
+  "%vmovq\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex")
+   (set_attr "mode" "DF")])
+
 ;; Avoid combining registers from different units in a single alternative,
 ;; see comment above inline_secondary_memory_needed function in i386.c
 (define_insn "vec_set_0"
@@ -14409,6 +14420,23 @@ (define_insn "*vec_concatv4si"
(set_attr "prefix" "orig,maybe_evex,orig,orig,maybe_evex")
(set_attr "mode" "TI,TI,V4SF,V2SF,V2SF")])
 
+(define_insn "*vec_concatv4si_0"
+  [(set (match_operand:V4SI 0 "register_operand"   "=v,x")
+   (vec_concat:V4SI
+ (match_operand:V2SI 1 "nonimmediate_operand" "vm,?!*y")
+ (match_operand:V2SI 2 "const0_operand"   " C,C")))]
+  "TARGET_SSE2"
+  "@
+   %vmovq\t{%1, %0|%0, %1}
+   movq2dq\t{%1, %0|%0, %1}"
+  [(set_attr "type" "ssemov")
+   (set_attr "prefix" "maybe_vex,orig")
+   (set_attr "mode" "TI")
+   (set (attr "preferred_for_speed")
+ (if_then_else (eq_attr "alternative" "1")
+   (symbol_ref "TARGET_INTER_UNIT_MOVES_FROM_VEC")
+   (symbol_ref "true")))])
+
 ;; movd instead of movq is required to handle broken assemblers.
 (define_insn "vec_concatv2di"
   [(set (match_operand:V2DI 0 "register_operand"
--- gcc/testsuite/gcc.target/i386/pr88278.c.jj  2018-11-30 21:38:02.261120768 
+0100
+++ gcc/testsuite/gcc.target/i386/pr88278.c 2018-11-30 21:38:02.261120768 
+0100
@@ -0,0 +1,34 @@
+/* PR target/88278 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse2 -mno-sse3 -fgimple -masm=att" } */
+/* { dg-final { scan-assembler-times "movq\[ \t]+\\(" 2 } } */
+/* { dg-final { scan-assembler-not "punpcklqdq\[ \t]+" } } */
+/* { dg-final { scan-assembler-not "pxor\[ \t]+" } } */
+
+typedef unsigned char v16qi __attribute__((vector_size(16)));
+typedef unsigned char v8qi __attribute__((vector_size(8)));
+typedef unsigned int v4si __attribute__((vector_size(16)));
+typedef unsigned int v2si __attribute__((vector_size(8)));
+
+v16qi __GIMPLE foo (unsigned char *p)
+{
+  v8qi _2;
+  v16qi _3;
+
+bb_2:
+  _2 = __MEM  (p_1(D));
+  _3 = _Literal (v16qi) { _2, _Literal (v8qi) { _Literal (unsigned char) 0, 
_Literal (unsigned char) 0, _Literal (unsigned char) 0, _Literal (unsigned 
char) 0, _Literal (unsigned char) 0, _Literal (unsigned char) 0, _Literal 
(unsigned char) 0 } };
+  return _3;
+}
+
+
+v4si __GIMPLE bar (unsigned int *p)
+{
+  v2si _2;
+  v4si _3;
+
+bb_2:
+  _2 = __MEM  (p_1(D));
+  _3 = _Literal (v4si) { _2, _Literal (v2si) { 0u, 0u } };
+  return _3;
+}
--- gcc/testsuite/gcc.target/i386/pr53759.c.jj  2016-05-22 12:20:04.184034591 
+0200
+++ gcc/testsuite/gcc.target/i386/pr53759.c 2018-11-30 22:04:56.432806470 
+0100
@@ -12,5 +12,6 @@ foo (__m128 *x, __m64 *y)
   *x = _mm_add_ps (b, b);
 }
 
-/* { dg-final { scan-assembler "vmovlps\[ \\t\]" } } */
+/* { dg-final { scan-assembler "vmovq\[ \\t\]" } } */
+/* { dg-final { scan-assembler-not "vmovlps\[ \\t\]" } } */
 /* { dg-final { scan-assembler-not "vshufps\[ \\t\]" } } */
--- gcc/testsuite/gcc.target/i386/pr53759-2.c.jj2018-11-30 
22:05:

[committed] [PR88288, OpenACC, libgomp] Adjust offsets for present data clauses

2018-11-30 Thread Thomas Schwinge

Hi!

On Fri, 30 Nov 2018 17:55:17 +0800, Chung-Lin Tang  
wrote:
> On 2018/7/21 6:07 AM, Cesar Philippidis wrote:
> > This is another old gomp4 patch

The patch as posted here contains additional changes compared to the
original patch.  I removed these additional changes.

The Fortran test case included doesn't actually FAIL without the patch,
so I removed that one, too.  The C/C++ test case included is enough to
motivate the bug, and its fix.

> > that corrects a bug where the runtime
> > was passing the wrong offset for subarray data to the accelerator. The
> > original description of this patch can be found here
> > 

I wish PRs would get filed right when such things are discovered...  I
just filed  "[OpenACC, libgomp] Adjust
offsets for present data clauses".  (Better late than never...)

> I think this patch is pretty obvious; this is what the 'offset' field
> of struct target_var_desc is supposed to be used for, and is in line
> with other sites throughout target.c.

Thanks for having a look, I agree.

This will likely need to be fixed on all active release branches.

For now, committed to trunk in r266688:

commit 2110057d427ba710d7f60045fe33c161e6b181c4
Author: tschwinge 
Date:   Fri Nov 30 20:39:49 2018 +

[PR88288, OpenACC, libgomp] Adjust offsets for present data clauses

Make libgomp respect the on device offset of subarrays which may arise in
present data clauses.

libgomp/
PR libgomp/88288
* oacc-parallel.c (GOACC_parallel_keyed): Add offset to devaddrs.
* testsuite/libgomp.oacc-c-c++-common/pr88288.c: New test.

Reviewed-by: Thomas Schwinge 

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266688 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog  |  6 
 libgomp/oacc-parallel.c|  3 +-
 .../testsuite/libgomp.oacc-c-c++-common/pr88288.c  | 41 ++
 3 files changed, 49 insertions(+), 1 deletion(-)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index a9dcbd808200..d095a196fb6c 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,9 @@
+2018-11-30  Cesar Philippidis  
+
+   PR libgomp/88288
+   * oacc-parallel.c (GOACC_parallel_keyed): Add offset to devaddrs.
+   * testsuite/libgomp.oacc-c-c++-common/pr88288.c: New test.
+
 2018-11-30  Thomas Schwinge  
 
* testsuite/libgomp.oacc-fortran/lib-16-2.f90: New file.
diff --git libgomp/oacc-parallel.c libgomp/oacc-parallel.c
index b80ace585907..1e08af70b4da 100644
--- libgomp/oacc-parallel.c
+++ libgomp/oacc-parallel.c
@@ -232,7 +232,8 @@ GOACC_parallel_keyed (int device, void (*fn) (void *),
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
 devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-   + tgt->list[i].key->tgt_offset);
+   + tgt->list[i].key->tgt_offset
+   + tgt->list[i].offset);
 
   acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs,
  async, dims, tgt);
diff --git libgomp/testsuite/libgomp.oacc-c-c++-common/pr88288.c 
libgomp/testsuite/libgomp.oacc-c-c++-common/pr88288.c
new file mode 100644
index ..d13e3359a3ec
--- /dev/null
+++ libgomp/testsuite/libgomp.oacc-c-c++-common/pr88288.c
@@ -0,0 +1,41 @@
+/* Test present data clauses in acc offloaded regions when the
+   subarray inside the present clause does not have the same base
+   offset value as the subarray in the enclosing acc data or acc enter
+   data variable.  */
+
+#include 
+
+void
+offset (int *data, int n)
+{
+  int i;
+
+#pragma acc parallel loop present (data[0:n])
+  for (i = 0; i < n; i++)
+data[i] = n;
+}
+
+int
+main ()
+{
+  const int n = 30;
+  int data[n], i;
+
+  for (i = 0; i < n; i++)
+data[i] = -1;
+
+#pragma acc data copy(data[0:n])
+  {
+offset (data + 10, 10);
+  }
+
+  for (i = 0; i < n; i++)
+{
+  if (i < 10 || i >= 20)
+   assert (data[i] == -1);
+  else
+   assert (data[i] == 10);
+}
+
+  return 0;
+}


Grüße
 Thomas

Re: [PATCH v3] Make function clone name numbering independent.

2018-11-30 Thread Michael Ploujnikov

Hi,

On 2018-11-30 2:25 a.m., Richard Biener wrote:
> +  unsigned &clone_number = lto_clone_numbers->get_or_insert (
> + IDENTIFIER_POINTER (DECL_NAME (decl)));
>name = maybe_rewrite_identifier (name);
>symtab->change_decl_assembler_name (decl,
> -  clone_function_name_numbered (
> -  name, "lto_priv"));
> +  clone_function_name (
> +  name, "lto_priv", clone_number));
> 
> since we use 'name' from maybe_rewrite_identifier in the end wouldn't it
> make more sense to use that as a key for lto_clone_numbers?

Yup, that looks much better. Fixed it.

> For the ipa-hsa.c changes since we iterate over all defined functions
> we at most create a single clone per cgraph_node.  That means your
> hash-map keyed on cgraph_node is useless and you could use
> an unnumbered clone (or a static zero number).

Good catch. I was thinking of doing this, but it somehow fell through the 
cracks during the rebase.

> 
> -  SET_DECL_ASSEMBLER_NAME (new_decl, clone_function_name_numbered (old_decl,
> -   suffix));
> +  SET_DECL_ASSEMBLER_NAME (new_decl,
> +   clone_function_name (
> + IDENTIFIER_POINTER (
> +   DECL_ASSEMBLER_NAME (old_decl)),
> + suffix,
> + num_suffix));
> 
> can you please hide the implementation detail of accessing the assembler name
> by adding an overload to clone_function_name_numbered with an explicit
> number?

Done.


> 
> OK with those changes.
> 
> Thanks,
> Richard.

Thank you for the review. I will commit as soon as my last test run finishes.

- Michael
From ac1f1579d37804c97833d460ec6cd5b87d6184c7 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Thu, 1 Nov 2018 12:57:30 -0400
Subject: [PATCH 1/3] Make function assembly more independent.

This is achieved by having clone_function_name assign unique clone
numbers for each function independently.

gcc:

2018-11-30  Michael Ploujnikov  

	* cgraphclones.c: Replaced clone_fn_id_num with clone_fn_ids;
	hash map.
	(clone_function_name_numbered): Use clone_fn_ids.

gcc/testsuite:

2018-11-30  Michael Ploujnikov  

	* gcc.dg/independent-cloneids-1.c: New test.
---
 gcc/cgraphclones.c| 10 -
 gcc/testsuite/gcc.dg/independent-cloneids-1.c | 38 +++
 2 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/independent-cloneids-1.c

diff --git gcc/cgraphclones.c gcc/cgraphclones.c
index e17959c0ca..fdd84b60d3 100644
--- gcc/cgraphclones.c
+++ gcc/cgraphclones.c
@@ -513,7 +513,7 @@ cgraph_node::create_clone (tree new_decl, profile_count prof_count,
   return new_node;
 }
 
-static GTY(()) unsigned int clone_fn_id_num;
+static GTY(()) hash_map *clone_fn_ids;
 
 /* Return a new assembler name for a clone of decl named NAME.  Apart
from the string SUFFIX, the new name will end with a unique (for
@@ -525,7 +525,13 @@ static GTY(()) unsigned int clone_fn_id_num;
 tree
 clone_function_name_numbered (const char *name, const char *suffix)
 {
-  return clone_function_name (name, suffix, clone_fn_id_num++);
+  /* Initialize the function->counter mapping the first time it's
+ needed.  */
+  if (!clone_fn_ids)
+clone_fn_ids = hash_map::create_ggc (64);
+  unsigned int &suffix_counter = clone_fn_ids->get_or_insert (
+   IDENTIFIER_POINTER (get_identifier (name)));
+  return clone_function_name (name, suffix, suffix_counter++);
 }
 
 /* Return a new assembler name for a clone of DECL.  Apart from string
diff --git gcc/testsuite/gcc.dg/independent-cloneids-1.c gcc/testsuite/gcc.dg/independent-cloneids-1.c
new file mode 100644
index 00..3203895492
--- /dev/null
+++ gcc/testsuite/gcc.dg/independent-cloneids-1.c
@@ -0,0 +1,38 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fipa-cp -fipa-cp-clone"  } */
+
+extern int printf (const char *, ...);
+
+static int __attribute__ ((noinline))
+foo (int arg)
+{
+  return 7 * arg;
+}
+
+static int __attribute__ ((noinline))
+bar (int arg)
+{
+  return arg * arg;
+}
+
+int
+baz (int arg)
+{
+  printf("%d\n", bar (3));
+  printf("%d\n", bar (4));
+  printf("%d\n", foo (5));
+  printf("%d\n", foo (6));
+  /* adding or removing the following call should not affect foo
+ function's clone numbering */
+  printf("%d\n", bar (7));
+  return foo (8);
+}
+
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*bar[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]0:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]1:} 1 } } */
+/* { dg-final { scan-assembler-times {(?n)\m_*foo[.$_]constprop[.$_]2:} 1 } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]3:} } } */
+/* { dg-final { scan-assembler-not {(?n)\m_*foo[.$_]constprop[.$_]4:} } } */
-- 
2.19.1

From 9656fc66b05c4ee9efd1b4a0533d564a35a85bc5 Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov

Re: [patch,openacc] use existing local variable in cp_parser_oacc_enter_exit_data

2018-11-30 Thread Thomas Schwinge

Hi!

On Thu, 29 Nov 2018 21:33:37 +, Julian Brown  
wrote:
> On Wed, 26 Sep 2018 11:21:33 -0700
> Cesar Philippidis  wrote:
> 
> > This is an old gomp4 patch that updates the location of the clause for
> > acc enter/exit data. Apparently, it didn't impact any test cases. Is
> > this OK for trunk or should we drop it from OG8?
> > 
> > I bootstrapped and regtested it for x86_64 Linux with nvptx
> > offloading.
> 
> At least at a glance, there is no actual change to behaviour given in
> this patch, it is just an extremely minor cleanup. I.e. in:
> 
>   location_t loc = pragma_tok->location;
>   [...]
>   SET_EXPR_LOCATION (stmt, pragma_tok->location);
> 
> the variable "loc" is used in the SET_EXPR_LOCATION instead. It doesn't
> look like anything could mutate either variable in the interim.

Thanks, agreed.

> So, OK, or shall we just drop this?

This tiny change obviously should've been part of (much) earlier changes
(when that code was reworked, "loc" added), and has now gotten much too
much attention already; committed to trunk in r266687:

commit 52b928e7cd1b7dee36c4ea14b5ac9df3f022efd5
Author: tschwinge 
Date:   Fri Nov 30 20:39:40 2018 +

[C++] Use existing local variable in cp_parser_oacc_enter_exit_data

gcc/cp/
* parser.c (cp_parser_oacc_enter_exit_data): Use existing local
variable.

Reviewed-by: Thomas Schwinge 

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266687 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/cp/ChangeLog | 5 +
 gcc/cp/parser.c  | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git gcc/cp/ChangeLog gcc/cp/ChangeLog
index 98a2528783a9..2c8b7d10ae4e 100644
--- gcc/cp/ChangeLog
+++ gcc/cp/ChangeLog
@@ -1,3 +1,8 @@
+2018-11-30  James Norris  
+
+   * parser.c (cp_parser_oacc_enter_exit_data): Use existing local
+   variable.
+
 2018-11-29  Paolo Carlini  
 
* decl.c (compute_array_index_type_loc): New, like the current
diff --git gcc/cp/parser.c gcc/cp/parser.c
index 3ef1eda45188..634485b5a8c7 100644
--- gcc/cp/parser.c
+++ gcc/cp/parser.c
@@ -38472,7 +38472,7 @@ cp_parser_oacc_enter_exit_data (cp_parser *parser, 
cp_token *pragma_tok,
   stmt = enter ? make_node (OACC_ENTER_DATA) : make_node (OACC_EXIT_DATA);
   TREE_TYPE (stmt) = void_type_node;
   OMP_STANDALONE_CLAUSES (stmt) = clauses;
-  SET_EXPR_LOCATION (stmt, pragma_tok->location);
+  SET_EXPR_LOCATION (stmt, loc);
   add_stmt (stmt);
   return stmt;
 }


Grüße
 Thomas

Re: [PATCH, OpenACC] Properly handle wait clause with no arguments

2018-11-30 Thread Thomas Schwinge

Hi Chung-Lin!

On Wed, 07 Nov 2018 20:13:29 +0100, Thomas Schwinge  
wrote:
> On Thu, 30 Aug 2018 21:27:22 +0800, Chung-Lin Tang  
> wrote:
> > Hi, this patch properly handles OpenACC 'wait' clauses without arguments, 
> > making it an equivalent of "wait all".

> > (current trunk basically discards and ignores such argument-less wait
> > clauses)
> 
> Bugs should be filed, for later reference.  Now done:
>  "OpenACC wait clauses without
> async-arguments".  (I couldn't put you in CC because "clt...@gcc.gnu.org
> did not match anything"?)

This will, by the way, need to be fixed on all active release branches.


> No test cases included.  I'm working on a few, will post/commit later.

I thought I had also written a libgomp execution test case, during
travel/attending the SuperComputing 2018 conference, but I can't find it
right now...  ;-|

Anyway, with XFAILs (which you then please remove as part of your patch),
at least the following compile-time test cases now committed to trunk in
r266686:

commit 1d89613e77d7db420b13ce3ad8b98f07aaf474e8
Author: tschwinge 
Date:   Fri Nov 30 20:39:30 2018 +

[PR87924] Add (XFAILed) test cases for OpenACC wait clauses without 
async-arguments

gcc/testsuite/
PR c/87924
* c-c++-common/goacc/asyncwait-5.c: Update.
* gfortran.dg/goacc/asyncwait-5.f: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266686 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/testsuite/ChangeLog|  4 
 gcc/testsuite/c-c++-common/goacc/asyncwait-5.c | 21 +
 gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f  | 20 
 3 files changed, 45 insertions(+)

diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index 75ca70b4af28..68186d8ab837 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,5 +1,9 @@
 2018-11-30  Thomas Schwinge  
 
+   PR c/87924
+   * c-c++-common/goacc/asyncwait-5.c: Update.
+   * gfortran.dg/goacc/asyncwait-5.f: Likewise.
+
* c-c++-common/goacc/asyncwait-5.c: New file.
* gfortran.dg/goacc/asyncwait-5.f: Likewise.
 
diff --git gcc/testsuite/c-c++-common/goacc/asyncwait-5.c 
gcc/testsuite/c-c++-common/goacc/asyncwait-5.c
index fe6f8a0cf2da..80d4a8477b93 100644
--- gcc/testsuite/c-c++-common/goacc/asyncwait-5.c
+++ gcc/testsuite/c-c++-common/goacc/asyncwait-5.c
@@ -11,4 +11,25 @@ void f()
 #pragma acc parallel async (2) wait (11, 12) wait (13)
   ;
   /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(13\\) 
wait\\(12\\) wait\\(11\\) async\\(2\\)\$" 1 "original" } } */
+
+
+#pragma acc parallel async (3) wait
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(-1\\) 
async\\(3\\)$" 1 "original" { xfail *-*-* } } } */
+
+#pragma acc parallel async (4) wait (100) wait
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(-1\\) 
wait\\(100\\) async\\(4\\)$" 1 "original" { xfail *-*-* } } } */
+
+#pragma acc parallel async (5) wait wait (101)
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(101\\) 
wait\\(-1\\) async\\(5\\)$" 1 "original" { xfail *-*-* } } } */
+
+#pragma acc parallel async (6) wait wait (102, 103) wait wait
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(-1\\) 
wait\\(-1\\) wait\\(103\\) wait\\(102\\) wait\\(-1\\) async\\(6\\)$" 1 
"original" { xfail *-*-* } } } */
+
+#pragma acc parallel async (7) wait (104) wait wait (105, 106)
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(106\\) 
wait\\(105\\) wait\\(-1\\) wait\\(104\\) async\\(7\\)$" 1 "original" { xfail 
*-*-* } } } */
 }
diff --git gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f 
gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f
index 59b886343af6..7ad5813b8a03 100644
--- gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f
+++ gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f
@@ -10,4 +10,24 @@
 !$ACC END PARALLEL
 ! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(2\\) 
wait\\(11\\) wait\\(12\\) wait\\(13\\)$" 1 "original" } }
 
+!$ACC PARALLEL ASYNC (3) WAIT
+!$ACC END PARALLEL
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(3\\) 
wait\\(-1\\)$" 1 "original" { xfail *-*-* } } }
+
+!$ACC PARALLEL ASYNC (4) WAIT (100) WAIT
+!$ACC END PARALLEL
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(4\\) 
wait\\(100\\) wait\\(-1\\)$" 1 "original" { xfail *-*-* } } }
+
+!$ACC PARALLEL ASYNC (5) WAIT WAIT (101)
+!$ACC END PARALLEL
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(5\\) 
wait\\(-1\\) wait\\(101\\)$" 1 "original" { xfail *-*-* } } }
+
+!$ACC PARALLEL ASYNC (6) WAIT WAIT (102, 103) WAIT WAIT
+!$ACC END PARALLEL
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(6\\) 
wait\\(-1\\) wait\\(102\\) wait\\(103\\) wait\\(-1\\) wait\\(-1\\)$" 1 
"original" { xfail

[committed] Add testcase for already fixed PR debug/85550

2018-11-30 Thread Jakub Jelinek

Hi!

The following testcase has been fixed by Nathan with r266158
aka PR debug/88006 and PR debug/87462.  In order to close the PR, I've
tested following testcase on x86_64-linux and i686-linux and committed to
trunk.

2018-11-30  Jakub Jelinek  

PR debug/85550
* g++.dg/debug/dwarf2/pr85550.C: New test.

--- gcc/testsuite/g++.dg/debug/dwarf2/pr85550.C.jj  2018-11-30 
17:32:55.959892418 +0100
+++ gcc/testsuite/g++.dg/debug/dwarf2/pr85550.C 2018-11-30 17:33:46.842056407 
+0100
@@ -0,0 +1,17 @@
+// PR debug/85550
+// { dg-do link }
+// { dg-options "-O2 -g -fdebug-types-section" }
+
+struct A {
+  int bar () const { return 0; }
+};
+template 
+struct B {
+};
+
+B<&A::bar> b;
+
+int
+main ()
+{
+}

Jakub

[C++ PATCH] Fix C++ parser endless diagnostics on CPP_PRAGMA_EOL (PR c++/88258)

2018-11-30 Thread Jakub Jelinek

Hi!

The following testcase endlessly diagnoses a syntax error.
The problem is that the various C++ parser routines to skip tokens until
something happily skip over CPP_PRAGMA that starts a pragma line, but
always stop at CPP_PRAGMA_EOL, but the caller expect them to stop at
something different.

The fix is similar to what the C FE already does, treat CPP_PRAGMA_EOL
as CPP_EOF only if in_pragma flag is set.  Otherwise we didn't really parse
CPP_PRAGMA normally and should skip everything until we find whatever we are
looking for.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-30  Jakub Jelinek  

PR c++/88258
* parser.c (cp_parser_skip_to_closing_parenthesis_1,
cp_parser_skip_to_end_of_statement,
cp_parser_skip_to_end_of_block_or_statement,
cp_parser_skip_to_closing_brace,
cp_parser_skip_to_closing_square_bracket,
cp_parser_skip_balanced_tokens): Don't treat CPP_PRAGMA_EOL specially
if in_pragma is false.

* g++.dg/gomp/pr88258.C: New test.

--- gcc/cp/parser.c.jj  2018-11-29 08:41:30.804788606 +0100
+++ gcc/cp/parser.c 2018-11-30 14:18:51.928839749 +0100
@@ -3556,8 +3556,11 @@ cp_parser_skip_to_closing_parenthesis_1
 
   switch (token->type)
{
-   case CPP_EOF:
case CPP_PRAGMA_EOL:
+ if (!parser->lexer->in_pragma)
+   break;
+ /* FALLTHRU */
+   case CPP_EOF:
  /* If we've run out of tokens, then there is no closing `)'.  */
  return 0;
 
@@ -3652,8 +3655,11 @@ cp_parser_skip_to_end_of_statement (cp_p
 
   switch (token->type)
{
-   case CPP_EOF:
case CPP_PRAGMA_EOL:
+ if (!parser->lexer->in_pragma)
+   break;
+ /* FALLTHRU */
+   case CPP_EOF:
  /* If we've run out of tokens, stop.  */
  return;
 
@@ -3742,8 +3748,11 @@ cp_parser_skip_to_end_of_block_or_statem
 
   switch (token->type)
{
-   case CPP_EOF:
case CPP_PRAGMA_EOL:
+ if (!parser->lexer->in_pragma)
+   break;
+ /* FALLTHRU */
+   case CPP_EOF:
  /* If we've run out of tokens, stop.  */
  return;
 
@@ -3792,8 +3801,11 @@ cp_parser_skip_to_closing_brace (cp_pars
 
   switch (token->type)
{
-   case CPP_EOF:
case CPP_PRAGMA_EOL:
+ if (!parser->lexer->in_pragma)
+   break;
+ /* FALLTHRU */
+   case CPP_EOF:
  /* If we've run out of tokens, stop.  */
  return false;
 
@@ -22498,8 +22510,11 @@ cp_parser_skip_to_closing_square_bracket
 
   switch (token->type)
{
-   case CPP_EOF:
case CPP_PRAGMA_EOL:
+ if (!parser->lexer->in_pragma)
+   break;
+ /* FALLTHRU */
+   case CPP_EOF:
  /* If we've run out of tokens, then there is no closing `]'.  */
  return false;
 
@@ -26008,8 +26023,11 @@ cp_parser_skip_balanced_tokens (cp_parse
   do
 switch (cp_lexer_peek_nth_token (parser->lexer, n++)->type)
   {
-  case CPP_EOF:
   case CPP_PRAGMA_EOL:
+   if (!parser->lexer->in_pragma)
+ break;
+   /* FALLTHRU */
+  case CPP_EOF:
/* Ran out of tokens.  */
return orig_n;
   case CPP_OPEN_PAREN:
--- gcc/testsuite/g++.dg/gomp/pr88258.C.jj  2018-11-30 14:26:00.806805739 
+0100
+++ gcc/testsuite/g++.dg/gomp/pr88258.C 2018-11-30 14:25:33.614251741 +0100
@@ -0,0 +1,11 @@
+// PR c++/88258
+// { dg-do compile }
+// { dg-options "-fopenmp" }
+
+void
+foo (bar int p)// { dg-error "variable or field|was not 
declared in this scope" }
+{
+  int i, x;
+  #pragma omp atomic write
+  x = 6;
+}

Jakub

[PATCH] Introduce --param logical-op-non-short-circuit (PR testsuite/85368)

2018-11-30 Thread Jakub Jelinek

Hi!

Until we stop depending on BRANCH_COST and LOGICAL_OP_NON_SHORT_CIRCUIT
macros at least for early GIMPLE, I'm afraid the current state for the
testsuite is terrible, on some targets it is enough to use
-mbranch-cost={1,2} to pick either of the setting, but other targets,
while they implement -mbranch-cost=, redefine LOGICAL_OP_NON_SHORT_CIRCUIT,
so it ignores BRANCH_COST altogether, or sometimes, other targets don't
implement -mbranch-cost= and have various complex definitions of BRANCH_COST
or LOGICAL_OP_NON_SHORT_CIRCUIT depending on other command line options.

The following patch introduces a new param (not an option, to make it
clearer it is intended primarily for the testsuite and we can more easily
remove it again) that overrides the target's LOGICAL_OP_NON_SHORT_CIRCUIT
and adjusts the testsuite to use it.

Bootstrapped/regtested on x86_64-linux and i686-linux, Christophe has kindly
tested it on ARM.  Ok for trunk?

2018-11-30  Jakub Jelinek  

PR testsuite/85368
* params.def (PARAM_LOGICAL_OP_NON_SHORT_CIRCUIT): New param.
* tree-ssa-ifcombine.c (ifcombine_ifandif): If
--param logical-op-non-short-circuit is present, override
LOGICAL_OP_NON_SHORT_CIRCUIT value from the param.
* fold-const.c (fold_range_test, fold_truth_andor): Likewise.

* lib/target-supports.exp (logical_op_short_circuit): Remove.
* gcc.dg/builtin-bswap-7.c: Remove logical_op_short_circuit
effective target, drop -mbranch-cost= options from the test and
instead pass --param logical-op-non-short-circuit=0 or
--param logical-op-non-short-circuit=1 depending on what the
tests meant to test.
* gcc.dg/pr21643.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-2.c: Likewise.
* gcc.dg/tree-ssa/phi-opt-11.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-1.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-4.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-3.c: Likewise.
* gcc.dg/tree-ssa/ssa-thread-14.c: Likewise.
* gcc.dg/tree-ssa/vrp47.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-16.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-14.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-5.c: Likewise.
* gcc.dg/tree-ssa/vrp87.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-6.c: Likewise.
* gcc.dg/tree-ssa/phi-opt-2.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-13.c: Likewise.
* gcc.dg/tree-ssa/ssa-thread-11.c: Likewise.
* gcc.dg/tree-ssa/ssa-ifcombine-ccmp-4.c: Likewise.
* gcc.dg/tree-ssa/forwprop-28.c: Likewise.
* gcc.dg/binop-xor1.c: Likewise.
* gcc.dg/pr46309.c: Likewise.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: New test.
* gcc.dg/tree-ssa/reassoc-32.c: Add
--param logical-op-non-short-circuit=1 to dg-options.
* gcc.dg/tree-ssa/reassoc-33.c: Likewise.
* gcc.dg/tree-ssa/reassoc-34.c: Likewise.
* gcc.dg/tree-ssa/reassoc-35.c: Likewise.
* gcc.dg/tree-ssa/reassoc-36.c: Likewise.

--- gcc/params.def.jj   2018-11-21 11:35:44.410041861 +0100
+++ gcc/params.def  2018-11-30 12:22:22.180688145 +0100
@@ -1360,6 +1360,11 @@ DEFPARAM(PARAM_AVOID_FMA_MAX_BITS,
 "Maximum number of bits for which we avoid creating FMAs.",
 0, 0, 512)
 
+DEFPARAM(PARAM_LOGICAL_OP_NON_SHORT_CIRCUIT,
+"logical-op-non-short-circuit",
+"True if a non-short-circuit operation is optimal.",
+-1, -1, 1)
+
 /*
 
 Local variables:
--- gcc/tree-ssa-ifcombine.c.jj 2018-11-29 08:41:32.681757293 +0100
+++ gcc/tree-ssa-ifcombine.c2018-11-30 12:29:42.936453120 +0100
@@ -40,6 +40,7 @@ along with GCC; see the file COPYING3.
 #include "gimplify-me.h"
 #include "tree-cfg.h"
 #include "tree-ssa.h"
+#include "params.h"
 
 #ifndef LOGICAL_OP_NON_SHORT_CIRCUIT
 #define LOGICAL_OP_NON_SHORT_CIRCUIT \
@@ -563,7 +564,11 @@ ifcombine_ifandif (basic_block inner_con
{
  tree t1, t2;
  gimple_stmt_iterator gsi;
- if (!LOGICAL_OP_NON_SHORT_CIRCUIT || flag_sanitize_coverage)
+ bool logical_op_non_short_circuit = LOGICAL_OP_NON_SHORT_CIRCUIT;
+ if (PARAM_VALUE (PARAM_LOGICAL_OP_NON_SHORT_CIRCUIT) != -1)
+   logical_op_non_short_circuit
+ = PARAM_VALUE (PARAM_LOGICAL_OP_NON_SHORT_CIRCUIT);
+ if (!logical_op_non_short_circuit || flag_sanitize_coverage)
return false;
  /* Only do this optimization if the inner bb contains only the 
conditional. */
  if (!gsi_one_before_end_p (gsi_start_nondebug_after_labels_bb 
(inner_cond_bb)))
--- gcc/fold-const.c.jj 2018-10-19 10:59:03.857467657 +0200
+++ gcc/fold-const.c2018-11-30 12:27:50.739294835 +0100
@@ -5572,12 +5572,15 @@ fold_range_test (location_t loc, enum tr
   /* On machines where the branch cost is expensive, if this is a
  short-circuited b

[committed] Clean up Fortran OpenACC wait clause handling

2018-11-30 Thread Thomas Schwinge

Hi!

commit 3e3de40a5ab21d72f08071a7a40120dd05608cc1
Author: tschwinge 
Date:   Fri Nov 30 20:39:18 2018 +

Clean up Fortran OpenACC wait clause handling

"wait" can be deduced from "wait_list".

gcc/fortran/
* gfortran.h (struct gfc_omp_clauses): Remove "wait".  Adjust all
users.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266685 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog  | 3 +++
 gcc/fortran/gfortran.h | 2 +-
 gcc/fortran/openmp.c   | 7 ++-
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git gcc/fortran/ChangeLog gcc/fortran/ChangeLog
index 435ecf82f970..76bb0b9f5c0c 100644
--- gcc/fortran/ChangeLog
+++ gcc/fortran/ChangeLog
@@ -1,5 +1,8 @@
 2018-11-30  Thomas Schwinge  
 
+   * gfortran.h (struct gfc_omp_clauses): Remove "wait".  Adjust all
+   users.
+
* openmp.c (gfc_match_omp_clauses): Support multiple OpenACC wait
clauses.
 
diff --git gcc/fortran/gfortran.h gcc/fortran/gfortran.h
index 4dd6298b3ddb..a14b4c44a18a 100644
--- gcc/fortran/gfortran.h
+++ gcc/fortran/gfortran.h
@@ -1345,7 +1345,7 @@ typedef struct gfc_omp_clauses
   gfc_expr_list *wait_list;
   gfc_expr_list *tile_list;
   unsigned async:1, gang:1, worker:1, vector:1, seq:1, independent:1;
-  unsigned wait:1, par_auto:1, gang_static:1;
+  unsigned par_auto:1, gang_static:1;
   unsigned if_present:1, finalize:1;
   locus loc;
 
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index e1560c1fe372..fb9c073ff779 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -1878,7 +1878,6 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  if ((mask & OMP_CLAUSE_WAIT)
  && gfc_match ("wait") == MATCH_YES)
{
- c->wait = true;
  match m = match_oacc_expr_list (" (", &c->wait_list, false);
  if (m == MATCH_ERROR)
{
@@ -4779,10 +4778,8 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
 resolve_positive_int_expr (omp_clauses->worker_expr, "WORKER");
   if (omp_clauses->vector_expr)
 resolve_positive_int_expr (omp_clauses->vector_expr, "VECTOR");
-  if (omp_clauses->wait)
-if (omp_clauses->wait_list)
-  for (el = omp_clauses->wait_list; el; el = el->next)
-   resolve_scalar_int_expr (el->expr, "WAIT");
+  for (el = omp_clauses->wait_list; el; el = el->next)
+resolve_scalar_int_expr (el->expr, "WAIT");
   if (omp_clauses->collapse && omp_clauses->tile_list)
 gfc_error ("Incompatible use of TILE and COLLAPSE at %L", &code->loc);
   if (omp_clauses->depend_source && code->op != EXEC_OMP_ORDERED)


Grüße
 Thomas

[committed] Support multiple OpenACC wait clauses

2018-11-30 Thread Thomas Schwinge

Hi!

I ran into this while working on something else, and it turned into a
prerequisite.  Committed to trunk in r266684:

commit ba688147af7632d7e1c420c98f2d301f7b9e427c
Author: tschwinge 
Date:   Fri Nov 30 20:39:08 2018 +

Support multiple OpenACC wait clauses

Support for this is not explicitly called for in OpenACC 2.6, but given that
GCC internally decomposes "wait (1, 2)" into "wait (1) wait (2)" (similar 
for
other clauses, too), it's reasonable to also support that syntax in the 
front
ends -- which happens to already be the case for C, C++, and easy enough to 
do
for Fortran.

gcc/fortran/
* openmp.c (gfc_match_omp_clauses): Support multiple OpenACC wait
clauses.
gcc/testsuite/
* c-c++-common/goacc/asyncwait-5.c: New file.
* gfortran.dg/goacc/asyncwait-5.f: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266684 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/fortran/ChangeLog  |  5 +
 gcc/fortran/openmp.c   |  1 -
 gcc/testsuite/ChangeLog|  5 +
 gcc/testsuite/c-c++-common/goacc/asyncwait-5.c | 14 ++
 gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f  | 13 +
 5 files changed, 37 insertions(+), 1 deletion(-)

diff --git gcc/fortran/ChangeLog gcc/fortran/ChangeLog
index 06e7400eda7d..435ecf82f970 100644
--- gcc/fortran/ChangeLog
+++ gcc/fortran/ChangeLog
@@ -1,3 +1,8 @@
+2018-11-30  Thomas Schwinge  
+
+   * openmp.c (gfc_match_omp_clauses): Support multiple OpenACC wait
+   clauses.
+
 2018-11-27  Martin Liska  
 
* decl.c (gfc_match_gcc_builtin): New function.
diff --git gcc/fortran/openmp.c gcc/fortran/openmp.c
index 6430e61ea7a6..e1560c1fe372 100644
--- gcc/fortran/openmp.c
+++ gcc/fortran/openmp.c
@@ -1876,7 +1876,6 @@ gfc_match_omp_clauses (gfc_omp_clauses **cp, const 
omp_mask mask,
  break;
case 'w':
  if ((mask & OMP_CLAUSE_WAIT)
- && !c->wait
  && gfc_match ("wait") == MATCH_YES)
{
  c->wait = true;
diff --git gcc/testsuite/ChangeLog gcc/testsuite/ChangeLog
index 874c158f75dd..75ca70b4af28 100644
--- gcc/testsuite/ChangeLog
+++ gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2018-11-30  Thomas Schwinge  
+
+   * c-c++-common/goacc/asyncwait-5.c: New file.
+   * gfortran.dg/goacc/asyncwait-5.f: Likewise.
+
 2018-11-30  Peter Bergner  
 
PR target/87496
diff --git gcc/testsuite/c-c++-common/goacc/asyncwait-5.c 
gcc/testsuite/c-c++-common/goacc/asyncwait-5.c
new file mode 100644
index ..fe6f8a0cf2da
--- /dev/null
+++ gcc/testsuite/c-c++-common/goacc/asyncwait-5.c
@@ -0,0 +1,14 @@
+/* Multiple OpenACC wait clauses.  */
+
+/* { dg-additional-options "-fdump-tree-original" } */
+
+void f()
+{
+#pragma acc parallel async (1) wait (14)
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(14\\) 
async\\(1\\)$" 1 "original" } } */
+
+#pragma acc parallel async (2) wait (11, 12) wait (13)
+  ;
+  /* { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel wait\\(13\\) 
wait\\(12\\) wait\\(11\\) async\\(2\\)\$" 1 "original" } } */
+}
diff --git gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f 
gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f
new file mode 100644
index ..59b886343af6
--- /dev/null
+++ gcc/testsuite/gfortran.dg/goacc/asyncwait-5.f
@@ -0,0 +1,13 @@
+! Multiple OpenACC wait clauses.
+
+! { dg-additional-options "-fdump-tree-original" } 
+
+!$ACC PARALLEL ASYNC (1) WAIT (14)
+!$ACC END PARALLEL
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(1\\) 
wait\\(14\\)$" 1 "original" } }
+
+!$ACC PARALLEL ASYNC (2) WAIT (11, 12) WAIT(13)
+!$ACC END PARALLEL
+! { dg-final { scan-tree-dump-times "(?n)#pragma acc parallel async\\(2\\) 
wait\\(11\\) wait\\(12\\) wait\\(13\\)$" 1 "original" } }
+
+  END


Grüße
 Thomas

[PATCH] Avoid weird integral types in reassoc range opts (PR tree-optimization/88274)

2018-11-30 Thread Jakub Jelinek

Hi!

The following patch makes sure that we don't use weird integral types
(either ones which have smaller precision than corresponding mode's
precision, or ones like C++ -fstrict-enum enumeral types which have
normal precision, but smaller TYPE_MAX_VALUE or larger TYPE_MIN_VALUE),
because the constants could be outside of the range of those types.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2018-11-30  Jakub Jelinek  

PR tree-optimization/88274
* tree-ssa-reassoc.c (optimize_range_tests_xor,
optimize_range_tests_diff): If type has smaller precision than
corresponding mode or if it has non-standard min/max, compute
everything in a standard type for the precision.

--- gcc/tree-ssa-reassoc.c.jj   2018-10-23 10:13:25.278875175 +0200
+++ gcc/tree-ssa-reassoc.c  2018-11-30 11:13:37.232393154 +0100
@@ -2537,8 +2537,23 @@ optimize_range_tests_xor (enum tree_code
   if (!tree_int_cst_equal (lowxor, highxor))
 return false;
 
+  exp = rangei->exp;
+  scalar_int_mode mode = as_a  (TYPE_MODE (type));
+  int prec = GET_MODE_PRECISION (mode);
+  if (TYPE_PRECISION (type) < prec
+  || (wi::to_wide (TYPE_MIN_VALUE (type))
+ != wi::min_value (prec, TYPE_SIGN (type)))
+  || (wi::to_wide (TYPE_MAX_VALUE (type))
+ != wi::max_value (prec, TYPE_SIGN (type
+{
+  type = build_nonstandard_integer_type (prec, TYPE_UNSIGNED (type));
+  exp = fold_convert (type, exp);
+  lowxor = fold_convert (type, lowxor);
+  lowi = fold_convert (type, lowi);
+  highi = fold_convert (type, highi);
+}
   tem = fold_build1 (BIT_NOT_EXPR, type, lowxor);
-  exp = fold_build2 (BIT_AND_EXPR, type, rangei->exp, tem);
+  exp = fold_build2 (BIT_AND_EXPR, type, exp, tem);
   lowj = fold_build2 (BIT_AND_EXPR, type, lowi, tem);
   highj = fold_build2 (BIT_AND_EXPR, type, highi, tem);
   if (update_range_test (rangei, rangej, NULL, 1, opcode, ops, exp,
@@ -2581,7 +2596,16 @@ optimize_range_tests_diff (enum tree_cod
   if (!integer_pow2p (tem1))
 return false;
 
-  type = unsigned_type_for (type);
+  scalar_int_mode mode = as_a  (TYPE_MODE (type));
+  int prec = GET_MODE_PRECISION (mode);
+  if (TYPE_PRECISION (type) < prec
+  || (wi::to_wide (TYPE_MIN_VALUE (type))
+ != wi::min_value (prec, TYPE_SIGN (type)))
+  || (wi::to_wide (TYPE_MAX_VALUE (type))
+ != wi::max_value (prec, TYPE_SIGN (type
+type = build_nonstandard_integer_type (prec, 1);
+  else
+type = unsigned_type_for (type);
   tem1 = fold_convert (type, tem1);
   tem2 = fold_convert (type, tem2);
   lowi = fold_convert (type, lowi);

Jakub

Re: [PATCH, OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions

2018-11-30 Thread Thomas Schwinge

Hi!

On Mon, 10 Sep 2018 10:22:17 -0700, Cesar Philippidis  
wrote:
> On 09/10/2018 08:04 AM, Chung-Lin Tang wrote:
> > Index: libgomp/openacc_lib.h
> 
> I don't see a test case for this.

Right.  (Better test coverage would be desirable generally.)

> I believe that openacc_lib.h is used
> by fixed-mode Fortran programs (those that end in a .f). Can you add a
> fixed-mode version of lib-16.f90?

Not only with fixed-form, but also with free-form Fortran programs.
Committed to trunk in r266683:

commit d084eb0a61d209ee0d852089ea76a672f519883b
Author: tschwinge 
Date:   Fri Nov 30 20:38:57 2018 +

Add libgomp.oacc-fortran/lib-16-2.f90

This is a copy of libgomp.oacc-fortran/lib-16.f90, but does 'include
"openacc_lib.h"' instead of 'use openacc'.

libgomp/
* testsuite/libgomp.oacc-fortran/lib-16-2.f90: New file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@266683 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog   | 4 
 libgomp/testsuite/libgomp.oacc-fortran/{lib-16.f90 => lib-16-2.f90} | 3 ++-
 libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90   | 1 +
 3 files changed, 7 insertions(+), 1 deletion(-)

diff --git libgomp/ChangeLog libgomp/ChangeLog
index d3c1bc36c145..a9dcbd808200 100644
--- libgomp/ChangeLog
+++ libgomp/ChangeLog
@@ -1,3 +1,7 @@
+2018-11-30  Thomas Schwinge  
+
+   * testsuite/libgomp.oacc-fortran/lib-16-2.f90: New file.
+
 2018-10-19  Richard Biener  
 
PR tree-optimization/88182
diff --git libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 
libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90
similarity index 94%
copy from libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
copy to libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90
index 9701b52dd257..fa76f65912fb 100644
--- libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
+++ libgomp/testsuite/libgomp.oacc-fortran/lib-16-2.f90
@@ -1,9 +1,10 @@
+! See also "lib-16.f90".
 ! { dg-do run }
 ! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
 
 program main
-  use openacc
   implicit none
+  include "openacc_lib.h"
 
   integer, parameter :: N = 256
   integer, allocatable :: h(:)
diff --git libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 
libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
index 9701b52dd257..011f9cf31db4 100644
--- libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
+++ libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
@@ -1,3 +1,4 @@
+! See also "lib-16-2.f90".
 ! { dg-do run }
 ! { dg-skip-if "" { *-*-* } { "*" } { "-DACC_MEM_SHARED=0" } }
 


Grüße
 Thomas

[PATCH,RFC] PR target/61976 PowerPC & AIX parameter passing

2018-11-30 Thread David Edelsohn

This patch corrects the manner in which single element structures are
passed.  The implementation always has been wrong and did not match
the AIX ABI. The AIX ABI specifies that aggregates are passed by
values in GPRs, but GCC passes small floating point aggregates in
FPRs. The PPC64 BE ELFv1 Linux ABI was updated in the past to match
the behavior of GCC.

For AIX, the implementation can produce wrong code for the case of a
single precision floating point value (SFmode) argument in 64 bit
mode.

The rs6000 backend parameter passing code chose the location for
arguments based on the MODE of the argument. For small aggregates, GCC
packs the aggregate into a scalar mode and the function arg machinery
is presented with an argument mode of the scalar mode, as opposed to
BLKmode.

On AIX, this meant that a single element floating point aggregate
(float or double) is represented as SFmode or DFmode.  The current
rs6000 function arg machinery passes the arguments in FPRs.  On AIX,
argument are padded upwards in the register, as if the arguments are
strings, not numerical values.  In 64 bit mode, GCC dutifully shifts a
32 bit SFmode value so that it is loaded into the upper 32 bits of the
FPR.  Power FPRs always are 64 bit double precision format, regardless
of the precision, so the value loaded is garbage, in addition to being
passed in the wrong location.  In 32 bit mode (still the default for
AIX), GCC didn't try to shift the value, so it accidentally worked --
self consistently passing the value in an FPR that GCC interpreted as
word mode size.

This patch adds a test for the argument TYPE to override the floating
point argument passing for aggregates only for AIX.

Yes, this will break the ABI for GCC on AIX.  GCC was not following
the AIX ABI. No one really noticed. It only produced wrong code in 64
bit mode. Single element aggregates are rare.  The change sucks, but
it doesn't make sense to have GCC produce code that is incompatible
with the AIX ABI.

I am using AGGREGATE_TYPE_P() to test the type. I believe that macro
captures all GCC cases of interest. Please let me know if this seems
like the wrong choice.  The macro includes ARRAY_TYPE, but arrays
commonly are passed by reference or pointer, not by value, so I don't
believe that this will affect single element vectors, but please
correct me if I'm wrong.  Another option is RECORD_OR_UNION_TYPE_P().

I believe that this change will correct GCC's conformance with the AIX
ABI.  I realize that most people don't know enough or care enough
about AIX and the ABI to determine if the conformance is correct. I am
interested in any feedback about the mechanism (AGGREGATE_TYPE_P) used
to select the behavior.

Bootstrapped on powerpc-ibm-aix7.2.0.0

Thanks, David

* config/rs6000/rs6000.c (rs6000_function_arg): Don't pass aggregates
in FPRs on AIX.
(rs6000_arg_partial_bytes): Same.

Index: rs6000.c
===
--- rs6000.c(revision 266671)
+++ rs6000.c(working copy)
@@ -11989,7 +11989,8 @@ rs6000_function_arg (cumulative_args_t cum_v, mach
   if (elt_mode == TDmode && (cum->fregno % 2) == 1)
cum->fregno++;

-  if (USE_FP_FOR_ARG_P (cum, elt_mode))
+  if (USE_FP_FOR_ARG_P (cum, elt_mode)
+ && !(TARGET_AIX && AGGREGATE_TYPE_P (type)))
{
  rtx rvec[GP_ARG_NUM_REG + AGGR_ARG_NUM_REG + 1];
  rtx r, off;
@@ -12125,7 +12126,8 @@ rs6000_arg_partial_bytes (cumulative_args_t cum_v,

   align_words = rs6000_parm_start (mode, type, cum->words);

-  if (USE_FP_FOR_ARG_P (cum, elt_mode))
+  if (USE_FP_FOR_ARG_P (cum, elt_mode)
+  && !(TARGET_AIX && AGGREGATE_TYPE_P (type)))
 {
   unsigned long n_fpreg = (GET_MODE_SIZE (elt_mode) + 7) >> 3;

Re: [PATCH] Adjust offsets for present data clauses

2018-11-30 Thread Thomas Schwinge

Hi Julian!

On Fri, 30 Nov 2018 10:37:09 +, Julian Brown  
wrote:
> > On 2018/7/21 6:07 AM, Cesar Philippidis wrote:
> > > This is another old gomp4 patch that corrects a bug where the
> > > runtime was passing the wrong offset for subarray data to the
> > > accelerator. The original description of this patch can be found
> > > here 

> This one will be superseded

Well, this is clearly a bug, so should be addressed independently.  I
just filed  "[OpenACC, libgomp] Adjust
offsets for present data clauses".  (Better late than never...)

> by the attach/detach patch, I think (where
> the additional offset is added also, via calling "gomp_map_val".

Aha, good to know: during review, I wondered about whether that function
should actually be used.  (But it also isn't in other places in
"libgomp/target.c".)

Grüße
 Thomas

Re: [PATCH] Use proper location for tls_init function (PR c++/88263).

2018-11-30 Thread Jason Merrill

OK.

Re: [PATCH, libfortran] PR 88137 Initialize backtrace state once

2018-11-30 Thread Janne Blomqvist

On Fri, Nov 30, 2018 at 9:07 PM Thomas Schwinge 
wrote:

> Hi!
>
> On Thu, 22 Nov 2018 11:17:24 +0200, Janne Blomqvist <
> blomqvist.ja...@gmail.com> wrote:
> > From backtrace.h for backtrace_create_state:
> >
> >Calling this function allocates resources that can not be freed.
> >There is no backtrace_free_state function.  The state is used to
> >cache information that is expensive to recompute.  Programs are
> >expected to call this function at most once and to save the return
> >value for all later calls to backtrace functions.
> >
> > So instead of calling backtrace_create_state every time we wish to
> > show a backtrace, do it once and store the result in a static
> > variable.  libbacktrace allows multiple threads to access the state,
> > so no need to use TLS.
>
> Hmm, OK, but...
>
> > --- a/libgfortran/runtime/backtrace.c
> > +++ b/libgfortran/runtime/backtrace.c
> > @@ -146,11 +146,15 @@ full_callback (void *data, uintptr_t pc, const
> char *filename,
> >  void
> >  show_backtrace (bool in_signal_handler)
> >  {
> > -  struct backtrace_state *lbstate;
> > +  /* Note that libbacktrace allows the state to be accessed from
> > + multiple threads, so we don't need to use a TLS variable for the
> > + state here.  */
> > +  static struct backtrace_state *lbstate;
> >struct mystate state = { 0, false, in_signal_handler };
> > -
> > -  lbstate = backtrace_create_state (NULL, __gthread_active_p (),
> > - error_callback, NULL);
> > +
> > +  if (!lbstate)
> > +lbstate = backtrace_create_state (NULL, __gthread_active_p (),
> > +   error_callback, NULL);
>
> ... don't you still have to make sure that only one thread ever executes
> "backtrace_create_state", and writes into "lbstate" (via locking, or
> atomics, or "pthread_once", or some such)?  Or is that situation (more
> than one thread entering "show_backtrace" with uninitialized "lbstate")
> not possible to happen for other reasons?
>

I thought about that, but I think it probably(?) doesn't matter.

- Two threads could race and run backtrace_create_state() in parallel, and
probably we'd waste some memory then, but that was already possible before.

- As for locking, the function can be called from a signal handler, so
pthread_mutex is out. I guess I could implement a spinlock with atomics,
but, meh, seems overkill.

- And using atomics to access lbstate, it's an aligned pointer anyway, so
while it's a race to access it, it shouldn't be possible to get a situation
with a corrupted value for the pointer, right? I could use
__atomic_load/store to access it, but that doesn't buy much in the end,
does it, since the problem is parallel initialization, and not non-atomic
access to the lbstate pointer? Or does gcc support targets with non-atomic
access to aligned pointers?

Or is there something I'm missing?


-- 
Janne Blomqvist

Re: [PATCH, libfortran] PR 88137 Initialize backtrace state once

2018-11-30 Thread Thomas Schwinge

Hi!

On Thu, 22 Nov 2018 11:17:24 +0200, Janne Blomqvist  
wrote:
> From backtrace.h for backtrace_create_state:
> 
>Calling this function allocates resources that can not be freed.
>There is no backtrace_free_state function.  The state is used to
>cache information that is expensive to recompute.  Programs are
>expected to call this function at most once and to save the return
>value for all later calls to backtrace functions.
> 
> So instead of calling backtrace_create_state every time we wish to
> show a backtrace, do it once and store the result in a static
> variable.  libbacktrace allows multiple threads to access the state,
> so no need to use TLS.

Hmm, OK, but...

> --- a/libgfortran/runtime/backtrace.c
> +++ b/libgfortran/runtime/backtrace.c
> @@ -146,11 +146,15 @@ full_callback (void *data, uintptr_t pc, const char 
> *filename,
>  void
>  show_backtrace (bool in_signal_handler)
>  {
> -  struct backtrace_state *lbstate;
> +  /* Note that libbacktrace allows the state to be accessed from
> + multiple threads, so we don't need to use a TLS variable for the
> + state here.  */
> +  static struct backtrace_state *lbstate;
>struct mystate state = { 0, false, in_signal_handler };
> - 
> -  lbstate = backtrace_create_state (NULL, __gthread_active_p (),
> - error_callback, NULL);
> +
> +  if (!lbstate)
> +lbstate = backtrace_create_state (NULL, __gthread_active_p (),
> +   error_callback, NULL);

... don't you still have to make sure that only one thread ever executes
"backtrace_create_state", and writes into "lbstate" (via locking, or
atomics, or "pthread_once", or some such)?  Or is that situation (more
than one thread entering "show_backtrace" with uninitialized "lbstate")
not possible to happen for other reasons?


Grüße
 Thomas

Re: [wwwdocs] [PATCH]introduce new -flive-patching master option into gcc9

2018-11-30 Thread Jeff Law

On 11/30/18 10:03 AM, Qing Zhao wrote:
> Hi,
> 
> This is the patch for update https://gcc.gnu.org/gcc-9/changes.html 
>  to include the introducing of new 
> option
> -flive-patching into gcc9.
> 
> Okay for commit?
OK
jeff

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler [v3]

2018-11-30 Thread Pedro Alves

On 11/30/2018 05:41 PM, Nick Clifton wrote:
> @@ -4693,10 +4694,21 @@
>  demangle_nested_args (struct work_stuff *work, const char **mangled,
>string *declp)
>  {
> +  static unsigned long recursion_level = 0;
>string* saved_previous_argument;
>int result;
>int saved_nrepeats;
>  
> +  if ((work->options & DMGL_RECURSE_LIMIT)
> +  && work->recursion_level > DEMANGLE_RECURSION_LIMIT)
> +{
> +  /* FIXME: There ought to be a way to report that the recursion limit
> +  has been reached.  */
> +  return 0;
> +}
> +
> +  recursion_level ++;
> +

There's still a static recursion level counter here?

Thanks,
Pedro Alves

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler [v3]

2018-11-30 Thread Jakub Jelinek

Hi!

Just spelling nitpicking.

On Fri, Nov 30, 2018 at 05:41:35PM +, Nick Clifton wrote:
> +  order to dmangle truely complicated names, but it also leaves the tools

truly? Multiple times.

> +The @option{-r} option is a snyonym for the

synonym? Multiple times.

Jakub

Re: [PATCH][AArch64][2/3] Correct type attribute for mul and mneg instructions

2018-11-30 Thread Steve Ellcey

On Fri, 2018-11-30 at 15:37 +, Kyrill Tkachov wrote:
> 
> In thunderx2t99.md the reservation changes from thunderx2t99_mul to
> thunderx2t99_madd.
> 
> Steve, can you share whether the AArch64 MUL and MNEG instructions
> really do have different latencies and reservations from MADD and
> MSUB
> on Thunderx2? If so, then this change is wrong :( and we'll want to
> model these forms differently indeed.
> 
> Thanks,
> Kyrill

According to the thunderx2 documents I looked at, the mul/mneg
instructions do have the same latencies as madd/msub.  So this
patch is OK from that standpoint and fixes an existing problem
on Thunderx2.

Steve Ellcey

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler [v3]

2018-11-30 Thread Nick Clifton

Hi Guys,

>> I think it would be fine to have a large fixed limit plus a flag to
>> disable the limit.

Great - in which case please may I present version 3 of the patch.  In 
this version:

  * The cplus_demangle_set_recursion_limit() function has been removed
and instead a new constant - DEMANGLE_RECURSION_LIMIT - is defined in
demangle.h.

  * The recursion counters in cp-demangle.c have been merged into one
counter, now contained in the d_info structure.

  * In cplus-dem.c the recursion counter has been moved into the work
structure.

  * The description of the DMGL_RECURSE_LIMIT option in demangle.h has
been enhanced to add a note that if the option is not used, then
bug reports about stack overflows in the demangler will be rejected.

  * The binutils patch has been updated to reflect these changes.  The
addr2line, cxxfilt, nm and objdump programs now have two new options
--recurse-limit and --no-recurse-limit, with --recurse-limit being
the default.  The documentation is updated to describe these options
and to also add a note about bug reports being rejected if 
--no-recurse-limit is used.

What do you think, is this version acceptable ?

Cheers
  Nick

libiberty/ChangeLog
2018-11-29  Nick Clifton  

PR 87681
PR 87675
PR 87636
PR 87335
* cp-demangle.h (struct d_info): Add recursion_limit field.
* cp-demangle.c (d_function_type): If the recursion limit is 
enabled and reached, return with a failure result.
(d_demangle_callback): If the recursion limit is enabled, check
for a mangled string that is so long that there is not enough
stack space for the local arrays.
* cplus-dem.c (struct work): Add recursion_level field.
(demangle_nested_args): If the recursion limit is enabled and 
reached, return with a failure result.

include/ChangeLog
2018-11-29  Nick Clifton  

* demangle.h (DMGL_RECURSE_LIMIT): Define.
(DEMANGLE_RECURSION_LIMIT): Prototype.

binutils/ChangeLog
2018-11-29  Nick Clifton  

* addr2line.c (demangle_flags): New static variable.
(long_options): Add --recurse-limit and --no-recurse-limit.
(translate_address): Pass demangle_flags to bfd_demangle.
(main): Handle --recurse-limit and --no-recurse-limit options.
* cxxfilt.c (flags): Add DMGL_RECURSE_LIMIT.
(long_options): Add --recurse-limit and --no-recurse-limit.
(main): Handle new options.
* dlltool.c (gen_def_file): Include DMGL_RECURSE_LIMIT in flags
passed to cplus_demangle.
* nm.c (demangle_flags): New static variable.
(long_options): Add --recurse-limit and --no-recurse-limit.
(main): Handle new options.
* objdump.c (demangle_flags): New static variable.
(usage): Add --recurse-limit and --no-recurse-limit.
(long_options): Likewise.
(objdump_print_symname): Pass demangle_flags to bfd_demangle.
(disassemble_section): Likewise.
(dump_dymbols): Likewise.
(main): Handle new options.
* prdbg.c (demangle_flags): New static variable.
(tg_variable): Pass demangle_flags to demangler.
(tg_start_function): Likewise.
* stabs.c (demangle_flags): New static variable.
(stab_demangle_template): Pass demangle_flags to demangler.
(stab_demangle_v3_argtypes): Likewise.
(stab_demangle_v3_arg): Likewise.
* doc/binutuls.texi: Document new command line options.
* NEWS: Mention the new feature.
* testsuite/config/default.exp (CXXFILT): Define if not already
defined.
(CXXFILTFLAGS): Likewise.
* testsuite/binutils-all/cxxfilt.exp: New file.  Runs a few
simple tests of the cxxfilt program.
diff --git a/binutils/NEWS b/binutils/NEWS
index a3ee86ef7f..5ed61c8513 100644
--- a/binutils/NEWS
+++ b/binutils/NEWS
@@ -1,5 +1,16 @@
 -*- text -*-
 
+* The addr2line, c++filt, nm and objdump tools now have a limit on the
+  maximum amount of recursion that is allowed whilst demangling strings.
+  The value for this limit is defined by the DEMANGLE_RECRUSE_LIMIT
+  constant declared in the include/demangle.h header file.  At the time
+  of writing this constant has the value of 1024.
+
+  The --no-recurse-limit option can be used to remove the limit, restoring
+  the behaviour of earlier versions of these tools.  This may be needed in
+  order to dmangle truely complicated names, but it also leaves the tools
+  vulnerable to stack exhaustion from maliciously constructed mangled names.
+
 * Objdump's --disassemble option can now take a parameter, specifying the
   starting symbol for disassembly.  Disassembly will continue from this
   symbol up to the next symbol.
diff --git a/binutils/addr2line.c b/binutils/addr2line.c
index 008e62026e..3085806a4a 100644
--- a/binutils/addr2line.c
+++ b/binutils/addr2line.c
@@ -45,6 +45,9 @@ static bfd_boolean do_dema

Re: [PATCH, libstdc++, C++20] Implement P0457R2 String Prefix and Suffix Checking.

2018-11-30 Thread Ed Smith-Rowland


On 11/30/18 5:33 AM, Jonathan Wakely wrote:

On 29/11/18 21:35 -0500, Ed Smith-Rowland wrote:

Greetings,

This patch implements starts_with and ends_with for basic_string and 
basic_string_view for C++20.


This was on my TODO list, thanks for taking care of it.


+#if __cplusplus > 201703L
+  bool
+  starts_with(basic_string_view<_CharT, _Traits> __x) const 
noexcept
+  { return __sv_type(this->data(), 
this->size()).starts_with(__x); }

+
+  bool
+  starts_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), 
this->size()).starts_with(__x); }

+
+  bool
+  starts_with(const _CharT* __x) const


This can be noexcept. It isn't in the standard, because it has a
narrow contract (there's a precondition that __x is null-terminated).
But since we can't reliably detect whether __x is null-terminated, and
even if we could we wouldn't want to throw an exception, we can just
make it noexcept. Same for the ends_with(const _CharT*) one, and the
equivalent members in the COW basic_string and basic_string_view.


Index: include/std/string_view
===
--- include/std/string_view    (revision 266645)
+++ include/std/string_view    (working copy)
@@ -227,7 +227,6 @@
__sv = __tmp;
  }

-
  // [string.view.ops], string operations:

  size_type
@@ -387,6 +386,36 @@
  traits_type::length(__str));
  }

+#if __cplusplus > 201703L
+  constexpr bool
+  starts_with(basic_string_view __x) const noexcept
+  { return this->size() >= __x.size()
+    && this->compare(0, __x.size(), __x) == 0; }


Please put the opening and closing braces on their own lines, as for
ends_with below:


+  constexpr bool
+  ends_with(basic_string_view __x) const noexcept
+  {
+    return this->size() >= __x.size()
+    && this->compare(this->size() - __x.size(), npos, __x) == 0;
+  }


OK with those changes, thanks


Committed 266674.

New patch attached.

Ed



2018-11-30  Edward Smith-Rowland  <3dw...@verizon.net>

Implement P0457R2 String Prefix and Suffix Checking.
* include/bits/basic_string.h: Add starts_with, ends_with members.
* include/std/string_view: Ditto.
* testsuite/21_strings/basic_string/operations/starts_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string/operations/starts_with/
wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string/operations/ends_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string/operations/ends_with/
wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/starts_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/starts_with/
wchar_t/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/ends_with/
char/1.cc: New test.
* testsuite/21_strings/basic_string_view/operations/ends_with/
wchar_t/1.cc: New test.

Index: include/bits/basic_string.h
===
--- include/bits/basic_string.h (revision 266671)
+++ include/bits/basic_string.h (working copy)
@@ -3038,6 +3038,32 @@
   compare(size_type __pos, size_type __n1, const _CharT* __s,
  size_type __n2) const;
 
+#if __cplusplus > 201703L
+  bool
+  starts_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(const _CharT* __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  ends_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+
+  bool
+  ends_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+
+  bool
+  ends_with(const _CharT* __x) const noexcept
+  { return __sv_type(this->data(), this->size()).ends_with(__x); }
+#endif // C++20
+
   // Allow basic_stringbuf::__xfer_bufptrs to call _M_length:
   template friend class basic_stringbuf;
 };
@@ -5884,6 +5910,32 @@
   compare(size_type __pos, size_type __n1, const _CharT* __s,
  size_type __n2) const;
 
+#if __cplusplus > 201703L
+  bool
+  starts_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(const _CharT* __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_

[committed] Fix ICE in substring locations from macros in header files (PR preprocessor/88257)

2018-11-30 Thread David Malcolm

PR preprocessor/88257 reports an ICE on gcc.dg/format/pr78304.c
when compiled using g++:

void test (const char *msg)
{
  printf ("size: %" PRIu32 "\n", msg);
}

due to mismatching files (and line maps) between
linemap_resolve_location and expand_location_to_spelling_point
when PRIu32 is defined in a system header.

The root cause is that expand_location_to_spelling_point stops
unwinding locations when it reaches a system header, whereas
linemap_resolve_location can resolve into a system header,
which can lead to locations within get_substring_ranges_for_loc
getting out of sync, and using the wrong line map.

This patch fixes the issue by checking that the files are the
same.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Committed to trunk as r266671.

gcc/ChangeLog:
PR preprocessor/88257
* input.c (get_substring_ranges_for_loc): Fix indentation.
Bulletproof against getting a different files back from
linemap_resolve_location and expand_location_to_spelling_point.

gcc/testsuite/ChangeLog:
PR preprocessor/88257
* c-c++-common/Wformat-pr88257.c: New test.
* c-c++-common/Wformat-pr88257.h: New test header.
* c-c++-common/empty.h: New test header.
---
 gcc/input.c  |  7 ++-
 gcc/testsuite/c-c++-common/Wformat-pr88257.c | 23 +++
 gcc/testsuite/c-c++-common/Wformat-pr88257.h | 26 ++
 gcc/testsuite/c-c++-common/empty.h   |  0
 4 files changed, 55 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/c-c++-common/Wformat-pr88257.c
 create mode 100644 gcc/testsuite/c-c++-common/Wformat-pr88257.h
 create mode 100644 gcc/testsuite/c-c++-common/empty.h

diff --git a/gcc/input.c b/gcc/input.c
index 237c0d5..b4d53d5 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -1432,7 +1432,12 @@ get_substring_ranges_for_loc (cpp_reader *pfile,
 for start vs finish due to line-length jumps.  */
   if (start_ord_map != final_ord_map
  && start_ord_map->to_file != final_ord_map->to_file)
- return "start and finish are spelled in different ordinary maps";
+   return "start and finish are spelled in different ordinary maps";
+  /* The file from linemap_resolve_location ought to match that from
+expand_location_to_spelling_point.  */
+  if (start_ord_map->to_file != start.file)
+   return "mismatching file after resolving linemap";
+
   location_t start_loc
= linemap_position_for_line_and_column (line_table, final_ord_map,
start.line, start.column);
diff --git a/gcc/testsuite/c-c++-common/Wformat-pr88257.c 
b/gcc/testsuite/c-c++-common/Wformat-pr88257.c
new file mode 100644
index 000..59e4db5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wformat-pr88257.c
@@ -0,0 +1,23 @@
+/* { dg-options "-Wformat -fdiagnostics-show-caret" } */
+
+#include "Wformat-pr88257.h"
+
+void test (const char *msg)
+{
+  __builtin_printf ("size: %" PRIu32 "\n", msg); /* { dg-warning "expects 
argument of type" } */
+  /* { dg-begin-multiline-output "" }
+   __builtin_printf ("size: %" PRIu32 "\n", msg);
+ ^  ~~~
+|
+const char *
+ { dg-end-multiline-output "" { target c } } */
+  /* { dg-begin-multiline-output "" }
+ # define PRIu32  "u"
+ { dg-end-multiline-output "" { target c } } */
+  /* { dg-begin-multiline-output "" }
+   __builtin_printf ("size: %" PRIu32 "\n", msg);
+ ^  ~~~
+|
+const char*
+ { dg-end-multiline-output "" { target c++ } } */
+}
diff --git a/gcc/testsuite/c-c++-common/Wformat-pr88257.h 
b/gcc/testsuite/c-c++-common/Wformat-pr88257.h
new file mode 100644
index 000..c778e93
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/Wformat-pr88257.h
@@ -0,0 +1,26 @@
+#pragma GCC system_header
+
+/* Whitespace is significant here, due to line numbering.
+   See the comment below.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+  End of significant whitespace */
+
+#include "empty.h"
+/* The preceding include ensures a line map starts here, with a line
+   number greater than that of the usage site, to try to trigger
+   an assertion failure.  */
+
+# define PRIu32  "u"
diff --git a/gcc/testsuite/c-c++-common/empty.h 
b/gcc/testsuite/c-c++-common/empty.h
new file mode 100644
index 000..e69de29
-- 
1.8.5.3

Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-30 Thread Martin Sebor


On 11/30/18 12:57 AM, Richard Biener wrote:

On Thu, Nov 29, 2018 at 9:34 PM Martin Sebor  wrote:


On 11/16/2018 02:07 AM, Richard Biener wrote:

On Fri, Nov 16, 2018 at 4:12 AM Martin Sebor  wrote:


Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

Please let me know if there is something I need to change here
to make the fix acceptable or if I should stop trying.


I have one more comment about

+  /* Defer warning (and folding) until the next statement in the basic
+ block is reachable.  */
+  if (!gimple_bb (stmt))
+return false;
+

it's not about the next statement in the basic-block being "reachable"
(even w/o a CFG you can use gsi_next()) but rather that the next
stmt isn't yet gimplified and thus not inserted into the gimple sequence,
right?


No, it's about the current statement not being associated with
a basic block yet when the warning code runs for the first time
(during gimplify_expr), and so gsi_next() returning null.


You apply this to gimple_fold_builtin_strncpy but I'd rather
see us not sprinkling this over gimple-fold.c but instead do this
in gimplify.c:maybe_fold_stmt, delaying folding until say lowering.

See the attached (untested).


I would also prefer this solution.  I had tested it (in response
to you first mentioning it back in September) and it causes quite
a bit of fallout in tests that look for the folding to take place
very early.  See the end of my reply here:

https://gcc.gnu.org/ml/gcc-patches/2018-09/msg01248.html

But I'm willing to do the test suite cleanup if you think it's
suitable for GCC 9.  (If you're thinking GCC 10 please let me
know now.)


I very much prefer that to the hacks in gimple-fold.c if it doesn't
help now then I'll rather live with some bogus warnings for GCC 9
and fix it up properly for GCC 10.

I expect the fallout to be quite minimal (also considering my
suggestion to do the folding in gimple-low.c).


Yes, it is.  Please see the full patch in my reply to Jeff and
let me know if that's fine for GCC 9.

As we discussed before, for GCC 10 Jeff and I are already planning
to look into merging the strlen pass with others (sprintf and
perhaps also object size) to improve things.

Martin



Richard.


Thanks
Martin



Richard.




On 10/31/2018 10:33 AM, Martin Sebor wrote:

Ping: https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01818.html

On 10/20/2018 06:01 PM, Martin Sebor wrote:

On 10/16/2018 03:21 PM, Jeff Law wrote:

On 10/4/18 9:51 AM, Martin Sebor wrote:

On 10/04/2018 08:58 AM, Jeff Law wrote:

On 8/27/18 9:42 AM, Richard Biener wrote:

On Mon, Aug 27, 2018 at 5:32 PM Jeff Law  wrote:


On 08/27/2018 02:29 AM, Richard Biener wrote:

On Sun, Aug 26, 2018 at 7:26 AM Jeff Law  wrote:


On 08/24/2018 09:58 AM, Martin Sebor wrote:

The warning suppression for -Wstringop-truncation looks for
the next statement after a truncating strncpy to see if it
adds a terminating nul.  This only works when the next
statement can be reached using the Gimple statement iterator
which isn't until after gimplification.  As a result, strncpy
calls that truncate their constant argument that are being
folded to memcpy this early get diagnosed even if they are
followed by the nul assignment:

   const char s[] = "12345";
   char d[3];

   void f (void)
   {
 strncpy (d, s, sizeof d - 1);   // -Wstringop-truncation
 d[sizeof d - 1] = 0;
   }

To avoid the warning I propose to defer folding strncpy to
memcpy until the pointer to the basic block the strnpy call
is in can be used to try to reach the next statement (this
happens as early as ccp1).  I'm aware of the preference to
fold things early but in the case of strncpy (a relatively
rarely used function that is often misused), getting
the warning right while folding a bit later but still fairly
early on seems like a reasonable compromise.  I fear that
otherwise, the false positives will drive users to adopt
other unsafe solutions (like memcpy) where these kinds of
bugs cannot be as readily detected.

Tested on x86_64-linux.

Martin

PS There still are outstanding cases where the warning can
be avoided.  I xfailed them in the test for now but will
still try to get them to work for GCC 9.

gcc-87028.diff


PR tree-optimization/87028 - false positive -Wstringop-truncation
strncpy with global variable source string
gcc/ChangeLog:

   PR tree-optimization/87028
   * gimple-fold.c (gimple_fold_builtin_strncpy): Avoid
folding when
   statement doesn't belong to a basic block.
   * tree-ssa-strlen.c (maybe_diag_stxncpy_trunc): Handle
MEM_REF on
   the left hand side of assignment.

gcc/testsuite/ChangeLog:

   PR tree-optimization/87028
   * c-c++-common/Wstringop-truncation.c: Remove xfails.
   * gcc.dg/Wstringop-truncation-5.c: New test.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index 07341eb..284c2fb 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1702,6 +1702,11 @@ gimple_fold_builtin_strncpy
(gimple_stmt_iterator *gsi,
if (tree_int_cs

Re: [PATCH][AArch64][2/3] Correct type attribute for mul and mneg instructions

2018-11-30 Thread Kyrill Tkachov




On 28/11/18 16:37, James Greenhalgh wrote:

On Mon, Nov 26, 2018 at 11:36:43AM -0600, Kyrill Tkachov wrote:

Hi all,

In the AAarch64 ISA the MUL and MNEG instructions are actually aliases of MADD 
and MSUB.
Therefore they should have the type attribute mla, rather than mul, which 
should only be used
for AArch32 32-bit multiplication instructions.

This will ensure more consistent scheduling decisions.

Bootstrapped and tested on aarch64-none-linux-gnu.

Ok for trunk?

OK in principle. Did you audit the pipeline models to check this doesn't
change scheduling class in an undesirable way for any of our supported
targets? OK if so, if not can you run that audit and figure out the right
thing to do to resolve it.


This has a change in 3 models.

In saphira.md this changes the reservation of these instructions
from saphira_muldiv_4_x_mul to saphira_muldiv_4_x_mla.
Both have the same characteristics (latency etc) but they differ in
their bypass behaviour: saphira_muldiv_4_x_mla can receive a bypass from 
saphira_muldiv_4_x_mul,
but not the other way around.

So with this change the MUL and MNEG will now be able to receive an early 
forwarded result from other multiply/multiply-accumulate operations,
whereas before this was only done on MADD instructions.
There is a similar situation in falkor.md.
Sameera, can you share whether this new behaviour is appropriate for Falkor and 
Saphira?
I note that if the forwarding only occurs on the accumulator operand, the 
bypass should be guarded on aarch_accumulator_forwarding

In thunderx2t99.md the reservation changes from thunderx2t99_mul to 
thunderx2t99_madd.

Steve, can you share whether the AArch64 MUL and MNEG instructions really do 
have different latencies and reservations from MADD and MSUB
on Thunderx2? If so, then this change is wrong :( and we'll want to model these 
forms differently indeed.

Thanks,
Kyrill



Thanks,
James

Re: [PATCH] v3: C/C++: add fix-it hints for missing '&' and '*' (PR c++/87850)

2018-11-30 Thread David Malcolm

On Tue, 2018-11-20 at 17:39 -0700, Martin Sebor wrote:
> > +void test_2 (void)
> > +{
> > +  takes_int_ptr(ivar); /* { dg-warning "" "" { target c } } */
> > +  /* { dg-error "" "" { target c++ } .-1 } */
> > +  /* { dg-message "possible fix: take the address with '&'" "" {
> > target *-*-* } .-2 } */
> > +
> > +  /* Expect an '&' fix-it hint.  */
> > +  /* { dg-begin-multiline-output "" }
> > +   takes_int_ptr(ivar);
> > + ^~~~
> > + |
> > + int
> > + { dg-end-multiline-output "" } */
> > +  /* { dg-begin-multiline-output "" }
> > +   takes_int_ptr(ivar);
> > + ^~~~
> > + &
> 
> I experimented with adding hints to sizeof_pointer_memaccess_warning
> over the weekend and ran into a location difference between the two
> front-ends.  In an attempt to resolve it, rather than using
> an "insertion hint" like I think you did above, I used a replacement
> hint.  And although it started out as a workaround I ended up liking
> the result better.  What I think the same approach would result in
> here is something like:
> 
>takes_int_ptr(ivar);
>note: possible fix: take the address with '&'
>note: takes_int_ptr(ivar);
>^~~~
>|
>int
> note: takes_int_ptr(ivar);
> ^~~~
> &ivar
> 
> Have you considered this style, i.e., printing valid expressions
> in the hints when possible rather than just operators?  It solves
> the problem of the lone operators looking a little too terse and
> cryptic.
> 
> I realize it's not always possible (not every fix-it hint is for
> a bad expression) but I'm wondering if it would be worth using
> when it is.

I definitely prefer the style of the output in your example, but I
think there's a "data model vs presentation" thing going on here.

I'd prefer to avoid the underlying rich_location using replacement when
we mean insertion, for the same reasons as for deletion discussed at
[1]

I think the issue here is with the presentation of insertion fix-it
hints in diagnostic-show-locus.c: there's currently no way for the user
to tell if this:

  takes_int_ptr(ivar);
^~~~
&

means:
  (a) replace "ivar" with "&", or
  (b) insert "&" in front of "ivar" (the intended behavior)

and this alternative presentation would definitely be clearer:

  takes_int_ptr(ivar);
^~~~
&ivar

I can have a look at reimplementing how we present insertions in
diagnostic-show-locus.c.

Dave

[1] 
https://gcc.gnu.org/onlinedocs/gccint/Guidelines-for-Diagnostics.html#Express-deletion-in-terms-of-deletion_002c-not-replacement

Re: [patch,openacc] C, C++ OpenACC wait diagnostic change

2018-11-30 Thread Thomas Schwinge

Hi Julian!

On Fri, 28 Sep 2018 14:17:42 +0100, Julian Brown  
wrote:
> On Wed, 26 Sep 2018 14:08:37 -0700
> Cesar Philippidis  wrote:
> 
> > On 09/26/2018 12:50 PM, Joseph Myers wrote:
> > > On Wed, 26 Sep 2018, Cesar Philippidis wrote:
> > >   
> > >> Attached is an old patch which updated the C and C++ FEs to use
> > >> %<)%> for the right ')' symbol. It's mostly a cosmetic change. All
> > >> of the changes are self-contained to the OpenACC code path.  
> > > 
> > > Why is the "before ')'" included in the call to c_parser_error at
> > > all? c_parser_error calls c_parse_error which adds its own " before
> > > " and token description or expansion, so I'd expect the current
> > > error to result in a message ending in something of the form
> > > "before X before Y".  
> 
> > Julian, I need to start working on deep copy in OpenACC. Can you take
> > over this patch? The error handling code in the C FE needs to be
> > removed because it's dead.
> 
> I agree that the error-handling path in question in the C FE is dead.
> The difference is that in C, c_parser_oacc_wait_list parses the open
> parenthesis, the list and then the close parenthesis separately, and so
> a token sequence like:
> 
>(1
> 
> will return an expression list of length 1. In the C++ FE rather, a
> cp_parser_parenthesized_expression_list is parsed all in one go, and if
> the input is not that well-formed sequence then NULL is returned (or a
> zero-length vector for an empty list).
> 
> But for C, it does not appear that c_parser_expr_list has a code path
> that can return a zero-length list at all.

In addition to your "(1" token sequence (and similar ones), I suppose
what these code paths in C and C++ are supposed to catch the "wait ()"
case (see line 149 of gcc/testsuite/c-c++-common/goacc/asyncwait-1.c).

I suppose in C, we do diagnose an "error: expected expression before ')'
token" in "c_parser_expr_list"/"c_parser_expr_no_commas", and then return
a list with an "error_mark_node", right?  (I have not verified that.)

> So, we can elide the
> diagnostic with no change to compiler behaviour.

In that case, yes.

> This patch does that,
> and also changes the C++ diagnostic, leading to errors being reported
> like:
> 
> diag.c: In function 'int main(int, char*)':
> diag.c:6:59: error: expected ')' before end of line
> 6 | #pragma acc parallel copyin (a[0:N]) copy (b[0:N]) wait (1
>   | ~ ^
>   |   )
> diag.c:6:59: error: expected integer expression list before end of line 
> 
> Actually I'm not too sure how useful the second error line is. Maybe we
> should just remove it to improve consistency between C & C++?

Right, one single error diagnostic is enough.

But please make sure that the "wait ()" case continues to be diagnosed
correctly -- similarly to C, I suggest "expected expression before ')'
token" (or whatever is natural to the C++ parser), and then accordingly
tidy up that "dg-error" regular expression on line 149 of
gcc/testsuite/c-c++-common/goacc/asyncwait-1.c.

In C++, this is the case that: "args != NULL && args->length () == 0", I
suppose?  (I have not verified that.)


Oh, and next to "wait ()" please also add test coverage for "wait (".


Grüße
 Thomas


> The attached has been tested with offloading to nvptx and bootstrapped.
> OK?
> 
> Thanks,
> 
> Julian
> 
> 2018-XX-YY  James Norris  
>   Cesar Philippidis  
> Julian Brown  
> 
>   gcc/c/
>   * c-parser.c (c_parser_oacc_wait_list): Remove dead diagnostic
>   code.
> 
> gcc/cp/
> * parser.c (cp_parser_oacc_wait_list): Change error message.
> 
>   gcc/testsuite/
> * c-c++-common/goacc/asyncwait-1: Update expected errors.
> commit 3a59bdbccc3c2383c0056c74797d698c7d81dce2
> Author: Julian Brown 
> Date:   Fri Sep 28 05:52:55 2018 -0700
> 
> OpenACC wait list diagnostic change
> 
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index 1f173fc..92a8089 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -11597,14 +11597,6 @@ c_parser_oacc_wait_list (c_parser *parser, 
> location_t clause_loc, tree list)
>  return list;
>  
>args = c_parser_expr_list (parser, false, true, NULL, NULL, NULL, NULL);
> -
> -  if (args->length () == 0)
> -{
> -  c_parser_error (parser, "expected integer expression before ')'");
> -  release_tree_vector (args);
> -  return list;
> -}
> -
>args_tree = build_tree_list_vec (args);
>  
>for (t = args_tree; t; t = TREE_CHAIN (t))
> diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
> index 6696f17..43128e0 100644
> --- a/gcc/cp/parser.c
> +++ b/gcc/cp/parser.c
> @@ -32086,7 +32086,7 @@ cp_parser_oacc_wait_list (cp_parser *parser, 
> location_t clause_loc, tree list)
>  
>if (args == NULL || args->length () == 0)
>  {
> -  cp_parser_error (parser, "expected integer expression before ')'");
> +  cp_parser_error (parser,

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Ian Lance Taylor

Michael Matz  writes:

> On Fri, 30 Nov 2018, Nick Clifton wrote:
>
>> Not without modifying the current demangling interface.  The problem is 
>> that the context structure is created for each invocation of a 
>> demangling function (from outside the library), and no state is 
>> preserved across demangling calls.  Thus in order to have a recursion 
>> limit which is configurable by the caller, you either need to have a 
>> global variable or else extend the demangling interface to include a 
>> recursion limit parameter.
>> 
>> I did consider just having a fixed limit, that the user cannot change, 
>> but I thought that this might be rejected by reviewers.  (On the grounds 
>> that different limits are appropriate to different execution 
>> environments). Note - enabling or disabling the recursion limit is 
>> controlled by a separate feature of the proposed patch, ie the new 
>> DMGL_RECURSE_LIMIT flag in the options field of the cplus_demangleXXX() 
>> functions.  But there is not enough room in the options field to also 
>> include a recursion limit value.
>
> Or we decide to not ignore this part of the GNU coding standard ...
>
>> 4.2 Writing Robust Programs
>>  
>> Avoid arbitrary limits on the length or number of any data structure, 
>> including file names, lines, files, and symbols, by allocating all data 
>> structures dynamically. In most Unix utilities, “long lines are silently 
>> truncated”. This is not acceptable in a GNU utility.
>
> ... just because script kiddies do mindless fuzzing work.  I realize that 
> you didn't implement a fixed limit, but IMHO it's bordering with that.

That section is "Writing Robust Programs."  Robustness guarantees have
to be different for utilities and servers.  A robust server doesn't
crash because of arbitrary user input, but there are servers that
demangle names that are provided by the user.  So we need two modes for
the demangler: one that takes anything and sometimes crashes, for
utilities like c++filt, and one that doesn't crash, for servers.  And it
seems like that is what Nick is suggesting.

Ian

Re: [PATCH] Call decl_default_tls_model with a proper type (PR gcov-profile/88279).

2018-11-30 Thread Jan Hubicka

> Hi.
> 
> This is quite obvious fix where I mixed type and variable
> when calling decl_default_tls_model.
> 
> I'm testing the patch on x86_64-linux-gnu.
> 
> Ready for trunk?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2018-11-30  Alexander Monakov  
> 
>   PR gcov-profile/88279
>   * tree-profile.c (init_ic_make_global_vars): Call
>   decl_default_tls_model for variable and not it's type.
OK, thanks!
Honza
> ---
>  gcc/tree-profile.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> 

> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> index 48204423eaf..506e00d6e90 100644
> --- a/gcc/tree-profile.c
> +++ b/gcc/tree-profile.c
> @@ -111,7 +111,7 @@ init_ic_make_global_vars (void)
>DECL_INITIAL (ic_tuple_var) = NULL;
>DECL_EXTERNAL (ic_tuple_var) = 1;
>if (targetm.have_tls)
> -set_decl_tls_model (ic_tuple_var, decl_default_tls_model (tuple_type));
> +set_decl_tls_model (ic_tuple_var, decl_default_tls_model (ic_tuple_var));
>  }
>  
>  /* Create the type and function decls for the interface with gcov.  */
>

Re: [PATCH][GCC][ARM] Ensure dotproduct is only enabled on armv8 neon

2018-11-30 Thread Richard Earnshaw (lists)

On 30/11/2018 14:19, Kyrill Tkachov wrote:
> 
> On 23/11/18 17:01, Sam Tebbs wrote:
>> Hi all,
>>
>> Currently on AArch32, invoking with -march=armv8.2-a+dotprod -mfpu=neon
>> incorrectly enables armv7 dotproduct. This patch restricts dotproduct
>> to armv8
>> to correct the issue.
>>
>> When using a float ABI different from that of the host platform,
>> including
>> /usr/include/gnu/stubs.h errors due to a non-existent stubs-soft.h
>> file, so
>> an inclusion of a standard header file is required to compile
>> successfully and
>> find the correct float ABI to select in
>> check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache.
>>
>> Bootstrapped and regression tested on arm-none-linux-gnueabihf with no
>> regressions.
>>
>> OK for trunk?
>>
> 
> Ok. I'd hope it's a niche usecase as the -mfpu=auto is the recommended
> way these days.
> 

Hmm, this should work in the same way as removing the fp16 or fp16fml
extensions.  So why doesn't specifying -mfpu=neon cause that bit to be
cleared from the feature sets?

R.

> Thanks,
> Kyrill
> 
>> gcc/ChangeLog:
>>
>> 2018-11-23  Sam Tebbs
>>
>>     * config/arm/arm.h (TARGET_DOTPROD): Add TARGET_VFP5 constraint.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2018-11-23  Sam Tebbs
>>
>>     * gcc.target/arm/neon-dotprod-restriction.c: New file.
>>  * lib/target-supports.exp
>> (check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
>>  Include stdint.h.
>>
>

Re: [PATCH 4/4][libbacktrace] Add tests for unused formats

2018-11-30 Thread Ian Lance Taylor via gcc-patches

On Fri, Nov 30, 2018 at 1:06 AM, Tom de Vries  wrote:
> On 29-11-18 19:28, Ian Lance Taylor wrote:
>> On Fri, Nov 23, 2018 at 12:56 PM, Tom de Vries  wrote:
>>>
>>> When building libbacktrace, we typically use elf.c, and don't build 
>>> pecoff.c,
>>> xcoff.c or unknown.c
>>>
>>> Add testcases that use unused format to ensure that we also build and
>>> test those on a typical development setup.
>>>
>>> Bootstrapped and reg-tested on x86_64.
>>>
>>> OK for trunk?
>>>
>>> Thanks,
>>> - Tom
>>>
>>> [libbacktrace] Add tests for unused formats
>>>
>>> 2018-11-23  Tom de Vries  
>>>
>>> * configure.ac (NOT_HAVE_FORMAT_ELF, NOT_HAVE_FORMAT_PECOFF)
>>> (NOT_HAVE_FORMAT_UNKNOWN, NOT_HAVE_FORMAT_XCOFF_32)
>>> (NOT_HAVE_FORMAT_XCOFF_64): New AM_CONDITIONAL.
>>> * configure: Regenerate.
>>> * Makefile.am (check_PROGRAMS): Add test_elf, test_xcoff_32,
>>> test_xcoff_64, test_pecoff and test_unknown.
>>> * Makefile.in: Regenerate.
>>> * test_format.c: New file.
>>
>>
>> Again it seems feasible to avoid GNU make features, and skip the
>> negative conditionals and just build the tests for all formats
>> including the one on the current system.  It's not worth adding the
>> complexity to avoid building the test.  Thanks.
>
> Done.
>
> OK for trunk?

This is OK.

Thanks.

Ian

Re: [PATCH 1/4][libbacktrace] Test check_PROGRAMS without mmap

2018-11-30 Thread Ian Lance Taylor via gcc-patches

>> As far as I know libbacktrace does not currently rely on using GNU
>> make.  I'd rather not add that dependency for this purpose (I don't
>> mind adding this kind of testing but to me this seems only of mild
>> interest--I expect that all significant libbacktrace users have mmap).
>> You should be able to write something like
>>
>> libbacktrace_alloc_la_SOURCES = $(libbacktrace_SOURCES)
>> libbacktrace_alloc_la_LIBADD = $(BACKTRACE_FILE) $(FORMAT_FILE) read.c 
>> alloc.c
>>
>
> Done.
>
>> Then I wouldn't bother with only running the tests with HAVE_MMAP,
>> just add unconditional tests for btest_alloc, etc.  It's OK to run
>> duplicate tests for the incredibly rare case of a host that doesn't
>> support mmap.
>
> Done.
>
> OK for trunk?

This is OK.  (I assume the tests pass).

Thanks.

Ian

Re: [PATCH, libfortran] PR 88137 Initialize backtrace state once

2018-11-30 Thread Jerry DeLisle


On 11/29/18 11:37 PM, Janne Blomqvist wrote:

PING!



LGTM,

OK and thanks

Jerry


(for the GCC-7 branch, I'll commit it after the 7.4 release)

On Thu, Nov 22, 2018 at 11:17 AM Janne Blomqvist 
wrote:


 From backtrace.h for backtrace_create_state:

Calling this function allocates resources that can not be freed.
There is no backtrace_free_state function.  The state is used to
cache information that is expensive to recompute.  Programs are
expected to call this function at most once and to save the return
value for all later calls to backtrace functions.

So instead of calling backtrace_create_state every time we wish to
show a backtrace, do it once and store the result in a static
variable.  libbacktrace allows multiple threads to access the state,
so no need to use TLS.

Regtested on x86_64-pc-linux-gnu, Ok for trunk/8/7?

libgfortran/ChangeLog:

2018-11-22  Janne Blomqvist  

 PR libfortran/88137
 * runtime/backtrace.c (show_backtrace): Make lbstate a static
 variable, initialize once.
---
  libgfortran/runtime/backtrace.c | 12 
  1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/libgfortran/runtime/backtrace.c
b/libgfortran/runtime/backtrace.c
index e0c277044b6..3fc973a5e6d 100644
--- a/libgfortran/runtime/backtrace.c
+++ b/libgfortran/runtime/backtrace.c
@@ -146,11 +146,15 @@ full_callback (void *data, uintptr_t pc, const char
*filename,
  void
  show_backtrace (bool in_signal_handler)
  {
-  struct backtrace_state *lbstate;
+  /* Note that libbacktrace allows the state to be accessed from
+ multiple threads, so we don't need to use a TLS variable for the
+ state here.  */
+  static struct backtrace_state *lbstate;
struct mystate state = { 0, false, in_signal_handler };
-
-  lbstate = backtrace_create_state (NULL, __gthread_active_p (),
-   error_callback, NULL);
+
+  if (!lbstate)
+lbstate = backtrace_create_state (NULL, __gthread_active_p (),
+ error_callback, NULL);

if (lbstate == NULL)
  return;
--
2.17.1

Re: [PATCH][GCC][ARM] Ensure dotproduct is only enabled on armv8 neon

2018-11-30 Thread Kyrill Tkachov




On 23/11/18 17:01, Sam Tebbs wrote:

Hi all,

Currently on AArch32, invoking with -march=armv8.2-a+dotprod -mfpu=neon
incorrectly enables armv7 dotproduct. This patch restricts dotproduct to armv8
to correct the issue.

When using a float ABI different from that of the host platform, including
/usr/include/gnu/stubs.h errors due to a non-existent stubs-soft.h file, so
an inclusion of a standard header file is required to compile successfully and
find the correct float ABI to select in
check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache.

Bootstrapped and regression tested on arm-none-linux-gnueabihf with no
regressions.

OK for trunk?



Ok. I'd hope it's a niche usecase as the -mfpu=auto is the recommended way 
these days.

Thanks,
Kyrill


gcc/ChangeLog:

2018-11-23  Sam Tebbs

* config/arm/arm.h (TARGET_DOTPROD): Add TARGET_VFP5 constraint.

gcc/testsuite/ChangeLog:

2018-11-23  Sam Tebbs

* gcc.target/arm/neon-dotprod-restriction.c: New file.
 * lib/target-supports.exp
(check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
 Include stdint.h.

Re: [PATCH][GCC][ARM] Ensure dotproduct is only enabled on armv8 neon

2018-11-30 Thread Sam Tebbs

On 11/23/18 5:01 PM, Sam Tebbs wrote:

> Hi all,
>
> Currently on AArch32, invoking with -march=armv8.2-a+dotprod -mfpu=neon
> incorrectly enables armv7 dotproduct. This patch restricts dotproduct to armv8
> to correct the issue.
>
> When using a float ABI different from that of the host platform, including
> /usr/include/gnu/stubs.h errors due to a non-existent stubs-soft.h file, so
> an inclusion of a standard header file is required to compile successfully and
> find the correct float ABI to select in
> check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache.
>
> Bootstrapped and regression tested on arm-none-linux-gnueabihf with no
> regressions.
>
> OK for trunk?
>
> gcc/ChangeLog:
>
> 2018-11-23  Sam Tebbs
>
>   * config/arm/arm.h (TARGET_DOTPROD): Add TARGET_VFP5 constraint.
>
> gcc/testsuite/ChangeLog:
>
> 2018-11-23  Sam Tebbs
>
>   * gcc.target/arm/neon-dotprod-restriction.c: New file.
>   * lib/target-supports.exp
>   (check_effective_target_arm_v8_2a_dotprod_neon_ok_nocache):
>   Include stdint.h.
>
ping

Re: [PATCH, ARM] Improve robustness of -mslow-flash-data

2018-11-30 Thread Kyrill Tkachov


Hi Thomas,

On 19/11/18 17:56, Thomas Preudhomme wrote:

Hi,

Current code to handle -mslow-flash-data in machine description files
suffers from a number of issues which this patch fixes:

1) The insn_and_split in vfp.md to load a generic floating-point
constant via GPR first and move it to VFP register are guarded by
!reload_completed which is forbidden explicitely in the GCC internals
documentation section 17.2 point 3;

2) A number of testcase in the testsuite ICEs under -mslow-flash-data
when targeting the hardfloat ABI [1];

3) Instructions performing load from literal pool are not disabled.

These problems are addressed by 2 separate actions:

1) Making the splitters take a clobber and changing the expanders
accordingly to generate a mov with clobber in cases where a literal
pool would be used. The splitter can thus be enabled after reload since
it does not call gen_reg_rtx anymore;

2) Adding new predicates and constraints to disable literal pool loads
in existing instructions when -mslow-flash-data is in effect.



Please split these into two separate patches so we can more clearly see which 
changes address which problem


The patch also rework the splitter for DFmode slightly to generate an
intermediate DI load instead of 2 intermediate SI loads, thus relying on
the existing DI splitters instead of redoing their job. At last, the
patch adds some missing arm_fp_ok effective target to some of the
slow-flash-data testcases.

[1]
c-c++-common/Wunused-var-3.c
gcc.c-torture/compile/pr72771.c
gcc.c-torture/compile/vector-5.c
gcc.c-torture/compile/vector-6.c
gcc.c-torture/execute/20030914-1.c
gcc.c-torture/execute/20050316-1.c
gcc.c-torture/execute/pr59643.c
gcc.dg/builtin-tgmath-1.c
gcc.dg/debug/pr55730.c
gcc.dg/graphite/interchange-7.c
gcc.dg/pr56890-2.c
gcc.dg/pr68474.c
gcc.dg/pr80286.c
gcc.dg/torture/pr35227.c
gcc.dg/torture/pr65077.c
gcc.dg/torture/pr86363.c
g++.dg/torture/pr81112.C
g++.dg/torture/pr82985.C
g++.dg/warn/Wunused-var-7.C
and a lot more in libstdc++ in special_functions/*_comp_ellint_* and
special_functions/*_ellint_* directories.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2018-11-14  Thomas Preud'homme 

* config/arm/arm.md (arm_movdi): Split if -mslow-flash-data and
source is a constant that would be loaded by literal pool.
(movsf expander): Generate a no_literal_pool_sf_immediate insn if
-mslow-flash-data is present, targeting hardfloat ABI and source is a
float constant that cannot be loaded via vmov.
(movdf expander): Likewise but generate a no_literal_pool_df_immediate
insn.
(arm_movsf_soft_insn): Split if -mslow-flash-data and source is a
float constant that would be loaded by literal pool.
(softfloat constant movsf splitter): Splitter for the above case.
(movdf_soft_insn): Split if -mslow-flash-data and source is a float
constant that would be loaded by literal pool.
(softfloat constant movdf splitter): Splitter for the above case.
* config/arm/constraints.md (Pz): Document existing constraint.
(Ha): Define constraint.
(Tu): Likewise.
* config/arm/predicates.md (hard_sf_operand): New predicate.
(hard_df_operand): Likewise.
* config/arm/thumb2.md (thumb2_movsi_insn): Split if
-mslow-flash-data and constant would be loaded by literal pool.
* constant/arm/vfp.md (thumb2_movsi_vfp): Likewise and disable constant
load in VFP register.
(movdi_vfp): Likewise.
(thumb2_movsf_vfp): Use hard_sf_operand as predicate for source to
prevent match for a constant load if -mslow-flash-data and constant
cannot be loaded via vmov.  Adapt constraint accordingly by
using Ha instead of E for generic floating-point constant load.
(thumb2_movdf_vfp): Likewise using hard_df_operand predicate instead.
(no_literal_pool_df_immediate): Add a clobber to use as the
intermediate general purpose register and also enable it after reload
but disable it constant is a valid FP constant.  Add constraints and
generate a DI intermediate load rather than 2 SI loads.
(no_literal_pool_sf_immediate): Add a clobber to use as the
intermediate general purpose register and also enable it after
reload.

*** gcc/testsuite/ChangeLog ***

2018-11-14  Thomas Preud'homme 

* gcc.target/arm/thumb2-slow-flash-data-2.c: Require arm_fp_ok
effective target.
* gcc.target/arm/thumb2-slow-flash-data-3.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-4.c: Likewise.
* gcc.target/arm/thumb2-slow-flash-data-5.c: Likewise.

Testing: Built arm-none-eabi cross compilers for Armv7E-M defaulting to
softfloat and hardfloat ABI which showed no regression and some
FAIL->PASS for hardfloat ABI. Bootstraped on Arm and Thumb-2 without any
regression. Compiled SPEC2k6 without -mslow-flash-data and checked that
code genera

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 05:55:52AM -0800, Ian Lance Taylor wrote:
> Nick Clifton  writes:
> 
> > I did consider just having a fixed limit, that the user cannot change, but
> > I thought that this might be rejected by reviewers.  (On the grounds that
> > different limits are appropriate to different execution environments).
> > Note - enabling or disabling the recursion limit is controlled by a separate
> > feature of the proposed patch, ie the new DMGL_RECURSE_LIMIT flag in the 
> > options field of the cplus_demangleXXX() functions.  But there is not enough
> > room in the options field to also include a recursion limit value.
> 
> I think it would be fine to have a large fixed limit plus a flag to
> disable the limit.  I can't think of any reason why a program would want
> to change the limit unless it has complete trust in the symbols it is
> demangling, and in that case it may as well simply disable the limit.

Well, disabling the limit is what the people fuzzing it will use then
and report it still crashes.
We'd need to document that if somebody asks for no limit, then we don't
consider any cases of running as out of stack etc. as bugs, and simply
people shouldn't set that on when running on untrusted symbols.

Jakub

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Ian Lance Taylor

Nick Clifton  writes:

> I did consider just having a fixed limit, that the user cannot change, but
> I thought that this might be rejected by reviewers.  (On the grounds that
> different limits are appropriate to different execution environments).
> Note - enabling or disabling the recursion limit is controlled by a separate
> feature of the proposed patch, ie the new DMGL_RECURSE_LIMIT flag in the 
> options field of the cplus_demangleXXX() functions.  But there is not enough
> room in the options field to also include a recursion limit value.

I think it would be fine to have a large fixed limit plus a flag to
disable the limit.  I can't think of any reason why a program would want
to change the limit unless it has complete trust in the symbols it is
demangling, and in that case it may as well simply disable the limit.

Ian

Re: [PATCH][RFC] Extend locations where to seach for Fortran pre-include.

2018-11-30 Thread Martin Liška

On 11/27/18 2:40 PM, Martin Liška wrote:
> On 11/26/18 7:44 PM, Joseph Myers wrote:
>> On Mon, 26 Nov 2018, Martin Liška wrote:
>>
 I don't see how this version ensures that NATIVE_SYSTEM_HEADER_DIR is 
 properly sysrooted.  Note there's add_sysrooted_prefix separate from 
 add_prefix (but that's *not* the correct thing to use here because it uses 
 target_sysroot_suffix whereas you need target_sysroot_hdrs_suffix).
>>>
>>> I address that in updated version of the patch.
>>
>> However, this version seems to make TOOL_INCLUDE_DIR sysrooted as well.  
>> I don't think that's correct; TOOL_INCLUDE_DIR ($prefix/$target/include, 
>> roughly) is a non-sysroot location for headers.  Note that it's not 
>> sysrooted in cppdefault.c, which is a good guide to which directories 
>> should or should not be sysrooted, and what order they should come in 
>> (though as discussed, various of the directories there are not relevant 
>> for the present issue).
>>
>> The patch appears to be against some tree other than current trunk.  At 
>> least, it shows a function find_fortran_preinclude_file in gcc.c as 
>> already existing in the diff context, but I see no such function in the 
>> current sources.
>>
> 
> I've just installed a prerequisite patch and now you should be able
> to apply the patch on top of current trunk.
> 
> Thanks,
> Martin
> 

Hi Joseph.

About this: I'll be away for next 3 weeks and I'm planning to return to
this once I'm back. If you find a spare cycles and help me with the
location which should be searched in find_fortran_preinclude_file, I would
be happy ;)

Thanks,
Martin

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Michael Matz

Hi,

On Fri, 30 Nov 2018, Nick Clifton wrote:

> Not without modifying the current demangling interface.  The problem is 
> that the context structure is created for each invocation of a 
> demangling function (from outside the library), and no state is 
> preserved across demangling calls.  Thus in order to have a recursion 
> limit which is configurable by the caller, you either need to have a 
> global variable or else extend the demangling interface to include a 
> recursion limit parameter.
> 
> I did consider just having a fixed limit, that the user cannot change, 
> but I thought that this might be rejected by reviewers.  (On the grounds 
> that different limits are appropriate to different execution 
> environments). Note - enabling or disabling the recursion limit is 
> controlled by a separate feature of the proposed patch, ie the new 
> DMGL_RECURSE_LIMIT flag in the options field of the cplus_demangleXXX() 
> functions.  But there is not enough room in the options field to also 
> include a recursion limit value.

Or we decide to not ignore this part of the GNU coding standard ...

> 4.2 Writing Robust Programs
>  
> Avoid arbitrary limits on the length or number of any data structure, 
> including file names, lines, files, and symbols, by allocating all data 
> structures dynamically. In most Unix utilities, “long lines are silently 
> truncated”. This is not acceptable in a GNU utility.

... just because script kiddies do mindless fuzzing work.  I realize that 
you didn't implement a fixed limit, but IMHO it's bordering with that.

But re the static variables: you have two, once the recursion depth, and 
once the limit.

The limit can be a static variable just fine, you state, as part of the 
interface that it sets global state, and if that needs to happen from 
multiple threads that the user must synchronize.  This is fine because 
there won't be many calls to constantly change that limit.

The current depth needs to be part of the d_info (resp. workstuff) 
structure, avoiding the thread-unsafety.

Another crazy idea: do encode the limit in the option field.  If you 
declare that the limit is a power of two (which IMHO doesn't 
materially change the usefullness of such limit) then you only need to 
encode the log2 of it, for which 5 bits are plenty enough (you can encode 
limits from 1 to 1<<32 then).

Ciao,
Michael.

[PATCH] Call decl_default_tls_model with a proper type (PR gcov-profile/88279).

2018-11-30 Thread Martin Liška

Hi.

This is quite obvious fix where I mixed type and variable
when calling decl_default_tls_model.

I'm testing the patch on x86_64-linux-gnu.

Ready for trunk?
Thanks,
Martin

gcc/ChangeLog:

2018-11-30  Alexander Monakov  

PR gcov-profile/88279
* tree-profile.c (init_ic_make_global_vars): Call
decl_default_tls_model for variable and not it's type.
---
 gcc/tree-profile.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)


diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index 48204423eaf..506e00d6e90 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -111,7 +111,7 @@ init_ic_make_global_vars (void)
   DECL_INITIAL (ic_tuple_var) = NULL;
   DECL_EXTERNAL (ic_tuple_var) = 1;
   if (targetm.have_tls)
-set_decl_tls_model (ic_tuple_var, decl_default_tls_model (tuple_type));
+set_decl_tls_model (ic_tuple_var, decl_default_tls_model (ic_tuple_var));
 }
 
 /* Create the type and function decls for the interface with gcov.  */

[PATCH] Use proper location for tls_init function (PR c++/88263).

2018-11-30 Thread Martin Liška

Hi.

The PR is about a function (__tls_init) that ends before it begins
(from line point of view). The patch uses location of a variable where
we first create the function declaration. I'm not much familiar with C++ FE,
but it works.

Survives bootstrap and regression test on ppc64le-linux-gnu.

Ready for trunk?
Thanks,
Martin

gcc/cp/ChangeLog:

2018-11-30  Martin Liska  

PR c++/88263
* decl2.c (get_local_tls_init_fn): Add location_t argument and
use it.
(get_tls_init_fn): Call it with location of variable for which
we'll need to create tls_init function.
(handle_tls_init): Likewise.

gcc/testsuite/ChangeLog:

2018-11-30  Martin Liska  

* g++.dg/gcov/pr88263.C: New test.
---
 gcc/cp/decl2.c  | 12 ++--
 gcc/testsuite/g++.dg/gcov/pr88263.C | 30 +
 2 files changed, 36 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/gcov/pr88263.C


diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index ffc0d0d6ec4..0c1f3026c16 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -3327,15 +3327,15 @@ var_needs_tls_wrapper (tree var)
translation unit.  */
 
 static tree
-get_local_tls_init_fn (void)
+get_local_tls_init_fn (location_t loc)
 {
   tree sname = get_identifier ("__tls_init");
   tree fn = get_global_binding (sname);
   if (!fn)
 {
-  fn = build_lang_decl (FUNCTION_DECL, sname,
-			 build_function_type (void_type_node,
-		  void_list_node));
+  fn = build_lang_decl_loc (loc, FUNCTION_DECL, sname,
+build_function_type (void_type_node,
+		 void_list_node));
   SET_DECL_LANGUAGE (fn, lang_c);
   TREE_PUBLIC (fn) = false;
   DECL_ARTIFICIAL (fn) = true;
@@ -3365,7 +3365,7 @@ get_tls_init_fn (tree var)
   /* If the variable is internal, or if we can't generate aliases,
  call the local init function directly.  */
   if (!TREE_PUBLIC (var) || !TARGET_SUPPORTS_ALIASES)
-return get_local_tls_init_fn ();
+return get_local_tls_init_fn (DECL_SOURCE_LOCATION (var));
 
   tree sname = mangle_tls_init_fn (var);
   tree fn = get_global_binding (sname);
@@ -4431,7 +4431,7 @@ handle_tls_init (void)
   set_decl_tls_model (guard, decl_default_tls_model (guard));
   pushdecl_top_level_and_finish (guard, NULL_TREE);
 
-  tree fn = get_local_tls_init_fn ();
+  tree fn = get_local_tls_init_fn (loc);
   start_preparsed_function (fn, NULL_TREE, SF_PRE_PARSED);
   tree body = begin_function_body ();
   tree if_stmt = begin_if_stmt ();
diff --git a/gcc/testsuite/g++.dg/gcov/pr88263.C b/gcc/testsuite/g++.dg/gcov/pr88263.C
new file mode 100644
index 000..4dc4063fe7e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/gcov/pr88263.C
@@ -0,0 +1,30 @@
+// PR gcov-profile/88045
+// { dg-options "-fprofile-arcs -ftest-coverage -std=c++11" }
+// { dg-do run { target native } }
+
+#include 
+
+namespace log {
+
+class Logstream {
+public:
+
+private:
+/// The logging stream
+static thread_local std::ostringstream os_;
+};
+
+}
+
+namespace log {
+
+thread_local std::ostringstream Logstream::os_;
+
+}
+
+int main()
+{
+  return 0;
+}
+
+// { dg-final { run-gcov pr88263.C } }

Re: [PATCH 2/2] asm inline

2018-11-30 Thread Richard Biener

On Tue, Oct 30, 2018 at 6:31 PM Segher Boessenkool
 wrote:
>
> The Linux kernel people want a feature that makes GCC pretend some
> inline assembler code is tiny (while it would think it is huge), so
> that such code will be inlined essentially always instead of
> essentially never.
>
> This patch lets you say "asm inline" instead of just "asm", with the
> result that that inline assembler is always counted as minimum cost
> for inlining.  It implements this for C and C++.

The middle-end and documentation changes are OK.  I suppose the FE changes
are as well in case [2/2] is approved.

Thanks,
Richard.

>
> 2018-10-30  Segher Boessenkool  
>
> * doc/extend.texi (Using Assembly Language with C): Document asm 
> inline.
> (Size of an asm): Fix typo.  Document asm inline.
> * gimple-pretty-print.c (dump_gimple_asm): Handle asm inline.
> * gimple.h (enum gf_mask): Add GF_ASM_INLINE.
> (gimple_asm_set_volatile): Fix typo.
> * gimple_asm_inline_p: New.
> * gimple_asm_set_inline: New.
> * gimplify.c (gimplify_asm_expr): Propagate the asm inline flag from
> tree to gimple.
> * ipa-icf-gimple.c (func_checker::compare_gimple_asm): Compare the
> gimple_asm_inline_p flag, too.
> * tree-core.h (tree_base): Document that protected_flag is 
> ASM_INLINE_P
> in an ASM_EXPR.
> * tree-inline.c (estimate_num_insns): If gimple_asm_inline_p return
> a minimum size for an asm.
> * tree.h (ASM_INLINE_P): New.
>
> gcc/c/
> * c-parser.c (c_parser_asm_statement): Detect the inline keyword
> after asm.  Pass a flag for it to build_asm_expr.
> * c-tree.h (build_asm_expr): Update declaration.
> * c-typeck.c (build_asm_stmt): Add is_inline parameter.  Use it to
> set ASM_INLINE_P.
>
> gcc/cp/
> * cp-tree.h (finish_asm_stmt): Update declaration.
> * parser.c (cp_parser_asm_definition): Detect the inline keyword
> after asm.  Pass a flag for it to finish_asm_stmt.
> * pt.c (tsubst_expr): Pass the ASM_INLINE_P flag to finish_asm_stmt.
> * semantics.c (finish_asm_stmt): Add inline_p parameter.  Use it to
> set ASM_INLINE_P.
>
> gcc/testsuite/
> * c-c++-common/torture/asm-inline.c: New testcase.
>
> ---
>  gcc/c/c-parser.c| 15 +--
>  gcc/c/c-tree.h  |  3 +-
>  gcc/c/c-typeck.c|  3 +-
>  gcc/cp/cp-tree.h|  2 +-
>  gcc/cp/parser.c | 15 ++-
>  gcc/cp/pt.c |  2 +-
>  gcc/cp/semantics.c  |  3 +-
>  gcc/doc/extend.texi | 10 -
>  gcc/gimple-pretty-print.c   |  2 +
>  gcc/gimple.h| 24 ++-
>  gcc/gimplify.c  |  1 +
>  gcc/ipa-icf-gimple.c|  3 ++
>  gcc/testsuite/c-c++-common/torture/asm-inline.c | 53 
> +
>  gcc/tree-core.h |  3 ++
>  gcc/tree-inline.c   |  3 ++
>  gcc/tree.h  |  3 ++
>  16 files changed, 133 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/torture/asm-inline.c
>
> diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
> index ce9921e..b28b712 100644
> --- a/gcc/c/c-parser.c
> +++ b/gcc/c/c-parser.c
> @@ -6283,11 +6283,12 @@ c_parser_for_statement (c_parser *parser, bool ivdep, 
> unsigned short unroll,
>  }
>
>  /* Parse an asm statement, a GNU extension.  This is a full-blown asm
> -   statement with inputs, outputs, clobbers, and volatile and goto tag
> -   allowed.
> +   statement with inputs, outputs, clobbers, and volatile, inline, and goto
> +   tags allowed.
>
> asm-qualifier:
>   type-qualifier
> + inline
>   goto
>
> asm-qualifier-list:
> @@ -6315,7 +6316,7 @@ static tree
>  c_parser_asm_statement (c_parser *parser)
>  {
>tree quals, str, outputs, inputs, clobbers, labels, ret;
> -  bool simple, is_goto;
> +  bool simple, is_inline, is_goto;
>location_t asm_loc = c_parser_peek_token (parser)->location;
>int section, nsections;
>
> @@ -6323,6 +6324,7 @@ c_parser_asm_statement (c_parser *parser)
>c_parser_consume_token (parser);
>
>quals = NULL_TREE;
> +  is_inline = false;
>is_goto = false;
>for (bool done = false; !done; )
>  switch (c_parser_peek_token (parser)->keyword)
> @@ -6340,6 +6342,10 @@ c_parser_asm_statement (c_parser *parser)
> c_parser_peek_token (parser)->value);
> c_parser_consume_token (parser);
> break;
> +  case RID_INLINE:
> +   is_inline = true;
> +   c_parser_consume_token (parser);
> +   break;
>case RID_GOTO:
> is_goto = true;
> c_

Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-11-30 Thread Richard Biener

On Fri, Nov 30, 2018 at 9:33 AM Alexander Monakov  wrote:
>
> On Fri, 30 Nov 2018, Richard Biener wrote:
> > > Ping - as I think this approach addresses the root of the problem, I 
> > > wouldn't
> > > like it to be forgotten.
> >
> > I agree this is also useful but it addresses another issue (that may appear 
> > to
> > be related).  asm inline is really a hint to the inliner estimates (with no 
> > way
> > to get semantics botched) while marking off-section parts is making the
> > asm text more precise also affecting code generation and thus has the
> > possibility to cause correctness issues (if you say mark everything as
> > off-section just to make it inline better).
>
> I don't think that's true: if the user marks too much of the template as
> off-section and makes GCC under-estimate branch ranges, they may receive an
> error from the assembler. But that's a build failure, not a correctness
> issue. Surely build error is a reasonable outcome from misuse of inline asm.

Yes, but I say they can't mark all of the asm test off-section to make it more
likely be inlined.  Because that might fire back (only on some weird targets
in weird circumstances).  But they can use asm inline for that.

> > I'm sympathtetic to both patches but clearly the kernel folks have shown
> > need for the inline hint (arguably the kernel folks are the ones we could
> > expect to get usages of %` correct ...).  I haven't seen any comments
> > from possible users of %`
>
> FWIW, when I raised the idea in the kernel thread, the response from Borislav
> was:
>
> >> I don't mind it but I see you guys are still discussing what would be
> >> the better solution here, on the gcc ML. And we, as one user, are a
> >> happy camper as long as it does what it is meant to do. But how the
> >> feature looks like syntactically is something for gcc folk to decide as
> >> they're going to support it for the foreseeable future and I'm very well
> >> aware of how important it is for a supportable feature to be designed
> >> properly.
>
> Alexander

Re: [RS6000] PowerPC64 soft-float

2018-11-30 Thread Segher Boessenkool

On Fri, Nov 30, 2018 at 06:28:28PM +1030, Alan Modra wrote:
> On Thu, Nov 29, 2018 at 12:15:06PM -0600, Segher Boessenkool wrote:
> > On Sun, Nov 25, 2018 at 10:50:27PM +1030, Alan Modra wrote:
> > > This patch aims to prevent long sequences loading soft-float
> > > constants.  For 32-bit, it makes sense to load values inline to a gpr
> > > with lis, addi, but not so much for 64-bit where a 5 insn sequence
> > > might be needed for each gpr.  For TFmode in particular, a 10 insn
> > > sequence is reduced to 2 loads from memory plus 1 or 2 address setup
> > > insns.
> > > 
> > > Bootstrapped etc. powerpc64le-linux and powerpc64-linux.  OK for
> > > next stage1?
> > 
> > It's okay now, even.
> 
> Thanks!  Revised patch as per your other comments bootstrapped and
> regression tested powerpc64le-linux.

That looks great.  Okay for trunk.  Thanks!

>   * config/rs6000/predicates.md (easy_fp_constant): Avoid long
>   dependent insn sequences.
>   * config/rs6000/rs6000.c (num_insns_constant): Support long
>   double constants.
>   * config/rs6000/rs6000.md (mov_softfloat128) Adjust length
>   attribute.

Missing colon on that last line.


Segher

[PATCH] [MIPS] GCC: Fix Loongson3 LLSC Errata

2018-11-30 Thread Paul Hua

In some older Loongson3 processors there is a LL/SC errata that can
cause the CPU to deadlock occasionally.  The details are very
complicated. We find a way to work around this errata by a) adding a
sync before ll/lld instruction, b) adding a sync
before branch target that between ll and sc. The assembler do the jobs
'a', gcc do the jobs 'b'.

This patch also add a configure options
--with-mips-fix-loongson3-llsc=[yes|no] to enable fix-loongson3-llsc
by config.
From 16f0fd9e32d2098637dc0eb3e576444c48c43f22 Mon Sep 17 00:00:00 2001
From: Chenghua Xu 
Date: Fri, 30 Nov 2018 19:57:38 +0800
Subject: [PATCH] [MIPS][GCC]  Fix Loongson3 LLSC Errata.

gcc/
	* config.gcc (supported_defaults): Add fix-loongson3-llsc
	(with_fix_loongson3_llsc): Add validation.
	(all_defaults): Add fix-loongson3-llsc.
	* config/mips/mips.c (mips_process_sync_loop): Add sync before
	branch target that between ll and sc.
	* config/mips/mips.h (OPTION_DEFAULT_SPECS): Add a default for
	fix-loongson3-llsc.
	gcc/config/mips/mips.opt: New option.
	* doc/install.texi (--with-fix-loongson3-llsc):Document the new
	option.
	* doc/invoke.texi (-mfix-loongson3-llsc):Document the new option.

gcc/testsuite/
	* gcc.target/mips/fix-loongson3-llsc.c: New test.
	* gcc.target/mips/mips.exp (option): Add fix-loongson3-llsc.
---
 gcc/config.gcc| 19 +--
 gcc/config/mips/mips.c| 13 +++--
 gcc/config/mips/mips.h|  4 +++-
 gcc/config/mips/mips.opt  |  4 
 gcc/doc/install.texi  |  4 
 gcc/doc/invoke.texi   |  8 
 .../gcc.target/mips/fix-loongson3-llsc.c  | 10 ++
 gcc/testsuite/gcc.target/mips/mips.exp|  1 +
 8 files changed, 58 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/fix-loongson3-llsc.c

diff --git a/gcc/config.gcc b/gcc/config.gcc
index f6162ed496e..9887b43dc87 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4323,7 +4323,7 @@ case "${target}" in
 		;;
 
 	mips*-*-*)
-		supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1 madd4"
+		supported_defaults="abi arch arch_32 arch_64 float fpu nan fp_32 odd_spreg_32 tune tune_32 tune_64 divide llsc mips-plt synci lxc1-sxc1 madd4 fix-loongson3-llsc"
 
 		case ${with_float} in
 		"" | soft | hard)
@@ -4476,6 +4476,21 @@ case "${target}" in
 			exit 1
 			;;
 		esac
+
+		case ${with_fix_loongson3_llsc} in
+		yes)
+			with_fix_loongson3_llsc=fix-loongson3-llsc
+			;;
+		no)
+			with_fix_loongson3_llsc=no-fix-loongson3-llsc
+			;;
+		"")
+			;;
+		*)
+			echo "Unknown fix-loongson3-llsc type used in --with-fix-loongson3-llsc" 1>&2
+			exit 1
+			;;
+		esac
 		;;
 
 	nds32*-*-*)
@@ -4995,7 +5010,7 @@ case ${target} in
 esac
 
 t=
-all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls lxc1-sxc1 madd4"
+all_defaults="abi cpu cpu_32 cpu_64 arch arch_32 arch_64 tune tune_32 tune_64 schedule float mode fpu nan fp_32 odd_spreg_32 divide llsc mips-plt synci tls lxc1-sxc1 madd4 fix-loongson3-llsc"
 for option in $all_defaults
 do
 	eval "val=\$with_"`echo $option | sed s/-/_/g`
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 55b44078518..717f3d03292 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -14127,7 +14127,7 @@ mips_process_sync_loop (rtx_insn *insn, rtx *operands)
   mips_multi_start ();
 
   /* Output the release side of the memory barrier.  */
-  if (need_atomic_barrier_p (model, true))
+  if (need_atomic_barrier_p (model, true) && !FIX_LOONGSON3_LLSC)
 {
   if (required_oldval == 0 && TARGET_OCTEON)
 	{
@@ -14148,6 +14148,10 @@ mips_process_sync_loop (rtx_insn *insn, rtx *operands)
   /* Output the branch-back label.  */
   mips_multi_add_label ("1:");
 
+  /* Loongson3 target need sync before ll/lld.  */
+  if (need_atomic_barrier_p (model,  true) && FIX_LOONGSON3_LLSC)
+mips_multi_add_insn ("sync", NULL);
+
   /* OLDVAL = *MEM.  */
   mips_multi_add_insn (is_64bit_p ? "lld\t%0,%1" : "ll\t%0,%1",
 		   oldval, mem, NULL);
@@ -14257,13 +14261,18 @@ mips_process_sync_loop (rtx_insn *insn, rtx *operands)
 mips_multi_add_insn ("li\t%0,1", cmp, NULL);
 
   /* Output the acquire side of the memory barrier.  */
-  if (TARGET_SYNC_AFTER_SC && need_atomic_barrier_p (model, false))
+  if (TARGET_SYNC_AFTER_SC && need_atomic_barrier_p (model, false)
+  && !FIX_LOONGSON3_LLSC)
 mips_multi_add_insn ("sync", NULL);
 
   /* Output the exit label, if needed.  */
   if (required_oldval)
 mips_multi_add_label ("2:");
 
+  /* Loongson3 need a sync before branch target that between ll and sc.  */
+  if (FIX_LOONGSON3_LLSC)
+mips_multi_add_insn ("sync", NULL);
+
 #undef READ_OPERAND
 }
 
diff --git a/gcc/config/mips/mips.h b/gcc/config/

Re: [patch,openacc] Fix acc_shutdown issue

2018-11-30 Thread Julian Brown

On Thu, 20 Sep 2018 10:05:30 -0700
Cesar Philippidis  wrote:

> Attached is an old gomp4 patch that allegedly fixes an shutdown
> runtime issue involving OpenACC accelerators. Unfortunately, the
> original patch didn't include a test case, nor did it generate any
> regressions in the libgomp testsuite when I reverted it in og8.
> 
> With that said, I like how this patch eliminates the redundant use of
> gomp_mutex_lock to unmap variables (because gomp_unmap_vars already
> acquires a lock). However, the trade-off is that it does increase
> tgt->list_count to num_funcs + num_vars.
> 
> Does anyone have any strong opinion on this patch and is it OK for
> trunk? I bootstrapped and regtested it for x86_64 Linux with nvptx
> offloading and I didn't encounter any regressions.

I'd like to withdraw this patch (on behalf of Cesar, who has left
Mentor). It's been superseded by:

  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02556.html

Thanks,

Julian

Re: [PATCH] Come up with --param asan-stack-small-redzone (PR sanitizer/81715).

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 12:44:04PM +0100, Martin Liška wrote:
> Ok, I'm sending updated version of the patch. I factored out the shadow memory
> byte emission into a class, it's responsible for underlying flushing and 
> guarantees
> that stores are 4B aligned (when beginning of stack vars is properly aligned
> to ASAN_RED_ZONE_SIZE).
> 
> So far I tested the patch on x86_64-linux-gnu and ppc64le-linux-gnu machine.

Can you please do a bootstrap-asan too at least on the former to test it
some more?
Ok for trunk if that succeeds or doesn't regress compared to without this patch.

Thanks.

> 2018-11-30  Martin Liska  
> 
>   PR sanitizer/81715
>   * asan.c (asan_shadow_cst): Remove, partially transform
>   into flush_redzone_payload.
>   (RZ_BUFFER_SIZE): New.
>   (struct asan_redzone_buffer): New.
>   (asan_redzone_buffer::emit_redzone_byte): Likewise.
>   (asan_redzone_buffer::flush_redzone_payload): Likewise.
>   (asan_redzone_buffer::flush_if_full): Likewise.
>   (asan_emit_stack_protection): Use asan_redzone_buffer class
>   that is responsible for proper aligned stores and flushing
>   of shadow memory payload.
>   * asan.h (ASAN_MIN_RED_ZONE_SIZE): New.
>   (asan_var_and_redzone_size): Likewise.
>   * cfgexpand.c (expand_stack_vars): Use smaller alignment
>   (ASAN_MIN_RED_ZONE_SIZE) in order to make shadow memory
>   for automatic variables more compact.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-11-30  Martin Liska  
> 
>   PR sanitizer/81715
>   * c-c++-common/asan/asan-stack-small.c: New test.

Jakub

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Pedro Alves

On 11/30/2018 08:26 AM, Nick Clifton wrote:
> Hi Pedro, Hi Tom,
> 
>> Pedro> E.g., in GDB, loading big binaries, demangling is very high
>> Pedro> in profiles, and so we've kicked around the desire to parallelize
>> Pedro> it
> 
> I did consider this, but I encountered two problems:
> 
>   1. I wanted users of the demangling functions to be able to change
>  the recursion limit.  So for example in environments with a very
>  limited amount of stack space the limit could be reduced.  This
>  is one of the purposes of the cplus_demangle_set_recursion_limit()
>  function.
> 
>  Since a new d_info structure is created every time a string is demangled
>  the only way to implement cplus_demangle_set_recursion_limit would
>  be to have it set the value that is used to initialize the field in
>  the structure, which is basically back to where we are now.

That's a lesser evil.  With the proposed cplus_demangle_set_recursion_limit
interface, you essentially end up with an interface that makes all threads in
the process end up with the same limit.  There's precedent for global options
in the current_demangling_style global, set by the cplus_demangle_set_style
function.  That's one I recalled, there may be others.  I'd prefer interfaces
that pass down the options as arguments, or a pointer to an options struct, but
that's not the main issue.

The main issue is the current recursion level counter.
That's the static variable that I had pointed at:

  d_function_type (struct d_info *di)
  {
[...]
 +  static unsigned long recursion_level = 0;
[...]
 +  recursion_level ++;
[...]
 +  recursion_level --;
return ret;
 }

Even if we go with a recursion limit per process, this recursion
level count must be moved to d_info for thread-safety without
synchronization.

> 
>   2. I wanted to be able to access the recursion limit from code
>  in cplus-dem.c, which does not use the d_info structure.  

That should just mean a bit of plumbing is in order to pass down
either a d_info structure or something like it?

> This was
>  the other purpose of the cplus_demangler_set_recursion_limit()
>  function - it allows the limit to be read without changing it.
> 
> As an alternative I did consider adding an extra parameter to the 
> cplus_demangle(), cplus_demangle_opname() and related functions.  But
> this would make them non-backwards compatible and I did not want to 
> do that.

I'd say that ideally, we'd aim at the interface we want, and then go
from there.  If changing interfaces is a problem, we can always add
a new entry point and keep the old as backward compatibility.  
But is it (a problem)?  Do we require ABI compatibility here?

But again, the main issue I'm seeing is the recursion level counter,
not the global limit.

> I would imagine that changing the recursion limit would be a very
> rare event, possibly only ever done once in an executable's runtime, 
> so I doubt that there will ever be any real-life thread safety 
> problems associated with it.  But I do understand that this is just
> an assumption, not a guarantee.
See above.

Thanks,
Pedro Alves

[PATCH] OpenACC reference count consistency checking

2018-11-30 Thread Julian Brown


This is a trunk-compatible version of the patch posted here:

  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02365.html

I understand it may not be suitable for committing (especially not
outside stage 1 -- though it's "obviously harmless" in its dormant state),
but it might be helpful for review purposes for the main attach/detach
patch, i.e.:

  https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02556.html

For convenience, I will copy the blurb from the og8 submission of the
patch here.

[...] The model used for checking is as follows.

 1. Each splay tree key that references a target memory descriptor
increases that descriptor's refcount by 1.

 2. Each variable listed in a target memory descriptor that links back to a
splay tree key increases that key's refcount by 1. Each target memory
descriptor's variable list is counted only once, even if multiple
splay tree keys point to it (via their "tgt" field).

 3. Additional ("real") target memory descriptors may be present
representing data mapped through "acc data" or "acc parallel/kernels"
blocks.  These descriptors have their refcount bumped, and the
variables linked through such blocks have their refcounts bumped also
(again, with "once only" semantics).

 4. Asynchronous operations "artificially" bump the reference counts for
referenced target memory descriptors (but *not* for linked
variables/splay tree keys), in order to delay freeing mapped device
memory until the asynchronous operation has completed.  We model this,
for checking purposes only, using an off-side linked list.

 5. "Virtual" reference counts ("virtual_refcount") cannot be checked
purely statically, so we add the incoming value to each key's
statically-determined reference count ("refcount_chk"), and make
sure that the total matches the incoming reference count ("refcount").

Thanks,

Julian

ChangeLog

libgomp/
* libgomp.h (RC_CHECKING): New macro, disabled by default, guarding all
hunks in this patch.
(target_mem_desc): Add forward declaration.
(async_tgt_use): New struct.
(target_mem_desc): Add refcount_chk, mark fields.
(acc_dispatch_t): Add tgt_uses, au_lock fields.
(dump_tgt, gomp_rc_check): Add prototypes.
* oacc-async (goacc_async_unmap_tgt): Add refcount self-check code.
(goacc_async_copyout_unmap_vars): Likewise.
(goacc_remove_var_async): Likewise.
* oacc-parallel.c (GOACC_parallel_keyed_internal): Add refcount
self-check code.
(GOACC_data_start, GOACC_data_end, GOACC_enter_exit_data): Likewise.
* target.c (stdio.h): Include.
(dump_tgt, rc_check_clear, rc_check_count, rc_check_verify)
(gomp_rc_check): New functions to consistency-check reference counts.
(gomp_target_init): Initialise self-check-related device fields.
---
 libgomp/libgomp.h   |   31 +++
 libgomp/oacc-async.c|   46 +++
 libgomp/oacc-parallel.c |   33 
 libgomp/target.c|  199 +++
 4 files changed, 309 insertions(+), 0 deletions(-)

diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index df49c1b..24cbddd 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -874,9 +874,26 @@ struct target_var_desc {
   uintptr_t length;
 };
 
+/* Uncomment to enable reference-count consistency checking (for development
+   use only).  */
+/*#define RC_CHECKING 1*/
+
+#ifdef RC_CHECKING
+struct target_mem_desc;
+
+struct async_tgt_use {
+  struct target_mem_desc *tgt;
+  struct async_tgt_use *next;
+};
+#endif
+
 struct target_mem_desc {
   /* Reference count.  */
   uintptr_t refcount;
+#ifdef RC_CHECKING
+  uintptr_t refcount_chk;
+  bool mark;
+#endif
   /* All the splay nodes allocated together.  */
   splay_tree_node array;
   /* Start of the target region.  */
@@ -925,6 +942,10 @@ struct splay_tree_key_s {
  "present increment" operations (via "acc enter data") refering to the same
  host-memory block.  */
   uintptr_t virtual_refcount;
+#ifdef RC_CHECKING
+  /* The recalculated reference count, for verification.  */
+  uintptr_t refcount_chk;
+#endif
   /* For a block with attached pointers, the attachment counters for each.  */
   unsigned short *attach_count;
   /* Pointer to the original mapping of "omp declare target link" object.  */
@@ -958,6 +979,10 @@ typedef struct acc_dispatch_t
 int nasyncqueue;
 struct goacc_asyncqueue **asyncqueue;
 struct goacc_asyncqueue_list *active;
+#ifdef RC_CHECKING
+struct async_tgt_use *tgt_uses;
+gomp_mutex_t au_lock;
+#endif
 
 __typeof (GOMP_OFFLOAD_openacc_async_construct) *construct_func;
 __typeof (GOMP_OFFLOAD_openacc_async_destruct) *destruct_func;
@@ -1085,6 +1110,12 @@ extern void gomp_detach_pointer (struct gomp_device_descr *,
  struct goacc_asyncqueue *, splay_tree_key,
  uintptr_t, bool, struct gomp_coalesce_buf *);
 
+#ifdef RC_CHECKING
+extern v

Re: [PATCH] PR libstdc++/67843 set shared_ptr lock policy at build-time

2018-11-30 Thread Ramana Radhakrishnan



Firstly thanks a lot for working on this.

On 28/11/2018 12:49, Jonathan Wakely wrote:
> On 28/11/18 12:30 +, Jonathan Wakely wrote:
>> 3) We could change libstdc++'s configure to automatically override
>> --with-libstdcxx-lock-policy for arm-linux so it always uses "atomic"
>> (because we know the kernel helpers are available). That causes an ABI
>> change for a GCC built for --target=armv5t-*-linux* so I didn't do
>> that by default. Packagers who want that can use the --with option
>> explicitly to build a libstdc++ with atomic refcounting that can be
>> used on any armv5t or later CPU, allowing users to choose a newer
>> -march for their own code. (Until my patch that didn't work, because
> 
> [...]
> 
>> Option 3) is not my call to make. If the ARM maintainers want to
>> change the default older arm-linux targets I have no objections.

We could change the defaults when(if) we next bump up the libstdc++ ABI 
in general :-/ and if we remember to do this. I don't feel comfortable 
changing the defaults silently and I don't view this as something we can 
decide on by ourselves as the Arm maintainers as this really falls in 
the space of folks distributing Linux using by defaults versions of the 
architecture that result in the use of mutexes rather than the atomics.

It's also not clear to me if we can use any other magic tricks to make 
this co-exist and whether that is worth it.
> 
> Another way to approach option 3) that we discussed at Cauldron, and
> which I want to start a separate discussion about on g...@gcc.gnu.org,
> is to change the value of __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 et al
> when we have kernel helpers to implement those operations.
> 
> The shared_ptr code doesn't care if an atomic CAS comes from the CPU
> or a kernel helper in libgcc. If the atomic CAS is supported via
> kernel helpers, let's use it! But those predefined macros only tell
> you about native CPU support (for the current target selected by
> -march).
> 
> 
> It's worth noting that all this shared_ptr code predates libatomic, so
> we couldn't just say "always use atomics, and link to libatomic if
> needed". If __GCC_HAVE_SYNC_COMPARE_AND_SWAP_4 was not defined, we had
> to assume no CAS *at all*. Not native, not provided by libatomic,
> nothing. I'd love to simply rely on std::atomic in std::shared_ptr,
> but that would be an ABI break for targets currently using a mutex,
> and would add new dependencies on libatomic for some targets. (It
> might also pessimize some single-threaded programs on targets that do
> use atomic ref-counts, because currently some updates are non-atomic
> when __gthread_active_p() == false).

Yep, though you want it to go to the kernel helpers or indeed libatomic.

regards
Ramana

> 
>

Re: [PATCH] Come up with --param asan-stack-small-redzone (PR sanitizer/81715).

2018-11-30 Thread Martin Liška

On 11/29/18 5:46 PM, Jakub Jelinek wrote:
> On Thu, Nov 29, 2018 at 05:37:11PM +0100, Martin Liška wrote:
>>   0x10007fff7b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7b30: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7b40: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>> =>0x10007fff7b50: 00 00 00 00 f1 f1 f1 f1[01]f2 00 00 01 f2 f2 f2
>>   0x10007fff7b60: f2 f2 00 00 00 06 f2 f2 f2 f2 00 00 00 00 00 f3
>>   0x10007fff7b70: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7b80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7b90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>   0x10007fff7ba0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>>
>> as seen 06f2f2f2 is completely unaligned write.
> 
> Sure, but the whole frame section where the vars from one function are
> is properly aligned.  So, you shouldn't store 0xf2f2f206, but 0xf2f20600
> one byte earlier, and then 0xf2f2 after it.  Of course, if we decide
> that we want to do solely SImode stores.  If we want to complicate it
> through determining costs and comparing if we should do a 1 or 2 or 4 byte
> store and if unaligned is possible and not very expensive consider even
> those in some cases, the code might be more complex.
> I guess for now I'd be fine with SImode stores only and if we get complains,
> we can address that incrementally (the question is if we'd want hooks or
> what to determine it).
> 
>   Jakub
> 

Ok, I'm sending updated version of the patch. I factored out the shadow memory
byte emission into a class, it's responsible for underlying flushing and 
guarantees
that stores are 4B aligned (when beginning of stack vars is properly aligned
to ASAN_RED_ZONE_SIZE).

So far I tested the patch on x86_64-linux-gnu and ppc64le-linux-gnu machine.

Martin
>From 06b434ecf6e80bd1f99c1f48d41a72fd6fe1d1f6 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 25 Sep 2018 10:54:37 +0200
Subject: [PATCH] Make red zone size more flexible for stack variables (PR
 sanitizer/81715).

gcc/ChangeLog:

2018-11-30  Martin Liska  

	PR sanitizer/81715
	* asan.c (asan_shadow_cst): Remove, partially transform
	into flush_redzone_payload.
	(RZ_BUFFER_SIZE): New.
	(struct asan_redzone_buffer): New.
	(asan_redzone_buffer::emit_redzone_byte): Likewise.
	(asan_redzone_buffer::flush_redzone_payload): Likewise.
	(asan_redzone_buffer::flush_if_full): Likewise.
	(asan_emit_stack_protection): Use asan_redzone_buffer class
	that is responsible for proper aligned stores and flushing
	of shadow memory payload.
	* asan.h (ASAN_MIN_RED_ZONE_SIZE): New.
	(asan_var_and_redzone_size): Likewise.
	* cfgexpand.c (expand_stack_vars): Use smaller alignment
	(ASAN_MIN_RED_ZONE_SIZE) in order to make shadow memory
	for automatic variables more compact.

gcc/testsuite/ChangeLog:

2018-11-30  Martin Liska  

	PR sanitizer/81715
	* c-c++-common/asan/asan-stack-small.c: New test.
---
 gcc/asan.c| 202 ++
 gcc/asan.h|  25 +++
 gcc/cfgexpand.c   |  14 +-
 .../c-c++-common/asan/asan-stack-small.c  |  28 +++
 4 files changed, 219 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/asan-stack-small.c

diff --git a/gcc/asan.c b/gcc/asan.c
index 45906bf8fee..5d1d1dec064 100644
--- a/gcc/asan.c
+++ b/gcc/asan.c
@@ -1155,20 +1155,6 @@ asan_pp_string (pretty_printer *pp)
   return build1 (ADDR_EXPR, shadow_ptr_types[0], ret);
 }
 
-/* Return a CONST_INT representing 4 subsequent shadow memory bytes.  */
-
-static rtx
-asan_shadow_cst (unsigned char shadow_bytes[4])
-{
-  int i;
-  unsigned HOST_WIDE_INT val = 0;
-  gcc_assert (WORDS_BIG_ENDIAN == BYTES_BIG_ENDIAN);
-  for (i = 0; i < 4; i++)
-val |= (unsigned HOST_WIDE_INT) shadow_bytes[BYTES_BIG_ENDIAN ? 3 - i : i]
-	   << (BITS_PER_UNIT * i);
-  return gen_int_mode (val, SImode);
-}
-
 /* Clear shadow memory at SHADOW_MEM, LEN bytes.  Can't call a library call here
though.  */
 
@@ -1235,6 +1221,136 @@ shadow_mem_size (unsigned HOST_WIDE_INT size)
   return ROUND_UP (size, ASAN_SHADOW_GRANULARITY) / ASAN_SHADOW_GRANULARITY;
 }
 
+/* Always emit 4 bytes at a time.  */
+#define RZ_BUFFER_SIZE 4
+
+/* ASAN redzone buffer container that handles emission of shadow bytes.  */
+struct asan_redzone_buffer
+{
+  /* Constructor.  */
+  asan_redzone_buffer (rtx shadow_mem, HOST_WIDE_INT prev_offset):
+m_shadow_mem (shadow_mem), m_prev_offset (prev_offset),
+m_original_offset (prev_offset), m_shadow_bytes (RZ_BUFFER_SIZE)
+  {}
+
+  /* Emit VALUE shadow byte at a given OFFSET.  */
+  void emit_redzone_byte (HOST_WIDE_INT offset, unsigned char value);
+
+  /* Emit RTX emission of the content of the buffer.  */
+  void flush_redzone_payload (void);
+
+private:
+  /* Flush if the c

[C++ Patch] Use FUNC_OR_METHOD_TYPE_P in cp/

2018-11-30 Thread Paolo Carlini


Hi,

as promised in the exchange with Marek, I'm sending the straightforward 
patch, likely for next Stage 1. Tested x86_64-linux, as usual.


Thanks, Paolo.



2018-11-30  Paolo Carlini  

* call.c (build_call_a): Use FUNC_OR_METHOD_TYPE_P.
* cp-gimplify.c (cp_fold): Likewise.
* cp-objcp-common.c (cp_type_dwarf_attribute): Likewise.
* cp-tree.h (TYPE_OBJ_P, TYPE_PTROBV_P): Likewise.
* cvt.c (perform_qualification_conversions): Likewise.
* decl.c (grokdeclarator): Likewise.
* decl2.c (build_memfn_type): Likewise.
* mangle.c (canonicalize_for_substitution, write_type): Likewise.
* parser.c (cp_parser_omp_declare_reduction): Likewise.
* pt.c (check_explicit_specialization, uses_deducible_template_parms,
check_cv_quals_for_unify, dependent_type_p_r): Likewise.
* rtti.c (ptr_initializer): Likewise.
* semantics.c (finish_asm_stmt, finish_offsetof,
cp_check_omp_declare_reduction): Likewise.
* tree.c (cp_build_qualified_type_real,
cp_build_type_attribute_variant, cxx_type_hash_eq,
cxx_copy_lang_qualifiers, cp_free_lang_data): Likewise.
* typeck.c (structural_comptypes, convert_arguments,
cp_build_addr_expr_1, unary_complex_lvalue, cp_build_c_cast,
cp_build_modify_expr, comp_ptr_ttypes_real, type_memfn_rqual):
Likewise.
Index: call.c
===
--- call.c  (revision 266503)
+++ call.c  (working copy)
@@ -357,8 +357,7 @@ build_call_a (tree function, int n, tree *argarray
 
   gcc_assert (TYPE_PTR_P (TREE_TYPE (function)));
   fntype = TREE_TYPE (TREE_TYPE (function));
-  gcc_assert (TREE_CODE (fntype) == FUNCTION_TYPE
- || TREE_CODE (fntype) == METHOD_TYPE);
+  gcc_assert (FUNC_OR_METHOD_TYPE_P (fntype));
   result_type = TREE_TYPE (fntype);
   /* An rvalue has no cv-qualifiers.  */
   if (SCALAR_TYPE_P (result_type) || VOID_TYPE_P (result_type))
Index: cp-gimplify.c
===
--- cp-gimplify.c   (revision 266503)
+++ cp-gimplify.c   (working copy)
@@ -2306,8 +2306,7 @@ cp_fold (tree x)
 
   /* Cope with user tricks that amount to offsetof.  */
   if (op0 != error_mark_node
- && TREE_CODE (TREE_TYPE (op0)) != FUNCTION_TYPE
- && TREE_CODE (TREE_TYPE (op0)) != METHOD_TYPE)
+ && !FUNC_OR_METHOD_TYPE_P (TREE_TYPE (op0)))
{
  tree val = get_base_address (op0);
  if (val
Index: cp-objcp-common.c
===
--- cp-objcp-common.c   (revision 266503)
+++ cp-objcp-common.c   (working copy)
@@ -248,8 +248,7 @@ cp_type_dwarf_attribute (const_tree type, int attr
   switch (attr)
 {
 case DW_AT_reference:
-  if ((TREE_CODE (type) == FUNCTION_TYPE
-  || TREE_CODE (type) == METHOD_TYPE)
+  if (FUNC_OR_METHOD_TYPE_P (type)
  && FUNCTION_REF_QUALIFIED (type)
  && !FUNCTION_RVALUE_QUALIFIED (type))
return 1;
@@ -256,8 +255,7 @@ cp_type_dwarf_attribute (const_tree type, int attr
   break;
 
 case DW_AT_rvalue_reference:
-  if ((TREE_CODE (type) == FUNCTION_TYPE
-  || TREE_CODE (type) == METHOD_TYPE)
+  if (FUNC_OR_METHOD_TYPE_P (type)
  && FUNCTION_REF_QUALIFIED (type)
  && FUNCTION_RVALUE_QUALIFIED (type))
return 1;
Index: cp-tree.h
===
--- cp-tree.h   (revision 266503)
+++ cp-tree.h   (working copy)
@@ -4313,8 +4313,7 @@ more_aggr_init_expr_args_p (const aggr_init_expr_a
 #define TYPE_OBJ_P(NODE)   \
   (!TYPE_REF_P (NODE)  \
&& !VOID_TYPE_P (NODE)  \
-   && TREE_CODE (NODE) != FUNCTION_TYPE\
-   && TREE_CODE (NODE) != METHOD_TYPE)
+   && !FUNC_OR_METHOD_TYPE_P (NODE))
 
 /* Returns true if NODE is a pointer to an object.  Keep these checks
in ascending tree code order.  */
@@ -4330,8 +4329,7 @@ more_aggr_init_expr_args_p (const aggr_init_expr_a
void.  Keep these checks in ascending tree code order.  */
 #define TYPE_PTROBV_P(NODE)\
   (TYPE_PTR_P (NODE)   \
-   && !(TREE_CODE (TREE_TYPE (NODE)) == FUNCTION_TYPE  \
-   || TREE_CODE (TREE_TYPE (NODE)) == METHOD_TYPE))
+   && !FUNC_OR_METHOD_TYPE_P (TREE_TYPE (NODE)))
 
 /* Returns true if NODE is a pointer to function type.  */
 #define TYPE_PTRFN_P(NODE) \
Index: cvt.c
===
--- cvt.c   (revision 266503)
+++ cvt.c   (working copy)
@@ -1979,8 +1979,7 @@ perform_qualification_conversions (tree type, tree
 bool
 tx_safe_fn_type_p (tree t)
 {
-  if (TREE_CODE (t) != FUNCTION_TYPE
-  && TREE_CODE (t) != METHOD_TYPE)
+

Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-30 Thread Jonathan Wakely


On 30/11/18 12:40 +0200, Ville Voutilainen wrote:

On Fri, 30 Nov 2018 at 12:25, Jonathan Wakely  wrote:

Yes, that's exactly what I had in mind (and what I expect to get
proposed for C++20 in the near future).

What does everyone else think, should we go ahead and do this?


Yes, if we are confident that's what will be in C++20.


We're not, but we don't need to be. Anything in the current draft
could get removed before the final standard.


The point is that C++20 changed forward_list (and list, for
consistency) to return the number of removed elements. The reason for
that is you can't easily find out the size before and after removing
elements, because forward_list doesn't have a size() member!
So IMHO the non-member erase should also return the size, and so that


Oops, I meant "should also return the number of removed elements", not
"the number of removed element and also the new size".



it's uniform it should do that for all containers not just the lists.


I don't mind forward_list being an exception. Returning the size from
other containers
is a bit pointless; presumably it would need to return a pair, and
handling that is clunky
even with structured bindings.


No need for a pair, sorry for the confusing wording above.

[PATCH] Add constructor parsing to the GIMPLE FE

2018-11-30 Thread Richard Biener



I failed to come up with ways using generic vector code to create
specific GIMPLE the vectorizer creates so the following fills
in the missing parsing of constructors in the GIMPLE FE.  Notably
aggregate assignments from empty constructors was also missing.
(the special case of clobbers still is)

In particular I was searching for the best way to represent

  movq mem, %xmm0

on GIMPLE.  Aka an 8 byte load into a vector register
implicitely zero-extending.  I settled with

 _2 = { v8qi, { 0, } };

but the explicit zero doesn't get optimized.  Expect a PR
for that now with a GIMPLE testcase ;)

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

Richard.

2018-11-30  Richard Biener  

c/
* gimple-parser.c (c_parser_gimple_postfix_expression): Parse
_Literal (type) { ... } as empty aggregate or vector constructor.

* gcc.dg/gimplefe-34.c: New testcase.
* gcc.dg/gimplefe-35.c: Likewise.

Index: gcc/c/gimple-parser.c
===
--- gcc/c/gimple-parser.c   (revision 266657)
+++ gcc/c/gimple-parser.c   (working copy)
@@ -806,6 +806,7 @@ c_parser_gimple_call_internal (c_parser
  identifier
  constant
  string-literal
+ constructor
  gimple-call-internal
 
 */
@@ -934,7 +935,7 @@ c_parser_gimple_postfix_expression (c_pa
}
  else if (strcmp (IDENTIFIER_POINTER (id), "_Literal") == 0)
{
- /* _Literal '(' type-name ')' [ '-' ] constant */
+ /* _Literal '(' type-name ')' ( [ '-' ] constant | constructor ) 
*/
  c_parser_consume_token (parser);
  tree type = NULL_TREE;
  if (c_parser_require (parser, CPP_OPEN_PAREN, "expected %<(%>"))
@@ -946,28 +947,90 @@ c_parser_gimple_postfix_expression (c_pa
  c_parser_skip_until_found (parser, CPP_CLOSE_PAREN,
 "expected %<)%>");
}
- bool neg_p;
- if ((neg_p = c_parser_next_token_is (parser, CPP_MINUS)))
-   c_parser_consume_token (parser);
- tree val = c_parser_gimple_postfix_expression (parser).value;
- if (! type
- || ! val
- || val == error_mark_node
- || ! CONSTANT_CLASS_P (val))
+ if (! type)
{
  c_parser_error (parser, "invalid _Literal");
  return expr;
}
- if (neg_p)
+ if (c_parser_next_token_is (parser, CPP_OPEN_BRACE))
{
- val = const_unop (NEGATE_EXPR, TREE_TYPE (val), val);
- if (! val)
+ c_parser_consume_token (parser);
+ if (!AGGREGATE_TYPE_P (type)
+ && !VECTOR_TYPE_P (type))
+   {
+ c_parser_error (parser, "invalid type for _Literal with "
+ "constructor");
+ c_parser_skip_until_found (parser, CPP_CLOSE_BRACE,
+"expected %<}%>");
+ return expr;
+   }
+ vec *v = NULL;
+ bool constant_p = true;
+ if (VECTOR_TYPE_P (type)
+ && !c_parser_next_token_is (parser, CPP_CLOSE_BRACE))
+   {
+ vec_alloc (v, TYPE_VECTOR_SUBPARTS (type).to_constant ());
+ do
+   {
+ tree val
+   = c_parser_gimple_postfix_expression (parser).value;
+ if (! val
+ || val == error_mark_node
+ || (! CONSTANT_CLASS_P (val)
+ && ! SSA_VAR_P (val)))
+   {
+ c_parser_error (parser, "invalid _Literal");
+ return expr;
+   }
+ CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, val);
+ if (! CONSTANT_CLASS_P (val))
+   constant_p = false;
+ if (c_parser_next_token_is (parser, CPP_COMMA))
+   c_parser_consume_token (parser);
+ else
+   break;
+   }
+ while (1);
+   }
+ if (c_parser_require (parser, CPP_CLOSE_BRACE,
+   "expected %<}%>"))
+   {
+ if (v && constant_p)
+   expr.value = build_vector_from_ctor (type, v);
+ else
+   expr.value = build_constructor (type, v);
+   }
+ else
+   {
+ c_parser_skip_until_found

Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-30 Thread Ville Voutilainen

On Fri, 30 Nov 2018 at 12:25, Jonathan Wakely  wrote:
> Yes, that's exactly what I had in mind (and what I expect to get
> proposed for C++20 in the near future).
>
> What does everyone else think, should we go ahead and do this?

Yes, if we are confident that's what will be in C++20.

> The point is that C++20 changed forward_list (and list, for
> consistency) to return the number of removed elements. The reason for
> that is you can't easily find out the size before and after removing
> elements, because forward_list doesn't have a size() member!
> So IMHO the non-member erase should also return the size, and so that
> it's uniform it should do that for all containers not just the lists.

I don't mind forward_list being an exception. Returning the size from
other containers
is a bit pointless; presumably it would need to return a pair, and
handling that is clunky
even with structured bindings.

Re: [PATCH] Adjust offsets for present data clauses

2018-11-30 Thread Julian Brown

On Fri, 30 Nov 2018 17:55:17 +0800
Chung-Lin Tang  wrote:

> On 2018/7/21 6:07 AM, Cesar Philippidis wrote:
> > This is another old gomp4 patch that corrects a bug where the
> > runtime was passing the wrong offset for subarray data to the
> > accelerator. The original description of this patch can be found
> > here 
> > 
> > I bootstrapped and regtested on x86_64/nvptx. Is it OK for trunk?
> > 
> > Thanks,
> > Cesar
> >   
> 
> Hi Thomas, this patch should be within your maintainership area now.
> 
> I think this patch is pretty obvious; this is what the 'offset' field
> of struct target_var_desc is supposed to be used for, and is in line
> with other sites throughout target.c.
> 
> I do think it might be better to use a more succinct form like as
> attached, you may consider which form better suits your taste when
> you apply it.

This one will be superseded by the attach/detach patch, I think (where
the additional offset is added also, via calling "gomp_map_val".

HTH,

Julian

[PATCH] Fix PR88274

2018-11-30 Thread Richard Biener



Another inconsistency with -fstrict-enums handling of VRP.

Bootstrapped and tested on x86_64-unknown-linux-gnu, applied.

Richard.

2018-11-30  Richard Biener  

PR tree-optimization/88274
* tree-vrp.c (ranges_from_anti_range): Fix handling of
TYPE_MIN/MAX_VALUE.

Index: gcc/tree-vrp.c
===
--- gcc/tree-vrp.c  (revision 266657)
+++ gcc/tree-vrp.c  (working copy)
@@ -1249,14 +1249,14 @@ ranges_from_anti_range (const value_rang
   || !vrp_val_max (type))
 return false;
 
-  if (!vrp_val_is_min (ar->min ()))
-*vr0 = value_range (VR_RANGE,
-   vrp_val_min (type),
-   wide_int_to_tree (type, wi::to_wide (ar->min ()) - 1));
-  if (!vrp_val_is_max (ar->max ()))
-*vr1 = value_range (VR_RANGE,
-   wide_int_to_tree (type, wi::to_wide (ar->max ()) + 1),
-   vrp_val_max (type));
+  if (tree_int_cst_lt (vrp_val_min (type), ar->min ()))
+vr0->set (VR_RANGE,
+ vrp_val_min (type),
+ wide_int_to_tree (type, wi::to_wide (ar->min ()) - 1));
+  if (tree_int_cst_lt (ar->max (), vrp_val_max (type)))
+vr1->set (VR_RANGE,
+ wide_int_to_tree (type, wi::to_wide (ar->max ()) + 1),
+ vrp_val_max (type));
   if (vr0->undefined_p ())
 {
   *vr0 = *vr1;

Re: [PATCH, libstdc++, C++20] Implement P0457R2 String Prefix and Suffix Checking.

2018-11-30 Thread Jonathan Wakely


On 29/11/18 21:35 -0500, Ed Smith-Rowland wrote:

Greetings,

This patch implements starts_with and ends_with for basic_string and 
basic_string_view for C++20.


This was on my TODO list, thanks for taking care of it.


+#if __cplusplus > 201703L
+  bool
+  starts_with(basic_string_view<_CharT, _Traits> __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(_CharT __x) const noexcept
+  { return __sv_type(this->data(), this->size()).starts_with(__x); }
+
+  bool
+  starts_with(const _CharT* __x) const


This can be noexcept. It isn't in the standard, because it has a
narrow contract (there's a precondition that __x is null-terminated).
But since we can't reliably detect whether __x is null-terminated, and
even if we could we wouldn't want to throw an exception, we can just
make it noexcept. Same for the ends_with(const _CharT*) one, and the
equivalent members in the COW basic_string and basic_string_view.


Index: include/std/string_view
===
--- include/std/string_view (revision 266645)
+++ include/std/string_view (working copy)
@@ -227,7 +227,6 @@
__sv = __tmp;
  }

-
  // [string.view.ops], string operations:

  size_type
@@ -387,6 +386,36 @@
  traits_type::length(__str));
  }

+#if __cplusplus > 201703L
+  constexpr bool
+  starts_with(basic_string_view __x) const noexcept
+  { return this->size() >= __x.size()
+   && this->compare(0, __x.size(), __x) == 0; }


Please put the opening and closing braces on their own lines, as for
ends_with below:


+  constexpr bool
+  ends_with(basic_string_view __x) const noexcept
+  {
+   return this->size() >= __x.size()
+   && this->compare(this->size() - __x.size(), npos, __x) == 0;
+  }


OK with those changes, thanks.

Re: [PATCH] Restrict vector extension to use 128bit vectors (PR testsuite/88265).

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 11:16:21AM +0100, Martin Liška wrote:
> This is about restriction of vector width on x86_64 target in order
> to match expected _ZGVbN4v_* glibc vector functions.

If you want to test just that, then just add -mno-avx instead of
-mavx2 -mno-avx512f -mprefer-vector-width=128.

Ok with that change.

> 2018-11-30  Martin Liska  
> 
>   PR testsuite/88265
>   * gfortran.dg/simd-builtins-1.f90: Restrict to 128b vectors.
>   * gfortran.dg/simd-builtins-6.f90: Likewise.

Jakub

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Nick Clifton

Hi Jakub,

>>   Also - Tom and Pedro have raised the issue that the patch introduces
>>   a new static variable to the library that is not thread safe.  I am
>>   not sure of the best way to address this problem.  Possibly the
>>   variable could be made thread local ?  Are there any other static
> 
> Please don't.

OK. :-)

> Most of the demangler functions pass around a pointer to a struct with
> context, can't this be added in there?

Not without modifying the current demangling interface.  The problem is 
that the context structure is created for each invocation of a demangling 
function (from outside the library), and no state is preserved across 
demangling calls.  Thus in order to have a recursion limit which is 
configurable by the caller, you either need to have a global variable or 
else extend the demangling interface to include a recursion limit parameter.

I did consider just having a fixed limit, that the user cannot change, but
I thought that this might be rejected by reviewers.  (On the grounds that
different limits are appropriate to different execution environments).
Note - enabling or disabling the recursion limit is controlled by a separate
feature of the proposed patch, ie the new DMGL_RECURSE_LIMIT flag in the 
options field of the cplus_demangleXXX() functions.  But there is not enough
room in the options field to also include a recursion limit value.

Cheers
  Nick

Re: RFC (branch prediction): PATCH to implement P0479R5, [[likely]] and [[unlikely]].

2018-11-30 Thread Martin Liška

Hi Jason.

Just small nits I noticed for:

cat test4.C
int a, b, c;

void
__attribute__((noinline))
bar()
{
  if (a == 123)
[[likely]] c = 5;
  else
[[likely]] b = 77;
}

int main()
{
  bar ();
  return 0;
}

$ g++ test4.C -c
test4.C: In function ‘void bar()’:
test4.C:8:16: warning: both branches of ‘if’ statement marked as ‘hot label’ 
[-Wattributes]
8 | [[likely]] c = 5;
  |^
9 |   else
   10 | [[likely]] b = 77;
  |~

1) I would expect 'likely' instead of 'hot label'
2) maybe we can put the carousel to the attribute instead of the first 
statement in the block?

Thanks,
Martin

Re: [PATCH, libstdc++] Uniform container erasure for c++20.

2018-11-30 Thread Jonathan Wakely


On 29/11/18 12:05 -0500, Ed Smith-Rowland wrote:

On 11/26/18 6:18 AM, Jonathan Wakely wrote:

On 24/11/18 13:54 -0500, Ed Smith-Rowland wrote:

All,

I's very late but uniform container erasure is, I think, the last 
little tidbit to graduate from fundamentals/v2 to std at the last 
meeting.  I think it would be a shame not to nudge this into 
gcc-9.  The routines are very short so I just copied them. Ditto 
the testcases (with adjustments).  The node erasure tool was moved 
from experimental to std/bits and adjustments made to affected 
*set/*map headers.


The experimental code has been in our tree since 2015.

This builds and tests clean on x86_64-linux.


OK for trunk as it only touches experimental C++2a and TS material.
Thanks.

I pointed out to the committee that the erase_if functions added to
C++20 completely overlook P0646R1 "Improving the Return Value of
Erase-Like Algorithms" and so fail to preserve the useful information
returned by the members of the linked list containers.
Is *this* what you has in mind for this?  I basically return the 
number of erased items for all the things.


Yes, that's exactly what I had in mind (and what I expect to get
proposed for C++20 in the near future).

What does everyone else think, should we go ahead and do this?

The point is that C++20 changed forward_list (and list, for
consistency) to return the number of removed elements. The reason for
that is you can't easily find out the size before and after removing
elements, because forward_list doesn't have a size() member!

So IMHO the non-member erase should also return the size, and so that
it's uniform it should do that for all containers not just the lists.



Index: include/std/forward_list
===
--- include/std/forward_list(revision 266623)
+++ include/std/forward_list(working copy)
@@ -66,16 +66,17 @@
{
_GLIBCXX_BEGIN_NAMESPACE_VERSION
  template
-inline void
+inline typename forward_list<_Tp, _Alloc>::size_type
erase_if(forward_list<_Tp, _Alloc>& __cont, _Predicate __pred)
-{ __cont.remove_if(__pred); }
+{ return __cont.remove_if(__pred); }

  template
-inline void
+inline typename forward_list<_Tp, _Alloc>::size_type
erase(forward_list<_Tp, _Alloc>& __cont, const _Up& __value)
{
  using __elem_type = typename forward_list<_Tp, _Alloc>::value_type;
-  erase_if(__cont, [&](__elem_type& __elem) { return __elem == __value; });
+  return erase_if(__cont,
+ [&](__elem_type& __elem) { return __elem == __value; });
}


I realised this can be a polymorphic lambda (here and in the TS
version) because it's only defined for C++14 and later. Also, the
erase overload that calls erase_if needs to qualify it as std::erase_if
to prevent ADL:

 return std::erase_if(__cont,
  [&](auto& __elem) { return __elem == __value; });

Same comment for the erase(std::list) case.

[PATCH] Restrict vector extension to use 128bit vectors (PR testsuite/88265).

2018-11-30 Thread Martin Liška

Hi.

This is about restriction of vector width on x86_64 target in order
to match expected _ZGVbN4v_* glibc vector functions.

Ready for trunk?

Martin

gcc/testsuite/ChangeLog:

2018-11-30  Martin Liska  

PR testsuite/88265
* gfortran.dg/simd-builtins-1.f90: Restrict to 128b vectors.
* gfortran.dg/simd-builtins-6.f90: Likewise.
---
 gcc/testsuite/gfortran.dg/simd-builtins-1.f90 | 2 +-
 gcc/testsuite/gfortran.dg/simd-builtins-6.f90 | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)


diff --git a/gcc/testsuite/gfortran.dg/simd-builtins-1.f90 b/gcc/testsuite/gfortran.dg/simd-builtins-1.f90
index e5ee380943f..25ea3e059e3 100644
--- a/gcc/testsuite/gfortran.dg/simd-builtins-1.f90
+++ b/gcc/testsuite/gfortran.dg/simd-builtins-1.f90
@@ -1,5 +1,5 @@
 ! { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } }
-! { dg-additional-options "-msse2 -nostdinc -Ofast -fpre-include=simd-builtins-1.h -fdump-tree-optimized" }
+! { dg-additional-options "-msse2 -mavx2 -mno-avx512f -mprefer-vector-width=128 -nostdinc -Ofast -fpre-include=simd-builtins-1.h -fdump-tree-optimized" }
 
 program test_overloaded_intrinsic
   real(4) :: x4(3200), y4(3200)
diff --git a/gcc/testsuite/gfortran.dg/simd-builtins-6.f90 b/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
index 5ff99212cf1..1a3e3be3f0a 100644
--- a/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
+++ b/gcc/testsuite/gfortran.dg/simd-builtins-6.f90
@@ -1,5 +1,5 @@
 ! { dg-do compile { target { i?86-*-linux* x86_64-*-linux* } } }
-! { dg-additional-options "-msse2 -nostdinc -Ofast -fdump-tree-optimized" }
+! { dg-additional-options "-msse2 -mavx2 -mno-avx512f -mprefer-vector-width=128 -nostdinc -Ofast -fdump-tree-optimized" }
 
 !GCC$ builtin (sin) attributes simd (inbranch)
 !GCC$ builtin (sinf) attributes simd (notinbranch)

Re: [PATCH] Adjust offsets for present data clauses

2018-11-30 Thread Chung-Lin Tang


On 2018/7/21 6:07 AM, Cesar Philippidis wrote:

This is another old gomp4 patch that corrects a bug where the runtime
was passing the wrong offset for subarray data to the accelerator. The
original description of this patch can be found here


I bootstrapped and regtested on x86_64/nvptx. Is it OK for trunk?

Thanks,
Cesar



Hi Thomas, this patch should be within your maintainership area now.

I think this patch is pretty obvious; this is what the 'offset' field
of struct target_var_desc is supposed to be used for, and is in line
with other sites throughout target.c.

I do think it might be better to use a more succinct form like as attached,
you may consider which form better suits your taste when you apply it.

Chung-Lin
Index: libgomp/oacc-parallel.c
===
--- libgomp/oacc-parallel.c (revision 266611)
+++ libgomp/oacc-parallel.c (working copy)
@@ -231,8 +231,11 @@ GOACC_parallel_keyed (int device, void (*fn) (void
 
   devaddrs = gomp_alloca (sizeof (void *) * mapnum);
   for (i = 0; i < mapnum; i++)
-devaddrs[i] = (void *) (tgt->list[i].key->tgt->tgt_start
-   + tgt->list[i].key->tgt_offset);
+devaddrs[i] = (tgt->list[i].key
+  ? (void *) (tgt->list[i].key->tgt->tgt_start
+  + tgt->list[i].key->tgt_offset
+  + tgt->list[i].offset)
+  : NULL);
 
   acc_dev->openacc.exec_func (tgt_fn, mapnum, hostaddrs, devaddrs,
  async, dims, tgt);

[PATCH] Comparison with an enum should mention enum value.

2018-11-30 Thread Martin Liška

Hi.

This is one more clean up of the transition that I've just tested
on x86_64-linux-gnu.

I'm planning to install it once I'll come back from vacation (27/12).

Thanks,
Martin

gcc/ChangeLog:

2018-11-29  Martin Liska  

* builtins.c (expand_movstr): Compare with RETURN_BEGIN.
* expr.c (move_by_pieces): Likewise.
(store_by_pieces): Likewise.
(store_expr): Fix GNU coding style.
---
 gcc/builtins.c | 3 ++-
 gcc/expr.c | 8 
 2 files changed, 6 insertions(+), 5 deletions(-)


diff --git a/gcc/builtins.c b/gcc/builtins.c
index dcac49d8be1..5728ee0e86e 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -3937,7 +3937,8 @@ expand_movstr (tree dest, tree src, rtx target, memop_ret retmode)
   dest_mem = replace_equiv_address (dest_mem, target);
 }
 
-  create_output_operand (&ops[0], retmode ? target : NULL_RTX, Pmode);
+  create_output_operand (&ops[0],
+			 retmode != RETURN_BEGIN ? target : NULL_RTX, Pmode);
   create_fixed_operand (&ops[1], dest_mem);
   create_fixed_operand (&ops[2], src_mem);
   if (!maybe_expand_insn (targetm.code_for_movstr, 3, ops))
diff --git a/gcc/expr.c b/gcc/expr.c
index 85b7847431b..43b129ce5b1 100644
--- a/gcc/expr.c
+++ b/gcc/expr.c
@@ -1220,7 +1220,7 @@ move_by_pieces (rtx to, rtx from, unsigned HOST_WIDE_INT len,
 
   data.run ();
 
-  if (retmode)
+  if (retmode != RETURN_BEGIN)
 return data.finish_retmode (retmode);
   else
 return to;
@@ -1388,7 +1388,7 @@ store_by_pieces (rtx to, unsigned HOST_WIDE_INT len,
   store_by_pieces_d data (to, constfun, constfundata, len, align);
   data.run ();
 
-  if (retmode)
+  if (retmode != RETURN_BEGIN)
 return data.finish_retmode (retmode);
   else
 return to;
@@ -5609,13 +5609,13 @@ store_expr (tree exp, rtx target, int call_param_p,
 
   dest_mem = target;
 
+  memop_ret retmode = exp_len > str_copy_len ? RETURN_END : RETURN_BEGIN;
   dest_mem = store_by_pieces (dest_mem,
   str_copy_len, builtin_strncpy_read_str,
   CONST_CAST (char *,
 	  TREE_STRING_POINTER (str)),
   MEM_ALIGN (target), false,
-  (exp_len > str_copy_len ? RETURN_END :
-   RETURN_BEGIN));
+  retmode);
   if (exp_len > str_copy_len)
 	clear_storage (adjust_address (dest_mem, BLKmode, 0),
 		   GEN_INT (exp_len - str_copy_len),

[PATCH] Improve predictions for hot and cold labels ([[likely]], [[unlikely]]).

2018-11-30 Thread Martin Liška

Hi.

This patch is a reaction to Jason's commit where he introduced new C++ 
attributes.
First I would like to align cold/hot to __builtin_expect, so I adjusted 
probability
and made the predictors first match predictors.

Second I fixed how we consider the predictors in switch statements, so that
we can correctly predict situation in predict-3.

Honza is fine with the patch, I'll install it later if there are no objections.
Survives tests and bootstrap on xc86_64-linux-gnu.

Martin

gcc/ChangeLog:

2018-11-29  Martin Liska  

* predict.c (set_even_probabilities): Include also
unlikely_count in calculation.
(combine_predictions_for_bb): Consider also HOT and
COLD labels predictions.
* predict.def (PRED_HOT_LABEL): Move it just after
__builtin_expect_with_probability predictor.
(PRED_COLD_LABEL): Likewise.

gcc/testsuite/ChangeLog:

2018-11-29  Martin Liska  

* g++.dg/predict-2.C: New test.
* g++.dg/predict-3.C: New test.
* g++.dg/predict-4.C: New test.
* gcc.dg/tree-ssa/attr-hotcold-2.c: Adjust test-case.
---
 gcc/predict.c  | 15 ---
 gcc/predict.def| 15 ---
 gcc/testsuite/g++.dg/predict-2.C   | 16 
 gcc/testsuite/g++.dg/predict-3.C   | 17 +
 gcc/testsuite/g++.dg/predict-4.C   | 17 +
 gcc/testsuite/gcc.dg/tree-ssa/attr-hotcold-2.c |  4 ++--
 6 files changed, 72 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/predict-2.C
 create mode 100644 gcc/testsuite/g++.dg/predict-3.C
 create mode 100644 gcc/testsuite/g++.dg/predict-4.C


diff --git a/gcc/predict.c b/gcc/predict.c
index 5ad252c2d39..81dbb67da4e 100644
--- a/gcc/predict.c
+++ b/gcc/predict.c
@@ -878,11 +878,18 @@ set_even_probabilities (basic_block bb,
 	profile_probability prob
 	  = profile_probability::from_reg_br_prob_base (p);
 	profile_probability remainder = prob.invert ();
+	remainder -= profile_probability::very_unlikely ()
+	  .apply_scale (unlikely_count, 1);
+	int count = nedges - unlikely_count - 1;
+	gcc_assert (count >= 0);
+	profile_probability even = remainder.apply_scale (1, count);
 
 	if (prediction->ep_edge == e)
 	  e->probability = prob;
+	else if (unlikely_edges != NULL && unlikely_edges->contains (e))
+	  e->probability = profile_probability::very_unlikely ();
 	else
-	  e->probability = remainder.apply_scale (1, nedges - 1);
+	  e->probability = even;
 	  }
 	else
 	  e->probability = profile_probability::never ();
@@ -1217,10 +1224,12 @@ combine_predictions_for_bb (basic_block bb, bool dry_run)
   if (preds)
 	for (pred = *preds; pred; pred = pred->ep_next)
 	  {
-	if (pred->ep_probability <= PROB_VERY_UNLIKELY)
+	if (pred->ep_probability <= PROB_VERY_UNLIKELY
+		|| pred->ep_predictor == PRED_COLD_LABEL)
 	  unlikely_edges.add (pred->ep_edge);
 	if (pred->ep_probability >= PROB_VERY_LIKELY
-		|| pred->ep_predictor == PRED_BUILTIN_EXPECT)
+		|| pred->ep_predictor == PRED_BUILTIN_EXPECT
+		|| pred->ep_predictor == PRED_HOT_LABEL)
 	  likely_edges.add (pred);
 	  }
 
diff --git a/gcc/predict.def b/gcc/predict.def
index e2a56f95955..27d0e4dc6f9 100644
--- a/gcc/predict.def
+++ b/gcc/predict.def
@@ -78,6 +78,14 @@ DEF_PREDICTOR (PRED_BUILTIN_EXPECT_WITH_PROBABILITY,
 	   "__builtin_expect_with_probability", PROB_UNINITIALIZED,
 	   PRED_FLAG_FIRST_MATCH)
 
+/* Branches to hot labels are likely.  */
+DEF_PREDICTOR (PRED_HOT_LABEL, "hot label", HITRATE (90),
+	   PRED_FLAG_FIRST_MATCH)
+
+/* Branches to cold labels are extremely unlikely.  */
+DEF_PREDICTOR (PRED_COLD_LABEL, "cold label", HITRATE (90),
+	   PRED_FLAG_FIRST_MATCH)
+
 /* Use number of loop iterations guessed by the contents of the loop.  */
 DEF_PREDICTOR (PRED_LOOP_ITERATIONS_GUESSED, "guessed loop iterations",
 	   PROB_UNINITIALIZED, PRED_FLAG_FIRST_MATCH)
@@ -171,13 +179,6 @@ DEF_PREDICTOR (PRED_LOOP_GUARD, "loop guard", HITRATE (73), 0)
 DEF_PREDICTOR (PRED_LOOP_GUARD_WITH_RECURSION, "loop guard with recursion",
 	   HITRATE (85), 0)
 
-/* Branches to hot labels are likely.  */
-DEF_PREDICTOR (PRED_HOT_LABEL, "hot label", HITRATE (85), 0)
-
-/* Branches to cold labels are extremely unlikely.  */
-DEF_PREDICTOR (PRED_COLD_LABEL, "cold label", PROB_VERY_LIKELY,
-	   PRED_FLAG_FIRST_MATCH)
-
 /* The following predictors are used in Fortran. */
 
 /* Branch leading to an integer overflow are extremely unlikely.  */
diff --git a/gcc/testsuite/g++.dg/predict-2.C b/gcc/testsuite/g++.dg/predict-2.C
new file mode 100644
index 000..1e0ac1d2c48
--- /dev/null
+++ b/gcc/testsuite/g++.dg/predict-2.C
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-profile_estimate -std=c++11" } */
+
+int a, b, c;
+
+void
+bar()
+{
+  if (a == 123)
+[[likely]] c = 5;

Re: [PATCH 4/4][libbacktrace] Add tests for unused formats

2018-11-30 Thread Tom de Vries

On 29-11-18 19:28, Ian Lance Taylor wrote:
> On Fri, Nov 23, 2018 at 12:56 PM, Tom de Vries  wrote:
>>
>> When building libbacktrace, we typically use elf.c, and don't build pecoff.c,
>> xcoff.c or unknown.c
>>
>> Add testcases that use unused format to ensure that we also build and
>> test those on a typical development setup.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> OK for trunk?
>>
>> Thanks,
>> - Tom
>>
>> [libbacktrace] Add tests for unused formats
>>
>> 2018-11-23  Tom de Vries  
>>
>> * configure.ac (NOT_HAVE_FORMAT_ELF, NOT_HAVE_FORMAT_PECOFF)
>> (NOT_HAVE_FORMAT_UNKNOWN, NOT_HAVE_FORMAT_XCOFF_32)
>> (NOT_HAVE_FORMAT_XCOFF_64): New AM_CONDITIONAL.
>> * configure: Regenerate.
>> * Makefile.am (check_PROGRAMS): Add test_elf, test_xcoff_32,
>> test_xcoff_64, test_pecoff and test_unknown.
>> * Makefile.in: Regenerate.
>> * test_format.c: New file.
> 
> 
> Again it seems feasible to avoid GNU make features, and skip the
> negative conditionals and just build the tests for all formats
> including the one on the current system.  It's not worth adding the
> complexity to avoid building the test.  Thanks.

Done.

OK for trunk?

Thanks,
- Tom
[libbacktrace] Add tests for unused formats

When building libbacktrace, we typically use elf.c, and don't build pecoff.c,
xcoff.c or unknown.c

Add testcases that use unused format to ensure that we also build and
test those on a typical development setup.

Bootstrapped and reg-tested on x86_64.

2018-11-23  Tom de Vries  

	* Makefile.am (check_PROGRAMS): Add test_elf, test_xcoff_32,
	test_xcoff_64, test_pecoff and test_unknown.
	* Makefile.in: Regenerate.
	* test_format.c: New file.

---
 libbacktrace/Makefile.am   |  39 +
 libbacktrace/Makefile.in   | 139 +
 libbacktrace/test_format.c |  55 ++
 3 files changed, 221 insertions(+), 12 deletions(-)

diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
index e7e9c7b6697..1a3680bc98c 100644
--- a/libbacktrace/Makefile.am
+++ b/libbacktrace/Makefile.am
@@ -96,6 +96,45 @@ libbacktrace_alloc_la_LIBADD = $(BACKTRACE_FILE) $(FORMAT_FILE) read.lo alloc.lo
 
 libbacktrace_alloc_la_DEPENDENCIES = $(libbacktrace_alloc_la_LIBADD)
 
+check_LTLIBRARIES += libbacktrace_noformat.la
+
+libbacktrace_noformat_la_SOURCES = $(libbacktrace_la_SOURCES)
+libbacktrace_noformat_la_LIBADD = $(BACKTRACE_FILE) $(VIEW_FILE) $(ALLOC_FILE)
+
+libbacktrace_noformat_la_DEPENDENCIES = $(libbacktrace_noformat_la_LIBADD)
+
+xcoff_%.c: xcoff.c
+	SEARCH='#error "Unknown BACKTRACE_XCOFF_SIZE"'; \
+	REPLACE='#undef BACKTRACE_XCOFF_SIZE\n#define BACKTRACE_XCOFF_SIZE'; \
+	$(SED) "s/^$$SEARCH\$$/$$REPLACE $*/" \
+		$(srcdir)/xcoff.c \
+		> $@
+
+test_elf_SOURCES = test_format.c testlib.c
+test_elf_LDADD = libbacktrace_noformat.la elf.lo
+
+check_PROGRAMS += test_elf
+
+test_xcoff_32_SOURCES = test_format.c testlib.c
+test_xcoff_32_LDADD = libbacktrace_noformat.la xcoff_32.lo
+
+check_PROGRAMS += test_xcoff_32
+
+test_xcoff_64_SOURCES = test_format.c testlib.c
+test_xcoff_64_LDADD = libbacktrace_noformat.la xcoff_64.lo
+
+check_PROGRAMS += test_xcoff_64
+
+test_pecoff_SOURCES = test_format.c testlib.c
+test_pecoff_LDADD = libbacktrace_noformat.la pecoff.lo
+
+check_PROGRAMS += test_pecoff
+
+test_unknown_SOURCES = test_format.c testlib.c
+test_unknown_LDADD = libbacktrace_noformat.la unknown.lo
+
+check_PROGRAMS += test_unknown
+
 unittest_SOURCES = unittest.c testlib.c
 unittest_LDADD = libbacktrace.la
 
diff --git a/libbacktrace/Makefile.in b/libbacktrace/Makefile.in
index 119d3d2bbc1..6eaa1e28c01 100644
--- a/libbacktrace/Makefile.in
+++ b/libbacktrace/Makefile.in
@@ -121,9 +121,10 @@ build_triplet = @build@
 host_triplet = @host@
 target_triplet = @target@
 check_PROGRAMS = $(am__EXEEXT_1) $(am__EXEEXT_2) $(am__EXEEXT_3)
-@NATIVE_TRUE@am__append_1 = unittest unittest_alloc btest btest_alloc \
-@NATIVE_TRUE@	stest stest_alloc ztest ztest_alloc edtest \
-@NATIVE_TRUE@	edtest_alloc
+@NATIVE_TRUE@am__append_1 = test_elf test_xcoff_32 test_xcoff_64 \
+@NATIVE_TRUE@	test_pecoff test_unknown unittest unittest_alloc \
+@NATIVE_TRUE@	btest btest_alloc stest stest_alloc ztest \
+@NATIVE_TRUE@	ztest_alloc edtest edtest_alloc
 @HAVE_ZLIB_TRUE@@NATIVE_TRUE@am__append_2 = -lz
 @HAVE_ZLIB_TRUE@@NATIVE_TRUE@am__append_3 = -lz
 @HAVE_PTHREAD_TRUE@@NATIVE_TRUE@am__append_4 = ttest ttest_alloc
@@ -168,11 +169,18 @@ am__objects_1 = atomic.lo dwarf.lo fileline.lo posix.lo print.lo \
 @NATIVE_TRUE@am_libbacktrace_alloc_la_OBJECTS = $(am__objects_1)
 libbacktrace_alloc_la_OBJECTS = $(am_libbacktrace_alloc_la_OBJECTS)
 @NATIVE_TRUE@am_libbacktrace_alloc_la_rpath =
-@NATIVE_TRUE@am__EXEEXT_1 = unittest$(EXEEXT) unittest_alloc$(EXEEXT) \
-@NATIVE_TRUE@	btest$(EXEEXT) btest_alloc$(EXEEXT) \
-@NATIVE_TRUE@	stest$(EXEEXT) stest_alloc$(EXEEXT) \
-@NATIVE_TRUE@	ztest$(EXEEXT) ztest_alloc$(EXEEXT) \
-@NATIVE_TRUE@	edtest$(EXEEXT) edte

Re: [PATCH 1/4][libbacktrace] Test check_PROGRAMS without mmap

2018-11-30 Thread Tom de Vries

On 29-11-18 19:26, Ian Lance Taylor wrote:
> On Fri, Nov 23, 2018 at 12:47 PM, Tom de Vries  wrote:
>> [ was: [PATCH 1/2][libbacktrace] Handle realloc returning NULL if size == 0 ]
>>
>> On Thu, Nov 22, 2018 at 01:35:43PM +0100, Tom de Vries wrote:
>>> Hi,
>>>
>>> Build and tested on x86_64, with mmap.c replaced by alloc.c to ensure that
>>> backtrace_vector_release in alloc.c is tested.
>>
>> Hi,
>>
>> I came up with a more structural solution.
>>
>> OK for trunk?
>>
>> Thanks,
>> - Tom
>>
>> [libbacktrace] Test check_PROGRAMS without mmap
>>
>> When building libbacktrace, we typically use mmapio.c and mmap.c, and don't
>> build read.c and alloc.c.
>>
>> Add testcases that use read.c and alloc.c to ensure that we also build and
>> test those on a typical development setup.
>>
>> Bootstrapped and reg-tested on x86_64.
>>
>> 2018-11-23  Tom de Vries  
>>
>> * configure.ac (HAVE_MMAP): New AM_CONDITIONAL.
>> * configure: Regenerate.
>> * Makefile.am : Add _with_alloc version for each test in
>> check_PROGRAMS.
>> * Makefile.in: Regenerate.
>>
>> ---
>>  libbacktrace/Makefile.am  |  45 
>>  libbacktrace/Makefile.in  | 284 
>> +++---
>>  libbacktrace/configure|  18 ++-
>>  libbacktrace/configure.ac |   1 +
>>  4 files changed, 331 insertions(+), 17 deletions(-)
>>
>> diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
>> index 3c1bd49dd7b..ee47daee415 100644
>> --- a/libbacktrace/Makefile.am
>> +++ b/libbacktrace/Makefile.am
>> @@ -95,11 +95,25 @@ btest_CFLAGS = $(AM_CFLAGS) -g -O
>>  btest_LDADD = libbacktrace.la
>>
>>  check_PROGRAMS += btest
>> +if HAVE_MMAP
>> +libbacktrace_with_alloc = $(libbacktrace_la_OBJECTS) \
>> +   $(filter-out mmap.lo mmapio.lo,$(libbacktrace_la_LIBADD)) \
>> +   alloc.lo read.lo
> 
> 
> As far as I know libbacktrace does not currently rely on using GNU
> make.  I'd rather not add that dependency for this purpose (I don't
> mind adding this kind of testing but to me this seems only of mild
> interest--I expect that all significant libbacktrace users have mmap).
> You should be able to write something like
> 
> libbacktrace_alloc_la_SOURCES = $(libbacktrace_SOURCES)
> libbacktrace_alloc_la_LIBADD = $(BACKTRACE_FILE) $(FORMAT_FILE) read.c alloc.c
> 

Done.

> Then I wouldn't bother with only running the tests with HAVE_MMAP,
> just add unconditional tests for btest_alloc, etc.  It's OK to run
> duplicate tests for the incredibly rare case of a host that doesn't
> support mmap.

Done.

OK for trunk?

Thanks,
- Tom
[libbacktrace] Test check_PROGRAMS without mmap

When building libbacktrace, we typically use mmapio.c and mmap.c, and don't
build read.c and alloc.c.

Add testcases that use read.c and alloc.c to ensure that we also build and
test those on a typical development setup.

Bootstrapped and reg-tested on x86_64.

2018-11-23  Tom de Vries  

	* Makefile.am : Add _with_alloc version for each test in
	check_PROGRAMS.
	* Makefile.in: Regenerate.

---
 libbacktrace/Makefile.am |  54 
 libbacktrace/Makefile.in | 319 +++
 2 files changed, 349 insertions(+), 24 deletions(-)

diff --git a/libbacktrace/Makefile.am b/libbacktrace/Makefile.am
index 13e94f27aef..e7e9c7b6697 100644
--- a/libbacktrace/Makefile.am
+++ b/libbacktrace/Makefile.am
@@ -89,38 +89,74 @@ check_PROGRAMS =
 TESTS = $(check_PROGRAMS)
 
 if NATIVE
+check_LTLIBRARIES = libbacktrace_alloc.la
+
+libbacktrace_alloc_la_SOURCES = $(libbacktrace_la_SOURCES)
+libbacktrace_alloc_la_LIBADD = $(BACKTRACE_FILE) $(FORMAT_FILE) read.lo alloc.lo
+
+libbacktrace_alloc_la_DEPENDENCIES = $(libbacktrace_alloc_la_LIBADD)
+
 unittest_SOURCES = unittest.c testlib.c
 unittest_LDADD = libbacktrace.la
 
 check_PROGRAMS += unittest
 
+unittest_alloc_SOURCES = $(unittest_SOURCES)
+unittest_alloc_LDADD = libbacktrace_alloc.la
+
+check_PROGRAMS += unittest_alloc
+
 btest_SOURCES = btest.c testlib.c
 btest_CFLAGS = $(AM_CFLAGS) -g -O
 btest_LDADD = libbacktrace.la
 
 check_PROGRAMS += btest
 
+btest_alloc_SOURCES = $(btest_SOURCES)
+btest_alloc_CFLAGS = $(btest_CFLAGS)
+btest_alloc_LDADD = libbacktrace_alloc.la
+
+check_PROGRAMS += btest_alloc
+
 stest_SOURCES = stest.c
 stest_LDADD = libbacktrace.la
 
 check_PROGRAMS += stest
 
+stest_alloc_SOURCES = $(stest_SOURCES)
+stest_alloc_LDADD = libbacktrace_alloc.la
+
+check_PROGRAMS += stest_alloc
+
 ztest_SOURCES = ztest.c testlib.c
 ztest_CFLAGS = -DSRCDIR=\"$(srcdir)\"
 ztest_LDADD = libbacktrace.la
+ztest_alloc_LDADD = libbacktrace_alloc.la
 
 if HAVE_ZLIB
 ztest_LDADD += -lz
+ztest_alloc_LDADD += -lz
 endif
 ztest_LDADD += $(CLOCK_GETTIME_LINK)
+ztest_alloc_LDADD += $(CLOCK_GETTIME_LINK)
 
 check_PROGRAMS += ztest
 
+ztest_alloc_SOURCES = $(ztest_SOURCES)
+ztest_alloc_CFLAGS = $(ztest_CFLAGS)
+
+check_PROGRAMS += ztest_alloc
+
 edtest_SOURCES = edtest.c edtest2_build.c testlib.c
 edtest_LDADD = libbacktrace.la
 
 check_PROGRAMS += edtest

Re: [PATCH 1/5][libbacktrace] Factor out backtrace_vector_free

2018-11-30 Thread Tom de Vries

On 29-11-18 19:17, Ian Lance Taylor wrote:
> On Thu, Nov 29, 2018 at 4:33 AM, Tom de Vries  wrote:
>> On 29-11-18 00:26, Ian Lance Taylor wrote:
>>> On Wed, Nov 28, 2018 at 3:15 PM, Tom de Vries  wrote:

 this patch factors out new function backtrace_vector_free.

 Bootstrapped and reg-tested on x86_64.

 OK for trunk?
>>>
>>> We should only add new files if we really absolutely must, as this
>>> package is copied around to a lot of places (e.g.,
>>> libsanitizer/libbacktrace) and adding files here requires
>>> modifications in all those places.
>>>
>>
>> I see, thanks for the explanation.
>>
>> How about his patch? It does not add a file, though it does add an
>> external function which requires a rename in libsanitizer/libbacktrace
>> (I don't know whether that requires changes in any other places).
>>
>> [ Also, it inlines backtrace-vector.c into alloc.c and mmap.c, so it
>> duplicates code. If that is not acceptable, I could move it to
>> internal.h as static inline or static ATTRIBUTE_UNUSED. ]
> 
> Yes, let's just use a static inline function or a macro.  Thanks.

Committed as static inline.

Thanks,
- Tom
[libbacktrace] Factor out backtrace_vector_free

Factor out new function backtrace_vector_free.

Bootstrapped and reg-tested on x86_64.

2018-11-28  Tom de Vries  

	* internal.h (backtrace_vector_free): New static inline fuction,
	factored out of ...
	* dwarf.c (read_line_info): ... here.

---
 libbacktrace/dwarf.c|  4 +---
 libbacktrace/internal.h | 12 
 2 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/libbacktrace/dwarf.c b/libbacktrace/dwarf.c
index 34543747c8f..48ef3638a5f 100644
--- a/libbacktrace/dwarf.c
+++ b/libbacktrace/dwarf.c
@@ -2057,9 +2057,7 @@ read_line_info (struct backtrace_state *state, struct dwarf_data *ddata,
   return 1;
 
  fail:
-  vec.vec.alc += vec.vec.size;
-  vec.vec.size = 0;
-  backtrace_vector_release (state, &vec.vec, error_callback, data);
+  backtrace_vector_free (state, &vec.vec, error_callback, data);
   free_line_header (state, hdr, error_callback, data);
   *lines = (struct line *) (uintptr_t) -1;
   *lines_count = 0;
diff --git a/libbacktrace/internal.h b/libbacktrace/internal.h
index bff8ed470e4..548f9d70905 100644
--- a/libbacktrace/internal.h
+++ b/libbacktrace/internal.h
@@ -257,6 +257,18 @@ extern int backtrace_vector_release (struct backtrace_state *state,
  backtrace_error_callback error_callback,
  void *data);
 
+/* Free the space managed by VEC.  This will reset VEC.  */
+
+static inline void
+backtrace_vector_free (struct backtrace_state *state,
+		   struct backtrace_vector *vec,
+		   backtrace_error_callback error_callback, void *data)
+{
+  vec->alc += vec->size;
+  vec->size = 0;
+  backtrace_vector_release (state, vec, error_callback, data);
+}
+
 /* Read initial debug data from a descriptor, and set the
fileline_data, syminfo_fn, and syminfo_data fields of STATE.
Return the fileln_fn field in *FILELN_FN--this is done this way so

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 08:38:48AM +, Nick Clifton wrote:
>   Also - Tom and Pedro have raised the issue that the patch introduces
>   a new static variable to the library that is not thread safe.  I am
>   not sure of the best way to address this problem.  Possibly the
>   variable could be made thread local ?  Are there any other static

Please don't.  That has a cost for all the programs that link against
libstdc++ or any other library that includes the demangler, even when they
don't use the demangler at all (99.9% of the users).
Most of the demangler functions pass around a pointer to a struct with
context, can't this be added in there?

Jakub

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Nick Clifton

Hi Scott,

> Thank you for looking into this Nick. I've been staring at a few of these 
> CVEs off-and-on for a few days, and the following CVEs all look like 
> duplicates:
> 
> CVE-2018-17985: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87335
> CVE-2018-18484: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87636
> CVE-2018-18701: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87675
> CVE-2018-18700: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87681

Yes, essentially they are.  They actually trigger stack exhaustion at
different places inside libiberty, but the root cause is the same.
I am also happy to say that my proposed patch fixes *all* of these PRs.

> Perhaps some of these should be rejected?

That would nice, but I think that if the patch is accepted then the
issue should be resolved and we should stop seeing this kind of CVE.

(I must admit that my motivation for creating this patch in the first
place is that I am fed up with the amount of hassle that is involved
each time a new CVE is created.  Especially when they are essentially
all the same bug).

Cheers
  Nick

Re: RFA/RFC: Add stack recursion limit to libiberty's demangler

2018-11-30 Thread Nick Clifton

Hi Ian,

  *sigh* it is always the way.  You post a patch and five minutes later
  you find a bug in it.  In this case it turned out that I had forgotten
  that gcc has its own copy of the libiberty sources, so the bootstrap
  test that I had run did not use the patched sources.  Doh.  When I did
  rerun the bootstrap with the patched sources it failed because I had
  forgotten to use the CP_STATIC_IF_GLIBCPP_V3 macro when declaring the
  new cplus_demangle_set_recursion_limit() function.

  I am attaching a revised patch with this bug fixed, and an updated
  changelog entry as I have found a few more CVE PRs that it fixes.

  Also - Tom and Pedro have raised the issue that the patch introduces
  a new static variable to the library that is not thread safe.  I am
  not sure of the best way to address this problem.  Possibly the
  variable could be made thread local ?  Are there any other static
  variables in libiberty that face the same issue ?
  
Cheers
  Nick

libiberty/ChangeLog
2018-11-29  Nick Clifton  

PR 87681
PR 87675
PR 87636
PR 87335
* cp-demangle.c (demangle_recursion_limit): New static
variable.
(d_function_type): Add recursion counter.  If the recursion
limit is enabled and reached, return with a failure result.
(d_demangle_callback): If the recursion limit is enabled, check
for a mangled string that is so long that there is not enough
stack space for the local arrays.
(cplus_demangle_set_recursion): New function.  Sets and/or
returns the current stack recursion limit.
* cplus-dem.c (demangle_nested_args): Add recursion counter.  If
the recursion limit is enabled and reached, return with a
failure result.

Index: libiberty/cp-demangle.c
===
--- libiberty/cp-demangle.c	(revision 266657)
+++ libiberty/cp-demangle.c	(working copy)
@@ -62,6 +62,7 @@
   cplus_demangle_fill_dtor
   cplus_demangle_print
   cplus_demangle_print_callback
+  cplus_demangle_set_recursion_limit
and other functions defined in the file cp-demint.c.
 
This file also defines some other functions and variables which are
@@ -181,6 +182,9 @@
 #define cplus_demangle_init_info d_init_info
 static void d_init_info (const char *, int, size_t, struct d_info *);
 
+#define cplus_demangle_set_recursion_limit d_set_recursion_level
+static  unsigned int d_set_recursion_limit (unsigned int);
+
 #else /* ! defined(IN_GLIBCPP_V3) */
 #define CP_STATIC_IF_GLIBCPP_V3
 #endif /* ! defined(IN_GLIBCPP_V3) */
@@ -2852,21 +2856,34 @@
 static struct demangle_component *
 d_function_type (struct d_info *di)
 {
-  struct demangle_component *ret;
+  static unsigned long recursion_level = 0;
+  struct demangle_component *ret = NULL;
 
-  if (! d_check_char (di, 'F'))
-return NULL;
-  if (d_peek_char (di) == 'Y')
+  if ((di->options & DMGL_RECURSE_LIMIT)
+  && recursion_level > demangle_recursion_limit)
 {
-  /* Function has C linkage.  We don't print this information.
-	 FIXME: We should print it in verbose mode.  */
-  d_advance (di, 1);
+  /* FIXME: There ought to be a way to report that the recursion limit
+	 has been reached.  */
+  return NULL;
 }
-  ret = d_bare_function_type (di, 1);
-  ret = d_ref_qualifier (di, ret);
 
-  if (! d_check_char (di, 'E'))
-return NULL;
+  recursion_level ++;
+  if (d_check_char (di, 'F'))
+{
+  if (d_peek_char (di) == 'Y')
+	{
+	  /* Function has C linkage.  We don't print this information.
+	 FIXME: We should print it in verbose mode.  */
+	  d_advance (di, 1);
+	}
+  ret = d_bare_function_type (di, 1);
+  ret = d_ref_qualifier (di, ret);
+  
+  if (! d_check_char (di, 'E'))
+	ret = NULL;
+}
+
+  recursion_level --;
   return ret;
 }
 
@@ -6242,6 +6259,20 @@
 
   cplus_demangle_init_info (mangled, options, strlen (mangled), &di);
 
+  /* PR 87675 - Check for a mangled string that is so long
+ that we do not have enough stack space to demangle it.  */
+  if ((options & DMGL_RECURSE_LIMIT)
+  /* This check is a bit arbitrary, since what we really want to do is to
+	 compare the sizes of the di.comps and di.subs arrays against the
+	 amount of stack space remaining.  But there is no portable way to do
+	 this, so instead we use the recursion limit as a guide to the maximum
+	 size of the arrays.  */
+  && (unsigned long) di.num_comps > demangle_recursion_limit)
+{
+  /* FIXME: We need a way to indicate that a stack limit has been reached.  */
+  return 0;
+}
+
   {
 #ifdef CP_DYNAMIC_ARRAYS
 __extension__ struct demangle_component comps[di.num_comps];
@@ -6324,6 +6355,23 @@
   return dgs.buf;
 }
 
+/* Set a limit on the amount of recursion allowed in mangled strings.
+   Only effective if the DMGL_RECURSE_LIMIT option has been set.
+   Returns the previous value of the recursion limit.
+   Ignores settin

Re: [PATCH] asm non-code template parts (alternative to asm inline)

2018-11-30 Thread Alexander Monakov

On Fri, 30 Nov 2018, Richard Biener wrote:
> > Ping - as I think this approach addresses the root of the problem, I 
> > wouldn't
> > like it to be forgotten.
> 
> I agree this is also useful but it addresses another issue (that may appear to
> be related).  asm inline is really a hint to the inliner estimates (with no 
> way
> to get semantics botched) while marking off-section parts is making the
> asm text more precise also affecting code generation and thus has the
> possibility to cause correctness issues (if you say mark everything as
> off-section just to make it inline better).

I don't think that's true: if the user marks too much of the template as
off-section and makes GCC under-estimate branch ranges, they may receive an
error from the assembler. But that's a build failure, not a correctness
issue. Surely build error is a reasonable outcome from misuse of inline asm.

> I'm sympathtetic to both patches but clearly the kernel folks have shown
> need for the inline hint (arguably the kernel folks are the ones we could
> expect to get usages of %` correct ...).  I haven't seen any comments
> from possible users of %`

FWIW, when I raised the idea in the kernel thread, the response from Borislav
was:

>> I don't mind it but I see you guys are still discussing what would be
>> the better solution here, on the gcc ML. And we, as one user, are a
>> happy camper as long as it does what it is meant to do. But how the
>> feature looks like syntactically is something for gcc folk to decide as
>> they're going to support it for the foreseeable future and I'm very well
>> aware of how important it is for a supportable feature to be designed
>> properly.

Alexander

Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-30 Thread Jakub Jelinek

On Fri, Nov 30, 2018 at 09:04:56AM +0100, Richard Biener wrote:
> So, let's suppose we delay folding of (builtin [string]) calls to some
> first special pass that also diagnoses those issues.  One obvious
> place to place this pass would be where we now do the
> early object-size pass.  In fact we might want to merge
> the object-size pass with the strlen pass because they sound so
> much related.  This pass would then fold the calls and set some

I don't see them related at all and there is nothing they have in common in
the way how they are implemented actually.

That said, the object-size pass is like a fab pass for a single builtin,
so handling in its main loop other builtins is fine.

Jakub

Re: [PATCH] clarify comments for implicit_p flag for built-ins

2018-11-30 Thread Richard Biener

On Wed, Nov 28, 2018 at 7:18 PM Martin Sebor  wrote:
>
> On 11/28/18 6:35 AM, Richard Biener wrote:
> > On Tue, Nov 27, 2018 at 3:52 AM Martin Sebor  wrote:
> >>
> >> Ping: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg01759.html
> >>
> >> If there are no objections or suggestions for tweaks I'll commit
> >> this updated comment this week.
> >
> > Please do not commit such changes w/o approval.
>
> Let me follow up on this separately.
>
>
> > Does the new comments match existing usage, for example in gimplify.c
> > where we do
> >
> > static void
> > build_stack_save_restore (gcall **save, gcall **restore)
> > {
> >tree tmp_var;
> >
> >*save = gimple_build_call (builtin_decl_implicit (BUILT_IN_STACK_SAVE), 
> > 0);
> >tmp_var = create_tmp_var (ptr_type_node, "saved_stack");
> >gimple_call_set_lhs (*save, tmp_var);
> >
> >*restore
> >  = gimple_build_call (builtin_decl_implicit (BUILT_IN_STACK_RESTORE),
> >   1, tmp_var);
> > }
> >
> > ?
>
> I don't know why this code uses builtin_decl_implicit here.  I don't
> think it's necessary or even correct.  If builtin_decl_implicit were
> to return null gimple_build_call() would ICE.  Replacing it with
> builtin_decl_explicit passes an x86_64 bootstrap and regression test.
>
> I see the same pattern in some of the other calls to
> builtin_decl_implicit with non-library built-ins.  E.g., in
> gimplify_function_tree:
>
>x = builtin_decl_implicit (BUILT_IN_RETURN_ADDRESS);
>call = gimple_build_call (x, 1, integer_zero_node);
>
> > +/* Return the tree node for the built-in function declaration corresponding
> > +   to FNCODE if its IMPLICIT_P flag has been set or NULL otherwise.
> > +   IMPLICIT_P is clear for library built-ins that GCC implements but that
> > +   may not be implemented in the runtime library on the target.  See also
> > +   the DEF_BUILTIN macro in builtins.def.  */
> > +
> >
> > it isn't exactly helpful to specify when IMPLICIT_P is not set.  When is it
> > actually set?
> >
> > My understanding of the difference is that the compiler may create a call
> > to a builtin_decl_implicit decl but _not_ to a builtin_decl_explicit one
> > unless a call to such function already existed.
>
> This sounds like an important detail to capture but I'm not sure
> I quite understand it to do so.
>
> Besides the BUILT_IN_STACK question above, cp/cp-gimplify.c for
> example calls builtin_decl_explicit with BUILT_IN_UNREACHABLE
> to unconditionally insert a call to the built-in.  It does that
> whether or not a call to the built-in has been seen in the program.
> E.g., this:
>
>int f () { }
>int g () { return f (); }
>
> ends up with a __builtin_unreachable call (IMPLICIT is set here
> so the result would be the same if cp/cp-gimplify.c called
> builtin_decl_implicit instead, but that seems beside the point).
>
> > Note we set the implicit flag during gimplification if we see such decl
> > referenced and it is marked as properly declared by the FE.
>
> I see this in gimplify.c:
>
>/* If we see a call to a declared builtin or see its address
>  being taken (we can unify those cases here) then we can mark
>  the builtin for implicit generation by GCC.  */
>if (TREE_CODE (op0) == FUNCTION_DECL
>   && fndecl_built_in_p (op0, BUILT_IN_NORMAL)
>   && builtin_decl_declared_p (DECL_FUNCTION_CODE (op0)))
> set_builtin_decl_implicit_p (DECL_FUNCTION_CODE (op0), true);
>
> This makes sense to me for library built-ins but for others like
> __builtin_unreachable it seems unnecessary.  Their IMPLICIT flag
> is already set.  The only calls to set_builtin_decl_implicit_p
> that I can find that clear the IMPLICIT flag are in the VMS
> back-end for BUILT_IN_FWRITE (apparently because the function
> is non-standard there).
>
> >
> > Overall your patch is an improvement I think but maybe you can
> > refer to builtins.def in the documentation for builtin_decl_implicit?
>
> Sure, I'm happy to adjust it once we're confident our understanding
> of how the flag works is correct.  In light of your comments I for
> one don't feel I'm there yet.

Everytime I get to the difference of implicit vs. explicit I am confused
as well.  Thus I am not saying I completely capture the difference...
(given all the inconsistencies in uses throughout the GCC codebase).

In fact I fail to see the need to have both - having only _implicit would
seem enough for the middle-end ...

Richard.

> Thanks
> Martin
>
> >
> >> On 11/20/18 1:24 PM, Martin Sebor wrote:
> >>> On 11/20/2018 11:02 AM, Martin Sebor wrote:
>  Would the updated comments in the attached patch more accurately
>  describe the purpose of the IMPLICIT_P flag and
>  the builtin_decl_explicit() and builtin_decl_implicit() functions?
> 
>  I ended up here while trying to understand the differences between
>  the functions on different targets and decide which one should be
>  used to diagnose bugs like

Re: [PING #4][PATCH] avoid warning on constant strncpy until next statement is reachable (PR 87028)

2018-11-30 Thread Richard Biener

On Fri, Nov 30, 2018 at 3:02 AM Jeff Law  wrote:
>
> On 11/29/18 4:43 PM, Martin Sebor wrote:
> >> The fallout on existing tests is minimal.  What's more concerning is
> >> that it doesn't actually pass the new test from Martin's original
> >> submission.  We get bogus warnings.
> >>
> >> At least part of the problem is weakness in maybe_diag_stxncpy_trunc.
> >> It can't handle something like this:
> >>
> >> test_literal (char * d, struct S * s)
> >> {
> >>strncpy (d, "1234567890", 3);
> >>_1 = d + 3;
> >>*_1 = 0;
> >> }
> >>
> >>
> >> Note the pointer arithmetic between the strncpy and storing the NUL
> >> terminator.

I already said the way the code looks for the next stmt is totally backwards...

Oh, and I also already voiced my concerns about emitting warnings
from folding code, did I? ...

So, let's suppose we delay folding of (builtin [string]) calls to some
first special pass that also diagnoses those issues.  One obvious
place to place this pass would be where we now do the
early object-size pass.  In fact we might want to merge
the object-size pass with the strlen pass because they sound so
much related.  This pass would then fold the calls and set some
cfun->gimple_df->after_call_warnings flag we could test in the
folder (similar to how we have avoid_folding_inline_builtin ()).

This placement ensures that we already got functions early
inlined (albeit in "early optimized" form but with their diagnostics
already been emitted).

This is of course all GCC 10 material.

> > Right.  I'm less concerned about this case because it involves
> > a literal that's obviously longer than the destination but it
> > would be nice if the suppression worked here as well in case
> > the literal comes from macro expansion.  It will require
> > another tweak.
> OK.  If this isn't at the core of the regression BZ, then xfailing those
> particular cases and coming back to them is fine.
>
> >
> > But the test from my patch passes with the changes to calls.c
> > from my patch, so that's an improvement.
> >
> > I have done the test suite cleanup in the attached patch.  It
> > was indeed minimal -- not sure why I saw so many failures with
> > my initial approach.
> Richi's does the folding as part of gimple lowering.  So it's still
> pretty early -- basically it ends up hitting just a few tests that are
> looking for folded stuff in the .gimple dump.
>
> I had actually targeted this patch as one to work through and try to get
> resolved today.  Kind of funny that we were poking at it at the same time.
>
>
> >
> > I can submit a patch to handle the literal case above as
> > a followup unless you would prefer it done at the same time.
> Follow-up is fine by me.
>
> jeff

1 2 >

100 matches

Mail list logo