date:20180831

Re: Make safe_iterator inline friends

2018-08-31 Thread Jonathan Wakely


On 30/08/18 21:56 +0200, François Dumont wrote:

--- a/libstdc++-v3/include/debug/safe_iterator.h
+++ b/libstdc++-v3/include/debug/safe_iterator.h
@@ -36,6 +36,28 @@
#include 
#include 

+#define _GLIBCXX_DEBUG_VERIFY_OPERANDS(_Lhs, _Rhs, _BadMsgId, _DiffMsgId) \
+  _GLIBCXX_DEBUG_VERIFY(!_Lhs._M_singular() && !_Rhs._M_singular(),\
+   _M_message(_BadMsgId)   \
+   ._M_iterator(_Lhs, #_Lhs)   \
+   ._M_iterator(_Rhs, #_Rhs)); \
+  _GLIBCXX_DEBUG_VERIFY(_Lhs._M_can_compare(_Rhs), \
+   _M_message(_DiffMsgId)  \
+   ._M_iterator(_Lhs, #_Lhs)   \
+   ._M_iterator(_Rhs, #_Rhs))
+
+#define _GLIBCXX_DEBUG_VERIFY_CMP_OPERANDS(_Lhs, _Rhs) \
+  _GLIBCXX_DEBUG_VERIFY_OPERANDS(_Lhs, _Rhs, __msg_iter_compare_bad,   \
+__msg_compare_different)
+
+#define _GLIBCXX_DEBUG_VERIFY_ORDER_OPERANDS(_Lhs, _Rhs)   \
+  _GLIBCXX_DEBUG_VERIFY_OPERANDS(_Lhs, _Rhs, __msg_iter_order_bad, \
+__msg_order_different)


Please use "EQ" instead of "CMP" and "REL" instead of "ORDER". Those
are the names used in the standard (the relational operators like less
than are still "comparisons" so only using CMP/compare for equality is
wrong).

OK for trunk with that change.

Re: DWARF: Represent hard frame pointer as stack pointer + offset

2018-08-31 Thread H.J. Lu

On Fri, Aug 31, 2018 at 1:32 PM, Jason Merrill  wrote:
> On Fri, Aug 31, 2018 at 3:33 PM, H.J. Lu  wrote:
>> On Thu, Aug 30, 2018 at 10:21 AM, Jason Merrill  wrote:
>>>
 r138335 allowed arg_pointer_rtx to be eliminated by either FP or SP,
 but only when dynamic stack alignment is supported.  In this case,
 arg_pointer_rtx is eliminated by FP even when frame_pointer_needed
 is false and there is no dynamic stack alignment at all.

> gcc_assert (elim == stack_pointer_rtx || (frame_pointer_needed && elim
> == hard_frame_pointer_rtx));
>
> so as not to allow eliminating to an uninitialized FP.

 FP isn't uninitialized.  It is initialized the same way as in the case of
 SUPPORTS_STACK_ALIGNMENT is true.
>>>
>>> How is that?  Why would it be initialized when frame_pointer_needed is
>>> false?  What does it mean for frame_pointer_needed to be false, if
>>> not, as in the documentation of -fomit-frame-pointer,
>>>
>>> "This avoids the instructions to save, set up and restore the frame
>>> pointer; on many targets it also makes an extra register available."
>>>
>>> ?
>>
>> A backend may not set up frame pointer even with -fno-omit-frame-pointer.
>> In the case of x86,  hard frame pointer can be represented by stack pointer
>> - UNITS_PER_WORD.
>>
>> This patch adds hard_frame_pointer_from_stack_pointer_plus_offset and
>> hard_frame_pointer_offset to rtl_data to allow a backend to represent
>> hard frame pointer as stack pointer + offset.
>
> Shouldn't this be fixed in eliminate_regs rather than dwarf2out?
>

With -fno-omit-frame-pointer, arg pointer is eliminated with hard frame
pointer.  But

commit cd557ff63f388ad27c376d0a225e74d3594a6f9d
Author: hjl 
Date:   Thu Aug 10 15:29:05 2017 +

i386: Don't use frame pointer without stack access

When there is no stack access, there is no need to use frame pointer
even if -fno-omit-frame-pointer is used and caller's frame pointer is
unchanged.

changed it in the last minute.  It is too late to go back.  When it is done,
hard frame pointer must be replaced by stack pointer - UNITS_PER_WORD
if it is ever used.


-- 
H.J.

Re: [PATCH][debug] Add -gdescribe-dies

2018-08-31 Thread Jason Merrill

On Fri, Aug 24, 2018 at 11:38 AM, Tom de Vries  wrote:
> [ was: Re: [PATCH][debug] Add -gdescriptive-dies ]
> On Fri, Aug 24, 2018 at 12:44:38PM +0200, Richard Biener wrote:
>> On Wed, 22 Aug 2018, Tom de Vries wrote:
>>
>> > [ was: Re: [PATCH][debug] Add -gforce-named-dies ]
>> >
>> > On 08/22/2018 11:46 AM, Tom de Vries wrote:
>> > > On 08/22/2018 08:56 AM, Tom de Vries wrote:
>> > >> This is an undocumented developer-only option, because using this 
>> > >> option may
>> > >> change behaviour of dwarf consumers, f.i., gdb shows the artificial 
>> > >> variables:
>> > >> ...
>> > >> (gdb) info locals
>> > >> a = 0x7fffda90 "\005"
>> > >> D.4278 = 
>> > >> ...
>> > > I just found in the dwarf 5 spec the attribute DW_AT_description
>> > > (present since version 3):
>> > > ...
>> > > 2.20 Entity Descriptions
>> > > Some debugging information entries may describe entities in the program
>> > > that are artificial, or which otherwise have a “name” that is not a
>> > > valid identifier in the programming language.
>> > >
>> > > This attribute provides a means for the producer to indicate the purpose
>> > > or usage of the containing debugging infor
>> > >
>> > > Generally, any debugging information entry that has, or may have, a
>> > > DW_AT_name attribute, may also have a DW_AT_description attribute whose
>> > > value is a null-terminated string providing a description of the entity.
>> > >
>> > > It is expected that a debugger will display these descriptions as part
>> > > of displaying other properties of an entity.
>> > > ...
>> > >
>> > > AFAICT, gdb currently does not explicitly handle this attribute, which I
>> > > think means it's ignored.
>> > >
>> > > It seems applicable to use DW_AT_description at least for the artificial
>> > > decls.
>> > >
>> > > Perhaps even for all cases that are added by the patch?
>> > >
>> >
>> > I've chosen for this option. Using DW_AT_desciption instead of
>> > DW_AT_name should minimize difference in gdb behaviour with and without
>> > -gdescriptive-dies.
>>
>> -gdescribe-dies?
>>
>
> Done.
>
>> > > I'll rewrite the patch.
>> >
>> > OK for trunk?
>>
>> Few comments:
>>
>> +static void
>> +add_desc_attribute (dw_die_ref die, tree decl)
>> +{
>> +  tree decl_name;
>> +
>> +  if (!flag_descriptive_dies || dwarf_version < 3)
>> +return;
>> +
>>
>> you said this is a DWARF5 "feature",
>
> No, it's a DWARF3 "feature". I copied the text from the DWARF5 spec.
>
>> I'd suggest changing the
>> check to
>>
>>   if (!flag_desctiprive_dies || (dwarf_version < 5 && dwarf_strict))
>>
>> given -gdescribe-dies is enough of a user request to enable the
>> feature.
>
> Done.
>
>> Not sure if we should warn about -gstrict-dwarf
>> combination with it and -gdwarf-N < 5.
>>
>
> Not sure either, left it out.
>
>> +  else if (TREE_CODE (decl) == VAR_DECL || TREE_CODE (decl) ==
>> CONST_DECL)
>> +{
>> +  char buf[32];
>> +  char decl_letter = TREE_CODE (decl) == CONST_DECL ? 'C' : 'D';
>> +  sprintf (buf, "%c.%u", decl_letter, DECL_UID (decl));
>> +  add_desc_attribute (die, buf);
>> +}
>>
>> I wondered before if we can use pretty-printing of decl here.  At
>> least I wouldn't restrict it to VAR_DECLs, FUNCTION_DECLs can
>> certainly appear here I think.
>>
>
> Done.
>
>> +@item -gdescriptive-dies
>> +@opindex gdescriptive-dies
>> +Add description attribute to DWARF DIEs that have no name attribute.
>> +
>>
>> Either "description attributes" or "a description attribute", not
>> 100% sure being a non-native speaker.
>>
>
> Went for "description attributes".
>
>> Otherwise the patch looks OK to me but please leave Jason time
>> to comment.
>>
>
> Will do.
>
> Untested patch attached.
>
> Thanks,
> - Tom
>
> [debug] Add -gdescribe-dies
>
> This patch adds option -gdescribe-dies.  It sets the DW_AT_description
> attribute of dies that do not get a DW_AT_name attribute, to make it easier to
> figure out what the die is describing.
>
> The option exports the names of artificial variables:
> ...
>  DIE0: DW_TAG_variable (0x7fa934dd54b0)
> +  DW_AT_description: "D.1922"
>DW_AT_type: die -> 0 (0x7fa934dd0d70)
>DW_AT_artificial: 1
>
> ...
> which can be traced back to gimple dumps:
> ...
>   char a[0:D.1922] [value-expr: *a.0];
> ...
>
> Furthermore, it adds names to external references:
> ...
>  DIE0: DW_TAG_subprogram (0x7fa88b9650f0)
> +DW_AT_description: "main"
>  DW_AT_abstract_origin: die -> label: vla_1.c.6719312a + 29 (0x7fa88b965140)

Please add more of this description to the one-line documentation
patch you have now; there are many DIEs that have no name because they
don't need one, and this patch doesn't add names to all of them.

These two cases seem like they have very different uses, but I suppose
we don't really need two new options.

Jason

Re: V3 [PATCH] C/C++: Add -Waddress-of-packed-member

2018-08-31 Thread Jason Merrill


On 07/23/2018 05:24 PM, H.J. Lu wrote:

On Mon, Jun 18, 2018 at 12:26 PM, Joseph Myers  wrote:

On Mon, 18 Jun 2018, Jason Merrill wrote:


On Mon, Jun 18, 2018 at 11:59 AM, Joseph Myers  wrote:

On Mon, 18 Jun 2018, Jason Merrill wrote:


+  if (TREE_CODE (rhs) == COND_EXPR)
+{
+  /* Check the THEN path first.  */
+  tree op1 = TREE_OPERAND (rhs, 1);
+  context = check_address_of_packed_member (type, op1);


This should handle the GNU extension of re-using operand 0 if operand
1 is omitted.


Doesn't that just use a SAVE_EXPR?


Hmm, I suppose it does, but many places in the compiler seem to expect
that it produces a COND_EXPR with TREE_OPERAND 1 as NULL_TREE.


Maybe that's used somewhere inside the C++ front end.  For C a SAVE_EXPR
is produced directly.


Here is the updated patch.  Changes from the last one:

1. Handle COMPOUND_EXPR.
2. Fixed typos in comments.
3. Combined warn_for_pointer_of_packed_member and
warn_for_address_of_packed_member into
warn_for_address_or_pointer_of_packed_member.



c.i:4:33: warning: converting a packed ‘struct C *’ pointer increases the 
alignment of ‘long int *’ pointer from 1 to 8 [-Waddress-of-packed-member]


I think this would read better as

c.i:4:33: warning: converting a packed ‘struct C *’ pointer (alignment 
1) to ‘long int *’ (alignment 8) may result in an unaligned pointer 
value [-Waddress-of-packed-member]



+  while (TREE_CODE (base) == ARRAY_REF)
+   base = TREE_OPERAND (base, 0);
+  if (TREE_CODE (base) != COMPONENT_REF)
+   return NULL_TREE;


Are you deliberately not handling the other handled_component_p cases? 
If so, there should be a comment.



+  /* Check alignment of the object.  */
+  if (TREE_CODE (object) == COMPONENT_REF)
+{
+  field = TREE_OPERAND (object, 1);
+  if (TREE_CODE (field) == FIELD_DECL && DECL_PACKED (field))
+   {
+ type_align = TYPE_ALIGN (type);
+ context = DECL_CONTEXT (field);
+ record_align = TYPE_ALIGN (context);
+ if ((record_align % type_align) != 0)
+   return context;
+   }
+}


Why doesn't this recurse?  What if you have a packed field three 
COMPONENT_REFs down?



+  if (TREE_CODE (rhs) == COND_EXPR)
+{
+  /* Check the THEN path first.  */
+  tree op1 = TREE_OPERAND (rhs, 1);
+  context = check_address_of_packed_member (type, op1);
+  if (context)
+   rhs = op1;
+  else
+   {
+ /* Check the ELSE path.  */
+ rhs = TREE_OPERAND (rhs, 2);
+ context = check_address_of_packed_member (type, rhs);
+   }
+}


Likewise, what if you have more levels of COND_EXPR?  Or COMPOUND_EXPR 
within COND_EXPR?



@@ -7470,6 +7470,9 @@ convert_for_arg_passing (tree type, tree val, 
tsubst_flags_t complain)
+  warn_for_address_or_pointer_of_packed_member (true, type, val);



@@ -8914,6 +8914,8 @@ convert_for_assignment (tree type, tree rhs,
+  warn_for_address_or_pointer_of_packed_member (true, type, rhs);


Why would address_p be true in these calls?  It seems that you are 
warning at the point of assignment but looking for the warning about 
taking the address rather than the one about assignment.


If you want to warn about taking the address, shouldn't that happen 
under cp_build_addr_expr?  Alternately, drop the address_p parameter and 
choose your path inside warn_for_*_packed_member based on whether rhs is 
an ADDR_EXPR there rather than in the caller.


Jason

Re: [C++ Patch] PR 84980 ("[concepts] ICE with missing typename in concept")

2018-08-31 Thread Jason Merrill

OK.  And I'd say that this sort of error recovery qualifies as obvious.

Jason


On Fri, Aug 31, 2018 at 5:32 AM, Paolo Carlini  wrote:
> Hi,
>
> another straightforward ICE on invalid.  I spent however quite a bit of time
> on it, because I tried to catch the error_mark_node as early as possible,
> but that doesn't seem to work well error-recovery-wise: if, say, we catch it
> in deduce_constrained_parameter we end-up emitting a second redundant "‘C’
> is not a type" error message because we skip 'C' completely. Thus, at least
> for the time being, I propose to simply catch the problem in
> finish_shorthand_constraint (which likely avoids more ICEs already in
> Bugzilla...). Tested x86_64-linux.
>
> Thanks, Paolo.
>
> 
>

Re: DWARF: Represent hard frame pointer as stack pointer + offset

2018-08-31 Thread Jason Merrill

On Fri, Aug 31, 2018 at 3:33 PM, H.J. Lu  wrote:
> On Thu, Aug 30, 2018 at 10:21 AM, Jason Merrill  wrote:
>>
>>> r138335 allowed arg_pointer_rtx to be eliminated by either FP or SP,
>>> but only when dynamic stack alignment is supported.  In this case,
>>> arg_pointer_rtx is eliminated by FP even when frame_pointer_needed
>>> is false and there is no dynamic stack alignment at all.
>>>
 gcc_assert (elim == stack_pointer_rtx || (frame_pointer_needed && elim
 == hard_frame_pointer_rtx));

 so as not to allow eliminating to an uninitialized FP.
>>>
>>> FP isn't uninitialized.  It is initialized the same way as in the case of
>>> SUPPORTS_STACK_ALIGNMENT is true.
>>
>> How is that?  Why would it be initialized when frame_pointer_needed is
>> false?  What does it mean for frame_pointer_needed to be false, if
>> not, as in the documentation of -fomit-frame-pointer,
>>
>> "This avoids the instructions to save, set up and restore the frame
>> pointer; on many targets it also makes an extra register available."
>>
>> ?
>
> A backend may not set up frame pointer even with -fno-omit-frame-pointer.
> In the case of x86,  hard frame pointer can be represented by stack pointer
> - UNITS_PER_WORD.
>
> This patch adds hard_frame_pointer_from_stack_pointer_plus_offset and
> hard_frame_pointer_offset to rtl_data to allow a backend to represent
> hard frame pointer as stack pointer + offset.

Shouldn't this be fixed in eliminate_regs rather than dwarf2out?

Jason

DWARF: Represent hard frame pointer as stack pointer + offset

2018-08-31 Thread H.J. Lu

On Thu, Aug 30, 2018 at 10:21 AM, Jason Merrill  wrote:
>
>> r138335 allowed arg_pointer_rtx to be eliminated by either FP or SP,
>> but only when dynamic stack alignment is supported.  In this case,
>> arg_pointer_rtx is eliminated by FP even when frame_pointer_needed
>> is false and there is no dynamic stack alignment at all.
>>
>>> gcc_assert (elim == stack_pointer_rtx || (frame_pointer_needed && elim
>>> == hard_frame_pointer_rtx));
>>>
>>> so as not to allow eliminating to an uninitialized FP.
>>
>> FP isn't uninitialized.  It is initialized the same way as in the case of
>> SUPPORTS_STACK_ALIGNMENT is true.
>
> How is that?  Why would it be initialized when frame_pointer_needed is
> false?  What does it mean for frame_pointer_needed to be false, if
> not, as in the documentation of -fomit-frame-pointer,
>
> "This avoids the instructions to save, set up and restore the frame
> pointer; on many targets it also makes an extra register available."
>
> ?
>

A backend may not set up frame pointer even with -fno-omit-frame-pointer.
In the case of x86,  hard frame pointer can be represented by stack pointer
- UNITS_PER_WORD.

This patch adds hard_frame_pointer_from_stack_pointer_plus_offset and
hard_frame_pointer_offset to rtl_data to allow a backend to represent
hard frame pointer as stack pointer + offset.

OK for trunk?


-- 
H.J.
From 6fc2b6ca9ca77afa2057d2d48ec8f8131036100e Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 8 Aug 2018 06:20:27 -0700
Subject: [PATCH] DWARF: Represent hard frame pointer as stack pointer + offset

With

commit cd557ff63f388ad27c376d0a225e74d3594a6f9d
Author: hjl 
Date:   Thu Aug 10 15:29:05 2017 +

i386: Don't use frame pointer without stack access

When there is no stack access, there is no need to use frame pointer
even if -fno-omit-frame-pointer is used and caller's frame pointer is
unchanged.

frame pointer may not be available even if -fno-omit-frame-pointer is
used.  When this happened, arg pointer and frame pointer may be eliminated
by hard frame pointer.  In this case, hard frame pointer may be represented
by stack pointer + offset.

This patch adds hard_frame_pointer_from_stack_pointer_plus_offset and
hard_frame_pointer_offset to rtl_data to allow a backend to represent
hard frame pointer as stack pointer + offset.

gcc/

	PR debug/86593
	* dwarf2out.c (based_loc_descr): When frame pointer isn't
	used, if arg pointer or frame pointer are eliminated by hard
	frame pointer, use stack pointer + offset to access stack
	variables if possible.  Update gcc_assert.
	* emit-rtl.h (rtl_data): Add hard_frame_pointer_offset and
	hard_frame_pointer_from_stack_pointer_plus_offset.
	* config/i386/i386.c (ix86_finalize_stack_frame_flags): Set
	hard_frame_pointer_from_stack_pointer_plus_offset and
	hard_frame_pointer_offset to represent hard frame pointer as
	stack pointer - UNITS_PER_WORD if frame pointer isn't used.

gcc/testsuite/

	PR debug/86593
	* g++.dg/pr86593.C: New test.
---
 gcc/config/i386/i386.c |  6 --
 gcc/dwarf2out.c| 13 +++--
 gcc/emit-rtl.h |  8 
 gcc/testsuite/g++.dg/pr86593.C | 11 +++
 4 files changed, 34 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/pr86593.C

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 8672a666024..775b96473b1 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -13161,10 +13161,12 @@ ix86_finalize_stack_frame_flags (void)
 	  df_compute_regs_ever_live (true);
 	  df_analyze ();
 
+	  /* Since frame pointer is no longer available, replace it with
+	 stack pointer - UNITS_PER_WORD in debug insns.  */
+	  crtl->hard_frame_pointer_from_stack_pointer_plus_offset = true;
+	  crtl->hard_frame_pointer_offset = -UNITS_PER_WORD;
 	  if (flag_var_tracking)
 	{
-	  /* Since frame pointer is no longer available, replace it with
-		 stack pointer - UNITS_PER_WORD in debug insns.  */
 	  df_ref ref, next;
 	  for (ref = DF_REG_USE_CHAIN (HARD_FRAME_POINTER_REGNUM);
 		   ref; ref = next)
diff --git a/gcc/dwarf2out.c b/gcc/dwarf2out.c
index 77317ed2575..09f3c192129 100644
--- a/gcc/dwarf2out.c
+++ b/gcc/dwarf2out.c
@@ -14326,9 +14326,18 @@ based_loc_descr (rtx reg, poly_int64 offset,
   if (elim != reg)
 	{
 	  elim = strip_offset_and_add (elim, );
+
+	  if (!frame_pointer_needed
+	  && elim == hard_frame_pointer_rtx
+	  && crtl->hard_frame_pointer_from_stack_pointer_plus_offset)
+	{
+	  int fp_offset = crtl->hard_frame_pointer_offset;
+	  int base_reg = DWARF_FRAME_REGNUM (STACK_POINTER_REGNUM);
+	  return new_reg_loc_descr (base_reg, offset + fp_offset);
+	}
+
 	  gcc_assert ((SUPPORTS_STACK_ALIGNMENT
-		   && (elim == hard_frame_pointer_rtx
-			   || elim == stack_pointer_rtx))
+		   && elim == stack_pointer_rtx)
 	  || elim == (frame_pointer_needed
   ? hard_frame_pointer_rtx
   : stack_pointer_rtx));
diff --git

Re: [PING v2][PATCH] Make function clone name numbering independent.

2018-08-31 Thread Michael Ploujnikov

On 2018-08-13 07:58 PM, Michael Ploujnikov wrote:
> Ping and I've updated the patch since last time as follows:
> 
>   - unittest scans assembly rather than the constprop dump because its
> forward changed
>   - unittests should handle different hosts where any of
> NO_DOT_IN_LABEL, NO_DOLLAR_IN_LABEL or __USER_LABEL_PREFIX__ may
> be defined
>   - not 100% it's safe to change DECL_NAME to DECL_ASSEMBLER_NAME in
> cgraph_node::create_virtual_clone, but I've attempted to reduce
> some code duplication
>   - lto-partition.c: privatize_symbol_name_1 *does* need numbered
> names
>   - but cold sections don't
>   - Expecting an IDENTIFIER_NODE in clone_function_name_1 avoids
> unreliable string pointer use as pointed out in the first review
>   - renamed clone_function_name_1 and clone_function_name to
> numbered_clone_function_name_1 and numbered_clone_function_name to
> clarify purpose and discourage future unintended uses
> 
> Also bootstrapped and regtested.

Ping.

I've done some more digging into the current uses of
numbered_clone_function_name and checked if any tests fail if I change
it to suffixed_function_name:

  - gcc/cgraphclones.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
(thunk->decl, "artificial_thunk");
- no new tests fail, inconclusive
- and despite the comment on redirect_callee_duplicating_thunks
  about "one or more" duplicates it doesn't seem like
  duplicate_thunk_for_node would be called more than once for each
  node, assuming each node is named uniquely, but I'm far from an
  expert in this area
  - gcc/cgraphclones.c:  SET_DECL_ASSEMBLER_NAME (new_decl, 
numbered_clone_function_name (old_decl, suffix));
- called by cgraph_node::create_virtual_clone, definitely needs it
- this is where clones like .constprop come from
  - gcc/cgraphclones.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
(old_decl, suffix);
- more tests fail
- this is where clones like .simdclone come from
- called by cgraph_node::create_version_clone_with_body, most likely needs 
it
  - gcc/config/rs6000/rs6000.c:  tree decl_name = numbered_clone_function_name 
(default_decl, "resolver");
- doesn't look like this needs the numbering as there should only
  be one resolver per multi-version function, but need someone
  with an rs/6000 to confirm
  - gcc/lto/lto-partition.c:  
numbered_clone_function_name_1 (identifier,
- definitely needed for disambiguation, shown by unit tests failing
  - gcc/multiple_target.c:
numbered_clone_function_name (node->decl,
- create_dispatcher_calls says it only creates the dispatcher once
- no new tests fail, inconclusive
  - gcc/multiple_target.c:
numbered_clone_function_name (node->decl,
- I have a feeling this isn't necessary
- no new tests fail, inconclusive
  - gcc/omp-expand.c:  DECL_NAME (kern_fndecl) = numbered_clone_function_name 
(kern_fndecl, "kernel");
- no new tests fail, inconclusive
- I didn't see (and couldn't figure out a way to get) any of the
  existing omp/acc tests actually exercise this codeptah
  - gcc/omp-low.c:  return numbered_clone_function_name (current_function_decl,
- definitely needed based on
  gcc/testsuite/c-c++-common/goacc/kernels-loop-2.c and others
  - gcc/omp-simd-clone.c:  DECL_NAME (new_decl) = 
numbered_clone_function_name (old_decl, "simdclone");
- no tests fail, inconclusive
- can definitely have more than one simdclone per function as above, but
  maybe not these external types
- some tests, like declare-simd.c actually exercise this codepath,
  but I couldn't find the resulting .simdclone decl the the
  simdclone pass dump nor in any of the other dumps to verify
  - gcc/symtab.c:  DECL_NAME (new_decl) = numbered_clone_function_name 
(node->decl, "localalias");
- no tests fail, inconclusive
- my understanding of function_and_variable_visibility says that
  there can only be one per function so maybe this isn't; hubicka?

I'll add patches to switch these once someone with expertise in these
areas can confirm that the numbering isn't needed in the respective
decl names.


- Michael

From adde0632266d3f1b0540e09b9db931df4302d2bc Mon Sep 17 00:00:00 2001
From: Michael Ploujnikov 
Date: Mon, 16 Jul 2018 12:55:49 -0400
Subject: [PATCH 1/4] Make function assembly more independent.

This changes clone_function_name such that clone names are based on a
per-function counter rather than a global one.

gcc:
2018-08-02  Michael Ploujnikov  

   Make function clone name numbering independent.
   * cgraphclones.c: Replace clone_fn_id_num with clone_fn_ids.
   (clone_function_name_1): Take an IDENTIFIER_NODE instead of a
   string. Use clone_fn_id_num.

lto:
2018-08-02  Michael Ploujnikov  

   * lto-partition.c (privatize_symbol_name_1): Pass a

Re: [Patch][Aarch64] Implement Aarch64 SIMD ABI and aarch64_vector_pcs attribute

2018-08-31 Thread Steve Ellcey



Ping.  Any feedback from the Aarch64 maintainers?

Steve Ellcey
sell...@cavium.com



On Mon, 2018-08-20 at 10:37 -0700, Steve Ellcey wrote:
> On Tue, 2018-08-07 at 12:15 -0500, Segher Boessenkool wrote:
> 
> > 
> > > 
> > > +/* { dg-final { scan-assembler-not "\[ \t\]stp\tq\[01234567\]" }
> > > }
> > > */
> > That's [0-7] but maybe you find [01234567] more readable here.
> Segher,  I fixed all the issues you pointed out except this one.
>  Since
> there are some uses of non consecutive numbers in one of the tests I 
> decided to leave [01234567] instead of using [0-7].  Here is the 
> latest version of the patch, there are no semantic changes, just
> syntactical ones to address the issues that you pointed out.
> 
> Steve Ellcey
> sell...@cavium.com
> 
> 
> 2018-08-20  Steve Ellcey  
> 
>   * config/aarch64/aarch64-protos.h
> (aarch64_use_simple_return_insn_p):
>   New prototype.
>   (aarch64_epilogue_uses): Ditto.
>   * config/aarch64/aarch64.c (aarch64_attribute_table): New
> array.
>   (aarch64_simd_decl_p): New function.
>   (aarch64_reg_save_mode): New function.
>   (aarch64_is_simd_call_p): New function.
>   (aarch64_function_ok_for_sibcall): Check for simd calls.
>   (aarch64_layout_frame): Check for simd function.
>   (aarch64_gen_storewb_pair): Handle E_TFmode.
>   (aarch64_push_regs): Use aarch64_reg_save_mode to get mode.
>   (aarch64_gen_loadwb_pair): Handle E_TFmode.
>   (aarch64_pop_regs): Use aarch64_reg_save_mode to get mode.
>   (aarch64_gen_store_pair): Handle E_TFmode.
>   (aarch64_gen_load_pair): Ditto.
>   (aarch64_save_callee_saves): Handle different mode sizes.
>   (aarch64_restore_callee_saves): Ditto.
>   (aarch64_components_for_bb): Check for simd function.
>   (aarch64_epilogue_uses): New function.
>   (aarch64_process_components): Check for simd function.
>   (aarch64_expand_prologue): Ditto.
>   (aarch64_expand_epilogue): Ditto.
>   (aarch64_expand_call): Ditto.
>   (TARGET_ATTRIBUTE_TABLE): New define.
>   * config/aarch64/aarch64.h (EPILOGUE_USES): Redefine.
>   (FP_SIMD_SAVED_REGNUM_P): New macro.
>   * config/aarch64/aarch64.md (V23_REGNUM) New constant.
>   (simple_return): New define_expand.
>   (load_pair_dw_tftf): New instruction.
>   (store_pair_dw_tftf): Ditto.
>   (loadwb_pair_): Ditto.
>   ("storewb_pair_): Ditto.
> 
> 2018-08-20  Steve Ellcey  
> 
>   * gcc.target/aarch64/torture/aarch64-torture.exp: New file.
>   * gcc.target/aarch64/torture/simd-abi-1.c: New test.
>   * gcc.target/aarch64/torture/simd-abi-2.c: Ditto.
>   * gcc.target/aarch64/torture/simd-abi-3.c: Ditto.
>   * gcc.target/aarch64/torture/simd-abi-4.c: Ditto.

Re: [PATCH] Adjust c_getstr/c_strlen to new STRING_CST semantic

2018-08-31 Thread Bernd Edlinger

On 08/31/18 16:42, Jeff Law wrote:
> On 08/31/2018 01:08 AM, Bernd Edlinger wrote:
>> Hi,
>>
>>
>> when re-testing the new STRING_CST semantic patch here:
>> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01569.html
>>
>> I noticed that the (my) fix for PR 87053 does still use
>> semantic properties of the STRING_CST that is not compatible
>> with the new proposed STRING_CST semantics.
>>
>> That means that c_strlen needs to handle the case
>> where strings are not zero terminated.
>>
>> When this is fixed, fortran.dg/pr45636.f90 starts to regress,
>> because the check in gimple_fold_builtin_memory_op here:
>>
>> if (tree_fits_uhwi_p (len)
>> && compare_tree_int (len, MOVE_MAX) <= 0
>> /* ???  Don't transform copies from strings with known length 
>> this
>>confuses the tree-ssa-strlen.c.  This doesn't handle
>>the case in gcc.dg/strlenopt-8.c which is XFAILed for that
>>reason.  */
>> && !c_strlen (src, 2))
>>
>> does now return NULL_TREE, because the fortran string is not null terminated.
>> However that allows the folding which makes the fortran test case fail.
>>
>> I fixed that by pulling in the c_getstr patch again, and use
>> it to make another exception for string constants without any embedded NUL.
>> That is what tree-ssa-forwprop.c can handle, and should make this work in
>> fortran again.
>>
>>
>> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
>> Is it OK for trunk?
> I've gone the rounds on pr45636 a couple times and it's part of the
> reason why there's a FIXME in the bits I committed earlier this week.
> 

Yes, the rationale here is that tree-ssa-forwprop will likely choke if there
is a NUL byte in the string:

   /* Neither builtin_strncpy_read_str nor builtin_memcpy_read_str
  handle embedded '\0's.  */
   if (strlen (src_buf) != src_len)
 break;

Prior to this the c_strlen (src, 2) returned 2, thus this folding was not done,
but since the string does not contain any NUL bytes this returns now NULL.
However it is easy to add an exception that triggers only in a fortran string
in this way.

> I'll look at this closely in conjunction with the (unposted) patch which
> addresses the FIXME.
> 
> Jeff
> 

Bernd.

Re: [PATCHv2] Call braced_list_to_string after array size is fixed

2018-08-31 Thread Bernd Edlinger

On 08/31/18 18:45, Jeff Law wrote:
> On 08/26/2018 01:45 AM, Bernd Edlinger wrote:
> 
 cp:
 2018-08-24  Bernd Edlinger  

* decl.c (eval_check_narrowing): Remove.
(check_initializer): Move call to braced_list_to_string from here ...
* typeck2.c (store_init_value): ... to here.
(digest_init_r): Remove handing of signed/unsigned char strings.

 testsuite:
 2018-08-24  Bernd Edlinger  

* c-c++-common/array-init.c: New test.
* g++.dg/init/string2.C: Remove xfail.
>>> My concern here is that you removed code that was explicitly added
>>> during the initial review work of the patch that turned braced
>>> initializers into strings.
>>>
>>
>> Yes, you mean BRACE_ENCLOSED_INITIALIZER_P
>> which is (TREE_CODE (NODE) == CONSTRUCTOR && TREE_TYPE (NODE) == 
>> init_list_type_node)
> I was referring to the callback into the eval routine from within
> braced_list_to_string which is related to the concern where you dropped
> the c++98_only selector.  Sorry I wasn't clear about that.
> 

Ah, that you mean.

I think it should be fine, since the main purpose of eval_check_narrowing
was to call check_narrowing on the value, which is normally done by reshape_init
but this is by-passed if braced_list_to_string is successful.

It is much smarter to call braced_list_to_string that after digest_init 
completed.


> 
>>
>>>
 -a[] = { 1, 2, 333, 0 }; // { dg-warning 
 "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
 +a[] = { 1, 2, 333, 0 }; // { dg-warning 
 "\\\[\(-Wnarrowing|-Woverflow\)" }
>>> This is not an XFAIL, this is a selector.  Essentially it says that the
>>> diagnostic is appropriate when not in c++98 mode.
>>>
>>> You can see that to be the case if you compile the test in c++98 mode
>>> without your change.  It will compile with no errors or warnings.
>>>
>>> However, after your change it issues a warning for the narrowing
>>> conversion in c++98 mode, which AFAICT it should not do.
>>>
>>
>> This just restores the state _before_ Martin's braced initializer patch.
>> So I have the impression that is actually a regression due to the
>> original braced initializer patch.
>> It is unfortunate that Martin did not check that.
> The ChangeLog said it was removing an xfail, so naturally when I saw it
> was removing a selector I assumed that the selector was right
> (particularly since it made sense given current behavior) and you'd made
> a minor goof.
> 

Yes, indeed, the change log is wrong, it should be

* g++.dg/init/string2.C: Adjust test expectations.


> But I can confirm that prior to Martin's changes we did issue a
> diagnostic when in C++98 mode:
> 
> j.C: In instantiation of ‘int tmplen() [with T = char]’:
> j.C:61:27:   required from here
> j.C:57:5: warning: overflow in conversion from ‘int’ to ‘char’ changes
> value from ‘333’ to ‘'M'’ [-Woverflow]
> 57 | a[] = { 1, 2, 333, 0 }; // { dg-warning
> "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
> | ^
> 
> You can see that with the trunk just before Martin's change or with
> older compilers (I also tested with 6.4.1 as I had it handy).
> 
> So I'll withdraw my objection to this change.  I'll dig into your
> updated patch momentarily.
> 
> jeff
> 


Thanks
Bernd.

Re: [PATCHv2] Call braced_list_to_string after array size is fixed

2018-08-31 Thread Jeff Law

On 08/26/2018 01:45 AM, Bernd Edlinger wrote:

>>> cp:
>>> 2018-08-24  Bernd Edlinger  
>>>
>>> * decl.c (eval_check_narrowing): Remove.
>>> (check_initializer): Move call to braced_list_to_string from here ...
>>> * typeck2.c (store_init_value): ... to here.
>>> (digest_init_r): Remove handing of signed/unsigned char strings.
>>>
>>> testsuite:
>>> 2018-08-24  Bernd Edlinger  
>>>
>>> * c-c++-common/array-init.c: New test.
>>> * g++.dg/init/string2.C: Remove xfail.
>> My concern here is that you removed code that was explicitly added
>> during the initial review work of the patch that turned braced
>> initializers into strings.
>>
> 
> Yes, you mean BRACE_ENCLOSED_INITIALIZER_P
> which is (TREE_CODE (NODE) == CONSTRUCTOR && TREE_TYPE (NODE) == 
> init_list_type_node)
I was referring to the callback into the eval routine from within
braced_list_to_string which is related to the concern where you dropped
the c++98_only selector.  Sorry I wasn't clear about that.

[ snip ]

> 
>>
>>> -a[] = { 1, 2, 333, 0 }; // { dg-warning 
>>> "\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
>>> +a[] = { 1, 2, 333, 0 }; // { dg-warning 
>>> "\\\[\(-Wnarrowing|-Woverflow\)" }
>> This is not an XFAIL, this is a selector.  Essentially it says that the
>> diagnostic is appropriate when not in c++98 mode.
>>
>> You can see that to be the case if you compile the test in c++98 mode
>> without your change.  It will compile with no errors or warnings.
>>
>> However, after your change it issues a warning for the narrowing
>> conversion in c++98 mode, which AFAICT it should not do.
>>
> 
> This just restores the state _before_ Martin's braced initializer patch.
> So I have the impression that is actually a regression due to the
> original braced initializer patch.
> It is unfortunate that Martin did not check that.
The ChangeLog said it was removing an xfail, so naturally when I saw it
was removing a selector I assumed that the selector was right
(particularly since it made sense given current behavior) and you'd made
a minor goof.

But I can confirm that prior to Martin's changes we did issue a
diagnostic when in C++98 mode:

j.C: In instantiation of ‘int tmplen() [with T = char]’:
j.C:61:27:   required from here
j.C:57:5: warning: overflow in conversion from ‘int’ to ‘char’ changes
value from ‘333’ to ‘'M'’ [-Woverflow]
57 | a[] = { 1, 2, 333, 0 }; // { dg-warning
"\\\[\(-Wnarrowing|-Woverflow\)" "" { target { ! c++98_only } } }
   | ^

You can see that with the trunk just before Martin's change or with
older compilers (I also tested with 6.4.1 as I had it handy).

So I'll withdraw my objection to this change.  I'll dig into your
updated patch momentarily.

jeff

Re: [GCC][PATCH v2][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-08-31 Thread Sam Tebbs





On 08/28/2018 11:53 PM, James Greenhalgh wrote:

On Wed, Aug 01, 2018 at 10:07:23AM -0500, Sam Tebbs wrote:

+   ones from the MSB.  */
+bool
+aarch64_is_left_consecutive (HOST_WIDE_INT i)
+{
+  return (i | (i - 1)) == HOST_WIDE_INT_M1;

exact_log2(-i) != HOST_WIDE_INT_M1?


I could change this but I'm not sure what benefit it would have over the 
original implementation. You're welcome to explain so as I could be 
missing something.

Re: [PR c++/87137] GCC-8 Fix

2018-08-31 Thread Joseph Myers

On Fri, 31 Aug 2018, Jason Merrill wrote:

> > Anonymous structures and unions are in C11 (and before that a widely
> > accepted extension).
> 
> This isn't an anonymous union, it's named "x1".  I don't think that
> line declares a field in C, either.

Indeed, the case where the field has a tag is one of the extensions from 
-fms-extensions / -fplan9-extensions, not part of the C11 anonymous struct 
/ union feature which requires a struct or union without a tag as the type 
of the unnamed field.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH] Add -Wabsolute-value

2018-08-31 Thread Joseph Myers

On Fri, 31 Aug 2018, Martin Jambor wrote:

> diff --git a/gcc/common.opt b/gcc/common.opt
> index ebc3ef43ce2..2950760fb2a 100644
> --- a/gcc/common.opt
> +++ b/gcc/common.opt
> @@ -815,6 +815,10 @@ Wvector-operation-performance
>  Common Var(warn_vector_operation_performance) Warning
>  Warn when a vector operation is compiled outside the SIMD.
>  
> +Wabsolute-value
> +Common Var(warn_absolute_value) Warning EnabledBy(Wextra)
> +Warn on suspicious calls of standard functions computing absolute values.

If it's C-specific I'd expect it to be in c.opt (in the appropriate 
alphabetical position) and have "C ObjC" instead of Common.

> +@item -Wabsolute-value
> +@opindex Wabsolute-value
> +@opindex Wno-absolute-value
> +Warn when a wrong absolute value function seems to be used or when it
> +does not have any effect because its argument is an unsigned type.
> +This warning is specific to C language and can be suppressed with an
> +explicit type cast.  This warning is also enabled by @option{-Wextra}.

And then the @item would have @r{(C and Objective-C only)}, similar to 
other such options (rather than saying "specific to C language" in the 
text).

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PR c++/87137] GCC-8 Fix

2018-08-31 Thread Jason Merrill

On Fri, Aug 31, 2018 at 10:35 AM, Michael Matz  wrote:
> Hi,
>
> On Thu, 30 Aug 2018, Nathan Sidwell wrote:
>
>> > struct foo
>> >{
>> >  unsigned a : 21;
>> >  union x1 { char x; };
>> >  unsigned b : 11;
>> >  union x1 u;
>> >};
>>
>> (for C++) this happens to be a case we already get right.   I think
>> that'd be a C vendor extension, I don't think unnamed fields are permitted
>> there?
>
> Anonymous structures and unions are in C11 (and before that a widely
> accepted extension).

This isn't an anonymous union, it's named "x1".  I don't think that
line declares a field in C, either.

Jason

Re: [GCC][PATCH v2][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-08-31 Thread Alexander Monakov

On Fri, 31 Aug 2018, Kyrill Tkachov wrote:
> > That sounds good! Although I'm confused about what the 'p' means.
> 
> It stands for "predicate" meaning a boolean function with no side-effects.
> It's the preferred way to name these kinds of functions in GCC (though I can't
> seem to find the documentation mandate for it).
> IIRC it's a remnant from the LISP days. You'll find a lot of *_p functions in
> GCC.

It's mentioned in the jargon file as "The -P convention" (see e.g. at
https://www.dourish.com/goodies/jargon.html ).

Alexander

Re: [GCC][PATCH v2][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-08-31 Thread Kyrill Tkachov

On 31/08/18 16:27, Sam Tebbs wrote:

On 08/31/2018 11:59 AM, Kyrill Tkachov wrote:

>
> On 30/08/18 16:53, Sam Tebbs wrote:
>>
>>
>> On 08/28/2018 11:53 PM, James Greenhalgh wrote:
>> > Hm, I'm not very sure about the naming here; "left consecutive"
>> isn't a
>> > common phrase to denote the mask you're looking for (exact_log2
>> (-i) != -1
>> > if I'm reading right), and is misleading 0x is 'left
>> consecutive'
>> > too, just with zeroes rather than ones.
>> >
>> I think you're right about it not being the best naming. Do you have any
>> suggestions for a better name?
>
> Naming things is hard... :(
>
> How about aarch64_hi_bits_all_ones_p ?
>
> Kyrill

That sounds good! Although I'm confused about what the 'p' means.

It stands for "predicate" meaning a boolean function with no side-effects.
It's the preferred way to name these kinds of functions in GCC (though I can't 
seem to find the documentation mandate for it).
IIRC it's a remnant from the LISP days. You'll find a lot of *_p functions in 
GCC.

Kyrill

Re: [GCC][PATCH v2][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-08-31 Thread Sam Tebbs


On 08/31/2018 11:59 AM, Kyrill Tkachov wrote:



On 30/08/18 16:53, Sam Tebbs wrote:



On 08/28/2018 11:53 PM, James Greenhalgh wrote:
> Hm, I'm not very sure about the naming here; "left consecutive" 
isn't a
> common phrase to denote the mask you're looking for (exact_log2 
(-i) != -1
> if I'm reading right), and is misleading 0x is 'left 
consecutive'

> too, just with zeroes rather than ones.
>
I think you're right about it not being the best naming. Do you have any
suggestions for a better name?


Naming things is hard... :(

How about aarch64_hi_bits_all_ones_p ?

Kyrill


That sounds good! Although I'm confused about what the 'p' means.

Re: [PATCH][AArch64] Implement new intrinsics vabsd_s64 and vnegd_s64

2018-08-31 Thread Vlad Lazar


On 28/08/18 22:58, James Greenhalgh wrote:

On Tue, Aug 28, 2018 at 03:59:25AM -0500, Vlad Lazar wrote:

Gentle ping.

On 08/08/18 17:38, Vlad Lazar wrote:

On 01/08/18 18:35, James Greenhalgh wrote:

On Wed, Aug 01, 2018 at 07:13:53AM -0500, Vlad Lazar wrote:

On 31/07/18 22:48, James Greenhalgh wrote:

On Fri, Jul 20, 2018 at 04:37:34AM -0500, Vlad Lazar wrote:

Hi,

The patch adds implementations for the NEON intrinsics vabsd_s64 and vnegd_s64.
(https://developer.arm.com/products/architecture/cpu-architecture/a-profile/docs/ihi0073/latest/arm-neon-intrinsics-reference-architecture-specification)

Bootstrapped and regtested on aarch64-none-linux-gnu and there are no 
regressions.

OK for trunk?

+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vnegd_s64 (int64_t __a)
+{
+  return -__a;
+}


Does this give the correct behaviour for the minimum value of int64_t? That
would be undefined behaviour in C, but well-defined under ACLE.

Thanks,
James



Hi. Thanks for the review.

For the minimum value of int64_t it behaves as the ACLE specifies:
"The negative of the minimum (signed) value is itself."


What should happen in this testcase? The spoiler is below, but try to work out
what should happen and what goes wrong with your implementation.

int foo (int64_t x)
{
  if (x < (int64_t) 0)
return vnegd_s64(x) < (int64_t) 0;
  else
return 0;
}
int bar (void)
{
  return foo (INT64_MIN);
}
Thanks,
James


-






INT64_MIN < 0 should be true, so we should return vnegd_s64(INT64_MIN) < 0.
vnegd_s64(INT64_MIN) is identity, so the return value should be
INT64_MIN < 0; i.e. True.

This isn't what the compiler thinks... The compiler makes use of the fact
that -INT64_MIN is undefined behaviour in C, and doesn't need to be considered
as a special case. The if statement gives you a range reduction to [-INF, -1],
negating that gives you a range [1, INF], and [1, INF] is never less than 0,
so the compiler folds the function to return false. We have a mismatch in
semantics


I see your point now. I have updated the vnegd_s64 intrinsic to convert to
unsigned before negating. This means that if the predicted range of x is
[INT64_MIN, y], then the predicted range of vnegd_s64 (x) will be
~[INT64_MIN + 1, y] which seems to resolve the issue. I've also added testcases
which reflect the issue you've pointed out. Note that I've change the vabsd_s64
intrinsic in order to avoid moves between integer and vector registers.


I think from my reading of the standard that this is OK, but I may be rusty
and missing a corner case.

OK for trunk.

Thanks,
James


Committed with an obvious change to testsuite/gcc.target/aarch64/vneg_s.c 
testcase:
merged two scan assembler directives which were searching for the same pattern.
See the patch below.

Thanks,
Vlad
Index: ChangeLog
===
--- ChangeLog	(revision 264018)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2018-08-31  Vlad Lazar  
+
+	* config/aarch64/arm_neon.h (vabsd_s64): New.
+	(vnegd_s64): Likewise.
+
 2018-08-31  Martin Jambor  
 
 	* ipa-cp.c (estimate_local_effects): Replace wrong MAX with MIN.
Index: config/aarch64/arm_neon.h
===
--- config/aarch64/arm_neon.h	(revision 264018)
+++ config/aarch64/arm_neon.h	(working copy)
@@ -11822,6 +11822,18 @@
   return __builtin_aarch64_absv2di (__a);
 }
 
+/* Try to avoid moving between integer and vector registers.
+   For why the cast to unsigned is needed check the vnegd_s64 intrinsic.
+   There is a testcase related to this issue:
+   gcc.target/aarch64/vabsd_s64.c.  */
+
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vabsd_s64 (int64_t __a)
+{
+  return __a < 0 ? - (uint64_t) __a : __a;
+}
+
 /* vadd */
 
 __extension__ extern __inline int64_t
@@ -22907,6 +22919,25 @@
   return -__a;
 }
 
+/* According to the ACLE, the negative of the minimum (signed)
+   value is itself.  This leads to a semantics mismatch, as this is
+   undefined behaviour in C.  The value range predictor is not
+   aware that the negation of a negative number can still be negative
+   and it may try to fold the expression.  See the test in
+   gcc.target/aarch64/vnegd_s64.c for an example.
+
+   The cast below tricks the value range predictor to include
+   INT64_MIN in the range it computes.  So for x in the range
+   [INT64_MIN, y] the range prediction after vnegd_s64 (x) will
+   be ~[INT64_MIN + 1, y].  */
+
+__extension__ extern __inline int64_t
+__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
+vnegd_s64 (int64_t __a)
+{
+  return - (uint64_t) __a;
+}
+
 __extension__ extern __inline float32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vnegq_f32 (float32x4_t __a)
Index: testsuite/ChangeLog

Re: [PATCH] Adjust c_getstr/c_strlen to new STRING_CST semantic

2018-08-31 Thread Jeff Law

On 08/31/2018 01:08 AM, Bernd Edlinger wrote:
> Hi,
> 
> 
> when re-testing the new STRING_CST semantic patch here:
> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01569.html
> 
> I noticed that the (my) fix for PR 87053 does still use
> semantic properties of the STRING_CST that is not compatible
> with the new proposed STRING_CST semantics.
> 
> That means that c_strlen needs to handle the case
> where strings are not zero terminated.
> 
> When this is fixed, fortran.dg/pr45636.f90 starts to regress,
> because the check in gimple_fold_builtin_memory_op here:
> 
>if (tree_fits_uhwi_p (len)
>&& compare_tree_int (len, MOVE_MAX) <= 0
>/* ???  Don't transform copies from strings with known length this
>   confuses the tree-ssa-strlen.c.  This doesn't handle
>   the case in gcc.dg/strlenopt-8.c which is XFAILed for that
>   reason.  */
>&& !c_strlen (src, 2))
>   
> does now return NULL_TREE, because the fortran string is not null terminated.
> However that allows the folding which makes the fortran test case fail.
> 
> I fixed that by pulling in the c_getstr patch again, and use
> it to make another exception for string constants without any embedded NUL.
> That is what tree-ssa-forwprop.c can handle, and should make this work in
> fortran again.
> 
> 
> Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
> Is it OK for trunk?
I've gone the rounds on pr45636 a couple times and it's part of the
reason why there's a FIXME in the bits I committed earlier this week.

I'll look at this closely in conjunction with the (unposted) patch which
addresses the FIXME.

Jeff

RE: Add new warning flag "warn_prio_ctor_dtor"

2018-08-31 Thread Joseph Myers

On Fri, 31 Aug 2018, Vinay Kumar wrote:

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 1fbcbd5..8dc9fb4 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2018-08-31  Vinay Kumar  
> +
> + * doc/invoke.texi (-Wreturn-type): Document new warning
> + -Wprio-ctor-dtor.

The documentation is of a new option, not of -Wreturn-type, so the 
ChangeLog entry should name the option being documented or the section to 
which the documentation is added.

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 637f5ad..49678d7 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -294,7 +294,7 @@ Objective-C and Objective-C++ Dialects}.
>  -Wdisabled-optimization @gol
>  -Wno-discarded-qualifiers  -Wno-discarded-array-qualifiers @gol
>  -Wno-div-by-zero  -Wdouble-promotion @gol
> --Wduplicated-branches  -Wduplicated-cond @gol
> +-Wprio-ctor-dtor  -Wduplicated-branches  -Wduplicated-cond @gol

Please see where invoke.texi says:

Many options have long names starting with @samp{-f} or with
@samp{-W}---for example,
@option{-fmove-loop-invariants}, @option{-Wformat} and so on.  Most of
these have both positive and negative forms; the negative form of
@option{-ffoo} is @option{-fno-foo}.  This manual documents
only one of these two forms, whichever one is not the default.

Thus, you should list the option as -Wno-prio-ctor-dtor when documenting 
it.

This list appears to be in alphabetical order (ignoring the "no-"), so the 
natural place for this option in this list would be after -Wno-pragmas.

> +@item -Wprio-ctor-dtor

And likewise here.  See other such options, e.g. -Wno-attributes, for 
examples.

> +@opindex Wno-prio-ctor-dtor
> +@opindex Wprio-ctor-dtor
> +Warn if a priority from 0 to 100 is used for constructor or destructor.

And then you need to say "Do not warn", to be consistent with documenting 
the negative form of the option.

> +The use of constructor and destructor attributes allow you to assign a
> +priority to the constructor/destructor to control its order of execution
> +before @code{main}, no () is called or after it returns.  The priority

You're not meant to have the literal text ", no ()" in the manual.  The 
point of my ", no ()" in the review comments is to draw attention to the 
point in the GNU Coding Standards that you don't put "()" after the name 
of a function to indicate that it's a function, because putting "()" there 
means a function call with no arguments.

> + * c-c++-common/Wprio-ctor-dtor.c: New test.
> + * g++.dg/warn/Wprio-ctor-dtor.C: New test.

Given that you have a c-c++-common test, I don't see why a g++.dg one is 
needed as well; it appears to be testing the same thing.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PR c++/87137] GCC-8 Fix

2018-08-31 Thread Michael Matz

Hi,

On Thu, 30 Aug 2018, Nathan Sidwell wrote:

> > struct foo
> >    {
> >      unsigned a : 21;
> >      union x1 { char x; };
> >      unsigned b : 11;
> >      union x1 u;
> >    };
> 
> (for C++) this happens to be a case we already get right.   I think
> that'd be a C vendor extension, I don't think unnamed fields are permitted
> there?

Anonymous structures and unions are in C11 (and before that a widely 
accepted extension).

Ciao,
Michael.

Re: Fix PR87074: miscompilation with unroll-and-jam

2018-08-31 Thread Jakub Jelinek

On Fri, Aug 31, 2018 at 04:31:09PM +0200, Richard Biener wrote:
> On August 31, 2018 4:11:26 PM GMT+02:00, Michael Matz  wrote:
> >Hi,
> >
> >as Jakub correctly identified, uses of values produced by the inner
> >loop 
> >that occur inside the outer loop prevent fusion as well.  We are 
> >in LCSSA so the check is easy.  (Uses outside the outer loop are okay,
> >see 
> >the comment)
> >
> >Regstrapping on x86-64-linux in progress.  Okay for trunk?
> 
> OK. 

And please backport it for 8.x too ;)

Jakub

Re: Fix PR87074: miscompilation with unroll-and-jam

2018-08-31 Thread Richard Biener

On August 31, 2018 4:11:26 PM GMT+02:00, Michael Matz  wrote:
>Hi,
>
>as Jakub correctly identified, uses of values produced by the inner
>loop 
>that occur inside the outer loop prevent fusion as well.  We are 
>in LCSSA so the check is easy.  (Uses outside the outer loop are okay,
>see 
>the comment)
>
>Regstrapping on x86-64-linux in progress.  Okay for trunk?

OK. 

Richard. 

>
>Ciao,
>Michael.
>
>   * gimple-loop-jam.c (unroll_jam_possible_p): Check loop exit
>   PHIs for outer-loop uses.
>
>testsuite/
>   * gcc.dg/pr87074.c: New test.
>
>diff --git a/gcc/gimple-loop-jam.c b/gcc/gimple-loop-jam.c
>index 2ecd1bb5a7c..c6bd0428684 100644
>--- a/gcc/gimple-loop-jam.c
>+++ b/gcc/gimple-loop-jam.c
>@@ -161,7 +161,7 @@ bb_prevents_fusion_p (basic_block bb)
>   gimple_stmt_iterator gsi;
>   /* BB is duplicated by outer unrolling and then all N-1 first copies
>move into the body of the fused inner loop.  If BB exits the outer loop
>- the last copy still doess so, and the first N-1 copies are
>cancelled
>+ the last copy still does so, and the first N-1 copies are
>cancelled
>  by loop unrolling, so also after fusion it's the exit block.
>  But there might be other reasons that prevent fusion:
>* stores or unknown side-effects prevent fusion
>@@ -227,6 +227,33 @@ unroll_jam_possible_p (struct loop *outer, struct
>loop *loop)
>   || !expr_invariant_in_loop_p (outer, niter.niter))
> return false;
> 
>+  /* If the inner loop produces any values that are used inside the
>+ outer loop (except the virtual op) then it can flow
>+ back (perhaps indirectly) into the inner loop.  This prevents
>+ fusion: without fusion the value at the last iteration is used,
>+ with fusion the value after the initial iteration is used.
>+
>+ If all uses are outside the outer loop this doesn't prevent
>fusion;
>+ the value of the last iteration is still used (and the values
>from
>+ all intermediate iterations are dead).  */
>+  gphi_iterator psi;
>+  for (psi = gsi_start_phis (single_exit (loop)->dest);
>+   !gsi_end_p (psi); gsi_next ())
>+{
>+  imm_use_iterator imm_iter;
>+  use_operand_p use_p;
>+  tree op = gimple_phi_result (psi.phi ());
>+  if (virtual_operand_p (op))
>+  continue;
>+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, op)
>+  {
>+gimple *use_stmt = USE_STMT (use_p);
>+if (!is_gimple_debug (use_stmt)
>+&& flow_bb_inside_loop_p (outer, gimple_bb (use_stmt)))
>+  return false;
>+  }
>+}
>+
>   /* And check blocks belonging to just outer loop.  */
>   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
>n = get_loop_body_with_size (outer, bbs, n_basic_blocks_for_fn (cfun));
>@@ -245,7 +272,6 @@ unroll_jam_possible_p (struct loop *outer, struct
>loop *loop)
> body would be the after-iter value of the first body) if it's over
>  an associative and commutative operation.  We wouldn't
>  be able to handle unknown cycles.  */
>-  gphi_iterator psi;
>for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next
>())
> {
>   affine_iv iv;
>diff --git a/gcc/testsuite/gcc.dg/pr87074.c
>b/gcc/testsuite/gcc.dg/pr87074.c
>new file mode 100644
>index 000..d838fcd8fc5
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/pr87074.c
>@@ -0,0 +1,25 @@
>+/* { dg-do run } */
>+/* { dg-options "-O3 -floop-unroll-and-jam --param
>unroll-jam-min-percent=0" } */
>+long b;
>+unsigned c[5];
>+unsigned long long d = 3;
>+int e, f, g;
>+
>+void h() {
>+  for (; f < 11; f++) {
>+b = g;
>+for (e = 0; e < 5; e++) {
>+  c[e] = e - b - (c[e] >> 5);
>+  g = c[e];
>+}
>+  }
>+  if (c[0])
>+d = 0;
>+}
>+
>+extern void abort(void);
>+int main() {
>+  h();
>+  if (d != 0)
>+abort ();
>+}

Re: [PATCH, OpenACC] Support C++ "this" in OpenACC directives (PR66053)

2018-08-31 Thread Jakub Jelinek

On Fri, Aug 31, 2018 at 10:04:07AM -0400, Nathan Sidwell wrote:
> On 08/30/2018 04:27 PM, Jason Merrill wrote:
> 
> > On Thu, Aug 30, 2018 at 3:31 PM, Julian Brown  
> > wrote:
> > 
> > > "Apart from parsing, it's necessary to prevent the "cannot take the
> > > address of 'this', which is an rvalue expression" error from appearing
> 
> Breaking a rather fundamental language attribute does not seem wise.
> 
> > Why does referring to this[0:1] require making 'this' addressable?
> > Surely what we're interested in is the value of 'this', not the
> > address.
> Yes, transferring the this pointer is very unlikely to be what the user
> wants -- the object being referred to contains the data.  It might be wise
> to look at the DR's and changes relating to lambdas and this capture.  Those
> changes now make it much harder to simply capture the pointer
> unintentionally.

Yeah, I agree we shouldn't try to make this addressable.
Does OpenACC try to map the base of the array section (rather than what e.g.
OpenMP does, privatize the pointer base instead and assign the pointer the
new value inside of the region)?
Even if it is mapped, can't it be mapped by taking an address of a temporary
initialized from this?

Jakub

Re: [ARM/FDPIC v2 09/21] [ARM] FDPIC: Add support for taking address of nested function

2018-08-31 Thread Christophe Lyon

On Wed, 29 Aug 2018 at 13:01, Kyrill Tkachov
 wrote:
>
> Hi Christophe,
>
> On 13/07/18 17:11, christophe.l...@st.com wrote:
> > From: Christophe Lyon 
> >
> > In FDPIC mode, the trampoline generated to support pointers to nested
> > functions looks like:
> >
> >.word trampoline address
> >.word trampoline GOT address
> >ldrr12, [pc, #8]
> >ldrr9, [pc, #8]
> >ldr   pc, [pc]
>
> The comment in the code says the last one is:
> ldr   pc,  [pc, #8] ; #4 for Thumb2
>
> I'm assuming the code one is correct.

Right, it looks like a typo in the commit message.

>
> >.word static chain value
> >.word GOT address
> >.word function's address
> >
> > because in FDPIC function pointers are actually pointers to function
> > descriptors, we have to actually generate a function descriptor for
> > the trampoline.
> >
> > 2018-XX-XX  Christophe Lyon  
> > Mickaël Guêné 
> >
> > gcc/
> > * config/arm/arm.c (arm_asm_trampoline_template): Add FDPIC
> > support.
> > (arm_trampoline_init): Likewise.
> > (arm_trampoline_init): Likewise.
> > * config/arm/arm.h (TRAMPOLINE_SIZE): Likewise.
> >
> > Change-Id: I4b5127261a9aefa0f0318f110574ec07a856aeb1
> >
> > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > index 51da2bc..ffc9128 100644
> > --- a/gcc/config/arm/arm.c
> > +++ b/gcc/config/arm/arm.c
> > @@ -3950,13 +3950,50 @@ arm_warn_func_return (tree decl)
> > .word static chain value
> > .word function's address
> > XXX FIXME: When the trampoline returns, r8 will be clobbered.  */
> > +/* In FDPIC mode, the trampoline looks like:
> > +  .word trampoline address
> > +  .word trampoline GOT address
> > +  ldrr12, [pc, #8] ; #4 for Thumb2
> > +  ldrr9,  [pc, #8] ; #4 for Thumb2
> > +  ldr   pc,  [pc, #8] ; #4 for Thumb2
> > +  .word static chain value
> > +  .word GOT address
> > +  .word function's address
> > +*/
> >  static void
> >  arm_asm_trampoline_template (FILE *f)
> >  {
> >fprintf (f, "\t.syntax unified\n");
> >
> > -  if (TARGET_ARM)
> > +  if (TARGET_FDPIC)
> > +{
> > +  /* The first two words are a function descriptor pointing to the
> > +trampoline code just below.  */
> > +  if (TARGET_ARM)
> > +   fprintf (f, "\t.arm\n");
> > +  else if (TARGET_THUMB2)
> > +   fprintf (f, "\t.thumb\n");
> > +  else
> > +   /* Only ARM and Thumb-2 are supported.  */
> > +   gcc_assert ( !TARGET_ARM && !TARGET_THUMB2);
> > +
>
> This cannot trigger based on the two clauses above. I think you want to just 
> make it gcc_unreachable ().
>
OK

> > +  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
> > +  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
> > +  /* Trampoline code which sets the static chain register but also
> > +PIC register before jumping into real code.  */
> > +  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
> > +  STATIC_CHAIN_REGNUM, PC_REGNUM,
> > +  TARGET_THUMB2 ? 8 : 4);
> > +  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
> > +  PIC_OFFSET_TABLE_REGNUM, PC_REGNUM,
> > +  TARGET_THUMB2 ? 8 : 4);
> > +  asm_fprintf (f, "\tldr\t%r, [%r, #%d]\n",
> > +  PC_REGNUM, PC_REGNUM,
> > +  TARGET_THUMB2 ? 8 : 4);
> > +  assemble_aligned_integer (UNITS_PER_WORD, const0_rtx);
> > +}
> > +  else if (TARGET_ARM)
> >  {
> >fprintf (f, "\t.arm\n");
> >asm_fprintf (f, "\tldr\t%r, [%r, #0]\n", STATIC_CHAIN_REGNUM, 
> > PC_REGNUM);
> > @@ -3997,12 +4034,37 @@ arm_trampoline_init (rtx m_tramp, tree fndecl, rtx 
> > chain_value)
> >emit_block_move (m_tramp, assemble_trampoline_template (),
> > GEN_INT (TRAMPOLINE_SIZE), BLOCK_OP_NORMAL);
> >
> > -  mem = adjust_address (m_tramp, SImode, TARGET_32BIT ? 8 : 12);
> > -  emit_move_insn (mem, chain_value);
> > +  if (TARGET_FDPIC)
> > +{
> > +  rtx funcdesc = XEXP (DECL_RTL (fndecl), 0);
> > +  rtx fnaddr = gen_rtx_MEM (Pmode, funcdesc);
> > +  rtx gotaddr = gen_rtx_MEM (Pmode, plus_constant (Pmode, funcdesc, 
> > 4));
> > +  rtx trampoline_code_start
> > +   = plus_constant (Pmode, XEXP (m_tramp, 0), TARGET_THUMB2 ? 9 : 8);
>
> 9? Can you comment on this value?
>
The function start address is a offset 8, but in Thumb mode we want
bit 0 set to 1 to indicate thumb-ness.

> > +
> > +  /* Write initial funcdesc which points to the trampoline.  */
> > +  mem = adjust_address (m_tramp, SImode, 0);
> > +  emit_move_insn (mem, trampoline_code_start);
> > +  mem = adjust_address (m_tramp, SImode, 4);
> > +  emit_move_insn (mem, gen_rtx_REG (Pmode, PIC_OFFSET_TABLE_REGNUM));
> > +  /* Setup static

[patch,nvptx] Basic -misa support for nvptx

2018-08-31 Thread Cesar Philippidis

Attached is an nvptx patch that adds support for a new, albeit rarely
used, compiler option -misa. At present, there are only two valid ISA
arguments, SM_30 and SM_35. Without that flag, GCC will default to
SM_30. The major advantage of using the SM_35 ISA is to enable the use
PTX atom instructions for __atomic_fetch_{add,and,or,xor} for DI
integers. Without -misa, GCC would use an atomic CAS loop for them. As
an aside, this patch also enables PTX atom instructions for those
aforementioned functions for SI integers.

Is this patch OK for trunk?

Thanks,
Cesar
Basic -misa support for nvptx

2018-XX-YY  Cesar Philippidis  
	Bernd Schmidt  

	gcc/
	* config/nvptx/nvptx-opts.h: New file.
	* config/nvptx/nvptx.c (nvptx_file_start): Print the correct .target.
	* config/nvptx/nvptx.h: Include "nvptx-opts.h".
	(ASM_SPEC): Define.
	(TARGET_SM35): New macro.
	* config/nvptx/nvptx.md (atomic_fetch_): Enable with the
	correct predicate.
	* config/nvptx/nvptx.opt (ptx_isa, sm_30, sm_35): New enum and its
	values.
	(misa=): New option.
	* doc/invoke.texi (Nvidia PTX Options): Document -misa.

	gcc/testsuite/
	* gcc.target/nvptx/atomic_fetch-1.c: New test.
	* gcc.target/nvptx/atomic_fetch-1.c: New test.


diff --git a/gcc/config/nvptx/nvptx-opts.h b/gcc/config/nvptx/nvptx-opts.h
new file mode 100644
index 000..55d9599917e
--- /dev/null
+++ b/gcc/config/nvptx/nvptx-opts.h
@@ -0,0 +1,30 @@
+/* Definitions for the NVPTX port needed for option handling.
+   Copyright (C) 2015-2018 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   .  */
+
+#ifndef NVPTX_OPTS_H
+#define NVPTX_OPTS_H
+
+enum ptx_isa
+{
+  PTX_ISA_SM30,
+  PTX_ISA_SM35
+};
+
+#endif
+
diff --git a/gcc/config/nvptx/nvptx.c b/gcc/config/nvptx/nvptx.c
index c0b0a2ec3ab..9903a273863 100644
--- a/gcc/config/nvptx/nvptx.c
+++ b/gcc/config/nvptx/nvptx.c
@@ -4931,7 +4931,10 @@ nvptx_file_start (void)
 {
   fputs ("// BEGIN PREAMBLE\n", asm_out_file);
   fputs ("\t.version\t3.1\n", asm_out_file);
-  fputs ("\t.target\tsm_30\n", asm_out_file);
+  if (TARGET_SM35)
+fputs ("\t.target\tsm_35\n", asm_out_file);
+  else
+fputs ("\t.target\tsm_30\n", asm_out_file);
   fprintf (asm_out_file, "\t.address_size %d\n", GET_MODE_BITSIZE (Pmode));
   fputs ("// END PREAMBLE\n", asm_out_file);
 }
diff --git a/gcc/config/nvptx/nvptx.h b/gcc/config/nvptx/nvptx.h
index dfa1e9aa859..a2fe8b68b22 100644
--- a/gcc/config/nvptx/nvptx.h
+++ b/gcc/config/nvptx/nvptx.h
@@ -21,10 +21,16 @@
 #ifndef GCC_NVPTX_H
 #define GCC_NVPTX_H
 
+#ifndef NVPTX_OPTS_H
+#include "config/nvptx/nvptx-opts.h"
+#endif
+
 /* Run-time Target.  */
 
 #define STARTFILE_SPEC "%{mmainkernel:crt0.o}"
 
+#define ASM_SPEC "%{misa=*:-m %*}"
+
 #define TARGET_CPU_CPP_BUILTINS()		\
   do		\
 {		\
@@ -87,6 +93,8 @@
 #define Pmode (TARGET_ABI64 ? DImode : SImode)
 #define STACK_SIZE_MODE Pmode
 
+#define TARGET_SM35 (ptx_isa_option >= PTX_ISA_SM35)
+
 /* Registers.  Since ptx is a virtual target, we just define a few
hard registers for special purposes and leave pseudos unallocated.
We have to have some available hard registers, to keep gcc setup
diff --git a/gcc/config/nvptx/nvptx.md b/gcc/config/nvptx/nvptx.md
index 2988f5dfa91..ca00b1d8073 100644
--- a/gcc/config/nvptx/nvptx.md
+++ b/gcc/config/nvptx/nvptx.md
@@ -1440,7 +1440,6 @@
 (define_code_iterator any_logic [and ior xor])
 (define_code_attr logic [(and "and") (ior "or") (xor "xor")])
 
-;; Currently disabled until we add better subtarget support - requires sm_32.
 (define_insn "atomic_fetch_"
   [(set (match_operand:SDIM 1 "memory_operand" "+m")
 	(unspec_volatile:SDIM
@@ -1450,7 +1449,7 @@
 	  UNSPECV_LOCK))
(set (match_operand:SDIM 0 "nvptx_register_operand" "=R")
 	(match_dup 1))]
-  "0"
+  "mode == SImode || TARGET_SM35"
   "%.\\tatom%A1.b%T0.\\t%0, %1, %2;"
   [(set_attr "atomic" "true")])
 
diff --git a/gcc/config/nvptx/nvptx.opt b/gcc/config/nvptx/nvptx.opt
index 04277d1d98e..8194c0324d6 100644
--- a/gcc/config/nvptx/nvptx.opt
+++ b/gcc/config/nvptx/nvptx.opt
@@ -48,3 +48,17 @@ Generate code that can keep local state uniform across all lanes.
 mgomp
 Target Report Mask(GOMP)
 Generate code for OpenMP offloading: enables -msoft-stack and -muniform-simt.
+
+Enum
+Name(ptx_isa) Type(int)
+Known PTX ISA versions (for use with the -misa= option):
+
+EnumValue
+Enum(ptx_isa)

Re: [ARM/FDPIC v2 02/21] [ARM] FDPIC: Handle arm--uclinuxfdpiceabi in configure scripts

2018-08-31 Thread Christophe Lyon

On Wed, 29 Aug 2018 at 12:46, Kyrill Tkachov
 wrote:
>
>
> On 13/07/18 17:10, christophe.l...@st.com wrote:
> > From: Christophe Lyon 
> >
> > The new arm-uclinuxfdpiceabi target behaves pretty much like
> > arm-linux-gnueabi. In order the enable the same set of features, we
> > have to update several configure scripts that generally match targets
> > like *-*-linux*: in most places, we add *-uclinux* where there is
> > already *-linux*, or uclinux* when there is already linux*.
> >
> > In gcc/config.gcc and libgcc/config.host we use *-*-uclinuxfdpiceabi
> > because there is already a different behaviour for *-*uclinux* target.
> >
> > In libtool.m4, we use uclinuxfdpiceabi in cases where ELF shared
> > libraries support is required, as uclinux does not guarantee that.
> >
> > 2018-XX-XX  Christophe Lyon  
> >
> > * config/futex.m4: Handle *-uclinux*.
> > * config/tls.m4 (GCC_CHECK_TLS): Likewise.
> > * gcc/config.gcc: Handle *-*-uclinuxfdpiceabi.
> > * libatomic/configure.tgt: Handle arm*-*-uclinux*.
> > * libgcc/config.host: Handle *-*-uclinuxfdpiceabi.
> > * libitm/configure.tgt: Handle *-*-uclinux*.
> > * libatomic/configure: Regenerate.
> > * libitm/configure: Regenerate.
> > * libstdc++-v3/acinclude.m4: Handle uclinux*.
> > * libstdc++-v3/configure: Regenerate.
> > * libstdc++-v3/configure.host: Handle uclinux*
> > * libtool.m4: Handle uclinux*.
> >
>
> Most of these sub-directories have their own ChangeLogs (just for the record).
> What happens if a user tries to configure armeb-*-linuxfdpiceabi. Is this an 
> unsupported configuration?
> Will this error out? I think some regexes here will allow such a target.

Yes. I wondered about that. I must admit I haven't tested it, but
since I don't think anything in the patch series is
endianness-dependent, I thought I should include armeb.
Do you prefer I remove it?
>
> Is the target triplet set in stone now?
I think it is part of binutils-2.31, but we can probably still change
it, since the whole toolchain isn't available yet, no product should
rely on the target name.

It will be just painful to adjust the testsuite again

> I think Richard had some thoughts on the naming...
>
> Thanks,
> Kyrill
>
> > Change-Id: I6a1fdcd9847d8a82179a214612a3474c1f492916
> >
> > diff --git a/config/futex.m4 b/config/futex.m4
> > index e95144d..4dffe15 100644
> > --- a/config/futex.m4
> > +++ b/config/futex.m4
> > @@ -9,7 +9,7 @@ AC_DEFUN([GCC_LINUX_FUTEX],[dnl
> >  GCC_ENABLE(linux-futex,default, ,[use the Linux futex system call],
> > permit yes|no|default)
> >  case "$target" in
> > -  *-linux*)
> > +  *-linux* | *-uclinux*)
> >  case "$enable_linux_futex" in
> >default)
> >  # If headers don't have gettid/futex syscalls definition, then
> > diff --git a/config/tls.m4 b/config/tls.m4
> > index 4e170c8..5a8676e 100644
> > --- a/config/tls.m4
> > +++ b/config/tls.m4
> > @@ -76,7 +76,7 @@ AC_DEFUN([GCC_CHECK_TLS], [
> >dnl Shared library options may depend on the host; this check
> >dnl is only known to be needed for GNU/Linux.
> >case $host in
> > -   *-*-linux*)
> > +   *-*-linux* | -*-uclinux*)
> >LDFLAGS="-shared -Wl,--no-undefined $LDFLAGS"
> >;;
> >esac
> > diff --git a/gcc/config.gcc b/gcc/config.gcc
> > index ef67c88..808ff82 100644
> > --- a/gcc/config.gcc
> > +++ b/gcc/config.gcc
> > @@ -759,7 +759,7 @@ case ${target} in
> >  *-*-fuchsia*)
> >native_system_header_dir=/include
> >;;
> > -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-gnu* | 
> > *-*-kopensolaris*-gnu)
> > +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-gnu* | 
> > *-*-kopensolaris*-gnu | *-*-uclinuxfdpiceabi)
> >extra_options="$extra_options gnu-user.opt"
> >gas=yes
> >gnu_ld=yes
> > @@ -768,7 +768,7 @@ case ${target} in
> >esac
> >tmake_file="t-slibgcc"
> >case $target in
> > -*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu)
> > +*-*-linux* | frv-*-*linux* | *-*-kfreebsd*-gnu | *-*-kopensolaris*-gnu 
> >  | *-*-uclinuxfdpiceabi)
> >:;;
> >  *-*-gnu*)
> >native_system_header_dir=/include
> > @@ -788,7 +788,7 @@ case ${target} in
> >  *-*-*android*)
> >tm_defines="$tm_defines DEFAULT_LIBC=LIBC_BIONIC"
> >;;
> > -*-*-*uclibc*)
> > +*-*-*uclibc* | *-*-uclinuxfdpiceabi)
> >tm_defines="$tm_defines DEFAULT_LIBC=LIBC_UCLIBC"
> >;;
> >  *-*-*musl*)
> > @@ -1135,7 +1135,7 @@ arm*-*-netbsdelf*)
> >  tmake_file="${tmake_file} arm/t-arm"
> >  target_cpu_cname="arm6"
> >  ;;
> > -arm*-*-linux-*)# ARM GNU/Linux with ELF
> > +arm*-*-linux-* | arm*-*-uclinuxfdpiceabi)  # ARM 
> > GNU/Linux with ELF
> >  tm_file="dbxelf.h elfos.h gnu-user.h linux.h linux-android.h 
> >

Fix PR87074: miscompilation with unroll-and-jam

2018-08-31 Thread Michael Matz

Hi,

as Jakub correctly identified, uses of values produced by the inner loop 
that occur inside the outer loop prevent fusion as well.  We are 
in LCSSA so the check is easy.  (Uses outside the outer loop are okay, see 
the comment)

Regstrapping on x86-64-linux in progress.  Okay for trunk?


Ciao,
Michael.

* gimple-loop-jam.c (unroll_jam_possible_p): Check loop exit
PHIs for outer-loop uses.

testsuite/
* gcc.dg/pr87074.c: New test.

diff --git a/gcc/gimple-loop-jam.c b/gcc/gimple-loop-jam.c
index 2ecd1bb5a7c..c6bd0428684 100644
--- a/gcc/gimple-loop-jam.c
+++ b/gcc/gimple-loop-jam.c
@@ -161,7 +161,7 @@ bb_prevents_fusion_p (basic_block bb)
   gimple_stmt_iterator gsi;
   /* BB is duplicated by outer unrolling and then all N-1 first copies
  move into the body of the fused inner loop.  If BB exits the outer loop
- the last copy still doess so, and the first N-1 copies are cancelled
+ the last copy still does so, and the first N-1 copies are cancelled
  by loop unrolling, so also after fusion it's the exit block.
  But there might be other reasons that prevent fusion:
* stores or unknown side-effects prevent fusion
@@ -227,6 +227,33 @@ unroll_jam_possible_p (struct loop *outer, struct loop 
*loop)
   || !expr_invariant_in_loop_p (outer, niter.niter))
 return false;
 
+  /* If the inner loop produces any values that are used inside the
+ outer loop (except the virtual op) then it can flow
+ back (perhaps indirectly) into the inner loop.  This prevents
+ fusion: without fusion the value at the last iteration is used,
+ with fusion the value after the initial iteration is used.
+
+ If all uses are outside the outer loop this doesn't prevent fusion;
+ the value of the last iteration is still used (and the values from
+ all intermediate iterations are dead).  */
+  gphi_iterator psi;
+  for (psi = gsi_start_phis (single_exit (loop)->dest);
+   !gsi_end_p (psi); gsi_next ())
+{
+  imm_use_iterator imm_iter;
+  use_operand_p use_p;
+  tree op = gimple_phi_result (psi.phi ());
+  if (virtual_operand_p (op))
+   continue;
+  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, op)
+   {
+ gimple *use_stmt = USE_STMT (use_p);
+ if (!is_gimple_debug (use_stmt)
+ && flow_bb_inside_loop_p (outer, gimple_bb (use_stmt)))
+   return false;
+   }
+}
+
   /* And check blocks belonging to just outer loop.  */
   bbs = XNEWVEC (basic_block, n_basic_blocks_for_fn (cfun));
   n = get_loop_body_with_size (outer, bbs, n_basic_blocks_for_fn (cfun));
@@ -245,7 +272,6 @@ unroll_jam_possible_p (struct loop *outer, struct loop 
*loop)
  body would be the after-iter value of the first body) if it's over
  an associative and commutative operation.  We wouldn't
  be able to handle unknown cycles.  */
-  gphi_iterator psi;
   for (psi = gsi_start_phis (loop->header); !gsi_end_p (psi); gsi_next ())
 {
   affine_iv iv;
diff --git a/gcc/testsuite/gcc.dg/pr87074.c b/gcc/testsuite/gcc.dg/pr87074.c
new file mode 100644
index 000..d838fcd8fc5
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr87074.c
@@ -0,0 +1,25 @@
+/* { dg-do run } */
+/* { dg-options "-O3 -floop-unroll-and-jam --param unroll-jam-min-percent=0" } 
*/
+long b;
+unsigned c[5];
+unsigned long long d = 3;
+int e, f, g;
+
+void h() {
+  for (; f < 11; f++) {
+b = g;
+for (e = 0; e < 5; e++) {
+  c[e] = e - b - (c[e] >> 5);
+  g = c[e];
+}
+  }
+  if (c[0])
+d = 0;
+}
+
+extern void abort(void);
+int main() {
+  h();
+  if (d != 0)
+abort ();
+}

Re: [ARM/FDPIC v2 01/21] [ARM] FDPIC: Add -mfdpic option support

2018-08-31 Thread Christophe Lyon

On Wed, 29 Aug 2018 at 12:46, Kyrill Tkachov
 wrote:
>
> Hi Christophe,
>
> On 13/07/18 17:10, christophe.l...@st.com wrote:
> > From: Christophe Lyon 
> >
> > 2018-XX-XX  Christophe Lyon  
> > Mickaël Guêné  
> >
> > gcc/
> > * config/arm/arm.opt: Add -mfdpic option.
> >
> > Change-Id: Ie5c4ed7434488933de6133186da09cd3ea1291a7
> >
> > diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
> > index a1286a4..231c1cb 100644
> > --- a/gcc/config/arm/arm.opt
> > +++ b/gcc/config/arm/arm.opt
> > @@ -302,3 +302,7 @@ When linking for big-endian targets, generate a legacy 
> > BE32 format image.
> >  mbranch-cost=
> >  Target RejectNegative Joined UInteger Var(arm_branch_cost) Init(-1)
> >  Cost to assume for a branch insn.
> > +
> > +mfdpic
> > +Target Report Mask(FDPIC)
> > +Enable Function Descriptor PIC mode.
>
> So since this is an ABI, why isn't it just adding a new -mabi value?

I think it's just an oversight, because we always talk about "FDPIC
mode". I can change it, it's meant to be mostly hidden and toggled by
the target name.

> In any case, command-line option changes need documentation in invoke.texi.
I keep forgetting this, sorry.

> Thanks,
> Kyrill
>
>
> > --
> > 2.6.3
> >
>

RE: Add new warning flag "warn_prio_ctor_dtor"

2018-08-31 Thread Vinay Kumar

Hi Joseph,

>> Isn't it enabled by default, not by -Wall (so the main option is 
>> -Wno-prio-ctor-dtor to disable the warning)?

Thanks for reviewing the patch and your suggestions.

Please find attached the modified patch as per your review comments.

Please review the patch and let me know if its okay?

Thanks,
Vinay


tcvxwgcc-222-2.patch
Description: tcvxwgcc-222-2.patch

Re: [PATCH, OpenACC] Support C++ "this" in OpenACC directives (PR66053)

2018-08-31 Thread Nathan Sidwell


On 08/30/2018 04:27 PM, Jason Merrill wrote:


On Thu, Aug 30, 2018 at 3:31 PM, Julian Brown  wrote:


"Apart from parsing, it's necessary to prevent the "cannot take the
address of 'this', which is an rvalue expression" error from appearing


Breaking a rather fundamental language attribute does not seem wise.


Why does referring to this[0:1] require making 'this' addressable?
Surely what we're interested in is the value of 'this', not the
address.
Yes, transferring the this pointer is very unlikely to be what the user 
wants -- the object being referred to contains the data.  It might be 
wise to look at the DR's and changes relating to lambdas and this 
capture.  Those changes now make it much harder to simply capture the 
pointer unintentionally.


nathan
--
Nathan Sidwell

[SVE ACLE] Add support for svasrd

2018-08-31 Thread Richard Sandiford

These functions require the shift amount to be an integer constant
expression in the range [1, N].  The patch adds a new hook so that
targets can check this.

Also, the ACLE asm tests try many combinations per file, so I'd added
code to parallelise them at single-file granularity.  However,
the code to do that was using variables that only exist when -j
is used and so this patch makes it conditional.

Tested on aarch64-linux-gnu and committed to aarch64/sve-acle-branch.

Richard



diff --git a/gcc/testsuite/g++.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp 
b/gcc/testsuite/g++.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp
index e4a65b809a4..88cf53396e5 100644
--- a/gcc/testsuite/g++.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp
+++ b/gcc/testsuite/g++.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp
@@ -48,8 +48,10 @@ if { ![string equal $sve_flags ""] || [aarch64_sve_bits] == 
0 } {
 }
 
 global gcc_runtest_parallelize_limit_minor
-set old_limit_minor $gcc_runtest_parallelize_limit_minor
-set gcc_runtest_parallelize_limit_minor 1
+if { [info exists gcc_runtest_parallelize_limit_minor] } {
+set old_limit_minor $gcc_runtest_parallelize_limit_minor
+set gcc_runtest_parallelize_limit_minor 1
+}
 
 torture-init
 set-torture-options {
@@ -76,7 +78,9 @@ gcc-dg-runtest [lsort $files] \
 
 torture-finish
 
-set gcc_runtest_parallelize_limit_minor $old_limit_minor
+if { [info exists gcc_runtest_parallelize_limit_minor] } {
+set gcc_runtest_parallelize_limit_minor $old_limit_minor
+}
 
 # All done.
 dg-finish
diff --git a/gcc/testsuite/gcc.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp 
b/gcc/testsuite/gcc.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp
index 7dbdee2825e..b4fd451adea 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp
+++ b/gcc/testsuite/gcc.target/aarch64/sve-acle/aarch64-sve-acle-asm.exp
@@ -48,8 +48,10 @@ if { ![string equal $sve_flags ""] || [aarch64_sve_bits] == 
0 } {
 }
 
 global gcc_runtest_parallelize_limit_minor
-set old_limit_minor $gcc_runtest_parallelize_limit_minor
-set gcc_runtest_parallelize_limit_minor 1
+if { [info exists gcc_runtest_parallelize_limit_minor] } {
+set old_limit_minor $gcc_runtest_parallelize_limit_minor
+set gcc_runtest_parallelize_limit_minor 1
+}
 
 torture-init
 set-torture-options {
@@ -72,7 +74,9 @@ gcc-dg-runtest [lsort $files] \
 
 torture-finish
 
-set gcc_runtest_parallelize_limit_minor $old_limit_minor
+if { [info exists gcc_runtest_parallelize_limit_minor] } {
+set gcc_runtest_parallelize_limit_minor $old_limit_minor
+}
 
 # All done.
 dg-finish



diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index f5ea772..9bb3feedf03 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5749,16 +5749,24 @@ builtin_function_validate_nargs (location_t loc, tree 
fndecl, int nargs,
 /* Verifies the NARGS arguments ARGS to the builtin function FNDECL.
Returns false if there was an error, otherwise true.  LOC is the
location of the function; ARG_LOC is a vector of locations of the
-   arguments.  */
+   arguments.  If FNDECL is the result of resolving an overloaded
+   target built-in, ORIG_FNDECL is the original function decl,
+   otherwise it is null.  */
 
 bool
 check_builtin_function_arguments (location_t loc, vec arg_loc,
- tree fndecl, int nargs, tree *args)
+ tree fndecl, tree orig_fndecl,
+ int nargs, tree *args)
 {
-  if (!DECL_BUILT_IN (fndecl)
-  || DECL_BUILT_IN_CLASS (fndecl) != BUILT_IN_NORMAL)
+  if (!DECL_BUILT_IN (fndecl))
 return true;
 
+  if (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_MD)
+return (!targetm.check_builtin_call
+   || targetm.check_builtin_call (loc, arg_loc, fndecl,
+  orig_fndecl, nargs, args));
+
+  gcc_assert (DECL_BUILT_IN_CLASS (fndecl) == BUILT_IN_NORMAL);
   switch (DECL_FUNCTION_CODE (fndecl))
 {
 case BUILT_IN_ALLOCA_WITH_ALIGN_AND_MAX:
diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index c266fee74c7..f14c2d8f84c 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -803,7 +803,7 @@ extern void check_function_arguments_recurse (void (*)
  void *, tree,
  unsigned HOST_WIDE_INT);
 extern bool check_builtin_function_arguments (location_t, vec,
- tree, int, tree *);
+ tree, tree, int, tree *);
 extern void check_function_format (tree, int, tree *, vec *);
 extern bool attribute_fallthrough_p (tree);
 extern tree handle_format_attribute (tree *, tree, tree, int, bool *);
@@ -979,7 +979,8 @@ extern bool

Re: [PATCH 3/3] or1k: gcc: initial support for openrisc

2018-08-31 Thread Paul Koning




> On Aug 30, 2018, at 9:02 PM, Jeff Law  wrote:
> 
> On 08/30/2018 10:58 AM, Richard Henderson wrote:
>> On 08/28/2018 07:13 AM, Jeff Law wrote:
>>> Please consider using function descriptors rather than trampolines.
>>> This allows you to make the stack non-executable at all times which is
>>> good from a security standpoint.  The downside is the indirect calling
>>> mechanism has to change slightly to distinguish between a simple
>>> indirect call and one through a function descriptor (usually by having a
>>> low bit on to indicate the latter).  GIven this is an ABI change, now is
>>> probably the last opportunity to make this change.
>> 
>> Correct me if I'm wrong here:
>> 
>> Define TARGET_CUSTOM_FUNCTION_DESCRIPTORS to an appropriate value -- easy 
>> for a
>> RISC target -- and that's it.
>> 
>> Further, it pretty much only gets used by the Ada front end.  One should not
>> expect these to be used by the C front end nested functions.
> I thought it was used more extensively than that...  Thanks for checking
> into it though.
> 
> Jeff

My impression from reading the internals manual is that it's an alternative to 
trampolines -- and in fact it appears to suggest it's a superior alternative.  
I've been planning to try turning it on for pdp11 where executable stacks can 
be problematic.  (For that matter, they are on lots of other machines -- which 
is why descriptors instead of trampolines sounds like a good thing.)

paul

[PATCH] Fix PR87168

2018-08-31 Thread Richard Biener



I am testing the following patch to fix a missed optimization and 
wrong-code issue in region VN.

{,O1-}Bootstrap and regtest running on x86_64-unknown-linux-gnu.

2018-08-31  Richard Biener  

PR tree-optimization/87168
* tree-ssa-sccvn.c (SSA_VAL): Add visited output parameter.
(rpo_elim::eliminate_avail): When OP was not visited it must
be available.

* gcc.dg/torture/pr87168.c: New testcase.

Index: gcc/tree-ssa-sccvn.c
===
--- gcc/tree-ssa-sccvn.c(revision 263982)
+++ gcc/tree-ssa-sccvn.c(working copy)
@@ -456,10 +456,14 @@ VN_INFO (tree name)
 /* Return the SSA value of X.  */
 
 inline tree
-SSA_VAL (tree x)
+SSA_VAL (tree x, bool *visited = NULL)
 {
   vn_ssa_aux_t tem = vn_ssa_aux_hash->find_with_hash (x, SSA_NAME_VERSION (x));
-  return tem && tem->visited ? tem->valnum : x;
+  if (!tem)
+return x;
+  if (visited)
+*visited = tem->visited;
+  return tem->visited ? tem->valnum : x;
 }
 
 /* Return whether X was visited.  */
@@ -5681,7 +5685,12 @@ rpo_elim::~rpo_elim ()
 tree
 rpo_elim::eliminate_avail (basic_block bb, tree op)
 {
-  tree valnum = SSA_VAL (op);
+  bool visited;
+  tree valnum = SSA_VAL (op, );
+  /* If we didn't visit OP then it must be defined outside of the
+ region we process and also dominate it.  So it is available.  */
+  if (!visited)
+return op;
   if (TREE_CODE (valnum) == SSA_NAME)
 {
   if (SSA_NAME_IS_DEFAULT_DEF (valnum))
Index: gcc/testsuite/gcc.dg/torture/pr87168.c
===
--- gcc/testsuite/gcc.dg/torture/pr87168.c  (nonexistent)
+++ gcc/testsuite/gcc.dg/torture/pr87168.c  (working copy)
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+
+int a, b, c, d, e, f, *g;
+
+int main ()
+{ 
+  unsigned i;
+  while (b)
+{ 
+  int j, m;
+L1:
+  f = j;
+L2:
+  if (i && e)
+   { 
+ i = f;
+ goto L2;
+   }
+  j = f;
+  if (a)
+   goto L3;
+  for (m = 0; m < 2; m++)
+   if (d)
+ goto L1;
+  goto L2;
+L3:
+  ( != g) | c;
+}
+  return 0;
+}

Re: [PATCHv3][PR 59521] Respect __builtin_expect in switch statements

2018-08-31 Thread Jan Hubicka

> 
> 2018-08-23  Martin Liska  
> 
>   PR middle-end/59521
>   * predict.c (set_even_probabilities): Add likely_edges
> argument and handle cases where we have precisely one
> likely edge.
>   (combine_predictions_for_bb): Catch also likely_edges.
>   (tree_predict_by_opcode): Handle gswitch statements.
>   * tree-cfg.c (find_taken_edge_switch_expr): Remove
> const quantifier.
>   (find_case_label_for_value): Likewise.
>   * tree-cfg.h (find_case_label_for_value): New declaration.
>   (find_taken_edge_switch_expr): Likewise.
>   * tree-switch-conversion.c (switch_decision_tree::balance_case_nodes):
> Find pivot in decision tree based on probabily, not by number of
> nodes.
> 
> gcc/testsuite/ChangeLog:
> 
> 2018-08-23  Martin Liska  
> 
>   PR middle-end/59521
>   * c-c++-common/pr59521-1.c: New test.
>   * c-c++-common/pr59521-2.c: New test.
>   * gcc.dg/tree-prof/pr59521-3.c: New test.
> ---
>  gcc/predict.c  | 96 +-
>  gcc/testsuite/c-c++-common/pr59521-1.c | 15 
>  gcc/testsuite/c-c++-common/pr59521-2.c | 15 
>  gcc/testsuite/gcc.dg/tree-prof/pr59521-3.c | 34 
>  gcc/tree-cfg.c | 10 +--
>  gcc/tree-cfg.h |  2 +
>  gcc/tree-switch-conversion.c   | 45 +-
>  7 files changed, 168 insertions(+), 49 deletions(-)
>  create mode 100644 gcc/testsuite/c-c++-common/pr59521-1.c
>  create mode 100644 gcc/testsuite/c-c++-common/pr59521-2.c
>  create mode 100644 gcc/testsuite/gcc.dg/tree-prof/pr59521-3.c
> 
> diff --git a/gcc/predict.c b/gcc/predict.c
> index 8c8e79153fc..db1fe737cb4 100644
> --- a/gcc/predict.c
> +++ b/gcc/predict.c
> @@ -831,7 +831,8 @@ unlikely_executed_bb_p (basic_block bb)
>  
>  static void
>  set_even_probabilities (basic_block bb,
> - hash_set *unlikely_edges = NULL)
> + hash_set *unlikely_edges = NULL,
> + hash_set *likely_edges = NULL)

Please add/update documentation of set_even_probabilities.
>  {
>unsigned nedges = 0, unlikely_count = 0;
>edge e = NULL;
> @@ -843,7 +844,7 @@ set_even_probabilities (basic_block bb,
>all -= e->probability;
>  else if (!unlikely_executed_edge_p (e))
>{
> -nedges ++;
> + nedges++;
>  if (unlikely_edges != NULL && unlikely_edges->contains (e))
> {
>   all -= profile_probability::very_unlikely ();
> @@ -852,26 +853,54 @@ set_even_probabilities (basic_block bb,
>}
>  
>/* Make the distribution even if all edges are unlikely.  */
> +  unsigned likely_count = likely_edges ? likely_edges->elements () : 0;
>if (unlikely_count == nedges)
>  {
>unlikely_edges = NULL;
>unlikely_count = 0;
>  }
>  
> -  unsigned c = nedges - unlikely_count;
> -
> -  FOR_EACH_EDGE (e, ei, bb->succs)
> -if (e->probability.initialized_p ())
> -  ;
> -else if (!unlikely_executed_edge_p (e))
> -  {
> - if (unlikely_edges != NULL && unlikely_edges->contains (e))
> -   e->probability = profile_probability::very_unlikely ();
> +  /* If we have one likely edge, then use its probability and distribute
> + remaining probabilities as even.  */
> +  if (likely_count == 1)
> +{
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> + if (e->probability.initialized_p ())
> +   ;
> + else if (!unlikely_executed_edge_p (e))
> +   {
> + edge_prediction *prediction = *likely_edges->begin ();
> + int p = prediction->ep_probability;
> + profile_probability prob
> +   = profile_probability::from_reg_br_prob_base (p);
> + profile_probability remainder = prob.invert ();
> +
> + if (prediction->ep_edge == e)
> +   e->probability = prob;
> + else
> +   e->probability = remainder.apply_scale (1, nedges - 1);
> +   }
>   else
> -   e->probability = all.apply_scale (1, c).guessed ();
> -  }
> -else
> -  e->probability = profile_probability::never ();
> +   e->probability = profile_probability::never ();
> +}
> +  else
> +{
> +  /* Make all unlikely edges unlikely and the rest will have even
> +  probability.  */
> +  unsigned scale = nedges - unlikely_count;
> +  FOR_EACH_EDGE (e, ei, bb->succs)
> + if (e->probability.initialized_p ())
> +   ;
> + else if (!unlikely_executed_edge_p (e))
> +   {
> + if (unlikely_edges != NULL && unlikely_edges->contains (e))
> +   e->probability = profile_probability::very_unlikely ();
> + else
> +   e->probability = all.apply_scale (1, scale);
> +   }
> + else
> +   e->probability = profile_probability::never ();
> +}
>  }
>  
>  /* Add REG_BR_PROB note to JUMP with PROB.  */
> @@ -1175,6 +1204,7 @@ combine_predictions_for_bb (basic_block bb, bool 
> dry_run)
>if

Re: [PATCH] Add -Wabsolute-value

2018-08-31 Thread Martin Jambor

Hi,

thank you very much for your comments.

On Fri, Aug 24 2018, Joseph Myers wrote:
> On Fri, 24 Aug 2018, Martin Jambor wrote:
>
>> +/* Assuming we have encountered a call to a probably wrong kind of abs, 
>> issue a
>> +   warning.  LOC is the location of the call, FNKIND is a string 
>> characterizing
>> +   the class of the used abs function, FNDEC is the actual function 
>> declaration
>> +   and ATYPE is type of the supplied actual argument.  */
>
> For proper i18n, you have to use an enumeration here, not English string 
> fragments.
>
>> +  warning_at (loc, OPT_Wabsolute_value,
>> +  "using %s absolute value function %qD when argument "
>> +  "is of %s type %qT", fnkind, fndecl, act, atype);
>
> And then have all the possible combinations as complete sentences, in 
> separate warning_at calls in appropriate switch statments or marked up 
> with G_() if you put more than one in a single warning_at call using ?:, 
> for translation purposes.  (Any cases that are impossible combinations for 
> the warning - you don't want translators to have to produce useless 
> translations where e.g. both argument and call are of the same kind and so 
> the warning shouldn't occur - should use gcc_unreachable ().)

Understood.  That however defeats the purpose of having the warning
string generation in a separate function and so I removed it and just
put separate warning_at calls with the entire string in warn_for_abs.

>
>> +case BUILT_IN_FABS:
>> +case BUILT_IN_FABSF:
>> +case BUILT_IN_FABSL:
>> +  if (!SCALAR_FLOAT_TYPE_P (atype)
>> +  || DECIMAL_FLOAT_MODE_P (TYPE_MODE (atype)))
>
> Should include the _FloatN / _FloatNx built-in functions here (CASE_FLT_FN 
> together with CASE_FLT_FN_FLOATN_NX).

OK, fixed.

>
>> +case BUILT_IN_CABS:
>> +case BUILT_IN_CABSF:
>> +case BUILT_IN_CABSL:
>
> And use CASE_FLT_FN here rather than hardcoding the list (but we don't yet 
> have _FloatN / _FloatNx built-in variants of cabs).

Likewise.

>
>> +  tree ftype = TREE_VALUE (TYPE_ARG_TYPES (TREE_TYPE (fndecl)));
>> +  if (tree_fits_uhwi_p (TYPE_SIZE (atype))
>> +  && tree_to_uhwi (TYPE_SIZE (atype)) > tree_to_uhwi (TYPE_SIZE 
>> (ftype)))
>> +warning_at (loc, OPT_Wabsolute_value,
>> +"absolute value function %qD given an argument of type %qT "
>> +"but has parameter of type %qT which may cause truncation "
>> +"of value ", fndecl, atype, ftype);
>
> Should not have space at end of warning text.  I don't think TYPE_SIZE is 
> the right thing to use in general; for example, on x86_64, you should warn 
> for passing a _Float128 value to fabsl, but both long double and _Float128 
> are 16-byte types (with only 10 bytes non-padding in long double).

I have looked at how unsafe_conversion_p detects these cases and it
seems it relies entirely on TYPE_PRECISION, so I changed my code to do
the same.  I have also added a test for long double vs _Float128 to the
first test case.

Bootstrapped and tested on x86_64-linux.  What do you think about it
now?

Thanks,

Martin



2018-08-29  Martin Jambor  

gcc/
* common.opt (Wabsolute-value): New.
* doc/invoke.texi (Warning Options): Likewise.

gcc/c/
(warn_for_abs): New function.
(c_parser_postfix_expression_after_primary): Call it.

testsuite/
* gcc.dg/warn-abs-1.c: New test.
* gcc.dg/dfp/warn-abs-2.c: Likewise.
---
 gcc/c/c-parser.c  | 155 +-
 gcc/common.opt|   4 +
 gcc/doc/invoke.texi   |   8 ++
 gcc/testsuite/gcc.dg/dfp/warn-abs-2.c |  28 +
 gcc/testsuite/gcc.dg/warn-abs-1.c |  67 +++
 5 files changed, 256 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/dfp/warn-abs-2.c
 create mode 100644 gcc/testsuite/gcc.dg/warn-abs-1.c

diff --git a/gcc/c/c-parser.c b/gcc/c/c-parser.c
index 0d5dbea8f67..c180d9a391e 100644
--- a/gcc/c/c-parser.c
+++ b/gcc/c/c-parser.c
@@ -9110,6 +9110,144 @@ sizeof_ptr_memacc_comptypes (tree type1, tree type2)
   return comptypes (type1, type2) == 1;
 }
 
+/* Warn for patterns where abs-like function appears to be used incorrectly,
+   gracely ignore any non-abs-like function.  The warning location should be
+   LOC.  FNDECL is the declaration of called function, it must be a
+   BUILT_IN_NORMAL function.  ARG is the first and only argument of the
+   call.  */
+
+static void
+warn_for_abs (location_t loc, tree fndecl, tree arg)
+{
+  tree atype = TREE_TYPE (arg);
+
+  /* Casts from pointers (and thus arrays and fndecls) will generate
+ -Wint-conversion warnings.  Most other wrong types hopefully lead to type
+ mismatch errors.  TODO: Think about what to do with FIXED_POINT_TYPE_P
+ types and possibly other exotic types.  */
+  if (!INTEGRAL_TYPE_P (atype)
+  && !SCALAR_FLOAT_TYPE_P (atype)
+  && TREE_CODE (atype) != COMPLEX_TYPE)
+return;
+
+  enum

[PR c++/87155] Anonymous namespace and

2018-08-31 Thread Nathan Sidwell

This fixes 87155 where an inline namespace caused us to find an inner 
anonymous namespace.


The literal wording of the standard is wrong.  Nobody implements that. 
They implement essentially 1 of 2 different interpretations, which were 
equivalent until inline namespaces and DR2061.  Then one of those 
interpretations has a surprise!


This patch implements the interpretation that gives us the desired 
behaviour.  While we could implement that by giving anonymous namespaces 
names for lookup purposes, that started messing with the debug machinery 
and common core more than desirable.  We can simply not look in inline 
namespaces when searching for a NULL name.


It turns out that 84707 was similar case of this, that we considered ill 
formed (relying on our interpretation of the std).  84707 is well formed 
code.


booted & tested on x86_64-linux, will commit trunk and gcc-8.

nathan

--
Nathan Sidwell
2018-08-31  Nathan Sidwell  

	PR c++/87155
	PR c++/84707
	cp/
	* name-lookup.c (name_lookup::search_namespace): Don't look at
	inlines when searching for NULL names.
	testsuite/
	* g++.dg/cpp0x/pr87155.C: New.
	* g++.dg/cpp0x/inline-ns10.C: Adjust.

Index: gcc/cp/name-lookup.c
===
--- gcc/cp/name-lookup.c	(revision 264014)
+++ gcc/cp/name-lookup.c	(working copy)
@@ -558,11 +558,14 @@ name_lookup::search_namespace (tree scop
 
   /* Look in exactly namespace. */
   bool found = search_namespace_only (scope);
-  
-  /* Recursively look in its inline children.  */
-  if (vec *inlinees = DECL_NAMESPACE_INLINEES (scope))
-for (unsigned ix = inlinees->length (); ix--;)
-  found |= search_namespace ((*inlinees)[ix]);
+
+  /* Don't look into inline children, if we're looking for an
+ anonymous name -- it must be in the current scope, if anywhere.  */
+  if (name)
+/* Recursively look in its inline children.  */
+if (vec *inlinees = DECL_NAMESPACE_INLINEES (scope))
+  for (unsigned ix = inlinees->length (); ix--;)
+	found |= search_namespace ((*inlinees)[ix]);
 
   if (found)
 mark_found (scope);
Index: gcc/testsuite/g++.dg/cpp0x/inline-ns10.C
===
--- gcc/testsuite/g++.dg/cpp0x/inline-ns10.C	(revision 264014)
+++ gcc/testsuite/g++.dg/cpp0x/inline-ns10.C	(working copy)
@@ -2,7 +2,10 @@
 // { dg-do compile { target c++11 } }
 
 inline namespace {
-  namespace {}
+	 namespace {} // not this one
+	 void a ();
 }
 
-namespace {} // { dg-error "conflicts" }
+namespace {
+  int a (); // { dg-error "ambiguating" "" }
+}
Index: gcc/testsuite/g++.dg/cpp0x/pr87155.C
===
--- gcc/testsuite/g++.dg/cpp0x/pr87155.C	(revision 0)
+++ gcc/testsuite/g++.dg/cpp0x/pr87155.C	(working copy)
@@ -0,0 +1,14 @@
+// { dg-do compile { target c++11 } }
+// PR c++/87155 confused about which anon namespace
+
+namespace {
+  void a (); // this one
+}
+
+inline namespace n2 {
+	 namespace {}
+} 
+
+namespace {
+  int a ();  // { dg-error "ambiguating" "" }
+}

Re: [PATCH] Fix thinko in estimate_local_effects in IPA-CP

2018-08-31 Thread Richard Biener

On Fri, Aug 31, 2018 at 1:31 PM Martin Jambor  wrote:
>
> Hi,
>
> I have discovered the following thinko in IPA-CP's
> estimate_local_effects added during conversion to use nonspecialized
> time.  The intent clearly was to add an upper bound to the time
> difference, not a lower one.
>
> The patch introduces a guality failure:
>
>   gcc.dg/guality/pr41616-1.c   -O3 -g  -DPREVENT_OPTIMIZATION  execution test
>
> the testcase however also already fails at -O2 and -O3 -flto and it
> occurs now also at -O3 because we do not clone a function which we quite
> clearly shouldn't.
>
> Bootstrapped and tested on x86_64-linux, I have also made sure the
> change does not affect SPEC 2006 and 2017 -Ofast.  OK for trunk?

OK.

> Thanks,
>
> Martin
>
>
> 2018-08-23  Martin Jambor  
>
> * ipa-cp.c (estimate_local_effects): Replace wrong MAX with MIN.
> ---
>  gcc/ipa-cp.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 42dd4cc2904..2117529aebb 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -2911,7 +2911,7 @@ estimate_local_effects (struct cgraph_node *node)
>  "known contexts, code not going to grow.\n");
> }
>else if (good_cloning_opportunity_p (node,
> -  MAX ((base_time - time).to_int (),
> +  MIN ((base_time - time).to_int (),
> 65536),
>stats.freq_sum, stats.count_sum,
>size))
> --
> 2.18.0
>

Re: [PATCH] IPA ICF: make type cache a static field sem_item.

2018-08-31 Thread Richard Biener

On Fri, Aug 31, 2018 at 12:49 PM Martin Liška  wrote:
>
> Hi.
>
> As noticed here:
> https://gcc.gnu.org/ml/gcc/2018-08/msg4.html
>
> There's ugly usage of a static variable in sem_item_optimizer
> that's called from sem_item.
>
> I believe logically the hash should live in sem_item.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK.

> Martin
>
>
> gcc/ChangeLog:
>
> 2018-08-29  Martin Liska  
>
> * ipa-icf.c (sem_item::add_type): Use
> sem_item::m_type_hash_cache.
> * ipa-icf.h: Move the cache from sem_item_optimizer
> to sem_item.
> ---
>  gcc/ipa-icf.c | 6 --
>  gcc/ipa-icf.h | 6 +++---
>  2 files changed, 7 insertions(+), 5 deletions(-)
>
>

[PATCH] Fix thinko in estimate_local_effects in IPA-CP

2018-08-31 Thread Martin Jambor

Hi,

I have discovered the following thinko in IPA-CP's
estimate_local_effects added during conversion to use nonspecialized
time.  The intent clearly was to add an upper bound to the time
difference, not a lower one.

The patch introduces a guality failure:

  gcc.dg/guality/pr41616-1.c   -O3 -g  -DPREVENT_OPTIMIZATION  execution test

the testcase however also already fails at -O2 and -O3 -flto and it
occurs now also at -O3 because we do not clone a function which we quite
clearly shouldn't.

Bootstrapped and tested on x86_64-linux, I have also made sure the
change does not affect SPEC 2006 and 2017 -Ofast.  OK for trunk?

Thanks,

Martin


2018-08-23  Martin Jambor  

* ipa-cp.c (estimate_local_effects): Replace wrong MAX with MIN.
---
 gcc/ipa-cp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 42dd4cc2904..2117529aebb 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -2911,7 +2911,7 @@ estimate_local_effects (struct cgraph_node *node)
 "known contexts, code not going to grow.\n");
}
   else if (good_cloning_opportunity_p (node,
-  MAX ((base_time - time).to_int (),
+  MIN ((base_time - time).to_int (),
65536),
   stats.freq_sum, stats.count_sum,
   size))
-- 
2.18.0

Re: [PATCH] tree-vrp: add "const" qualifier to various value_range pointers

2018-08-31 Thread Aldy Hernandez





On 08/30/2018 06:24 PM, David Malcolm wrote:

Adding "const" to these indicates to both humans and the compiler in which
direction information is intended to flow.  For example, compare:
   extract_range_from_binary_expr_1, which takes two vr and modifies a third
with:
   ranges_from_anti_range, which takes one and modifies two others.



Dde!  Thanks so much.  I will adjust my patch accordingly.

Aldy

[PATCH] GCOV: Print one decimal place in human readable mode.

2018-08-31 Thread Martin Liška

Hi.

There are sometimes situations where a line is executed:

 1G:  307:read_size (unsigned char *p, int size)

in such case a more fine value is appreciated:

 1.4G:  307:read_size (unsigned char *p, int size)

Thus I'm suggesting to add one digit after dot.

Patch survives gcov.exp, will install if not objections.

Martin

gcc/ChangeLog:

2018-08-31  Martin Liska  

* doc/gcov.texi: Update documentation of humar
readable mode.
* gcov.c (format_count): Print one decimal place, it provides
more fine number of situations like '1G' vs. '1.4G'.

gcc/testsuite/ChangeLog:

2018-08-31  Martin Liska  

* g++.dg/gcov/loop.C: Update test to support new format.
---
 gcc/doc/gcov.texi| 2 +-
 gcc/gcov.c   | 4 ++--
 gcc/testsuite/g++.dg/gcov/loop.C | 6 +++---
 3 files changed, 6 insertions(+), 6 deletions(-)


diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index f33dc8f6aed..98f4a876293 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -255,7 +255,7 @@ lcount:36,1,0
 
 @item -j
 @itemx --human-readable
-Write counts in human readable format (like 24k).
+Write counts in human readable format (like 24.6k).
 
 @item -k
 @itemx --use-colors
diff --git a/gcc/gcov.c b/gcc/gcov.c
index ff4020c713e..6a24a320046 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -2216,8 +2216,8 @@ format_count (gcov_type count)
   if (count + divisor / 2 < 1000 * divisor)
 	break;
 }
-  gcov_type r  = (count + divisor / 2) / divisor;
-  sprintf (buffer, "%" PRId64 "%c", r, units[i]);
+  float r = 1.0f * count / divisor;
+  sprintf (buffer, "%.1f%c", r, units[i]);
   return buffer;
 }
 
diff --git a/gcc/testsuite/g++.dg/gcov/loop.C b/gcc/testsuite/g++.dg/gcov/loop.C
index 7f3be5587af..24f580634d9 100644
--- a/gcc/testsuite/g++.dg/gcov/loop.C
+++ b/gcc/testsuite/g++.dg/gcov/loop.C
@@ -2,11 +2,11 @@
 /* { dg-do run { target native } } */
 
 unsigned
-loop (unsigned n, int value)		  /* count(14k) */
+loop (unsigned n, int value)		  /* count(14.0k) */
 {
   for (unsigned i = 0; i < n - 1; i++)
   {
-value += i;  /* count(21M) */
+value += i;  /* count(21.0M) */
   }
 
   return value;
@@ -18,7 +18,7 @@ int main(int argc, char **argv)
   for (unsigned i = 0; i < 7 * 1000; i++)
   {
 sum += loop (1000, sum);
-sum += loop (2000, sum);		  /* count(7k) */
+sum += loop (2000, sum);		  /* count(7.0k) */
   }
 
   return 0;  /* count(1) */

Re: [testsuite] Add dg-require-fileio to some libstdc++ wchar_t tests

2018-08-31 Thread Jonathan Wakely


On 30/08/18 19:19 -0600, Sandra Loosemore wrote:
One of the simulators we use for testing nios2-elf builds doesn't 
include full semihosted fileio support.  (IIRC it can do I/O to the 
console and open and close files, but doesn't support fseek, 
stat/fstat, unlink, etc).  There are a bunch of libstdc++ 
wchar_t/$test.cc test cases that fail on this target where the 
corresponding char/$test.cc test is skipped, because the char test 
case files already specify "dg-require-fileio" and the wchar_t ones 
don't.


I used grep etc to find all such pairs of test cases, and then added 
"dg-require-fileio" to the wchar_t files where it was missing.  Note 
that this patch touches more test cases than the ones that were 
actually failing,


Three of them use dg-require-namedlocale to only run when localedata
for a specific locale is available, and four are restricted to only be
tested for { target *-*-mingw* }, would that explain it?

and I didn't confirm that all the similarly-named 
char and wchar_t files were actually testing the exact same things.


These ones don't match the corresponding /char/ version:

libstdc++-v3/testsuite/27_io/basic_filebuf/underflow/wchar_t/2.cc
libstdc++-v3/testsuite/27_io/basic_filebuf/underflow/wchar_t/3.cc
libstdc++-v3/testsuite/27_io/basic_ifstream/cons/wchar_t/1.cc
libstdc++-v3/testsuite/27_io/basic_ofstream/cons/wchar_t/1.cc
libstdc++-v3/testsuite/27_io/basic_ofstream/open/wchar_t/1.cc
libstdc++-v3/testsuite/27_io/objects/wchar_t/10.cc

They do all use file I/O though.

Is this OK to check in?  Or is some deeper analysis required to 
identify which tests are affected?


I checked the files, most of them #include  so we can
safely assume they require file I/O. These ones don't include
 but they all use fopen and/or freopen:

libstdc++-v3/testsuite/27_io/objects/wchar_t/10.cc
libstdc++-v3/testsuite/27_io/objects/wchar_t/12048-1.cc
libstdc++-v3/testsuite/27_io/objects/wchar_t/12048-2.cc
libstdc++-v3/testsuite/27_io/objects/wchar_t/12048-3.cc
libstdc++-v3/testsuite/27_io/objects/wchar_t/12048-4.cc
libstdc++-v3/testsuite/27_io/objects/wchar_t/12048-5.cc
libstdc++-v3/testsuite/ext/stdio_sync_filebuf/wchar_t/1.cc

So the patch seems fine. Thanks for keeping the test results clean.

Re: [PATCH] Optimise sqrt reciprocal multiplications

2018-08-31 Thread Richard Biener

On Thu, 30 Aug 2018, Kyrill Tkachov wrote:

> Ping.
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01496.html
> 
> Thanks,
> Kyrill
> 
> On 23/08/18 18:09, Kyrill Tkachov wrote:
> > Hi Richard,
> > 
> > On 23/08/18 11:13, Richard Sandiford wrote:
> > > Kyrill  Tkachov  writes:
> > > > Hi all,
> > > > 
> > > > This patch aims to optimise sequences involving uses of 1.0 / sqrt (a)
> > > > under -freciprocal-math and -funsafe-math-optimizations.
> > > > In particular consider:
> > > > 
> > > > x = 1.0 / sqrt (a);
> > > > r1 = x * x;  // same as 1.0 / a
> > > > r2 = a * x; // same as sqrt (a)
> > > > 
> > > > If x, r1 and r2 are all used further on in the code, this can be
> > > > transformed into:
> > > > tmp1 = 1.0 / a
> > > > tmp2 = sqrt (a)
> > > > tmp3 = tmp1 * tmp2
> > > > x = tmp3
> > > > r1 = tmp1
> > > > r2 = tmp2
> > > Nice optimisation :-)  Someone who knows the pass better should review,
> > > but:
> > 
> > Thanks for the review.
> > 
> > > There seems to be an implicit assumption that this is a win even
> > > when the r1 and r2 assignments are only conditionally executed.
> > > That's probably true, but it might be worth saying explicitly.
> > 
> > I'll admit I had not considered that case.
> > I think it won't make a difference in practice, as the really expensive
> > operations here
> > are the sqrt and the division and they are on the executed path in either
> > case and them
> > becoming independent should be a benefit of its own.
> > 
> > > > +/* Return TRUE if USE_STMT is a multiplication of DEF by A.  */
> > > > +
> > > > +static inline bool
> > > > +is_mult_by (gimple *use_stmt, tree def, tree a)
> > > > +{
> > > > +  if (gimple_code (use_stmt) == GIMPLE_ASSIGN
> > > > +  && gimple_assign_rhs_code (use_stmt) == MULT_EXPR)
> > > > +{
> > > > +  tree op0 = gimple_assign_rhs1 (use_stmt);
> > > > +  tree op1 = gimple_assign_rhs2 (use_stmt);
> > > > +
> > > > +  return (op0 == def && op1 == a)
> > > > +  || (op0 == a && op1 == def);
> > > > +}
> > > > +  return 0;
> > > > +}
> > > Seems like is_square_of could now be a light-weight wrapper around this.
> > 
> > Indeed, I've done the wrapping now.
> > 
> > > > @@ -652,6 +669,180 @@ execute_cse_reciprocals_1 (gimple_stmt_iterator
> > > > *def_gsi, tree def)
> > > > occ_head = NULL;
> > > >   }
> > > >   +/* Transform sequences like
> > > > +   x = 1.0 / sqrt (a);
> > > > +   r1 = x * x;
> > > > +   r2 = a * x;
> > > > +   into:
> > > > +   tmp1 = 1.0 / a;
> > > > +   tmp2 = sqrt (a);
> > > > +   tmp3 = tmp1 * tmp2;
> > > > +   x = tmp3;
> > > > +   r1 = tmp1;
> > > > +   r2 = tmp2;
> > > > +   depending on the uses of x, r1, r2.  This removes one multiplication
> > > > and
> > > > +   allows the sqrt and division operations to execute in parallel.
> > > > +   DEF_GSI is the gsi of the initial division by sqrt that defines
> > > > +   DEF (x in the example abovs).  */
> > > > +
> > > > +static void
> > > > +optimize_recip_sqrt (gimple_stmt_iterator *def_gsi, tree def)
> > > > +{
> > > > +  use_operand_p use_p;
> > > > +  imm_use_iterator use_iter;
> > > > +  gimple *stmt = gsi_stmt (*def_gsi);
> > > > +  tree x = def;
> > > > +  tree orig_sqrt_ssa_name = gimple_assign_rhs2 (stmt);
> > > > +  tree div_rhs1 = gimple_assign_rhs1 (stmt);
> > > > +
> > > > +  if (TREE_CODE (orig_sqrt_ssa_name) != SSA_NAME
> > > > +  || TREE_CODE (div_rhs1) != REAL_CST
> > > > +  || !real_equal (_REAL_CST (div_rhs1), ))
> > > > +return;
> > > > +
> > > > +  gimple *sqrt_stmt = SSA_NAME_DEF_STMT (orig_sqrt_ssa_name);
> > > > +  if (!is_gimple_call (sqrt_stmt)
> > > > +  || !gimple_call_lhs (sqrt_stmt))
> > > > +return;
> > > > +
> > > > +  gcall *call = as_a  (sqrt_stmt);
> > > Very minor, but:
> > > 
> > >gcall *sqrt_stmt
> > >  = dyn_cast  (SSA_NAME_DEF_STMT (orig_sqrt_ssa_name));
> > >if (!sqrt_stmt || !gimple_call_lhs (sqrt_stmt))
> > >  return;
> > > 
> > > would avoid the need for the separate as_a<>, and would mean that
> > > we only call gimple_call_* on gcalls.
> > 
> > Ok.
> > 
> > > > +  if (has_other_use)
> > > > +{
> > > > +  /* Using the two temporaries tmp1, tmp2 from above
> > > > + the original x is now:
> > > > + x = tmp1 * tmp2.  */
> > > > +  gcc_assert (mult_ssa_name);
> > > > +  gcc_assert (sqr_ssa_name);
> > > > +  gimple_stmt_iterator gsi2 = gsi_for_stmt (stmt);
> > > > +
> > > > +  tree new_ssa_name
> > > > += make_temp_ssa_name (TREE_TYPE (a), NULL,
> > > > "recip_sqrt_transformed");
> > > > +  gimple *new_stmt
> > > > += gimple_build_assign (new_ssa_name, MULT_EXPR,
> > > > +   mult_ssa_name, sqr_ssa_name);
> > > > +  gsi_insert_before (, new_stmt, GSI_SAME_STMT);
> > > > +  gcc_assert (gsi_stmt (gsi2) == stmt);
> > > > +  gimple_assign_set_rhs_from_tree (, new_ssa_name);
> > > > +  fold_stmt ();
> > > > +  update_stmt (stmt);
> > > In this case we're replacing the statement in its original

Re: [GCC][PATCH v2][Aarch64] Exploiting BFXIL when OR-ing two AND-operations with appropriate bitmasks

2018-08-31 Thread Kyrill Tkachov




On 30/08/18 16:53, Sam Tebbs wrote:



On 08/28/2018 11:53 PM, James Greenhalgh wrote:
> Hm, I'm not very sure about the naming here; "left consecutive" isn't a
> common phrase to denote the mask you're looking for (exact_log2 (-i) != -1
> if I'm reading right), and is misleading 0x is 'left consecutive'
> too, just with zeroes rather than ones.
>
I think you're right about it not being the best naming. Do you have any
suggestions for a better name?


Naming things is hard... :(

How about aarch64_hi_bits_all_ones_p ?

Kyrill


>> diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
>> index 
af5db9c595385f7586692258f750b6aceb3ed9c8..01d9e1bd634572fcfa60208ba4dc541805af5ccd 
100644
>> --- a/gcc/config/aarch64/aarch64-protos.h
>> +++ b/gcc/config/aarch64/aarch64-protos.h
>> @@ -574,4 +574,6 @@ rtl_opt_pass *make_pass_fma_steering (gcc::context 
*ctxt);
>>
>>   poly_uint64 aarch64_regmode_natural_size (machine_mode);
>>
>> +bool aarch64_is_left_consecutive (HOST_WIDE_INT);
>> +
>>   #endif /* GCC_AARCH64_PROTOS_H */
>> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> index 
fa01475aa9ee579b6a3b2526295b622157120660..3cfa51b15af3e241672f1383cf881c12a44494a5 
100644
>> --- a/gcc/config/aarch64/aarch64.c
>> +++ b/gcc/config/aarch64/aarch64.c
>> @@ -1454,6 +1454,14 @@ aarch64_hard_regno_caller_save_mode (unsigned regno, 
unsigned,
>>   return SImode;
>>   }
>>
>> +/* Implement IS_LEFT_CONSECUTIVE.  Check if I's bits are consecutive
> What is IS_LEFT_CONSECUTIVE - I don't see it elsewhere in the GCC code, so
> what does the comment refer to implementing?
Thanks for pointing out this mistake, it should read
"AARCH64_IS_LEFT_CONSECUTIVE" to refer to the function definition in
aarch64-protos.h. This will of course change once a better name is
thought of.

Thanks,
Sam

[PATCH] IPA ICF: make type cache a static field sem_item.

2018-08-31 Thread Martin Liška

Hi.

As noticed here:
https://gcc.gnu.org/ml/gcc/2018-08/msg4.html

There's ugly usage of a static variable in sem_item_optimizer
that's called from sem_item.

I believe logically the hash should live in sem_item.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Martin


gcc/ChangeLog:

2018-08-29  Martin Liska  

* ipa-icf.c (sem_item::add_type): Use
sem_item::m_type_hash_cache.
* ipa-icf.h: Move the cache from sem_item_optimizer
to sem_item.
---
 gcc/ipa-icf.c | 6 --
 gcc/ipa-icf.h | 6 +++---
 2 files changed, 7 insertions(+), 5 deletions(-)


diff --git a/gcc/ipa-icf.c b/gcc/ipa-icf.c
index 39b96ba13be..8a6a7a3f32f 100644
--- a/gcc/ipa-icf.c
+++ b/gcc/ipa-icf.c
@@ -227,6 +227,8 @@ void sem_item::set_hash (hashval_t hash)
   m_hash_set = true;
 }
 
+hash_map sem_item::m_type_hash_cache;
+
 /* Semantic function constructor that uses STACK as bitmap memory stack.  */
 
 sem_function::sem_function (bitmap_obstack *stack)
@@ -1587,7 +1589,7 @@ sem_item::add_type (const_tree type, inchash::hash )
 	  return;
 	}
 
-  hashval_t *val = optimizer->m_type_hash_cache.get (type);
+  hashval_t *val = m_type_hash_cache.get (type);
 
   if (!val)
 	{
@@ -1607,7 +1609,7 @@ sem_item::add_type (const_tree type, inchash::hash )
 	  hstate2.add_int (nf);
 	  hash = hstate2.end ();
 	  hstate.add_hwi (hash);
-	  optimizer->m_type_hash_cache.put (type, hash);
+	  m_type_hash_cache.put (type, hash);
 	}
   else
 hstate.add_hwi (*val);
diff --git a/gcc/ipa-icf.h b/gcc/ipa-icf.h
index 622aebc00c0..a64b3852efb 100644
--- a/gcc/ipa-icf.h
+++ b/gcc/ipa-icf.h
@@ -281,6 +281,9 @@ private:
   /* Initialize internal data structures. Bitmap STACK is used for
  bitmap memory allocation process.  */
   void setup (bitmap_obstack *stack);
+
+  /* Because types can be arbitrarily large, avoid quadratic bottleneck.  */
+  static hash_map m_type_hash_cache;
 }; // class sem_item
 
 class sem_function: public sem_item
@@ -524,9 +527,6 @@ public:
   /* Gets a congruence class group based on given HASH value and TYPE.  */
   congruence_class_group *get_group_by_hash (hashval_t hash,
   sem_item_type type);
-
-  /* Because types can be arbitrarily large, avoid quadratic bottleneck.  */
-  hash_map m_type_hash_cache;
 private:
 
   /* For each semantic item, append hash values of references.  */

[PATCH][ipa-inline][obvious] Fix typos in comment

2018-08-31 Thread Kyrill Tkachov


Hi all,

I'm committing this as obvious.

Thanks,
Kyrill

2018-08-31  Kyrylo Tkachov  

* ipa-inline.c (can_inline_edge_by_limits_p): Fix typos in comment.
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index a84d1d9ad3e45935eff51f00b8f6b5058a691c9e..025788522fbe602e1d190a30ccea006b9ab053c5 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -430,9 +430,9 @@ can_inline_edge_by_limits_p (struct cgraph_edge *e, bool report,
   ipa_fn_summary *caller_info = ipa_fn_summaries->get (caller);
   ipa_fn_summary *callee_info = ipa_fn_summaries->get (callee);
 
- /* Until GCC 4.9 we did not check the semantics alterning flags
-	bellow and inline across optimization boundry.
-	Enabling checks bellow breaks several packages by refusing
+ /* Until GCC 4.9 we did not check the semantics-altering flags
+	below and inlined across optimization boundaries.
+	Enabling checks below breaks several packages by refusing
 	to inline library always_inline functions. See PR65873.
 	Disable the check for early inlining for now until better solution
 	is found.  */

Re: [PATCH] look harder for MEM_REF operand equality to avoid -Wstringop-truncation (PR 84561)

2018-08-31 Thread Richard Biener

On Thu, Aug 30, 2018 at 7:39 PM Martin Sebor  wrote:
>
> On 08/30/2018 11:22 AM, Richard Biener wrote:
> > On August 30, 2018 6:54:21 PM GMT+02:00, Martin Sebor  
> > wrote:
> >> On 08/30/2018 02:35 AM, Richard Biener wrote:
> >>> On Thu, Aug 30, 2018 at 2:12 AM Martin Sebor 
> >> wrote:
> 
>  The attached patch adds code to work harder to determine whether
>  the destination of an assignment involving MEM_REF is the same
>  as the destination of a prior strncpy call.  The included test
>  case demonstrates when this situation comes up.  During ccp,
>  dstbase and lhsbase returned by get_addr_base_and_unit_offset()
>  end up looking like this:
> >>>
> >>> "During CCP" means exactly when?  The CCP lattice tracks copies
> >>> so CCP should already know that _1 == _8.  I suppose during
> >>> substitute_and_fold then?  But that replaces uses before folding
> >>> the stmt.
> >>
> >> Yes, when ccp_finalize() performs the final substitution during
> >> substitute_and_fold().
> >
> > But then you shouldn't need the loop but at most look at the pointer SSA 
> > Def to get at the non-invariant ADDR_EXPR.
>
> I don't follow.   Are you suggesting to compare
> SSA_NAME_DEF_STMT (dstbase) to SSA_NAME_DEF_STMT (lhsbase) for
> equality?  They're not equal.

No.

> The first loop iterates once and retrieves
>
>1.  _8 = _3(D)->a;
>
> The second loop iterates three times and retrieves:
>
>1.  _1 = _9
>2.  _9 = _8
>3.  _8 = _3(D)->a;
>
> How do I get from _1 to _3(D)->a without iterating?  Or are
> you saying to still iterate but compare the SSA_NAME_DEF_STMT?

I say you should retrieve _8 = _3(D)->a immediately since the
copies should be
propagated out at this stage.

> Martin
>
> >
> > Richard.
> >
> >> Martin
> >>
> >>>
> >>> So I'm confused.
> >>>
> 
> _8 = _3(D)->a;
> _9 = _8;
> _1 = _9;
> strncpy (MEM_REF (_3(D)->a), ...);
> MEM[(struct S *)_1].a[n_7] = 0;
> 
>  so the loops follow the simple assignments until we get at
>  the ADDR_EXPR assigned to _8 which is the same as the strncpy
>  destination.
> 
>  Tested on x86_64-linux.
> 
>  Martin
> >
>

Re: [PR c++/87137] GCC-8 Fix

2018-08-31 Thread Richard Biener

On Thu, Aug 30, 2018 at 5:09 PM JonY <10wa...@gmail.com> wrote:
>
> On 08/30/2018 11:59 AM, Nathan Sidwell wrote:
> > On 08/29/2018 11:06 PM, Liu Hao wrote:
> >
> >> It is strictly an ABI break but I doubt whether code in real world
> >> that got broken by this bug ever exists. Usually when people expect
> >> consecutive bitfields to be packed into a single word they wouldn't
> >> put irrelevant declarations between them.
> >
> > You're probably right.  I'm guessing this bug was found because:
> >
> >int bob : 1;
> >int get_bob () const {return bob;}
> >void set_bob (bool v) {bob=v;}
> >int bill : 1;
> >...
> >
> > might be a useful style.
> >
> >> An important reason why such code could be rare is that the following
> >> code compiles differently as C and C++:
> >
> >> struct foo
> >>{
> >>  unsigned a : 21;
> >>  union x1 { char x; };
> >>  unsigned b : 11;
> >>  union x1 u;
> >>};
> >
> > (for C++) this happens to be a case we already get right.   I
> > think that'd be a C vendor extension, I don't think unnamed fields are
> > permitted there?
> >
> > Here's a version of the patch to completely resolve the issue, tested on
> > trunk.  I noticed regular x86 targets support the ms_struct attribute,
> > so we can give this wider testing.  I did consider trying to reorder the
> > field decls before struct layour, but that seemed a more invasive change
> > -- besides it might be nice to be able to remove the template-specific
> > CLASSTYPE_DECL_LIST which almost but not quite mirrors TYPE_FIELDS.
> >
> > Ok for trunk, ok for 8?
> >
> > nathan
>
> I don't have any objections.

Me neither if appropriately documented in changes.html.

Richard.

>

Re: [committed] C++: fix-it hint for missing "typename" (PR c++/63392)

2018-08-31 Thread Paolo Carlini


Hi David,

On 28/08/2018 02:20, David Malcolm wrote:

This patch adds a fix-it hint to missing "typename" errors in the C++
frontend, suggesting the insertion of "typename ".
... it occurs to me, that probably we could have a fix-it hint for 
another kind of missing typename (class) in testcases like c++/84980: 
saying *only*

"‘T’ has not been declared" doesn't seem to help the novice much!?!

Paolo.

[C++ Patch] PR 84980 ("[concepts] ICE with missing typename in concept")

2018-08-31 Thread Paolo Carlini


Hi,

another straightforward ICE on invalid.  I spent however quite a bit of 
time on it, because I tried to catch the error_mark_node as early as 
possible, but that doesn't seem to work well error-recovery-wise: if, 
say, we catch it in deduce_constrained_parameter we end-up emitting a 
second redundant "‘C’ is not a type" error message because we skip 'C' 
completely. Thus, at least for the time being, I propose to simply catch 
the problem in finish_shorthand_constraint (which likely avoids more 
ICEs already in Bugzilla...). Tested x86_64-linux.


Thanks, Paolo.



/cp
2018-08-31  Paolo Carlini  

PR c++/84980
* constraint.cc (finish_shorthand_constraint): Early return if the
constraint is erroneous.

/testsuite
2018-08-31  Paolo Carlini  

PR c++/84980
* g++.dg/concepts/pr84980.C: New.
Index: cp/constraint.cc
===
--- cp/constraint.cc(revision 264009)
+++ cp/constraint.cc(working copy)
@@ -1259,6 +1259,9 @@ finish_shorthand_constraint (tree decl, tree const
   if (!constr)
 return NULL_TREE;
 
+  if (error_operand_p (constr))
+return NULL_TREE;
+
   tree proto = CONSTRAINED_PARM_PROTOTYPE (constr);
   tree con = CONSTRAINED_PARM_CONCEPT (constr);
   tree args = CONSTRAINED_PARM_EXTRA_ARGS (constr);
Index: testsuite/g++.dg/concepts/pr84980.C
===
--- testsuite/g++.dg/concepts/pr84980.C (nonexistent)
+++ testsuite/g++.dg/concepts/pr84980.C (working copy)
@@ -0,0 +1,6 @@
+// { dg-do compile { target c++14 } }
+// { dg-additional-options "-fconcepts" }
+
+template concept bool C = true;  // { dg-error "has not been declared" }
+
+template struct A;

Re: [PATCH] Come up with TARGET_GET_VALID_OPTION_VALUES option hook (PR driver/83193).

2018-08-31 Thread Martin Liška

On 08/30/2018 12:16 PM, Richard Biener wrote:
> On Wed, Aug 29, 2018 at 2:47 PM Martin Liška  wrote:
>>
>> On 08/29/2018 01:06 PM, Richard Biener wrote:
>>> On Mon, Aug 27, 2018 at 12:00 PM Martin Liška  wrote:

 On 08/13/2018 03:00 PM, Martin Liška wrote:
> On 08/13/2018 02:54 PM, Ramana Radhakrishnan wrote:
>> On Mon, Aug 13, 2018 at 1:49 PM, Martin Liška  wrote:
>>> PING^1
>>>
>>> On 07/24/2018 02:05 PM, Martin Liška wrote:
 Hi.

 I'm sending updated version of the patch. It comes up with a new 
 target common hook
 that provide option completion list. It's used both in --help=target 
 and with --completion
 option. I implemented support for -match and -mtune for i386 target.

 Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>>
>> Err I don't maintain the x86 backend to review this effectively. In an
>> ideal world you would have split this into 2 patches 1 for the common
>> parts and 1 for x86 and CC'd the relevant x86 maintainers. I'm not
>> clear what you are looking for here from me :-/
>
> Sorry, I was not clear. I would like to hear from ARM's folks that 
> interface
> design is fine for them? I know a global reviewer will have to approve 
> that.

 I'm CC'ing Jakub and Richard who can help us with the new target hook 
 infrastructure.
>>>
>>> +vec
>>> +ix86_get_valid_option_values (int option_code, const char *prefix)
>>> +{
>>>
>>> prefix isn't used - why does that not fail bootstrap?
>>
>> Will add ATTRIBUTE_UNUSED.
>>
>>  It requires documentation
>>> that honoring prefix isn't required and callers have to deal with
>>
>> It's more detail described in common-target.def:
>>
>> 'The hook is used for options that have a non-trivial list of\
>>  possible option values.  OPTION_CODE is option code of opt_code\
>>  enum type.  PREFIX is used for bash completion and allows an implementation\
>>  to return more specific completion based on the prefix.  All string values\
>>  should be allocated from heap memory and consumers should release them.'
>>
>> Should I copy it to the implementation.
>>
>>> that.  IMHO that
>>> makes prefix useless?
>>
>> ARM folks requested that, they want to do a smart filtering for bash 
>> completion.
>> It was there request.
> 
> Ah, I see.  Based on your x86 example below I guess that generic code already
> does prefix handling, yes?  I think that's something that should be 
> documented,
> that is, "The result will be pruned to cases with PREFIX if not NULL" or so?

Done.

> 
>>>
>>> Unfortunately option_proposer::build_option_suggestions isn't documented
>>> so I don't see whether it only receives target options.  If not then
>>>
>>> -   add_misspelling_candidates (m_option_suggestions, option,
>>> -   opt_text);
>>> +   {
>>> + vec option_values
>>> +   = targetm_common.get_valid_option_values (i, prefix);
>>> + if (!option_values.is_empty ())
>>>
>>> this should be guarded with a check for whether this is a target
>>> option (CL_TARGET
>>> in flags).
>>
>> Good point!
>>
>>  I wonder why misspellings are to be checked for the bash
>>> completion case?
>>
>> Note that option_proposer::build_option_suggestions is shared infrastructure
>> in between bash completion and misspellings.
>>
>> 2 examples:
>>
>> $ /xgcc -B. -march=znver2 -c /tmp/empty.c
>> cc1: error: bad value (‘znver2’) for ‘-march=’ switch
>> cc1: note: valid arguments to ‘-march=’ switch are:  ... did you mean 
>> ‘znver1’?
> 
> Hmm, not very pretty ;)  If you do -march=bdver5, what will it print?
> "... did you mean 'bdver1' ... did you mean 'bdver2' ..."?

Yep, it's not ideal, I created: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87165

> 
> Anyway, sort that out with David ;)
> 
>> $  ./xgcc -B. --completion=-march=znver
>> -march=znver1
>>
>>>
>>> I also wonder why those target options could not simply be of Enum type and 
>>> thus
>>> be automatically handled?
>>
>> What a question, I asked the same one before I implemented that. It's because
>> you have modifiers like: -march=armv8.3-a+simd+crypto+nofp for aarch64 
>> target.
>> For i386, it's also manually parsed because of tables that group
>> options and various values:
>>
>> processor_target_table
>> processor_alias_table
> 
> Eh, OK.
> 
>> I'm sending updated version of patch that I'm going to test.
> 
> The middle-end changes (target hook addition) is OK.
> 
> I guess the other non-target parts as well, we can improve over the
> prettyness as followup.
> 
> Please get target maintainer approval for the rest.

Good then, I'm sending final version of the patch that I've justed
tested on x86_64-linux-gnu.

Martin

> 
> Thanks,
> Richard.
> 
>> Martin
>>
>>>
>>> Richard.
>>>
 Martin

>
> Martin
>
>>
>> Ramana

 Martin

[PATCH] rs6000: Make lrounddi2 depend on TARGET_FPRND (PR86684)

2018-08-31 Thread Segher Boessenkool

TARGET_FPRND should be on for everything ISA 2.04 and later, and
TARGET_VSX implies ISA 2.06 or later; but it is possible to disable
TARGET_FPRND (separately via -mno-fprnd, but also implicitly)
currently, and then things fall down.  This patch makes things not
fall down.

Committing to trunk, and queued up for backports.


Segher


2018-08-31  Segher Boessenkool  

PR target/86684
PR target/87149
* config/rs6000/rs6000.md (lrounddi2): Gate on TARGET_FPRND.

---
 gcc/config/rs6000/rs6000.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index 49e2090..c4ef878 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -5985,7 +5985,7 @@ (define_expand "lrounddi2"
(set (match_operand:DI 0 "gpc_reg_operand")
(unspec:DI [(match_dup 2)]
   UNSPEC_FCTID))]
-  "TARGET_HARD_FLOAT && TARGET_VSX"
+  "TARGET_HARD_FLOAT && TARGET_VSX && TARGET_FPRND"
 {
   operands[2] = gen_reg_rtx (mode);
 })
-- 
1.8.3.1

Re: Fix even more merging PIC and PIE options

2018-08-31 Thread Richard Biener

On Thu, 30 Aug 2018, Jan Hubicka wrote:

> > On Fri, 10 Aug 2018, Jan Hubicka wrote:
> > 
> > > Hi,
> > > this patch should fix merging of PIC and PIE options so we always resort
> > > to the least common denominator of the object files compiled (i.e. 
> > > linking together -fpic and -fPIE will result in -fpie binary).
> > > Note that we also use information about format of output passed by linker
> > > plugin so we will disable pic/pie when resulting binary is not 
> > > relocatable.
> > > However for shared libraries and pie we want to pick right mode that makes
> > > sense.
> > > 
> > > I wrote simple script that tries all possible options of combining two 
> > > files.
> > > Mode picked is specified in the output file.
> > > 
> > > To support targets that default to pic/pie well, I had to also hack
> > > lto_write_options to always stream what mode was used in the given file.
> > > 
> > > lto-bootstrapped/regtested x86_64-linux, OK?
> > > 
> > > Honza
> > > 
> > >   PR lto/86517
> > >   * lto-opts.c (lto_write_options): Always stream PIC/PIE mode.
> > >   * lto-wrapper.c (merge_and_complain): Fix merging of PIC/PIE
> > > Index: lto-opts.c
> > > ===
> > > --- lto-opts.c(revision 263356)
> > > +++ lto-opts.c(working copy)
> > > @@ -78,6 +78,22 @@ lto_write_options (void)
> > >&& !global_options.x_flag_openacc)
> > >  append_to_collect_gcc_options (_obstack, _p,
> > >  "-fno-openacc");
> > > +  /* Append PIC/PIE mode because its default depends on target and it is
> > > + subject of merging in lto-wrapper.  */
> > > +  if ((!global_options_set.x_flag_pic || global_options.x_flag_pic == 0)
> > 
> > why's that checked only for flag_pic?
> > Shouldn't it be consistent with flag_pie?
> > 
> > Isn't it so that setting either -f[no-]pic or -f[no-]pie fully specifies
> > the state?  So the == 0 check is superfluous?
> 
> Yep, it is superflows. I have changed it to test if at least one option is 
> set,
> re-tested and going to commit.
> 
> OK for release branch after some soaking?

OK after Cauldron if nothing bad happened.

Richard.

> Honza
> 
> > 
> > The rest of the patch looks OK to me.
> > 
> > Thanks,
> > Richard.
> > 
> > > +  && !global_options_set.x_flag_pie)
> > > +{
> > > +   append_to_collect_gcc_options (_obstack, _p,
> > > +   global_options.x_flag_pic == 2
> > > +   ? "-fPIC"
> > > +   : global_options.x_flag_pic == 1
> > > +   ? "-fpic"
> > > +   : global_options.x_flag_pie == 2
> > > +   ? "-fPIE"
> > > +   : global_options.x_flag_pie == 1
> > > +   ? "-fpie"
> > > +   : "-fno-pie");
> > > +}
> > >  
> > >/* Append options from target hook and store them to offload_lto 
> > > section.  */
> > >if (lto_stream_offload_p)
> > > Index: lto-wrapper.c
> > > ===
> > > --- lto-wrapper.c (revision 263356)
> > > +++ lto-wrapper.c (working copy)
> > > @@ -408,6 +408,11 @@ merge_and_complain (struct cl_decoded_op
> > >   It is a common mistake to mix few -fPIC compiled objects into 
> > > otherwise
> > >   non-PIC code.  We do not want to build everything with PIC then.
> > >  
> > > + Similarly we merge PIE options, however in addition we keep
> > > +  -fPIC + -fPIE = -fPIE
> > > +  -fpic + -fPIE = -fpie
> > > +  -fPIC/-fpic + -fpie = -fpie
> > > +
> > >   It would be good to warn on mismatches, but it is bit hard to do as
> > >   we do not know what nothing translates to.  */
> > >  
> > > @@ -415,11 +420,38 @@ merge_and_complain (struct cl_decoded_op
> > >  if ((*decoded_options)[j].opt_index == OPT_fPIC
> > >  || (*decoded_options)[j].opt_index == OPT_fpic)
> > >{
> > > - if (!pic_option
> > > - || (pic_option->value > 0) != ((*decoded_options)[j].value > 0))
> > > -   remove_option (decoded_options, j, decoded_options_count);
> > > - else if (pic_option->opt_index == OPT_fPIC
> > > -  && (*decoded_options)[j].opt_index == OPT_fpic)
> > > + /* -fno-pic in one unit implies -fno-pic everywhere.  */
> > > + if ((*decoded_options)[j].value == 0)
> > > +   j++;
> > > + /* If we have no pic option or merge in -fno-pic, we still may turn
> > > +existing pic/PIC mode into pie/PIE if -fpie/-fPIE is present.  */
> > > + else if ((pic_option && pic_option->value == 0)
> > > +  || !pic_option)
> > > +   {
> > > + if (pie_option)
> > > +   {
> > > + bool big = (*decoded_options)[j].opt_index == OPT_fPIC
> > > +&& pie_option->opt_index == OPT_fPIE;
> > > + (*decoded_options)[j].opt_index = big ? OPT_fPIE : OPT_fpie;
> > > + if

[PATCH] Adjust c_getstr/c_strlen to new STRING_CST semantic

2018-08-31 Thread Bernd Edlinger

Hi,


when re-testing the new STRING_CST semantic patch here:
https://gcc.gnu.org/ml/gcc-patches/2018-08/msg01569.html

I noticed that the (my) fix for PR 87053 does still use
semantic properties of the STRING_CST that is not compatible
with the new proposed STRING_CST semantics.

That means that c_strlen needs to handle the case
where strings are not zero terminated.

When this is fixed, fortran.dg/pr45636.f90 starts to regress,
because the check in gimple_fold_builtin_memory_op here:

   if (tree_fits_uhwi_p (len)
   && compare_tree_int (len, MOVE_MAX) <= 0
   /* ???  Don't transform copies from strings with known length this
  confuses the tree-ssa-strlen.c.  This doesn't handle
  the case in gcc.dg/strlenopt-8.c which is XFAILed for that
  reason.  */
   && !c_strlen (src, 2))
  
does now return NULL_TREE, because the fortran string is not null terminated.
However that allows the folding which makes the fortran test case fail.

I fixed that by pulling in the c_getstr patch again, and use
it to make another exception for string constants without any embedded NUL.
That is what tree-ssa-forwprop.c can handle, and should make this work in
fortran again.


Bootstrapped and reg-tested on x86_64-pc-linux-gnu.
Is it OK for trunk?


Thanks
Bernd.
2018-08-30  Bernd Edlinger  

	* builtins.c (c_strlen): Handle not zero terminated STRING_CSTs
	correctly.
	* fold-const.c (c_getstr): Fix function comment.  Remove unused third
	argument.  Fix range checks.
	* fold-const.h (c_getstr): Adjust protoype.
	* gimple-fold.c (gimple_fold_builtin_memory_op): Avoid folding when
	string is constant but contains no NUL byte.

diff -Npur gcc/builtins.c gcc/builtins.c
--- gcc/builtins.c	2018-08-30 08:21:13.0 +0200
+++ gcc/builtins.c	2018-08-30 21:46:11.155211333 +0200
@@ -604,12 +604,12 @@ c_strlen (tree src, int only_value, unsi
  In that case, the elements of the array after the terminating NUL are
  all NUL.  */
   HOST_WIDE_INT strelts = TREE_STRING_LENGTH (src);
-  strelts = strelts / eltsize - 1;
+  strelts = strelts / eltsize;
 
   if (!tree_fits_uhwi_p (memsize))
 return NULL_TREE;
 
-  HOST_WIDE_INT maxelts = tree_to_uhwi (memsize) / eltsize - 1;
+  HOST_WIDE_INT maxelts = tree_to_uhwi (memsize) / eltsize;
 
   /* PTR can point to the byte representation of any string type, including
  char* and wchar_t*.  */
@@ -617,10 +617,6 @@ c_strlen (tree src, int only_value, unsi
 
   if (byteoff && TREE_CODE (byteoff) != INTEGER_CST)
 {
-  /* For empty strings the result should be zero.  */
-  if (maxelts == 0)
-	return ssize_int (0);
-
   /* The code below works only for single byte character types.  */
   if (eltsize != 1)
 	return NULL_TREE;
@@ -632,9 +628,13 @@ c_strlen (tree src, int only_value, unsi
   unsigned len = string_length (ptr, eltsize, strelts);
 
   /* Return when an embedded null character is found or none at all.  */
-  if (len < strelts || len > maxelts)
+  if (len + 1 < strelts || len >= maxelts)
 	return NULL_TREE;
 
+  /* For empty strings the result should be zero.  */
+  if (len == 0)
+	return ssize_int (0);
+
   /* We don't know the starting offset, but we do know that the string
 	 has no internal zero bytes.  If the offset falls within the bounds
 	 of the string subtract the offset from the length of the string,
@@ -644,7 +644,7 @@ c_strlen (tree src, int only_value, unsi
   offsave = fold_convert (ssizetype, offsave);
   tree condexp = fold_build2_loc (loc, LE_EXPR, boolean_type_node, offsave,
   build_int_cst (ssizetype, len));
-  tree lenexp = size_diffop_loc (loc, ssize_int (strelts), offsave);
+  tree lenexp = size_diffop_loc (loc, ssize_int (len), offsave);
   return fold_build3_loc (loc, COND_EXPR, ssizetype, condexp, lenexp,
 			  build_zero_cst (ssizetype));
 }
@@ -663,7 +663,7 @@ c_strlen (tree src, int only_value, unsi
 
   /* If the offset is known to be out of bounds, warn, and call strlen at
  runtime.  */
-  if (eltoff < 0 || eltoff > maxelts)
+  if (eltoff < 0 || eltoff >= maxelts)
 {
  /* Suppress multiple warnings for propagated constant strings.  */
   if (only_value != 2
@@ -691,9 +691,9 @@ c_strlen (tree src, int only_value, unsi
   unsigned len = string_length (ptr + eltoff * eltsize, eltsize,
 strelts - eltoff);
 
-  /* Don't know what to return if there was no zero termination. 
+  /* Don't know what to return if there was no zero termination.
  Ideally this would turn into a gcc_checking_assert over time.  */
-  if (len > maxelts - eltoff)
+  if (len >= maxelts - eltoff)
 return NULL_TREE;
 
   return ssize_int (len);
diff -Npur gcc/fold-const.c gcc/fold-const.c
--- gcc/fold-const.c	2018-08-30 08:21:13.0 +0200
+++ gcc/fold-const.c	2018-08-30 21:03:12.612460165 +0200
@@ -14576,23 +14576,20 @@ fold_build_pointer_plus_hwi_loc (locatio
 /* Return a pointer P to a

Re: [PATCHv2] Call braced_list_to_string after array size is fixed

2018-08-31 Thread Bernd Edlinger

On 08/30/18 09:07, Bernd Edlinger wrote:
> On 08/30/18 06:34, Jason Merrill wrote:
>> On 08/24/2018 03:52 PM, Bernd Edlinger wrote:
>>> this updated patch fixes one regression with current trunk due
>>> to a new test case.  Sorry for the confusion.
>>>
>>> The change to the previous version is:
>>> 1) the check to avoid folding on empty char arrays is restored.
>>> 2) A null-termination character is added except when the string is full 
>>> length.
>>
>>> -  && TYPE_STRING_FLAG (TREE_TYPE (valtype)))
>>
>>> +  tree typ1 = TYPE_MAIN_VARIANT (TREE_TYPE (type));
>>> +  if (typ1 == char_type_node
>>> +  || typ1 == signed_char_type_node
>>> +  || typ1 == unsigned_char_type_node)
>>
>>
>> Why stop using TYPE_STRING_FLAG?
>>
> 
> No longer sure, I will try it with TYPE_STRING_FLAG.
> 

This is an update that uses TYPE_STRING_FLAG.
It does not seem to make any difference, though.

Bootstrapped and reg-tested on x86_64-pc-linux-gnu.


Is it OK for trunk?


Thanks
Bernd.
c-family:
2018-08-24  Bernd Edlinger  

	* c-common.c (braced_list_to_string): Remove eval parameter.
	Add some more checks.  Always create zero-terminated STRING_CST.
	* c-common.h (braced_list_to_string): Adjust prototype.

c:
2018-08-24  Bernd Edlinger  

	* c-decl.c (finish_decl): Call braced_list_to_string here ...
	* c-parser.c (c_parser_declaration_or_fndef): ... instead of here.


cp:
2018-08-24  Bernd Edlinger  

	* decl.c (eval_check_narrowing): Remove.
	(check_initializer): Move call to braced_list_to_string from here ...
	* typeck2.c (store_init_value): ... to here.
	(digest_init_r): Remove handing of signed/unsigned char strings.

testsuite:
2018-08-24  Bernd Edlinger  

	* c-c++-common/array-init.c: New test.
	* g++.dg/init/string2.C: Remove xfail.

diff -Npur gcc/c/c-decl.c gcc/c/c-decl.c
--- gcc/c/c-decl.c	2018-08-21 08:17:41.0 +0200
+++ gcc/c/c-decl.c	2018-08-30 12:03:12.508230259 +0200
@@ -5031,6 +5031,12 @@ finish_decl (tree decl, location_t init_
   relayout_decl (decl);
 }
 
+  if (TREE_CODE (type) == ARRAY_TYPE
+  && TYPE_STRING_FLAG (TREE_TYPE (type))
+  && DECL_INITIAL (decl)
+  && TREE_CODE (DECL_INITIAL (decl)) == CONSTRUCTOR)
+DECL_INITIAL (decl) = braced_list_to_string (type, DECL_INITIAL (decl));
+
   if (VAR_P (decl))
 {
   if (init && TREE_CODE (init) == CONSTRUCTOR)
diff -Npur gcc/c/c-parser.c gcc/c/c-parser.c
--- gcc/c/c-parser.c	2018-08-21 08:17:41.0 +0200
+++ gcc/c/c-parser.c	2018-08-24 20:03:12.511230218 +0200
@@ -2127,15 +2127,6 @@ c_parser_declaration_or_fndef (c_parser
 	  if (d != error_mark_node)
 		{
 		  maybe_warn_string_init (init_loc, TREE_TYPE (d), init);
-
-		  /* Try to convert a string CONSTRUCTOR into a STRING_CST.  */
-		  tree valtype = TREE_TYPE (init.value);
-		  if (TREE_CODE (init.value) == CONSTRUCTOR
-		  && TREE_CODE (valtype) == ARRAY_TYPE
-		  && TYPE_STRING_FLAG (TREE_TYPE (valtype)))
-		if (tree str = braced_list_to_string (valtype, init.value))
-		  init.value = str;
-
 		  finish_decl (d, init_loc, init.value,
 			   init.original_type, asm_name);
 		}
diff -Npur gcc/c-family/c-common.c gcc/c-family/c-common.c
--- gcc/c-family/c-common.c	2018-08-17 05:02:11.0 +0200
+++ gcc/c-family/c-common.c	2018-08-24 20:03:12.512230205 +0200
@@ -8511,39 +8511,28 @@ maybe_add_include_fixit (rich_location *
 }
 
 /* Attempt to convert a braced array initializer list CTOR for array
-   TYPE into a STRING_CST for convenience and efficiency.  When non-null,
-   use EVAL to attempt to evalue constants (used by C++).  Return
-   the converted string on success or null on failure.  */
+   TYPE into a STRING_CST for convenience and efficiency.  Return
+   the converted string on success or the original ctor on failure.  */
 
 tree
-braced_list_to_string (tree type, tree ctor, tree (*eval)(tree, tree))
+braced_list_to_string (tree type, tree ctor)
 {
-  unsigned HOST_WIDE_INT nelts = CONSTRUCTOR_NELTS (ctor);
+  if (!tree_fits_uhwi_p (TYPE_SIZE_UNIT (type)))
+return ctor;
 
   /* If the array has an explicit bound, use it to constrain the size
  of the string.  If it doesn't, be sure to create a string that's
  as long as implied by the index of the last zero specified via
  a designator, as in:
const char a[] = { [7] = 0 };  */
-  unsigned HOST_WIDE_INT maxelts = HOST_WIDE_INT_M1U;
-  if (tree size = TYPE_SIZE_UNIT (type))
-{
-  if (tree_fits_uhwi_p (size))
-	{
-	  maxelts = tree_to_uhwi (size);
-	  maxelts /= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type)));
+  unsigned HOST_WIDE_INT maxelts = tree_to_uhwi (TYPE_SIZE_UNIT (type));
+  maxelts /= tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (type)));
 
-	  /* Avoid converting initializers for zero-length arrays.  */
-	  if (!maxelts)
-	return NULL_TREE;
-	}
-}
-  else if (!nelts)
-/* Avoid handling the undefined/erroneous case of an empty
-   initializer for an arrays with unspecified bound.  */
-return

58 matches

Mail list logo