Do not call null register_common in emutls

2020-02-05 Thread Alexandre Oliva
Thread-local variables with DECL_COMMON trigger an internal compiler
error on targets that use emulated TLS without register_common, when
we attempt to expand a call to the NULL register_common, with
testcases as simple as gcc.dg/tls/emutls-2.c.

The documentation states that, on such targets, common variables would
fall back to explicitly initialized.  This patch rearranges the code
that deals with initialization of common and non-common variables,
complementing code that is already in place to detect
register_common-less targets.


Regstrapped on x86_64-linux-gnu, tested on an affected target.
Ok to install?


for  gcc/ChangeLog

* tree-emutls.c (new_emutls_decl, emutls_common_1): Complete
handling of register_common-less targets.

for  gcc/testsuite/ChangeLog

* gcc.dg/tls/emutls-3.c: New, combining emutls-2.c and
thr-init-2.c into an execution test with explicitly common
variables.
---
 testsuite/gcc.dg/tls/emutls-3.c |   26 ++
 tree-emutls.c   |4 ++--
 2 files changed, 28 insertions(+), 2 deletions(-)
 create mode 100644 testsuite/gcc.dg/tls/emutls-3.c

diff --git gcc/testsuite/gcc.dg/tls/emutls-3.c 
gcc/testsuite/gcc.dg/tls/emutls-3.c
new file mode 100644
index ..e062ba8
--- /dev/null
+++ gcc/testsuite/gcc.dg/tls/emutls-3.c
@@ -0,0 +1,26 @@
+/* { dg-do run } */
+/* { dg-require-effective-target tls } */
+/* { dg-require-effective-target global_constructor } */
+/* { dg-options "-O2" } */
+/* { dg-require-effective-target tls_runtime } */
+/* { dg-add-options tls } */
+
+__thread int i __attribute__((common));
+
+extern void abort (void);
+
+int test_code(int b)
+{
+  i += b ;
+  return i;
+}
+
+int main (int ac, char *av[])
+{
+  int a = test_code(test_code(1));
+  
+  if ((a != 2) || (i != 2))
+abort () ;
+  
+  return 0;
+}
diff --git gcc/tree-emutls.c gcc/tree-emutls.c
index e924124..44755dd4 100644
--- gcc/tree-emutls.c
+++ gcc/tree-emutls.c
@@ -322,7 +322,7 @@ new_emutls_decl (tree decl, tree alias_of)
  control structure with size and alignment information.  Initialization
  of COMMON block variables happens elsewhere via a constructor.  */
   if (!DECL_EXTERNAL (to)
-  && (!DECL_COMMON (to)
+  && (!DECL_COMMON (to) || !targetm.emutls.register_common
   || (DECL_INITIAL (decl)
   && DECL_INITIAL (decl) != error_mark_node)))
 {
@@ -360,7 +360,7 @@ emutls_common_1 (tree tls_decl, tree control_decl, tree 
*pstmts)
   tree x;
   tree word_type_node;
 
-  if (! DECL_COMMON (tls_decl)
+  if (!DECL_COMMON (tls_decl) || !targetm.emutls.register_common
   || (DECL_INITIAL (tls_decl)
  && DECL_INITIAL (tls_decl) != error_mark_node))
 return;


-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


[FYI] [Ada] Initialize barrier_cache for ARM EH ABI compliance

2020-02-05 Thread Alexandre Oliva
The ARM Exception Handling ABI requires personality functions in
phase1 to initialize barrier_cache before returning
_URC_HANDLER_FOUND, and we don't.

Although our own ARM personality function does not use barrier_cache
at all, other languages' ARM personality functions, during phase2, are
allowed and expected to test barrier_cache.sp to check whether the
handler frame was reached, which implies that personality functions is
in charge of the frame, and the remaining fields of barrier_cache hold
whatever values it put there in phase1.

Since we did not set barrier_cache.sp, an earlier exception, already
handled by a non-Ada handler and then released, may have its storage
reused for a new exception, that phase1 matches to an Ada frame, but
if that leaves barrier_cache.sp alone, the phase2 personality function
that handled the earlier exception, upon reaching the frame that
handled the earlier exception, may believe the information in
barrier_cache applies to the current exception.  The C++ personality
function, for example, would take the information in the barrier_cache
and end up activating the handler that handled the earlier exception:

  try {
throw 1;
  } catch (int i) {
std::cout << "caught " << i << " by c++" << std::endl;
  }
  raise_ada_exception (); // might loop back to the handler above


Regstrapped on x86_64-linux-gnu (no change expected), and tested on
ARM-based systems that use the ARM EH ABI.  I'm checking this in.


for  gcc/ada/ChangeLog

* raise-gcc.c (personality_body) [__ARM_EABI_UNWINDER__]:
Initialize barrier_cache.sp when ending phase1.
---
 ada/raise-gcc.c |   10 ++
 1 file changed, 10 insertions(+)

diff --git gcc/ada/raise-gcc.c gcc/ada/raise-gcc.c
index 1ba8af1..3b6c21fc 100644
--- gcc/ada/raise-gcc.c
+++ gcc/ada/raise-gcc.c
@@ -1211,6 +1211,16 @@ personality_body (_Unwind_Action uw_phases,
}
   else
{
+#ifdef __ARM_EABI_UNWINDER__
+ /* Though we do not use this field ourselves, initializing
+it is required by the ARM EH ABI before a personality
+function in phase1 returns _URC_HANDLER_FOUND, so that
+any personality function can use it in phase2 to test
+whether the handler frame was reached.  */
+ uw_exception->barrier_cache.sp
+   = _Unwind_GetGR (uw_context, UNWIND_STACK_REG);
+#endif
+
 #ifndef CERT
  /* Trigger the appropriate notification routines before the second
 phase starts, when the stack is still intact.  First install what

-- 
Alexandre Oliva, freedom fighter   he/him   https://FSFLA.org/blogs/lxo
Free Software Evangelist   Stallman was right, but he's left :(
GNU Toolchain EngineerFSMatrix: It was he who freed the first of us
FSF & FSFLA board memberThe Savior shall return (true);


Re: [PATCH] i386: Make xmm16-xmm31 call used even in ms ABI

2020-02-05 Thread Jakub Jelinek
On Thu, Feb 06, 2020 at 01:00:36AM +, JonY wrote:
> On 2/4/20 11:42 AM, Jakub Jelinek wrote:
> > Hi!
> > 
> > On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
> >> I guess that Comment #9 patch form the PR should be trivially correct,
> >> but althouhg it looks obvious, I don't want to propose the patch since
> >> I have no means of testing it.
> > 
> > I don't have means of testing it either.
> > https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
> > is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
> > 128-bits only) are call preserved.
> > 
> > Jonathan, could you please test this if it is sufficient to just change
> > CALL_USED_REGISTERS or if e.g. something in the pro/epilogue needs tweaking
> > too?  Thanks.
> 
> Is this patch testing still required? I just got back from traveling.

Yes, our reading of the MS ABI docs show that xmm16-31 are to be call used
(not preserved over calls), while in gcc they are currently handled as
preserved across the calls.

Jakub



Re: [PATCH v2] c++: Fix ICE with lambda in operator function [PR93597]

2020-02-05 Thread Jason Merrill

On 2/5/20 5:04 PM, Marek Polacek wrote:

On Wed, Feb 05, 2020 at 04:40:50PM -0500, Jason Merrill wrote:

On 2/5/20 4:31 PM, Marek Polacek wrote:

If we are going to use get_first_fn let's make sure we operate
on is_overloaded_fn, as the rest of the codebase does.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

PR c++/93597 - ICE with lambda in operator function.
* name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.

* g++.dg/cpp0x/lambda/lambda-93597.C: New test.
---
   gcc/cp/name-lookup.c | 13 -
   gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C |  8 
   2 files changed, 16 insertions(+), 5 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 2447166a444..aa539926332 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7624,11 +7624,14 @@ maybe_save_operator_binding (tree e)
 if (!fns && (fns = op_unqualified_lookup (fnname)))
   {
-  tree fn = get_first_fn (fns);
-  if (DECL_CLASS_SCOPE_P (fn))
-   /* We don't need to remember class-scope functions, normal unqualified
-  lookup will find them again.  */
-   return;
+  if (is_overloaded_fn (fns))
+   {
+ tree fn = get_first_fn (fns);
+ if (DECL_CLASS_SCOPE_P (fn))
+   /* We don't need to remember class-scope functions, normal
+  unqualified lookup will find them again.  */
+   return;
+   }


I think we want to return early in this testcase, too, i.e. if lookup finds
any class-scope declaration.


Ah okay, how about this, if it passes the usual testing?

-- >8 --
If we are going to use get_first_fn let's make sure we operate on
is_overloaded_fn, as the rest of the codebase does, and if lookup finds
any class-scope declaration, return early too.

PR c++/93597 - ICE with lambda in operator function.
* name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.

* g++.dg/cpp0x/lambda/lambda-93597.C: New test.
---
  gcc/cp/name-lookup.c | 8 
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C | 8 
  2 files changed, 12 insertions(+), 4 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 2447166a444..8853c39f0eb 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7624,10 +7624,10 @@ maybe_save_operator_binding (tree e)
  
if (!fns && (fns = op_unqualified_lookup (fnname)))

  {
-  tree fn = get_first_fn (fns);
-  if (DECL_CLASS_SCOPE_P (fn))
-   /* We don't need to remember class-scope functions, normal unqualified
-  lookup will find them again.  */
+  tree d = is_overloaded_fn (fns) ? get_first_fn (fns) : fns;
+  if (DECL_CLASS_SCOPE_P (d))


We likely want to guard this with DECL_P (d).  OK with that change.

Jason


+   /* We don't need to remember class-scope functions or declarations,
+  normal unqualified lookup will find them again.  */
return;
  
bindings = tree_cons (fnname, fns, bindings);

diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
new file mode 100644
index 000..257d9c7cdfd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
@@ -0,0 +1,8 @@
+// PR c++/93597 - ICE with lambda in operator function.
+// { dg-do compile { target c++11 } }
+
+template 
+struct S {
+  using T ::operator<;
+  void operator==(T x) { [x] { 0 < x; }; }
+};

base-commit: d10cddeaad2a315c114063b7c1ff11c6a356ab65





Re: [PATCH, rs6000]: mark clobber for registers changed by untpyed_call

2020-02-05 Thread Jiufu Guo
Segher Boessenkool  writes:

> On Wed, Feb 05, 2020 at 09:53:27PM +0800, Jiufu Guo wrote:
>> As PR93047 said, __builtin_apply/__builtin_return does not work well with
>> -frename-registers.  This is caused by return register(e.g. r3) is used to
>> rename another register, before return register is stored to stack.
>> 
>> This patch fix this issue by emitting clobber for those egisters which
>> maybe changed by untyped call.
>
> Yeah.  untyped_call does
>
>   /* The optimizer does not know that the call sets the function value
>  registers we stored in the result block.  We avoid problems by
>  claiming that all hard registers are used and clobbered at this
>  point.  */
>   emit_insn (gen_blockage ());
>
> but blockage does not say registers are used and clobbered:
>
> @cindex @code{blockage} instruction pattern
> @item @samp{blockage}
> This pattern defines a pseudo insn that prevents the instruction
> scheduler and other passes from moving instructions and using register
> equivalences across the boundary defined by the blockage insn.
> This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.
>
> Many archs have this same implementation of untyped_call (and of
> blockage, too).  It all just works by luck (or it doesn't work).
>
> What we have is:
>
>   emit_call_insn (gen_call (operands[0], const0_rtx, const0_rtx));
>
>   for (i = 0; i < XVECLEN (operands[2], 0); i++)
> {
>   rtx set = XVECEXP (operands[2], 0, i);
>   emit_move_insn (SET_DEST (set), SET_SRC (set));
> }
>
> ... and nothing in the rtl stream says that those return registers are
> actually set by that call.  Maybe we should use gen_call_value?  Can we
> ever be asked to return more than a single thing here?
I was also thinking about using "gen_call_value" or "emit_clobber (r3)"
which could generate rtl: "%3:DI=call [foo]" or "call [foo]; clobber
r3".  This could tell optimizer that %3 is changed.  While there are
potential issues that untyped_call may change other registers.  So, mark
clobber for all touched registers maybe more safe.

>
> Some trivial patch comments:
>
>> gcc/
>> 2020-02-05  Jiufu Guo  
>> 
>>  PR target/93047
>>  * config/rs6000/rs6000.md (untyped_call): add emit_clobber.
>
> "Add", capital.
Thanks,
>
>> gcc/testsuite
>> 2020-02-05  Jiufu Guo  
>> 
>>  PR target/93047
>>  * gcc.dg/torture/stackalign/builtin-return-2.c: New case.
>
> "New test case."  (And there is trailing whitespace here; Git warns
> about that, so this won't happen much in the future :-) )
Oh, get it, thanks. The withspace is after this line.

Jiufu
>
>
> Segher


Re: [PATCH] i386: Make xmm16-xmm31 call used even in ms ABI

2020-02-05 Thread JonY
On 2/4/20 11:42 AM, Jakub Jelinek wrote:
> Hi!
> 
> On Tue, Feb 04, 2020 at 11:16:06AM +0100, Uros Bizjak wrote:
>> I guess that Comment #9 patch form the PR should be trivially correct,
>> but althouhg it looks obvious, I don't want to propose the patch since
>> I have no means of testing it.
> 
> I don't have means of testing it either.
> https://docs.microsoft.com/en-us/cpp/build/x64-calling-convention?view=vs-2019
> is quite explicit that [xyz]mm16-31 are call clobbered and only xmm6-15 (low
> 128-bits only) are call preserved.
> 
> Jonathan, could you please test this if it is sufficient to just change
> CALL_USED_REGISTERS or if e.g. something in the pro/epilogue needs tweaking
> too?  Thanks.

Is this patch testing still required? I just got back from traveling.



signature.asc
Description: OpenPGP digital signature


[committed] avoid pedantic warning in C++ 98 mode: comma at end of enumerator list

2020-02-05 Thread Martin Sebor

I removed the trailing comma and (after a few false starts) managed
to push the change in commit
  r10-6466-g297aa668293d55ffe100d810e92fbe592f262557.

I got the error below for my first few attempts.  The message had
the expected format so I wasn't sure what the problem was until
I removed the "[-Wpedantic]" part at the end.  Was it looking for
a bug id and getting confused?

Martin

remote: *** Invalid revision history for commit 
1475ba69fd77b6c22b10e76ad94fa487d931dabd:

remote: *** The first line should be the subject of the commit,
remote: *** followed by an empty line.
remote: ***
remote: *** Below are the first few lines of the revision history:
remote: *** | Remove trailing comma to avoid pedantic warning in C++ 98 
mode:

remote: *** |   comma at end of enumerator list [-Wpedantic]
remote: ***
remote: *** Please amend the commit's revision history and try again.
remote: error: hook declined to update refs/heads/master


Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-05 Thread Martin Sebor

On 2/4/20 7:35 AM, Richard Biener wrote:



...

Jakub/Richi, comments on this hunk?


+ tree ref = TREE_OPERAND (TREE_OPERAND (arg, 0), 0);
+ tree off = TREE_OPERAND (arg, 1);
+ if ((TREE_CODE (ref) == PARM_DECL || VAR_P (ref))
+ && (!DECL_EXTERNAL (ref)
+ || !array_at_struct_end_p (arg)))
+   {

I think you'd want decl_binds_to_current_def_p (ref) instead of !DECL_EXTERNAL.


I've made the change.


Since 'arg' is originally a pointer array_at_struct_end_p is
meaningless here since
that's about the structure of a reference while the pointer is just a
value.


array_at_struct_end_p handles MEM_REF by looking at the argument
(i.e., at the DECL when it is one), so its use here avoids DECLs
with flexible arrays but allows others.  In other words, I don't
want to exclude MEM_REFs to a in:

  extern char a[4];

just because a is extern.


So if
you're concerned the objects size might not be as it looks like then you have to
rely on decl_binds_to_current_def_p only.


I'm only concerned about sizes of extern objects of struct types
with flexible array members.  I believe others are handled fine.


You also shouldn't use 'off' natively
in the code below but use mem_ref_offset to access the embedded offset
which is to be interpreted as signed integer (it's a pointer as you use it).
You compare it against an unsigned size...


I've changed it in the latest revision of the patch.

Martin


Re: [PATCH] adjust object size computation for union accesses and PHIs (PR 92765)

2020-02-05 Thread Martin Sebor

On 2/3/20 11:44 AM, Jeff Law wrote:

On Fri, 2020-01-31 at 12:04 -0700, Martin Sebor wrote:

Attached is a reworked patch since the first one didn't go far
enough to solve the major problems.  The new solution relies on
get_range_strlen_dynamic the same way as the sprintf optimization,
and does away with the determine_min_objsize function and calling
compute_builtin_object_size.

To minimize testsuite fallout I extended get_range_strlen to handle
a couple more kinds expressions(*), but I still had to xfail and
disable a few tests that were relying on being able to use the type
of the destination object as the upper bound on the string length.

Tested on x86_64-linux.

Martin

[*] With all the issues around MEM_REFs and types this change needs
extra scrutiny.  I'm still not sure I fully understand what can and
what cannot be safely relied on at this level.

On 1/15/20 6:18 AM, Martin Sebor wrote:

The strcmp optimization newly introduced in GCC 10 relies on
the size of the smallest referenced array object to determine
whether the function can return zero.  When the size of
the object is smaller than the length of the other string
argument the optimization folds the equality to false.

The bug report has identified a couple of problems here:
1) when the access to the array object is via a pointer to
a (possibly indirect) member of a union, in GIMPLE the pointer
may actually point to a different member than the one in
the original source code.  Thus the size of the array may
appear to be smaller than in the source code which can then
result in the optimization being invalid.
2) when the pointer in the access may point to two or more
arrays of different size (i.e., it's the result of a PHI),
assuming it points to the smallest of them can also lead
to an incorrect result when the optimization is applied.

The attached patch adjusts the optimization to 1) avoid making
any assumptions about the sizes of objects accessed via union
types, and b) use the size of the largest object in PHI nodes.

Tested on x86_64-linux.

Martin



PR tree-optimization/92765 - wrong code for strcmp of a union member

gcc/ChangeLog:

 PR tree-optimization/92765
 * gimple-fold.c (get_range_strlen_tree): Handle MEM_REF and PARM_DECL.
 * tree-ssa-strlen.c (compute_string_length): Remove.
 (determine_min_objsize): Remove.
 (get_len_or_size): Add an argument.  Call get_range_strlen_dynamic.
 Avoid using type size as the upper bound on string length.
 (handle_builtin_string_cmp): Add an argument.  Adjust.
 (strlen_check_and_optimize_call): Pass additional argument to
 handle_builtin_string_cmp.

gcc/testsuite/ChangeLog:

 PR tree-optimization/92765
 * g++.dg/tree-ssa/strlenopt-1.C: New test.
 * g++.dg/tree-ssa/strlenopt-2.C: New test.
 * gcc.dg/Warray-bounds-58.c: New test.
 * gcc.dg/Wrestrict-20.c: Avoid a valid -Wformat-overflow.
 * gcc.dg/Wstring-compare.c: Xfail a test.
 * gcc.dg/strcmpopt_2.c: Disable tests.
 * gcc.dg/strcmpopt_4.c: Adjust tests.
 * gcc.dg/strcmpopt_10.c: New test.
 * gcc.dg/strlenopt-69.c: Disable tests.
 * gcc.dg/strlenopt-92.c: New test.
 * gcc.dg/strlenopt-93.c: New test.
 * gcc.dg/strlenopt.h: Declare calloc.
 * gcc.dg/tree-ssa/pr92056.c: Xfail tests until pr93518 is resolved.
 * gcc.dg/tree-ssa/builtin-sprintf-warn-23.c: Correct test (pr93517).

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index ed225922269..d70ac67e1ca 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1280,7 +1280,7 @@ get_range_strlen_tree (tree arg, bitmap *visited, 
strlen_range_kind rkind,
c_strlen_data *pdata, unsigned eltsize)
  {
gcc_assert (TREE_CODE (arg) != SSA_NAME);
-
+
/* The length computed by this invocation of the function.  */
tree val = NULL_TREE;
  
@@ -1422,7 +1422,42 @@ get_range_strlen_tree (tree arg, bitmap *visited, strlen_range_kind rkind,

  type about the length here.  */
   tight_bound = true;
 }
-  else if (VAR_P (arg))
+  else if (TREE_CODE (arg) == MEM_REF
+  && TREE_CODE (TREE_TYPE (arg)) == ARRAY_TYPE
+  && TREE_CODE (TREE_TYPE (TREE_TYPE (arg))) == INTEGER_TYPE
+  && TREE_CODE (TREE_OPERAND (arg, 0)) == ADDR_EXPR)
+   {
+ /* Handle a MEM_REF into a DECL accessing an array of integers,
+being conservative about references to extern structures with
+flexible array members that can be initialized to arbitrary
+numbers of elements as an extension (static structs are okay).
+FIXME: Make this less conservative -- see
+component_ref_size in tree.c.  */

I think it's generally been agreed that we can look at sizes of _DECL
nodes and this code doesn't look like this walks backwards through
casts or anything like that.  So the worry would be 

[COMMITTED] c++: Fix decltype of empty pack expansion of parm.

2020-02-05 Thread Jason Merrill
In unevaluated context, we only substitute a single PARM_DECL, not the
entire chain, but the handling of an empty pack expansion was missing that
check.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/93140
* pt.c (tsubst_decl) [PARM_DECL]: Check cp_unevaluated_operand in
handling of TREE_CHAIN for empty pack.
---
 gcc/cp/pt.c |  2 +-
 gcc/testsuite/g++.dg/cpp0x/variadic-parm1.C | 17 +
 2 files changed, 18 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/variadic-parm1.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 40ff3c3a089..01bade85cdf 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -14076,7 +14076,7 @@ tsubst_decl (tree t, tree args, tsubst_flags_t complain)
 
 /* Zero-length parameter packs are boring. Just substitute
into the chain.  */
-if (len == 0)
+   if (len == 0 && !cp_unevaluated_operand)
   RETURN (tsubst (TREE_CHAIN (t), args, complain,
  TREE_CHAIN (t)));
   }
diff --git a/gcc/testsuite/g++.dg/cpp0x/variadic-parm1.C 
b/gcc/testsuite/g++.dg/cpp0x/variadic-parm1.C
new file mode 100644
index 000..4300c781400
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/variadic-parm1.C
@@ -0,0 +1,17 @@
+// PR c++/93140
+// { dg-do compile { target c++11 } }
+
+int
+bar ()
+{
+  return 42;
+}
+
+template 
+void foo (R... r, decltype (bar (r...)) x = 0) {}
+
+int
+main ()
+{
+  foo (3);
+}

base-commit: 5a8ad97b6e4823d4ded00a3ce8d80e4bf93368d4
-- 
2.18.1



Re: [PATCH RFA] cgraph: A COMDAT decl always has non-zero address.

2020-02-05 Thread Jan Hubicka
> We should be able to assume that a template instantiation or other COMDAT
> has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets
> DECL_WEAK and we haven't yet decided to emit a definition in this
> translation unit.
> 
> Tested x86_64-pc-linux-gnu, OK for trunk?
> 
>   PR c++/92003
>   * symtab.c (symtab_node::nonzero_address): A DECL_COMDAT decl has
>   non-zero address even if weak and not yet defined.
OK, thanks!

Honza
> ---
>  gcc/symtab.c| 10 +-
>  gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C | 17 +
>  2 files changed, 22 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C
> 
> diff --git a/gcc/symtab.c b/gcc/symtab.c
> index eae891ab211..a879c095a1a 100644
> --- a/gcc/symtab.c
> +++ b/gcc/symtab.c
> @@ -2058,22 +2058,22 @@ symtab_node::nonzero_address ()
>   bind to NULL. This is on by default on embedded targets only.
>  
>   Otherwise all non-WEAK symbols must be defined and thus non-NULL or
> - linking fails.  Important case of WEAK we want to do well are comdats.
> - Those are handled by later check for definition.
> + linking fails.  Important case of WEAK we want to do well are comdats,
> + which also must be defined somewhere.
>  
>   When parsing, beware the cases when WEAK attribute is added later.  */
> -  if (!DECL_WEAK (decl)
> +  if ((!DECL_WEAK (decl) || DECL_COMDAT (decl))
>&& flag_delete_null_pointer_checks)
>  {
>refuse_visibility_changes = true;
>return true;
>  }
>  
> -  /* If target is defined and either comdat or not extern, we know it will be
> +  /* If target is defined and not extern, we know it will be
>   output and thus it will bind to non-NULL.
>   Play safe for flag_delete_null_pointer_checks where weak definition may
>   be re-defined by NULL.  */
> -  if (definition && (!DECL_EXTERNAL (decl) || DECL_COMDAT (decl))
> +  if (definition && !DECL_EXTERNAL (decl)
>&& (flag_delete_null_pointer_checks || !DECL_WEAK (decl)))
>  {
>if (!DECL_WEAK (decl))
> diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C 
> b/gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C
> new file mode 100644
> index 000..644f9f7f893
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C
> @@ -0,0 +1,17 @@
> +// PR c++/92003
> +// { dg-do compile { target c++11 } }
> +// { dg-prune-output "narrowing conversion" }
> +
> +constexpr char const* get_c_str() { return "abc"; }
> +constexpr bool use_get_c_str_in_constexpr_context{get_c_str()}; // works
> +
> +template 
> +struct string {
> +  static constexpr char const* c_str() { return c; }
> +
> + private:
> +  static constexpr char c[]{Cs..., '\0'};
> +};
> +
> +constexpr char const* cstr{string<'a', 'b', 'c'>::c_str()};
> +constexpr bool use_cstr_in_constexpr_context{cstr}; // doesn't work
> 
> base-commit: fa0c6e297b22d5883857d0db4a6a8be0967cb16f
> -- 
> 2.18.1
> 


Re: [PATCH/RFC] Make -fanalyzer C only for GCC 10 (PR 93392)

2020-02-05 Thread David Malcolm
On Wed, 2020-02-05 at 16:18 +0100, Jakub Jelinek wrote:
> On Wed, Feb 05, 2020 at 09:59:19AM -0500, David Malcolm wrote:
> > Although the analyzer works on GIMPLE SSA and therefore in theory
> > ought
> > to support all languages supported by GCC, the code currently only
> > supports the subset of GIMPLE SSA expressible via the C frontend.
> > 
> > For GCC 10 I want to explicitly restrict the scope of the analyzer
> > to
> > C code, to keep the initial scope of the feature sane.
> > 
> > For example, various C++ things aren't yet supported by the
> > analyzer and
> > won't be in GCC 10 (exceptions, new/delete diagnostics, ctors that
> > run
> > before main, etc)
> 
> C has ctors before main too, look for attribute constructor.

Thanks.

> I must say I don't really like this, if you encounter something in
> the IL
> that the analyzer can't handle yet, punt, that is fine, 

Fair enough.

FWIW I managed to get double-free detection working in gfortran earlier
today, and in doing so discovered and fixed an issue affecting C,
proving your point, I think.

> but you could
> write code in C++ that doesn't use exceptions nor new/delete, there
> is
> no reason not to analyze that, etc.  And teaching it to handle
> new/delete
> operator like malloc/free shouldn't be that hard, but even if you
> don't,
> you still need some way how to deal with other allocators in C, say
> for
> stuff allocated with mmap etc.

Right.  I want to generalize the malloc/free stuff to deal with
arbitrary families of acquire/release APIs, but at this point we're
deep in stage 4, and so I want to try to set some expectations about
what's likely to work, and what isn't, and it seems prudent to leave
that generalization work to gcc 11.

My plan for gcc 10 is to focus on C code that uses malloc and free,
possibly with a simple wrapper around them.  I think the highest
priority item is ensuring that the analyzer scales up to handle real-
world C code rather than just the reduced examples in the testsuite. 
I'm working on a fix for a rather convoluted set of issues in state-
merging that would otherwise lead to the analyzer burning CPU cycles
w/o properly exploring the user's code.

But I'm also fixing ICEs as I go.

> Mentioning in the documentation that it is for now primarily intended
> for C
> is one thing (that is fine), but stopping analyzing something is
> another.

I'll do this in release notes and docs then.

Hope the above sounds like a reasonable plan.

Thanks
Dave



Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 2:51 PM H.J. Lu  wrote:
>
> On Wed, Feb 5, 2020 at 2:37 PM Marek Polacek  wrote:
> >
> > On Wed, Feb 05, 2020 at 02:24:48PM -0800, H.J. Lu wrote:
> > > On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu  wrote:
> > > >
> > > > On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford
> > > >  wrote:
> > > > >
> > > > > "H.J. Lu"  writes:
> > > > > > Currently patchable area is at the wrong place.
> > > > >
> > > > > Agreed :-)
> > > > >
> > > > > > It is placed immediately
> > > > > > after function label and before .cfi_startproc.  A backend should 
> > > > > > be able
> > > > > > to add a pseudo patchable area instruction durectly into RTL.  This 
> > > > > > patch
> > > > > > adds patch_area_size and patch_area_entry to cfun so that the 
> > > > > > patchable
> > > > > > area info is available in RTL passes.
> > > > >
> > > > > It might be better to add it to crtl, since it should only be needed
> > > > > during rtl generation.
> > > > >
> > > > > > It also limits patch_area_size and patch_area_entry to 65535, which 
> > > > > > is
> > > > > > a reasonable maximum size for patchable area.
> > > > > >
> > > > > > gcc/
> > > > > >
> > > > > >   PR target/93492
> > > > > >   * function.c (expand_function_start): Set 
> > > > > > cfun->patch_area_size
> > > > > >   and cfun->patch_area_entry.
> > > > > >   * function.h (function): Add patch_area_size and 
> > > > > > patch_area_entry.
> > > > > >   * opts.c (common_handle_option): Limit
> > > > > >   function_entry_patch_area_size and 
> > > > > > function_entry_patch_area_start
> > > > > >   to USHRT_MAX.  Fix a typo in error message.
> > > > > >   * varasm.c (assemble_start_function): Use 
> > > > > > cfun->patch_area_size
> > > > > >   and cfun->patch_area_entry.
> > > > > >   * doc/invoke.texi: Document the maximum value for
> > > > > >   -fpatchable-function-entry.
> > > > > >
> > > > > > gcc/testsuite/
> > > > > >
> > > > > >   PR target/93492
> > > > > >   * c-c++-common/patchable_function_entry-error-1.c: New test.
> > > > > >   * c-c++-common/patchable_function_entry-error-2.c: Likewise.
> > > > > >   * c-c++-common/patchable_function_entry-error-3.c: Likewise.
> > > > > > ---
> > > > > >  gcc/doc/invoke.texi   |  1 +
> > > > > >  gcc/function.c| 35 
> > > > > > +++
> > > > > >  gcc/function.h|  6 
> > > > > >  gcc/opts.c|  4 ++-
> > > > > >  .../patchable_function_entry-error-1.c|  9 +
> > > > > >  .../patchable_function_entry-error-2.c|  9 +
> > > > > >  .../patchable_function_entry-error-3.c| 20 +++
> > > > > >  gcc/varasm.c  | 30 ++--
> > > > > >  8 files changed, 85 insertions(+), 29 deletions(-)
> > > > > >  create mode 100644 
> > > > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
> > > > > >  create mode 100644 
> > > > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
> > > > > >  create mode 100644 
> > > > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
> > > > > >
> > > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > > > > index 35b341e759f..dd4835199b0 100644
> > > > > > --- a/gcc/doc/invoke.texi
> > > > > > +++ b/gcc/doc/invoke.texi
> > > > > > @@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
> > > > > >  The NOP instructions are inserted at---and maybe before, depending 
> > > > > > on
> > > > > >  @var{M}---the function entry address, even before the prologue.
> > > > > >
> > > > > > +The maximum value of @var{N} and @var{M} is 65535.
> > > > > >  @end table
> > > > > >
> > > > > >
> > > > > > diff --git a/gcc/function.c b/gcc/function.c
> > > > > > index d8008f60422..badbf538eec 100644
> > > > > > --- a/gcc/function.c
> > > > > > +++ b/gcc/function.c
> > > > > > @@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
> > > > > >/* If we are doing generic stack checking, the probe should go 
> > > > > > here.  */
> > > > > >if (flag_stack_check == GENERIC_STACK_CHECK)
> > > > > >  stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
> > > > > > +
> > > > > > +  unsigned HOST_WIDE_INT patch_area_size = 
> > > > > > function_entry_patch_area_size;
> > > > > > +  unsigned HOST_WIDE_INT patch_area_entry = 
> > > > > > function_entry_patch_area_start;
> > > > > > +
> > > > > > +  tree patchable_function_entry_attr
> > > > > > += lookup_attribute ("patchable_function_entry",
> > > > > > + DECL_ATTRIBUTES (cfun->decl));
> > > > > > +  if (patchable_function_entry_attr)
> > > > > > +{
> > > > > > +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> > > > > > +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> > > > > > +
> > > > > > +  patch_area_size = tree_to_uhwi 
> > > > > > 

Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 2:37 PM Marek Polacek  wrote:
>
> On Wed, Feb 05, 2020 at 02:24:48PM -0800, H.J. Lu wrote:
> > On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu  wrote:
> > >
> > > On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford
> > >  wrote:
> > > >
> > > > "H.J. Lu"  writes:
> > > > > Currently patchable area is at the wrong place.
> > > >
> > > > Agreed :-)
> > > >
> > > > > It is placed immediately
> > > > > after function label and before .cfi_startproc.  A backend should be 
> > > > > able
> > > > > to add a pseudo patchable area instruction durectly into RTL.  This 
> > > > > patch
> > > > > adds patch_area_size and patch_area_entry to cfun so that the 
> > > > > patchable
> > > > > area info is available in RTL passes.
> > > >
> > > > It might be better to add it to crtl, since it should only be needed
> > > > during rtl generation.
> > > >
> > > > > It also limits patch_area_size and patch_area_entry to 65535, which is
> > > > > a reasonable maximum size for patchable area.
> > > > >
> > > > > gcc/
> > > > >
> > > > >   PR target/93492
> > > > >   * function.c (expand_function_start): Set cfun->patch_area_size
> > > > >   and cfun->patch_area_entry.
> > > > >   * function.h (function): Add patch_area_size and 
> > > > > patch_area_entry.
> > > > >   * opts.c (common_handle_option): Limit
> > > > >   function_entry_patch_area_size and 
> > > > > function_entry_patch_area_start
> > > > >   to USHRT_MAX.  Fix a typo in error message.
> > > > >   * varasm.c (assemble_start_function): Use cfun->patch_area_size
> > > > >   and cfun->patch_area_entry.
> > > > >   * doc/invoke.texi: Document the maximum value for
> > > > >   -fpatchable-function-entry.
> > > > >
> > > > > gcc/testsuite/
> > > > >
> > > > >   PR target/93492
> > > > >   * c-c++-common/patchable_function_entry-error-1.c: New test.
> > > > >   * c-c++-common/patchable_function_entry-error-2.c: Likewise.
> > > > >   * c-c++-common/patchable_function_entry-error-3.c: Likewise.
> > > > > ---
> > > > >  gcc/doc/invoke.texi   |  1 +
> > > > >  gcc/function.c| 35 
> > > > > +++
> > > > >  gcc/function.h|  6 
> > > > >  gcc/opts.c|  4 ++-
> > > > >  .../patchable_function_entry-error-1.c|  9 +
> > > > >  .../patchable_function_entry-error-2.c|  9 +
> > > > >  .../patchable_function_entry-error-3.c| 20 +++
> > > > >  gcc/varasm.c  | 30 ++--
> > > > >  8 files changed, 85 insertions(+), 29 deletions(-)
> > > > >  create mode 100644 
> > > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
> > > > >  create mode 100644 
> > > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
> > > > >  create mode 100644 
> > > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
> > > > >
> > > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > > > index 35b341e759f..dd4835199b0 100644
> > > > > --- a/gcc/doc/invoke.texi
> > > > > +++ b/gcc/doc/invoke.texi
> > > > > @@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
> > > > >  The NOP instructions are inserted at---and maybe before, depending on
> > > > >  @var{M}---the function entry address, even before the prologue.
> > > > >
> > > > > +The maximum value of @var{N} and @var{M} is 65535.
> > > > >  @end table
> > > > >
> > > > >
> > > > > diff --git a/gcc/function.c b/gcc/function.c
> > > > > index d8008f60422..badbf538eec 100644
> > > > > --- a/gcc/function.c
> > > > > +++ b/gcc/function.c
> > > > > @@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
> > > > >/* If we are doing generic stack checking, the probe should go 
> > > > > here.  */
> > > > >if (flag_stack_check == GENERIC_STACK_CHECK)
> > > > >  stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
> > > > > +
> > > > > +  unsigned HOST_WIDE_INT patch_area_size = 
> > > > > function_entry_patch_area_size;
> > > > > +  unsigned HOST_WIDE_INT patch_area_entry = 
> > > > > function_entry_patch_area_start;
> > > > > +
> > > > > +  tree patchable_function_entry_attr
> > > > > += lookup_attribute ("patchable_function_entry",
> > > > > + DECL_ATTRIBUTES (cfun->decl));
> > > > > +  if (patchable_function_entry_attr)
> > > > > +{
> > > > > +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> > > > > +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> > > > > +
> > > > > +  patch_area_size = tree_to_uhwi 
> > > > > (patchable_function_entry_value1);
> > > > > +  patch_area_entry = 0;
> > > > > +  if (TREE_CHAIN (pp_val) != NULL_TREE)
> > > > > + {
> > > > > +   tree patchable_function_entry_value2
> > > > > + = TREE_VALUE (TREE_CHAIN (pp_val));
> > > > > +   patch_area_entry = tree_to_uhwi 
> > > > > 

Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread Marek Polacek
On Wed, Feb 05, 2020 at 02:24:48PM -0800, H.J. Lu wrote:
> On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu  wrote:
> >
> > On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford
> >  wrote:
> > >
> > > "H.J. Lu"  writes:
> > > > Currently patchable area is at the wrong place.
> > >
> > > Agreed :-)
> > >
> > > > It is placed immediately
> > > > after function label and before .cfi_startproc.  A backend should be 
> > > > able
> > > > to add a pseudo patchable area instruction durectly into RTL.  This 
> > > > patch
> > > > adds patch_area_size and patch_area_entry to cfun so that the patchable
> > > > area info is available in RTL passes.
> > >
> > > It might be better to add it to crtl, since it should only be needed
> > > during rtl generation.
> > >
> > > > It also limits patch_area_size and patch_area_entry to 65535, which is
> > > > a reasonable maximum size for patchable area.
> > > >
> > > > gcc/
> > > >
> > > >   PR target/93492
> > > >   * function.c (expand_function_start): Set cfun->patch_area_size
> > > >   and cfun->patch_area_entry.
> > > >   * function.h (function): Add patch_area_size and patch_area_entry.
> > > >   * opts.c (common_handle_option): Limit
> > > >   function_entry_patch_area_size and function_entry_patch_area_start
> > > >   to USHRT_MAX.  Fix a typo in error message.
> > > >   * varasm.c (assemble_start_function): Use cfun->patch_area_size
> > > >   and cfun->patch_area_entry.
> > > >   * doc/invoke.texi: Document the maximum value for
> > > >   -fpatchable-function-entry.
> > > >
> > > > gcc/testsuite/
> > > >
> > > >   PR target/93492
> > > >   * c-c++-common/patchable_function_entry-error-1.c: New test.
> > > >   * c-c++-common/patchable_function_entry-error-2.c: Likewise.
> > > >   * c-c++-common/patchable_function_entry-error-3.c: Likewise.
> > > > ---
> > > >  gcc/doc/invoke.texi   |  1 +
> > > >  gcc/function.c| 35 +++
> > > >  gcc/function.h|  6 
> > > >  gcc/opts.c|  4 ++-
> > > >  .../patchable_function_entry-error-1.c|  9 +
> > > >  .../patchable_function_entry-error-2.c|  9 +
> > > >  .../patchable_function_entry-error-3.c| 20 +++
> > > >  gcc/varasm.c  | 30 ++--
> > > >  8 files changed, 85 insertions(+), 29 deletions(-)
> > > >  create mode 100644 
> > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
> > > >  create mode 100644 
> > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
> > > >  create mode 100644 
> > > > gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
> > > >
> > > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > > index 35b341e759f..dd4835199b0 100644
> > > > --- a/gcc/doc/invoke.texi
> > > > +++ b/gcc/doc/invoke.texi
> > > > @@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
> > > >  The NOP instructions are inserted at---and maybe before, depending on
> > > >  @var{M}---the function entry address, even before the prologue.
> > > >
> > > > +The maximum value of @var{N} and @var{M} is 65535.
> > > >  @end table
> > > >
> > > >
> > > > diff --git a/gcc/function.c b/gcc/function.c
> > > > index d8008f60422..badbf538eec 100644
> > > > --- a/gcc/function.c
> > > > +++ b/gcc/function.c
> > > > @@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
> > > >/* If we are doing generic stack checking, the probe should go here. 
> > > >  */
> > > >if (flag_stack_check == GENERIC_STACK_CHECK)
> > > >  stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
> > > > +
> > > > +  unsigned HOST_WIDE_INT patch_area_size = 
> > > > function_entry_patch_area_size;
> > > > +  unsigned HOST_WIDE_INT patch_area_entry = 
> > > > function_entry_patch_area_start;
> > > > +
> > > > +  tree patchable_function_entry_attr
> > > > += lookup_attribute ("patchable_function_entry",
> > > > + DECL_ATTRIBUTES (cfun->decl));
> > > > +  if (patchable_function_entry_attr)
> > > > +{
> > > > +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> > > > +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> > > > +
> > > > +  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
> > > > +  patch_area_entry = 0;
> > > > +  if (TREE_CHAIN (pp_val) != NULL_TREE)
> > > > + {
> > > > +   tree patchable_function_entry_value2
> > > > + = TREE_VALUE (TREE_CHAIN (pp_val));
> > > > +   patch_area_entry = tree_to_uhwi 
> > > > (patchable_function_entry_value2);
> > > > + }
> > > > +  if (patch_area_size > USHRT_MAX || patch_area_size > USHRT_MAX)
> > > > + error ("invalid values for % 
> > > > attribute");
> > >
> > > This should probably go in handle_patchable_function_entry_attribute
> > > instead.  It doesn't look like we have 

Re: [PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 12:20 PM H.J. Lu  wrote:
>
> On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford
>  wrote:
> >
> > "H.J. Lu"  writes:
> > > Currently patchable area is at the wrong place.
> >
> > Agreed :-)
> >
> > > It is placed immediately
> > > after function label and before .cfi_startproc.  A backend should be able
> > > to add a pseudo patchable area instruction durectly into RTL.  This patch
> > > adds patch_area_size and patch_area_entry to cfun so that the patchable
> > > area info is available in RTL passes.
> >
> > It might be better to add it to crtl, since it should only be needed
> > during rtl generation.
> >
> > > It also limits patch_area_size and patch_area_entry to 65535, which is
> > > a reasonable maximum size for patchable area.
> > >
> > > gcc/
> > >
> > >   PR target/93492
> > >   * function.c (expand_function_start): Set cfun->patch_area_size
> > >   and cfun->patch_area_entry.
> > >   * function.h (function): Add patch_area_size and patch_area_entry.
> > >   * opts.c (common_handle_option): Limit
> > >   function_entry_patch_area_size and function_entry_patch_area_start
> > >   to USHRT_MAX.  Fix a typo in error message.
> > >   * varasm.c (assemble_start_function): Use cfun->patch_area_size
> > >   and cfun->patch_area_entry.
> > >   * doc/invoke.texi: Document the maximum value for
> > >   -fpatchable-function-entry.
> > >
> > > gcc/testsuite/
> > >
> > >   PR target/93492
> > >   * c-c++-common/patchable_function_entry-error-1.c: New test.
> > >   * c-c++-common/patchable_function_entry-error-2.c: Likewise.
> > >   * c-c++-common/patchable_function_entry-error-3.c: Likewise.
> > > ---
> > >  gcc/doc/invoke.texi   |  1 +
> > >  gcc/function.c| 35 +++
> > >  gcc/function.h|  6 
> > >  gcc/opts.c|  4 ++-
> > >  .../patchable_function_entry-error-1.c|  9 +
> > >  .../patchable_function_entry-error-2.c|  9 +
> > >  .../patchable_function_entry-error-3.c| 20 +++
> > >  gcc/varasm.c  | 30 ++--
> > >  8 files changed, 85 insertions(+), 29 deletions(-)
> > >  create mode 100644 
> > > gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
> > >  create mode 100644 
> > > gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
> > >  create mode 100644 
> > > gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
> > >
> > > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > > index 35b341e759f..dd4835199b0 100644
> > > --- a/gcc/doc/invoke.texi
> > > +++ b/gcc/doc/invoke.texi
> > > @@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
> > >  The NOP instructions are inserted at---and maybe before, depending on
> > >  @var{M}---the function entry address, even before the prologue.
> > >
> > > +The maximum value of @var{N} and @var{M} is 65535.
> > >  @end table
> > >
> > >
> > > diff --git a/gcc/function.c b/gcc/function.c
> > > index d8008f60422..badbf538eec 100644
> > > --- a/gcc/function.c
> > > +++ b/gcc/function.c
> > > @@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
> > >/* If we are doing generic stack checking, the probe should go here.  
> > > */
> > >if (flag_stack_check == GENERIC_STACK_CHECK)
> > >  stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
> > > +
> > > +  unsigned HOST_WIDE_INT patch_area_size = 
> > > function_entry_patch_area_size;
> > > +  unsigned HOST_WIDE_INT patch_area_entry = 
> > > function_entry_patch_area_start;
> > > +
> > > +  tree patchable_function_entry_attr
> > > += lookup_attribute ("patchable_function_entry",
> > > + DECL_ATTRIBUTES (cfun->decl));
> > > +  if (patchable_function_entry_attr)
> > > +{
> > > +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> > > +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> > > +
> > > +  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
> > > +  patch_area_entry = 0;
> > > +  if (TREE_CHAIN (pp_val) != NULL_TREE)
> > > + {
> > > +   tree patchable_function_entry_value2
> > > + = TREE_VALUE (TREE_CHAIN (pp_val));
> > > +   patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
> > > + }
> > > +  if (patch_area_size > USHRT_MAX || patch_area_size > USHRT_MAX)
> > > + error ("invalid values for % attribute");
> >
> > This should probably go in handle_patchable_function_entry_attribute
> > instead.  It doesn't look like we have a clear policy about whether
> > errors or warnings are right for unrecognised arguments to known
> > attribute names, but handle_patchable_function_entry_attribute reports
> > an OPT_Wattributes warning for arguments that aren't integers.  Doing the
> > same for out-of-range integers would be more 

Re: [PATCH v2] c++: Fix ICE with lambda in operator function [PR93597]

2020-02-05 Thread Marek Polacek
On Wed, Feb 05, 2020 at 04:40:50PM -0500, Jason Merrill wrote:
> On 2/5/20 4:31 PM, Marek Polacek wrote:
> > If we are going to use get_first_fn let's make sure we operate
> > on is_overloaded_fn, as the rest of the codebase does.
> > 
> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > PR c++/93597 - ICE with lambda in operator function.
> > * name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.
> > 
> > * g++.dg/cpp0x/lambda/lambda-93597.C: New test.
> > ---
> >   gcc/cp/name-lookup.c | 13 -
> >   gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C |  8 
> >   2 files changed, 16 insertions(+), 5 deletions(-)
> >   create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
> > 
> > diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
> > index 2447166a444..aa539926332 100644
> > --- a/gcc/cp/name-lookup.c
> > +++ b/gcc/cp/name-lookup.c
> > @@ -7624,11 +7624,14 @@ maybe_save_operator_binding (tree e)
> > if (!fns && (fns = op_unqualified_lookup (fnname)))
> >   {
> > -  tree fn = get_first_fn (fns);
> > -  if (DECL_CLASS_SCOPE_P (fn))
> > -   /* We don't need to remember class-scope functions, normal unqualified
> > -  lookup will find them again.  */
> > -   return;
> > +  if (is_overloaded_fn (fns))
> > +   {
> > + tree fn = get_first_fn (fns);
> > + if (DECL_CLASS_SCOPE_P (fn))
> > +   /* We don't need to remember class-scope functions, normal
> > +  unqualified lookup will find them again.  */
> > +   return;
> > +   }
> 
> I think we want to return early in this testcase, too, i.e. if lookup finds
> any class-scope declaration.

Ah okay, how about this, if it passes the usual testing?

-- >8 --
If we are going to use get_first_fn let's make sure we operate on
is_overloaded_fn, as the rest of the codebase does, and if lookup finds
any class-scope declaration, return early too.

PR c++/93597 - ICE with lambda in operator function.
* name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.

* g++.dg/cpp0x/lambda/lambda-93597.C: New test.
---
 gcc/cp/name-lookup.c | 8 
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C | 8 
 2 files changed, 12 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 2447166a444..8853c39f0eb 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7624,10 +7624,10 @@ maybe_save_operator_binding (tree e)
 
   if (!fns && (fns = op_unqualified_lookup (fnname)))
 {
-  tree fn = get_first_fn (fns);
-  if (DECL_CLASS_SCOPE_P (fn))
-   /* We don't need to remember class-scope functions, normal unqualified
-  lookup will find them again.  */
+  tree d = is_overloaded_fn (fns) ? get_first_fn (fns) : fns;
+  if (DECL_CLASS_SCOPE_P (d))
+   /* We don't need to remember class-scope functions or declarations,
+  normal unqualified lookup will find them again.  */
return;
 
   bindings = tree_cons (fnname, fns, bindings);
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
new file mode 100644
index 000..257d9c7cdfd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
@@ -0,0 +1,8 @@
+// PR c++/93597 - ICE with lambda in operator function.
+// { dg-do compile { target c++11 } }
+
+template 
+struct S {
+  using T ::operator<;
+  void operator==(T x) { [x] { 0 < x; }; }
+};

base-commit: d10cddeaad2a315c114063b7c1ff11c6a356ab65
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



Re: [PATCH] Fix PR 93568 on PowerPC (vector extract failures)

2020-02-05 Thread Segher Boessenkool
On Wed, Feb 05, 2020 at 04:29:38PM -0500, Michael Meissner wrote:
> 2020-02-05  Michael Meissner  
> 
>   PR target/93568
>   * config/rs6000/rs6000.c (get_vector_offset): Fix
> 
> --- /tmp/a8cqkr_rs6000.c  2020-02-05 14:55:36.255021903 -0600
> +++ gcc/config/rs6000/rs6000.c2020-02-05 13:27:00.393877012 -0600
> @@ -6744,8 +6744,7 @@ get_vector_offset (rtx mem, rtx element,
>  
>/* All insns should use the 'Q' constraint (address is a single register) 
> if
>   the element number is not a constant.  */
> -  rtx addr = XEXP (mem, 0);
> -  gcc_assert (satisfies_constraint_Q (addr));
> +  gcc_assert (satisfies_constraint_Q (mem));
>  
>/* Mask the element to make sure the element number is between 0 and the
>   maximum number of elements - 1 so that we don't generate an address

Okay for trunk.  Thank you!


Segher


[committed] wwwdocs: New GCC mirror from Rabat, Morocco

2020-02-05 Thread Gerald Pfeifer
On Sun, 5 Jan 2020, Gerald Pfeifer wrote:
> Happy to have you as a mirror, and if you'd like to submit a patch
> for https://gcc.gnu.org/mirrors.html that'd be great. Otherwise we
> can create one.

I applied this patch that I created on behalf of Sami.

Gerald


- Log -
commit c69ec31fe4eec31ec6f1c76ebec23ad69cfb85d3
Author: Gerald Pfeifer 
Date:   Wed Feb 5 22:38:24 2020 +0100

Add mirror.marwan.ma.

diff --git a/htdocs/mirrors.html b/htdocs/mirrors.html
index 1917d04a..6813de72 100644
--- a/htdocs/mirrors.html
+++ b/htdocs/mirrors.html
@@ -30,6 +30,10 @@ mirrors.  The following sites mirror the gcc.gnu.org 
download site
 Greece: ftp://ftp.ntua.gr/pub/gnu/gcc/;>ftp.ntua.gr, thanks 
to ftpadm at ntua.gr
 Hungary, Budapest: http://robotlab.itk.ppke.hu/gcc/;>robotlab.itk.ppke.hu, thanks to 
Adam Rak (neurhlp at gmail.com)
 Japan: http://ftp.tsukuba.wide.ad.jp/software/gcc/;>ftp.tsukuba.wide.ad.jp, 
thanks to Kohei Takahashi (tsukuba-ftp-servers at tsukuba.wide.ad.jp)
+Morocco:
+  https://mirror.marwan.ma/gcc/;>mirror.marwan.ma
+ (rsync://mirror.marwan.ma/gcc/),
+  thanks to Sami Ait Ali Oulahcen (s...@marwan.ma)
 The Netherlands, Dronten:
 ?? http://mirror.koddos.net/gcc/;>http://mirror.koddos.net/gcc/ |
 ?? rsync://mirror.koddos.net/gcc/,


Re: [PATCH] c++: Fix ICE with lambda in operator function [PR93597]

2020-02-05 Thread Jason Merrill

On 2/5/20 4:31 PM, Marek Polacek wrote:

If we are going to use get_first_fn let's make sure we operate
on is_overloaded_fn, as the rest of the codebase does.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

PR c++/93597 - ICE with lambda in operator function.
* name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.

* g++.dg/cpp0x/lambda/lambda-93597.C: New test.
---
  gcc/cp/name-lookup.c | 13 -
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C |  8 
  2 files changed, 16 insertions(+), 5 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 2447166a444..aa539926332 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7624,11 +7624,14 @@ maybe_save_operator_binding (tree e)
  
if (!fns && (fns = op_unqualified_lookup (fnname)))

  {
-  tree fn = get_first_fn (fns);
-  if (DECL_CLASS_SCOPE_P (fn))
-   /* We don't need to remember class-scope functions, normal unqualified
-  lookup will find them again.  */
-   return;
+  if (is_overloaded_fn (fns))
+   {
+ tree fn = get_first_fn (fns);
+ if (DECL_CLASS_SCOPE_P (fn))
+   /* We don't need to remember class-scope functions, normal
+  unqualified lookup will find them again.  */
+   return;
+   }


I think we want to return early in this testcase, too, i.e. if lookup 
finds any class-scope declaration.



bindings = tree_cons (fnname, fns, bindings);
if (attr)
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
new file mode 100644
index 000..257d9c7cdfd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
@@ -0,0 +1,8 @@
+// PR c++/93597 - ICE with lambda in operator function.
+// { dg-do compile { target c++11 } }
+
+template 
+struct S {
+  using T ::operator<;
+  void operator==(T x) { [x] { 0 < x; }; }
+};

base-commit: fa0c6e297b22d5883857d0db4a6a8be0967cb16f





[committed] x86: Simplify post epilogue_completed splitters.

2020-02-05 Thread Uros Bizjak
Simplify post epilogue_completed splitters.

Now that we have post epilogue_completed split point for all
optimization levels, we can simplify post epilogue_completed splitters
considerably. If corresponding define_peephole2 pattern fails to
allocate a temporary register (or if peephole2 pass isn't run at all),
we can now always split invalid RTX after epilogue_completed is set.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

* config/i386/i386.md (*pushdi2_rex64 peephole2): Remove.
(*pushdi2_rex64 peephole2): Unconditionally split after
epilogue_completed.
(*ashl3_doubleword): Ditto.
(*3_doubleword): Ditto.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 46b442dae51..496a843 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1688,38 +1688,21 @@
 ;; First try to get scratch register and go through it.  In case this
 ;; fails, push sign extended lower part first and then overwrite
 ;; upper part by 32bit move.
+
 (define_peephole2
   [(match_scratch:DI 2 "r")
(set (match_operand:DI 0 "push_operand")
 (match_operand:DI 1 "immediate_operand"))]
-  "TARGET_64BIT && !symbolic_operand (operands[1], DImode)
+  "TARGET_64BIT
+   && !symbolic_operand (operands[1], DImode)
&& !x86_64_immediate_operand (operands[1], DImode)"
   [(set (match_dup 2) (match_dup 1))
(set (match_dup 0) (match_dup 2))])
 
-;; We need to define this as both peepholer and splitter for case
-;; peephole2 pass is not run.
-;; "&& 1" is needed to keep it from matching the previous pattern.
-(define_peephole2
-  [(set (match_operand:DI 0 "push_operand")
-(match_operand:DI 1 "immediate_operand"))]
-  "TARGET_64BIT && !symbolic_operand (operands[1], DImode)
-   && !x86_64_immediate_operand (operands[1], DImode) && 1"
-  [(set (match_dup 0) (match_dup 1))
-   (set (match_dup 2) (match_dup 3))]
-{
-  split_double_mode (DImode, [1], 1, [2], [3]);
-
-  operands[1] = gen_lowpart (DImode, operands[2]);
-  operands[2] = gen_rtx_MEM (SImode, gen_rtx_PLUS (Pmode, stack_pointer_rtx,
-  GEN_INT (4)));
-})
-
 (define_split
   [(set (match_operand:DI 0 "push_operand")
 (match_operand:DI 1 "immediate_operand"))]
-  "TARGET_64BIT && ((optimize > 0 && flag_peephole2)
-   ? epilogue_completed : reload_completed)
+  "TARGET_64BIT && epilogue_completed
&& !symbolic_operand (operands[1], DImode)
&& !x86_64_immediate_operand (operands[1], DImode)"
   [(set (match_dup 0) (match_dup 1))
@@ -10586,7 +10569,7 @@
(ashift:DWI (match_operand:DWI 1 "nonmemory_operand")
(match_operand:QI 2 "nonmemory_operand")))
(clobber (reg:CC FLAGS_REG))]
-  "(optimize && flag_peephole2) ? epilogue_completed : reload_completed"
+  "epilogue_completed"
   [(const_int 0)]
   "ix86_split_ashl (operands, NULL_RTX, mode); DONE;")
 
@@ -11338,7 +11321,7 @@
(clobber (reg:CC FLAGS_REG))]
   ""
   "#"
-  "(optimize && flag_peephole2) ? epilogue_completed : reload_completed"
+  "epilogue_completed"
   [(const_int 0)]
   "ix86_split_ (operands, NULL_RTX, mode); DONE;"
   [(set_attr "type" "multi")])


[PATCH] c++: Fix ICE with lambda in operator function [PR93597]

2020-02-05 Thread Marek Polacek
If we are going to use get_first_fn let's make sure we operate
on is_overloaded_fn, as the rest of the codebase does.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

PR c++/93597 - ICE with lambda in operator function.
* name-lookup.c (maybe_save_operator_binding): Check is_overloaded_fn.

* g++.dg/cpp0x/lambda/lambda-93597.C: New test.
---
 gcc/cp/name-lookup.c | 13 -
 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C |  8 
 2 files changed, 16 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C

diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index 2447166a444..aa539926332 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -7624,11 +7624,14 @@ maybe_save_operator_binding (tree e)
 
   if (!fns && (fns = op_unqualified_lookup (fnname)))
 {
-  tree fn = get_first_fn (fns);
-  if (DECL_CLASS_SCOPE_P (fn))
-   /* We don't need to remember class-scope functions, normal unqualified
-  lookup will find them again.  */
-   return;
+  if (is_overloaded_fn (fns))
+   {
+ tree fn = get_first_fn (fns);
+ if (DECL_CLASS_SCOPE_P (fn))
+   /* We don't need to remember class-scope functions, normal
+  unqualified lookup will find them again.  */
+   return;
+   }
 
   bindings = tree_cons (fnname, fns, bindings);
   if (attr)
diff --git a/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C 
b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
new file mode 100644
index 000..257d9c7cdfd
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-93597.C
@@ -0,0 +1,8 @@
+// PR c++/93597 - ICE with lambda in operator function.
+// { dg-do compile { target c++11 } }
+
+template 
+struct S {
+  using T ::operator<;
+  void operator==(T x) { [x] { 0 < x; }; }
+};

base-commit: fa0c6e297b22d5883857d0db4a6a8be0967cb16f
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[PATCH] Fix PR 93568 on PowerPC (vector extract failures)

2020-02-05 Thread Michael Meissner
When I submitted my recent patches, in updating one of the patches, I made a
thinko that resulted in a lot of failures on big endian systems (but not as
many on the little endian systems).

I have done bootstraps on both big endian and little endian systems.  Can I
check in this patch?

On a big endian power8 system, the following tests now pass:

gcc.target/powerpc/pr87532-mc.c
gcc.target/powerpc/pr89765-mc.c
gcc.target/powerpc/vec-extract-3.c
gcc.target/powerpc/vec-extract-5.c
gcc.target/powerpc/vec-extract-6.c
gcc.target/powerpc/vec-extract-7.c
gcc.target/powerpc/vec-extract-8.c
gcc.target/powerpc/vec-extract-9.c
gcc.target/powerpc/vec-extract-v16qi-df.c
gcc.target/powerpc/vec-extract-v16qi.c
gcc.target/powerpc/vec-extract-v16qiu-df.c
gcc.target/powerpc/vec-extract-v16qiu.c
gcc.target/powerpc/vec-extract-v2df.c
gcc.target/powerpc/vec-extract-v2di.c
gcc.target/powerpc/vec-extract-v4sf.c
gcc.target/powerpc/vec-extract-v4si-df.c
gcc.target/powerpc/vec-extract-v4si.c
gcc.target/powerpc/vec-extract-v4siu-df.c
gcc.target/powerpc/vec-extract-v4siu.c
gcc.target/powerpc/vec-extract-v8hi-df.c
gcc.target/powerpc/vec-extract-v8hi.c
gcc.target/powerpc/vec-extract-v8hiu-df.c
gcc.target/powerpc/vec-extract-v8hiu.c
gcc.target/powerpc/vsx-builtin-10b.c
gcc.target/powerpc/vsx-builtin-11b.c
gcc.target/powerpc/vsx-builtin-12b.c
gcc.target/powerpc/vsx-builtin-14b.c
gcc.target/powerpc/vsx-builtin-15b.c
gcc.target/powerpc/vsx-builtin-16b.c
gcc.target/powerpc/vsx-builtin-17b.c
gcc.target/powerpc/vsx-builtin-18b.c
gcc.target/powerpc/vsx-builtin-19b.c
gcc.target/powerpc/vsx-builtin-9b.c

On a little endian power8 system, the following tests now pass:

gcc.target/powerpc/pr87532-mc.c
gcc.target/powerpc/pr89765-mc.c
gcc.target/powerpc/vec-extract-v2di.c
gcc.target/powerpc/vsx-builtin-12b.c
gcc.target/powerpc/vsx-builtin-19b.c

2020-02-05  Michael Meissner  

PR target/93568
* config/rs6000/rs6000.c (get_vector_offset): Fix

--- /tmp/a8cqkr_rs6000.c2020-02-05 14:55:36.255021903 -0600
+++ gcc/config/rs6000/rs6000.c  2020-02-05 13:27:00.393877012 -0600
@@ -6744,8 +6744,7 @@ get_vector_offset (rtx mem, rtx element,
 
   /* All insns should use the 'Q' constraint (address is a single register) if
  the element number is not a constant.  */
-  rtx addr = XEXP (mem, 0);
-  gcc_assert (satisfies_constraint_Q (addr));
+  gcc_assert (satisfies_constraint_Q (mem));
 
   /* Mask the element to make sure the element number is between 0 and the
  maximum number of elements - 1 so that we don't generate an address

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


Re: [PATCH] c++: Fix ICE with CONSTRUCTOR flags verification [PR93559]

2020-02-05 Thread Jason Merrill
On Wed, Feb 5, 2020 at 4:18 PM Marek Polacek  wrote:

> Since reshape_init_array_1 can now reuse a single constructor for
> an array of non-aggregate type, we might run into a scenario where
> we reuse a constructor with TREE_SIDE_EFFECTS.  This broke this test
> because we have something like { { expr } } and we try to reshape it,
> so we recurse on the inner CONSTRUCTOR, reuse an existing CONSTRUCTOR
> with TREE_SIDE_EFFECTS, and then ICE on the discrepancy because the
> outermost CONSTRUCTOR doesn't have TREE_SIDE_EFFECTS.  In this case
> EXPR was a call to an operator function so TREE_SIDE_EFFECTS should
> be set.  Naturally one would want to fix this by calling
> recompute_constructor_flags in an appropriate place so that the flags
> on the CONSTRUCTORs match.  The appropriate place would be at the end
> of reshape_init, but this breaks initlist109.C: there we are dealing
> with { { TARGET_EXPR <{}> } } where the outermost { } is TREE_CONSTANT
> but the inner { } is not, so recompute_constructor_flags would clear
> the constant flag in the outermost { }.  Seems resonable but it upsets
> check_initializer which then complains about "non-constant in-class
> initialization invalid for static member".  TARGET_EXPRs are always
> created with TREE_SIDE_EFFECTS on, but that is mutually exclusive
> with TREE_CONSTANT.  So we're in a bind.
>
> Fixed by not reusing a CONSTRUCTOR that has TREE_SIDE_EFFECTS; in the
> grand scheme of things it isn't measurable: it only affects ~3 tests
> in the testsuite.
>

Bootstrapped/regtested on x86_64-linux, ok for trunk?
>

OK.

PR c++/93559 - ICE with CONSTRUCTOR flags verification.
> * decl.c (reshape_init_array_1): Don't reuse a CONSTRUCTOR with
> TREE_SIDE_EFFECTS.
>
> * g++.dg/cpp0x/initlist119.C: New test.
> * g++.dg/cpp0x/initlist120.C: New test.
> ---
>  gcc/cp/decl.c|  4 +++-
>  gcc/testsuite/g++.dg/cpp0x/initlist119.C | 15 +++
>  gcc/testsuite/g++.dg/cpp0x/initlist120.C | 16 
>  3 files changed, 34 insertions(+), 1 deletion(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist119.C
>  create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist120.C
>
> diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
> index 859fd1bb931..d90007ba475 100644
> --- a/gcc/cp/decl.c
> +++ b/gcc/cp/decl.c
> @@ -5989,7 +5989,9 @@ reshape_init_array_1 (tree elt_type, tree max_index,
> reshape_iter *d,
>/* The initializer for an array is always a CONSTRUCTOR.  If this is the
>   outermost CONSTRUCTOR and the element type is non-aggregate, we
> don't need
>   to build a new one.  */
> -  bool reuse = first_initializer_p && !CP_AGGREGATE_TYPE_P (elt_type);
> +  bool reuse = (first_initializer_p
> +   && !CP_AGGREGATE_TYPE_P (elt_type)
> +   && !TREE_SIDE_EFFECTS (first_initializer_p));
>if (reuse)
>  new_init = first_initializer_p;
>else
> diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist119.C
> b/gcc/testsuite/g++.dg/cpp0x/initlist119.C
> new file mode 100644
> index 000..80f391f64c7
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/initlist119.C
> @@ -0,0 +1,15 @@
> +// PR c++/93559 - ICE with CONSTRUCTOR flags verification.
> +// { dg-do compile { target c++11 } }
> +
> +struct E { int d[10]; };
> +
> +struct S {
> +  constexpr int operator()(char) { return 42; }
> +};
> +
> +template  struct X {
> +  constexpr static E foo(S s) { return {{s(1)}}; }
> +};
> +
> +S s;
> +static_assert((X::foo(s), 1), "");
> diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist120.C
> b/gcc/testsuite/g++.dg/cpp0x/initlist120.C
> new file mode 100644
> index 000..8d03166e197
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp0x/initlist120.C
> @@ -0,0 +1,16 @@
> +// PR c++/93559 - ICE with CONSTRUCTOR flags verification.
> +// { dg-do compile { target c++11 } }
> +
> +struct F { int d[10]; };
> +struct E { F f; };
> +
> +struct S {
> +  constexpr int operator()(char) { return 42; }
> +};
> +
> +template  struct X {
> +  constexpr static E foo(S s) { return {{{s(1)}}}; }
> +};
> +
> +S s;
> +static_assert((X::foo(s), 1), "");
>
> base-commit: 91bc3c98851670360dfcd312399c3ba35fb50231
> --
> Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA
>
>


[PATCH] c++: Fix ICE with CONSTRUCTOR flags verification [PR93559]

2020-02-05 Thread Marek Polacek
Since reshape_init_array_1 can now reuse a single constructor for
an array of non-aggregate type, we might run into a scenario where
we reuse a constructor with TREE_SIDE_EFFECTS.  This broke this test
because we have something like { { expr } } and we try to reshape it,
so we recurse on the inner CONSTRUCTOR, reuse an existing CONSTRUCTOR
with TREE_SIDE_EFFECTS, and then ICE on the discrepancy because the
outermost CONSTRUCTOR doesn't have TREE_SIDE_EFFECTS.  In this case
EXPR was a call to an operator function so TREE_SIDE_EFFECTS should
be set.  Naturally one would want to fix this by calling
recompute_constructor_flags in an appropriate place so that the flags
on the CONSTRUCTORs match.  The appropriate place would be at the end
of reshape_init, but this breaks initlist109.C: there we are dealing
with { { TARGET_EXPR <{}> } } where the outermost { } is TREE_CONSTANT
but the inner { } is not, so recompute_constructor_flags would clear
the constant flag in the outermost { }.  Seems resonable but it upsets
check_initializer which then complains about "non-constant in-class
initialization invalid for static member".  TARGET_EXPRs are always
created with TREE_SIDE_EFFECTS on, but that is mutually exclusive
with TREE_CONSTANT.  So we're in a bind.

Fixed by not reusing a CONSTRUCTOR that has TREE_SIDE_EFFECTS; in the
grand scheme of things it isn't measurable: it only affects ~3 tests
in the testsuite.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

PR c++/93559 - ICE with CONSTRUCTOR flags verification.
* decl.c (reshape_init_array_1): Don't reuse a CONSTRUCTOR with
TREE_SIDE_EFFECTS.

* g++.dg/cpp0x/initlist119.C: New test.
* g++.dg/cpp0x/initlist120.C: New test.
---
 gcc/cp/decl.c|  4 +++-
 gcc/testsuite/g++.dg/cpp0x/initlist119.C | 15 +++
 gcc/testsuite/g++.dg/cpp0x/initlist120.C | 16 
 3 files changed, 34 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist119.C
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist120.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 859fd1bb931..d90007ba475 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -5989,7 +5989,9 @@ reshape_init_array_1 (tree elt_type, tree max_index, 
reshape_iter *d,
   /* The initializer for an array is always a CONSTRUCTOR.  If this is the
  outermost CONSTRUCTOR and the element type is non-aggregate, we don't need
  to build a new one.  */
-  bool reuse = first_initializer_p && !CP_AGGREGATE_TYPE_P (elt_type);
+  bool reuse = (first_initializer_p
+   && !CP_AGGREGATE_TYPE_P (elt_type)
+   && !TREE_SIDE_EFFECTS (first_initializer_p));
   if (reuse)
 new_init = first_initializer_p;
   else
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist119.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist119.C
new file mode 100644
index 000..80f391f64c7
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist119.C
@@ -0,0 +1,15 @@
+// PR c++/93559 - ICE with CONSTRUCTOR flags verification.
+// { dg-do compile { target c++11 } }
+
+struct E { int d[10]; };
+
+struct S {
+  constexpr int operator()(char) { return 42; }
+};
+
+template  struct X {
+  constexpr static E foo(S s) { return {{s(1)}}; }
+};
+
+S s;
+static_assert((X::foo(s), 1), "");
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist120.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist120.C
new file mode 100644
index 000..8d03166e197
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist120.C
@@ -0,0 +1,16 @@
+// PR c++/93559 - ICE with CONSTRUCTOR flags verification.
+// { dg-do compile { target c++11 } }
+
+struct F { int d[10]; };
+struct E { F f; };
+
+struct S {
+  constexpr int operator()(char) { return 42; }
+};
+
+template  struct X {
+  constexpr static E foo(S s) { return {{{s(1)}}}; }
+};
+
+S s;
+static_assert((X::foo(s), 1), "");

base-commit: 91bc3c98851670360dfcd312399c3ba35fb50231
-- 
Marek Polacek • Red Hat, Inc. • 300 A St, Boston, MA



[PATCH RFA] cgraph: A COMDAT decl always has non-zero address.

2020-02-05 Thread Jason Merrill
We should be able to assume that a template instantiation or other COMDAT
has non-zero address even if MAKE_DECL_ONE_ONLY for the target sets
DECL_WEAK and we haven't yet decided to emit a definition in this
translation unit.

Tested x86_64-pc-linux-gnu, OK for trunk?

PR c++/92003
* symtab.c (symtab_node::nonzero_address): A DECL_COMDAT decl has
non-zero address even if weak and not yet defined.
---
 gcc/symtab.c| 10 +-
 gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C | 17 +
 2 files changed, 22 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C

diff --git a/gcc/symtab.c b/gcc/symtab.c
index eae891ab211..a879c095a1a 100644
--- a/gcc/symtab.c
+++ b/gcc/symtab.c
@@ -2058,22 +2058,22 @@ symtab_node::nonzero_address ()
  bind to NULL. This is on by default on embedded targets only.
 
  Otherwise all non-WEAK symbols must be defined and thus non-NULL or
- linking fails.  Important case of WEAK we want to do well are comdats.
- Those are handled by later check for definition.
+ linking fails.  Important case of WEAK we want to do well are comdats,
+ which also must be defined somewhere.
 
  When parsing, beware the cases when WEAK attribute is added later.  */
-  if (!DECL_WEAK (decl)
+  if ((!DECL_WEAK (decl) || DECL_COMDAT (decl))
   && flag_delete_null_pointer_checks)
 {
   refuse_visibility_changes = true;
   return true;
 }
 
-  /* If target is defined and either comdat or not extern, we know it will be
+  /* If target is defined and not extern, we know it will be
  output and thus it will bind to non-NULL.
  Play safe for flag_delete_null_pointer_checks where weak definition may
  be re-defined by NULL.  */
-  if (definition && (!DECL_EXTERNAL (decl) || DECL_COMDAT (decl))
+  if (definition && !DECL_EXTERNAL (decl)
   && (flag_delete_null_pointer_checks || !DECL_WEAK (decl)))
 {
   if (!DECL_WEAK (decl))
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C 
b/gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C
new file mode 100644
index 000..644f9f7f893
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-static13.C
@@ -0,0 +1,17 @@
+// PR c++/92003
+// { dg-do compile { target c++11 } }
+// { dg-prune-output "narrowing conversion" }
+
+constexpr char const* get_c_str() { return "abc"; }
+constexpr bool use_get_c_str_in_constexpr_context{get_c_str()}; // works
+
+template 
+struct string {
+  static constexpr char const* c_str() { return c; }
+
+ private:
+  static constexpr char c[]{Cs..., '\0'};
+};
+
+constexpr char const* cstr{string<'a', 'b', 'c'>::c_str()};
+constexpr bool use_cstr_in_constexpr_context{cstr}; // doesn't work

base-commit: fa0c6e297b22d5883857d0db4a6a8be0967cb16f
-- 
2.18.1



Re: [PATCH] x86-64: Pass aggregates with only float/double in GPRs for MS_ABI

2020-02-05 Thread Uros Bizjak
On Wed, Feb 5, 2020 at 6:59 PM H.J. Lu  wrote:
>
> MS_ABI requires passing aggregates with only float/double in integer
> registers.  Checked gcc outputs against Clang and fixed:
>
> FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54
> -Wno-unused-variable -Wno-unused-parameter
> -Wno-unused-but-set-variable -Wno-uninitialized -O0
> -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
> FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54
> -Wno-unused-variable -Wno-unused-parameter
> -Wno-unused-but-set-variable -Wno-uninitialized -O2
> -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
> FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55
> -Wno-unused-variable -Wno-unused-parameter
> -Wno-unused-but-set-variable -Wno-uninitialized -O0
> -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
> FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55
> -Wno-unused-variable -Wno-unused-parameter
> -Wno-unused-but-set-variable -Wno-uninitialized -O2
> -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
> FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56
> -Wno-unused-variable -Wno-unused-parameter
> -Wno-unused-but-set-variable -Wno-uninitialized -O0
> -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
> FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56
> -Wno-unused-variable -Wno-unused-parameter
> -Wno-unused-but-set-variable -Wno-uninitialized -O2
> -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
>
> in libffi testsuite.
>
> OK for master and backports to GCC 8/9 branches?
>
> gcc/
>
> PR target/85667
> * config/i386/i386.c (function_arg_ms_64): Add a type argument.
> Don't return aggregates with only SFmode and DFmode in SSE
> register.
> (ix86_function_arg): Pass arg.type to function_arg_ms_64.
>
> gcc/testsuite/
>
> PR target/85667
> * gcc.target/i386/pr85667-10.c: New test.
> * gcc.target/i386/pr85667-7.c: Likewise.
> * gcc.target/i386/pr85667-8.c: Likewise.
> * gcc.target/i386/pr85667-9.c: Likewise.

LGTM, but should really be reviewed by cygwin, mingw-w64 maintainer (CC'd).

Thanks,
Uros.

>
> --
> H.J.


[PATCH] Add patch_area_size and patch_area_entry to crtl

2020-02-05 Thread H.J. Lu
On Wed, Feb 5, 2020 at 9:00 AM Richard Sandiford
 wrote:
>
> "H.J. Lu"  writes:
> > Currently patchable area is at the wrong place.
>
> Agreed :-)
>
> > It is placed immediately
> > after function label and before .cfi_startproc.  A backend should be able
> > to add a pseudo patchable area instruction durectly into RTL.  This patch
> > adds patch_area_size and patch_area_entry to cfun so that the patchable
> > area info is available in RTL passes.
>
> It might be better to add it to crtl, since it should only be needed
> during rtl generation.
>
> > It also limits patch_area_size and patch_area_entry to 65535, which is
> > a reasonable maximum size for patchable area.
> >
> > gcc/
> >
> >   PR target/93492
> >   * function.c (expand_function_start): Set cfun->patch_area_size
> >   and cfun->patch_area_entry.
> >   * function.h (function): Add patch_area_size and patch_area_entry.
> >   * opts.c (common_handle_option): Limit
> >   function_entry_patch_area_size and function_entry_patch_area_start
> >   to USHRT_MAX.  Fix a typo in error message.
> >   * varasm.c (assemble_start_function): Use cfun->patch_area_size
> >   and cfun->patch_area_entry.
> >   * doc/invoke.texi: Document the maximum value for
> >   -fpatchable-function-entry.
> >
> > gcc/testsuite/
> >
> >   PR target/93492
> >   * c-c++-common/patchable_function_entry-error-1.c: New test.
> >   * c-c++-common/patchable_function_entry-error-2.c: Likewise.
> >   * c-c++-common/patchable_function_entry-error-3.c: Likewise.
> > ---
> >  gcc/doc/invoke.texi   |  1 +
> >  gcc/function.c| 35 +++
> >  gcc/function.h|  6 
> >  gcc/opts.c|  4 ++-
> >  .../patchable_function_entry-error-1.c|  9 +
> >  .../patchable_function_entry-error-2.c|  9 +
> >  .../patchable_function_entry-error-3.c| 20 +++
> >  gcc/varasm.c  | 30 ++--
> >  8 files changed, 85 insertions(+), 29 deletions(-)
> >  create mode 100644 
> > gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
> >  create mode 100644 
> > gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
> >  create mode 100644 
> > gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
> >
> > diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> > index 35b341e759f..dd4835199b0 100644
> > --- a/gcc/doc/invoke.texi
> > +++ b/gcc/doc/invoke.texi
> > @@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
> >  The NOP instructions are inserted at---and maybe before, depending on
> >  @var{M}---the function entry address, even before the prologue.
> >
> > +The maximum value of @var{N} and @var{M} is 65535.
> >  @end table
> >
> >
> > diff --git a/gcc/function.c b/gcc/function.c
> > index d8008f60422..badbf538eec 100644
> > --- a/gcc/function.c
> > +++ b/gcc/function.c
> > @@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
> >/* If we are doing generic stack checking, the probe should go here.  */
> >if (flag_stack_check == GENERIC_STACK_CHECK)
> >  stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
> > +
> > +  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
> > +  unsigned HOST_WIDE_INT patch_area_entry = 
> > function_entry_patch_area_start;
> > +
> > +  tree patchable_function_entry_attr
> > += lookup_attribute ("patchable_function_entry",
> > + DECL_ATTRIBUTES (cfun->decl));
> > +  if (patchable_function_entry_attr)
> > +{
> > +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> > +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> > +
> > +  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
> > +  patch_area_entry = 0;
> > +  if (TREE_CHAIN (pp_val) != NULL_TREE)
> > + {
> > +   tree patchable_function_entry_value2
> > + = TREE_VALUE (TREE_CHAIN (pp_val));
> > +   patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
> > + }
> > +  if (patch_area_size > USHRT_MAX || patch_area_size > USHRT_MAX)
> > + error ("invalid values for % attribute");
>
> This should probably go in handle_patchable_function_entry_attribute
> instead.  It doesn't look like we have a clear policy about whether
> errors or warnings are right for unrecognised arguments to known
> attribute names, but handle_patchable_function_entry_attribute reports
> an OPT_Wattributes warning for arguments that aren't integers.  Doing the
> same for out-of-range integers would be more consistent and means that
> we wouldn't break existing code if we relaxed/changed the rules in future.

Like this?  OK for master if there is no regression?

Thanks.

-- 
H.J.
From 8a56c3424d4194dfc0290eaa666962c6e75f9ce8 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 5 Feb 

Re: [PATCH] document that alias and target must have the same type

2020-02-05 Thread Martin Sebor

On 2/4/20 6:05 PM, Martin Sebor wrote:

GCC diagnoses declarations of function aliases whose type doesn't
match that of the target (ditto for attribute weakref).  It doesn't
yet diagnose such incompatbilities for variable aliases but that's
just an oversight that I will try to remember to correct in GCC 11.
The attached patch updates the manual to make it clear that
aliases must have the same type as their targets, or the behavior
is undefined (and may be diagnosed).


On further review I noticed a few problems with the documentation
of attribute weakref.  The manual says that

  Without a target, given as an argument to weakref or to alias,
  weakref is equivalent to weak.

and

  At present, a declaration to which weakref is attached can only
  be static.

However, GCC accepts the following declaration:

  extern int x (void) __attribute__ ((weakref));

so the second sentence isn't correct without qualification (unlike
weakref, a weak declaration must be external).

Another documentation problem is with the example in the manual that
says that this declaration:

  static int x() __attribute__ ((weakref ("y")));

  /* is equivalent to... */

  static int x() __attribute__ ((weak, weakref, alias ("y")));

but GCC rejects the latter with

  error: weak declaration of ‘x’ must be public

Changing the latter from static to extern changes the error to

  error: ‘weakref’ symbol ‘x’ must have static linkage

So clearly two declarations with the two sets of attributes are not
equivalent, either extern or static.

I've fixed up these documentation problems in the attached revision
of the original patch.  I also mention that besides their types
having to match, the alias must have the same size and alignment
as the target.

Martin

PS I also noted that when the function is then also used GCC issues:

  warning: ‘weakref’ attribute should be accompanied with an ‘alias’ 
attribute


This matches what the manual says in "Without arguments, it should
be accompanied by an alias attribute naming the target symbol."

But when the weakref function is not used there is no warning.
That seems like an unfortunate side-effect of the choice of issuing
the warning a little too late (waning from the front-end it would
make it consistent regardless of whether the function was used, and
it would highlight the omission even in translation units that
define the weakref without using it).
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index ec99c38a607..3634ce1c423 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -2557,8 +2557,11 @@ __attribute__ ((access (write_only, 1, 2), access (read_write, 3))) int fgets (c
 
 @item alias ("@var{target}")
 @cindex @code{alias} function attribute
-The @code{alias} attribute causes the declaration to be emitted as an
-alias for another symbol, which must be specified.  For instance,
+The @code{alias} attribute causes the declaration to be emitted as an alias
+for another symbol, which must have been previously declared with the same
+type, and for variables same size and alignment.  Declaring an alias with
+a different type than the target is undefined and may be diagnosed.  For
+instance, an alias for another symbol, which must be specified.  For instance,
 
 @smallexample
 void __f () @{ /* @r{Do something.} */; @}
@@ -3919,31 +3922,41 @@ results in warning on line 5.
 
 @item weak
 @cindex @code{weak} function attribute
-The @code{weak} attribute causes the declaration to be emitted as a weak
-symbol rather than a global.  This is primarily useful in defining
-library functions that can be overridden in user code, though it can
-also be used with non-function declarations.  Weak symbols are supported
-for ELF targets, and also for a.out targets when using the GNU assembler
-and linker.
+The @code{weak} attribute causes a declaration of an external symbol
+to be emitted as a weak symbol rather than a global.  This is primarily
+useful in defining library functions that can be overridden in user code,
+though it can also be used with non-function declarations.  The overriding
+symbol must have the same type, and for variables size, and alignment as
+the weak symbol.  Weak symbols are supported for ELF targets, and also for
+a.out targets when using the GNU assembler and linker.
 
 @item weakref
 @itemx weakref ("@var{target}")
 @cindex @code{weakref} function attribute
 The @code{weakref} attribute marks a declaration as a weak reference.
 Without arguments, it should be accompanied by an @code{alias} attribute
-naming the target symbol.  Optionally, the @var{target} may be given as
-an argument to @code{weakref} itself.  In either case, @code{weakref}
-implicitly marks the declaration as @code{weak}.  Without a
-@var{target}, given as an argument to @code{weakref} or to @code{alias},
-@code{weakref} is equivalent to @code{weak}.
+naming the target symbol.  Alernatively, @var{target} may be given as
+an argument to @code{weakref} itself, naming the target definition of
+the 

Re: [PATCH 2/3] libstdc++: Implement C++20 constrained algorithms

2020-02-05 Thread Patrick Palka
[resending with attachment now compressed]

On Wed, 5 Feb 2020, François Dumont wrote:

> Hi
> 
>     Is it me or the patch isn't an attachment ? It is far more convenient to
> provide something easy to extract and apply locally.

Good point, I've attached the patch as an attachment and I'll make sure
to provide big patches as an attachment in the future.  I should also
have noted that the patches are also available in my user branch
libstdcxx-constrained-algo-adaptors which you can fetch with

git fetch origin 
refs/users/ppalka/heads/libstdcxx-constrained-algos-adaptors

> 
> On 2/4/20 3:07 AM, Patrick Palka wrote:
> > This patch implements the C++20 ranges overloads for the algorithms in
> > [algorithms].  Most of the algorithms were reimplemented, with each of their
> > implementations very closely following the existing implementation in
> > bits/stl_algo.h and bits/stl_algobase.h.  The reason for reimplementing most
> > of
> > the algorithms instead of forwarding to their STL-style overload is because
> > forwarding cannot be conformantly and efficiently performed for algorithms
> > that
> > operate on non-random-access iterators.  But algorithms that operate on
> > random
> > access iterators can safely and efficiently be forwarded to the STL-style
> > implementation, and this patch does so for push_heap, pop_heap, make_heap,
> > sort_heap, sort, stable_sort, nth_element, inplace_merge and
> > stable_partition.
> > 
> > What's missing from this patch is debug-iterator
> 
> Always the 5th wheel of the car like we say in French :-)
> 
> I'll be looking at this point once I manage to apply the patch.

That would be much appreciated! :)

> 
> >   and container specializations
> > that are present for some of the STL-style algorithms that need to be ported
> > over to the ranges algos.  I marked them missing at TODO comments.  There
> > are
> > also some other minor outstanding TODOs.
> > 
> > The code that could use the most thorough review is ranges::__copy_or_move,
> > ranges::__copy_or_move_backward, ranges::__equal and
> > ranges::__lexicographical_compare.  In the tests, I tried to test the
> > interface
> > of each new overload, as well as the correctness of the new implementation.
> > 
> > diff --git a/libstdc++-v3/include/bits/ranges_algo.h
> > b/libstdc++-v3/include/bits/ranges_algo.h
> > new file mode 100644
> > index 000..2e177ce7f7a
> > --- /dev/null
> > +++ b/libstdc++-v3/include/bits/ranges_algo.h
> > @@ -0,0 +1,3640 @@
> > +// Core algorithmic facilities -*- C++ -*-
> > +
> > +// Copyright (C) 2019-2020 Free Software Foundation, Inc.
> 
> Copyright for new files is wrong, should be only 2020. I know it is painful to
> maintain that when you work on patch on several years.

Thanks, fixed!

> 
> > 
> > +//
> > +// This file is part of the GNU ISO C++ Library.  This library is free
> > +// software; you can redistribute it and/or modify it under the
> > +// terms of the GNU General Public License as published by the
> > +// Free Software Foundation; either version 3, or (at your option)
> > +// any later version.
> > +
> > +// This library is distributed in the hope that it will be useful,
> > +// but WITHOUT ANY WARRANTY; without even the implied warranty of
> > +// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > +// GNU General Public License for more details.
> > +
> > +// Under Section 7 of GPL version 3, you are granted additional
> > +// permissions described in the GCC Runtime Library Exception, version
> > +// 3.1, as published by the Free Software Foundation.
> > +
> > +// You should have received a copy of the GNU General Public License and
> > +// a copy of the GCC Runtime Library Exception along with this program;
> > +// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
> > +// .
> > +
> > +/** @file bits/ranges_algo.h
> > + *  This is an internal header file, included by other library headers.
> > + *  Do not attempt to use it directly. @headername{algorithm}
> > + */
> > +
> > +#ifndef _RANGES_ALGO_H
> > +#define _RANGES_ALGO_H 1
> > +
> > +#if __cplusplus > 201703L
> > +
> > +#include 
> > +#include 
> > +#include 
> > +// #include 
> > +#include 
> > +#include 
> > +#include  // __is_byte
> > +#include  // concept uniform_random_bit_generator
> > +
> > +#if __cpp_lib_concepts
> > +namespace std _GLIBCXX_VISIBILITY(default)
> > +{
> > +_GLIBCXX_BEGIN_NAMESPACE_VERSION
> > +namespace ranges
> > +{
> > +  namespace __detail
> > +  {
> > +template
> > +constexpr inline bool __is_normal_iterator = false;
> > +
> > +template
> > +constexpr inline bool
> > +  __is_normal_iterator<__gnu_cxx::__normal_iterator<_Iterator,
> > _Container>>
> > +  = true;
> > +
> > +template
> > +constexpr inline bool __is_reverse_iterator = false;
> > +
> > +template
> > +constexpr inline bool
> > +  __is_reverse_iterator> = true;
> > +
> > +template
> > +constexpr inline 

Re: [PATCH, rs6000]: mark clobber for registers changed by untpyed_call

2020-02-05 Thread Segher Boessenkool
On Wed, Feb 05, 2020 at 09:53:27PM +0800, Jiufu Guo wrote:
> As PR93047 said, __builtin_apply/__builtin_return does not work well with
> -frename-registers.  This is caused by return register(e.g. r3) is used to
> rename another register, before return register is stored to stack.
> 
> This patch fix this issue by emitting clobber for those egisters which
> maybe changed by untyped call.

Yeah.  untyped_call does

  /* The optimizer does not know that the call sets the function value
 registers we stored in the result block.  We avoid problems by
 claiming that all hard registers are used and clobbered at this
 point.  */
  emit_insn (gen_blockage ());

but blockage does not say registers are used and clobbered:

@cindex @code{blockage} instruction pattern
@item @samp{blockage}
This pattern defines a pseudo insn that prevents the instruction
scheduler and other passes from moving instructions and using register
equivalences across the boundary defined by the blockage insn.
This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM.

Many archs have this same implementation of untyped_call (and of
blockage, too).  It all just works by luck (or it doesn't work).

What we have is:

  emit_call_insn (gen_call (operands[0], const0_rtx, const0_rtx));

  for (i = 0; i < XVECLEN (operands[2], 0); i++)
{
  rtx set = XVECEXP (operands[2], 0, i);
  emit_move_insn (SET_DEST (set), SET_SRC (set));
}

... and nothing in the rtl stream says that those return registers are
actually set by that call.  Maybe we should use gen_call_value?  Can we
ever be asked to return more than a single thing here?

Some trivial patch comments:

> gcc/
> 2020-02-05  Jiufu Guo  
> 
>   PR target/93047
>   * config/rs6000/rs6000.md (untyped_call): add emit_clobber.

"Add", capital.

> gcc/testsuite
> 2020-02-05  Jiufu Guo  
> 
>   PR target/93047
>   * gcc.dg/torture/stackalign/builtin-return-2.c: New case.

"New test case."  (And there is trailing whitespace here; Git warns
about that, so this won't happen much in the future :-) )


Segher


[COMMITTED] c++: Fix SEGV with malformed constructor decl.

2020-02-05 Thread Jason Merrill
In the testcase, since there's no declaration of T, ref_view(T) declares a
non-static data member T of type ref_view, the same type as its enclosing
class.  Then when we try to do C++20 aggregate class template argument
deduction we recursively try to adjust the braced-init-list to match the
template class definition until we run out of stack.

Fixed by rejecting the template data member.  I was interested to see that
there was another test in the suite with this same wrong declaration.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/92593
* decl.c (grokdeclarator): Reject field of current class type even
in a template.
---
 gcc/cp/decl.c  | 11 +++
 gcc/testsuite/g++.dg/cpp1z/class-deduction68.C | 10 ++
 gcc/testsuite/g++.dg/parse/undefined3.C|  2 +-
 gcc/testsuite/g++.dg/template/pr71710.C|  4 ++--
 4 files changed, 20 insertions(+), 7 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp1z/class-deduction68.C

diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index 859fd1bb931..794370d0ff9 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -13259,10 +13259,13 @@ grokdeclarator (const cp_declarator *declarator,
if (declspecs->explicit_specifier)
  store_explicit_specifier (decl, declspecs->explicit_specifier);
  }
-   else if (!staticp && !dependent_type_p (type)
-&& !COMPLETE_TYPE_P (complete_type (type))
-&& (!complete_or_array_type_p (type)
-|| initialized == 0))
+   else if (!staticp
+&& ((current_class_type
+ && same_type_p (type, current_class_type))
+|| (!dependent_type_p (type)
+&& !COMPLETE_TYPE_P (complete_type (type))
+&& (!complete_or_array_type_p (type)
+|| initialized == 0
  {
if (TREE_CODE (type) != ARRAY_TYPE
|| !COMPLETE_TYPE_P (TREE_TYPE (type)))
diff --git a/gcc/testsuite/g++.dg/cpp1z/class-deduction68.C 
b/gcc/testsuite/g++.dg/cpp1z/class-deduction68.C
new file mode 100644
index 000..a761e70e09c
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/class-deduction68.C
@@ -0,0 +1,10 @@
+// PR c++/92593
+// { dg-do compile { target c++17 } }
+
+template
+struct ref_view
+{
+  ref_view(T) { }; // { dg-error "incomplete" }
+};
+
+ref_view r{1}; // { dg-error "no match|deduction failed" }
diff --git a/gcc/testsuite/g++.dg/parse/undefined3.C 
b/gcc/testsuite/g++.dg/parse/undefined3.C
index 6bafd6fc695..ad445bcdf2f 100644
--- a/gcc/testsuite/g++.dg/parse/undefined3.C
+++ b/gcc/testsuite/g++.dg/parse/undefined3.C
@@ -2,5 +2,5 @@
 // Origin: Volker Reichelt 
 // { dg-do compile }
 
-template struct A { A(B); };
+template struct A { A(B); }; // { dg-error "incomplete" }
 template A::A(B) {} // { dg-error "" }
diff --git a/gcc/testsuite/g++.dg/template/pr71710.C 
b/gcc/testsuite/g++.dg/template/pr71710.C
index 7c394e793b7..1e78cbb62fe 100644
--- a/gcc/testsuite/g++.dg/template/pr71710.C
+++ b/gcc/testsuite/g++.dg/template/pr71710.C
@@ -3,8 +3,8 @@
 
 template < typename > struct A
 {
-  A a;
+  A a;// { dg-error "incomplete" }
   template < int > using B = decltype (a);
-  B < 0 > b;
+  B < 0 > b;  // { dg-prune-output "B. does not name a type" }
   template < int C > B < C > foo ();
 };

base-commit: efe0e5cd64be70690ba287a8dd5841a2b5eefef1
-- 
2.18.1



[PATCH] Use a non-empty test program to test ability to link

2020-02-05 Thread Sandra Loosemore
This patch is for PR 79193 and 88999, problems where libstdc++ is 
mis-configuring itself when building for a bare-metal target because it 
thinks it can link programs without pulling in the BSP that provides 
low-level I/O support.  (Specifically, this was observed on nios2-elf 
with Newlib and GDB semihosting.)  It'll build just fine if it 
recognizes that it can only compile programs and not link them, but it's 
confused because using an empty program to test for ability to link 
succeeds.


Is this configure change OK, and suitable for stage 4?

BTW, I did run autoconf in every subdirectory that contains a 
configure.ac, but it appears only libstc++-v3 actually uses this test; 
all the other regenerated configure scripts were unchanged.


-Sandra
>From 44b769a9b5e01a58c9b89b24ca5a00fc1ff53012 Mon Sep 17 00:00:00 2001
From: Sandra Loosemore 
Date: Wed, 5 Feb 2020 10:03:58 -0800
Subject: [PATCH] Use a non-empty test program to test ability to link.

On bare-metal targets, I/O support is typically provided by a BSP and
requires a linker script and/or hosting library to be specified on the
linker command line.  Linking an empty program with the default linker
script may succeed, however, which confuses libstdc++ configuration
when programs that probe for the presence of various I/O features fail
with link errors.

2020-02-05  Sandra Loosemore  

	PR libstdc++/79193
	PR libstdc++/88999

	config/
	* no-executables.m4: Use a non-empty program to test for linker
	support.

	libstdc++v-3/
	* configure: Regenerated.
---
 config/ChangeLog | 8 
 config/no-executables.m4 | 4 +++-
 libstdc++-v3/ChangeLog   | 7 +++
 libstdc++-v3/configure   | 4 ++--
 4 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/config/ChangeLog b/config/ChangeLog
index f1fec81..d2a12bd 100644
--- a/config/ChangeLog
+++ b/config/ChangeLog
@@ -1,3 +1,11 @@
+2020-02-05  Sandra Loosemore  
+
+	PR libstdc++/79193
+	PR libstdc++/88999
+
+	* no-executables.m4: Use a non-empty program to test for linker
+	support.
+
 2020-02-01  Andrew Burgess  
 
 	* lib-link.m4 (AC_LIB_LINKFLAGS_BODY): Update shell syntax.
diff --git a/config/no-executables.m4 b/config/no-executables.m4
index 9061624..6842f84 100644
--- a/config/no-executables.m4
+++ b/config/no-executables.m4
@@ -25,7 +25,9 @@ AC_BEFORE([$0], [_AC_COMPILER_EXEEXT])
 AC_BEFORE([$0], [AC_LINK_IFELSE])
 
 m4_define([_AC_COMPILER_EXEEXT],
-[AC_LANG_CONFTEST([AC_LANG_PROGRAM()])
+[AC_LANG_CONFTEST([AC_LANG_PROGRAM(
+		 [#include ],
+		 [printf ("hello world\n");])])
 # FIXME: Cleanup?
 AS_IF([AC_TRY_EVAL(ac_link)], [gcc_no_link=no], [gcc_no_link=yes])
 if test x$gcc_no_link = xyes; then
diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 76a6e2b..46ab7c0 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,10 @@
+2020-02-05  Sandra Loosemore  
+
+	PR libstdc++/79193
+	PR libstdc++/88999
+
+	* configure: Regenerated.
+
 2020-02-05  Jonathan Wakely  
 
 	* include/bits/iterator_concepts.h (iter_reference_t)
diff --git a/libstdc++-v3/configure b/libstdc++-v3/configure
index a39c33b..9f9c5a2 100755
--- a/libstdc++-v3/configure
+++ b/libstdc++-v3/configure
@@ -4130,11 +4130,11 @@ done
 
 cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 /* end confdefs.h.  */
-
+#include 
 int
 main ()
 {
-
+printf ("hello world\n");
   ;
   return 0;
 }
-- 
2.8.1



Re: [PATCH 2/3] libstdc++: Implement C++20 constrained algorithms

2020-02-05 Thread François Dumont

Hi

    Is it me or the patch isn't an attachment ? It is far more 
convenient to provide something easy to extract and apply locally.


On 2/4/20 3:07 AM, Patrick Palka wrote:

This patch implements the C++20 ranges overloads for the algorithms in
[algorithms].  Most of the algorithms were reimplemented, with each of their
implementations very closely following the existing implementation in
bits/stl_algo.h and bits/stl_algobase.h.  The reason for reimplementing most of
the algorithms instead of forwarding to their STL-style overload is because
forwarding cannot be conformantly and efficiently performed for algorithms that
operate on non-random-access iterators.  But algorithms that operate on random
access iterators can safely and efficiently be forwarded to the STL-style
implementation, and this patch does so for push_heap, pop_heap, make_heap,
sort_heap, sort, stable_sort, nth_element, inplace_merge and stable_partition.

What's missing from this patch is debug-iterator


Always the 5th wheel of the car like we say in French :-)

I'll be looking at this point once I manage to apply the patch.


  and container specializations
that are present for some of the STL-style algorithms that need to be ported
over to the ranges algos.  I marked them missing at TODO comments.  There are
also some other minor outstanding TODOs.

The code that could use the most thorough review is ranges::__copy_or_move,
ranges::__copy_or_move_backward, ranges::__equal and
ranges::__lexicographical_compare.  In the tests, I tried to test the interface
of each new overload, as well as the correctness of the new implementation.

diff --git a/libstdc++-v3/include/bits/ranges_algo.h 
b/libstdc++-v3/include/bits/ranges_algo.h
new file mode 100644
index 000..2e177ce7f7a
--- /dev/null
+++ b/libstdc++-v3/include/bits/ranges_algo.h
@@ -0,0 +1,3640 @@
+// Core algorithmic facilities -*- C++ -*-
+
+// Copyright (C) 2019-2020 Free Software Foundation, Inc.


Copyright for new files is wrong, should be only 2020. I know it is 
painful to maintain that when you work on patch on several years.




+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file bits/ranges_algo.h
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{algorithm}
+ */
+
+#ifndef _RANGES_ALGO_H
+#define _RANGES_ALGO_H 1
+
+#if __cplusplus > 201703L
+
+#include 
+#include 
+#include 
+// #include 
+#include 
+#include 
+#include  // __is_byte
+#include  // concept uniform_random_bit_generator
+
+#if __cpp_lib_concepts
+namespace std _GLIBCXX_VISIBILITY(default)
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+namespace ranges
+{
+  namespace __detail
+  {
+template
+constexpr inline bool __is_normal_iterator = false;
+
+template
+constexpr inline bool
+  __is_normal_iterator<__gnu_cxx::__normal_iterator<_Iterator, _Container>>
+  = true;
+
+template
+constexpr inline bool __is_reverse_iterator = false;
+
+template
+constexpr inline bool
+  __is_reverse_iterator> = true;
+
+template
+constexpr inline bool __is_move_iterator = false;
+
+template
+constexpr inline bool
+  __is_move_iterator> = true;


Did you consider the __is_move_iterator in stl_iterator.h ?

At least this version will also detect a move_iterator in different 
situation. I haven't check yet what you are doing with that but it might 
be an easy way to get better debug iterators integration for instance.


François


Re: [PATCH] c++: Handle CONSTRUCTORs without indexes in find_array_ctor_elt [PR93549]

2020-02-05 Thread Jason Merrill

On 2/4/20 4:20 AM, Jakub Jelinek wrote:

Hi!

My change
* typeck2.c (store_init_value): Don't call cp_fully_fold_init on
initializers of automatic non-constexpr variables in constexpr
functions.
-  value = cp_fully_fold_init (value);
+  /* Don't fold initializers of automatic variables in constexpr functions,
+ that might fold away something that needs to be diagnosed at constexpr
+ evaluation time.  */
+  if (!current_function_decl
+  || !DECL_DECLARED_CONSTEXPR_P (current_function_decl)
+  || TREE_STATIC (decl))
+value = cp_fully_fold_init (value);
from the constexpr new change apparently broke the following testcase.
When handling COND_EXPR, we build_vector_from_val, however as the argument we
pass to it is not an INTEGER_CST/REAL_CST, but that wrapped in a
NON_LVALUE_EXPR location wrapper, we end up with a CONSTRUCTOR and as it is
middle-end that builds it, it doesn't bother with indexes.  The
cp_fully_fold_init call used to fold it into VECTOR_CST in the past, but as
we intentionally don't invoke it anymore as it might fold away something
that needs to be diagnosed during constexpr evaluation, we end up evaluating
ARRAY_REF into the index-less CONSTRUCTOR.  The following patch fixes the
ICE by teaching find_array_ctor_elt to handle CONSTRUCTORs without indexes
(that itself could be still very efficient) and CONSTRUCTORs with some
indexes present and others missing (the rules are that if the index on the
first element is missing, then it is the array's lowest index (in C/C++ 0)
and if other indexes are missing, they are the index of the previous element
+ 1).


Is it currently possible to get a CONSTRUCTOR with non-init-list type 
that has some indexes present and others missing?  Other than from the 
new code in your patch that sets some indexes?



Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Slightly OT, for large arrays it might be efficient to leave indexes out of
the CONSTRUCTORs too, but as the patch shows, we can't then access the
elements using binary search.



Could we (for GCC11+) use either some flag on
the CONSTRUCTOR to denote CONSTRUCTORs with no indexes at all, or some new
tree as index on the first element that would be similar to RANGE_EXPR, but
would say this element has index xyz and the following n elements have
linearly increasing indexes omitted?  If we have such an assurance, we could
do direct access to access the array elts in such CONSTRUCTORs and be even
more efficient than binary search.


Is it unreasonable to assume that if the first element has no index, 
none of the elements do?



2020-02-04  Jakub Jelinek  

PR c++/93549
* constexpr.c (find_array_ctor_elt): Deal with some or all indexes in
CONSTRUCTOR missing.

* g++.dg/ext/constexpr-pr93549.C: New test.

--- gcc/cp/constexpr.c.jj   2020-01-26 00:20:26.532367552 +0100
+++ gcc/cp/constexpr.c  2020-02-03 17:14:21.520780109 +0100
@@ -3023,7 +3023,8 @@ find_array_ctor_elt (tree ary, tree dind
if (end > 0)
  {
tree cindex = (*elts)[end - 1].index;
-  if (TREE_CODE (cindex) == INTEGER_CST
+  if (cindex
+ && TREE_CODE (cindex) == INTEGER_CST
  && compare_tree_int (cindex, end - 1) == 0)
{
  if (i < end)
@@ -3037,8 +3038,32 @@ find_array_ctor_elt (tree ary, tree dind
while (begin != end)
  {
unsigned HOST_WIDE_INT middle = (begin + end) / 2;
-  constructor_elt  = (*elts)[middle];
-  tree idx = elt.index;
+  constructor_elt *elt = &(*elts)[middle];
+  tree idx = elt->index;
+
+  if (idx == NULL_TREE)
+   {
+ /* If some or all indexes are missing, we can't use binary search.  */
+ unsigned HOST_WIDE_INT j
+   = (begin > 0 ? tree_to_uhwi ((*elts)[begin - 1].index) + 1 : 0);
+ for (middle = begin; middle < end; middle++)
+   if ((*elts)[middle].index)
+ {
+   elt = &(*elts)[middle];
+   idx = elt->index;
+   break;
+ }
+   else if (i == j + (middle - begin))
+ {
+   (*elts)[middle].index = dindex;


Why set this index?


+   return middle;
+ }
+ if (middle == end)
+   {
+ begin = end;
+ continue;
+   }
+   }
  
int cmp = array_index_cmp (dindex, idx);

if (cmp < 0)
@@ -3053,7 +3078,7 @@ find_array_ctor_elt (tree ary, tree dind
  constructor_elt e;
  tree lo = TREE_OPERAND (idx, 0);
  tree hi = TREE_OPERAND (idx, 1);
- tree value = elt.value;
+ tree value = elt->value;
  dindex = fold_convert (sizetype, dindex);
  if (tree_int_cst_lt (lo, dindex))
{
@@ -3062,7 +3087,7 @@ find_array_ctor_elt (tree ary, tree dind
 size_one_node);
  if (tree_int_cst_equal (lo, new_hi))

Re: [PATCH] c++: Mark __builtin_convertvector operand as read [PR93557]

2020-02-05 Thread Jason Merrill

On 2/5/20 5:44 AM, Jakub Jelinek wrote:

Hi!

In C++ we weren't calling mark_exp_read on the __builtin_convertvector first
argument.  I guess it could misbehave even with lambda implicit captures.

Fixed by calling decay_conversion on the argument, we use the argument as
rvalue so we want the standard lvalue to rvalue conversions, but as the
argument must be a vector type, e.g. integral promotions aren't really
needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK.


2020-02-05  Jakub Jelinek  

PR c++/93557
* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
prior to passing it to c_build_vec_convert.

* c-c++-common/Wunused-var-17.c: New test.

--- gcc/cp/semantics.c.jj   2020-02-01 10:01:36.581701153 +0100
+++ gcc/cp/semantics.c  2020-02-04 14:38:43.618588979 +0100
@@ -10389,7 +10389,8 @@ cp_build_vec_convert (tree arg, location
  
tree ret = NULL_TREE;

if (!type_dependent_expression_p (arg) && !dependent_type_p (type))
-ret = c_build_vec_convert (cp_expr_loc_or_input_loc (arg), arg,
+ret = c_build_vec_convert (cp_expr_loc_or_input_loc (arg),
+  decay_conversion (arg, complain),
   loc, type, (complain & tf_error) != 0);
  
if (!processing_template_decl)

--- gcc/testsuite/c-c++-common/Wunused-var-17.c.jj  2020-02-04 
14:42:33.648152691 +0100
+++ gcc/testsuite/c-c++-common/Wunused-var-17.c 2020-02-04 14:41:46.092862987 
+0100
@@ -0,0 +1,19 @@
+/* PR c++/93557 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wunused-but-set-variable" } */
+
+typedef int VI __attribute__((vector_size (sizeof (int) * 4)));
+typedef float VF __attribute__((vector_size (sizeof (float) * 4)));
+
+void
+foo (VI *p, VF *q)
+{
+  VI a = (VI) { 1, 2, 3, 4 };  /* { dg-bogus "set but not 
used" } */
+  q[0] = __builtin_convertvector (a, VF);
+  VI b = p[1]; /* { dg-bogus "set but not 
used" } */
+  q[1] = __builtin_convertvector (b, VF);
+  VF c = (VF) { 5.0f, 6.0f, 7.0f, 8.0f };  /* { dg-bogus "set but not 
used" } */
+  p[2] = __builtin_convertvector (c, VI);
+  VF d = q[3]; /* { dg-bogus "set but not 
used" } */
+  p[3] = __builtin_convertvector (d, VI);
+}

Jakub





[PATCH] x86-64: Pass aggregates with only float/double in GPRs for MS_ABI

2020-02-05 Thread H.J. Lu
MS_ABI requires passing aggregates with only float/double in integer
registers.  Checked gcc outputs against Clang and fixed:

FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54
-Wno-unused-variable -Wno-unused-parameter
-Wno-unused-but-set-variable -Wno-uninitialized -O0
-DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54
-Wno-unused-variable -Wno-unused-parameter
-Wno-unused-but-set-variable -Wno-uninitialized -O2
-DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55
-Wno-unused-variable -Wno-unused-parameter
-Wno-unused-but-set-variable -Wno-uninitialized -O0
-DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55
-Wno-unused-variable -Wno-unused-parameter
-Wno-unused-but-set-variable -Wno-uninitialized -O2
-DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56
-Wno-unused-variable -Wno-unused-parameter
-Wno-unused-but-set-variable -Wno-uninitialized -O0
-DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56
-Wno-unused-variable -Wno-unused-parameter
-Wno-unused-but-set-variable -Wno-uninitialized -O2
-DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test

in libffi testsuite.

OK for master and backports to GCC 8/9 branches?

gcc/

PR target/85667
* config/i386/i386.c (function_arg_ms_64): Add a type argument.
Don't return aggregates with only SFmode and DFmode in SSE
register.
(ix86_function_arg): Pass arg.type to function_arg_ms_64.

gcc/testsuite/

PR target/85667
* gcc.target/i386/pr85667-10.c: New test.
* gcc.target/i386/pr85667-7.c: Likewise.
* gcc.target/i386/pr85667-8.c: Likewise.
* gcc.target/i386/pr85667-9.c: Likewise.


-- 
H.J.
From e561fd8fcb46b8d8e40942c077e26ce120832747 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 5 Feb 2020 09:49:56 -0800
Subject: [PATCH] x86-64: Pass aggregates with only float/double in GPRs for
 MS_ABI

MS_ABI requires passing aggregates with only float/double in integer
registers.  Checked gcc outputs against Clang and fixed:

FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=54 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=55 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O0 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test
FAIL: libffi.bhaible/test-callback.c -W -Wall -Wno-psabi -DDGTEST=56 -Wno-unused-variable -Wno-unused-parameter -Wno-unused-but-set-variable -Wno-uninitialized -O2 -DABI_NUM=FFI_GNUW64 -DABI_ATTR=MSABI execution test

in libffi testsuite.

gcc/

	PR target/85667
	* config/i386/i386.c (function_arg_ms_64): Add a type argument.
	Don't return aggregates with only SFmode and DFmode in SSE
	register.
	(ix86_function_arg): Pass arg.type to function_arg_ms_64.

gcc/testsuite/

	PR target/85667
	* gcc.target/i386/pr85667-10.c: New test.
	* gcc.target/i386/pr85667-7.c: Likewise.
	* gcc.target/i386/pr85667-8.c: Likewise.
	* gcc.target/i386/pr85667-9.c: Likewise.
---
 gcc/config/i386/i386.c | 10 --
 gcc/testsuite/gcc.target/i386/pr85667-10.c | 21 +
 gcc/testsuite/gcc.target/i386/pr85667-7.c  | 36 ++
 gcc/testsuite/gcc.target/i386/pr85667-8.c  | 21 +
 gcc/testsuite/gcc.target/i386/pr85667-9.c  | 36 ++
 5 files changed, 121 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85667-10.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85667-7.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85667-8.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr85667-9.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ffda3e8fd21..f769cb8f75e 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -3153,7 +3153,7 @@ function_arg_64 (const CUMULATIVE_ARGS *cum, machine_mode mode,
 
 static rtx
 function_arg_ms_64 (const CUMULATIVE_ARGS *cum, machine_mode mode,
-		machine_mode orig_mode, 

[committed] amdgcn: Remove redundant multilib

2020-02-05 Thread Andrew Stubbs
This patch removes a redundant "gfx900/gfx906" multilib that was added 
by accident. We need those options independently, but not together.


Andrew
amdgcn: Remove redundant multilib

2020-02-05  Andrew Stubbs  

	gcc/
	* config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Use / not space.

diff --git a/gcc/config/gcn/t-gcn-hsa b/gcc/config/gcn/t-gcn-hsa
index 9bd16abbc02..af203c5cf05 100644
--- a/gcc/config/gcn/t-gcn-hsa
+++ b/gcc/config/gcn/t-gcn-hsa
@@ -42,7 +42,7 @@ ALL_HOST_OBJS += gcn-run.o
 gcn-run$(exeext): gcn-run.o
 	+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ $< -ldl
 
-MULTILIB_OPTIONS = march=gfx900 march=gfx906
+MULTILIB_OPTIONS = march=gfx900/march=gfx906
 MULTILIB_DIRNAMES = gfx900 gfx906
 
 PASSES_EXTRA += $(srcdir)/config/gcn/gcn-passes.def


Re: [PATCH V2] Generalized value pass-through for self-recursive function (ipa/pr93203)

2020-02-05 Thread Martin Jambor
Hi,

On Sat, Jan 25 2020, Feng Xue OS wrote:
> Made some changes.
>
> Feng
>
> 
> From: Feng Xue OS
> Sent: Saturday, January 25, 2020 5:54 PM
> To: mjam...@suse.cz; Jan Hubicka; gcc-patches@gcc.gnu.org
> Subject: [PATCH] Generalized value pass-through for self-recursive function 
> (ipa/pr93203)
>
> Besides simple pass-through (aggregate) jump function, arithmetic (aggregate)
> jump function could also bring same (aggregate) value as parameter passed-in
> for self-feeding recursive call.  For example,
>
>   f1 (int i)/*  normal jump function */
>  {
> f1 (i & 1);
>  }
>
> Suppose i is 0, recursive propagation via (i & 1) also gets 0, which
> can be seen as a simple pass-through of i.
>
>   f2 (int *p)  /* aggregate jump function */
>  {
> int t = *p & 1;
> f2 ();
>  }
> Likewise, if *p is 0, (*p & 1) is also 0, and  is an aggregate simple
> pass-through of p.
>
> This patch is to support these two kinds of value pass-through.
> Bootstrapped/regtested on x86_64-linux and aarch64-linux.

sorry for the delay in the review.  As far as I am concerned, I am OK
with the patch but please see the few comments below:

> ---
> 2020-01-25  Feng Xue  
>
> PR ipa/93203
> * ipa-cp.c (ipcp_lattice::add_value): Add source with same call
> edge but different source value.
> (adjust_callers_for_value_intersection): New function.
> (gather_edges_for_value): Adjust order of callers to let a
> non-self-recursive caller be the first element.
> (self_recursive_pass_through_p): Add a new parameter simple, and
> check generalized self-recursive pass-through jump function.
> (self_recursive_agg_pass_through_p): Likewise.
> (find_more_scalar_values_for_callers_subset): Compute value from
> pass-through jump function for self-recursive.
> (intersect_with_plats): Remove code of itersection with unknown
> place holder value.
> (intersect_with_agg_replacements): Likewise.
> (intersect_aggregates_with_edge): Deduce with from pass-through
> jump function for self-recursive.
> (decide_whether_version_node): Remove dead callers and adjust
> order to let a non-self-recursive caller be the first element.
>
> From 74aef0cd2f40ff828a4b2abcbbdbbf4b1aea1fcf Mon Sep 17 00:00:00 2001
> From: Feng Xue 
> Date: Tue, 21 Jan 2020 20:53:38 +0800
> Subject: [PATCH] Generalized value pass-through for self-recusive function
>
> ---
>  gcc/ipa-cp.c   | 195 ++---
>  gcc/testsuite/g++.dg/ipa/pr93203.C |  95 ++
>  gcc/testsuite/gcc.dg/ipa/ipcp-1.c  |   2 +-
>  3 files changed, 216 insertions(+), 76 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr93203.C
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 17da1d8e8a7..64d23a34292 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c

...

> @@ -4817,19 +4867,12 @@ intersect_with_plats (class ipcp_param_lattices 
> *plats,
>   break;
> if (aglat->offset - offset == item->offset)
>   {
> -   gcc_checking_assert (item->value);

I've been staring at this for quite a while, trying to figure out how
your patch can put NULL here before I realized it was just a clean-up
:-)  Sending such changes independently or pointing them out in the
email/ChangeLog makes review easier.

> if (aglat->is_single_const ())
>   {
> tree value = aglat->values->value;
>  
> if (values_equal_for_ipcp_p (item->value, value))
>   found = true;
> -   else if (item->value == error_mark_node)
> - {
> -   /* Replace unknown place holder value with real one.  */
> -   item->value = value;
> -   found = true;
> - }
>   }
> break;
>   }

...

> @@ -5564,7 +5610,6 @@ decide_whether_version_node (struct cgraph_node *node)
>   }
>clone = create_specialized_node (node, known_csts, known_contexts,
>  aggvals, callers);
> -  info = IPA_NODE_REF (node);

please either drop this change or change it to:

   gcc_checking_assert (info == IPA_NODE_REF (node));

this line of code was actually necessary when adding nodes possibly
invalidated addresses of all summaries - like fast_function_summary
classes still do.  So if we ever decide to use fast summaries we need a
test to remind us that info address must be obtained again.

Thanks for working on this!

Martin



Re: [PATCH 2/3] Add patch_area_size and patch_area_entry to cfun

2020-02-05 Thread Richard Sandiford
"H.J. Lu"  writes:
> Currently patchable area is at the wrong place.

Agreed :-)

> It is placed immediately
> after function label and before .cfi_startproc.  A backend should be able
> to add a pseudo patchable area instruction durectly into RTL.  This patch
> adds patch_area_size and patch_area_entry to cfun so that the patchable
> area info is available in RTL passes.

It might be better to add it to crtl, since it should only be needed
during rtl generation.

> It also limits patch_area_size and patch_area_entry to 65535, which is
> a reasonable maximum size for patchable area.
>
> gcc/
>
>   PR target/93492
>   * function.c (expand_function_start): Set cfun->patch_area_size
>   and cfun->patch_area_entry.
>   * function.h (function): Add patch_area_size and patch_area_entry.
>   * opts.c (common_handle_option): Limit
>   function_entry_patch_area_size and function_entry_patch_area_start
>   to USHRT_MAX.  Fix a typo in error message.
>   * varasm.c (assemble_start_function): Use cfun->patch_area_size
>   and cfun->patch_area_entry.
>   * doc/invoke.texi: Document the maximum value for
>   -fpatchable-function-entry.
>
> gcc/testsuite/
>
>   PR target/93492
>   * c-c++-common/patchable_function_entry-error-1.c: New test.
>   * c-c++-common/patchable_function_entry-error-2.c: Likewise.
>   * c-c++-common/patchable_function_entry-error-3.c: Likewise.
> ---
>  gcc/doc/invoke.texi   |  1 +
>  gcc/function.c| 35 +++
>  gcc/function.h|  6 
>  gcc/opts.c|  4 ++-
>  .../patchable_function_entry-error-1.c|  9 +
>  .../patchable_function_entry-error-2.c|  9 +
>  .../patchable_function_entry-error-3.c| 20 +++
>  gcc/varasm.c  | 30 ++--
>  8 files changed, 85 insertions(+), 29 deletions(-)
>  create mode 100644 
> gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
>  create mode 100644 
> gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
>  create mode 100644 
> gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
>
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 35b341e759f..dd4835199b0 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
>  The NOP instructions are inserted at---and maybe before, depending on
>  @var{M}---the function entry address, even before the prologue.
>  
> +The maximum value of @var{N} and @var{M} is 65535.
>  @end table
>  
>  
> diff --git a/gcc/function.c b/gcc/function.c
> index d8008f60422..badbf538eec 100644
> --- a/gcc/function.c
> +++ b/gcc/function.c
> @@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
>/* If we are doing generic stack checking, the probe should go here.  */
>if (flag_stack_check == GENERIC_STACK_CHECK)
>  stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
> +
> +  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
> +  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
> +
> +  tree patchable_function_entry_attr
> += lookup_attribute ("patchable_function_entry",
> + DECL_ATTRIBUTES (cfun->decl));
> +  if (patchable_function_entry_attr)
> +{
> +  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
> +  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
> +
> +  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
> +  patch_area_entry = 0;
> +  if (TREE_CHAIN (pp_val) != NULL_TREE)
> + {
> +   tree patchable_function_entry_value2
> + = TREE_VALUE (TREE_CHAIN (pp_val));
> +   patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
> + }
> +  if (patch_area_size > USHRT_MAX || patch_area_size > USHRT_MAX)
> + error ("invalid values for % attribute");

This should probably go in handle_patchable_function_entry_attribute
instead.  It doesn't look like we have a clear policy about whether
errors or warnings are right for unrecognised arguments to known
attribute names, but handle_patchable_function_entry_attribute reports
an OPT_Wattributes warning for arguments that aren't integers.  Doing the
same for out-of-range integers would be more consistent and means that
we wouldn't break existing code if we relaxed/changed the rules in future.

Thanks,
Richard


Re: [GCC][BUG][Aarch64][ARM] (PR93300) Fix ICE due to BFmode placement in GET_MODES_WIDER chain.

2020-02-05 Thread Richard Sandiford
Stam Markianos-Wright  writes:
> On 2/4/20 12:02 PM, Richard Sandiford wrote:
>> Stam Markianos-Wright  writes:
>>> On 1/31/20 1:45 PM, Richard Sandiford wrote:
 Stam Markianos-Wright  writes:
> On 1/30/20 10:01 AM, Richard Sandiford wrote:
>> Stam Markianos-Wright  writes:
>>> On 1/29/20 12:42 PM, Richard Sandiford wrote:
 Stam Markianos-Wright  writes:
> Hi all,
>
> This fixes:
>
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93300
>
> Genmodes.c was generating the "wider_mode" chain as follows:
>
> HF -> BF -> SF - > DF -> TF -> VOID
>
> This caused issues in some rare cases where conversion between modes
> was needed, such as the above PR93300 where BFmode was being picked up
> as a valid mode for:
>
> optabs.c:prepare_float_lib_cmp
>
> which then led to the ICE at expr.c:convert_mode_scalar.
>>>
>>> Hi Richard,
>>>

 Can you go into more details about why this chain was a problem?
 Naively, it's the one I'd have expected: HF should certainly have
 priority over BF,
>>>
>>> Is that because functionally it looks like genmodes puts things in 
>>> reverse
>>> alphabetical order if all else is equal? (If I'm reading the comment 
>>> about
>>> MODE_RANDOM, MODE_CC correctly)
>>>
 but BF coming before SF doesn't seem unusual in itself.

 I'm not saying the patch is wrong.  It just wasn't clear why it was
 right either.

>>> Yes, I see what you mean. I'll go through my thought process here:
>>>
>>> In investigating the ICE PR93300 I found that the diversion from 
>>> pre-bf16
>>> behaviour was specifically at `optabs.c:prepare_float_lib_cmp`, where a
>>> `FOR_EACH_MODE_FROM (mode, orig_mode)` is used to then go off and 
>>> generate
>>> library calls for conversions.
>>>
>>> This was then being caught further down by the gcc_assert at expr.c:325 
>>> where
>>> GET_MODE_PRECISION (from_mode) was equal to GET_MODE_PRECISION 
>>> (to_mode) because
>>> it was trying to emit a HF->BF conversion libcall as `bl __extendhfbf2` 
>>> (which
>>> is what happened if i removed the gcc_assert at expr.c:325)
>>>
>>> With BFmode being a target-defined mode, I didn't want to add something 
>>> like `if
>>> (mode != BFmode)` to specifically exclude BFmode from being selected 
>>> for this.
>>> (and there's nothing different between HFmode and BFmode here to allow 
>>> me to
>>> make this distinction?)
>>>
>>> Also I couldn't find anywhere where the target back-end is not 
>>> consulted for a
>>> "is this supported: yes/no" between the `FOR_EACH_MODE_FROM` loop and 
>>> the
>>> libcall being created later on as __extendhfbf2.
>>
>> Yeah, prepare_float_lib_cmp just checks for libfuncs rather than
>> calling target hooks directly.  The libfuncs themselves are under
>> the control of the target though.
>>
>> By default we assume all float modes have associated libfuncs.
>> It's then up to the target to remove functions that don't exist
>> (or redirect them to other functions).  So I think we need to remove
>> BFmode libfuncs in arm_init_libfuncs in the same way as we currently
>> do for HFmode.
>>
>> I guess we should also nullify the conversion libfuncs for BFmode,
>> not just the arithmetic and comparison ones.
>
> Ahhh now this works, thank you for the suggestion!
>
> I was aware of arm_init_libfuncs, but I had not realised that returning 
> NULL
> would have the desired effect for us, in this case. So I have essentially 
> rolled
> back the whole previous version of the patch and done this in the new 
> diff.
> It seems to have fixed the ICE and I am currently in the process of 
> regression
> testing!

 LGTM behaviourally, just a couple of requests about how it's written:

>
> Thank you!
> Stam
>
>>
>> Thanks,
>> Richard
>>
>>> Finally, because we really don't want __bf16 to be in the same 
>>> "conversion rank"
>>> as standard float modes for things like automatic promotion, this 
>>> seemed like a
>>> reasonable solution to that problem :)
>>>
>>> Let me know of your thoughts!
>>>
>>> Cheers,
>>> Stam
>
> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> index c47fc232f39..18055d4a75e 100644
> --- a/gcc/config/arm/arm.c
> +++ b/gcc/config/arm/arm.c
> @@ -2643,6 +2643,30 @@ arm_init_libfuncs (void)
>default:
>  break;
>}
> +
> +  /* For all possible libcalls in BFmode, return NULL.  */
> +  /* Conversions.  */
> +  set_conv_libfunc (trunc_optab, BFmode, HFmode, (NULL));
> 

Re: [Patch][Testsuite] – More fixes for remote execution: check_gc_sections_available, …

2020-02-05 Thread Richard Sandiford
Tobias Burnus  writes:
> Still pending: libgomp-Testsuite patch 
> https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00207.html
>
> This is the same fix – but for gcc/testsuite/.
>
> To illustrate the problem again. Using remote testing
> (here: modified target_compile, but DejaGNU's default_target_compile is 
> likewise),
> and using check_gc_sections_available, I get the following result without the 
> fix:
>
> call_remote  download  -print-prog-name=ld -print-prog-name=ld
> Invoking the compiler as powerpc64le-none-linux-gnu-gcc … 
> /scratch/…/default/gcc.d/-print-prog-name=ld
> …
> UNSUPPORTED: gcc.dg/special/gcsec-1.c
>
> Which obvious fails both for uploading the flag as file and for compiling the
> flag as filename.
>
>
> WITH the patch, I get the expected:
>
> Invoking the compiler as powerpc64le-none-linux-gnu-gcc … 
> -fdiagnostics-urls=never  -print-prog-name=ld -I .
> …
> PASS: gcc.dg/special/gcsec-1.c (test for excess errors)
> PASS: gcc.dg/special/gcsec-1.c execution test
>
> I tested the attached patch for check_gc_sections_available thoroughly
> – esp. via RUNTESTFLAGS="-v -v -v -v -v special.exp=gcsec-1.c"
> w/ and w/o remote but also using, intermittently, some debugging "puts".
>
> I also fixed check_effective_target_gld, check_effective_target_gas, 
> check_runtime,
> check_multi_dir, but tested them only lightly. (The first two are unused, the
> others are only used by mips and arm.)
> Regarding the '{ }' see below – or in the linked other patch. Without, only 
> one
> argument is actually passed.
>
> OK for the trunk?
>
> Tobias
>
> On 2/4/20 4:49 PM, Tobias Burnus wrote:
>> DejaGNU uses in lib/target.exp's
>>    proc default_target_compile {source destfile type options}
>> uses the following.
>>
>> When running locally, $source is simply used
>> as argument. However, when run remotely, it is tried to be uploaded
>> on the remote host – and otherwise skipped. That's fine if the
>> argument is an actual file – but not if it is a flag.
>>
>> Hence, flags should be passed as $options not as $source.
>> Quoting now from DejaGNU's default_target_compile#:
>>
>>     set sources ""
>>     if {[isremote host]} {
>>     foreach x $source {
>>     set file [remote_download host $x]
>>     if { $file eq "" } {
>>     warning "Unable to download $x to host."
>>     return "Unable to download $x to host."
>>     } else {
>>     append sources " $file"
>>     }
>>     }
>>     } else {
>>     set sources $source
>>     }
>>
>>  * * *
>>
>> FIRST, I think that affects the following calls, but I have not
>> checked. — I assume that moving it from the first to the last
>> argument should work and fix the "isremote" issue.
>>
>> gcc/testsuite/gcc.target/arm/multilib.exp:    set gcc_output 
>> [${tool}_target_compile "--print-multi-directory $gcc_opts" "" "none" ""]
>> gcc/testsuite/lib/target-supports.exp:    set gcc_output 
>> [${tool}_target_compile "-v" "" "none" ""]
>> gcc/testsuite/lib/target-supports.exp:  set gcc_ld [lindex 
>> [${tool}_target_compile "-print-prog-name=ld" "" "none" ""] 0]
>> gcc/testsuite/lib/target-supports.exp:  set gcc_as [lindex 
>> [${tool}_target_compile "-print-prog-name=as" "" "none" ""] 0]
>> gcc/testsuite/lib/target-supports.exp:  set gcc_ld [lindex 
>> [${tool}_target_compile "-print-prog-name=ld" "" "none" ""] 0]
>>
>> TODO: One should probably change this.
>>
>>
>> SECONDLY: I hit a very similar issue in libgomp, which I actually did 
>> debug.
>>
>> In check_effective_target_offload_target_nvptx, one has something 
>> similar, namely:
>>   set gcc_output [libgomp_target_compile "-v $options" "" "none" ""]
>> This currently tries (w/o success) to upload the flags to the remote 
>> host and then
>> fails as the required "-v" flag fails (i.e. no input).
>>
>> However, using "-v" as options argument does not work as seemingly 
>> only additional_flags=
>> arguments are passed on. Additionally, if one does not only want to 
>> pass on the first
>> argument, curly braces are needed.
>>
>> PATCH: The attached patch does this – and have successfully tested it 
>> both with local
>> runs and with remote runs. (RUNTESTFLAGS="-v -v -v" did help a lot for 
>> studying the
>> behavior in both cases.)
>>
>> OK for the trunk?
>>
>> Cheers,
>>
>> Tobias
>
>   gcc/testsuite/
>   * gcc.target/arm/multilib.exp (multilib_config): Pass flags to
>   …_target_compile as (additional_flags=) option and not as source
>   filename to make it work with remote execution.
>   * lib/target-supports.exp (check_runtime, check_gc_sections_available,
>   check_effective_target_gas, check_effective_target_gld): Likewise.
>
> diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp 
> b/gcc/testsuite/gcc.target/arm/multilib.exp
> index 67d00266f6b..60b9edeebd0 100644
> --- a/gcc/testsuite/gcc.target/arm/multilib.exp
> +++ b/gcc/testsuite/gcc.target/arm/multilib.exp
> @@ -40,7 +40,7 @@ proc 

Re: [Patch][Testsuite][libgomp] – Fix check_effective_target_offload_target_nvptx for remote execution

2020-02-05 Thread Richard Sandiford
Tobias Burnus  writes:
> diff --git a/libgomp/testsuite/lib/libgomp.exp 
> b/libgomp/testsuite/lib/libgomp.exp
> index 7e94527c7ca..cb7757b6a91 100644
> --- a/libgomp/testsuite/lib/libgomp.exp
> +++ b/libgomp/testsuite/lib/libgomp.exp
> @@ -346,11 +346,11 @@ proc check_effective_target_offload_target_nvptx { } {
>  # files; in particular, '-foffload', 'libgomp.oacc-*/*.exp'), which don't
>  # get passed on to 'check_effective_target_*' functions.  (Not caching 
> the
>  # result due to that.)
> -set options [current_compiler_flags]
> +set options [concat "{additional_flags=-v" [current_compiler_flags] "}"]

Using:

set options [list "additional_flags=[concat "-v" [current_compiler_flags]]"]

is maybe more obvious, and should get the { ... } quoting right for
unusual cases.

LGTM otherwise.

Thanks,
Richard


Re: [PATCH] avoid issuing -Wrestrict from folder (PR 93519)

2020-02-05 Thread Martin Sebor

On 2/5/20 1:19 AM, Richard Biener wrote:

On Tue, Feb 4, 2020 at 11:02 PM Martin Sebor  wrote:


On 2/4/20 2:31 PM, Jeff Law wrote:

On Tue, 2020-02-04 at 13:08 -0700, Martin Sebor wrote:

On 2/4/20 12:15 PM, Richard Biener wrote:

On February 4, 2020 5:30:42 PM GMT+01:00, Jeff Law  wrote:

On Tue, 2020-02-04 at 10:34 +0100, Richard Biener wrote:

On Tue, Feb 4, 2020 at 1:44 AM Martin Sebor  wrote:

PR 93519 reports a false positive -Wrestrict issued for an inlined

call

to strcpy that carefully guards against self-copying.  This is

caused

by the caller's arguments substituted into the call during inlining

and

before dead code elimination.

The attached patch avoids this by removing -Wrestrict from the

folder

and deferring folding perfectly overlapping (and so undefined)

calls

to strcpy (and mempcpy, but not memcpy) until much later.  Calls to
perfectly overlapping calls to memcpy are still folded early.


Why do we bother to warn at all for this case?  Just DWIM here.

Warnings like

this can be emitted from the analyzer?

They potentially can, but the analyzer is and will almost always
certainly be considerably slower.  I would not expect it to be used
nearly as much as the core compiler.

WHether or not a particular warning makes sense in the core compiler or
analyzer would seem to me to depend on whether or not we can reasonably
issue warnings without interprocedural analysis.  double-free
realistically requires interprocedural analysis to be effective.  I'm
not sure Wrestrict really does.



That is, I suggest to simply remove the bogus warning code from

folding

(and _not_ fail the folding).

I haven't looked at the patch, but if we can get the warning out of the
folder that's certainly preferable.  And we could investigate deferring
self-copy removal.


I think the issue is as usual, warning for code we'll later remove as dead. 
Warning at folding is almost always premature.


In this instance the code is reachable (or isn't obviously unreachable).
GCC doesn't remove it, but provides benign (and reasonable) semantics
for it(*).  To me, that's one aspect of quality.  Letting the user know
that the code is buggy is another.  I view that as at least as important
as folding the ill-effects away because it makes it possible to fix
the problem so the code works correctly even with compilers that don't
provide these benign semantics.

If you look at the guts of what happens at the point where we issue the
warning from within gimple_fold_builtin_strcpy we have:


DCH_to_char (char * in, char * out, int collid)
{
int type;
char * D.2148;
char * dest;
char * num;
long unsigned int _4;
char * _5;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
;;succ:   4

;;   basic block 4, loop depth 0
;;pred:   2
;;succ:   5

;;   basic block 5, loop depth 0
;;pred:   4
;;succ:   6

;;   basic block 6, loop depth 0
;;pred:   5
if (0 != 0)
  goto ; [53.47%]
else
  goto ; [46.53%]
;;succ:   7
;;8

;;   basic block 7, loop depth 0
;;pred:   6
strcpy (out_1(D), out_1(D));
;;succ:   8

;;   basic block 8, loop depth 0
;;pred:   6
;;7
_4 = __builtin_strlen (out_1(D));
_5 = out_1(D) + _4;
__builtin_memcpy (_5, "foo", 4);
;;succ:   3

;;   basic block 3, loop depth 0
;;pred:   8
return;
;;succ:   EXIT

}



Which shows the code is obviously unreachable in the case we're warning
about.  You can't see this in the dumps because it's exposed by
inlining, then cleaned up before writing the dump file.


In the specific case of the bug the code is of course eliminated
because it's guarded by the if (s != d).  I was referring to
the general (unguarded) case of:

char *s = "", *p;

int main (void)
{
  p = strcpy (s, s);
  puts (p);
}

where GCC folds the assignment 'p = strcpy(s, s);' to effectively
p = s;  That's perfectly reasonable but it could equally as well
leave the call alone, as it does when s is null, for instance.

I think folding it away is not only reasonable but preferable to
making the invalid call, but it's done only rarely.  Most of
the time GCC does emit the undefined access (it does that with
calls to library functions as well as with direct stores and
reads).  (I am hoping we can change that in the future so that
these kinds of problems are handled with some consistency.)



ISTM this would be a case we could handle with the __builtin_warning
stuff.

I think the question is do we want to do anything about it this cycle?


If so, I think Martin's approach is quite reasonable.  It disables
folding away the self-copies from gimple-fold and moves the warning
into the expander.  So if there's such a call in the IL at expansion
time we get a warning (-O0).

I'd hazard a guess that the diagnostic was added to the strlen pass to
capture the missed warning when we're optimizing and the 

Re: [PATCH/RFC] Make -fanalyzer C only for GCC 10 (PR 93392)

2020-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2020 at 04:18:49PM +0100, Jakub Jelinek wrote:
> C has ctors before main too, look for attribute constructor.

And you can have exceptions in C too, attribute cleanup for the C
"destructors" that are invoked when exception is thrown or pthread
cancellation is invoked.

Jakub



Re: [PATCH/RFC] Make -fanalyzer C only for GCC 10 (PR 93392)

2020-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2020 at 09:59:19AM -0500, David Malcolm wrote:
> Although the analyzer works on GIMPLE SSA and therefore in theory ought
> to support all languages supported by GCC, the code currently only
> supports the subset of GIMPLE SSA expressible via the C frontend.
> 
> For GCC 10 I want to explicitly restrict the scope of the analyzer to
> C code, to keep the initial scope of the feature sane.
> 
> For example, various C++ things aren't yet supported by the analyzer and
> won't be in GCC 10 (exceptions, new/delete diagnostics, ctors that run
> before main, etc)

C has ctors before main too, look for attribute constructor.

I must say I don't really like this, if you encounter something in the IL
that the analyzer can't handle yet, punt, that is fine, but you could
write code in C++ that doesn't use exceptions nor new/delete, there is
no reason not to analyze that, etc.  And teaching it to handle new/delete
operator like malloc/free shouldn't be that hard, but even if you don't,
you still need some way how to deal with other allocators in C, say for
stuff allocated with mmap etc.

Mentioning in the documentation that it is for now primarily intended for C
is one thing (that is fine), but stopping analyzing something is another.

Jakub



[committed] libstdc++: Remove workarounds for constraints on alias templates

2020-02-05 Thread Jonathan Wakely
The G++ bug has been fixed for a couple of months so we can remove these
workarounds that define alias templates in terms of constrained class
templates. We can just apply constraints directly to alias templates as
specified in the C++20 working draft.

* include/bits/iterator_concepts.h (iter_reference_t)
(iter_rvalue_reference_t, iter_common_reference_t, indirect_result_t):
Remove workarounds for PR c++/67704.
* testsuite/24_iterators/aliases.cc: New test.

Tested powerpc64le-linux, committed to master.


commit 269e8130b77065452698ab97e5da77d132d00276
Author: Jonathan Wakely 
Date:   Wed Feb 5 10:35:19 2020 +

libstdc++: Remove workarounds for constraints on alias templates

The G++ bug has been fixed for a couple of months so we can remove these
workarounds that define alias templates in terms of constrained class
templates. We can just apply constraints directly to alias templates as
specified in the C++20 working draft.

* include/bits/iterator_concepts.h (iter_reference_t)
(iter_rvalue_reference_t, iter_common_reference_t, 
indirect_result_t):
Remove workarounds for PR c++/67704.
* testsuite/24_iterators/aliases.cc: New test.

diff --git a/libstdc++-v3/include/bits/iterator_concepts.h 
b/libstdc++-v3/include/bits/iterator_concepts.h
index bf581597229..d9b8958d0a7 100644
--- a/libstdc++-v3/include/bits/iterator_concepts.h
+++ b/libstdc++-v3/include/bits/iterator_concepts.h
@@ -70,17 +70,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{
  { *__t } -> __can_reference;
};
-
-// FIXME: needed due to PR c++/67704
-template<__detail::__dereferenceable _Tp>
-  struct __iter_ref
-  {
-   using type = decltype(*std::declval<_Tp&>());
-  };
   } // namespace __detail
 
-  template
-using iter_reference_t = typename __detail::__iter_ref<_Tp>::type;
+  template<__detail::__dereferenceable _Tp>
+using iter_reference_t = decltype(*std::declval<_Tp&>());
 
   namespace ranges
   {
@@ -127,26 +120,11 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 } // inline namespace __cust
   } // namespace ranges
 
-  namespace __detail
-  {
-// FIXME: needed due to PR c++/67704
-template<__detail::__dereferenceable _Tp>
-  struct __iter_rvalue_ref
-  { };
-
-template<__detail::__dereferenceable _Tp>
-  requires requires(_Tp& __t)
-  {
-   { ranges::iter_move(__t) } -> __detail::__can_reference;
-  }
-  struct __iter_rvalue_ref<_Tp>
-  { using type = decltype(ranges::iter_move(std::declval<_Tp&>())); };
-
-  } // namespace __detail
-
-  template
+  template<__detail::__dereferenceable _Tp>
+requires requires(_Tp& __t)
+{ { ranges::iter_move(__t) } -> __detail::__can_reference; }
 using iter_rvalue_reference_t
-  = typename __detail::__iter_rvalue_ref<_Tp>::type;
+  = decltype(ranges::iter_move(std::declval<_Tp&>()));
 
   template struct incrementable_traits { };
 
@@ -467,18 +445,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   && common_reference_with&&,
   const iter_value_t<_In>&>;
 
-  namespace __detail
-  {
-// FIXME: needed due to PR c++/67704
-template
-  struct __iter_common_ref
-  : common_reference, iter_value_t<_Tp>&>
-  { };
-  } // namespace __detail
-
-  template
+  template
 using iter_common_reference_t
-  = typename __detail::__iter_common_ref<_Tp>::type;
+  = common_reference_t, iter_value_t<_Tp>&>;
 
   /// Requirements for writing a value into an iterator's referenced object.
   template
@@ -663,24 +632,10 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   && strict_weak_order<_Fn&, iter_common_reference_t<_I1>,
   iter_common_reference_t<_I2>>;
 
-  namespace __detail
-  {
-// FIXME: needed due to PR c++/67704
-template
-  struct __indirect_result
-  { };
-
-template
-  requires (readable<_Is> && ...)
-   && invocable<_Fn, iter_reference_t<_Is>...>
-  struct __indirect_result<_Fn, _Is...>
-  : invoke_result<_Fn, iter_reference_t<_Is>...>
-  { };
-  } // namespace __detail
-
   template
-using indirect_result_t = typename
-  __detail::__indirect_result<_Fn, _Is...>::type;
+requires (readable<_Is> && ...)
+  && invocable<_Fn, iter_reference_t<_Is>...>
+using indirect_result_t = invoke_result_t<_Fn, iter_reference_t<_Is>...>;
 
   /// [projected], projected
   template _Proj>
diff --git a/libstdc++-v3/testsuite/24_iterators/aliases.cc 
b/libstdc++-v3/testsuite/24_iterators/aliases.cc
new file mode 100644
index 000..29bb56ec0d3
--- /dev/null
+++ b/libstdc++-v3/testsuite/24_iterators/aliases.cc
@@ -0,0 +1,61 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by 

Re: [PATCH][AARCH64] Fix for PR86901

2020-02-05 Thread Wilco Dijkstra
Hi Modi,

Thanks for your patch! 

> Adding support for extv and extzv on aarch64 as described in 
> PR86901. I also changed
> extract_bit_field_using_extv to use gen_lowpart_if_possible instead of 
> gen_lowpart directly. Using
> gen_lowpart directly will fail with an ICE in building libgcc when the 
> compiler fails to successfully do so 
> whereas gen_lowpart_if_possible will bail out of matching this pattern 
> gracefully.

I did a quick bootstrap, this shows several failures like:

gcc/builtins.c:9427:1: error: unrecognizable insn:
 9427 | }
  | ^
(insn 212 211 213 24 (set (reg:SI 207)
(zero_extract:SI (reg:SI 206)
(const_int 26 [0x1a])
(const_int 6 [0x6]))) "/gcc/builtins.c":9413:44 -1
 (nil))

The issue here is that 26+6 = 32 and that's not a valid ubfx encoding. Currently
cases like this are split into a right shift in aarch64.md around line 5569:

;; When the bit position and width add up to 32 we can use a W-reg LSR
;; instruction taking advantage of the implicit zero-extension of the X-reg.
(define_split
  [(set (match_operand:DI 0 "register_operand")
(zero_extract:DI (match_operand:DI 1 "register_operand")
 (match_operand 2
   "aarch64_simd_shift_imm_offset_di")
 (match_operand 3
   "aarch64_simd_shift_imm_di")))]
  "IN_RANGE (INTVAL (operands[2]) + INTVAL (operands[3]), 1,
 GET_MODE_BITSIZE (DImode) - 1)
   && (INTVAL (operands[2]) + INTVAL (operands[3]))
   == GET_MODE_BITSIZE (SImode)"
  [(set (match_dup 0)
(zero_extend:DI (lshiftrt:SI (match_dup 4) (match_dup 3]
  {
operands[4] = gen_lowpart (SImode, operands[1]);
  }

However that only supports DImode, not SImode, so it needs to be changed to
be more general using GPI.

Your new extv patterns should replace the magic patterns above it:

;; ---
;; Bitfields
;; ---

(define_expand ""

These are the current extv/extzv patterns, but without a mode. They are no 
longer
used when we start using the new ones.

Note you can write  to combine the extzv and extz patterns.
But please add a comment mentioning the pattern names so they are easy to find!

Besides a bootstrap it is always useful to compile a large body of code with 
your change
(say SPEC2006/2017) and check for differences in at least codesize. If there is 
an increase
in instruction count then there may be more issues that need to be resolved.

> I'm looking through past mails and https://gcc.gnu.org/contribute.html which 
> details testing bootstrap.
> I'm building a cross-compiler (x64_aarch64) and the instructions don't 
> address that scenario. The GCC 
> cross-build is green and there's no regressions on the C/C++ tests (The 
> go/fortran etc. look like they 
> need additional infrastructure built on my side to work). Is there a workflow 
> for cross-builds or should I 
> aim to get an aarch64 native machine for full validation?

I find it easiest to develop on a many-core AArch64 server so you get much 
faster builds,
bootstraps and regression tests. Cross compilers are mostly useful if you want 
to test big-endian
or new architecture features which are not yet supported in hardware. You don't 
normally need
to test Go/Fortran/ADA etc unless your patch does something that would directly 
affect them.

Finally do you have a copyright assignment with the FSF?

Cheers,
Wilco

[PATCH/RFC] Make -fanalyzer C only for GCC 10 (PR 93392)

2020-02-05 Thread David Malcolm
Although the analyzer works on GIMPLE SSA and therefore in theory ought
to support all languages supported by GCC, the code currently only
supports the subset of GIMPLE SSA expressible via the C frontend.

For GCC 10 I want to explicitly restrict the scope of the analyzer to
C code, to keep the initial scope of the feature sane.

For example, various C++ things aren't yet supported by the analyzer and
won't be in GCC 10 (exceptions, new/delete diagnostics, ctors that run
before main, etc)

There are already a couple of ICEs in BZ relating to -fanalyzer with
non-C code (PR 93288 and PR 93405, using C++ and Fortran respectively).

Rather than have the feature potentially crash if a user attempts to use
it on non C, I think it's more user-friendly to explicitly mark it as
C-only for the GCC 10 release.

I'm not sure of the ideal way to implement this.

This patch moves -fanalyzer from common.opt to c-family/c.opt, changing
it from "Common" to "C".

Unfortunately, doing it this way seems to mean losing LTO support, so
the patch also marks the LTO analyzer tests to be skipped.
LTO analysis currently does work when used on C code, albeit with some
UI warts, but there doesn't seem to be a good way to express
  "only use this in LTO with C"
and in theory a user could compile C++ to .o with -flto, and then link
with -fanalyzer etc - so it seems simplest to also drop link-time
-fanalyzer support for this release.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

OK for master?  Is there a better way to implement this?

Thanks
Dave


gcc/c-family/ChangeLog:
PR analyzer/93392
* c.opt (fanalyzer): Move here from common.opt, make C-specific.

gcc/ChangeLog:
PR analyzer/93392
* common.opt (fanalyzer): Move to c.opt.
* doc/invoke.texi (-fanalyzer): Note that it is specific to the C
front-end.

gcc/testsuite/ChangeLog:
PR analyzer/93392
* gcc.dg/analyzer/double-free-lto-1-a.c: Skip the test until LTO
support can be re-enabled.
* gcc.dg/analyzer/malloc-ipa-8-lto-c.c: Likewise.
* gcc.dg/analyzer/torture/analyzer-torture.exp: Disable LTO for
now when running torture tests.
---
 gcc/c-family/c.opt |  4 
 gcc/common.opt |  4 
 gcc/doc/invoke.texi|  2 +-
 .../gcc.dg/analyzer/double-free-lto-1-a.c  |  3 +++
 gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-lto-c.c |  3 +++
 .../gcc.dg/analyzer/torture/analyzer-torture.exp   | 14 ++
 6 files changed, 25 insertions(+), 5 deletions(-)

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 814ed17f7c4..da087f28f8e 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1303,6 +1303,10 @@ d
 C ObjC C++ ObjC++ Joined
 ; Documented in common.opt.  FIXME - what about -dI, -dD, -dN and -dD?
 
+fanalyzer
+C Var(flag_analyzer)
+Enable static analysis pass.
+
 fabi-compat-version=
 C++ ObjC++ Joined RejectNegative UInteger Var(flag_abi_compat_version) Init(-1)
 The version of the C++ ABI used for -Wabi warnings and link compatibility 
aliases.
diff --git a/gcc/common.opt b/gcc/common.opt
index 5692cd04374..e9b29fb4ee0 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -989,10 +989,6 @@ fallow-store-data-races
 Common Report Var(flag_store_data_races) Optimization
 Allow the compiler to introduce new data races on stores.
 
-fanalyzer
-Common Var(flag_analyzer)
-Enable static analysis pass.
-
 fargument-alias
 Common Ignore
 Does nothing. Preserved for backward compatibility.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4dec0c8326b..a71992efcae 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -8360,7 +8360,7 @@ Enabling this option effectively enables the following 
warnings:
 }
 
 This option is only available if GCC was configured with analyzer
-support enabled.
+support enabled, and is specific to the C front-end.
 
 @item -Wanalyzer-too-complex
 @opindex Wanalyzer-too-complex
diff --git a/gcc/testsuite/gcc.dg/analyzer/double-free-lto-1-a.c 
b/gcc/testsuite/gcc.dg/analyzer/double-free-lto-1-a.c
index 61e78467732..5e05e6aec9d 100644
--- a/gcc/testsuite/gcc.dg/analyzer/double-free-lto-1-a.c
+++ b/gcc/testsuite/gcc.dg/analyzer/double-free-lto-1-a.c
@@ -3,6 +3,9 @@
 /* { dg-additional-options "-flto" } */
 /* { dg-additional-sources double-free-lto-1-b.c } */
 
+/* For now, LTO support is disabled (PR analyzer/93392), so skip this test.  */
+/* { dg-skip-if "PR analyzer/93392" { *-*-* } } */
+
 #include 
 #include "double-free-lto-1.h"
 
diff --git a/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-lto-c.c 
b/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-lto-c.c
index d332db13ef0..03452c923db 100644
--- a/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-lto-c.c
+++ b/gcc/testsuite/gcc.dg/analyzer/malloc-ipa-8-lto-c.c
@@ -3,6 +3,9 @@
 /* { dg-additional-options "-flto" } */
 /* { dg-additional-sources "malloc-ipa-8-lto-a.c 

[committed] analyzer: add enode status and revamp __analyzer_dump_exploded_nodes

2020-02-05 Thread David Malcolm
The analyzer recognizes __analyzer_dump_exploded_nodes as a "magic"
function for use in DejaGnu tests: at the end of the pass, it issues
a warning at each such call, dumping the count of exploded nodes seen at
the call, which can be checked in test cases via dg-warning directives,
along with the IDs of the enodes (which is helpful when debugging).

My intent was to give a way of testing the results of the state-merging
code.

The state-merging code can generate duplicate exploded nodes at a point
when state merging occurs, taking a pair of enodes from the worklist
that share a program_point and sufficiently similar state.  For these
cases it generates a merged state, and adds edges from those enodes to
the merged-state enode (potentially a new or a pre-existing enode); the
input enodes don't have process_node called on them.

This means that at a CFG join point there can be an unpredictable number
of enodes that we don't care about, where the precise number depends on
the details of the state-merger code, immediately followed by a more
predictable number that we do care about.

I've been papering over this in the analyzer DejaGnu tests somewhat
by adding pairs of __analyzer_dump_exploded_nodes calls at CFG join
points, where the output at the first call is somewhat arbitrary, and
the second has the number we care about; the first number tends to
change "at random" as I tweak the state merging code, in ways that
aren't interesting, but require the tests to be updated.

See e.g. gcc.dg/analyzer/paths-6.c which had:

  __analyzer_dump_exploded_nodes (0); /* { dg-warning "2 exploded nodes" } */
  // FIXME: the above can vary between 2 and 3 exploded nodes
  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 exploded node" } */

This patch remedies this situation by tracking which enodes are
processed, and which are merely "merger" enodes.  It updates the
output for __analyzer_dump_exploded_nodes so that count of enodes
only includes the *processed* enodes, and that the IDs are split
into "processed" and "merger" enodes.

The patch simplifies the testsuite by eliminating the redundant calls
described above; the example above becomes:

  __analyzer_dump_exploded_nodes (0); /* { dg-warning "1 processed enode" } */

where the output in question is now:

  warning: 1 processed enode: [EN: 94] merger(s): [EN: 93]

The patch also adds various checks on the status of enodes, to ensure
e.g. that each enode is processed at most once.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Pushed to master as a4d3bfc0851ac1b489c4dea5b57dcc08adb20457.

gcc/analyzer/ChangeLog:
* engine.cc (exploded_node::dump_dot): Show merger enodes.
(worklist::add_node): Assert that the node's m_status is
STATUS_WORKLIST.
(exploded_graph::process_worklist): Likewise for nodes from the
worklist.  Set status of merged nodes to STATUS_MERGER.
(exploded_graph::process_node): Set status of node to
STATUS_PROCESSED.
(exploded_graph::dump_exploded_nodes): Rework handling of
"__analyzer_dump_exploded_nodes", splitting enodes by status into
"processed" and "merger", showing the count of just the processed
enodes at the call, rather than the count of all enodes.
* exploded-graph.h (exploded_node::status): New enum.
(exploded_node::exploded_node): Initialize m_status to
STATUS_WORKLIST.
(exploded_node::get_status): New getter.
(exploded_node::set_status): New setter.
(exploded_node::m_status): New field.

gcc/ChangeLog:
* doc/analyzer.texi
(Special Functions for Debugging the Analyzer): Update description
of __analyzer_dump_exploded_nodes.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/data-model-1.c: Update for changed output to
__analyzer_dump_exploded_nodes, dropping redundant call at merger.
* gcc.dg/analyzer/data-model-7.c: Likewise.
* gcc.dg/analyzer/loop-2.c: Update for changed output format.
* gcc.dg/analyzer/loop-2a.c: Likewise.
* gcc.dg/analyzer/loop-4.c: Likewise.
* gcc.dg/analyzer/loop.c: Likewise.
* gcc.dg/analyzer/malloc-paths-10.c: Likewise; drop redundant
call at merger.
* gcc.dg/analyzer/malloc-vs-local-1a.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-1b.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-2.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-3.c: Likewise.
* gcc.dg/analyzer/paths-1.c: Likewise.
* gcc.dg/analyzer/paths-1a.c: Likewise.
* gcc.dg/analyzer/paths-2.c: Likewise.
* gcc.dg/analyzer/paths-3.c: Likewise.
* gcc.dg/analyzer/paths-4.c: Update for changed output format.
* gcc.dg/analyzer/paths-5.c: Likewise.
* gcc.dg/analyzer/paths-6.c: Likewise; drop redundant calls
at merger.
* gcc.dg/analyzer/paths-7.c: Likewise.
* gcc.dg/analyzer/torture/conditionals-2.c: Update for 

[PATCH 2/3] Add patch_area_size and patch_area_entry to cfun

2020-02-05 Thread H.J. Lu
Currently patchable area is at the wrong place.  It is placed immediately
after function label and before .cfi_startproc.  A backend should be able
to add a pseudo patchable area instruction durectly into RTL.  This patch
adds patch_area_size and patch_area_entry to cfun so that the patchable
area info is available in RTL passes.

It also limits patch_area_size and patch_area_entry to 65535, which is
a reasonable maximum size for patchable area.

gcc/

PR target/93492
* function.c (expand_function_start): Set cfun->patch_area_size
and cfun->patch_area_entry.
* function.h (function): Add patch_area_size and patch_area_entry.
* opts.c (common_handle_option): Limit
function_entry_patch_area_size and function_entry_patch_area_start
to USHRT_MAX.  Fix a typo in error message.
* varasm.c (assemble_start_function): Use cfun->patch_area_size
and cfun->patch_area_entry.
* doc/invoke.texi: Document the maximum value for
-fpatchable-function-entry.

gcc/testsuite/

PR target/93492
* c-c++-common/patchable_function_entry-error-1.c: New test.
* c-c++-common/patchable_function_entry-error-2.c: Likewise.
* c-c++-common/patchable_function_entry-error-3.c: Likewise.
---
 gcc/doc/invoke.texi   |  1 +
 gcc/function.c| 35 +++
 gcc/function.h|  6 
 gcc/opts.c|  4 ++-
 .../patchable_function_entry-error-1.c|  9 +
 .../patchable_function_entry-error-2.c|  9 +
 .../patchable_function_entry-error-3.c| 20 +++
 gcc/varasm.c  | 30 ++--
 8 files changed, 85 insertions(+), 29 deletions(-)
 create mode 100644 
gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
 create mode 100644 
gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
 create mode 100644 
gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c

diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 35b341e759f..dd4835199b0 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -13966,6 +13966,7 @@ If @code{N=0}, no pad location is recorded.
 The NOP instructions are inserted at---and maybe before, depending on
 @var{M}---the function entry address, even before the prologue.
 
+The maximum value of @var{N} and @var{M} is 65535.
 @end table
 
 
diff --git a/gcc/function.c b/gcc/function.c
index d8008f60422..badbf538eec 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -5202,6 +5202,41 @@ expand_function_start (tree subr)
   /* If we are doing generic stack checking, the probe should go here.  */
   if (flag_stack_check == GENERIC_STACK_CHECK)
 stack_check_probe_note = emit_note (NOTE_INSN_DELETED);
+
+  unsigned HOST_WIDE_INT patch_area_size = function_entry_patch_area_size;
+  unsigned HOST_WIDE_INT patch_area_entry = function_entry_patch_area_start;
+
+  tree patchable_function_entry_attr
+= lookup_attribute ("patchable_function_entry",
+   DECL_ATTRIBUTES (cfun->decl));
+  if (patchable_function_entry_attr)
+{
+  tree pp_val = TREE_VALUE (patchable_function_entry_attr);
+  tree patchable_function_entry_value1 = TREE_VALUE (pp_val);
+
+  patch_area_size = tree_to_uhwi (patchable_function_entry_value1);
+  patch_area_entry = 0;
+  if (TREE_CHAIN (pp_val) != NULL_TREE)
+   {
+ tree patchable_function_entry_value2
+   = TREE_VALUE (TREE_CHAIN (pp_val));
+ patch_area_entry = tree_to_uhwi (patchable_function_entry_value2);
+   }
+  if (patch_area_size > USHRT_MAX || patch_area_size > USHRT_MAX)
+   error ("invalid values for % attribute");
+}
+
+  if (patch_area_entry > patch_area_size)
+{
+  if (patch_area_size > 0)
+   warning (OPT_Wattributes,
+"patchable function entry %wu exceeds size %wu",
+patch_area_entry, patch_area_size);
+  patch_area_entry = 0;
+}
+
+  cfun->patch_area_size = patch_area_size;
+  cfun->patch_area_entry = patch_area_entry;
 }
 
 void
diff --git a/gcc/function.h b/gcc/function.h
index 1ee8ed3de53..1ed7c400f23 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -332,6 +332,12 @@ struct GTY(()) function {
   /* Last assigned dependence info clique.  */
   unsigned short last_clique;
 
+  /* How many NOP insns to place at each function entry by default.  */
+  unsigned short patch_area_size;
+
+  /* How far the real asm entry point is into this area.  */
+  unsigned short patch_area_entry;
+
   /* Collected bit flags.  */
 
   /* Number of units of general registers that need saving in stdarg
diff --git a/gcc/opts.c b/gcc/opts.c
index 7affeb41a96..c6011f1f9b7 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -2598,10 +2598,12 @@ common_handle_option (struct gcc_options *opts,
function_entry_patch_area_start = 0;

[PATCH 3/3] x86: Simplify UNSPECV_PATCHABLE_AREA generation

2020-02-05 Thread H.J. Lu
Since patch_area_size and patch_area_entry have been added to cfun,
we can use them to directly insert the pseudo UNSPECV_PATCHABLE_AREA
instruction.

PR target/93492
* config/i386/i386-features.c
(rest_of_insert_endbr_and_patchable_area): Change
need_patchable_area argument to patchable_area_size.  Generate
UNSPECV_PATCHABLE_AREA instruction with proper arguments.
(pass_insert_endbr_and_patchable_area::gate): Set and check
patchable_area_size instead of need_patchable_area.
(pass_insert_endbr_and_patchable_area::execute): Pass
patchable_area_size to rest_of_insert_endbr_and_patchable_area.
(pass_insert_endbr_and_patchable_area): Replace
need_patchable_area with patchable_area_size.
* config/i386/i386.c (ix86_print_patchable_function_entry):
Just return if function table has been emitted.
(x86_function_profiler): Use cfun->patch_area_size and
cfun->patch_area_entry.
* config/i386/i386.h (machine_function): Remove patch_area_size
and record_patch_area.
* config/i386/i386.md (patchable_area): Set length attribute.
---
 gcc/config/i386/i386-features.c | 22 +---
 gcc/config/i386/i386.c  | 60 ++---
 gcc/config/i386/i386.h  |  6 
 gcc/config/i386/i386.md | 10 +++---
 4 files changed, 25 insertions(+), 73 deletions(-)

diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index be46f036126..d358abe7064 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -1941,7 +1941,7 @@ make_pass_stv (gcc::context *ctxt)
 
 static void
 rest_of_insert_endbr_and_patchable_area (bool need_endbr,
-bool need_patchable_area)
+unsigned int patchable_area_size)
 {
   rtx endbr;
   rtx_insn *insn;
@@ -1980,7 +1980,7 @@ rest_of_insert_endbr_and_patchable_area (bool need_endbr,
}
 }
 
-  if (need_patchable_area)
+  if (patchable_area_size)
 {
   if (crtl->profile && flag_fentry)
{
@@ -1992,10 +1992,9 @@ rest_of_insert_endbr_and_patchable_area (bool need_endbr,
}
   else
{
- /* ix86_print_patchable_function_entry will provide actual
-size.  */
- rtx patchable_area = gen_patchable_area (GEN_INT (0),
-  GEN_INT (0));
+ rtx patchable_area
+   = gen_patchable_area (GEN_INT (patchable_area_size),
+ GEN_INT (cfun->patch_area_entry == 0));
  if (endbr_insn)
emit_insn_after (patchable_area, endbr_insn);
  else
@@ -2123,25 +2122,22 @@ public:
   virtual bool gate (function *fun)
 {
   need_endbr = (flag_cf_protection & CF_BRANCH) != 0;
-  need_patchable_area
-   = (function_entry_patch_area_size
-  || lookup_attribute ("patchable_function_entry",
-   DECL_ATTRIBUTES (fun->decl)));
-  return need_endbr || need_patchable_area;
+  patchable_area_size = fun->patch_area_size - fun->patch_area_entry;
+  return need_endbr || patchable_area_size;
 }
 
   virtual unsigned int execute (function *)
 {
   timevar_push (TV_MACH_DEP);
   rest_of_insert_endbr_and_patchable_area (need_endbr,
-  need_patchable_area);
+  patchable_area_size);
   timevar_pop (TV_MACH_DEP);
   return 0;
 }
 
 private:
   bool need_endbr;
-  bool need_patchable_area;
+  unsigned int patchable_area_size;
 }; // class pass_insert_endbr_and_patchable_area
 
 } // anon namespace
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 051a1fcbdc2..79a8823f61e 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -9130,53 +9130,11 @@ ix86_print_patchable_function_entry (FILE *file,
 {
   if (cfun->machine->function_label_emitted)
 {
-  /* The insert_endbr_and_patchable_area pass inserted a dummy
-UNSPECV_PATCHABLE_AREA with 0 patchable area size.  If the
-patchable area is placed after the function label, we replace
-0 patchable area size with the real one.  Otherwise, the
-dummy UNSPECV_PATCHABLE_AREA will be ignored.  */
-  if (cfun->machine->insn_queued_at_entrance)
-   {
- /* Record the patchable area.  Both ENDBR and patchable area
-will be inserted by x86_function_profiler later.  */
- cfun->machine->patch_area_size = patch_area_size;
- cfun->machine->record_patch_area = record_p;
- return;
-   }
-
-  /* We can have
-
-UNSPECV_NOP_ENDBR
-UNSPECV_PATCHABLE_AREA
-
-or just
-
-UNSPECV_PATCHABLE_AREA
-   */
-  rtx_insn *patchable_insn;
-  rtx_insn *insn = next_real_nondebug_insn (get_insns ());
-  if 

[PATCH 1/3] x86: Add UNSPECV_PATCHABLE_AREA

2020-02-05 Thread H.J. Lu
Currently patchable area is at the wrong place.  It is placed immediately
after function label, before both .cfi_startproc and ENDBR.  This patch
adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
changes ENDBR insertion pass to also insert a dummy patchable area.
TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to provide the
actual size for patchable area.  It places patchable area immediately
after .cfi_startproc and ENDBR.

gcc/

PR target/93492
* config/i386/i386-features.c (rest_of_insert_endbranch):
Renamed to ...
(rest_of_insert_endbr_and_patchable_area): Change return type
to void.  Don't call timevar_push nor timevar_pop.  Replace
endbr_queued_at_entrance with insn_queued_at_entrance.  Insert
UNSPECV_PATCHABLE_AREA for patchable area.
(pass_data_insert_endbranch): Renamed to ...
(pass_data_insert_endbr_and_patchable_area): This.  Change
pass name to endbr_and_patchable_area.
(pass_insert_endbranch): Renamed to ...
(pass_insert_endbr_and_patchable_area): This.  Add need_endbr
and need_patchable_area.
(pass_insert_endbr_and_patchable_area::gate): Set and check
need_endbr/need_patchable_area.
(pass_insert_endbr_and_patchable_area::execute): Call
timevar_push and timevar_pop.  Pass need_endbr amd
need_patchable_area to rest_of_insert_endbr_and_patchable_area.
(make_pass_insert_endbranch): Renamed to ...
(make_pass_insert_endbr_and_patchable_area): This.
* config/i386/i386-passes.def: Replace pass_insert_endbranch
with pass_insert_endbr_and_patchable_area.
* config/i386/i386-protos.h (ix86_output_patchable_area): New.
(make_pass_insert_endbranch): Renamed to ...
(make_pass_insert_endbr_and_patchable_area): This.
* config/i386/i386.c (ix86_asm_output_function_label): Set
function_label_emitted to true.
(ix86_print_patchable_function_entry): New function.
(ix86_output_patchable_area): Likewise.
(x86_function_profiler): Replace endbr_queued_at_entrance with
insn_queued_at_entrance.  Generate ENDBR only for TYPE_ENDBR.
Call ix86_output_patchable_area to generate patchable area.
(TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New.
* i386.h (queued_insn_type): New.
(machine_function): Add patch_area_size, record_patch_area and
function_label_emitted.  Replace endbr_queued_at_entrance with
insn_queued_at_entrance.
* config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New.
(patchable_area): New.

gcc/testsuite/

PR target/93492
* gcc.target/i386/pr93492-1.c: New test.
* gcc.target/i386/pr93492-2.c: Likewise.
* gcc.target/i386/pr93492-3.c: Likewise.
* gcc.target/i386/pr93492-4.c: Likewise.
* gcc.target/i386/pr93492-5.c: Likewise.
---
 gcc/config/i386/i386-features.c   | 136 +++---
 gcc/config/i386/i386-passes.def   |   2 +-
 gcc/config/i386/i386-protos.h |   5 +-
 gcc/config/i386/i386.c|  91 ++-
 gcc/config/i386/i386.h|  20 +++-
 gcc/config/i386/i386.md   |  15 +++
 gcc/testsuite/gcc.target/i386/pr93492-1.c |  73 
 gcc/testsuite/gcc.target/i386/pr93492-2.c |  12 ++
 gcc/testsuite/gcc.target/i386/pr93492-3.c |  13 +++
 gcc/testsuite/gcc.target/i386/pr93492-4.c |  11 ++
 gcc/testsuite/gcc.target/i386/pr93492-5.c |  12 ++
 11 files changed, 339 insertions(+), 51 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-5.c

diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index b49e6f8d408..be46f036126 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -1937,43 +1937,79 @@ make_pass_stv (gcc::context *ctxt)
   return new pass_stv (ctxt);
 }
 
-/* Inserting ENDBRANCH instructions.  */
+/* Inserting ENDBR and pseudo patchable-area instructions.  */
 
-static unsigned int
-rest_of_insert_endbranch (void)
+static void
+rest_of_insert_endbr_and_patchable_area (bool need_endbr,
+bool need_patchable_area)
 {
-  timevar_push (TV_MACH_DEP);
-
-  rtx cet_eb;
+  rtx endbr;
   rtx_insn *insn;
+  rtx_insn *endbr_insn = NULL;
   basic_block bb;
 
-  /* Currently emit EB if it's a tracking function, i.e. 'nocf_check' is
- absent among function attributes.  Later an optimization will be
- introduced to make analysis if an address of a static function is
- taken.  A static function whose address is not taken will get a
- nocf_check attribute.  This will allow to 

[PATCH 0/3] Update -fpatchable-function-entry implementation

2020-02-05 Thread H.J. Lu
The current -fpatchable-function-entry implementation is done almost
behind the backend back.  Backend doesn't know if and where patchable
area will be generated during RTL passes.

TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is only used to print out
assembly codes for patchable area.   This leads to wrong codes with
-fpatchable-function-entry:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92424
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93492

Also .cfi_startproc and DWARF info are at the wrong places when
-fpatchable-function-entry is used.

This patch set has 3 parts:

1. Add a pseudo UNSPECV_PATCHABLE_AREA instruction and define
TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY to work around the
-fpatchable-function-entry implementation deficiency.
2. Add patch_area_size and patch_area_entry to cfun to make the 
patchable area info is available in RTL passes so that backend
can handle patchable area properly.  It also limits patch_area_size
and patch_area_entry to 65535, which is a reasonable maximum size
for patchable area.  Other backends can also use it properly generate
patchable area.
3. Remove workaround in UNSPECV_PATCHABLE_AREA generation. 

If the patch set is acceptable, I can combine patch 1 and patch 3
into a single patch.

H.J. Lu (3):
  x86: Add UNSPECV_PATCHABLE_AREA
  Add patch_area_size and patch_area_entry to cfun
  x86: Simplify UNSPECV_PATCHABLE_AREA generation

 gcc/config/i386/i386-features.c   | 130 --
 gcc/config/i386/i386-passes.def   |   2 +-
 gcc/config/i386/i386-protos.h |   5 +-
 gcc/config/i386/i386.c|  51 ++-
 gcc/config/i386/i386.h|  14 +-
 gcc/config/i386/i386.md   |  17 +++
 gcc/doc/invoke.texi   |   1 +
 gcc/function.c|  35 +
 gcc/function.h|   6 +
 gcc/opts.c|   4 +-
 .../patchable_function_entry-error-1.c|   9 ++
 .../patchable_function_entry-error-2.c|   9 ++
 .../patchable_function_entry-error-3.c|  20 +++
 gcc/testsuite/gcc.target/i386/pr93492-1.c |  73 ++
 gcc/testsuite/gcc.target/i386/pr93492-2.c |  12 ++
 gcc/testsuite/gcc.target/i386/pr93492-3.c |  13 ++
 gcc/testsuite/gcc.target/i386/pr93492-4.c |  11 ++
 gcc/testsuite/gcc.target/i386/pr93492-5.c |  12 ++
 gcc/varasm.c  |  30 +---
 19 files changed, 375 insertions(+), 79 deletions(-)
 create mode 100644 
gcc/testsuite/c-c++-common/patchable_function_entry-error-1.c
 create mode 100644 
gcc/testsuite/c-c++-common/patchable_function_entry-error-2.c
 create mode 100644 
gcc/testsuite/c-c++-common/patchable_function_entry-error-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr93492-5.c

-- 
2.24.1



Re: gcov: reduce code quality loss by reproducible topn merging [PR92924]

2020-02-05 Thread Martin Liška

On 1/30/20 5:13 PM, Jan Hubicka wrote:

Martin, I welcome your opinion on the patch


Ok, recently we've made quite some experiments with Honza about
TOP N counters. Based on the numbers, it shows that tracking a negative
value per-counter value does not help much. So that I suggest to use
only one global flag (a negative total) in order to distinguish in between
reproducible and non-reproducible profile.

I'm sending patch that simplifies the merging and introduces a new option.

For the future, we may (similarly to LLVM) come up with a dynamic allocation
for TOPN counters. I've git a working patch for that.

Martin
>From acf77e7ebba2a4f4370388f83901d67cc71e5a0c Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Wed, 5 Feb 2020 14:44:43 +0100
Subject: [PATCH] Introduce -fprofile-reproducible and support it with TOP N.

gcc/ChangeLog:

2020-02-05  Martin Liska  

	PR ipa/92924
	* common.opt: Add -fprofile-reproducible.
	* doc/invoke.texi: Document it.
	* value-prof.c (dump_histogram_value):
	Document and support behavior for counters[0]
	being a negative value.
	(get_nth_most_common_value): Handle negative
	counters[0] in respect to flag_profile_reproducible.

libgcc/ChangeLog:

2020-02-05  Martin Liska  

	PR ipa/92924
	* libgcov-merge.c (merge_topn_values_set): Record
	when a TOP N counter becomes invalid.  When merging
	remove a smallest value if the space is needed.
---
 gcc/common.opt |  4 
 gcc/doc/invoke.texi| 12 +-
 gcc/value-prof.c   | 26 --
 libgcc/libgcov-merge.c | 50 +-
 4 files changed, 69 insertions(+), 23 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index 5692cd04374..7f643516a62 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2168,6 +2168,10 @@ fprofile-exclude-files=
 Common Joined RejectNegative Var(flag_profile_exclude_files)
 Instrument only functions from files where names do not match all the regular expressions (separated by a semi-colon).
 
+fprofile-reproducible
+Common Report Var(flag_profile_reproducible)
+Enable only profile based optimizations that do not depend on order of training runs.
+
 Enum
 Name(profile_update) Type(enum profile_update) UnknownError(unknown profile update method %qs)
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 35b341e759f..6725c543c3b 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -562,7 +562,7 @@ Objective-C and Objective-C++ Dialects}.
 -fprofile-abs-path @gol
 -fprofile-dir=@var{path}  -fprofile-generate  -fprofile-generate=@var{path} @gol
 -fprofile-note=@var{path}  -fprofile-update=@var{method} @gol
--fprofile-filter-files=@var{regex}  -fprofile-exclude-files=@var{regex} @gol
+-fprofile-filter-files=@var{regex}  -fprofile-exclude-files=@var{regex} -fprofile-reproducibly @gol
 -fsanitize=@var{style}  -fsanitize-recover  -fsanitize-recover=@var{style} @gol
 -fasan-shadow-offset=@var{number}  -fsanitize-sections=@var{s1},@var{s2},... @gol
 -fsanitize-undefined-trap-on-error  -fbounds-check @gol
@@ -13324,6 +13324,16 @@ all the regular expressions (separated by a semi-colon).
 For example, @option{-fprofile-exclude-files=/usr/*} will prevent instrumentation
 of all files that are located in @file{/usr/} folder.
 
+Enable only profile based optimizations (PGO) that do not depend on order of training runs.
+
+The PGO optimizations depend on a run-time profile that is always merged after
+each training run.  The merged profile can end up being different based on
+sequence of these training runs.  Using the option, the compiler will use
+only the profile information which cannot be changed by order of training runs.
+
+@item -fprofile-reproducible
+@opindex fprofile-reproducible
+
 @item -fsanitize=address
 @opindex fsanitize=address
 Enable AddressSanitizer, a fast memory error detector.
diff --git a/gcc/value-prof.c b/gcc/value-prof.c
index f0456c8e93d..64b9c9de6dd 100644
--- a/gcc/value-prof.c
+++ b/gcc/value-prof.c
@@ -265,8 +265,10 @@ dump_histogram_value (FILE *dump_file, histogram_value hist)
 		? "Top N value counter" : "Indirect call counter"));
 	  if (hist->hvalue.counters)
 	{
-	  fprintf (dump_file, " all: %" PRId64 ", values: ",
-		   (int64_t) hist->hvalue.counters[0]);
+	  fprintf (dump_file, " all: %" PRId64 "%s, values: ",
+		   abs ((int64_t) hist->hvalue.counters[0]),
+		   hist->hvalue.counters[0] < 0
+		   ? " (values missing)": "");
 	  for (unsigned i = 0; i < GCOV_TOPN_VALUES; i++)
 		{
 		  fprintf (dump_file, "[%" PRId64 ":%" PRId64 "]",
@@ -719,26 +721,36 @@ gimple_divmod_fixed_value (gassign *stmt, tree value, profile_probability prob,
 
 /* Return the n-th value count of TOPN_VALUE histogram.  If
there's a value, return true and set VALUE and COUNT
-   arguments.  */
+   arguments.
+
+   Counters have the following meaning.
+
+   abs (counters[0]) is the number of executions
+   for i in 0 ... TOPN-1
+ counters[2 * i + 1] is target
+ abs 

[PATCH, rs6000]: mark clobber for registers changed by untpyed_call

2020-02-05 Thread Jiufu Guo
As PR93047 said, __builtin_apply/__builtin_return does not work well with
-frename-registers.  This is caused by return register(e.g. r3) is used to
rename another register, before return register is stored to stack.

This patch fix this issue by emitting clobber for those egisters which
maybe changed by untyped call.

Because this would be a kind of wrong code, so I'm thinking to submit to
gcc10.

Bootstrap and regtest on powerpc64le are pass.  Is this okay for trunk?

Thanks,
Jiufu

gcc/
2020-02-05  Jiufu Guo  

PR target/93047
* config/rs6000/rs6000.md (untyped_call): add emit_clobber.

gcc/testsuite
2020-02-05  Jiufu Guo  

PR target/93047
* gcc.dg/torture/stackalign/builtin-return-2.c: New case.


diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
index f3c8eb0..d68bc24 100644
--- a/gcc/config/rs6000/rs6000.md
+++ b/gcc/config/rs6000/rs6000.md
@@ -10867,6 +10867,9 @@
 
   emit_call_insn (gen_call (operands[0], const0_rtx, const0_rtx));
 
+  for (int i = 0; i < XVECLEN (operands[2], 0); i++)
+emit_clobber (SET_SRC (XVECEXP (operands[2], 0, i)));
+
   for (i = 0; i < XVECLEN (operands[2], 0); i++)
 {
   rtx set = XVECEXP (operands[2], 0, i);
diff --git a/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-2.c 
b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-2.c
new file mode 100644
index 000..7719109
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/stackalign/builtin-return-2.c
@@ -0,0 +1,40 @@
+/* PR target/93047 */
+/* Originator: Andrew Church  */
+/* { dg-do run } */
+/* { dg-additional-options "-O3 -frename-registers" } */
+/* { dg-require-effective-target untyped_assembly } */
+
+#ifdef __MMIX__
+/* No parameters on stack for bar.  */
+#define STACK_ARGUMENTS_SIZE 0
+#else
+#define STACK_ARGUMENTS_SIZE 64
+#endif
+
+extern void abort(void);
+
+int foo(int n)
+{
+  return n+1;
+}
+
+int bar(int n)
+{
+  __builtin_return(__builtin_apply((void (*)(void))foo, __builtin_apply_args(),
+  STACK_ARGUMENTS_SIZE));
+}
+
+int main(void)
+{
+  /* Allocate 64 bytes on the stack to make sure that __builtin_apply
+ can read at least 64 bytes above the return address.  */
+  char dummy[64];
+
+  __asm__ ("" : : "" (dummy));
+
+  if (bar(1) != 2)
+abort();
+
+  return 0;
+}
+



Re: [RFA] [PR rtl-optimization/90275] Handle nop reg->reg copies in cse

2020-02-05 Thread Richard Sandiford
Jeff Law  writes:
> Richard & Segher, if y'all could check my analysis here, it'd be
> appreciated.
>
> pr90275 is a P2 regression that is only triggering on ARM.  David's
> testcase in c#1 is the best for this problem as it doesn't require
> magic flags like -fno-dce to trigger.
>
> The block in question:
>
>> (code_label 89 88 90 24 15 (nil) [0 uses])
>> (note 90 89 97 24 [bb 24] NOTE_INSN_BASIC_BLOCK)
>> (insn 97 90 98 24 (parallel [
>> (set (reg:CC 100 cc)
>> (compare:CC (reg:SI 131 [ d_lsm.21 ])
>> (const_int 0 [0])))
>> (set (reg:SI 135 [ d_lsm.21 ])
>> (reg:SI 131 [ d_lsm.21 ]))
>> ]) "pr90275.c":21:45 248 {*movsi_compare0}
>>  (expr_list:REG_DEAD (reg:SI 131 [ d_lsm.21 ])
>> (nil)))
>> (insn 98 97 151 24 (set (reg:SI 136 [+4 ])
>> (reg:SI 132 [ d_lsm.21+4 ])) "pr90275.c":21:45 241 {*arm_movsi_insn}
>>  (expr_list:REG_DEAD (reg:SI 132 [ d_lsm.21+4 ])
>> (expr_list:REG_DEAD (reg:CC 100 cc)
>> (nil
>> (insn 151 98 152 24 (set (reg:SI 131 [ d_lsm.21 ])
>> (reg:SI 131 [ d_lsm.21 ])) "pr90275.c":21:45 241 {*arm_movsi_insn}
>>  (expr_list:REG_DEAD (reg:SI 135 [ d_lsm.21 ])
>> (nil)))
>> (insn 152 151 103 24 (set (reg:SI 132 [ d_lsm.21+4 ])
>> (reg:SI 136 [+4 ])) "pr90275.c":21:45 241 {*arm_movsi_insn}
>>  (expr_list:REG_DEAD (reg:SI 136 [+4 ])
>> (nil)))
>> 
> insns 97 and 151 are the most important.
>
> We process insn 97 which creates an equivalency between r135 and r131. 
> This is expressed by putting both on on the "same_value" chain
> (table_elt->{next,prev}_same_value).
>
> When we put the REGs on the chain we'll set REG_QTY to a positive
> number which indicates their values are valid.
>
> We continue processing insns forward and run into insn 151 which is a
> self-copy.
>
> First CSE will invalidate r131 (because its set).  Invalidation is
> accomplished by setting REG_QTY for r131 to a negative value.  It does
> not remove r131 from the same value chains.
>
> Then CSE will call insert_regs for r131.  The qty is not valid, so we
> get into this code:
>
>>  if (modified || ! qty_valid)
>> {
>>   if (classp)
>> for (classp = classp->first_same_value;
>>  classp != 0;
>>  classp = classp->next_same_value)
>>   if (REG_P (classp->exp)
>>   && GET_MODE (classp->exp) == GET_MODE (x))
>> {
>>   unsigned c_regno = REGNO (classp->exp);
>> 
>>   gcc_assert (REGNO_QTY_VALID_P (c_regno));
>> [ ... ]
>
> So we walk the chain of same values for r131.  WHen walking we run into
> r131 itself.  Since r131 has been invalidated  we trip the assert.
>
>
> The fix is pretty simple.  We just arrange to stop processing insns
> that are nop reg->reg copies much like we already do for mem->mem
> copies and (set (pc) (pc)).
>
> This has bootstrapped and regression tested on x86_64.  I've also
> verified it fixes the testcase in c#1 of pr90275, the test in pr93125
> and pr92388.  Interestingly enough I couldn't trigger the original
> testcase in 90275, but I'm confident this is ultimately all the same
> problem.

This looks similar to the infamous (to me):

   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01581.html

which had to be reverted because it broke powerpc64 bootstrap.
The problem was that n_sets is misleading for calls:

   https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01858.html

That's easy to fix (and I have a fix).  But given the damage this caused
last time, I think it's probably best left to GCC 11.

Thanks,
Richard


[PATCH] testsuite/92177 fix for SLP build changes

2020-02-05 Thread Richard Biener
We're now consistently building SLP operations with only
scalar defs from scalars which makes the testcase no longer
testing multiplication vectorization.  The following smuggles
in a constant making the vector variant profitable for SLP build.

Tested on x86_64-linux, pushed.

2020-02-05  Richard Biener  

PR testsuite/92177
* gcc.dg/vect/bb-slp-22.c: Adjust.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-22.c | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-22.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
index 992f5898409..6dc2375f5d1 100644
--- a/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-22.c
@@ -29,10 +29,10 @@ main1 (unsigned int x, unsigned int y)
 }
   else
 {
-  out[0] = a0 * x;
-  out[1] = a1 * y;
-  out[2] = a2 * x;
-  out[3] = a3 * y;
+  out[0] = a0 * (x + 1);
+  out[1] = a1 * (y + 1);
+  out[2] = a2 * (x + 1);
+  out[3] = a3 * (y + 1);
 }
 
   if (x)
@@ -40,10 +40,10 @@ main1 (unsigned int x, unsigned int y)
 
   /* Check results.  */
   if ((x <= y 
-   && (out[0] != (in[0] + 23) * x
-   || out[1] != (in[1] + 142) * y
-   || out[2] != (in[2] + 2) * x
-   || out[3] != (in[3] + 31) * y))
+   && (out[0] != (in[0] + 23) * (x + 1)
+   || out[1] != (in[1] + 142) * (y + 1)
+   || out[2] != (in[2] + 2) * (x + 1)
+   || out[3] != (in[3] + 31) * (y + 1)))
|| (x > y
&& (b[0] != (in[0] + 23)
|| b[1] != (in[1] + 142)
-- 
2.16.4


[PATCH] middle-end/90648 fend off builtin calls with not enough arguments from match

2020-02-05 Thread Richard Biener
This adds guards to genmatch generated code before accessing call
expression or stmt arguments that might be out of bounds when
the user provided bogus prototypes for what we consider builtins.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Richard.

2020-02-05  Richard Biener  

PR middle-end/90648
* genmatch.c (dt_node::gen_kids_1): Emit number of argument
checks before matching calls.

* gcc.dg/pr90648.c: New testcase.
---
 gcc/genmatch.c | 22 ++
 gcc/testsuite/gcc.dg/pr90648.c |  8 
 2 files changed, 22 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr90648.c

diff --git a/gcc/genmatch.c b/gcc/genmatch.c
index d174d4144fd..0a8cba62e0c 100644
--- a/gcc/genmatch.c
+++ b/gcc/genmatch.c
@@ -3060,10 +3060,15 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, 
int depth,
{
  expr *e = as_a (fns[i]->op);
  fprintf_indent (f, indent, "case %s:\n", e->operation->id);
- fprintf_indent (f, indent, "  {\n");
- fns[i]->gen (f, indent + 4, true, depth);
- fprintf_indent (f, indent, "break;\n");
- fprintf_indent (f, indent, "  }\n");
+ /* We need to be defensive against bogus prototypes allowing
+calls with not enough arguments.  */
+ fprintf_indent (f, indent,
+ "  if (gimple_call_num_args (_c%d) == %d)\n"
+ "{\n", depth, e->ops.length ());
+ fns[i]->gen (f, indent + 6, true, depth);
+ fprintf_indent (f, indent,
+ "}\n"
+ "  break;\n");
}
 
  fprintf_indent (f, indent, "default:;\n");
@@ -3125,10 +3130,11 @@ dt_node::gen_kids_1 (FILE *f, int indent, bool gimple, 
int depth,
  gcc_assert (e->operation->kind == id_base::FN);
 
  fprintf_indent (f, indent, "case %s:\n", e->operation->id);
- fprintf_indent (f, indent, "  {\n");
- generic_fns[j]->gen (f, indent + 4, false, depth);
- fprintf_indent (f, indent, "break;\n");
- fprintf_indent (f, indent, "  }\n");
+ fprintf_indent (f, indent, "  if (call_expr_nargs (%s) == %d)\n"
+"{\n", kid_opname, e->ops.length ());
+ generic_fns[j]->gen (f, indent + 6, false, depth);
+ fprintf_indent (f, indent, "}\n"
+"  break;\n");
}
   fprintf_indent (f, indent, "default:;\n");
 
diff --git a/gcc/testsuite/gcc.dg/pr90648.c b/gcc/testsuite/gcc.dg/pr90648.c
new file mode 100644
index 000..bf1fa989478
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr90648.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-O" } */
+
+extern double copysign ();
+double foo (double x)
+{
+  return x * copysign (); /* { dg-warning "too few arguments" } */
+}
-- 
2.16.4


Re: [PATCH 00/14] rs6000: Begin replacing built-in support

2020-02-05 Thread Bill Schmidt



On 2/5/20 6:30 AM, Segher Boessenkool wrote:

Hi!

On Wed, Feb 05, 2020 at 08:57:16AM +0100, Richard Biener wrote:

On Tue, Feb 4, 2020 at 6:40 PM Segher Boessenkool
 wrote:

On Mon, Feb 03, 2020 at 08:26:01PM -0600, Bill Schmidt wrote:

My intent is to make adding new built-in functions as simple as adding
a few lines to a couple of files, and automatically generating as much
of the initialization, overload resolution, and expansion logic as
possible.  This patch series establishes the format of the input files
and creates a new program (rs6000-genbif) to:

Let's call it rs6000-gen-builtins or similar.  Not as cryptic.

I believe we talked about this a few years ago.  Any reason this is powerpc
specific?  If sufficiently generic most targets would benefit and maybe even
frontends and the middle-end could make use of this.  The generator
program, that is.  (disclaimer: I didn't look into the patches at all)



One thing that's powerpc-unique (I believe) is our peculiar overloading 
infrastructure for the original AltiVec interface (extended to cover 
quite a bit more territory since).  But that's largely an extra level of 
abstraction that could eventually be optional.


There's also some specificity to our vector types (things like vector 
bool and vector pixel) that would need to be abstracted away.


Finally, there's a set of flags for special handling that are definitely 
Power-specific and would have to be abstracted away also.


Nothing that couldn't be dealt with given enough attention, so far as I 
can see.  But honestly I have not looked a great deal into other 
targets' built-in handling to see what other landmines might be present.



Absolutely, but we first want to solve the urgent problem for Power
(because that is what it is); it's a huge job with that reduction of
scope, already.  After *that* is done, it will be clearer how to do
things for what is wanted generically, will be clearer what is wanted
in the first place :-)



Yes, this is a necessary first step to even be able to see what's going 
on...





I always wondered if we can make our C frontend spit out things from
C declarations (with maybe extra #pragmas for some of the more obscure
details) and how to fit that into the bootstrap.

I think there will be too many problem cases, a direct description of
the builtins will work better (but is more verbose of course).

In any case, Bill's patches keep the exact same approach in rs6000 as
we had before, just with some more pre-processing and macros etc.;
which results in a much shorter description, many cases folded into one,
which as a bonus also fixes bugs (directly, when two things you fold
should be the same but are not, at least one of them is wrong; and maybe
more importantly indirectly: a reader of the tables will spot errors
much more easily if they fit on one screen, if you have similar entries
on the screen at the same time so you *can* compare; and there will be
more readers as well even, people are actually scared of having to look
at it currently).

So, yes, this same approach might be a good fit generically, but we'll
do it for rs6000 only, in the interest of ever getting it done ;-)
The generator programs etc. can move to generic code later, if that
helps and there is interest in it, there isn't much (if anything) in
here that is specific to our arch.



I'll keep this possibility in mind as we move forward.  It's probably a 
matter of months to get everything converted over just for Power.  But 
this set of patches is the most generic; the remaining patches will all 
be quite Power-specific.


Thanks,
Bill



Segher


Re: [PATCH coroutines] Change lowering behavior of alias variable from copy to substitute

2020-02-05 Thread JunMa

在 2020/2/4 下午8:17, JunMa 写道:

Hi
When testing coroutines with lambda function, I find we copy each 
captured
variable to frame. However, according to gimplify pass, for each 
declaration

that is an alias for another expression(DECL_VALUE_EXPR), we can
substitute them directly.

Since lambda captured variables is one of this kind. It is better to 
replace them
rather than copy them, This can reduce frame size (all of the captured 
variables

are field of closure class) and avoid extra copy behavior as well.

This patch remove all of the code related to copy captured variable.
Instead, we first rewrite DECL_VALUE_EXPR with frame field, then we check
every variable whether it has DECL_VALUE_EXPR, and then substitute it, 
this

patch does not create frame field for such variables.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa


Hi

minor update: only handle var_decl when iterate BIND_EXPR_VARS
in register_local_var_uses.

Regards
JunMa

gcc/cp
2020-02-04  Jun Ma 

    * coroutines.cc (morph_fn_to_coro): Remove code related to
    copy captured variable.
    (register_local_var_uses):  Ditto.
    (register_param_uses):  Collect use of parameters inside
    DECL_VALUE_EXPR.
    (transform_local_var_uses): Substitute the alias variable
    with DECL_VALUE_EXPR if it has one.


gcc/testsuite
2020-02-04  Jun Ma 

    * g++.dg/coroutines/lambda-07-multi-capture.C: New test.



---
 gcc/cp/coroutines.cc  | 117 +-
 .../torture/lambda-07-multi-capture.C |  55 
 2 files changed, 85 insertions(+), 87 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/lambda-07-multi-capture.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 0c2ec32d7db..1bc2ed7f89c 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -1790,6 +1790,11 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  cp_walk_tree (_SIZE (lvar), transform_local_var_uses, d, NULL);
  cp_walk_tree (_SIZE_UNIT (lvar), transform_local_var_uses, d,
NULL);
+ if (VAR_P (lvar) && DECL_HAS_VALUE_EXPR_P (lvar))
+   {
+ tree t  = DECL_VALUE_EXPR (lvar);
+ cp_walk_tree (, transform_local_var_uses, d, NULL);
+   }
 
  /* TODO: implement selective generation of fields when vars are
 known not-used.  */
@@ -1815,9 +1820,9 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  gcc_checking_assert (existed);
 
  if (local_var.field_id == NULL_TREE)
-   pvar = _CHAIN (*pvar); /* Wasn't used.  */
-
- *pvar = DECL_CHAIN (*pvar); /* discard this one, we replaced it.  */
+   pvar = _CHAIN (*pvar); /* Wasn't used or was an alias.  */
+ else
+   *pvar = DECL_CHAIN (*pvar); /* discard this one, we replaced it.  */
}
 
   *do_subtree = 0; /* We've done the body already.  */
@@ -1847,8 +1852,16 @@ transform_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
   if (local_var_i == NULL)
 return NULL_TREE;
 
-  /* This is our revised 'local' i.e. a frame slot.  */
-  tree revised = local_var_i->field_idx;
+  /* This is our revised 'local' i.e. a frame slot or an alias.  */
+  tree revised  = NULL_TREE;
+  if (local_var_i->field_id == NULL_TREE)
+{
+  gcc_checking_assert (DECL_HAS_VALUE_EXPR_P (var_decl));
+  /* If the decl is an alias for another expression, substitute it.  */
+  revised = DECL_VALUE_EXPR (var_decl);
+}
+  else
+revised = local_var_i->field_idx;
   gcc_checking_assert (DECL_CONTEXT (var_decl) == lvd->context);
 
   if (decl_expr_p && DECL_INITIAL (var_decl))
@@ -2796,6 +2809,12 @@ register_param_uses (tree *stmt, int *do_subtree 
ATTRIBUTE_UNUSED, void *d)
 {
   param_frame_data *data = (param_frame_data *) d;
 
+  if (VAR_P (*stmt) && DECL_HAS_VALUE_EXPR_P (*stmt))
+{
+  tree x = DECL_VALUE_EXPR (*stmt);
+  cp_walk_tree (, register_param_uses, d, NULL);
+}
+
   if (TREE_CODE (*stmt) != PARM_DECL)
 return NULL_TREE;
 
@@ -2840,7 +2859,6 @@ struct local_vars_frame_data
 {
   tree *field_list;
   hash_map *local_var_uses;
-  vec *captures;
   unsigned int nest_depth, bind_indx;
   location_t loc;
   bool saw_capture;
@@ -2869,26 +2887,27 @@ register_local_var_uses (tree *stmt, int *do_subtree, 
void *d)
  local_var_info _var
= lvd->local_var_uses->get_or_insert (lvar, );
  gcc_checking_assert (!existed);
+ /* For non-VAR_DECL or var as an alias, just leave it.  */
+ if (!VAR_P (lvar) || DECL_HAS_VALUE_EXPR_P (lvar))
+   continue;
  tree lvtype = TREE_TYPE (lvar);
  tree lvname = DECL_NAME (lvar);
- bool captured = is_normal_capture_proxy (lvar);
  /* Make names depth+index unique, so that we can support nested
 scopes with identically named locals.  */
  char *buf;
  

Re: [PATCH 00/14] rs6000: Begin replacing built-in support

2020-02-05 Thread Segher Boessenkool
Hi!

On Wed, Feb 05, 2020 at 08:57:16AM +0100, Richard Biener wrote:
> On Tue, Feb 4, 2020 at 6:40 PM Segher Boessenkool
>  wrote:
> > On Mon, Feb 03, 2020 at 08:26:01PM -0600, Bill Schmidt wrote:
> > > My intent is to make adding new built-in functions as simple as adding
> > > a few lines to a couple of files, and automatically generating as much
> > > of the initialization, overload resolution, and expansion logic as
> > > possible.  This patch series establishes the format of the input files
> > > and creates a new program (rs6000-genbif) to:
> >
> > Let's call it rs6000-gen-builtins or similar.  Not as cryptic.
> 
> I believe we talked about this a few years ago.  Any reason this is powerpc
> specific?  If sufficiently generic most targets would benefit and maybe even
> frontends and the middle-end could make use of this.  The generator
> program, that is.  (disclaimer: I didn't look into the patches at all)

Absolutely, but we first want to solve the urgent problem for Power
(because that is what it is); it's a huge job with that reduction of
scope, already.  After *that* is done, it will be clearer how to do
things for what is wanted generically, will be clearer what is wanted
in the first place :-)

> I always wondered if we can make our C frontend spit out things from
> C declarations (with maybe extra #pragmas for some of the more obscure
> details) and how to fit that into the bootstrap.

I think there will be too many problem cases, a direct description of
the builtins will work better (but is more verbose of course).

In any case, Bill's patches keep the exact same approach in rs6000 as
we had before, just with some more pre-processing and macros etc.;
which results in a much shorter description, many cases folded into one,
which as a bonus also fixes bugs (directly, when two things you fold
should be the same but are not, at least one of them is wrong; and maybe
more importantly indirectly: a reader of the tables will spot errors
much more easily if they fit on one screen, if you have similar entries
on the screen at the same time so you *can* compare; and there will be
more readers as well even, people are actually scared of having to look
at it currently).

So, yes, this same approach might be a good fit generically, but we'll
do it for rs6000 only, in the interest of ever getting it done ;-)
The generator programs etc. can move to generic code later, if that
helps and there is interest in it, there isn't much (if anything) in
here that is specific to our arch.


Segher


Re: [RFA] [PR rtl-optimization/90275] Handle nop reg->reg copies in cse

2020-02-05 Thread Segher Boessenkool
Hi all,

On Wed, Feb 05, 2020 at 07:26:03AM +0100, Jakub Jelinek wrote:
> On Tue, Feb 04, 2020 at 06:04:09PM -0700, Jeff Law wrote:
> > --- a/gcc/cse.c
> > +++ b/gcc/cse.c
> > @@ -5572,6 +5572,16 @@ cse_insn (rtx_insn *insn)
> >   sets[i].rtl = 0;
> > }
> >  
> > +  /* Similarly for no-op moves.  */

It says "no-op MEM moves" right above, so it should say "no-op REG
moves" here?

> > +  if (n_sets == 1
> > + && GET_CODE (src) == REG
> 
> Just nits:
> REG_P (src) ?

Hey that is my nit!  Find your own!  ;-)

> > + && src == dest)
> 
> Is pointer comparison ok?

It isn't, it doesn't work for hard registers.

> I mean, shouldn't we instead do
> rtx_equal_p (src, dest),

This does not see e.g.
  (set (reg:SI 123) (subreg:SI (reg:DI 123) 0))
as no-op move.

> set_noop_p (sets[i].rtl)

This doesn't catch all such cases either.

> or noop_move_p (insn)?

And this one is plain wrong (it should be called something with "maybe"
in the name, it returns false if it thinks that may lead to better
optimisation, see the REG_EQUAL handling).

What we need here is a test whether CSE can ignore this insn, and we
will run into this problem if we don't (Jeff's analysis).  Does CSE
already have everything it uses to make that decision scribbled away
somewhere, when we get here?  It would be good if we could use the
exact same condition (same predicate function for example) as what
would lead to the problem later, or we'll be playing whack-a-mole for
a while (or CSE is completely rewritten soon, my preferred option, but
"soon" on a GCC timescale will take way too long for the PR).


Segher


Re: [PATCH] i386: Omit clobbers from vzeroupper until final [PR92190]

2020-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2020 at 12:11:15PM +0100, Uros Bizjak wrote:
> > I think the primary question is, do we for -O0 need to enable split2 (as in,
> > will anything in pro_and_epilogue or jump2 passes benefit from the
> > splitting and worth running two -O0 post-RA split passes), then we could go
> > with the first patch I've posted, or nothing benefits from it in the two
> > passes and it is ok to split only before stack, in that case we can go with
> > the third patch.
> 
> Oh, I see. No passes get scheduled between split3 and split4. Looks to
> me that your third patch is then the way to go.

Ok, will bootstrap/regtest it soon.  Thanks.

Jakub



Re: [PATCH] i386: Omit clobbers from vzeroupper until final [PR92190]

2020-02-05 Thread Uros Bizjak
On Wed, Feb 5, 2020 at 12:03 PM Jakub Jelinek  wrote:
>
> On Wed, Feb 05, 2020 at 11:46:51AM +0100, Uros Bizjak wrote:
> > I think we should just enable split4 also for -O0. This would also
> > allow us to remove the "optimize > 0" check above and allow us to
> > generate a bit more optimal code even with -O0 for
> > TARGET_SSE_PARTIAL_REG_DEPENDENCY and TARGET_AVOID_FALSE_DEP_FOR_BMI.
>
> The -O0 passes are:
> test.c.284r.ira
> test.c.285r.reload
> # placement of split2
> test.c.292r.pro_and_epilogue
> test.c.295r.jump2
> # placement of split4
> # placement of split3
> test.c.307r.stack
> test.c.308r.alignments
> test.c.310r.mach
> test.c.311r.barriers
> # placement of split5
> test.c.316r.shorten
> test.c.317r.nothrow
> test.c.318r.dwarf2
> test.c.319r.final
>
> It really doesn't matter if for -O0 STACK_REGS we enable split4 or
> split3, it is the same location, just split3 is better named (split before
> reg-stack, which is run even at -O0, rather than split before sched2, which
> is not run at -O0).
>
> I think the primary question is, do we for -O0 need to enable split2 (as in,
> will anything in pro_and_epilogue or jump2 passes benefit from the
> splitting and worth running two -O0 post-RA split passes), then we could go
> with the first patch I've posted, or nothing benefits from it in the two
> passes and it is ok to split only before stack, in that case we can go with
> the third patch.

Oh, I see. No passes get scheduled between split3 and split4. Looks to
me that your third patch is then the way to go.

Uros.


Re: [PATCH] i386: Omit clobbers from vzeroupper until final [PR92190]

2020-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2020 at 11:46:51AM +0100, Uros Bizjak wrote:
> I think we should just enable split4 also for -O0. This would also
> allow us to remove the "optimize > 0" check above and allow us to
> generate a bit more optimal code even with -O0 for
> TARGET_SSE_PARTIAL_REG_DEPENDENCY and TARGET_AVOID_FALSE_DEP_FOR_BMI.

The -O0 passes are:
test.c.284r.ira
test.c.285r.reload
# placement of split2
test.c.292r.pro_and_epilogue
test.c.295r.jump2
# placement of split4
# placement of split3
test.c.307r.stack
test.c.308r.alignments
test.c.310r.mach
test.c.311r.barriers
# placement of split5
test.c.316r.shorten
test.c.317r.nothrow
test.c.318r.dwarf2
test.c.319r.final

It really doesn't matter if for -O0 STACK_REGS we enable split4 or
split3, it is the same location, just split3 is better named (split before
reg-stack, which is run even at -O0, rather than split before sched2, which
is not run at -O0).

I think the primary question is, do we for -O0 need to enable split2 (as in,
will anything in pro_and_epilogue or jump2 passes benefit from the
splitting and worth running two -O0 post-RA split passes), then we could go
with the first patch I've posted, or nothing benefits from it in the two
passes and it is ok to split only before stack, in that case we can go with
the third patch.

Jakub



Re: [PATCH] i386: Omit clobbers from vzeroupper until final [PR92190]

2020-02-05 Thread Uros Bizjak
On Wed, Feb 5, 2020 at 11:05 AM Jakub Jelinek  wrote:
>
> On Tue, Feb 04, 2020 at 02:15:04PM +0100, Uros Bizjak wrote:
> > On Tue, Feb 4, 2020 at 2:13 PM Jakub Jelinek  wrote:
> > >
> > > On Tue, Feb 04, 2020 at 01:38:51PM +0100, Uros Bizjak wrote:
> > > > As Richard advised, let's put this safety stuff back. Usually, in
> > > > i386.md, these kind of splitters are implemented as two patterns, one
> > > > (define_insn_and_split) having "#", and the other (define_insn) with a
> > > > real insn. My opinion is, that this separation avoids confusion as
> > > > much as possible.
> > >
> > > Okay.  So like this if it passes bootstrap/regtest then?
> >
> > Yes.
> >
> > > 2020-02-04  Jakub Jelinek  
> > >
> > > PR target/92190
> > > * config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper): 
> > > Only
> > > include sets and not clobbers in the vzeroupper pattern.
> > > * config/i386/sse.md (*avx_vzeroupper): Require in insn condition 
> > > that
> > > the parallel has 17 (64-bit) or 9 (32-bit) elts.
> > > (*avx_vzeroupper_1): New define_insn_and_split.
> > >
> > > * gcc.target/i386/pr92190.c: New test.
> >
> > OK.
>
> Unfortunately, it breaks
> +FAIL: gcc.target/i386/avx-2.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-2.c (test for excess errors)
> +FAIL: gcc.target/i386/avx-vzeroupper-10.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-10.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-10.c scan-assembler-times 
> avx_vzeroupper 3
> +FAIL: gcc.target/i386/avx-vzeroupper-11.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-11.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-11.c scan-assembler-times 
> *avx_vzeroall 1
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-11.c scan-assembler-times 
> avx_vzeroupper 3
> +FAIL: gcc.target/i386/avx-vzeroupper-12.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-12.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-12.c scan-assembler-times 
> *avx_vzeroall 1
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-12.c scan-assembler-times 
> avx_vzeroupper 4
> +FAIL: gcc.target/i386/avx-vzeroupper-5.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-5.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-5.c scan-assembler-times 
> avx_vzeroupper 1
> +FAIL: gcc.target/i386/avx-vzeroupper-7.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-7.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-7.c scan-assembler-times 
> avx_vzeroupper 1
> +FAIL: gcc.target/i386/avx-vzeroupper-8.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-8.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-8.c scan-assembler-times 
> avx_vzeroupper 1
> +FAIL: gcc.target/i386/avx-vzeroupper-9.c (internal compiler error)
> +FAIL: gcc.target/i386/avx-vzeroupper-9.c (test for excess errors)
> +UNRESOLVED: gcc.target/i386/avx-vzeroupper-9.c scan-assembler-times 
> avx_vzeroupper 4
> +FAIL: gcc.target/i386/sse-14.c (internal compiler error)
> +FAIL: gcc.target/i386/sse-14.c (test for excess errors)
> +FAIL: gcc.target/i386/sse-22.c (internal compiler error)
> +FAIL: gcc.target/i386/sse-22.c (test for excess errors)
> +FAIL: gcc.target/i386/sse-22a.c (internal compiler error)
> +FAIL: gcc.target/i386/sse-22a.c (test for excess errors)
>
> The problem is that x86 is the only target that defines HAVE_ATTR_length and
> doesn't schedule any splitting pass at -O0 after pro_and_epilogue.
>
> So, either we go back to handling this during vzeroupper output
> (unconditionally, rather than flag_ipa_ra guarded), or we need to tweak the
> split* passes for x86.
>
> Seems there are 5 split passes, split1 is run unconditionally before reload,
> split2 is run for optimize > 0 or STACK_REGS (x86) after ra but before
> epilogue_completed, split3 is run before regstack for STACK_REGS and
> optimize and -fno-schedule-insns2, split4 is run before sched2 if sched2 is
> run and split5 is run before shorten_branches if HAVE_ATTR_length and not
> STACK_REGS.
>
> Attached are 3 possible incremental patches for recog.c, all of them fix
> all the above regressions, but haven't fully bootstrapped/regtested any of
> them yet.  My preference would be the last one, which for -O0 and x86
> disables split2 and enables split3, as it doesn't add any extra passes.
> The first one just enables split3 for -O0 on x86, the second one enables
> split5 for -O0 on x86.

Please note that in i386.md we expect that the check for
"epilogue_completed" means split4 point. There are some places with:

  "TARGET_64BIT && ((optimize > 0 && flag_peephole2)
? epilogue_completed : reload_completed)

so for flag_peephole2, we split after peephole2 pass was performed.

Apparently, we already hit this problem in the past, so the check for

[PATCH] c++: Mark __builtin_convertvector operand as read [PR93557]

2020-02-05 Thread Jakub Jelinek
Hi!

In C++ we weren't calling mark_exp_read on the __builtin_convertvector first
argument.  I guess it could misbehave even with lambda implicit captures.

Fixed by calling decay_conversion on the argument, we use the argument as
rvalue so we want the standard lvalue to rvalue conversions, but as the
argument must be a vector type, e.g. integral promotions aren't really
needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-02-05  Jakub Jelinek  

PR c++/93557
* semantics.c (cp_build_vec_convert): Call decay_conversion on arg
prior to passing it to c_build_vec_convert.

* c-c++-common/Wunused-var-17.c: New test.

--- gcc/cp/semantics.c.jj   2020-02-01 10:01:36.581701153 +0100
+++ gcc/cp/semantics.c  2020-02-04 14:38:43.618588979 +0100
@@ -10389,7 +10389,8 @@ cp_build_vec_convert (tree arg, location
 
   tree ret = NULL_TREE;
   if (!type_dependent_expression_p (arg) && !dependent_type_p (type))
-ret = c_build_vec_convert (cp_expr_loc_or_input_loc (arg), arg,
+ret = c_build_vec_convert (cp_expr_loc_or_input_loc (arg),
+  decay_conversion (arg, complain),
   loc, type, (complain & tf_error) != 0);
 
   if (!processing_template_decl)
--- gcc/testsuite/c-c++-common/Wunused-var-17.c.jj  2020-02-04 
14:42:33.648152691 +0100
+++ gcc/testsuite/c-c++-common/Wunused-var-17.c 2020-02-04 14:41:46.092862987 
+0100
@@ -0,0 +1,19 @@
+/* PR c++/93557 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -Wunused-but-set-variable" } */
+
+typedef int VI __attribute__((vector_size (sizeof (int) * 4)));
+typedef float VF __attribute__((vector_size (sizeof (float) * 4)));
+
+void
+foo (VI *p, VF *q)
+{
+  VI a = (VI) { 1, 2, 3, 4 };  /* { dg-bogus "set but not 
used" } */
+  q[0] = __builtin_convertvector (a, VF);
+  VI b = p[1]; /* { dg-bogus "set but not 
used" } */
+  q[1] = __builtin_convertvector (b, VF);
+  VF c = (VF) { 5.0f, 6.0f, 7.0f, 8.0f };  /* { dg-bogus "set but not 
used" } */
+  p[2] = __builtin_convertvector (c, VI);
+  VF d = q[3]; /* { dg-bogus "set but not 
used" } */
+  p[3] = __builtin_convertvector (d, VI);
+}

Jakub



[PATCH] Add link to porting_to.html from the changes page for GCC 9

2020-02-05 Thread apinski
From: Andrew Pinski 

Looks like the porting_to page was not linked to the changes
page for GCC 9. So uncomments it out.

Committed as obvious.

---
 htdocs/gcc-9/changes.html | 2 --
 1 file changed, 2 deletions(-)

diff --git a/htdocs/gcc-9/changes.html b/htdocs/gcc-9/changes.html
index c0e581fe..22b069c5 100644
--- a/htdocs/gcc-9/changes.html
+++ b/htdocs/gcc-9/changes.html
@@ -17,11 +17,9 @@
 
 This page is a "brief" summary of some of the huge number of improvements
 in GCC 9.
-
 
 
 
-- 
2.17.1



[committed] Fix up comment typo.

2020-02-05 Thread Jakub Jelinek
Hi!

When briefly looking at PR93586, I've noticed a typo, fixed thusly,
committed to trunk as obvious.

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index aec156eb773..a4005841f60 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,5 +1,8 @@
 2020-02-05  Jakub Jelinek  
 
+   * tree-ssa-alias.c (aliasing_matching_component_refs_p): Fix up
+   function comment typo.
+
PR middle-end/93555
* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
simd_clone_create failed when i == 0, adjust clone->nargs by
diff --git a/gcc/ChangeLog-2009 b/gcc/ChangeLog-2009
index c95c0ca129e..1d3702ebaf3 100644
--- a/gcc/ChangeLog-2009
+++ b/gcc/ChangeLog-2009
@@ -14477,7 +14477,7 @@
(aapcs_layout_arg): New function.
(arm_init_cumulative_args): Initialize AAPCS args data.
(arm_function_arg): Handle AAPCS variants using new interface.
-   (arm_arg_parital_bytes): Likewise.
+   (arm_arg_partial_bytes): Likewise.
(arm_function_arg_advance): New function.
(arm_function_ok_for_sibcall): Ensure that sibling calls agree on
calling conventions.
diff --git a/gcc/tree-ssa-alias.c b/gcc/tree-ssa-alias.c
index 670676f20c3..c7e6679f990 100644
--- a/gcc/tree-ssa-alias.c
+++ b/gcc/tree-ssa-alias.c
@@ -856,7 +856,7 @@ type_has_components_p (tree type)
 
 /* MATCH1 and MATCH2 which are part of access path of REF1 and REF2
respectively are either pointing to same address or are completely
-   disjoint. If PARITAL_OVERLAP is true, assume that outermost arrays may
+   disjoint. If PARTIAL_OVERLAP is true, assume that outermost arrays may
just partly overlap.
 
Try to disambiguate using the access path starting from the match

Jakub



[committed] openmp: Avoid ICEs with declare simd; declare simd inbranch [PR93555]

2020-02-05 Thread Jakub Jelinek
Hi!

The testcases ICE because when processing the declare simd inbranch,
we don't create the i == 0 clone as it already exists, which means
clone_info->nargs is not adjusted, but we then rely on it being adjusted
when trying other clones.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2020-02-05  Jakub Jelinek  

PR middle-end/93555
* omp-simd-clone.c (expand_simd_clones): If simd_clone_mangle or
simd_clone_create failed when i == 0, adjust clone->nargs by
clone->inbranch.

* c-c++-common/gomp/pr93555-1.c: New test.
* c-c++-common/gomp/pr93555-2.c: New test.
* gfortran.dg/gomp/pr93555.f90: New test.

--- gcc/omp-simd-clone.c.jj 2020-01-12 11:54:36.688409260 +0100
+++ gcc/omp-simd-clone.c2020-02-04 18:08:11.271036096 +0100
@@ -1713,14 +1713,22 @@ expand_simd_clones (struct cgraph_node *
 already.  */
  tree id = simd_clone_mangle (node, clone);
  if (id == NULL_TREE)
-   continue;
+   {
+ if (i == 0)
+   clone->nargs += clone->inbranch;
+ continue;
+   }
 
  /* Only when we are sure we want to create the clone actually
 clone the function (or definitions) or create another
 extern FUNCTION_DECL (for prototypes without definitions).  */
  struct cgraph_node *n = simd_clone_create (node);
  if (n == NULL)
-   continue;
+   {
+ if (i == 0)
+   clone->nargs += clone->inbranch;
+ continue;
+   }
 
  n->simdclone = clone;
  clone->origin = node;
--- gcc/testsuite/c-c++-common/gomp/pr93555-1.c.jj  2020-02-04 
22:28:54.557506688 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr93555-1.c 2020-02-04 22:30:18.575246428 
+0100
@@ -0,0 +1,18 @@
+/* PR middle-end/93555 */
+/* { dg-do compile } */
+
+#pragma omp declare simd
+#pragma omp declare simd inbranch
+int
+foo (int x)
+{
+  return x;
+}
+
+#pragma omp declare simd inbranch
+#pragma omp declare simd
+int
+bar (int x)
+{
+  return x;
+}
--- gcc/testsuite/c-c++-common/gomp/pr93555-2.c.jj  2020-02-04 
22:28:57.581461328 +0100
+++ gcc/testsuite/c-c++-common/gomp/pr93555-2.c 2020-02-04 22:30:25.712139374 
+0100
@@ -0,0 +1,16 @@
+/* PR middle-end/93555 */
+/* { dg-do compile } */
+
+#pragma omp declare simd
+#pragma omp declare simd inbranch
+void
+foo (void)
+{
+}
+
+#pragma omp declare simd inbranch
+#pragma omp declare simd
+void
+bar (void)
+{
+}
--- gcc/testsuite/gfortran.dg/gomp/pr93555.f90.jj   2020-02-04 
22:31:27.578211386 +0100
+++ gcc/testsuite/gfortran.dg/gomp/pr93555.f90  2020-02-04 22:31:18.267351050 
+0100
@@ -0,0 +1,11 @@
+! PR middle-end/93555
+! { dg-do compile }
+
+subroutine foo
+  !$omp declare simd(foo)
+  !$omp declare simd(foo) inbranch
+end
+subroutine bar
+  !$omp declare simd(bar) inbranch
+  !$omp declare simd(bar)
+end

Jakub



Re: [PATCH] i386: Omit clobbers from vzeroupper until final [PR92190]

2020-02-05 Thread Jakub Jelinek
On Wed, Feb 05, 2020 at 10:58:04AM +0100, Jakub Jelinek wrote:
> Attached are 3 possible incremental patches for recog.c, all of them fix
> all the above regressions, but haven't fully bootstrapped/regtested any of
> them yet.  My preference would be the last one, which for -O0 and x86
> disables split2 and enables split3, as it doesn't add any extra passes.
> The first one just enables split3 for -O0 on x86, the second one enables
> split5 for -O0 on x86.

The -da -O0 changes of the last patch are:

 test.c.275r.split1
 test.c.277r.dfinit
 test.c.278r.mode_sw
 test.c.279r.asmcons
 test.c.284r.ira
 test.c.285r.reload
-test.c.289r.split2
 test.c.292r.pro_and_epilogue
 test.c.295r.jump2
+test.c.306r.split3
 test.c.307r.stack
 test.c.308r.alignments
 test.c.310r.mach
 test.c.311r.barriers
 test.c.316r.shorten
 test.c.317r.nothrow
 test.c.318r.dwarf2
 test.c.319r.final
 test.c.320r.dfinish

while the first one just adds test.c.306r.split3 and doesn't remove
test.c.289r.split2 and the second one just adds test.c.313r.split5
before test.c.316r.shorten.

Jakub



Re: [PATCH] i386: Omit clobbers from vzeroupper until final [PR92190]

2020-02-05 Thread Jakub Jelinek
On Tue, Feb 04, 2020 at 02:15:04PM +0100, Uros Bizjak wrote:
> On Tue, Feb 4, 2020 at 2:13 PM Jakub Jelinek  wrote:
> >
> > On Tue, Feb 04, 2020 at 01:38:51PM +0100, Uros Bizjak wrote:
> > > As Richard advised, let's put this safety stuff back. Usually, in
> > > i386.md, these kind of splitters are implemented as two patterns, one
> > > (define_insn_and_split) having "#", and the other (define_insn) with a
> > > real insn. My opinion is, that this separation avoids confusion as
> > > much as possible.
> >
> > Okay.  So like this if it passes bootstrap/regtest then?
> 
> Yes.
> 
> > 2020-02-04  Jakub Jelinek  
> >
> > PR target/92190
> > * config/i386/i386-features.c (ix86_add_reg_usage_to_vzeroupper): 
> > Only
> > include sets and not clobbers in the vzeroupper pattern.
> > * config/i386/sse.md (*avx_vzeroupper): Require in insn condition 
> > that
> > the parallel has 17 (64-bit) or 9 (32-bit) elts.
> > (*avx_vzeroupper_1): New define_insn_and_split.
> >
> > * gcc.target/i386/pr92190.c: New test.
> 
> OK.

Unfortunately, it breaks
+FAIL: gcc.target/i386/avx-2.c (internal compiler error)
+FAIL: gcc.target/i386/avx-2.c (test for excess errors)
+FAIL: gcc.target/i386/avx-vzeroupper-10.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-10.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-10.c scan-assembler-times 
avx_vzeroupper 3
+FAIL: gcc.target/i386/avx-vzeroupper-11.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-11.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-11.c scan-assembler-times 
*avx_vzeroall 1
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-11.c scan-assembler-times 
avx_vzeroupper 3
+FAIL: gcc.target/i386/avx-vzeroupper-12.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-12.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-12.c scan-assembler-times 
*avx_vzeroall 1
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-12.c scan-assembler-times 
avx_vzeroupper 4
+FAIL: gcc.target/i386/avx-vzeroupper-5.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-5.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-5.c scan-assembler-times 
avx_vzeroupper 1
+FAIL: gcc.target/i386/avx-vzeroupper-7.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-7.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-7.c scan-assembler-times 
avx_vzeroupper 1
+FAIL: gcc.target/i386/avx-vzeroupper-8.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-8.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-8.c scan-assembler-times 
avx_vzeroupper 1
+FAIL: gcc.target/i386/avx-vzeroupper-9.c (internal compiler error)
+FAIL: gcc.target/i386/avx-vzeroupper-9.c (test for excess errors)
+UNRESOLVED: gcc.target/i386/avx-vzeroupper-9.c scan-assembler-times 
avx_vzeroupper 4
+FAIL: gcc.target/i386/sse-14.c (internal compiler error)
+FAIL: gcc.target/i386/sse-14.c (test for excess errors)
+FAIL: gcc.target/i386/sse-22.c (internal compiler error)
+FAIL: gcc.target/i386/sse-22.c (test for excess errors)
+FAIL: gcc.target/i386/sse-22a.c (internal compiler error)
+FAIL: gcc.target/i386/sse-22a.c (test for excess errors)

The problem is that x86 is the only target that defines HAVE_ATTR_length and
doesn't schedule any splitting pass at -O0 after pro_and_epilogue.

So, either we go back to handling this during vzeroupper output
(unconditionally, rather than flag_ipa_ra guarded), or we need to tweak the
split* passes for x86.

Seems there are 5 split passes, split1 is run unconditionally before reload,
split2 is run for optimize > 0 or STACK_REGS (x86) after ra but before
epilogue_completed, split3 is run before regstack for STACK_REGS and
optimize and -fno-schedule-insns2, split4 is run before sched2 if sched2 is
run and split5 is run before shorten_branches if HAVE_ATTR_length and not
STACK_REGS.

Attached are 3 possible incremental patches for recog.c, all of them fix
all the above regressions, but haven't fully bootstrapped/regtested any of
them yet.  My preference would be the last one, which for -O0 and x86
disables split2 and enables split3, as it doesn't add any extra passes.
The first one just enables split3 for -O0 on x86, the second one enables
split5 for -O0 on x86.

Jakub
2020-02-05  Jakub Jelinek  

PR target/92190
* recog.c (pass_split_before_regstack::gate): For STACK_REGS targets,
run even when !optimize.

--- gcc/recog.c.jj  2020-01-12 11:54:36.918405790 +0100
+++ gcc/recog.c 2020-02-05 10:13:20.236989567 +0100
@@ -3991,12 +3991,12 @@ pass_split_before_regstack::gate (functi
  split until final which doesn't allow splitting
  if HAVE_ATTR_length.  */
 # ifdef INSN_SCHEDULING
-  return (optimize && !flag_schedule_insns_after_reload);
+  return !optimize || !flag_schedule_insns_after_reload;
 # else
-  return 

Re: [PATCH] Fix -ffast-math flags handling inconsistencies

2020-02-05 Thread Richard Biener
On Fri, Jan 31, 2020 at 6:01 PM Ulrich Weigand  wrote:
>
> Hello,
>
> we've noticed some inconsistencies in how the component flags of -ffast-math
> are handled, see the discussion on the GCC list starting here:
> https://gcc.gnu.org/ml/gcc/2020-01/msg00365.html
>
> This patch fixes those inconsistencies.  Specifically, there are the
> following changes:
>
> 1. Some component flags for -ffast-math are *set* with -ffast-math
>(changing the default), but are not reset with -fno-fast-math,
>causing the latter to have surprising results.  (Note that since
>"-ffast-math -fno-fast-math" is short-cut by the driver, you'll
>only see the surprising results with "-Ofast -fno-fast-math".)
>This is fixed here by both setting and resetting the flags.
>
>This affects the following flags
>   -fcx-limited-range
>   -fexcess-precision=
>
> 2. Some component flags for -ffast-math are changed from their default,
>but are *not* included in the fast_math_flags_set_p test, causing
>__FAST_MATH__ to remain predefined even when the full set of fast
>math options is not actually in effect.  This is fixed here by
>adding those flags into the fast_math_flags_set_p test.
>
>This affects the following flags:
>  -fcx-limited-range
>  -fassociative-math
>  -freciprocal-math
>
> 3. For some math flags, set_fast_math_flags has code that sets their
>values only to what is already their default.  The overall effect
>of this code is a complete no-op.  This patch removes that dead code.
>
>This affects the following flags:
>  -frounding-math
>  -fsignaling-nans
>
>
> The overall effect of this patch is that now all component flags of
> -ffast-math are treated exactly equivalently:
>   * they are set (changed from their default) with -ffast-math
>   * they are reset to their default with -fno-fast-math
>   * __FAST_MATH__ is only defined if the value of the flag matches
> what -ffast-math would have set it to

The last part is not obviously correct to me since it doesn't match
documentation which says

@item -ffast-math
@opindex ffast-math
Sets the options @option{-fno-math-errno}, @option{-funsafe-math-optimizations},
@option{-ffinite-math-only}, @option{-fno-rounding-math},
@option{-fno-signaling-nans}, @option{-fcx-limited-range} and
@option{-fexcess-precision=fast}.

This option causes the preprocessor macro @code{__FAST_MATH__} to be defined.

to me this reads as -ffast-math -fexcess-precision=standard defines
__FAST_MATH__.
The only relevant part to define __FAST_MATH__ is specifying -ffast-math, other
options are not relevant (which of course is contradicted by
implementation - where
I didn't actually follow its history in that respect).  So can you
adjust documentation
as to when exactly __FAST_MATH__ is defined?

Also...

> Tested on s390x-ibm-linux.
>
> OK for mainline?
>
> Bye,
> Ulrich
>
> ChangeLog:
>
> * opts.c (set_fast_math_flags): In the !set case, also reset
> x_flag_cx_limited_range and x_flag_excess_precision.  Remove dead
> code resetting x_flag_signaling_nans and x_flag_rounding_math.
> (fast_math_flags_set_p): Also test x_flag_cx_limited_range,
> x_flag_associative_math, and x_flag_reciprocal_math.
>
> diff --git a/gcc/opts.c b/gcc/opts.c
> index 7affeb4..4452793 100644
> --- a/gcc/opts.c
> +++ b/gcc/opts.c
> @@ -2850,18 +2850,14 @@ set_fast_math_flags (struct gcc_options *opts, int 
> set)
>  opts->x_flag_finite_math_only = set;
>if (!opts->frontend_set_flag_errno_math)
>  opts->x_flag_errno_math = !set;
> -  if (set)
> -{
> -  if (opts->frontend_set_flag_excess_precision == 
> EXCESS_PRECISION_DEFAULT)
> -   opts->x_flag_excess_precision
> - = set ? EXCESS_PRECISION_FAST : EXCESS_PRECISION_DEFAULT;
> -  if (!opts->frontend_set_flag_signaling_nans)
> -   opts->x_flag_signaling_nans = 0;
> -  if (!opts->frontend_set_flag_rounding_math)
> -   opts->x_flag_rounding_math = 0;
> -  if (!opts->frontend_set_flag_cx_limited_range)
> -   opts->x_flag_cx_limited_range = 1;
> -}
> +  if (!opts->frontend_set_flag_cx_limited_range)
> +opts->x_flag_cx_limited_range = set;
> +  if (!opts->frontend_set_flag_excess_precision)
> +opts->x_flag_excess_precision
> +  = set ? EXCESS_PRECISION_FAST : EXCESS_PRECISION_DEFAULT;
> +
> +  // -ffast-math should also reset -fsignaling-nans and -frounding-math,
> +  // but since those are off by default, there's nothing to do for now.
...

but what happens to -fsignalling-nans -ffast-math then?  Better leave those
in I'd say.

Note frontends come into play with what is considered -ffast-math
and -fno-fast-math but below flags are tested irrespectively of that
interpretation.

Note there's -fcx-fortran-rules similar to -fcx-limited-range but not tested
above.  The canonical middle-end "flag" to look at is flag_complex_method.
Somehow -fcx-fortran-rules doesn't come into play at all above but it
affects 

[Patch][Testsuite] – More fixes for remote execution: check_gc_sections_available, …

2020-02-05 Thread Tobias Burnus

Still pending: libgomp-Testsuite patch 
https://gcc.gnu.org/ml/gcc-patches/2020-02/msg00207.html

This is the same fix – but for gcc/testsuite/.

To illustrate the problem again. Using remote testing
(here: modified target_compile, but DejaGNU's default_target_compile is 
likewise),
and using check_gc_sections_available, I get the following result without the 
fix:

call_remote  download  -print-prog-name=ld -print-prog-name=ld
Invoking the compiler as powerpc64le-none-linux-gnu-gcc … 
/scratch/…/default/gcc.d/-print-prog-name=ld
…
UNSUPPORTED: gcc.dg/special/gcsec-1.c

Which obvious fails both for uploading the flag as file and for compiling the
flag as filename.


WITH the patch, I get the expected:

Invoking the compiler as powerpc64le-none-linux-gnu-gcc … 
-fdiagnostics-urls=never  -print-prog-name=ld -I .
…
PASS: gcc.dg/special/gcsec-1.c (test for excess errors)
PASS: gcc.dg/special/gcsec-1.c execution test

I tested the attached patch for check_gc_sections_available thoroughly
– esp. via RUNTESTFLAGS="-v -v -v -v -v special.exp=gcsec-1.c"
w/ and w/o remote but also using, intermittently, some debugging "puts".

I also fixed check_effective_target_gld, check_effective_target_gas, 
check_runtime,
check_multi_dir, but tested them only lightly. (The first two are unused, the
others are only used by mips and arm.)
Regarding the '{ }' see below – or in the linked other patch. Without, only one
argument is actually passed.

OK for the trunk?

Tobias

On 2/4/20 4:49 PM, Tobias Burnus wrote:

DejaGNU uses in lib/target.exp's
   proc default_target_compile {source destfile type options}
uses the following.

When running locally, $source is simply used
as argument. However, when run remotely, it is tried to be uploaded
on the remote host – and otherwise skipped. That's fine if the
argument is an actual file – but not if it is a flag.

Hence, flags should be passed as $options not as $source.
Quoting now from DejaGNU's default_target_compile#:

    set sources ""
    if {[isremote host]} {
    foreach x $source {
    set file [remote_download host $x]
    if { $file eq "" } {
    warning "Unable to download $x to host."
    return "Unable to download $x to host."
    } else {
    append sources " $file"
    }
    }
    } else {
    set sources $source
    }

 * * *

FIRST, I think that affects the following calls, but I have not
checked. — I assume that moving it from the first to the last
argument should work and fix the "isremote" issue.

gcc/testsuite/gcc.target/arm/multilib.exp:    set gcc_output 
[${tool}_target_compile "--print-multi-directory $gcc_opts" "" "none" ""]
gcc/testsuite/lib/target-supports.exp:    set gcc_output 
[${tool}_target_compile "-v" "" "none" ""]
gcc/testsuite/lib/target-supports.exp:  set gcc_ld [lindex 
[${tool}_target_compile "-print-prog-name=ld" "" "none" ""] 0]
gcc/testsuite/lib/target-supports.exp:  set gcc_as [lindex 
[${tool}_target_compile "-print-prog-name=as" "" "none" ""] 0]
gcc/testsuite/lib/target-supports.exp:  set gcc_ld [lindex 
[${tool}_target_compile "-print-prog-name=ld" "" "none" ""] 0]


TODO: One should probably change this.


SECONDLY: I hit a very similar issue in libgomp, which I actually did 
debug.


In check_effective_target_offload_target_nvptx, one has something 
similar, namely:

  set gcc_output [libgomp_target_compile "-v $options" "" "none" ""]
This currently tries (w/o success) to upload the flags to the remote 
host and then

fails as the required "-v" flag fails (i.e. no input).

However, using "-v" as options argument does not work as seemingly 
only additional_flags=
arguments are passed on. Additionally, if one does not only want to 
pass on the first

argument, curly braces are needed.

PATCH: The attached patch does this – and have successfully tested it 
both with local
runs and with remote runs. (RUNTESTFLAGS="-v -v -v" did help a lot for 
studying the

behavior in both cases.)

OK for the trunk?

Cheers,

Tobias
	gcc/testsuite/
	* gcc.target/arm/multilib.exp (multilib_config): Pass flags to
	…_target_compile as (additional_flags=) option and not as source
	filename to make it work with remote execution.
	* lib/target-supports.exp (check_runtime, check_gc_sections_available,
	check_effective_target_gas, check_effective_target_gld): Likewise.

diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp b/gcc/testsuite/gcc.target/arm/multilib.exp
index 67d00266f6b..60b9edeebd0 100644
--- a/gcc/testsuite/gcc.target/arm/multilib.exp
+++ b/gcc/testsuite/gcc.target/arm/multilib.exp
@@ -40,7 +40,7 @@ proc multilib_config {profile} {
 proc check_multi_dir { gcc_opts multi_dir } {
 global tool
 
-set gcc_output [${tool}_target_compile "--print-multi-directory $gcc_opts" "" "none" ""]
+set gcc_output [${tool}_target_compile "" "" "none" "{additional_flags=--print-multi-directory $gcc_opts}"]
 if { [string match "$multi_dir\n" $gcc_output] } {
 	pass 

Re: [PATCH coroutines v1] Build co_await/yield_expr with unknown_type in processing_template_decl phase

2020-02-05 Thread JunMa

在 2020/2/5 下午2:14, JunMa 写道:

Hi
This patch builds co_await/yield_expr with unknown_type when we can not
know the promise type in processing_template_decl phase. it avoid to
confuse compiler when handing type deduction and conversion.

Bootstrap and test on X86_64, is it OK?

Regards
JunMa


Hi
sorry for that '}' was removed, here is the update patch:)

Regards
JunMa

gcc/cp
2020-02-05  Jun Ma 

    * coroutines.cc (finish_co_await_expr): Build co_await_expr
    with unknown_type_node.
    (finish_co_yield_expr): Ditto.
    *pt.c (type_dependent_expression_p): Set co_await/yield_expr
    with unknown type as dependent.

gcc/testsuite
2020-02-05  Jun Ma 

    * g++.dg/coroutines/torture/co-await-14-template-traits.C: New 
test.



---
 gcc/cp/coroutines.cc  |  6 ++---
 gcc/cp/pt.c   |  5 
 .../torture/co-await-14-template-traits.C | 24 +++
 3 files changed, 32 insertions(+), 3 deletions(-)
 create mode 100644 
gcc/testsuite/g++.dg/coroutines/torture/co-await-14-template-traits.C

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index 7525d7c035a..04e56b9a636 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -834,8 +834,8 @@ finish_co_await_expr (location_t kw, tree expr)
   /* If we don't know the promise type, we can't proceed.  */
   tree functype = TREE_TYPE (current_function_decl);
   if (dependent_type_p (functype) || type_dependent_expression_p (expr))
-   return build5_loc (kw, CO_AWAIT_EXPR, TREE_TYPE (expr), expr, NULL_TREE,
-  NULL_TREE, NULL_TREE, integer_zero_node);
+   return build5_loc (kw, CO_AWAIT_EXPR, unknown_type_node, expr,
+  NULL_TREE, NULL_TREE, NULL_TREE, integer_zero_node);
 }
 
   /* We must be able to look up the "await_transform" method in the scope of
@@ -912,7 +912,7 @@ finish_co_yield_expr (location_t kw, tree expr)
   tree functype = TREE_TYPE (current_function_decl);
   /* If we don't know the promise type, we can't proceed.  */
   if (dependent_type_p (functype) || type_dependent_expression_p (expr))
-   return build2_loc (kw, CO_YIELD_EXPR, TREE_TYPE (expr), expr,
+   return build2_loc (kw, CO_YIELD_EXPR, unknown_type_node, expr,
   NULL_TREE);
 }
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 40ff3c3a089..cfc3393991e 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -26743,6 +26743,11 @@ type_dependent_expression_p (tree expression)
   if (TREE_CODE (expression) == SCOPE_REF)
return false;
 
+  /* CO_AWAIT/YIELD_EXPR with unknown type is always dependent.  */
+  if (TREE_CODE (expression) == CO_AWAIT_EXPR
+ || TREE_CODE (expression) == CO_YIELD_EXPR)
+   return true;
+
   if (BASELINK_P (expression))
{
  if (BASELINK_OPTYPE (expression)
diff --git 
a/gcc/testsuite/g++.dg/coroutines/torture/co-await-14-template-traits.C 
b/gcc/testsuite/g++.dg/coroutines/torture/co-await-14-template-traits.C
new file mode 100644
index 000..4e670b1c308
--- /dev/null
+++ b/gcc/testsuite/g++.dg/coroutines/torture/co-await-14-template-traits.C
@@ -0,0 +1,24 @@
+//  { dg-do compile }
+//  Test we create co_await_expr with dependent type rather than type of 
awaitable class
+
+#include "../coro.h"
+#include "../coro1-ret-int-yield-int.h"
+#include 
+
+struct TestAwaiter {
+int recent_test;
+TestAwaiter(int test) : recent_test{test} {}
+bool await_ready() { return true; }
+void await_suspend(coro::coroutine_handle<>) {}
+int await_resume() { return recent_test;}
+void return_value(int x) { recent_test = x;}
+};
+
+template 
+coro1 test_temparg (std::chrono::duration dur)
+{
+   auto sum = co_await TestAwaiter(1);
+   if (!sum)
+dur.count();
+   co_return 0;
+}
-- 
2.19.1.3.ge56e4f7



Re: [PATCH Coroutines][OBVIOUS]Fix typo of temporary variable name

2020-02-05 Thread Iain Sandoe

bin.cheng  wrote:


This is an obvious patch fixing typo in maybe_promote_captured_temps,
previously, the vnum (== 0) is used for all captured temps...

Will commit later.


Yeah, I made this exact change locally yesterday (but I didn’t commit it yet,
so go ahead if you have time).

Iain



Re: [PATCH][DOCUMENTATION] Document ASLR for Precompiled Headers.

2020-02-05 Thread Martin Liška

On 2/5/20 8:52 AM, Richard Biener wrote:

"Address space layout randomization (ASLR) can lead to not binary identical
PCH files.  If you rely on stable PCH file contents disable ASLR when generating
PCH files."

?


I'm fine with that. Installed as 
r10-6442-gf4239581925d6a9fba049f8f771e909a7a5e5ce7.

Martin


[PATCH Coroutines][OBVIOUS]Fix typo of temporary variable name

2020-02-05 Thread bin.cheng
Hi,
This is an obvious patch fixing typo in maybe_promote_captured_temps,
previously, the vnum (== 0) is used for all captured temps...

Will commit later.

Thanks,
bin

gcc/cp
2020-02-05  Bin Cheng  

* coroutines.cc (maybe_promote_captured_temps): Increase the index
number for temporary variables' name.

obvious-vnum.patch
Description: Binary data


Re: [PATCH] avoid issuing -Wrestrict from folder (PR 93519)

2020-02-05 Thread Richard Biener
On Tue, Feb 4, 2020 at 11:02 PM Martin Sebor  wrote:
>
> On 2/4/20 2:31 PM, Jeff Law wrote:
> > On Tue, 2020-02-04 at 13:08 -0700, Martin Sebor wrote:
> >> On 2/4/20 12:15 PM, Richard Biener wrote:
> >>> On February 4, 2020 5:30:42 PM GMT+01:00, Jeff Law  
> >>> wrote:
>  On Tue, 2020-02-04 at 10:34 +0100, Richard Biener wrote:
> > On Tue, Feb 4, 2020 at 1:44 AM Martin Sebor  wrote:
> >> PR 93519 reports a false positive -Wrestrict issued for an inlined
>  call
> >> to strcpy that carefully guards against self-copying.  This is
>  caused
> >> by the caller's arguments substituted into the call during inlining
>  and
> >> before dead code elimination.
> >>
> >> The attached patch avoids this by removing -Wrestrict from the
>  folder
> >> and deferring folding perfectly overlapping (and so undefined)
>  calls
> >> to strcpy (and mempcpy, but not memcpy) until much later.  Calls to
> >> perfectly overlapping calls to memcpy are still folded early.
> >
> > Why do we bother to warn at all for this case?  Just DWIM here.
>  Warnings like
> > this can be emitted from the analyzer?
>  They potentially can, but the analyzer is and will almost always
>  certainly be considerably slower.  I would not expect it to be used
>  nearly as much as the core compiler.
> 
>  WHether or not a particular warning makes sense in the core compiler or
>  analyzer would seem to me to depend on whether or not we can reasonably
>  issue warnings without interprocedural analysis.  double-free
>  realistically requires interprocedural analysis to be effective.  I'm
>  not sure Wrestrict really does.
> 
> 
> > That is, I suggest to simply remove the bogus warning code from
>  folding
> > (and _not_ fail the folding).
>  I haven't looked at the patch, but if we can get the warning out of the
>  folder that's certainly preferable.  And we could investigate deferring
>  self-copy removal.
> >>>
> >>> I think the issue is as usual, warning for code we'll later remove as 
> >>> dead. Warning at folding is almost always premature.
> >>
> >> In this instance the code is reachable (or isn't obviously unreachable).
> >> GCC doesn't remove it, but provides benign (and reasonable) semantics
> >> for it(*).  To me, that's one aspect of quality.  Letting the user know
> >> that the code is buggy is another.  I view that as at least as important
> >> as folding the ill-effects away because it makes it possible to fix
> >> the problem so the code works correctly even with compilers that don't
> >> provide these benign semantics.
> > If you look at the guts of what happens at the point where we issue the
> > warning from within gimple_fold_builtin_strcpy we have:
> >
> >> DCH_to_char (char * in, char * out, int collid)
> >> {
> >>int type;
> >>char * D.2148;
> >>char * dest;
> >>char * num;
> >>long unsigned int _4;
> >>char * _5;
> >>
> >> ;;   basic block 2, loop depth 0
> >> ;;pred:   ENTRY
> >> ;;succ:   4
> >>
> >> ;;   basic block 4, loop depth 0
> >> ;;pred:   2
> >> ;;succ:   5
> >>
> >> ;;   basic block 5, loop depth 0
> >> ;;pred:   4
> >> ;;succ:   6
> >>
> >> ;;   basic block 6, loop depth 0
> >> ;;pred:   5
> >>if (0 != 0)
> >>  goto ; [53.47%]
> >>else
> >>  goto ; [46.53%]
> >> ;;succ:   7
> >> ;;8
> >>
> >> ;;   basic block 7, loop depth 0
> >> ;;pred:   6
> >>strcpy (out_1(D), out_1(D));
> >> ;;succ:   8
> >>
> >> ;;   basic block 8, loop depth 0
> >> ;;pred:   6
> >> ;;7
> >>_4 = __builtin_strlen (out_1(D));
> >>_5 = out_1(D) + _4;
> >>__builtin_memcpy (_5, "foo", 4);
> >> ;;succ:   3
> >>
> >> ;;   basic block 3, loop depth 0
> >> ;;pred:   8
> >>return;
> >> ;;succ:   EXIT
> >>
> >> }
> >>
> >
> > Which shows the code is obviously unreachable in the case we're warning
> > about.  You can't see this in the dumps because it's exposed by
> > inlining, then cleaned up before writing the dump file.
>
> In the specific case of the bug the code is of course eliminated
> because it's guarded by the if (s != d).  I was referring to
> the general (unguarded) case of:
>
>char *s = "", *p;
>
>int main (void)
>{
>  p = strcpy (s, s);
>  puts (p);
>}
>
> where GCC folds the assignment 'p = strcpy(s, s);' to effectively
> p = s;  That's perfectly reasonable but it could equally as well
> leave the call alone, as it does when s is null, for instance.
>
> I think folding it away is not only reasonable but preferable to
> making the invalid call, but it's done only rarely.  Most of
> the time GCC does emit the undefined access (it does that with
> calls to library functions as well as with direct stores and
> reads).  (I am hoping we can change that in the future so that
> these 

Re: [PATCH] alias: Fix offset checks involving section anchors [PR92294]

2020-02-05 Thread Richard Biener
On Tue, Feb 4, 2020 at 6:44 PM Richard Sandiford
 wrote:
>
> Richard Sandiford  writes:
> > [...]
> >> I'm not sure given the issues you've introduced if I could actually
> >> fill out the matrix of answers without more underlying information.
> >> ie, when can we get symbols without source level decls,
> >> anchors+interposition issues, etc.
> >
> > OK.  In that case, I wonder whether it would be safer to have a
> > fourth state on top of the three above:
> >
> >   - known distance apart
> >   - independent
> >   - known distance apart or independent
> >   - don't know
> >
> > with "don't know" being anything that involves bare symbols?
> >
> > Richard
>
> How does this look?  Tested on aarch64-linux-gnu and
> x86_64-linux-gnu.
>
> Full description from scratch:
>
> memrefs_conflict_p has a slightly odd structure.  It first checks
> whether two addresses based on SYMBOL_REFs refer to the same object,
> with a tristate result:
>
>   int cmp = compare_base_symbol_refs (x,y);
>
> If the addresses do refer to the same object, we can use offset-based checks:
>
>   /* If both decls are the same, decide by offsets.  */
>   if (cmp == 1)
> return offset_overlap_p (c, xsize, ysize);
>
> But then, apart from the special case of forced address alignment,
> we use an offset-based check even if we don't know whether the
> addresses refer to the same object:
>
>   /* Assume a potential overlap for symbolic addresses that went
>  through alignment adjustments (i.e., that have negative
>  sizes), because we can't know how far they are from each
>  other.  */
>   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
> return -1;
>   /* If decls are different or we know by offsets that there is no 
> overlap,
>  we win.  */
>   if (!cmp || !offset_overlap_p (c, xsize, ysize))
> return 0;
>
> This somewhat contradicts:
>
>   /* In general we assume that memory locations pointed to by different labels
>  may overlap in undefined ways.  */

Sorry for not chiming in earlier but isn't the bug that



> at the end of compare_base_symbol_refs.  In other words, we're taking -1
> to mean that either (a) the symbols are equal (via aliasing) or (b) the
> references access non-overlapping objects.
>
> But even assuming that's true for normal symbols, it doesn't cope
> correctly with section anchors.  If a symbol X at ANCHOR+OFFSET
> is preemptible, either (a) X = ANDHOR+OFFSET or (b) X and ANCHOR
> reference non-overlapping objects.
>
> And an offset-based comparison makes no sense for an anchor symbol
> vs. a bare symbol with no decl.  If the bare symbol is allowed to
> alias other symbols then it can surely alias any symbol in the
> anchor's block, so there are multiple anchor offsets that might
> induce an alias.

But then isn't the simple fix to honor the -1 and do

diff --git a/gcc/alias.c b/gcc/alias.c
index 3794f9b6a9e..bf13d37c0f7 100644
--- a/gcc/alias.c
+++ b/gcc/alias.c
@@ -2490,9 +2490,8 @@ memrefs_conflict_p (poly_int64 xsize, rtx x,
poly_int64 ysize, rtx y,
 other.  */
   if (maybe_lt (xsize, 0) || maybe_lt (ysize, 0))
return -1;
-  /* If decls are different or we know by offsets that there is no overlap,
-we win.  */
-  if (!cmp || !offset_overlap_p (c, xsize, ysize))
+  /* If decls are different, we win.  */
+  if (cmp == 0)
return 0;
   /* Decls may or may not be different and offsets overlap*/
   return -1;

?

> This patch therefore replaces the current tristate:
>
>   - known equal
>   - known independent (two accesses can't alias)
>   - equal or independent
>
> with:
>
>   - known distance apart
>   - known independent (two accesses can't alias)
>   - known distance apart or independent
>   - don't know
>
> For safety, the patch puts all bare symbols in the "don't know"
> category.  If that turns out to be too conservative, we at least
> need that behaviour for combinations involving a bare symbol
> and a section anchor.

This sounds complicated (for this stage).  Do you have any statistics as
to how it affects actual alias queries (thus outcome in {true,...}_dependence)
when you do the "simple" fix?

Thanks,
Richard.

> 2020-02-04  Richard Sandiford  
>
> gcc/
> PR rtl-optimization/92294
> * alias.c (compare_base_symbol_refs): Take an extra parameter
> and add the distance between two symbols to it.  Enshrine in
> comments that -1 means "either 0 or 1, but we can't tell
> which at compile time".  Return -2 for symbols whose
> relationship is unknown.
> (memrefs_conflict_p): Update call accordingly.
> (rtx_equal_for_memref_p): Likewise.  Punt for a return value of -2,
> without even checking the offset.  Take the distance between symbols
> into account.
> ---
>  gcc/alias.c | 53 ++---
>  1 file changed, 38 insertions(+), 15 deletions(-)
>
>