Re: [PATCH] Fix attribute access issues

2019-11-22 Thread Richard Biener
On November 23, 2019 2:03:21 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>On Thu, Nov 21, 2019 at 06:09:34PM -0700, Martin Sebor wrote:
>> > >  PR middle-end/83859
>> > >  * c-attribs.c (handle_access_attribute): New function.
>> > >  (c_common_attribute_table): Add new attribute.
>> > >  (get_argument_type): New function.
>> > >  (append_access_attrs): New function.
>
>I'm getting
>+FAIL: gcc.dg/Wstringop-overflow-24.c (internal compiler error)
>+FAIL: gcc.dg/Wstringop-overflow-24.c (test for excess errors)
>on i686-linux, while it succeeds on x86_64-linux.  On a closer look,
>there is a buffer overflow even on x86_64-linux as can be seen under
>valgrind, plus memory leak.
>
>The buffer overflow is in append_access_attrs:
>==9759== Command: ./cc1 -quiet -Wall Wstringop-overflow-24.c
>==9759== 
>==9759== Invalid write of size 1
>==9759==at 0x483BD9F: strcpy (vg_replace_strmem.c:513)
>==9759==by 0xA11FF4: append_access_attrs(tree_node*, tree_node*,
>char const*, char, long*) (c-attribs.c:3934)
>==9759==by 0xA12AD3: handle_access_attribute(tree_node**,
>tree_node*, tree_node*, int, bool*) (c-attribs.c:4158)
>==9759==by 0x88E1BF: decl_attributes(tree_node**, tree_node*, int,
>tree_node*) (attribs.c:728)
>==9759==by 0x8A6A9B: c_decl_attributes(tree_node**, tree_node*,
>int) (c-decl.c:4944)
>==9759==by 0x8A6FE2: start_decl(c_declarator*, c_declspecs*, bool,
>tree_node*) (c-decl.c:5083)
>==9759==by 0x91CB15: c_parser_declaration_or_fndef(c_parser*, bool,
>bool, bool, bool, bool, tree_node**, vec,
>bool, tree_node*, oacc_routine_data*, bool*) (c-parser.c:2216)
>==9759==by 0x91B742: c_parser_external_declaration(c_parser*)
>(c-parser.c:1690)
>==9759==by 0x91B25E: c_parser_translation_unit(c_parser*)
>(c-parser.c:1563)
>==9759==by 0x9590A4: c_parse_file() (c-parser.c:21524)
>==9759==by 0x9E308E: c_common_parse_file() (c-opts.c:1185)
>==9759==by 0x1211AEE: compile_file() (toplev.c:458)
>==9759==  Address 0x5113f68 is 0 bytes after a block of size 8 alloc'd
>==9759==at 0x483880B: malloc (vg_replace_malloc.c:309)
>==9759==by 0x229BF17: xmalloc (xmalloc.c:147)
>==9759==by 0xA11FC0: append_access_attrs(tree_node*, tree_node*,
>char const*, char, long*) (c-attribs.c:3932)
>==9759==by 0xA12AD3: handle_access_attribute(tree_node**,
>tree_node*, tree_node*, int, bool*) (c-attribs.c:4158)
>==9759==by 0x88E1BF: decl_attributes(tree_node**, tree_node*, int,
>tree_node*) (attribs.c:728)
>==9759==by 0x8A6A9B: c_decl_attributes(tree_node**, tree_node*,
>int) (c-decl.c:4944)
>==9759==by 0x8A6FE2: start_decl(c_declarator*, c_declspecs*, bool,
>tree_node*) (c-decl.c:5083)
>==9759==by 0x91CB15: c_parser_declaration_or_fndef(c_parser*, bool,
>bool, bool, bool, bool, tree_node**, vec,
>bool, tree_node*, oacc_routine_data*, bool*) (c-parser.c:2216)
>==9759==by 0x91B742: c_parser_external_declaration(c_parser*)
>(c-parser.c:1690)
>==9759==by 0x91B25E: c_parser_translation_unit(c_parser*)
>(c-parser.c:1563)
>==9759==by 0x9590A4: c_parse_file() (c-parser.c:21524)
>==9759==by 0x9E308E: c_common_parse_file() (c-opts.c:1185)
>If n2 != 0, newlen is computed as n1 + n2, but that doesn't take into
>account for the , that is added in between the two.
>
>The following patch ought to fix both the buffer overflow (by adding 1
>if n2
>is non-zero), memory leak (freeing newspec buffer after creating the
>string;
>I've considered using XALLOCAVEC instead, but I believe the string can
>be
>arbitrarily long on functions with thousands of arguments), using
>XNEWVEC
>instead of (type *) xmalloc, using auto_diagnostic_group to bind
>warning +
>inform together and fixes a typo in the documentation.
>
>Ok for trunk if it passes bootstrap/regtest on x86_64-linux and
>i686-linux?

Ok. 

Richard. 

>2019-11-23  Jakub Jelinek  
>
>   PR middle-end/83859
>   * doc/extend.texi (attribute access): Fix a typo.
>
>   * c-attribs.c (append_access_attrs): Avoid buffer overflow.  Avoid
>   memory leak.  Use XNEWVEC macro.  Use auto_diagnostic_group to
>   group warning with inform together.
>   (handle_access_attribute): Formatting fix.
>
>--- gcc/doc/extend.texi.jj 2019-11-22 19:11:53.634970558 +0100
>+++ gcc/doc/extend.texi2019-11-23 01:34:33.344849287 +0100
>@@ -2490,7 +2490,7 @@ The following attributes are supported o
> 
> The @code{access} attribute enables the detection of invalid or unsafe
> accesses by functions to which they apply to or their callers, as well
>-as wite-only accesses to objects that are never read from.  Such
>accesses
>+as write-only accesses to objects that are never read from.  Such
>accesses
> may be diagnosed by warnings such as @option{-Wstringop-overflow},
> @option{-Wunnitialized}, @option{-Wunused}, and others.
> 
>--- gcc/c-family/c-attribs.c.jj2019-11-22 19:11:54.0 +0100
>+++ gcc/c-family/c-attribs.c   2019-11-23 01:44:50.306617000 +0100
>@@ -3840,7 +3840,7 @@ append_access_attrs (tree t, 

Re: [PATCH] Perform cfg cleanup in cse if needed (PR rtl-optimization/92610)

2019-11-22 Thread Richard Biener
On November 23, 2019 2:15:36 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>cse_main which sets tem has:
>  if (cse_jumps_altered || recorded_label_ref)
>return 2;
>  else if (cse_cfg_altered)
>return 1;
>  else
>return 0;
>at the end, but after these two functions call it, they call also
>delete_trivially_dead_insns.  Even when cse_main didn't need cfg
>changes,
>delete_trivially_dead_insns could remove a trivially dead insn that had
>EH edges and so we might still need to remove unreachable blocks.
>
>The following patch implements it.  Bootstrapped/regtested on
>x86_64-linux
>and i686-linux, ok for trunk?

Ok. 

Richard. 

>2019-11-23  Jakub Jelinek  
>
>   PR rtl-optimization/92610
>   * cse.c (rest_of_handle_cse2): Call cleanup_cfg (0) also if
>   cse_cfg_altered is set, even when tem is 0.
>   (rest_of_handle_cse_after_global_opts): Likewise.
>
>   * g++.dg/opt/pr92610.C: New test.
>
>--- gcc/cse.c.jj   2019-11-19 22:27:01.960059340 +0100
>+++ gcc/cse.c  2019-11-22 15:51:34.639806155 +0100
>@@ -7701,7 +7701,7 @@ rest_of_handle_cse2 (void)
>   cse_cfg_altered |= cleanup_cfg (CLEANUP_CFG_CHANGED);
>   timevar_pop (TV_JUMP);
> }
>-  else if (tem == 1)
>+  else if (tem == 1 || cse_cfg_altered)
> cse_cfg_altered |= cleanup_cfg (0);
> 
>   cse_not_expected = 1;
>@@ -7775,7 +7775,7 @@ rest_of_handle_cse_after_global_opts (vo
>   cse_cfg_altered |= cleanup_cfg (CLEANUP_CFG_CHANGED);
>   timevar_pop (TV_JUMP);
> }
>-  else if (tem == 1)
>+  else if (tem == 1 || cse_cfg_altered)
> cse_cfg_altered |= cleanup_cfg (0);
> 
>   flag_cse_follow_jumps = save_cfj;
>--- gcc/testsuite/g++.dg/opt/pr92610.C.jj  2019-11-22 15:52:46.254734331
>+0100
>+++ gcc/testsuite/g++.dg/opt/pr92610.C 2019-11-22 15:53:14.991303498
>+0100
>@@ -0,0 +1,13 @@
>+// PR rtl-optimization/92610
>+// { dg-do compile }
>+// { dg-options "-w -fdelete-dead-exceptions
>--param=sccvn-max-alias-queries-per-access=0 -fno-dse
>-fnon-call-exceptions -Os -funroll-loops -ftrapv" }
>+
>+struct C { int x; ~C () {} };
>+
>+int
>+main ()
>+{
>+  C *buffer = new C[42];
>+  buffer[-3].x = 42;
>+  delete [] buffer;
>+}
>
>   Jakub



Re: [PATCH], V7, #5 of 7, Add more effective targets for the 'future' system to target-supports.

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 14, 2019 at 05:56:50PM -0500, Michael Meissner wrote:
>   * lib/target-supports.exp
>   (check_effective_target_powerpc_future_ok): Do not require 64-bit
>   or Linux support before doing the test.  Use a 32-bit constant in
>   PLI.

You changed from 0x12345 to 0x1234, instead -- why?

> -# Return 1 if this is a PowerPC target supporting -mfuture.
> -# Limit this to 64-bit linux systems for now until other
> -# targets support FUTURE.
> +# Return 1 if this is a PowerPC target supporting -mcpu=future.

"Return 1 if the assembler supports Future instructions."

Please make it explicit that this isn't about the compiler.

>  proc check_effective_target_powerpc_future_ok { } {
> -if { ([istarget powerpc64*-*-linux*]) } {
> +if { ([istarget powerpc*-*-*]) } {
>   return [check_no_compiler_messages powerpc_future_ok object {
>   int main (void) {
>   long e;
> - asm ("pli %0,%1" : "=r" (e) : "n" (0x12345));
> + asm ("pli %0,%1" : "=r" (e) : "n" (0x1234));
>   return e;
>   }
>   } "-mfuture"]

You are still passing -mfuture.


> +# Return 1 if this is a PowerPC target supporting -mcpu=future.  The compiler
> +# must support large numeric prefixed addresses by default when -mfuture is
> +# used.  We test loading up a large constant to verify that the full 34-bit
> +# offset for prefixed instructions is supported and we check for a prefixed
> +# load as well.

The comment says one thing.

-mfuture isn't a compiler option...  Well it still is, but that should
be removed.

The actual test uses only 30 bits (and a positive number).  Which is fine,
but the comment is misleading then: the code doesn't test "if the full
34-bit offset is supported".

I don't understand why we test both pli and pld.

> +proc check_effective_target_powerpc_prefixed_addr_ok { } {

The name says another.

> +if { ([istarget powerpc*-*-*]) } {

This part has no function?  Are there any testcases that test for our
prefixed insns that are compiler for non-powerpc?

If we want this at all, this test shouldn't be nested, just should be
an early-out.

> + return [check_no_compiler_messages powerpc_prefixed_addr_ok object {
> + int main (void) {
> + extern long l[];
> + long e, e2;
> + asm ("pli %0,%1" : "=r" (e) : "n" (0x12345678));
> + asm ("pld %0,0x12345678(%1)" : "=r" (e2) : "r" (& l[0]));

(should be "b" for the last constraint; and "[0]" is usually written
just "l").

> + return e - e2;
> + }
> + } "-mfuture"]

And the code tests two things.

-mcpu=future, instead?

Does this need to be separate from check_effective_target_powerpc_future_ok
at all?

> +# Return 1 if this is a PowerPC target supporting -mfuture.  The compiler 
> must

That is the third selector claiming to test the same thing ("target supports
-mfuture").

> +# support PC-relative addressing when -mcpu=future is used to pass this test.
> +
> +proc check_effective_target_powerpc_pcrel_ok { } {
> +if { ([istarget powerpc*-*-*]) } {
> + return [check_no_compiler_messages powerpc_pcrel_ok object {
> +   int main (void) {
> +   static int s __attribute__((__used__));
> +   int e;
> +   asm ("plwa %0,s@pcrel(0),1" : "=r" (e));
> +   return e;
> +   }
> +   } "-mfuture"]
> +  } else {
> +   return 0
> +  }
> +}

So every assembler will support either all three of these, or none.
Can you simplify this please?


Segher


[PATCH] Perform cfg cleanup in cse if needed (PR rtl-optimization/92610)

2019-11-22 Thread Jakub Jelinek
Hi!

cse_main which sets tem has:
  if (cse_jumps_altered || recorded_label_ref)
return 2;
  else if (cse_cfg_altered)
return 1;
  else
return 0;
at the end, but after these two functions call it, they call also
delete_trivially_dead_insns.  Even when cse_main didn't need cfg changes,
delete_trivially_dead_insns could remove a trivially dead insn that had
EH edges and so we might still need to remove unreachable blocks.

The following patch implements it.  Bootstrapped/regtested on x86_64-linux
and i686-linux, ok for trunk?

2019-11-23  Jakub Jelinek  

PR rtl-optimization/92610
* cse.c (rest_of_handle_cse2): Call cleanup_cfg (0) also if
cse_cfg_altered is set, even when tem is 0.
(rest_of_handle_cse_after_global_opts): Likewise.

* g++.dg/opt/pr92610.C: New test.

--- gcc/cse.c.jj2019-11-19 22:27:01.960059340 +0100
+++ gcc/cse.c   2019-11-22 15:51:34.639806155 +0100
@@ -7701,7 +7701,7 @@ rest_of_handle_cse2 (void)
   cse_cfg_altered |= cleanup_cfg (CLEANUP_CFG_CHANGED);
   timevar_pop (TV_JUMP);
 }
-  else if (tem == 1)
+  else if (tem == 1 || cse_cfg_altered)
 cse_cfg_altered |= cleanup_cfg (0);
 
   cse_not_expected = 1;
@@ -7775,7 +7775,7 @@ rest_of_handle_cse_after_global_opts (vo
   cse_cfg_altered |= cleanup_cfg (CLEANUP_CFG_CHANGED);
   timevar_pop (TV_JUMP);
 }
-  else if (tem == 1)
+  else if (tem == 1 || cse_cfg_altered)
 cse_cfg_altered |= cleanup_cfg (0);
 
   flag_cse_follow_jumps = save_cfj;
--- gcc/testsuite/g++.dg/opt/pr92610.C.jj   2019-11-22 15:52:46.254734331 
+0100
+++ gcc/testsuite/g++.dg/opt/pr92610.C  2019-11-22 15:53:14.991303498 +0100
@@ -0,0 +1,13 @@
+// PR rtl-optimization/92610
+// { dg-do compile }
+// { dg-options "-w -fdelete-dead-exceptions 
--param=sccvn-max-alias-queries-per-access=0 -fno-dse -fnon-call-exceptions -Os 
-funroll-loops -ftrapv" }
+
+struct C { int x; ~C () {} };
+
+int
+main ()
+{
+  C *buffer = new C[42];
+  buffer[-3].x = 42;
+  delete [] buffer;
+}

Jakub



[PATCH] Fix ICE with inline asm "=@cc.." (PR target/92615)

2019-11-22 Thread Jakub Jelinek
Hi!

The following testcase ICEs, because ix86_md_asm_adjust emits an invalid
insn that isn't matched.  setcc_qi output is nonimmediate_operand and so is
fine, but the problem is if we decide to do a ZERO_EXTEND, because
zero_extendqidi2 output must be register_operand, but dest could be MEM as
in the testcase.  All other cases look ok to me, including if dest_mode is
SImode and DEST_MODE is DImode (with -m32), because we then do zero_extend
into a temporary pseudo and final extension that can cope with nonimmediate
dest, or the movstrictqi which is also into a register.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk and
release branches?

2019-11-23  Jakub Jelinek  

PR target/92615
* config/i386/i386.c (ix86_md_asm_adjust): If dest_mode is
GET_MODE (dest), is not QImode, using ZERO_EXTEND and dest is not
register_operand, force x into register before storing it into dest.
Formatting fix.

* gcc.target/i386/pr92615.c: New test.

--- gcc/config/i386/i386.c.jj   2019-11-18 12:07:54.673405114 +0100
+++ gcc/config/i386/i386.c  2019-11-22 14:43:50.895674949 +0100
@@ -20819,11 +20819,15 @@ ix86_md_asm_adjust (vec , v
{
  x = force_reg (dest_mode, const0_rtx);
 
- emit_insn (gen_movstrictqi
-(gen_lowpart (QImode, x), destqi));
+ emit_insn (gen_movstrictqi (gen_lowpart (QImode, x), destqi));
}
  else
-   x = gen_rtx_ZERO_EXTEND (dest_mode, destqi);
+   {
+ x = gen_rtx_ZERO_EXTEND (dest_mode, destqi);
+ if (dest_mode == GET_MODE (dest)
+ && !register_operand (dest, GET_MODE (dest)))
+   x = force_reg (dest_mode, x);
+   }
}
 
   if (dest_mode != GET_MODE (dest))
--- gcc/testsuite/gcc.target/i386/pr92615.c.jj  2019-11-22 14:49:35.878541378 
+0100
+++ gcc/testsuite/gcc.target/i386/pr92615.c 2019-11-22 14:52:49.658658923 
+0100
@@ -0,0 +1,45 @@
+/* PR target/92615 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+void *a;
+long long b;
+char c;
+
+void
+foo (void)
+{
+  void *p;
+  long long q;
+  char r;
+  __asm__ ("" : : "r" (), "r" (), "r" ());
+  __asm__ ("" : "=@cca" (p));
+  a = p;
+  __asm__ ("" : "=@cca" (q));
+  b = q;
+  __asm__ ("" : "=@cca" (r));
+  c = r;
+  __asm__ ("" : : "r" (), "r" (), "r" ());
+}
+
+void
+bar (void)
+{
+  void *p;
+  long long q;
+  char r;
+  __asm__ ("" : "=@cca" (p));
+  a = p;
+  __asm__ ("" : "=@cca" (q));
+  b = q;
+  __asm__ ("" : "=@cca" (r));
+  c = r;
+  __asm__ ("" : : "r" (p), "A" (q), "q" (r));
+}
+
+void
+baz (void)
+{
+  void *p = (void *) 
+  __asm__ __volatile__ ("" : "=@ccng" (p) : "r" (1));
+}

Jakub



[PATCH] Fix attribute access issues

2019-11-22 Thread Jakub Jelinek
Hi!

On Thu, Nov 21, 2019 at 06:09:34PM -0700, Martin Sebor wrote:
> > >   PR middle-end/83859
> > >   * c-attribs.c (handle_access_attribute): New function.
> > >   (c_common_attribute_table): Add new attribute.
> > >   (get_argument_type): New function.
> > >   (append_access_attrs): New function.

I'm getting
+FAIL: gcc.dg/Wstringop-overflow-24.c (internal compiler error)
+FAIL: gcc.dg/Wstringop-overflow-24.c (test for excess errors)
on i686-linux, while it succeeds on x86_64-linux.  On a closer look,
there is a buffer overflow even on x86_64-linux as can be seen under
valgrind, plus memory leak.

The buffer overflow is in append_access_attrs:
==9759== Command: ./cc1 -quiet -Wall Wstringop-overflow-24.c
==9759== 
==9759== Invalid write of size 1
==9759==at 0x483BD9F: strcpy (vg_replace_strmem.c:513)
==9759==by 0xA11FF4: append_access_attrs(tree_node*, tree_node*, char 
const*, char, long*) (c-attribs.c:3934)
==9759==by 0xA12AD3: handle_access_attribute(tree_node**, tree_node*, 
tree_node*, int, bool*) (c-attribs.c:4158)
==9759==by 0x88E1BF: decl_attributes(tree_node**, tree_node*, int, 
tree_node*) (attribs.c:728)
==9759==by 0x8A6A9B: c_decl_attributes(tree_node**, tree_node*, int) 
(c-decl.c:4944)
==9759==by 0x8A6FE2: start_decl(c_declarator*, c_declspecs*, bool, 
tree_node*) (c-decl.c:5083)
==9759==by 0x91CB15: c_parser_declaration_or_fndef(c_parser*, bool, bool, 
bool, bool, bool, tree_node**, vec, bool, tree_node*, 
oacc_routine_data*, bool*) (c-parser.c:2216)
==9759==by 0x91B742: c_parser_external_declaration(c_parser*) 
(c-parser.c:1690)
==9759==by 0x91B25E: c_parser_translation_unit(c_parser*) (c-parser.c:1563)
==9759==by 0x9590A4: c_parse_file() (c-parser.c:21524)
==9759==by 0x9E308E: c_common_parse_file() (c-opts.c:1185)
==9759==by 0x1211AEE: compile_file() (toplev.c:458)
==9759==  Address 0x5113f68 is 0 bytes after a block of size 8 alloc'd
==9759==at 0x483880B: malloc (vg_replace_malloc.c:309)
==9759==by 0x229BF17: xmalloc (xmalloc.c:147)
==9759==by 0xA11FC0: append_access_attrs(tree_node*, tree_node*, char 
const*, char, long*) (c-attribs.c:3932)
==9759==by 0xA12AD3: handle_access_attribute(tree_node**, tree_node*, 
tree_node*, int, bool*) (c-attribs.c:4158)
==9759==by 0x88E1BF: decl_attributes(tree_node**, tree_node*, int, 
tree_node*) (attribs.c:728)
==9759==by 0x8A6A9B: c_decl_attributes(tree_node**, tree_node*, int) 
(c-decl.c:4944)
==9759==by 0x8A6FE2: start_decl(c_declarator*, c_declspecs*, bool, 
tree_node*) (c-decl.c:5083)
==9759==by 0x91CB15: c_parser_declaration_or_fndef(c_parser*, bool, bool, 
bool, bool, bool, tree_node**, vec, bool, tree_node*, 
oacc_routine_data*, bool*) (c-parser.c:2216)
==9759==by 0x91B742: c_parser_external_declaration(c_parser*) 
(c-parser.c:1690)
==9759==by 0x91B25E: c_parser_translation_unit(c_parser*) (c-parser.c:1563)
==9759==by 0x9590A4: c_parse_file() (c-parser.c:21524)
==9759==by 0x9E308E: c_common_parse_file() (c-opts.c:1185)
If n2 != 0, newlen is computed as n1 + n2, but that doesn't take into
account for the , that is added in between the two.

The following patch ought to fix both the buffer overflow (by adding 1 if n2
is non-zero), memory leak (freeing newspec buffer after creating the string;
I've considered using XALLOCAVEC instead, but I believe the string can be
arbitrarily long on functions with thousands of arguments), using XNEWVEC
instead of (type *) xmalloc, using auto_diagnostic_group to bind warning +
inform together and fixes a typo in the documentation.

Ok for trunk if it passes bootstrap/regtest on x86_64-linux and i686-linux?

2019-11-23  Jakub Jelinek  

PR middle-end/83859
* doc/extend.texi (attribute access): Fix a typo.

* c-attribs.c (append_access_attrs): Avoid buffer overflow.  Avoid
memory leak.  Use XNEWVEC macro.  Use auto_diagnostic_group to
group warning with inform together.
(handle_access_attribute): Formatting fix.

--- gcc/doc/extend.texi.jj  2019-11-22 19:11:53.634970558 +0100
+++ gcc/doc/extend.texi 2019-11-23 01:34:33.344849287 +0100
@@ -2490,7 +2490,7 @@ The following attributes are supported o
 
 The @code{access} attribute enables the detection of invalid or unsafe
 accesses by functions to which they apply to or their callers, as well
-as wite-only accesses to objects that are never read from.  Such accesses
+as write-only accesses to objects that are never read from.  Such accesses
 may be diagnosed by warnings such as @option{-Wstringop-overflow},
 @option{-Wunnitialized}, @option{-Wunused}, and others.
 
--- gcc/c-family/c-attribs.c.jj 2019-11-22 19:11:54.0 +0100
+++ gcc/c-family/c-attribs.c2019-11-23 01:44:50.306617000 +0100
@@ -3840,7 +3840,7 @@ append_access_attrs (tree t, tree attrs,
   if (idxs[1])
 n2 = sprintf (attrspec + n1 + 1, "%u", (unsigned) idxs[1] - 1);
 
-  size_t newlen = n1 + n2;
+  size_t newlen = n1 + n2 + !!n2;
   char 

Re: [PATCH] V7, #4 of 7, Add explicit (0),1 to @pcrel references

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 14, 2019 at 05:51:14PM -0500, Michael Meissner wrote:
> In some of my previous work, I had make a mistake forgetting that the PADDI
> instruction did not allow adding a PC-relative reference to a register (you 
> can
> either load up a PC-relative address without adding a register, or you can add
> a register to a constant).  The assembler allowed the instruction, but it
> didn't do what I expected.

So, what was the instruction?

> --- gcc/config/rs6000/rs6000.c(revision 278175)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -13241,7 +13241,10 @@ print_operand_address (FILE *file, rtx x
>if (SYMBOL_REF_P (x) && !SYMBOL_REF_LOCAL_P (x))
>   fprintf (file, "@got");
>  
> -  fprintf (file, "@pcrel");
> +  /* Specifically add (0),1 to catch uses where a @pcrel was added to a 
> an
> +  address with a base register, since the hardware does not support
> +  adding a base register to a PC-relative address.  */
> +  fprintf (file, "@pcrel(0),1");

But this is print_operand_address, it shouldn't know anything about
specific instructions, it certainly shouldn't print commas and other
fields.

Can you fix this in the caller?


Segher


Re: [PATCH], V7, #3 of 7, Use PADDI for 34-bit immediate adds

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 14, 2019 at 05:44:42PM -0500, Michael Meissner wrote:
> This patch generates PADDI to add 34-bit immediate constants on the 'future'
> system, and prevents such adds from being split.

I don't see that last part?  Is that a remnant of a previous version of
the patch?

>   * config/rs6000/predicates.md (add_operand): Add support for
>   PADDI.

More directly: Allow constants that satisfy eI.

>   * config/rs6000/rs6000.md (add3): Add support for PADDI.

(add3 for GPR): etc.

We have six things all called add3 (VI2, DDTD, SDI, SFDF, IEEE128,
VEC_F).  Erm.

Wait, this is called *add3, not add3.


Okay for trunk with that fixed.  Thanks,


Segher


Re: [PATCH], V7, #2 of 7, Use PLI to load up 32-bit SImode constants

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 14, 2019 at 05:42:43PM -0500, Michael Meissner wrote:
> -;; Split a load of a large constant into the appropriate two-insn
> -;; sequence.
> +;; Split a load of a large constant into the appropriate two-insn sequence.  
> On
> +;; systems that support PADDI (PLI), we can use PLI to load any 32-bit 
> constant
> +;; in one instruction.
>  
>  (define_split
>[(set (match_operand:SI 0 "gpc_reg_operand")
>   (match_operand:SI 1 "const_int_operand"))]
>"(unsigned HOST_WIDE_INT) (INTVAL (operands[1]) + 0x8000) >= 0x1
> -   && (INTVAL (operands[1]) & 0x) != 0"
> +   && (INTVAL (operands[1]) & 0x) != 0 && !TARGET_PREFIXED_ADDR"
>[(set (match_dup 0)
>   (match_dup 2))
> (set (match_dup 0)

Please use num_insns_constant, instead (and fix num_insns_constant_gpr
so it knows about SIGNED_34BIT).


Segher


Re: [PATCH] V7, #1 of 7, Use PLI to load up 34-bit DImode constants

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 14, 2019 at 05:40:10PM -0500, Michael Meissner wrote:
> --- gcc/config/rs6000/rs6000.c(revision 278173)
> +++ gcc/config/rs6000/rs6000.c(working copy)
> @@ -5552,7 +5552,7 @@ static int
>  num_insns_constant_gpr (HOST_WIDE_INT value)
>  {
>/* signed constant loadable with addi */
> -  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x1)
> +  if (SIGNED_16BIT_OFFSET_P (value))
>  return 1;

Hrm, so the SIGNED_nBIT_OFFSET_P should not be called "offset", if we use
them for other numbers as well.

> -;;  GPR store  GPR load   GPR move   GPR li GPR lis GPR #
> -;;  FPR store  FPR load   FPR move   AVX store  AVX store   AVX 
> load
> -;;  AVX load   VSX move   P9 0   P9 -1  AVX 0/-1VSX 0
> -;;  VSX -1 P9 const   AVX const  From SPR   To SPR  
> SPR<->SPR
> -;;  VSX->GPR   GPR->VSX
> +;;  GPR store  GPR load   GPR move   GPR li GPR lis GPR 
> pli
> +;;  GPR #  FPR store  FPR load   FPR move   AVX store   AVX 
> store
> +;;  AVX load   AVX load   VSX move   P9 0   P9 -1   AVX 
> 0/-1
> +;;  VSX 0  VSX -1 P9 const   AVX const  From SPRTo 
> SPR
> +;;  SPR<->SPR  VSX->GPR   GPR->VSX

I cannot make heads or tails of it this way.  Please just add the "pli",
don't rearrange everything else.

There do not have to be exactly six per line.  The only reason to have
some order here is to make it easier to read, not to make it *harder*!

So for this first line let's have three GPR moves, and then have four
load immediates.  Then in the future if we need to edit it again, make
the edited part make some sense, etc.

>  ; Some DImode loads are best done as a load of -1 followed by a mask
> -; instruction.
> +; instruction.  On systems that support the PADDI (PLI) instruction,
> +; num_insns_constant returns 1, so these splitter would not be used for 
> things
> +; that be loaded with PLI.

That comment doesn't add much at all?  This splitter isn't used for
constants we can load in one insn, that's right.  That happily works
just fine if we have prefixed insns as well.


Okay for trunk with those things fixed.  Thanks!


Segher


Re: [PATCH v2] Add `--with-install-sysroot=' configuration option

2019-11-22 Thread Joseph Myers
On Fri, 22 Nov 2019, Maciej W. Rozycki wrote:

>  As I recall the MIPS sysroot setup (please correct me if I got something 
> wrong here) was like:

Yes, that's the sort of layout you get with sysroot suffixes.  See 
gcc/config/mips/{st.h,t-st} for an example.

>  Then the right-hand side of /path/to/somewhere (except for usr/) is what 
> gets printed by `-print-multi-directory' or the left-hand side of output 
> from `-print-multi-lib', e.g. `sof/el/lib64' for the example above.  

Rather, it's a suffix (as in SYSROOT_SUFFIX_SPEC, no command-line option 
to print it), followed by a directory such as /lib64 that comes from 
STARTFILE_PREFIX_SPEC.  (Until MULTILIB_OSDIRNAMES / 
-print-multi-os-directory were added, I think STARTFILE_PREFIX_SPEC was 
the main mechanism for using directories such as /lib64; once the multilib 
OS directory mechanism was added, STARTFILE_PREFIX_SPEC was needed much 
less, but is still relevant for this sysroot use case, along with some 
linker configuration in binutils to teach it about such directories.)

> Similarly `-print-multi-os-directory' prints a directory path relative to 
> a lib/ subdirectory to the sysroot, so that would be `../sof/el/lib64' 
> respectively.

Rather, it's a path relative to the (non-sysroot, before your patch) 
directory where the compiler installs the libraries.  See e.g. t-st using 
paths such as ../lib64/2f.

>  Well, I agree we need to have this stuff documented beyond what we 
> currently have, but I think it applies equally to all the sysroot options 
> we have, including both the `--sysroot=' GCC driver's option, and the 
> `--with-sysroot=', `--with-build-sysroot=' and the newly-proposed 

All three of those refer to the top-level sysroot path, to which a sysroot 
suffix is appended based on SYSROOT_SUFFIX_SPEC (unless 
--no-sysroot-suffix is used).

> `--with-install-sysroot=' `configure' script's options as well.  All we 
> currently have is this paragraph:

But this is a path relative to which SYSROOT_SUFFIX_SPEC isn't used at 
all.

>  And last but not least: do we want to hold my proposed change hostage to 
> a sysroot handling documentation improvement?  It does not appear fair to 
> me as the situation with said documentation is not a new problem nor one 
> specific to this newly-added option, and the new option merely played the 

The proposed new option is, as far as I know, the first one introducing 
this new kind of sysroot option (one for which the suffix from 
SYSROOT_SUFFIX_SPEC is never added).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [PATCH v2] Add `--with-install-sysroot=' configuration option

2019-11-22 Thread Maciej W. Rozycki
On Wed, 20 Nov 2019, Joseph Myers wrote:

> >  Thanks for your concern, however again, AFAICT this change is tangential 
> > to any sysroot suffix, which necessarily has to be already included in the 
> > multilib OS directory as given by `-print-multi-os-directory', so that it 
> > gets embedded within $toolexeclibdir for the purpose of target library 
> > installation across the relevant subdirs, as per this excerpt from 
> > `configure' code right after the assignments quoted in the example above:
> > 
> > multi_os_directory=`$CC -print-multi-os-directory`
> > case $multi_os_directory in
> >   .) ;; # Avoid trailing /.
> >   *) toolexeclibdir=$toolexeclibdir/$multi_os_directory ;;
> > esac
> > 
> > or otherwise the existing arrangement where 
> > toolexeclibdir='$(toolexecdir)/lib' wouldn't have worked either (and 
> > neither would in the native case where toolexeclibdir='$(libdir)').
> > 
> >  Does this answer clear your concern?  OK to apply with the documentation 
> > thinko fixed?
> 
> The answer explains the reasoning behind the design of the option (i.e., 
> the design that means it's not particularly useful with sysroot suffixes, 
> because the user would still need to relocate libraries manually to the 
> correct suffixed sysroot).  It is indeed the case that making a version 
> useful with sysroot suffixes would not simply be a configuration change 
> but involve changes in the compiler driver to disentangle two different 
> uses of multilib OS directory suffixes.

 I think I am confused now about your mention of the existence of two 
different uses here.  This may be because it's been a while since I worked 
with a MIPS toolchain configuration and my memory may have started to 
fade.  So at this point I'll appreciate if you enlighten me a bit.

 As I recall the MIPS sysroot setup (please correct me if I got something 
wrong here) was like:

/path/to/somewhere/
  +-> lib/
  +-> lib32/
  +-> lib64/
  +-> usr/
  |  +-> lib/
  |  +-> lib32/
  |  \-> lib64/
  +-> el/
  | +-> lib/
  | +-> lib32/
  | +-> lib64/
  | \-> usr/
  |+-> lib/
  |+-> lib32/
  |\-> lib64/
  +-> sof/
  |  +-> lib/
  |  +-> lib32/ 
  |  +-> lib64/
  |  +-> usr/
  |  |  +-> lib/
  |  |  +-> lib32/
  |  |  \-> lib64/
  |  \-> el/
  |+-> lib/
  |+-> lib32/
  |+-> lib64/
  |\-> usr/
  |   +-> lib/
  |   +-> lib32/
  |   \-> lib64/
  .
  .
  .

and the use of `--sysroot=/path/to/somewhere' combined with the required 
multilib selection options, such as `-EL -mabi=64 -msoft-float' would make 
the GCC driver point the linker at the right set of directories to use for 
libraries to be linked against.  For the options given here these would be 
/path/to/somewhere/sof/el/lib64 and /path/to/somewhere/sof/el/usr/lib64.

 For RISC-V targets the structure so far is much simpler and for the Linux 
target amounts to:

/path/to/somewhere/
  +-> lib32/
  |+-> ilp32/
  |\-> ilp32d/
  +-> lib64/
  |+-> ilp64/
  |\-> ilp64d/
  \-> usr/
 +-> lib32/
 |+-> ilp32/
 |\-> ilp32d/
 \-> lib64/
  +-> ilp64/
  \-> ilp64d/

NB I have deliberately omitted header files from this consideration; these 
could or could not be shared among multilibs depending on the particular 
target although as I recall and agree with the desire was to have a single 
shared set of headers living under /path/to/somewhere/usr/include/.

 Then the right-hand side of /path/to/somewhere (except for usr/) is what 
gets printed by `-print-multi-directory' or the left-hand side of output 
from `-print-multi-lib', e.g. `sof/el/lib64' for the example above.  
Similarly `-print-multi-os-directory' prints a directory path relative to 
a lib/ subdirectory to the sysroot, so that would be `../sof/el/lib64' 
respectively.

 I have no immediate access to a MIPS toolchain (not at least one with 
multilib support configured), but I have made a quick experiment with my 
RISC-V toolchain (configured with the sysroot at 

[PATCH] Don't override various Makefile variables for gnulib et al

2019-11-22 Thread Christian Biesinger via gcc-patches
Normally the toplevel Makefile will pass various CC=foo and other
flags down to subdir Makefiles. However, for Gnulib this is a problem
because Gnulib's configure specifically sets CC to something that
includes a -std=gnu11 flag on some systems, and this override would
set it back to CC=gcc, leading to compile errors in a GDB build
with an updated Gnulib.

I don't believe this is needed outside of GCC, so this patch changes
Gnulib and other non-GCC modules to just not override any flags --
the values set during configure time should be fine. If a user
overrides them manually when invoking make, those will still work.

Under the same condition, I also removed the host_exports. I don't
understand why this is ever necessary (this is only after configure
has run).

The other option is to clear MAKEOVERRIDES in gnulib/Makefile.am, but
that means the user can't override any variables for this subdirectory.

ChangeLog:

2019-11-22  Christian Biesinger  

* Makefile.def: Pass no_exports_and_flags to various non-GCC
modules.
* Makefile.in: Allow passing a no_exports_and_flags argument to
"all" to suppress emitting exports and make flags. Useful when
invoked via host_modules from Makefile.def.
* Makefile.tpl: Regenerate.

Change-Id: I7d80328cf81c133ba6157eec7d10c422b6790723
---
 Makefile.def | 12 ++--
 Makefile.in  | 30 --
 Makefile.tpl |  9 ++---
 3 files changed, 24 insertions(+), 27 deletions(-)

diff --git a/Makefile.def b/Makefile.def
index 311feb9de3..e1ff065202 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -33,7 +33,7 @@ build_modules= { module= fixincludes; };
 build_modules= { module= libcpp;
  extra_configure_flags='--disable-nls am_cv_func_iconv=no';};
 
-host_modules= { module= bfd; bootstrap=true; };
+host_modules= { module= bfd; bootstrap=true; no_exports_and_flags=true; };
 host_modules= { module= opcodes; bootstrap=true; };
 host_modules= { module= binutils; bootstrap=true; };
 host_modules= { module= bison; no_check_cross= true; };
@@ -105,15 +105,15 @@ host_modules= { module= libiconv;
missing= install-html;
missing= install-info; };
 host_modules= { module= m4; };
-host_modules= { module= readline; };
+host_modules= { module= readline; no_exports_and_flags=true; };
 host_modules= { module= sid; };
-host_modules= { module= sim; };
+host_modules= { module= sim; no_exports_and_flags=true; };
 host_modules= { module= texinfo; no_install= true; };
 host_modules= { module= zlib; no_install=true; no_check=true;
bootstrap=true;
extra_configure_flags='@extra_host_zlib_configure_flags@';};
-host_modules= { module= gnulib; };
-host_modules= { module= gdb; };
+host_modules= { module= gnulib; no_exports_and_flags=true; };
+host_modules= { module= gdb; no_exports_and_flags=true; };
 host_modules= { module= expect; };
 host_modules= { module= guile; };
 host_modules= { module= tk; };
@@ -129,7 +129,7 @@ host_modules= { module= lto-plugin; bootstrap=true;
extra_make_flags='@extra_linker_plugin_flags@'; };
 host_modules= { module= libcc1; extra_configure_flags=--enable-shared; };
 host_modules= { module= gotools; };
-host_modules= { module= libctf; no_check=true;
+host_modules= { module= libctf; no_check=true; no_exports_and_flags=true;
bootstrap=true; };
 
 target_modules = { module= libstdc++-v3;
diff --git a/Makefile.in b/Makefile.in
index 1aabf6ede4..bd41753543 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -3414,10 +3414,9 @@ maybe-all-bfd: all-bfd
 all-bfd: configure-bfd
@r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
-   $(HOST_EXPORTS)  \
+\
(cd $(HOST_SUBDIR)/bfd && \
- $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) 
$(STAGE1_FLAGS_TO_PASS)  \
-   $(TARGET-bfd))
+ $(MAKE) $(TARGET-bfd))
 @endif bfd
 
 
@@ -25530,10 +25529,9 @@ all-readline: configure-readline
@: $(MAKE); $(unstage)
@r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
-   $(HOST_EXPORTS)  \
+\
(cd $(HOST_SUBDIR)/readline && \
- $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) 
$(STAGE1_FLAGS_TO_PASS)  \
-   $(TARGET-readline))
+ $(MAKE) $(TARGET-readline))
 @endif readline
 
 
@@ -26412,10 +26410,9 @@ all-sim: configure-sim
@: $(MAKE); $(unstage)
@r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); ${PWD_COMMAND}`; export s; \
-   $(HOST_EXPORTS)  \
+\
(cd $(HOST_SUBDIR)/sim && \
- $(MAKE) $(BASE_FLAGS_TO_PASS) $(EXTRA_HOST_FLAGS) 
$(STAGE1_FLAGS_TO_PASS)  \
-   $(TARGET-sim))
+ $(MAKE) $(TARGET-sim))
 @endif sim
 
 
@@ -28150,10 +28147,9 @@ all-gnulib: configure-gnulib
@: $(MAKE); $(unstage)
@r=`${PWD_COMMAND}`; export r; \
s=`cd $(srcdir); 

[analyzer] Add sm-malloc.dot

2019-11-22 Thread David Malcolm
I've found myself wanting a visualization of the state machines,
so this patch adds a sm-malloc.dot, based on sm-malloc.cc

Ideally we would autogenerate the .dot file somehow, but there isn't a
good way to do this at this time.

A .png version of this .dot file can be seen at:
  https://dmalcolm.fedorapeople.org/gcc/2019-11-22/sm-malloc.png

Pushed to branch "dmalcolm/analyzer" on the GCC git mirror.

gcc/ChangeLog:
* analyzer/sm-malloc.cc (class malloc_state_machine): Add note
about sm-malloc.dot.
(malloc_state_machine::on_stmt): Fix some comments to use "freed"
rather than "free" as the name of a state.
* analyzer/sm-malloc.dot: New file.
---
 gcc/analyzer/sm-malloc.cc  |  8 +++--
 gcc/analyzer/sm-malloc.dot | 89 ++
 2 files changed, 94 insertions(+), 3 deletions(-)
 create mode 100644 gcc/analyzer/sm-malloc.dot

diff --git a/gcc/analyzer/sm-malloc.cc b/gcc/analyzer/sm-malloc.cc
index 602cb0b..d3d9147 100644
--- a/gcc/analyzer/sm-malloc.cc
+++ b/gcc/analyzer/sm-malloc.cc
@@ -34,7 +34,9 @@ namespace {
 
 
 
-/* A state machine for detecting misuses of the malloc/free API.  */
+/* A state machine for detecting misuses of the malloc/free API.
+
+   See sm-malloc.dot for an overview (keep this in-sync with that file).  */
 
 class malloc_state_machine : public state_machine
 {
@@ -615,7 +617,7 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
 
  arg = sm_ctxt->get_readable_tree (arg);
 
- /* start/unchecked/nonnull -> free.  */
+ /* start/unchecked/nonnull -> freed.  */
  sm_ctxt->on_transition (node, stmt, arg, m_start, m_freed);
  sm_ctxt->on_transition (node, stmt, arg, m_unchecked, m_freed);
  sm_ctxt->on_transition (node, stmt, arg, m_nonnull, m_freed);
@@ -623,7 +625,7 @@ malloc_state_machine::on_stmt (sm_context *sm_ctxt,
  /* Keep state "null" as-is, rather than transitioning to "free";
 we don't want want to complain about double-free of NULL.  */
 
- /* free -> stop, with warning.  */
+ /* freed -> stop, with warning.  */
  sm_ctxt->warn_for_state (node, stmt, arg, m_freed,
   new double_free (*this, arg));
  sm_ctxt->on_transition (node, stmt, arg, m_freed, m_stop);
diff --git a/gcc/analyzer/sm-malloc.dot b/gcc/analyzer/sm-malloc.dot
new file mode 100644
index 000..2050267
--- /dev/null
+++ b/gcc/analyzer/sm-malloc.dot
@@ -0,0 +1,89 @@
+/* An overview of the state machine from sm-malloc.cc.
+   Copyright (C) 2019 Free Software Foundation, Inc.
+   Contributed by David Malcolm .
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it
+under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful, but
+WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+/* Keep this in-sync with sm-malloc.cc  */
+
+digraph "malloc" {
+
+  /* STATES. */
+
+  /* Start state.  */
+  start;
+
+  /* State for a pointer returned from malloc that hasn't been checked for
+ NULL.
+ It could be a pointer to heap-allocated memory, or could be NULL.  */
+  unchecked;
+
+  /* State for a pointer that's known to be NULL.  */
+  null;
+
+  /* State for a pointer to heap-allocated memory, known to be non-NULL.  */
+  nonnull;
+
+  /* State for a pointer to freed memory.  */
+  freed;
+
+  /* State for a pointer that's known to not be on the heap (e.g. to a local
+ or global).  */
+  non_heap;
+
+  /* Stop state, for pointers we don't want to track any more.  */
+  stop;
+
+  /* TRANSITIONS. */
+
+  start -> unchecked [label="on 'X=malloc(...);'"];
+  start -> unchecked [label="on 'X=calloc(...);'"];
+
+  start -> non_heap [label="on 'X=alloca(...);'"];
+  start -> non_heap [label="on 'X=__builtin_alloca(...);'"];
+
+  /* On "free".  */
+  start -> freed [label="on 'free(X);'"];
+  unchecked -> freed [label="on 'free(X);'"];
+  nonnull -> freed [label="on 'free(X);'"];
+  freed -> stop [label="on 'free(X);':\n Warn('double-free')"];
+  non_heap -> stop  [label="on 'free(X);':\n Warn('free of non-heap')"];
+
+  /* Handle "__attribute__((nonnull))".   */
+  unchecked -> nonnull [label="on 'FN(X)' with 
__attribute__((nonnull)):\nWarn('possible NULL arg')"];
+  null -> stop [label="on 'FN(X)' with __attribute__((nonnull)):\nWarn('NULL 
arg')"];
+
+  /* is_zero_assignment.  */
+  start -> null [label="on 'X = 0;'"];
+  unchecked -> null [label="on 'X = 0;'"];
+  nonnull 

Re: [PATCH] OpenACC reference count overhaul

2019-11-22 Thread Julian Brown
On Sat, 9 Nov 2019 01:28:51 +
Julian Brown  wrote:

> On Thu, 31 Oct 2019 19:11:57 +0100
> Thomas Schwinge  wrote:
> 
> > So that's not related to reference counting, needs to be discussed
> > separately.
> > 
> > ..., and while I do agree that the current code is a bit "strange"
> > (returning 'tgt->to_free'), I couldn't quickly find or come up with
> > a test cases where this would actually do the wrong thing.  After
> > all, this is the code path taken for "not present", and 'tgt' is
> > built anew for one single mapping, with no alignment set (which
> > would cause 'to_free' to differ from 'tgt_start'); 'tgt_offset'
> > should always be zero, and 'h' always the same as 'host_start'.
> > What am I missing? That is, given the current set of libgomp test
> > cases, the attached never triggeres.  
> 
> The code can't stay exactly as it is with this patch, because the tgt
> return value from gomp_map_vars_async with
> GOMP_MAP_VARS_OPENACC_ENTER_DATA is a null pointer.
> 
> So, the device pointer calculation needed to be re-done -- although
> it's not quite a bug fix, as you point out, and some of the offsets
> will always be zero or cancel out in practice.
> 
> *However*, it looks like the device pointer calculation for the
> "present" case is wrong in the preceding code. I've addressed that in
> the patch posted here:
> 
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00661.html
> 
> The patch attached here applies on top of that one, and attempts to
> keep the device pointer calculation "the same" for the non-present
> case, modulo an extra lookup_host -- and also adds some assertions to
> make sure the assumptions about zero/cancelled-out offsets stay true.

Here's another iteration that applies over the version of the
present/subarray patch committed, and also addresses the use of
REFCOUNT_INFINITY on target blocks as queried in the following message:

https://gcc.gnu.org/ml/gcc-patches/2019-11/msg01146.html

Most uses of REFCOUNT_INFINITY indeed appear to be unreachable (as in,
a target_mem_desc with refcount == REFCOUNT_INFINITY will most of the
time be linked from a splay tree key with refcount ==
REFCOUNT_INFINITY, and the code to decrement the former's refcount
and/or free the block will never be called).

I found one case (for OpenACC) where a runtime check/error can be
added -- attempting to free a mapped target block corresponding to a
device_resident global variable using an API routine. I don't think
there's a code path using directives (for either OpenACC or OpenMP) that
exhibits any problematic behaviour in that regard. I've added a couple
of test cases, and a couple of assertions.

OK now? (Or perhaps the REFCOUNT_INFINITY bits want splitting out? It
all still arguably comes under the "refcount overhaul" umbrella!).

Thanks,

Julian
commit d027f9ab63fdf871076f40e2cf42f672ed3f0e69
Author: Julian Brown 
Date:   Mon Nov 5 15:51:46 2018 -0800

OpenACC reference count overhaul

2019-11-22  Julian Brown  
Thomas Schwinge  

libgomp/
* libgomp.h (struct splay_tree_key_s): Substitute dynamic_refcount
field for virtual_refcount.
(struct acc_dispatch_t): Remove data_environ field.
(struct gomp_device_descr): Update comment on openacc field.
(enum gomp_map_vars_kind): Add GOMP_MAP_VARS_OPENACC_ENTER_DATA.
(gomp_acc_insert_pointer, gomp_acc_remove_pointer, gomp_free_memmap):
Remove prototypes.
(gomp_remove_var_async): Add prototype.
* oacc-host.c (host_dispatch): Don't initialise removed data_environ
field.
* oacc-init.c (acc_shutdown_1): Iteratively call gomp_remove_var
instead of calling gomp_free_memmap.
* oacc-mem.c (lookup_dev_1): New function.
(lookup_dev): Reimplement using above.
(acc_free, acc_hostptr): Update calls to lookup_dev.
(acc_map_data): Likewise.  Don't add to data_environ list.
(acc_unmap_data): Remove call to gomp_unmap_vars.  Fix semantics to
remove mapping, but not mapped data.  Handle REFCOUNT_INFINITY on
target blocks.
(present_create_copy): Use virtual_refcount instead of
dynamic_refcount.  Don't manipulate data_environ.  Re-do lookup for
target pointer return value.
(delete_copyout): Update for virtual_refcount semantics.  Use
goacc_remove_var_async for asynchronous delete/copyouts.
(gomp_acc_insert_pointer, gomp_acc_remove_pointer): Remove functions.
* oacc-parallel.c (find_pointer): Remove function.
(find_group_last, goacc_enter_data_internal,
goacc_exit_data_internal): New functions.
(GOACC_enter_exit_data): Use goacc_enter_data_internal and
goacc_exit_data_internal helper functions.
* target.c (gomp_map_vars_internal): Handle

[analyzer] Add -fno-semantic-interposition to PLUGIN_CFLAGS

2019-11-22 Thread David Malcolm
An annoying wart with debugging the plugin is that attempts to set
a breakpoint on a function set two breakpoints: on the "funcname@plt",
and on the function itself, making stepping through the code more
awkward than it ought to be.

This patch detects for -fno-semantic-interposition at configure
time, and adds it to PLUGIN_CFLAGS if available, eliminating the
"funcname@plt" entries.

Presumably this also speeds up in-tree plugin code.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Pushed to branch "dmalcolm/analyzer" on the GCC git mirror,
along with the autogenerated changes to gcc/configure as a followup

gcc/ChangeLog:
* Makefile.in (SEM_INTERPOS_FLAGS): New.
(PLUGIN_CFLAGS): Flesh out comment.  Add SEM_INTERPOS_FLAGS.
* configure.ac (sem_interpos_flags): Add this, testing for
-fno-semantic-interposition.
---
 gcc/Makefile.in  | 16 ++--
 gcc/configure.ac | 12 
 2 files changed, 26 insertions(+), 2 deletions(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0587447..224bd75 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -188,6 +188,8 @@ NOEXCEPTION_FLAGS = @noexception_flags@
 
 ALIASING_FLAGS = @aliasing_flags@
 
+SEM_INTERPOS_FLAGS = @sem_interpos_flags@
+
 # This is set by --disable-maintainer-mode (default) to "#"
 # FIXME: 'MAINT' will always be set to an empty string, no matter if
 # --disable-maintainer-mode is used or not.  This is because the
@@ -1786,8 +1788,18 @@ ifneq ($(PLUGIN_MAKEFRAGS),)
 include $(PLUGIN_MAKEFRAGS)
 endif
 
-# Add PLUGIN_CFLAGS to objects that belong to in-tree plugins
-PLUGIN_CFLAGS = -fPIC
+# Define PLUGIN_CFLAGS and add it to objects that belong to in-tree plugins
+#
+# .o files for plugins need to built as position-independent code, hence -fPIC
+#
+# We use -fno-semantic-interposition if available (via SEM_INTERPOS_FLAGS)
+# since it fixes an annoying wart when debugging the plugin, where every
+# attempt to set a breakpoint on a function sets two breakpoints: on the
+# "funcname@plt", and on the function itself, making stepping through the
+# code more awkward than it ought to be.  Presumably this also leads
+# to faster plugin code.
+
+PLUGIN_CFLAGS = -fPIC $(SEM_INTERPOS_FLAGS)
 $(foreach file,$(ALL_HOST_PLUGIN_OBJS),$(eval CFLAGS-$(file) += 
$(PLUGIN_CFLAGS)))
 
 #
diff --git a/gcc/configure.ac b/gcc/configure.ac
index ff6023e..eb30857 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -461,7 +461,19 @@ if test "$GCC" = yes; then
 fi
 AC_SUBST(aliasing_flags)
 
+sem_interpos_flags=
+if test "$GCC" = yes; then
+  saved_CXXFLAGS="$CXXFLAGS"
+
+  CXXFLAGS="$CXXFLAGS -fno-semantic-interposition"
+  AC_MSG_CHECKING(whether the compiler supports -fno-semantic-interposition)
+  AC_COMPILE_IFELSE([AC_LANG_SOURCE([])],
+[AC_MSG_RESULT([yes]); sem_interpos_flags='-fno-semantic-interposition'],
+[AC_MSG_RESULT([no])])
 
+  CXXFLAGS="$saved_CXXFLAGS"
+fi
+AC_SUBST(sem_interpos_flags)
 
 # -
 # Warnings and checking
-- 
1.8.5.3



Re: C++ PATCH for c++/88337 - P1327R1: Allow polymorphic typeid in constexpr

2019-11-22 Thread Marek Polacek
On Fri, Nov 22, 2019 at 04:14:06PM -0500, Jason Merrill wrote:
> On 11/11/19 7:54 PM, Marek Polacek wrote:
> > Part of P1327R1 is to allow typeid with an operand of polymorphic type in
> > constexpr.  I found that we pretty much support it already, the only tweak
> > was to allow TYPEID_EXPR (only created in a template) in constexpr in C++20.
> > 
> > I also noticed this in build_typeid:
> >/* FIXME when integrating with c_fully_fold, mark
> >   resolves_to_fixed_type_p case as a non-constant expression.  */
> >if (TYPE_POLYMORPHIC_P (TREE_TYPE (exp))
> >&& ! resolves_to_fixed_type_p (exp, )
> >&& ! nonnull)
> > but I'm not quite sure what to do with it.
> 
> Remove it, I think.

Ack.

> > Bootstrapped/regtested on x86_64-linux, ok for trunk?
> > 
> > 2019-11-11  Marek Polacek  
> > 
> > PR c++/88337 - P1327R1: Allow polymorphic typeid in constexpr.
> > * constexpr.c (potential_constant_expression_1): Allow a typeid
> > expression whose operand is of polymorphic type in constexpr in
> > C++20.
> > 
> > * g++.dg/cpp2a/constexpr-typeid1.C: New test.
> > * g++.dg/cpp2a/constexpr-typeid2.C: New test.
> > * g++.dg/cpp2a/constexpr-typeid3.C: New test.
> > * g++.dg/cpp2a/constexpr-typeid4.C: New test.
> > 
> > diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
> > index 20fddc57825..430c65694b7 100644
> > --- gcc/cp/constexpr.c
> > +++ gcc/cp/constexpr.c
> > @@ -7018,11 +7018,13 @@ potential_constant_expression_1 (tree t, bool 
> > want_rval, bool strict, bool now,
> > return false;
> >   case TYPEID_EXPR:
> > -  /* -- a typeid expression whose operand is of polymorphic
> > -class type;  */
> > +  /* In C++20, a typeid expression whose operand is of polymorphic
> > +class type can be constexpr.  */
> > {
> >   tree e = TREE_OPERAND (t, 0);
> > -if (!TYPE_P (e) && !type_dependent_expression_p (e)
> > +   if (cxx_dialect < cxx2a
> 
> Do we want to allow this before C++20 if !strict?  OK either way.

Since we allow non-template constexpr typeid use, we might as well.

I'll commit the following after the usual testing.

2019-11-22  Marek Polacek  

PR c++/88337 - P1327R1: Allow polymorphic typeid in constexpr.
* constexpr.c (potential_constant_expression_1): Allow a typeid
expression whose operand is of polymorphic type in constexpr in
C++20.
* rtti.c (build_typeid): Remove obsolete FIXME comment.

* g++.dg/cpp2a/constexpr-typeid1.C: New test.
* g++.dg/cpp2a/constexpr-typeid2.C: New test.
* g++.dg/cpp2a/constexpr-typeid3.C: New test.
* g++.dg/cpp2a/constexpr-typeid4.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 9ce768bb2e6..658455cce96 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -7021,11 +7021,14 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
   return false;
 
 case TYPEID_EXPR:
-  /* -- a typeid expression whose operand is of polymorphic
-class type;  */
+  /* In C++20, a typeid expression whose operand is of polymorphic
+class type can be constexpr.  */
   {
 tree e = TREE_OPERAND (t, 0);
-if (!TYPE_P (e) && !type_dependent_expression_p (e)
+   if (cxx_dialect < cxx2a
+   && strict
+   && !TYPE_P (e)
+   && !type_dependent_expression_p (e)
&& TYPE_POLYMORPHIC_P (TREE_TYPE (e)))
   {
 if (flags & tf_error)
diff --git gcc/cp/rtti.c gcc/cp/rtti.c
index d987f8b4d83..da685961c70 100644
--- gcc/cp/rtti.c
+++ gcc/cp/rtti.c
@@ -353,8 +353,6 @@ build_typeid (tree exp, tsubst_flags_t complain)
   if (processing_template_decl)
 return build_min (TYPEID_EXPR, const_type_info_type_node, exp);
 
-  /* FIXME when integrating with c_fully_fold, mark
- resolves_to_fixed_type_p case as a non-constant expression.  */
   if (TYPE_POLYMORPHIC_P (TREE_TYPE (exp))
   && ! resolves_to_fixed_type_p (exp, )
   && ! nonnull)
diff --git gcc/testsuite/g++.dg/cpp2a/constexpr-typeid1.C 
gcc/testsuite/g++.dg/cpp2a/constexpr-typeid1.C
new file mode 100644
index 000..a81f649b44b
--- /dev/null
+++ gcc/testsuite/g++.dg/cpp2a/constexpr-typeid1.C
@@ -0,0 +1,47 @@
+// PR c++/88337 - Implement P1327R1: Allow dynamic_cast/typeid in constexpr.
+// { dg-do compile { target c++2a } }
+// Test non-polymorphic type.
+
+#include 
+
+struct B {
+  const std::type_info  = typeid (*this);
+};
+
+struct D : B { };
+
+constexpr B b;
+constexpr D d;
+
+static_assert ( ==  (B));
+static_assert ({}.ti ==  (B));
+static_assert (().ti ==  (B));
+static_assert ( ((B())) ==  (B));
+static_assert ( ((B{})) ==  (B));
+
+static_assert ( ==  (B));
+static_assert ({}.ti ==  (B));
+static_assert (().ti ==  (B));
+static_assert ( ((D())) ==  (D));
+static_assert ( ((D{})) ==  (D));
+
+extern D ed;
+static_assert ( (ed) ==  (D));
+
+constexpr const B  = d;
+static_assert ( (r) ==  

[wwwdocs] Update C++ status with Belfast motions

2019-11-22 Thread Marek Polacek
Committed to git.  Should s/http/https/ the wg21 links.

Jason, do we support P1907R1?

commit d59a823fb4ad2daa535d26f592274ec68b9cb4a1
Author: Marek Polacek 
Date:   Fri Nov 22 16:49:27 2019 -0500

Update with Belfast motions.

diff --git a/htdocs/projects/cxx-status.html b/htdocs/projects/cxx-status.html
index c240d6da..380a7dfa 100644
--- a/htdocs/projects/cxx-status.html
+++ b/htdocs/projects/cxx-status.html
@@ -116,7 +116,11 @@
http://wg21.link/p1141r2;>P1141R2
http://wg21.link/p0848r3;>P0848R3
http://wg21.link/p1616r1;>P1616R1
-   http://wg21.link/p1452r2;>P1452R2
+   http://wg21.link/p1452r2;>P1452R2
+   
+   https://wg21.link/p1972r0;>P1972R0
+   
+   https://wg21.link/p1980r0;>P1980R0
10 
__cpp_concepts = ? 
 
@@ -157,7 +161,10 @@
http://wg21.link/p1120r0;>P1120R0
http://wg21.link/p1185r2;>P1185R2
http://wg21.link/p1186r3;>P1186R3
-   http://wg21.link/p1630r1;>P1630R1
+   http://wg21.link/p1630r1;>P1630R1
+   
+   https://wg21.link/p1946r0;>P1946R0
+   
   10
__cpp_impl_three_way_comparison = 201711 
 
@@ -229,7 +236,9 @@
 
 
Class Types in Non-Type Template Parameters 
-  http://wg21.link/p0732r2;>P0732R2
+  http://wg21.link/p0732r2;>P0732R2
+   
+   https://wg21.link/p1907r1;>P1907R1
9 
__cpp_nontype_template_parameter_class = 201806 
 
@@ -332,10 +341,10 @@

 
 
-   Modules 
+   Modules 
   http://wg21.link/p1103r3;>P1103R3
-  No (Modules Wiki) 
-   
+  No (Modules Wiki) 
+   
 
 
   http://wg21.link/p1766r1;>P1766R1
@@ -346,6 +355,14 @@
 
   http://wg21.link/p1703r1;>P1703R1
 
+
+  
+  https://wg21.link/p1874r1;>P1874R1
+
+
+  
+  https://wg21.link/p1979r0;>P1979R0
+
 
Coroutines 
   http://wg21.link/p0912r5;>P0912R5
@@ -354,7 +371,9 @@
 
 
Parenthesized initialization of aggregates 
-  http://wg21.link/p0960r3;>P0960R3
+  http://wg21.link/p0960r3;>P0960R3
+  
+  https://wg21.link/p1975r0;>P1975R0
   No (https://gcc.gnu.org/PR91363;>PR91363)
__cpp_aggregate_paren_init = 201902
 



Re: [C/C++ PATCH] Fix up build of GCC 4.6 and earlier with GCC 9+ (PR c/90677, take 2)

2019-11-22 Thread Joseph Myers
On Fri, 22 Nov 2019, Jason Merrill wrote:

> On 11/22/19 2:10 PM, Jakub Jelinek wrote:
> > On Wed, Nov 20, 2019 at 02:01:58PM -0500, Jason Merrill wrote:
> > > I would think that get_named_type should find struct or enum names that
> > > have
> > > been hidden by another declaration; that would fix this without
> > > special-casing cgraph_node.  For the C++ front-end, that would be
> > > 
> > > lookup_qualified_name (global_namespace, id, /*prefer_type*/2,
> > > /*complain*/false)
> > 
> > Ok, this patch does that, for C by going from I_TAG_BINDING to the global
> > level.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> OK on Monday unless a C maintainer has a comment.

I'm fine with this patch.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [committed] [testsuite] Fix fp-int-convert-timode-1.c testism.

2019-11-22 Thread Joseph Myers
On Fri, 22 Nov 2019, Tamar Christina wrote:

> Hi Joseph,
> 
> > > Or do you want me to send them separately?
> > 
> > I think it's best to fix the test now not to have the #ifdef, then if you 
> > have execution failures those can be addressed separately.  (If you want 
> > to avoid the test FAILing before then, an XFAIL with a comment referencing 
> > an open bug in Bugzilla would be appropriate, not #ifdef that makes the 
> > test spuriously PASS.)
> > 
> 
> Fair enough, found the issue and it wasn't with the test. I've attached the
> new patch.
> 
> Ok for trunk?

OK.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: C++ PATCH for c++/88337 - P1327R1: Allow polymorphic typeid in constexpr

2019-11-22 Thread Jason Merrill

On 11/11/19 7:54 PM, Marek Polacek wrote:

Part of P1327R1 is to allow typeid with an operand of polymorphic type in
constexpr.  I found that we pretty much support it already, the only tweak
was to allow TYPEID_EXPR (only created in a template) in constexpr in C++20.

I also noticed this in build_typeid:
   /* FIXME when integrating with c_fully_fold, mark
  resolves_to_fixed_type_p case as a non-constant expression.  */
   if (TYPE_POLYMORPHIC_P (TREE_TYPE (exp))
   && ! resolves_to_fixed_type_p (exp, )
   && ! nonnull)
but I'm not quite sure what to do with it.


Remove it, I think.


Bootstrapped/regtested on x86_64-linux, ok for trunk?

2019-11-11  Marek Polacek  

PR c++/88337 - P1327R1: Allow polymorphic typeid in constexpr.
* constexpr.c (potential_constant_expression_1): Allow a typeid
expression whose operand is of polymorphic type in constexpr in
C++20.

* g++.dg/cpp2a/constexpr-typeid1.C: New test.
* g++.dg/cpp2a/constexpr-typeid2.C: New test.
* g++.dg/cpp2a/constexpr-typeid3.C: New test.
* g++.dg/cpp2a/constexpr-typeid4.C: New test.

diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 20fddc57825..430c65694b7 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -7018,11 +7018,13 @@ potential_constant_expression_1 (tree t, bool 
want_rval, bool strict, bool now,
return false;
  
  case TYPEID_EXPR:

-  /* -- a typeid expression whose operand is of polymorphic
-class type;  */
+  /* In C++20, a typeid expression whose operand is of polymorphic
+class type can be constexpr.  */
{
  tree e = TREE_OPERAND (t, 0);
-if (!TYPE_P (e) && !type_dependent_expression_p (e)
+   if (cxx_dialect < cxx2a


Do we want to allow this before C++20 if !strict?  OK either way.

Jason



Re: C++ PATCH for c++/88337 - Implement P1327R1: Allow dynamic_cast in constexpr

2019-11-22 Thread Jason Merrill

On 11/8/19 4:24 PM, Marek Polacek wrote:

After much weeping and gnashing of teeth, here's a patch to handle dynamic_cast
in constexpr evaluation.  While the change in the standard is trivial (see
), the
change in the compiler is less so.

When build_dynamic_cast realizes that a dynamic_cast needs a run-time check, it
generates a call to __dynamic_cast -- see dyncast.cc in libsupc++ for its
definition.  The gist of my approach is to evaluate such a call at compile time.

This should be easy in theory: let the constexpr machinery find out the dynamic
type and then handle a sidecast and upcast.  That's ultimately what the patch
is trying to do but there was a number of hindrances.

1) We can't use __dynamic_cast's type_info parameters, this type is not a
literal class.  But that means we have no idea what we're converting to!


get_tinfo_decl sets the TREE_TYPE of the DECL_NAME of the tinfo decl to 
the relevant type, can't you use that?



2) [class.cdtor] says that when a dynamic_cast is used in a constructor or
destructor and the operand of the dynamic_cast refers to the object under
construction or destruction, this object is considered to be a most derived
object.


This means that during the 'tor the vtable pointer refers to the 
type_info for that class and the offset-to-top is 0.  Can you use that?



This was tricky, and the only thing that seemed to work was to add
a new member to constexpr_global_ctx.  I was happy to find out that I could
use new_obj I'd added recently.  Note that destruction is *not* handled at
all and in fact I couldn't even construct a testcase where that would make
a difference.



3) We can't rely on the hint __dynamic_cast gave us; the comment in
cxx_eval_dynamic_cast_fn explains why the accessible_base_p checks were
necessary.

There are many various scanarios regarding inheritance so special care was
devoted to test as much as possible, but testing the "dynamic_cast in
a constructor" could be expanded.

This patch doesn't handle polymorphic typeid yet.  I think it will be easier
to review to separate these two.  Hopefully the typeid part will be much
easier.

Bootstrapped/regtested on x86_64-linux.

2019-11-08  Marek Polacek  

PR c++/88337 - Implement P1327R1: Allow dynamic_cast in constexpr.
* call.c (is_base_field_ref): No longer static.
* constexpr.c (struct constexpr_global_ctx): Add ctor_object member
and initialize it.
(cxx_dynamic_cast_fn_p): New function.
(cxx_eval_dynamic_cast_fn): Likewise.
(cxx_eval_call_expression): Call cxx_eval_dynamic_cast_fn for a call
to __dynamic_cast.  Save the object a constexpr constructor is
constructing.
(cxx_eval_constant_expression) : Save the target
type of a call to __dynamic_cast.
(potential_constant_expression_1): Don't give up on
cxx_dynamic_cast_fn_p.
* cp-tree.h (is_base_field_ref): Declare.
* parser.c (cp_parser_postfix_expression): Set location of expression.
* rtti.c (build_dynamic_cast_1): When creating a call to
__dynamic_cast, use the location of the original expression.

* g++.dg/cpp2a/constexpr-dynamic1.C: New test.
* g++.dg/cpp2a/constexpr-dynamic10.C: New test.
* g++.dg/cpp2a/constexpr-dynamic11.C: New test.
* g++.dg/cpp2a/constexpr-dynamic12.C: New test.
* g++.dg/cpp2a/constexpr-dynamic13.C: New test.
* g++.dg/cpp2a/constexpr-dynamic14.C: New test.
* g++.dg/cpp2a/constexpr-dynamic2.C: New test.
* g++.dg/cpp2a/constexpr-dynamic3.C: New test.
* g++.dg/cpp2a/constexpr-dynamic4.C: New test.
* g++.dg/cpp2a/constexpr-dynamic5.C: New test.
* g++.dg/cpp2a/constexpr-dynamic6.C: New test.
* g++.dg/cpp2a/constexpr-dynamic7.C: New test.
* g++.dg/cpp2a/constexpr-dynamic8.C: New test.
* g++.dg/cpp2a/constexpr-dynamic9.C: New test.

diff --git gcc/cp/call.c gcc/cp/call.c
index 0034c1cee0d..5de2aca1358 100644
--- gcc/cp/call.c
+++ gcc/cp/call.c
@@ -8193,7 +8193,7 @@ call_copy_ctor (tree a, tsubst_flags_t complain)
  
  /* Return true iff T refers to a base field.  */
  
-static bool

+bool
  is_base_field_ref (tree t)
  {
STRIP_NOPS (t);
diff --git gcc/cp/constexpr.c gcc/cp/constexpr.c
index 20fddc57825..ef7706347bc 100644
--- gcc/cp/constexpr.c
+++ gcc/cp/constexpr.c
@@ -1025,8 +1025,11 @@ struct constexpr_global_ctx {
/* Heap VAR_DECLs created during the evaluation of the outermost constant
   expression.  */
auto_vec heap_vars;
+  /* For a constructor, this is the object we're constructing.  */
+  tree ctor_object;
/* Constructor.  */
-  constexpr_global_ctx () : constexpr_ops_count (0) {}
+  constexpr_global_ctx () : constexpr_ops_count (0), ctor_object (NULL_TREE)
+{}
  };
  
  /* The constexpr expansion context.  CALL is the current function

@@ -1663,6 +1666,244 @@ 

[Darwin, X86, testsuite] Update tests for common section use.

2019-11-22 Thread Iain Sandoe
The tests amended here now have different code-gen with default
options because, previously, the access were indirected per Darwin
ABI for common accesses.  The revised code-gen does not match the
expected scan-asms because Darwin defaults to fPIC.  For these tests,
it seems that the best solution is to use '-mdynamic-no-pic' in the
m32 case which makes the output similar to the ElF platform default.

tested on x86_64-dawin16, x86_64-linux-gnu
applied to mainline
thanks
Iain

gcc/testsuite/ChangeLog:

2019-11-22  Iain Sandoe  

* gcc.target/i386/pr27971.c: Use mdynamic-no-pic for m32 on
Darwin.
* gcc.target/i386/sse2-load-multi.c: Likewise.
* gcc.target/i386/sse2-store-multi.c: Likewise.

diff --git a/gcc/testsuite/gcc.target/i386/pr27971.c 
b/gcc/testsuite/gcc.target/i386/pr27971.c
index 149bf2b..f80cb65 100644
--- a/gcc/testsuite/gcc.target/i386/pr27971.c
+++ b/gcc/testsuite/gcc.target/i386/pr27971.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-O2 -mno-tbm" } */
+/* { dg-additional-options "-mdynamic-no-pic" { target { *-*-darwin* && ia32 } 
} } */
 
 unsigned array[4];
 
@@ -16,3 +17,4 @@ unsigned foo(TYPE x)
 
 /* { dg-final { scan-assembler-not "shr\[^\\n\]*2" } } */
 /* { dg-final { scan-assembler "and\[^\\n\]*12" } } */
+ 
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/i386/sse2-load-multi.c 
b/gcc/testsuite/gcc.target/i386/sse2-load-multi.c
index 9276054..3ee0ef8 100644
--- a/gcc/testsuite/gcc.target/i386/sse2-load-multi.c
+++ b/gcc/testsuite/gcc.target/i386/sse2-load-multi.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-march=corei7 -O2" } */
+/* { dg-additional-options "-mdynamic-no-pic" { target { *-*-darwin* && ia32 } 
} } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/i386/sse2-store-multi.c 
b/gcc/testsuite/gcc.target/i386/sse2-store-multi.c
index 203a00f..ca04934 100644
--- a/gcc/testsuite/gcc.target/i386/sse2-store-multi.c
+++ b/gcc/testsuite/gcc.target/i386/sse2-store-multi.c
@@ -1,5 +1,6 @@
 /* { dg-do compile } */
 /* { dg-options "-march=corei7 -O2" } */
+/* { dg-additional-options "-mdynamic-no-pic" { target { *-*-darwin* && ia32 } 
} } */
 
 #include 
 



Re: [PATCH] Fix PR92462

2019-11-22 Thread Jeff Law
On 11/21/19 1:00 AM, Richard Biener wrote:
> On Wed, 20 Nov 2019, Jeff Law wrote:
> 
> Sure, but we're not being called with this but with RTL
> expanded via cselib usually:
> 
> (plus:SI (plus:SI (and:SI (symbol_ref:SI ("*.LANCHOR0") [flags 0x182])
> (const_int 3 [0x3]))
> (value:SI 8:8 @0x3089c18/0x30f5e40))
> (const_int -8 [0xfff8]))
Ugh.  I'm suspect all the ANCHOR stuff came in well after the original
code to try and optimize the low-order bit-logical masking for addressing.

Without knowing something more, I'd claim that can't actually be a pointer.

> 
> so there are no REGs involved on the "subexpressions" we could
> check REG_POINTER on ... and in fact the symbol_ref _is_
> a pointer!  It's just not something we should consider the
> base of the memory reference.  We're talking about sth like
> 
> *( + ((uintptr_t) - (uintptr_t)))
> 
> where it's nearly impossible to tell (on RTL, if the address
> parts are not split to separate insns and thus REG_POINTER is
> applicable) what object we are looking at for aliasing purposes.
Right.  And in that case we have to make the most conservative
assumptions.


> 
> The appropriate way here is to not to too clever things with
> the address but instead rely on MEM_EXPR here - unfortunately
> that is non-existent for spill slots which is where most of
> the regressions with removing the code appear.
That's what doesn't make much sense to me.  We ought to be able to
analyze stack slots reasonably well.  But the RTL DSE bits have changed
a lot since Christian's  implementation  -- it actually used to be easy
to follow.


jeff



Re: [PATCH] Fix attribute((section)) for templates

2019-11-22 Thread Strager Neds
Here's a revised version of the patch. This revised version is ready for review.

When GCC encounters __attribute__((section("foo"))) on a function or
variable declaration, it adds an entry in the symbol table for the
declaration to remember its desired section. The symbol table is
separate from the declaration's tree node.

When instantiating a template, GCC copies the tree of the template
recursively. GCC does *not* copy symbol table entries when copying
function and variable declarations.

Combined, these two details mean that section attributes on function and
variable declarations in a template have no effect.

Fix this issue by copying the section name (in the symbol table) when
copying a tree node for template instantiation. This addresses PR
c++/70435 and PR c++/88061.

Originally, I tried copying section names in copy_node. This caused
problems for reasons I do not understand. This patch in this email
avoids those problems by copying section names only in the callers of
copy_node relevant to template instantation.

Known unknowns (questions for the audience):

* For all targets which support the section attribute, are functions and
  variables deduplicated (comdat) when using a custom section? It seems
  to work with GNU ELF on Linux and with Mach-O on macOS (i.e. I end up
  with only one copy), but I'm unsure about other platforms. Richard
  Biener raised this concern in PR c++/88061. Is this something I should
  worry much about?
* Do we need to check or copy implicit_section or alias? I don't know
  what these properties mean, but they look related to section_name.

Note: This patch depends on the following unmerged patches (but could be
changed to not depend on them):

* Simplify testing symbol sections:
  https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02062.html
* Fix attribute((section)) with -flto:
  https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02063.html
* Refactor copying decl section names:
  https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00979.html

Testing:

* Bootstrap on x86_64-linux-gnu with --disable-multilib
  --enable-checking=release --enable-languages=c,c++. Observe no change
  in test results (aside from the added tests).
* Bootstrap on macOS x86_64-apple-darwin16.7.0 with --disable-multilib
  --enable-checking=release --enable-languages=c,c++. Observe no change
  in test results (aside from the added tests).

2019-11-20  Matthew Glazar 

* gcc/cp/pt.c (tsubst_function_decl): Copy the section name from the
original function.
(tsubst_decl): Copy the section name from the original variable (if the
variable is global).
---
 gcc/cp/pt.c   |  5 +++
 ...section-class-template-function-template.C | 25 +++
 ...ass-template-specialized-static-variable.C | 29 +
 ...template-static-inline-variable-template.C | 19 
 ...on-class-template-static-inline-variable.C | 19 
 .../section-class-template-static-variable.C  | 20 +
 ...on-function-template-static-variable-lto.C | 19 
 ...ection-function-template-static-variable.C | 26 +++
 .../g++.dg/ext/section-function-template.C| 20 +
 .../ext/section-multi-tu-template-main.C  | 43 +++
 .../ext/section-multi-tu-template-other.C | 24 +++
 .../g++.dg/ext/section-multi-tu-template.h| 33 ++
 .../g++.dg/ext/section-variable-template.C| 16 +++
 13 files changed, 298 insertions(+)
 create mode 100644
gcc/testsuite/g++.dg/ext/section-class-template-function-template.C
 create mode 100644
gcc/testsuite/g++.dg/ext/section-class-template-specialized-static-variable.C
 create mode 100644
gcc/testsuite/g++.dg/ext/section-class-template-static-inline-variable-template.C
 create mode 100644
gcc/testsuite/g++.dg/ext/section-class-template-static-inline-variable.C
 create mode 100644
gcc/testsuite/g++.dg/ext/section-class-template-static-variable.C
 create mode 100644
gcc/testsuite/g++.dg/ext/section-function-template-static-variable-lto.C
 create mode 100644
gcc/testsuite/g++.dg/ext/section-function-template-static-variable.C
 create mode 100644 gcc/testsuite/g++.dg/ext/section-function-template.C
 create mode 100644 gcc/testsuite/g++.dg/ext/section-multi-tu-template-main.C
 create mode 100644 gcc/testsuite/g++.dg/ext/section-multi-tu-template-other.C
 create mode 100644 gcc/testsuite/g++.dg/ext/section-multi-tu-template.h
 create mode 100644 gcc/testsuite/g++.dg/ext/section-variable-template.C

diff --git gcc/cp/pt.c gcc/cp/pt.c
index 8bacb3952ff..2593cf67a20 100644
--- gcc/cp/pt.c
+++ gcc/cp/pt.c
@@ -42,6 +42,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimplify.h"
 #include "gcc-rich-location.h"
 #include "selftest.h"
+#include "cgraph.h"

 /* The type of functions taking a tree, and some additional data, and
returning an int.  */
@@ -13526,6 +13527,7 @@ tsubst_function_decl (tree t, tree args,
tsubst_flags_t complain,
   if (!DECL_DELETED_FN (r))
 DECL_INITIAL (r) = 

Re: [C++ PATCH] Fix concepts vs. PCH (PR c++/92458, take 2)

2019-11-22 Thread Jakub Jelinek
On Fri, Nov 22, 2019 at 02:43:25PM -0500, Jason Merrill wrote:
> > @@ -5358,6 +5410,18 @@ struct tree_cache_traits
> > : simple_cache_map_traits, tree> { };
> 
> We should probably use tree_hash instead of default_hash_traits here. OK
> with or without that change.

Dunno, I'd say pointer_hash ::hash which is
  return (hashval_t) ((intptr_t)candidate >> 3);
might be actually better hash function than
#define TREE_HASH(NODE) ((size_t) (NODE) & 077)
at least on 64-bit hosts, because union tree_node is 8-byte aligned there,
so the low 3 bits are all zero and we use all of the 32 bits above that.
For 32-bit hosts, it might be better to just use candidate without shifting
I guess, there aren't any extra bits we get in from above.
TREE_HASH to me unnecessarily uses the low 3 bits known to be zero and masks
with 0x3, so throws away also 14 bits that could hold useful info,
leaving for 64-bit hosts only 15 bits.

Jakub



Re: [C++20 PATCH] Implement P1920R1, Missing feature-test macros 2017-2019

2019-11-22 Thread Jason Merrill

On 11/16/19 7:38 PM, Jakub Jelinek wrote:

Hi!

This implements the core part of P1920R1, tested on x86_64-linux,
ok for trunk if it passes full bootstrap/regtest?

Jon, are you going to handle the libstdc++ side of this, assuming
there is something still not done where we have the corresponding
features implemented already?


OK.


2019-11-16  Jakub Jelinek  

Implement P1920R1, Missing feature-test macros 2017-2019.
* c-cppbuiltin.c (c_cpp_builtins): Bump __cpp_init_captures
and __cpp_generic_lambdas for -std=c++2a.  Define
__cpp_designated_initializers, __cpp_constexpr_in_decltype and
__cpp_consteval for -std=c++2a.  Remove a FIXME comment about
__cpp_concepts for -std=c++2a.

* g++.dg/cpp1z/feat-cxx1z.C: Only compile with -std=c++17.
* g++.dg/cpp2a/feat-cxx2a.C: Adjust for P1920R1 changes.
* g++.dg/cpp2a/desig15.C: New test.
* g++.dg/cpp2a/lambda-pack-init3.C: New test.
* g++.dg/cpp2a/lambda-generic6.C: New test.
* g++.dg/cpp2a/consteval15.C: New test.

--- gcc/c-family/c-cppbuiltin.c.jj  2019-11-13 19:13:15.490048963 +0100
+++ gcc/c-family/c-cppbuiltin.c 2019-11-16 18:30:02.338883062 +0100
@@ -952,8 +952,11 @@ c_cpp_builtins (cpp_reader *pfile)
{
  /* Set feature test macros for C++14.  */
  cpp_define (pfile, "__cpp_return_type_deduction=201304L");
- cpp_define (pfile, "__cpp_init_captures=201304L");
- cpp_define (pfile, "__cpp_generic_lambdas=201304L");
+ if (cxx_dialect <= cxx17)
+   {
+ cpp_define (pfile, "__cpp_init_captures=201304L");
+ cpp_define (pfile, "__cpp_generic_lambdas=201304L");
+   }
  if (cxx_dialect <= cxx14)
cpp_define (pfile, "__cpp_constexpr=201304L");
  cpp_define (pfile, "__cpp_decltype_auto=201304L");
@@ -990,7 +993,12 @@ c_cpp_builtins (cpp_reader *pfile)
if (cxx_dialect > cxx17)
{
  /* Set feature test macros for C++2a.  */
+ cpp_define (pfile, "__cpp_init_captures=201803L");
+ cpp_define (pfile, "__cpp_generic_lambdas=201707L");
+ cpp_define (pfile, "__cpp_designated_initializers=201707L");
+ cpp_define (pfile, "__cpp_constexpr_in_decltype=201711L");
  cpp_define (pfile, "__cpp_conditional_explicit=201806L");
+ cpp_define (pfile, "__cpp_consteval=201811L");
  cpp_define (pfile, "__cpp_constinit=201907L");
  cpp_define (pfile, "__cpp_nontype_template_parameter_class=201806L");
  cpp_define (pfile, "__cpp_impl_destroying_delete=201806L");
@@ -1000,7 +1008,6 @@ c_cpp_builtins (cpp_reader *pfile)
if (flag_concepts)
  {
if (cxx_dialect >= cxx2a)
-/* FIXME: Update this to the value required by the IS.  */
  cpp_define (pfile, "__cpp_concepts=201907L");
else
  cpp_define (pfile, "__cpp_concepts=201507L");
--- gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C.jj  2019-10-19 09:22:15.756879262 
+0200
+++ gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C 2019-11-16 18:34:08.045190225 
+0100
@@ -1,5 +1,5 @@
-// { dg-do compile { target c++17 } }
-// { dg-options "-I${srcdir}/g++.dg/cpp1y -I${srcdir}/g++.dg/cpp1y/testinc" }
+// { dg-do compile }
+// { dg-options "-std=c++17 -I${srcdir}/g++.dg/cpp1y 
-I${srcdir}/g++.dg/cpp1y/testinc" }
  
  //  C++98 features:
  
--- gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C.jj	2019-10-19 09:22:16.168872968 +0200

+++ gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C 2019-11-16 18:42:37.817528261 
+0100
@@ -122,14 +122,14 @@
  
  #ifndef __cpp_init_captures

  #  error "__cpp_init_captures"
-#elif __cpp_init_captures != 201304
-#  error "__cpp_init_captures != 201304"
+#elif __cpp_init_captures != 201803
+#  error "__cpp_init_captures != 201803"
  #endif
  
  #ifndef __cpp_generic_lambdas

  #  error "__cpp_generic_lambdas"
-#elif __cpp_generic_lambdas != 201304
-#  error "__cpp_generic_lambdas != 201304"
+#elif __cpp_generic_lambdas != 201707
+#  error "__cpp_generic_lambdas != 201707"
  #endif
  
  #ifndef __cpp_constexpr

@@ -507,3 +507,27 @@
  #elif __cpp_char8_t != 201811
  #  error "__cpp_char8_t != 201811"
  #endif
+
+#ifndef __cpp_designated_initializers
+#  error "__cpp_designated_initializers"
+#elif __cpp_designated_initializers != 201707
+#  error "__cpp_designated_initializers != 201707"
+#endif
+
+#ifndef __cpp_constexpr_in_decltype
+#  error "__cpp_constexpr_in_decltype"
+#elif __cpp_constexpr_in_decltype != 201711
+#  error "__cpp_constexpr_in_decltype != 201711"
+#endif
+
+#ifndef __cpp_consteval
+#  error "__cpp_consteval"
+#elif __cpp_consteval != 201811
+#  error "__cpp_consteval != 201811"
+#endif
+
+#ifndef __cpp_concepts
+#  error "__cpp_concepts"
+#elif __cpp_concepts != 201907
+#  error "__cpp_concepts != 201907"
+#endif
--- gcc/testsuite/g++.dg/cpp2a/desig15.C.jj 2019-11-16 19:07:37.527982693 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/desig15.C2019-11-16 

Re: [C++ Patch] Improve build_new_op_1, cp_build_indirect_ref_1, and cp_build_modify_expr locations

2019-11-22 Thread Jason Merrill

On 11/22/19 10:43 AM, Paolo Carlini wrote:

Hi,

I would say most of the changes are straightforward or mechanical. 
Essentially, for build_new_op_1 and cp_build_modify_expr I'm simply 
consistently using the available location argument; for 
cp_build_indirect_ref_1 I'm adding the parameter but then using it in a 
completely straightforward way. Minor nit: I wondered for a while if 
cp_build_modify_expr should use cp_expr_loc_or_loc more - normally the 
passed loc points to the '=' - but eventually, given the actual texts of 
the messages, I used it only in one place, for "void value not ignored 
as it ought to be" which is mostly about the type of 'rhs'. All the 
other messages in one way or the other talk about both sides (the 
primary clang caret appears to agree).


Tested x86_64-linux.

Thanks, Paolo.

//


OK.

Jason



Re: [C++ PATCH] Fix concepts vs. PCH (PR c++/92458, take 2)

2019-11-22 Thread Jason Merrill

On 11/22/19 2:08 PM, Jakub Jelinek wrote:

On Wed, Nov 20, 2019 at 07:46:13PM -0500, Jason Merrill wrote:

If decl_tree_cache_map will be needed in more than one spot, I'll probably
need to move it to some generic header.


Most of them probably need it, for code that uses the relevant features.
Except debug_type_map, which probably needs to use TYPE_UID.

Or we might make default_hash_traits use DECL_UID for decls and
TYPE_UID for types even if it doesn't do the more complex analysis of
tree_operand_hash.


Finding out what is a type and what is decl (and what to do with rest?)
at runtime would be more costly than necessary.

The following updated patch moves the decl_tree_cache_map to tree.h and
adds there type_tree_cache_map too and uses it in all the tree_cache_map
spots in the C++ FE.
tree-hash-traits.h can't be included from tree.h, because operand_equal_p is
declared elsewhere, so instead this patch moves some of the
tree-hash-traits.h traits from that header to tree.h.  I could move there
just tree_decl_hash if moving the other two doesn't look right to you.

Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
by compiling stdc++.h as PCH with -std=c++2a and using it.  Ok for trunk?



2019-11-22  Jakub Jelinek  

PR c++/92458
* tree-hash-traits.h (tree_decl_hash, tree_ssa_name_hash,
tree_hash): Move to ...
* tree.h (tree_decl_hash, tree_ssa_name_hash, tree_hash): ... here.
(struct decl_tree_cache_traits, struct type_tree_cache_traits): New
types.
(decl_tree_cache_map, tree_tree_cache_map): New typedefs.

* init.c (nsdmi_inst): Change type to
decl_tree_cache_map * from tree_cache_map *.
* constraint.cc (decl_constraints): Likewise.
* decl.c (get_tuple_decomp_init): Likewise.
* pt.c (defarg_inst, explicit_specifier_map): Likewise.
(tsubst_default_argument, store_explicit_specifier): Use
decl_tree_cache_map::create_ggc rather than
tree_cache_map::create_ggc.
* cp-objcp-common.c (debug_type_map): Change type to
type_tree_cache_map * from tree_cache_map *.

* g++.dg/pch/pr92458.C: New test.
* g++.dg/pch/pr92458.Hs: New test.

--- gcc/tree-hash-traits.h.jj   2019-01-01 12:37:16.902979134 +0100
+++ gcc/tree-hash-traits.h  2019-11-22 12:00:01.538725844 +0100
@@ -41,44 +41,4 @@ tree_operand_hash::equal (const value_ty
return operand_equal_p (t1, t2, 0);
  }
  
-/* Hasher for tree decls.  Pointer equality is enough here, but the DECL_UID

-   is a better hash than the pointer value and gives a predictable traversal
-   order.  */
-struct tree_decl_hash : ggc_ptr_hash 
-{
-  static inline hashval_t hash (tree);
-};
-
-inline hashval_t
-tree_decl_hash::hash (tree t)
-{
-  return DECL_UID (t);
-}
-
-/* Hash for SSA_NAMEs in the same function.  Pointer equality is enough
-   here, but the SSA_NAME_VERSION is a better hash than the pointer
-   value and gives a predictable traversal order.  */
-struct tree_ssa_name_hash : ggc_ptr_hash 
-{
-  static inline hashval_t hash (tree);
-};
-
-inline hashval_t
-tree_ssa_name_hash::hash (tree t)
-{
-  return SSA_NAME_VERSION (t);
-}
-
-/* Hasher for general trees, based on their TREE_HASH.  */
-struct tree_hash : ggc_ptr_hash 
-{
-  static hashval_t hash (tree);
-};
-
-inline hashval_t
-tree_hash::hash (tree t)
-{
-  return TREE_HASH (t);
-}
-
  #endif
--- gcc/tree.h.jj   2019-11-15 00:37:26.293070143 +0100
+++ gcc/tree.h  2019-11-22 12:00:17.479478830 +0100
@@ -5351,6 +5351,58 @@ struct tree_decl_map_cache_hasher : ggc_
  #define tree_vec_map_hash tree_decl_map_hash
  #define tree_vec_map_marked_p tree_map_base_marked_p
  
+/* Hasher for tree decls.  Pointer equality is enough here, but the DECL_UID

+   is a better hash than the pointer value and gives a predictable traversal
+   order.  Additionally it can be used across PCH save/restore.  */
+struct tree_decl_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_decl_hash::hash (tree t)
+{
+  return DECL_UID (t);
+}
+
+/* Similarly for types.  Uses TYPE_UID as hash function.  */
+struct tree_type_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_type_hash::hash (tree t)
+{
+  return TYPE_UID (t);
+}
+
+/* Hash for SSA_NAMEs in the same function.  Pointer equality is enough
+   here, but the SSA_NAME_VERSION is a better hash than the pointer
+   value and gives a predictable traversal order.  */
+struct tree_ssa_name_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_ssa_name_hash::hash (tree t)
+{
+  return SSA_NAME_VERSION (t);
+}
+
+/* Hasher for general trees, based on their TREE_HASH.  */
+struct tree_hash : ggc_ptr_hash 
+{
+  static hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_hash::hash (tree t)
+{
+  return TREE_HASH (t);
+}
+
  /* A hash_map of two trees for use with GTY((cache)).  Garbage 

Re: [C/C++ PATCH] Fix up build of GCC 4.6 and earlier with GCC 9+ (PR c/90677, take 2)

2019-11-22 Thread Jason Merrill

On 11/22/19 2:10 PM, Jakub Jelinek wrote:

On Wed, Nov 20, 2019 at 02:01:58PM -0500, Jason Merrill wrote:

I would think that get_named_type should find struct or enum names that have
been hidden by another declaration; that would fix this without
special-casing cgraph_node.  For the C++ front-end, that would be

lookup_qualified_name (global_namespace, id, /*prefer_type*/2,
/*complain*/false)


Ok, this patch does that, for C by going from I_TAG_BINDING to the global
level.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


OK on Monday unless a C maintainer has a comment.


2019-11-22  Jakub Jelinek  

PR c/90677
* c-common.h (identifier_global_tag): Declare.
* c-format.c (get_pointer_to_named_type): Renamed to ...
(get_named_type): ... this.  Use identifier_global_tag instead of
identifier_global_value, handle the return value being a TYPE_P.
(init_dynamic_diag_info): Adjust get_pointer_to_named_type callers
to call get_named_type instead.  Formatting fixes.
c/
* c-decl.c (identifier_global_tag): Define.
cp/
* cp-objcp-common.c (identifier_global_tag): Define.
testsuite/
* c-c++-common/pr90677.c: New test.

--- gcc/c-family/c-common.h.jj  2019-11-15 09:28:56.904930082 +0100
+++ gcc/c-family/c-common.h 2019-11-22 12:47:45.261379415 +0100
@@ -811,6 +811,7 @@ extern void c_register_addr_space (const
  extern bool in_late_binary_op;
  extern const char *c_addr_space_name (addr_space_t as);
  extern tree identifier_global_value (tree);
+extern tree identifier_global_tag (tree);
  extern bool names_builtin_p (const char *);
  extern tree c_linkage_bindings (tree);
  extern void record_builtin_type (enum rid, const char *, tree);
--- gcc/c-family/c-format.c.jj  2019-11-19 22:26:45.842300595 +0100
+++ gcc/c-family/c-format.c 2019-11-22 13:12:12.765658215 +0100
@@ -4899,31 +4899,32 @@ init_dynamic_gfc_info (void)
  }
  }
  
-/* Lookup the type named NAME and return a pointer-to-NAME type if found.

-   Otherwise, return void_type_node if NAME has not been used yet, or 
NULL_TREE if
-   NAME is not a type (issuing an error).  */
+/* Lookup the type named NAME and return a NAME type if found.
+   Otherwise, return void_type_node if NAME has not been used yet,
+   or NULL_TREE if NAME is not a type (issuing an error).  */
  
  static tree

-get_pointer_to_named_type (const char *name)
+get_named_type (const char *name)
  {
-  tree result;
-  if ((result = maybe_get_identifier (name)))
+  if (tree result = maybe_get_identifier (name))
  {
-  result = identifier_global_value (result);
+  result = identifier_global_tag (result);
if (result)
{
- if (TREE_CODE (result) != TYPE_DECL)
+ if (TYPE_P (result))
+   ;
+ else if (TREE_CODE (result) == TYPE_DECL)
+   result = TREE_TYPE (result);
+ else
{
  error ("%qs is not defined as a type", name);
  result = NULL_TREE;
}
- else
-   result = TREE_TYPE (result);
}
+  return result;
  }
else
-result = void_type_node;
-  return result;
+return void_type_node;
  }
  
  /* Determine the types of "tree" and "location_t" in the code being

@@ -4953,23 +4954,24 @@ init_dynamic_diag_info (void)
 an extra type level.  */
if ((local_tree_type_node = maybe_get_identifier ("tree")))
{
- local_tree_type_node = identifier_global_value (local_tree_type_node);
+ local_tree_type_node
+   = identifier_global_value (local_tree_type_node);
  if (local_tree_type_node)
{
  if (TREE_CODE (local_tree_type_node) != TYPE_DECL)
{
  error ("% is not defined as a type");
- local_tree_type_node = 0;
+ local_tree_type_node = NULL_TREE;
}
  else if (TREE_CODE (TREE_TYPE (local_tree_type_node))
   != POINTER_TYPE)
{
  error ("% is not defined as a pointer type");
- local_tree_type_node = 0;
+ local_tree_type_node = NULL_TREE;
}
  else
-   local_tree_type_node =
- TREE_TYPE (TREE_TYPE (local_tree_type_node));
+   local_tree_type_node
+ = TREE_TYPE (TREE_TYPE (local_tree_type_node));
}
}
else
@@ -4979,12 +4981,12 @@ init_dynamic_diag_info (void)
/* Similar to the above but for gimple*.  */
if (!local_gimple_ptr_node
|| local_gimple_ptr_node == void_type_node)
-local_gimple_ptr_node = get_pointer_to_named_type ("gimple");
+local_gimple_ptr_node = get_named_type ("gimple");
  
/* Similar to the above but for cgraph_node*.  */

if (!local_cgraph_node_ptr_node
|| local_cgraph_node_ptr_node == void_type_node)
-

Re: [PATCH][_GLIBCXX_DEBUG] Improve valid_range check

2019-11-22 Thread Jonathan Wakely

On 22/11/19 18:38 +0100, François Dumont wrote:

Hi

    I noticed that we are not checking that iterators are not singular 
in valid_range. Moreover __check_singular signature for pointers is 
not intercepting all kind of pointers in terms of qualification.


    I'd like to commit it next week but considering we are in stage 3 
I need proper acceptance.


    * include/debug/functions.h: Remove  include.
    (__check_singular_aux, __check_singular): Move...
    * include/debug/helper_functions.h:
    (__check_singular_aux, __check_singular): ...here.
    (__valid_range_aux): Adapt to use latter.
    * testsuite/25_algorithms/copy/debug/2_neg.cc: New.

Tested under Linux x86_64 normal and debug modes.


OK for trunk, thanks.




Re: [PATCH] Fix type handling in undistribute_bitref_for_vector (PR tree-optimization/92618)

2019-11-22 Thread Richard Biener
On November 22, 2019 3:25:05 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>The undistribute_bitref_for_vector function assumes that BIT_FIELD_REFs
>with vector first argument must have always type of the vector element,
>but that is not the case, BIT_FIELD_REF can extract any type with the
>corresponding size from any other vector.
>
>So, without this patch, when it saw e.g. addition in unsigned long long
>type, even when the vector was 2x long long, it computed the addition
>in 2x long long vector (in theory could run into undefined overflows)
>and fail on IL checking because conversion from long long to unsigned
>long
>long is not useless.  Even worse, for double addition with the vector
>2x
>long long, it would again perform the addition in 2x long long vector
>type,
>which is completely wrong.
>
>The following patch determines the right vector type and adds VCE
>around
>the operands when needed.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok. 

Thanks, 
Richard. 

>2019-11-22  Jakub Jelinek  
>
>   PR tree-optimization/92618
>   * tree-ssa-reassoc.c (v_info): Change from auto_vec to a struct
>   containing the auto_vec and a tree.
>   (undistribute_bitref_for_vector): Handle the case when element type
>   of vec is not the same as type of the BIT_FIELD_REF.  Formatting
>   fixes.
>
>   * gcc.c-torture/compile/pr92618.c: New test.
>   * gcc.c-torture/execute/pr92618.c: New test.
>
>--- gcc/tree-ssa-reassoc.c.jj  2019-11-13 10:54:56.038884580 +0100
>+++ gcc/tree-ssa-reassoc.c 2019-11-22 10:57:24.956931307 +0100
>@@ -1775,7 +1775,10 @@ undistribute_ops_list (enum tree_code op
>first: element index for each relevant BIT_FIELD_REF.
>second: the index of vec ops* for each relevant BIT_FIELD_REF.  */
> typedef std::pair v_info_elem;
>-typedef auto_vec v_info;
>+struct v_info {
>+  tree vec_type;
>+  auto_vec vec;
>+};
> typedef v_info *v_info_ptr;
> 
>/* Comparison function for qsort on VECTOR SSA_NAME trees by machine
>mode.  */
>@@ -1840,8 +1843,11 @@ undistribute_bitref_for_vector (enum tre
>   if (ops->length () <= 1)
> return false;
> 
>-  if (opcode != PLUS_EXPR && opcode != MULT_EXPR && opcode !=
>BIT_XOR_EXPR
>-  && opcode != BIT_IOR_EXPR && opcode != BIT_AND_EXPR)
>+  if (opcode != PLUS_EXPR
>+  && opcode != MULT_EXPR
>+  && opcode != BIT_XOR_EXPR
>+  && opcode != BIT_IOR_EXPR
>+  && opcode != BIT_AND_EXPR)
> return false;
> 
>   hash_map v_info_map;
>@@ -1879,9 +1885,45 @@ undistribute_bitref_for_vector (enum tre
>   if (!TYPE_VECTOR_SUBPARTS (vec_type).is_constant ())
>   continue;
> 
>+  if (VECTOR_TYPE_P (TREE_TYPE (rhs))
>+|| !is_a  (TYPE_MODE (TREE_TYPE (rhs
>+  continue;
>+
>+  /* The type of BIT_FIELD_REF might not be equal to the element
>type of
>+   the vector.  We want to use a vector type with element type the
>+   same as the BIT_FIELD_REF and size the same as TREE_TYPE (vec).  */
>+  if (!useless_type_conversion_p (TREE_TYPE (rhs), TREE_TYPE
>(vec_type)))
>+  {
>+machine_mode simd_mode;
>+unsigned HOST_WIDE_INT size, nunits;
>+unsigned HOST_WIDE_INT elem_size
>+  = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (rhs)));
>+if (!GET_MODE_BITSIZE (TYPE_MODE (vec_type)).is_constant ())
>+  continue;
>+if (size <= elem_size || (size % elem_size) != 0)
>+  continue;
>+nunits = size / elem_size;
>+if (!mode_for_vector (SCALAR_TYPE_MODE (TREE_TYPE (rhs)),
>+  nunits).exists (_mode))
>+  continue;
>+vec_type = build_vector_type_for_mode (TREE_TYPE (rhs), simd_mode);
>+
>+/* Ignore it if target machine can't support this VECTOR type.  */
>+if (!VECTOR_MODE_P (TYPE_MODE (vec_type)))
>+  continue;
>+
>+/* Check const vector type, constrain BIT_FIELD_REF offset and
>+   size.  */
>+if (!TYPE_VECTOR_SUBPARTS (vec_type).is_constant ())
>+  continue;
>+
>+if (maybe_ne (GET_MODE_SIZE (TYPE_MODE (vec_type)),
>+  GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (vec)
>+  continue;
>+  }
>+
>   tree elem_type = TREE_TYPE (vec_type);
>-  unsigned HOST_WIDE_INT elem_size
>-  = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
>+  unsigned HOST_WIDE_INT elem_size = tree_to_uhwi (TYPE_SIZE
>(elem_type));
>   if (maybe_ne (bit_field_size (rhs), elem_size))
>   continue;
> 
>@@ -1898,8 +1940,13 @@ undistribute_bitref_for_vector (enum tre
>   bool existed;
>   v_info_ptr  = v_info_map.get_or_insert (vec, );
>   if (!existed)
>-  info = new v_info;
>-  info->safe_push (std::make_pair (idx, i));
>+  {
>+info = new v_info;
>+info->vec_type = vec_type;
>+  }
>+  else if (!types_compatible_p (vec_type, info->vec_type))
>+  continue;
>+  info->vec.safe_push (std::make_pair (idx, i));
> }
> 
>   /* At 

[PATCH][_GLIBCXX_DEBUG] Improve valid_range check

2019-11-22 Thread François Dumont

Hi

    I noticed that we are not checking that iterators are not singular 
in valid_range. Moreover __check_singular signature for pointers is not 
intercepting all kind of pointers in terms of qualification.


    I'd like to commit it next week but considering we are in stage 3 I 
need proper acceptance.


    * include/debug/functions.h: Remove  include.
    (__check_singular_aux, __check_singular): Move...
    * include/debug/helper_functions.h:
    (__check_singular_aux, __check_singular): ...here.
    (__valid_range_aux): Adapt to use latter.
    * testsuite/25_algorithms/copy/debug/2_neg.cc: New.

Tested under Linux x86_64 normal and debug modes.

François

diff --git a/libstdc++-v3/include/debug/functions.h b/libstdc++-v3/include/debug/functions.h
index 8c385b87244..12df745b573 100644
--- a/libstdc++-v3/include/debug/functions.h
+++ b/libstdc++-v3/include/debug/functions.h
@@ -29,7 +29,6 @@
 #ifndef _GLIBCXX_DEBUG_FUNCTIONS_H
 #define _GLIBCXX_DEBUG_FUNCTIONS_H 1
 
-#include 		// for __addressof
 #include 	// for less
 
 #if __cplusplus >= 201103L
@@ -49,23 +48,6 @@ namespace __gnu_debug
   template
 struct _Is_contiguous_sequence : std::__false_type { };
 
-  // An arbitrary iterator pointer is not singular.
-  inline bool
-  __check_singular_aux(const void*) { return false; }
-
-  // We may have an iterator that derives from _Safe_iterator_base but isn't
-  // a _Safe_iterator.
-  template
-inline bool
-__check_singular(const _Iterator& __x)
-{ return __check_singular_aux(std::__addressof(__x)); }
-
-  /** Non-NULL pointers are nonsingular. */
-  template
-inline bool
-__check_singular(const _Tp* __ptr)
-{ return __ptr == 0; }
-
   /* Checks that [first, last) is a valid range, and then returns
* __first. This routine is useful when we can't use a separate
* assertion statement because, e.g., we are in a constructor.
diff --git a/libstdc++-v3/include/debug/helper_functions.h b/libstdc++-v3/include/debug/helper_functions.h
index c3e7478f649..5a858754875 100644
--- a/libstdc++-v3/include/debug/helper_functions.h
+++ b/libstdc++-v3/include/debug/helper_functions.h
@@ -29,6 +29,7 @@
 #ifndef _GLIBCXX_DEBUG_HELPER_FUNCTIONS_H
 #define _GLIBCXX_DEBUG_HELPER_FUNCTIONS_H 1
 
+#include // for __addressof
 #include 	// for iterator_traits,
 		// categories and _Iter_base
 #include 		// for __is_integer
@@ -112,6 +113,23 @@ namespace __gnu_debug
 __get_distance(_Iterator __lhs, _Iterator __rhs)
 { return __get_distance(__lhs, __rhs, std::__iterator_category(__lhs)); }
 
+  // An arbitrary iterator pointer is not singular.
+  inline bool
+  __check_singular_aux(const void*) { return false; }
+
+  // We may have an iterator that derives from _Safe_iterator_base but isn't
+  // a _Safe_iterator.
+  template
+inline bool
+__check_singular(_Iterator const& __x)
+{ return __check_singular_aux(std::__addressof(__x)); }
+
+  /** Non-NULL pointers are nonsingular. */
+  template
+inline bool
+__check_singular(_Tp* const& __ptr)
+{ return __ptr == 0; }
+
   /** We say that integral types for a valid range, and defer to other
*  routines to realize what to do with integral types instead of
*  iterators.
@@ -138,14 +156,23 @@ namespace __gnu_debug
 inline bool
 __valid_range_aux(_InputIterator __first, _InputIterator __last,
 		  std::input_iterator_tag)
-{ return true; }
+{
+  if (__first != __last)
+	return !__check_singular(__first) && !__check_singular(__last);
+
+  return true;
+}
 
   template
 _GLIBCXX_CONSTEXPR
 inline bool
 __valid_range_aux(_InputIterator __first, _InputIterator __last,
 		  std::random_access_iterator_tag)
-{ return __first <= __last; }
+{
+  return
+	__valid_range_aux(__first, __last, std::input_iterator_tag{})
+	&& __first <= __last;
+}
 
   /** We have iterators, so figure out what kind of iterators they are
*  to see if we can check the range ahead of time.
@@ -167,6 +194,9 @@ namespace __gnu_debug
 		  typename _Distance_traits<_InputIterator>::__type& __dist,
 		  std::__false_type)
 {
+  if (!__valid_range_aux(__first, __last, std::input_iterator_tag{}))
+	return false;
+
   __dist = __get_distance(__first, __last);
   switch (__dist.second)
 	{
diff --git a/libstdc++-v3/testsuite/25_algorithms/copy/debug/2_neg.cc b/libstdc++-v3/testsuite/25_algorithms/copy/debug/2_neg.cc
new file mode 100644
index 000..8bbf873de96
--- /dev/null
+++ b/libstdc++-v3/testsuite/25_algorithms/copy/debug/2_neg.cc
@@ -0,0 +1,37 @@
+// Copyright (C) 2019 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that 

Re: [PATCH v3] PR92398: Fix testcase failure of pr72804.c

2019-11-22 Thread Segher Boessenkool
Hi!

On Fri, Nov 22, 2019 at 01:26:39PM +0800, luoxhu wrote:
> Update the code as you wish, Thanks:

The point is to make this interface easy and clear to use.  So please
tell me what *you* think about that, don't just do what I think may be
a good solution!

>   * gcc.target/powerpc/pr72804.c: Split the store function to...
>   * gcc.target/powerpc/pr72804-1.c: ... this one.  New.

Ah, good idea.

> +++ b/gcc/testsuite/gcc.target/powerpc/pr72804-1.c

> +/* store generates difference instructions as below:
> +   P9: mtvsrdd;xxlnot;stxv.
> +   P8/P7/P6 LE: not;not;std;std.
> +   P8 BE: mtvsrd;mtvsrd;xxpermdi;xxlnor;stxvd2x.
> +   P7/P6 BE: std;std;addi;lxvd2x;xxlnor;stxvd2x.  */

But it should generate just the first or second.  So just document that,
and base the tests around that as well?

> +/* { dg-final { scan-assembler-times {\mmtvsrdd\M} 1 { target power9+ } } } 
> */
> +/* { dg-final { scan-assembler-times {\mxxlnor\M} 1 { target power9+ } } } */
> +/* { dg-final { scan-assembler-times {\mstxv\M} 1 { target power9+ } } } */
> +
> +/* { dg-final { scan-assembler-times {\mnot\M} 2 { xfail {! { {! power9+} && 
> {le} } } } } } */
> +
> +/* { dg-final { scan-assembler-times {\mstd\M} 2 { xfail { { {power8} && 
> {be} } || {power9+} } } } } */

These shouldn't xfail for p9 and later: it is an *unexpected* failure there.

Maybe it is easier to duplicate this test?  One for p9+ and one for ! p9+?

> +# Return 1 if we're generating code for only power8 platforms.
> +
> +proc check_effective_target_power8 {  } {
> +  return [check_no_compiler_messages_nocache power8 assembly {
> + #if !(!defined(_ARCH_PWR9) && defined(_ARCH_PWR8))
> + #error NO
> + #endif
> +  } ""]
> +}

Do we need this?

> +# Return 1 if we're generating code for power9 and future platforms.
> +
> +proc check_effective_target_power9+ {  } {
> +  return [check_no_compiler_messages_nocache power9+ assembly {
> + #if !(defined(_ARCH_PWR9))
> + #error NO
> + #endif
> +  } ""]
> +}

Maybe it is useful to have even shorter names, p9+, since room is scarce
where it is used (just like le and be).  But maybe it doesn't matter?


Segher


Re: [PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-22 Thread Uros Bizjak
On Fri, Nov 22, 2019 at 5:39 PM Bernd Schmidt  wrote:
>
> On 11/22/19 3:04 PM, Uros Bizjak wrote:
> > On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt  
> > wrote:
> >>
> >> A patch I posted recently fixes combine to take costs of JUMP_INSNs into
> >> account. That causes the pr30315 test to fail with -m32, since the cost
> >> of an add that sets the flags is estimated too high.
> >>
> >> The following seems to fix it.  Bootstrapped and tested on x86_64-linux, 
> >> ok?
> >
> > I think that the intention of the code below  "The embedded comparison
> > operand is completely free." comment is to handle this case. It looks
> > that it should return the cost of the inside operation of COMPARE rtx.
>
> There seem to be two problems with that. We're dealing with patterns
> such as:
>
> (set (reg:CCC 17 flags)
> (compare:CCC (plus:SI (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4
> A32])
> (reg/v:SI 87 [ b ]))
> (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4 A32])))

Indeed, this is a different case, an overflow test that results in one
CMP insn. I think, we should check if the second operand is either 0
(then proceed as it is now), or if the second operand equals first
operand of PLUS insn, then we actually emit CMP insn (please see
PR30315).

Uros.

> If I remove the test for const0_rtx, it still doesn't work - I think
> setting *total to zero is ineffective, since we'll still count the MEM
> twice.
>
> So, how about the following?
>
>
> Bernd
>
> @@ -19502,9 +19502,12 @@
> }
>
>/* The embedded comparison operand is completely free.  */
> -  if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0)))
> - && XEXP (x, 1) == const0_rtx)
> -   *total = 0;
> +  if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0
> +   {
> + *total = rtx_cost (XEXP (x, 0), GET_MODE (XEXP (x, 0)),
> +COMPARE, 0, speed);
> + return true;
> +   }
>
>return false;
>


[committed, amdgcn] Limit LDS usage

2019-11-22 Thread Andrew Stubbs
This patch changes the amount of LDS (Local Data Store) memory requested 
for offload kernels. This allows more teams/gangs to run on the same 
compute unit, increasing potential data throughput.


For OpenMP we can reduce the allocation to almost nothing. This means we 
can have up-to 40 single-thread teams per CU.


For OpenACC we need enough LDS to broadcast data between workers, and 
the algorithm is not particularly memory efficient. This means we cannot 
yet achieve the maximum thread count, but we can at least double the 
current thread-count -- to 32 -- but halving the LDS usage and relying 
on having 16 workers. (Note that I'm assuming Julian's multi-worker 
support patches will be committed soon. Without those we can allocate no 
LDS and have 40 single-worker teams. With the patches the same can also 
be true, but that's still on the to-do list.)


LDS allocation remains unchanged for non-offload compiles (this is only 
really used for running the testsuite).


--
Andrew Stubbs
CodeSourcery / Mentor Graphics
Limit LDS usage.

2019-11-22  Andrew Stubbs  

	gcc/
	* config/gcn/gcn.c (OMP_LDS_SIZE): Define.
	(ACC_LDS_SIZE): Define.
	(OTHER_LDS_SIZE): Define.
	(LDS_SIZE): Redefine using above.
	(gcn_expand_prologue): Initialize m0 with LDS_SIZE-1.

diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 3a8c10ed8b4..f85d84bbe95 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -70,10 +70,15 @@ int gcn_isa = 3;		/* Default to GCN3.  */
worker-single mode to worker-partitioned mode), per workgroup.  Global
analysis could calculate an exact bound, but we don't do that yet.
  
-   We reserve the whole LDS, which also prevents any other workgroup
-   sharing the Compute Unit.  */
+   We want to permit full occupancy, so size accordingly.  */
 
-#define LDS_SIZE 65536
+#define OMP_LDS_SIZE 0x600/* 0x600 is 1/40 total, rounded down.  */
+#define ACC_LDS_SIZE 32768/* Half of the total should be fine.  */
+#define OTHER_LDS_SIZE 65536  /* If in doubt, reserve all of it.  */
+
+#define LDS_SIZE (flag_openacc ? ACC_LDS_SIZE \
+		  : flag_openmp ? OMP_LDS_SIZE \
+		  : OTHER_LDS_SIZE)
 
 /* The number of registers usable by normal non-kernel functions.
The SGPR count includes any special extra registers such as VCC.  */
@@ -2876,8 +2881,11 @@ gcn_expand_prologue ()
   /* Ensure that the scheduler doesn't do anything unexpected.  */
   emit_insn (gen_blockage ());
 
+  /* m0 is initialized for the usual LDS DS and FLAT memory case.
+ The low-part is the address of the topmost addressable byte, which is
+ size-1.  The high-part is an offset and should be zero.  */
   emit_move_insn (gen_rtx_REG (SImode, M0_REG),
-		  gen_int_mode (LDS_SIZE, SImode));
+		  gen_int_mode (LDS_SIZE-1, SImode));
 
   emit_insn (gen_prologue_use (gen_rtx_REG (SImode, M0_REG)));
 


Re: [PATCH] fold strncmp of unterminated arrays (PR 92501)

2019-11-22 Thread Martin Sebor

On 11/15/19 1:15 PM, Jeff Law wrote:

On 11/14/19 10:11 AM, Martin Sebor wrote:

Adding tests for unsafe uses of unterminated constant char arrays
in string functions exposed the limitation in strncmp folding
described in PR 92501: GCC only folds strncmp calls involving
nul-terminated constant strings.

The attached patch improves the folder to also handle unterminated
constant character arrays.  This capability is in turn relied on
for the dependable detection of unsafe uses of unterminated arrays
in strncpy, a patch for which I'm about to post separately.

Tested on x86_64-linux.

Martin

gcc-92501.diff

PR tree-optimization/92501 - strncmp with constant unterminated arrays not 
folded

gcc/testsuite/ChangeLog:

PR tree-optimization/92501
* gcc.dg/strcmpopt_7.c: New test.

gcc/ChangeLog:

PR tree-optimization/92501
* gimple-fold.c ((gimple_fold_builtin_string_compare): Let strncmp
handle unterminated arrays.  Rename local variables for clarity.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c   (revision 278253)
+++ gcc/gimple-fold.c   (working copy)
@@ -2361,9 +2361,32 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
return true;
  }
  
-  const char *p1 = c_getstr (str1);

-  const char *p2 = c_getstr (str2);
+  /* Initially set to the number of characters, including the terminating
+ nul if each array has one.   Nx == strnlen(Sx, Nx) implies that
+ the array is not terminated by a nul.
+ For nul-terminated strings then adjusted to their length.  */
+  unsigned HOST_WIDE_INT len1 = HOST_WIDE_INT_MAX, len2 = len1;
+  const char *p1 = c_getstr (str1, );
+  const char *p2 = c_getstr (str2, );
  
+  /* The position of the terminting nul character if one exists, otherwise

terminting/terminating/


OK with the nit fixed.


Another review and more testing with the follow-on change to detect
the missing nuls (PR 88226) made me realize the code wasn't quite
right here.  It was at the same time overly permissive and restrictive.
I wound up tweaking it a bit and committing the attached in r278621.

Martin

PS This simple change is a good example of the delicate balance
between optimizing and detecting bugs.  The strncpy optimization
to fold calls like strncmp(A, A, N) into zero early on, without
even looking at the contents of A, means that invalid calls where
N is larger than the size of the unterminated array A are not
diagnosed.  Likewise, early folding of strcmp(A, S) to *A when
*S == 0 holds prevents the detection of the same bugs.  This might
be okay if the expected strategy is to fold the bugs away, but it's
not when the user expects GCC to consistently detect bugs even if
it can avoid their deleterious effects and substitute reasonable
behavior.  This will become a more of a problem once we provide
an option to control the strategy as we've discussed.
PR tree-optimization/92501 - strncmp with constant unterminated arrays not folded

gcc/testsuite/ChangeLog:

	PR tree-optimization/92501
	* gcc.dg/strcmpopt_7.c: New test.

gcc/ChangeLog:

	PR tree-optimization/92501
	* gimple-fold.c ((gimple_fold_builtin_string_compare): Let strncmp
	handle unterminated arrays.  Rename local variables for clarity.

Index: gcc/gimple-fold.c
===
--- gcc/gimple-fold.c	(revision 278603)
+++ gcc/gimple-fold.c	(working copy)
@@ -2323,8 +2323,7 @@ gimple_load_first_char (location_t loc, tree str,
   return var;
 }
 
-/* Fold a call to the str{n}{case}cmp builtin pointed by GSI iterator.
-   FCODE is the name of the builtin.  */
+/* Fold a call to the str{n}{case}cmp builtin pointed by GSI iterator.  */
 
 static bool
 gimple_fold_builtin_string_compare (gimple_stmt_iterator *gsi)
@@ -2337,18 +2336,19 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
   tree str1 = gimple_call_arg (stmt, 0);
   tree str2 = gimple_call_arg (stmt, 1);
   tree lhs = gimple_call_lhs (stmt);
-  HOST_WIDE_INT length = -1;
+  tree len = NULL_TREE;
+  unsigned HOST_WIDE_INT bound = HOST_WIDE_INT_M1U;
 
   /* Handle strncmp and strncasecmp functions.  */
   if (gimple_call_num_args (stmt) == 3)
 {
-  tree len = gimple_call_arg (stmt, 2);
+  len = gimple_call_arg (stmt, 2);
   if (tree_fits_uhwi_p (len))
-	length = tree_to_uhwi (len);
+	bound = tree_to_uhwi (len);
 }
 
   /* If the LEN parameter is zero, return zero.  */
-  if (length == 0)
+  if (bound == 0)
 {
   replace_call_with_value (gsi, integer_zero_node);
   return true;
@@ -2361,9 +2361,33 @@ gimple_fold_builtin_string_compare (gimple_stmt_it
   return true;
 }
 
-  const char *p1 = c_getstr (str1);
-  const char *p2 = c_getstr (str2);
+  /* Initially set to the number of characters, including the terminating
+ nul if each array has one.   LENx == strnlen (Sx, LENx) implies that
+ the array Sx is not terminated by a nul.
+ For nul-terminated 

Re: [PATCH] Make IPA-SRA follow comdat-local rules (PR 91956)

2019-11-22 Thread Jan Hubicka
> Hi,
> 
> this verifier error happens because first IPA-split splits a comdat
> function creating a comdat local callee and then IPA-SRA comes along,
> modifies the caller, makes it a local clone in the process but because
> it does not set the comdat group to the original, the verifier sees a
> call from outside of the comdat group to a private comdat symbol, which
> it does not like.
> 
> Fixed by doing what IPA-split does in the patch below.  Bootstrapped and
> tested on x86_64-linux.  OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 2019-11-21  Martin Jambor  
> 
>   PR ipa/91956
>   * ipa-sra.c (process_isra_node_results): Put the new node to the
>   same comdat group as the original node.
> 
>   testsuite/
>   * g++.dg/ipa/pr91956.C: New test.
> ---
>  gcc/ipa-sra.c  |  3 +++
>  gcc/testsuite/g++.dg/ipa/pr91956.C | 27 +++
>  2 files changed, 30 insertions(+)
>  create mode 100644 gcc/testsuite/g++.dg/ipa/pr91956.C
> 
> diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
> index 08606aeded1..c6ed0f44b87 100644
> --- a/gcc/ipa-sra.c
> +++ b/gcc/ipa-sra.c
> @@ -3759,6 +3759,9 @@ process_isra_node_results (cgraph_node *node,
>  = node->create_virtual_clone (callers, NULL, new_adjustments, "isra",
> suffix_counter);
>suffix_counter++;
> +  if (node->same_comdat_group)
> +new_node->add_to_same_comdat_group (node);
> +  new_node->calls_comdat_local = node->calls_comdat_local;

This looks correct, but since I do not remember how that is done,
can you please double-check that the symbol actually gets output into
the given comdat section in the assembly file?

Honza


Re: [PATCH] ipa: Prevent materialization of clones with removed bodies (PR 92109)

2019-11-22 Thread Jan Hubicka
> Hi,
> 
> this bug is an interesting one.  The immediate cause of the ICE is that
> IPA-SRA modification phase baked into clone materialization was
> expecting to see a body already changed by IPA-CP while it was actually
> looking at the original function.
> 
> That was because we were materializing the (offline) clone in an ltrans
> from which it was called but where it did not belonged at all.
> remove_unreachable_nodes realized this and wanted to keep just the
> declaration and remove the body - and therefor also all traces of the
> IPA-CP clone which was necessary just for materialization.  But
> materialization code still attempted to resurrect the clone.
> 
> Honza proposed immediately removing any node which only needs to have a
> declaration from the tree of clones to prevent materialization code from
> even considering them which is what the following patch does.
> 
> Unfortunately, I was not able to come up with a small testcase, the bug
> is triggered by fragile partitioning decisions.  I have bootstrapped,
> LTO-bootstrapped and tested the patch on an x86_64-linux, OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 
> 2019-11-21  Martin Jambor  
> 
>   PR ipa/92109
>   * cgraph.h (cgraph_node::remove_from_clone_tree): Declare.
>   * cgraphclones.c (cgraph_node::remove_from_clone_tree): New method.
>   (cgraph_materialize_clone): Move removel from clone tree to the
>   the new method and use it instead.
>   * ipa.c (symbol_table::remove_unreachable_nodes): When removing
>   bodies of clones, also remove it from the clone tree.

OK,
thanks!
Honza
> ---
>  gcc/cgraph.h   |  4 
>  gcc/cgraphclones.c | 35 ++-
>  gcc/ipa.c  | 11 ---
>  3 files changed, 34 insertions(+), 16 deletions(-)
> 
> diff --git a/gcc/cgraph.h b/gcc/cgraph.h
> index a4f14743f00..0d2442c997c 100644
> --- a/gcc/cgraph.h
> +++ b/gcc/cgraph.h
> @@ -967,6 +967,10 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : 
> public symtab_node
>ipa_param_adjustments *param_adjustments,
>const char * suffix, unsigned num_suffix);
>  
> +  /* Remove the node from the tree of virtual and inline clones and make it a
> + standalone node - not a clone any more.  */
> +  void remove_from_clone_tree ();
> +
>/* cgraph node being removed from symbol table; see if its entry can be
> replaced by other inline clone.  */
>cgraph_node *find_replacement (void);
> diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
> index bfcebb20495..ac5c57a47aa 100644
> --- a/gcc/cgraphclones.c
> +++ b/gcc/cgraphclones.c
> @@ -1013,6 +1013,22 @@ cgraph_node::create_version_clone_with_body
>return new_version_node;
>  }
>  
> +/* Remove the node from the tree of virtual and inline clones and make it a
> +   standalone node - not a clone any more.  */
> +
> +void cgraph_node::remove_from_clone_tree ()
> +{
> +  if (next_sibling_clone)
> +next_sibling_clone->prev_sibling_clone = prev_sibling_clone;
> +  if (prev_sibling_clone)
> +prev_sibling_clone->next_sibling_clone = next_sibling_clone;
> +  else
> +clone_of->clones = next_sibling_clone;
> +  next_sibling_clone = NULL;
> +  prev_sibling_clone = NULL;
> +  clone_of = NULL;
> +}
> +
>  /* Given virtual clone, turn it into actual clone.  */
>  
>  static void
> @@ -1033,22 +1049,15 @@ cgraph_materialize_clone (cgraph_node *node)
>dump_function_to_file (node->decl, symtab->dump_file, dump_flags);
>  }
>  
> +  cgraph_node *clone_of = node->clone_of;
>/* Function is no longer clone.  */
> -  if (node->next_sibling_clone)
> -node->next_sibling_clone->prev_sibling_clone = node->prev_sibling_clone;
> -  if (node->prev_sibling_clone)
> -node->prev_sibling_clone->next_sibling_clone = node->next_sibling_clone;
> -  else
> -node->clone_of->clones = node->next_sibling_clone;
> -  node->next_sibling_clone = NULL;
> -  node->prev_sibling_clone = NULL;
> -  if (!node->clone_of->analyzed && !node->clone_of->clones)
> +  node->remove_from_clone_tree ();
> +  if (!clone_of->analyzed && !clone_of->clones)
>  {
> -  node->clone_of->release_body ();
> -  node->clone_of->remove_callees ();
> -  node->clone_of->remove_all_references ();
> +  clone_of->release_body ();
> +  clone_of->remove_callees ();
> +  clone_of->remove_all_references ();
>  }
> -  node->clone_of = NULL;
>bitmap_obstack_release (NULL);
>  }
>  
> diff --git a/gcc/ipa.c b/gcc/ipa.c
> index 0c92980db46..2404024d722 100644
> --- a/gcc/ipa.c
> +++ b/gcc/ipa.c
> @@ -520,9 +520,14 @@ symbol_table::remove_unreachable_nodes (FILE *file)
>reliably.  */
> if (node->alias || node->thunk.thunk_p)
>   ;
> -   else if (!body_needed_for_clonning.contains (node->decl)
> -   && !node->alias && !node->thunk.thunk_p)
> - node->release_body ();
> +   else if 

Re: [PATCH 3/4] Set costs for jumps in combine

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 21, 2019 at 08:12:05PM -0500, Paul Koning wrote:
> > On Nov 21, 2019, at 7:42 PM, Segher Boessenkool 
> >  wrote:
> > 
> > ...
> > Maybe i386 should implement the insn_cost hook as well?  For most targets
> > that is a lot simpler to get right than rtx_cost, but allowing memory in
> > many insns and all the different insn lengths complicates matters.  At
> > least insn_cost isn't inside-out, that should make it easier to deal with
> > already.
> 
> Yes, rtx_cost is a pain to write and understand.  I looked at insn_cost in 
> the past but was told, or got the impression somehow, that it is at the 
> moment a partial solution and rtx_cost is still needed to complete the cost 
> picture.
> 
> Is that incorrect, or no longer accurate?  I would like to dump rtx_cost in 
> my target if that is now indeed a valid thing to do.

Some things still want the rtx_cost of non-insns.  CSE does this, as does
expand (for figuring out the multiplication strategy to use, for one thing),
and then there are set_src_cost and set_rtx_cost (which are probably not
hard to fix, for the cases where there *is* an insn for doing things).

With rtx_cost you need to estimate the cost of arbitrary RTL expressions,
whether that means anything for the machine or not.  This does not make
terribly much sense, but changing the compiler her without regressing
things isn't trivial.


Segher


Re: [PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-22 Thread Bernd Schmidt
On 11/22/19 3:04 PM, Uros Bizjak wrote:
> On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt  wrote:
>>
>> A patch I posted recently fixes combine to take costs of JUMP_INSNs into
>> account. That causes the pr30315 test to fail with -m32, since the cost
>> of an add that sets the flags is estimated too high.
>>
>> The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?
> 
> I think that the intention of the code below  "The embedded comparison
> operand is completely free." comment is to handle this case. It looks
> that it should return the cost of the inside operation of COMPARE rtx.

There seem to be two problems with that. We're dealing with patterns
such as:

(set (reg:CCC 17 flags)
(compare:CCC (plus:SI (mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4
A32])
(reg/v:SI 87 [ b ]))
(mem/c:SI (reg/f:SI 16 argp) [2 a+0 S4 A32])))

If I remove the test for const0_rtx, it still doesn't work - I think
setting *total to zero is ineffective, since we'll still count the MEM
twice.

So, how about the following?


Bernd

@@ -19502,9 +19502,12 @@
}

   /* The embedded comparison operand is completely free.  */
-  if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0)))
- && XEXP (x, 1) == const0_rtx)
-   *total = 0;
+  if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0
+   {
+ *total = rtx_cost (XEXP (x, 0), GET_MODE (XEXP (x, 0)),
+COMPARE, 0, speed);
+ return true;
+   }

   return false;



Prevent all uses of DFP when unsupported (PR c/91985)

2019-11-22 Thread Joseph Myers
Code that directly uses _Decimal* types on architectures not
supporting DFP is properly diagnosed ("error: decimal floating-point
not supported for this target"), via a call to
targetm.decimal_float_supported_p, if the _Decimal32, _Decimal64 or
_Decimal128 keywords are used to access it.  Use via mode attributes
is also diagnosed ("unable to emulate 'SD'"); so is use of the
FLOAT_CONST_DECIMAL64 pragma.  However, it is possible to access those
types via typeof applied to constants or built-in functions without
such an error.  I expect that there are ways to get an ICE from this;
certainly it uses a completely undefined ABI.

This patch arranges for the types not to exist in the compiler at all
when DFP is not supported.  As is done with unsupported _FloatN /
_FloatNx types, the global tree nodes are left as NULL_TREE, and the
built-in function machinery is made to use error_mark_node for them in
that case in builtin-types.def, so that the built-in functions are
unavailable.  Code handling constants is adjusted to give an error,
and other code that might not work with the global tree nodes being
NULL_TREE is also updated.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Also tested
with no regressions for cross to aarch64-linux-gnu, as a configuration
without DFP support.  OK to commit (the changes that aren't C front-end 
changes)?

gcc:
2019-11-22  Joseph Myers  

PR c/91985
* builtin-types.def (BT_DFLOAT32, BT_DFLOAT64, BT_DFLOAT128)
(BT_DFLOAT32_PTR, BT_DFLOAT64_PTR, BT_DFLOAT128_PTR): Define to
error_mark_node if corresponding global tree node is NULL.
* tree.c (build_common_tree_nodes): Do not initialize
dfloat32_type_node, dfloat64_type_node or dfloat128_type_node if
decimal floating-point not supported.

gcc/c:
2019-11-22  Joseph Myers  

PR c/91985
* c-decl.c (finish_declspecs): Use int instead of decimal
floating-point types if decimal floating-point not supported.

gcc/c-family:
2019-11-22  Joseph Myers  

PR c/91985
* c-common.c (c_common_type_for_mode): Handle decimal
floating-point types being NULL_TREE.
* c-format.c (get_format_for_type_1): Handle specified types being
NULL_TREE.
* c-lex.c (interpret_float): Give an error for decimal
floating-point constants when decimal floating-point not
supported.

gcc/lto:
2019-11-22  Joseph Myers  

PR c/91985
* lto-lang.c (lto_type_for_mode): Handle decimal floating-point
types being NULL_TREE.

gcc/testsuite:
2019-11-22  Joseph Myers  

PR c/91985
* gcc.dg/c2x-no-dfp-1.c, gcc.dg/gnu2x-builtins-no-dfp-1.c: New
tests.
* gcc.dg/fltconst-pedantic-dfp.c: Expect errors when decimal
floating-point not supported.

Index: gcc/builtin-types.def
===
--- gcc/builtin-types.def   (revision 278603)
+++ gcc/builtin-types.def   (working copy)
@@ -136,12 +136,24 @@ DEF_PRIMITIVE_TYPE (BT_WINT, wint_type_node)
 DEF_PRIMITIVE_TYPE (BT_STRING, string_type_node)
 DEF_PRIMITIVE_TYPE (BT_CONST_STRING, const_string_type_node)
 
-DEF_PRIMITIVE_TYPE (BT_DFLOAT32, dfloat32_type_node)
-DEF_PRIMITIVE_TYPE (BT_DFLOAT64, dfloat64_type_node)
-DEF_PRIMITIVE_TYPE (BT_DFLOAT128, dfloat128_type_node)
-DEF_PRIMITIVE_TYPE (BT_DFLOAT32_PTR, dfloat32_ptr_type_node)
-DEF_PRIMITIVE_TYPE (BT_DFLOAT64_PTR, dfloat64_ptr_type_node)
-DEF_PRIMITIVE_TYPE (BT_DFLOAT128_PTR, dfloat128_ptr_type_node)
+DEF_PRIMITIVE_TYPE (BT_DFLOAT32, (dfloat32_type_node
+ ? dfloat32_type_node
+ : error_mark_node))
+DEF_PRIMITIVE_TYPE (BT_DFLOAT64, (dfloat64_type_node
+ ? dfloat64_type_node
+ : error_mark_node))
+DEF_PRIMITIVE_TYPE (BT_DFLOAT128, (dfloat128_type_node
+  ? dfloat128_type_node
+  : error_mark_node))
+DEF_PRIMITIVE_TYPE (BT_DFLOAT32_PTR, (dfloat32_ptr_type_node
+ ? dfloat32_ptr_type_node
+ : error_mark_node))
+DEF_PRIMITIVE_TYPE (BT_DFLOAT64_PTR, (dfloat64_ptr_type_node
+ ? dfloat64_ptr_type_node
+ : error_mark_node))
+DEF_PRIMITIVE_TYPE (BT_DFLOAT128_PTR, (dfloat128_ptr_type_node
+  ? dfloat128_ptr_type_node
+  : error_mark_node))
 
 DEF_PRIMITIVE_TYPE (BT_VALIST_REF, va_list_ref_type_node)
 DEF_PRIMITIVE_TYPE (BT_VALIST_ARG, va_list_arg_type_node)
Index: gcc/c/c-decl.c
===
--- gcc/c/c-decl.c  (revision 278603)
+++ gcc/c/c-decl.c  (working copy)
@@ -11619,7 +11619,9 @@ finish_declspecs (struct c_declspecs *specs)
 case cts_dfloat128:
   

Re: Add a new combine pass

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 21, 2019 at 08:32:14PM +, Richard Sandiford wrote:
> Segher Boessenkool  writes:
> > It's not great if every pass invents its own version of some common
> > infrastructure thing because that common one is not suitable.
> >
> > I.e., can this be fixed somehow?  Maybe just by having a restricted DU
> > chains df problem?
> 
> Well, it'd probably make sense to make fwprop.c's approach available
> as a "proper" df interface at some point.  Hopefully if anyone wants the
> same thing as fwprop.c, they'd do that rather than copy the code. :-)

> >> * Updating a full, ordered def-use chain after a move is a linear-time
> >>   operation, so whatever happens, we'd need to apply some kind of limit
> >>   on the number of uses we maintain, with something like that integer
> >>   point range for the rest.

Yeah.

> >> * Once we've analysed the insn and built its def-use chains, we don't
> >>   look at the df_refs again until we update the chains after a successful
> >>   combination.  So it should be more efficient to maintain a small array
> >>   of insn_info_rec pointers alongside the numerical range, rather than
> >>   walk and pollute chains of df_refs and then link back the insn uids
> >>   to the pass-local info.
> >
> > So you need something like combine's LOG_LINKS?  Not that handling those
> > is not quadratic in the worst case, but in practice it works well.  And
> > it *could* be made linear.
> 
> Not sure why what I've used isn't what I need though :-)

I am wondering the other way around :-)  Is what you do for combine2
something that would be more generally applicable/useful?  That's what
I'm trying to find out :-)

What combine does could use some improvement, if you want to hear a
more direct motivations.  LOG_LINKS just skip references we cannot
handle (and some more), so we always have to do modified_between etc.,
which hurts.

> >> Target Tests   DeltaBest   Worst  Median
> >> avr-elf 1341 -111401  -13824 680 -10
> >
> > Things like this are kind of suspicious :-)
> 
> Yeah.  This mostly seems to come from mopping up the extra moves created
> by make_more_copies.  So we have combinations like:
> 
>58: r70:SF=r94:SF
>   REG_DEAD r94:SF
>60: r22:SF=r70:SF
>   REG_DEAD r70:SF

Why didn't combine do this?  A target problem?

> So there's only one case in which it isn't a win, but the number of
> tests is tiny.  So I agree there's no justification for trying this in
> combine proper as things stand (and I wasn't arguing otherwise FWIW).
> I'd still like to keep it in the new pass because it does help
> *sometimes* and there's no sign yet that it has a noticeable
> compile-time cost.

So when does it help?  I can only think of cases where there are
problems elsewhere.

> It might also be interesting to see how much difference it makes for
> run-combine=4 (e.g. to see how much it makes up for the current 2-insn
> limit)...

Numbers are good :-)


Segher


Re: [PATCH] [og9] Fix libgomp.oacc-fortran/lib-16.f90 test

2019-11-22 Thread Julian Brown
On Fri, 22 Nov 2019 16:11:15 +
Kwok Cheung Yeung  wrote:

> libgomp.oacc-fortran/lib-16.f90 in the libgomp testsuite currently
> fails with:
> 
> Program aborted. Backtrace:
> #0  0x4013bf in MAIN__
>  at 
> src/gcc-gcn-master/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90:28
> #1  0x4013bf in main
>  at 
> src/gcc-gcn-master/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90:6
> 
> In commit 3a25e449d04d5768c3a799264ba0e5cab8ae181f ([og9] Fix OpenACC 
> "ephemeral" asynchronous host-to-device copies), lib-16-2.f90 has two 
> acc_waits added to wait for an asynchronous update to complete before 
> operating on the data. Since lib-16-2.f90 is supposed to be identical
> to lib-16.f90 except for using 'include "openacc_lib.h"' instead of
> 'use openacc', any changes to lib-16-2.f90 should also apply to
> lib-16.f90. When the changes are applied to lib-16.f90, the test
> passes.
> 
> Okay to commit to OG9?

Fine by me, FWIW!

Thanks,

Julian


[PATCH] ipa: Prevent materialization of clones with removed bodies (PR 92109)

2019-11-22 Thread Martin Jambor
Hi,

this bug is an interesting one.  The immediate cause of the ICE is that
IPA-SRA modification phase baked into clone materialization was
expecting to see a body already changed by IPA-CP while it was actually
looking at the original function.

That was because we were materializing the (offline) clone in an ltrans
from which it was called but where it did not belonged at all.
remove_unreachable_nodes realized this and wanted to keep just the
declaration and remove the body - and therefor also all traces of the
IPA-CP clone which was necessary just for materialization.  But
materialization code still attempted to resurrect the clone.

Honza proposed immediately removing any node which only needs to have a
declaration from the tree of clones to prevent materialization code from
even considering them which is what the following patch does.

Unfortunately, I was not able to come up with a small testcase, the bug
is triggered by fragile partitioning decisions.  I have bootstrapped,
LTO-bootstrapped and tested the patch on an x86_64-linux, OK for trunk?

Thanks,

Martin



2019-11-21  Martin Jambor  

PR ipa/92109
* cgraph.h (cgraph_node::remove_from_clone_tree): Declare.
* cgraphclones.c (cgraph_node::remove_from_clone_tree): New method.
(cgraph_materialize_clone): Move removel from clone tree to the
the new method and use it instead.
* ipa.c (symbol_table::remove_unreachable_nodes): When removing
bodies of clones, also remove it from the clone tree.
---
 gcc/cgraph.h   |  4 
 gcc/cgraphclones.c | 35 ++-
 gcc/ipa.c  | 11 ---
 3 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index a4f14743f00..0d2442c997c 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -967,6 +967,10 @@ struct GTY((tag ("SYMTAB_FUNCTION"))) cgraph_node : public 
symtab_node
 ipa_param_adjustments *param_adjustments,
 const char * suffix, unsigned num_suffix);
 
+  /* Remove the node from the tree of virtual and inline clones and make it a
+ standalone node - not a clone any more.  */
+  void remove_from_clone_tree ();
+
   /* cgraph node being removed from symbol table; see if its entry can be
replaced by other inline clone.  */
   cgraph_node *find_replacement (void);
diff --git a/gcc/cgraphclones.c b/gcc/cgraphclones.c
index bfcebb20495..ac5c57a47aa 100644
--- a/gcc/cgraphclones.c
+++ b/gcc/cgraphclones.c
@@ -1013,6 +1013,22 @@ cgraph_node::create_version_clone_with_body
   return new_version_node;
 }
 
+/* Remove the node from the tree of virtual and inline clones and make it a
+   standalone node - not a clone any more.  */
+
+void cgraph_node::remove_from_clone_tree ()
+{
+  if (next_sibling_clone)
+next_sibling_clone->prev_sibling_clone = prev_sibling_clone;
+  if (prev_sibling_clone)
+prev_sibling_clone->next_sibling_clone = next_sibling_clone;
+  else
+clone_of->clones = next_sibling_clone;
+  next_sibling_clone = NULL;
+  prev_sibling_clone = NULL;
+  clone_of = NULL;
+}
+
 /* Given virtual clone, turn it into actual clone.  */
 
 static void
@@ -1033,22 +1049,15 @@ cgraph_materialize_clone (cgraph_node *node)
   dump_function_to_file (node->decl, symtab->dump_file, dump_flags);
 }
 
+  cgraph_node *clone_of = node->clone_of;
   /* Function is no longer clone.  */
-  if (node->next_sibling_clone)
-node->next_sibling_clone->prev_sibling_clone = node->prev_sibling_clone;
-  if (node->prev_sibling_clone)
-node->prev_sibling_clone->next_sibling_clone = node->next_sibling_clone;
-  else
-node->clone_of->clones = node->next_sibling_clone;
-  node->next_sibling_clone = NULL;
-  node->prev_sibling_clone = NULL;
-  if (!node->clone_of->analyzed && !node->clone_of->clones)
+  node->remove_from_clone_tree ();
+  if (!clone_of->analyzed && !clone_of->clones)
 {
-  node->clone_of->release_body ();
-  node->clone_of->remove_callees ();
-  node->clone_of->remove_all_references ();
+  clone_of->release_body ();
+  clone_of->remove_callees ();
+  clone_of->remove_all_references ();
 }
-  node->clone_of = NULL;
   bitmap_obstack_release (NULL);
 }
 
diff --git a/gcc/ipa.c b/gcc/ipa.c
index 0c92980db46..2404024d722 100644
--- a/gcc/ipa.c
+++ b/gcc/ipa.c
@@ -520,9 +520,14 @@ symbol_table::remove_unreachable_nodes (FILE *file)
 reliably.  */
  if (node->alias || node->thunk.thunk_p)
;
- else if (!body_needed_for_clonning.contains (node->decl)
- && !node->alias && !node->thunk.thunk_p)
-   node->release_body ();
+ else if (!body_needed_for_clonning.contains (node->decl))
+   {
+ /* Make the node a non-clone so that we do not attempt to
+materialize it later.  */
+ if (node->clone_of)
+   node->remove_from_clone_tree ();
+ 

Re: Add a new combine pass

2019-11-22 Thread Segher Boessenkool
On Thu, Nov 21, 2019 at 07:41:56PM +, Richard Sandiford wrote:
> Nicholas Krause  writes:
> >> +/* The instructions we're combining, in program order.  */
> >> +insn_info_rec *sequence[MAX_COMBINE_INSNS];
> > Can't we can this a vec in order to grow to lengths and just loop through
> > merging on instructions in the vec as required?
> 
> Yeah, extending this to combining more than 2 instructions would be
> future work.  When that happens, this would likely end up becoming an
> auto_vec.  I imagine there would
> still be a fairly low compile-time limit on the number of combinations
> though.  E.g. current combine has a limit of 4, with even 4 being
> restricted to certain high-value cases.  I don't think I've ever
> seen a case where 5 or more would help.

And sometimes it looks like 4 would help, but often this is because of a
limitation elsewhere (like, it should have done a 2->2 before, for example).

4 _does_ help quite a bit with irregular instruction sets.  It could
sometimes help with RMW insns, too, but there are other problems with
that.

What you see a lot where 4 "helps" is where it really should combine
with just 3 of them, but something prevents that, often cost, while
throwing in a 4th insn tilts the balance just enough.  We used to have
a lot of that with 3-insn combinations as well, and probably still have
some.


Segher


[PATCH] Make IPA-SRA follow comdat-local rules (PR 91956)

2019-11-22 Thread Martin Jambor
Hi,

this verifier error happens because first IPA-split splits a comdat
function creating a comdat local callee and then IPA-SRA comes along,
modifies the caller, makes it a local clone in the process but because
it does not set the comdat group to the original, the verifier sees a
call from outside of the comdat group to a private comdat symbol, which
it does not like.

Fixed by doing what IPA-split does in the patch below.  Bootstrapped and
tested on x86_64-linux.  OK for trunk?

Thanks,

Martin


2019-11-21  Martin Jambor  

PR ipa/91956
* ipa-sra.c (process_isra_node_results): Put the new node to the
same comdat group as the original node.

testsuite/
* g++.dg/ipa/pr91956.C: New test.
---
 gcc/ipa-sra.c  |  3 +++
 gcc/testsuite/g++.dg/ipa/pr91956.C | 27 +++
 2 files changed, 30 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/ipa/pr91956.C

diff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index 08606aeded1..c6ed0f44b87 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -3759,6 +3759,9 @@ process_isra_node_results (cgraph_node *node,
 = node->create_virtual_clone (callers, NULL, new_adjustments, "isra",
  suffix_counter);
   suffix_counter++;
+  if (node->same_comdat_group)
+new_node->add_to_same_comdat_group (node);
+  new_node->calls_comdat_local = node->calls_comdat_local;
 
   if (dump_file)
 fprintf (dump_file, "  Created new node %s\n", new_node->dump_name ());
diff --git a/gcc/testsuite/g++.dg/ipa/pr91956.C 
b/gcc/testsuite/g++.dg/ipa/pr91956.C
new file mode 100644
index 000..6f6edc30a47
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ipa/pr91956.C
@@ -0,0 +1,27 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -std=c++11 -fno-strict-aliasing -fno-tree-fre 
-fno-tree-vrp"  } */
+
+int count = 0;
+struct VB
+{
+  VB() {++count;}
+};
+
+struct B : virtual VB
+{
+  B() : B(42) {}
+  B(int)  {}
+};
+
+struct D : B
+{
+  D() {}
+  D(int) : D() {}
+};
+
+int main()
+{
+  D d{42};
+  if (count != 1)
+__builtin_abort();
+}
-- 
2.23.0



[PATCH] [og9] Fix libgomp.oacc-fortran/lib-16.f90 test

2019-11-22 Thread Kwok Cheung Yeung
libgomp.oacc-fortran/lib-16.f90 in the libgomp testsuite currently fails 
with:


Program aborted. Backtrace:
#0  0x4013bf in MAIN__
at 
src/gcc-gcn-master/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90:28

#1  0x4013bf in main
at 
src/gcc-gcn-master/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90:6


In commit 3a25e449d04d5768c3a799264ba0e5cab8ae181f ([og9] Fix OpenACC 
"ephemeral" asynchronous host-to-device copies), lib-16-2.f90 has two 
acc_waits added to wait for an asynchronous update to complete before 
operating on the data. Since lib-16-2.f90 is supposed to be identical to 
lib-16.f90 except for using 'include "openacc_lib.h"' instead of 'use 
openacc', any changes to lib-16-2.f90 should also apply to lib-16.f90. 
When the changes are applied to lib-16.f90, the test passes.


Okay to commit to OG9?

Thanks

Kwok


2019-11-22  Kwok Cheung Yeung  

libgomp/
* testsuite/libgomp.oacc-fortran/lib-16.f90: Fix async-safety issue.


diff --git a/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90 
b/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90

index 011f9cf..5e01099 100644
--- a/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
+++ b/libgomp/testsuite/libgomp.oacc-fortran/lib-16.f90
@@ -27,6 +27,9 @@ program main

   if (acc_is_present (h) .neqv. .TRUE.) call abort

+  ! We must wait for the update to be done.
+  call acc_wait (async)
+
   h(:) = 0

   call acc_copyout_async (h, sizeof (h), async)
@@ -45,6 +48,8 @@ program main

   if (acc_is_present (h) .neqv. .TRUE.) call abort

+  call acc_wait (async)
+
   do i = 1, N
 if (h(i) /= i + i) call abort
   end do


Re: [PATCH, rs6000] Refactor FP vector comparison operators

2019-11-22 Thread Segher Boessenkool
Hi!

On Thu, Nov 21, 2019 at 09:15:19PM +0800, Kewen.Lin wrote:
> >> +;; code iterators and attributes for vector FP comparison operators:
> >> +(define_code_iterator
> >> +  vector_fp_comparison_operator [lt le ne ungt unge unlt unle])
> > 
> > Can you change this name so it is clear it is just the simple ones?
> > 
> 
> Renamed as vector_fp_comparison_simple and vector_fp_comparison_complex,
> does it look better?

Sure.

>   4) Append "DONE" at the end of define_insn_and_split.

Ooh did I miss that?  That must have created some interesting code, heh.

> 2019-11-21 Kewen Lin  
> 
>   * config/rs6000/vector.md (vector_fp_comparison_simple):
>   New code iterator.
>   (vector_fp_comparison_complex): Likewise.
>   (vector_ for VEC_F and
>   vector_fp_comparison_simple): New define_and_split.

(You don't have to wrap so early...  Line length is 80 columns.)

> +;; code iterators and attributes for vector FP comparison operators:
> +(define_code_iterator
> + vector_fp_comparison_simple [lt le ne ungt unge unlt unle])
> +(define_code_iterator
> + vector_fp_comparison_complex [ltgt uneq unordered ordered])

Please indent by two spaces, not a tab.

> +; For lt/le/ne/ungt/unge/unlt/unle:
> +; lt(a,b)   = gt(b,a)
> +; le(a,b)   = ge(b,a)
> +; unge(a,b) = ~lt(a,b)
> +; unle(a,b) = ~gt(a,b)
> +; ne(a,b)   = ~eq(a,b)
> +; ungt(a,b) = ~le(a,b)
> +; unlt(a,b) = ~ge(a,b)
> +(define_insn_and_split "vector_"
>[(set (match_operand:VEC_F 0 "vfloat_operand")
> + (vector_fp_comparison_simple:VEC_F
> +(match_operand:VEC_F 1 "vfloat_operand")
> +(match_operand:VEC_F 2 "vfloat_operand")))]

Indent is weird here as well.

> +; For ltgt/uneq/ordered/unordered:
> +; ltgt: gt(a,b) | gt(b,a)
> +; uneq: ~(gt(a,b) | gt(b,a))
> +; ordered: ge(a,b) | ge(b,a)
> +; unordered: ~(ge(a,b) | ge(b,a))
> +(define_insn_and_split "vector_"
>[(set (match_operand:VEC_F 0 "vfloat_operand")
> + (vector_fp_comparison_complex:VEC_F
> +(match_operand:VEC_F 1 "vfloat_operand")
> +(match_operand:VEC_F 2 "vfloat_operand")))]
> +  "VECTOR_UNIT_ALTIVEC_OR_VSX_P (mode) && can_create_pseudo_p ()"
>"#"
> +  "&& can_create_pseudo_p ()"

So hrm, if we do that here we can as well do that in the previous
splitter as well (and not do the operands[0] thing) (sorry for going
back on this -- I have said that before haven't I?)

> +  switch (cond)
> +{
> +case LTGT:
> +  cond = GT;
> +  break;
> +case ORDERED:
> +  cond = GE;
> +  break;
> +default:
> +  gcc_unreachable ();
> +}

It feels a bit lighter (and is shorter) if you write

  if (cond == LTGT)
cond = GT;
  else if (cond == ORDERED)
cond = GE;
  else
gcc_unreachable ();


Okay for trunk with those trivialities taken care of one way or the
other.  Thanks!


Segher


[PATCH] PR c++/92439: Improve diagnostics for ill-formed requires-clauses

2019-11-22 Thread Andrew Sutton
This does a much better job identifying when an expression in a
requires-clause needs parentheses.

Andrew Sutton


0001-PR-c-92439.patch
Description: Binary data


Re: [patch,Fortran] PR 92050 - fix ICE with -fcheck=all

2019-11-22 Thread Steve Kargl
Just my $0.02.

Backporting a patch from trunk that fixes a regression is
always encouraged.  It is up to the person doing the backport
to determine the level of effort.  If it exceeds some threshold
the person can choose to not backport.

For patches that fix a bug, which is not a regresssion, but
allows gfortran to accept valid Fortran or reject invalid Fortran,
I think that this is up to the discretion of the person doing the
backport.  For example, I backported several patches that fixed
free-source code parsing issues with attribute statements (i.e.,
gfortran accepted "publica" as if it were "public a").

For new features, I think backporting is okay if the new feature
is isolated behind an option.

-- 
steve

On Fri, Nov 22, 2019 at 01:36:03PM +0100, Tobias Burnus wrote:
> Hi all,
> 
> I was asked to backport this fix to the GCC 9 branch – given that 
> -fcheck=bounds is a common option and it fails with code like 
> genecode.org. Given that ICEs with -fcheck are a regression and that the 
> patch is not that invasive, I am inclined to accept it.
> 
> Comments/(dis)approvals?
> 
> Tobias
> 
> On 10/11/19 12:17 PM, Tobias Burnus wrote:
> > Checking produces extra code (unsurprisingly); this code needs to end 
> > up in the program…
> >
> > Bootstrapped on x86-64_gnu-linux. OK for the trunk?
> >
> > Tobias
> >
> >
> > pr92050.diff
> >
> > diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
> > index 965ab7786a1..65238ff623d 100644
> > --- a/gcc/fortran/trans-expr.c
> > +++ b/gcc/fortran/trans-expr.c
> > @@ -7031,8 +7031,11 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * 
> > sym,
> > gfc_allocate_lang_decl (result);
> > GFC_DECL_SAVED_DESCRIPTOR (result) = parmse.expr;
> > gfc_free_expr (class_expr);
> > -  gcc_assert (parmse.pre.head == NULL_TREE
> > - && parmse.post.head == NULL_TREE);
> > +  /* -fcheck= can add diagnostic code, which has to be placed before
> > +the call. */
> > +  if (parmse.pre.head != NULL)
> > + gfc_add_expr_to_block (>pre, parmse.pre.head);
> > +  gcc_assert (parmse.post.head == NULL_TREE);
> >   }
> >   
> > /* Follow the function call with the argument post block.  */
> > diff --git a/gcc/testsuite/gfortran.dg/pr92050.f90 
> > b/gcc/testsuite/gfortran.dg/pr92050.f90
> > new file mode 100644
> > index 000..64193878d8f
> > --- /dev/null
> > +++ b/gcc/testsuite/gfortran.dg/pr92050.f90
> > @@ -0,0 +1,53 @@
> > +! { dg-do run }
> > +! { dg-options "-fcheck=all" }
> > +! { dg-shouldfail "above upper bound" }
> > +!
> > +! PR fortran/92050
> > +!
> > +!
> > +module buggy
> > +  implicit none (type, external)
> > +
> > +  type :: par
> > +  contains
> > +procedure, public :: fun => fun_par
> > +  end type par
> > +
> > +  type comp
> > +class(par), allocatable :: p
> > +  end type comp
> > +
> > +  type foo
> > +type(comp), allocatable :: m(:)
> > +  end type foo
> > +
> > +contains
> > +
> > +  function fun_par(this)
> > +class(par) :: this
> > +integer:: fun_par(1)
> > +fun_par = 42
> > +  end function fun_par
> > +
> > +  subroutine update_foo(this)
> > +class(foo) :: this
> > +write(*,*) this%m(1)%p%fun()
> > +  end subroutine update_foo
> > +
> > +  subroutine bad_update_foo(this)
> > +class(foo) :: this
> > +write(*,*) this%m(2)%p%fun()
> > +  end subroutine bad_update_foo
> > +end module buggy
> > +
> > +program main
> > +  use buggy
> > +  implicit none (type, external)
> > +  type(foo) :: x
> > +  allocate(x%m(1))
> > +  allocate(x%m(1)%p)
> > +  call update_foo(x)
> > +  call bad_update_foo(x)
> > +end program main
> > +
> > +! { dg-output "At line 39 of file .*pr92050.f90.*Fortran runtime error: 
> > Index '2' of dimension 1 of array 'this%m' above upper bound of 1" }

-- 
Steve
20170425 https://www.youtube.com/watch?v=VWUpyCsUKR4
20161221 https://www.youtube.com/watch?v=IbCHE-hONow


Re: [committed, amdgcn] Use GFX9 granulated sgprs count correctly

2019-11-22 Thread Tobias Burnus

Hi Andrew,

On 11/22/19 3:47 PM, Andrew Stubbs wrote:

--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -4922,6 +4922,14 @@ gcn_hsa_declare_function_name (FILE *file, const char 
*name, tree)
+  int granulated_sgprs;
+  if (TARGET_GCN3)
+granulated_sgprs = (sgpr + extra_regs + 7) / 8 - 1;
+  else if (TARGET_GCN5)
+granulated_sgprs = 2 * ((sgpr + extra_regs + 15) / 16 - 1);


Would it make sense to add here:

else
  gcc_unreachable ();

(TARGET_GCN5_PLUS would also work, but gcc_unreachable is probably better.)

Cheers,

Tobias



[committed, amdgcn] Use GFX9 granulated sgprs count correctly

2019-11-22 Thread Andrew Stubbs
I've committed the attached. The patch adjusts the GCN kernel metadata 
so that it is correct for GFX9 devices.


The existing implementation was correct for GFX8, and seems to work on 
GFX9, but wasn't technically correct.


--
Andrew Stubbs
CodeSourcery / Mentor Graphics
Use GFX9 granulated sgprs count correctly.

2019-11-22  Andrew Stubbs  

	gcc/
	* config/gcn/gcn.c (gcn_hsa_declare_function_name): Calculate
	granulated_sgprs according to architecture.

diff --git a/gcc/config/gcn/gcn.c b/gcc/config/gcn/gcn.c
index 4401896d441..b34e8e7f5e2 100644
--- a/gcc/config/gcn/gcn.c
+++ b/gcc/config/gcn/gcn.c
@@ -4922,6 +4922,14 @@ gcn_hsa_declare_function_name (FILE *file, const char *name, tree)
 	sgpr = MAX_NORMAL_SGPR_COUNT - extra_regs;
 }
 
+  /* GFX8 allocates SGPRs in blocks of 8.
+ GFX9 uses blocks of 16.  */
+  int granulated_sgprs;
+  if (TARGET_GCN3)
+granulated_sgprs = (sgpr + extra_regs + 7) / 8 - 1;
+  else if (TARGET_GCN5)
+granulated_sgprs = 2 * ((sgpr + extra_regs + 15) / 16 - 1);
+
   fputs ("\t.align\t256\n", file);
   fputs ("\t.type\t", file);
   assemble_name (file, name);
@@ -4960,7 +4968,7 @@ gcn_hsa_declare_function_name (FILE *file, const char *name, tree)
 	   "\t\tcompute_pgm_rsrc2_excp_en = 0\n",
 	   (vgpr - 1) / 4,
 	   /* Must match wavefront_sgpr_count */
-	   (sgpr + extra_regs + 7) / 8 - 1,
+	   granulated_sgprs,
 	   /* The total number of SGPR user data registers requested.  This
 	  number must match the number of user data registers enabled.  */
 	   cfun->machine->args.nsgprs);


[PATCH][Arm] Enable CLI for Armv8.6-a: armv8.6-a, i8mm and bf16

2019-11-22 Thread Dennis Zhang
Hi all,

This patch is part of a series adding support for Armv8.6-A features.
It enables options including -march=armv8.6-a, +i8mm and +bf16.
The +i8mm and +bf16 features are optional for Armv8.2-a and onward.
Documents are at https://developer.arm.com/docs/ddi0596/latest

Regtested for arm-none-linux-gnueabi-armv8-a.

Please help to check if ready for trunk.

Many thanks!
Dennis

gcc/ChangeLog:

2019-11-15  Dennis Zhang  

* config/arm/arm-c.c (arm_cpu_builtins): Define
__ARM_FEATURE_MATMUL_INT8, __ARM_FEATURE_BF16_VECTOR_ARITHMETIC,
__ARM_FEATURE_BF16_SCALAR_ARITHMETIC, and
__ARM_BF16_FORMAT_ALTERNATIVE when enabled.
* config/arm/arm-cpus.in (armv8_6, i8mm, bf16): New features.
* config/arm/arm-tables.opt: Regenerated.
* config/arm/arm.c (arm_option_reconfigure_globals): Init
arm_arch_i8mm and arm_arch_bf16 to enable features.
* config/arm/arm.h (TARGET_I8MM): New macro.
(TARGET_BF16_FP, TARGET_BF16_SIMD): Likewise.
* config/arm/t-aprofile: Add matching rules for -march=armv8.6-a.
* config/arm/t-arm-elf (all_v8_archs): Add armv8.6-a.
* config/arm/t-multilib: Add matching rules for -march=armv8.6-a.
(v8_6_a_simd_variants): New.
(v8_*_a_simd_variants): Add i8mm and bf16.
* doc/invoke.texi (armv8.6-a, i8mm, bf16): Document new options.

gcc/testsuite/ChangeLog:

2019-11-15  Dennis Zhang  

* gcc.target/arm/multilib.exp: Add combination tests for armv8.6-a.
diff --git a/gcc/config/arm/arm-c.c b/gcc/config/arm/arm-c.c
index c4485ce7af1..b47e64c2151 100644
--- a/gcc/config/arm/arm-c.c
+++ b/gcc/config/arm/arm-c.c
@@ -225,6 +225,14 @@ arm_cpu_builtins (struct cpp_reader* pfile)
 
   builtin_define_with_int_value ("__ARM_FEATURE_COPROC", coproc_level);
 }
+
+  def_or_undef_macro (pfile, "__ARM_FEATURE_MATMUL_INT8", TARGET_I8MM);
+  def_or_undef_macro (pfile, "__ARM_FEATURE_BF16_SCALAR_ARITHMETIC",
+		  TARGET_BF16_FP);
+  def_or_undef_macro (pfile, "__ARM_FEATURE_BF16_VECTOR_ARITHMETIC",
+		  TARGET_BF16_SIMD);
+  def_or_undef_macro (pfile, "__ARM_BF16_FORMAT_ALTERNATIVE",
+		  TARGET_BF16_FP || TARGET_BF16_SIMD);
 }
 
 void
diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 50379a0a10a..d373406649c 100644
--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -123,6 +123,9 @@ define feature armv8_4
 # Architecture rel 8.5.
 define feature armv8_5
 
+# Architecture rel 8.6.
+define feature armv8_6
+
 # M-Profile security extensions.
 define feature cmse
 
@@ -191,6 +194,12 @@ define feature sb
 # v8-A architectures, added by default from v8.5-A
 define feature predres
 
+# 8-bit Integer Matrix Multiply extension. Optional from v8.2-A.
+define feature i8mm
+
+# Brain half-precision floating-point extension. Optional from v8.2-A.
+define feature bf16
+
 # Feature groups.  Conventionally all (or mostly) upper case.
 # ALL_FPU lists all the feature bits associated with the floating-point
 # unit; these will all be removed if the floating-point unit is disabled
@@ -213,7 +222,7 @@ define fgroup ALL_CRYPTO	crypto
 # strip off 32 D-registers, but does not remove support for
 # double-precision FP.
 define fgroup ALL_SIMD_INTERNAL	fp_d32 neon ALL_CRYPTO
-define fgroup ALL_SIMD	ALL_SIMD_INTERNAL dotprod fp16fml
+define fgroup ALL_SIMD	ALL_SIMD_INTERNAL dotprod fp16fml i8mm
 
 # List of all FPU bits to strip out if -mfpu is used to override the
 # default.  fp16 is deliberately missing from this list.
@@ -253,6 +262,7 @@ define fgroup ARMv8_2aARMv8_1a armv8_2
 define fgroup ARMv8_3aARMv8_2a armv8_3
 define fgroup ARMv8_4aARMv8_3a armv8_4
 define fgroup ARMv8_5aARMv8_4a armv8_5 sb predres
+define fgroup ARMv8_6aARMv8_5a armv8_6
 define fgroup ARMv8m_base ARMv6m armv8 cmse tdiv
 define fgroup ARMv8m_main ARMv7m armv8 cmse
 define fgroup ARMv8r  ARMv8a
@@ -560,6 +570,8 @@ begin arch armv8.2-a
  option dotprod add FP_ARMv8 DOTPROD
  option sb add sb
  option predres add predres
+ option i8mm add i8mm FP_ARMv8 NEON
+ option bf16 add bf16 FP_ARMv8 NEON
 end arch armv8.2-a
 
 begin arch armv8.3-a
@@ -577,6 +589,8 @@ begin arch armv8.3-a
  option dotprod add FP_ARMv8 DOTPROD
  option sb add sb
  option predres add predres
+ option i8mm add i8mm FP_ARMv8 NEON
+ option bf16 add bf16 FP_ARMv8 NEON
 end arch armv8.3-a
 
 begin arch armv8.4-a
@@ -592,6 +606,8 @@ begin arch armv8.4-a
  option nofp remove ALL_FP
  option sb add sb
  option predres add predres
+ option i8mm add i8mm FP_ARMv8 NEON
+ option bf16 add bf16 FP_ARMv8 NEON
 end arch armv8.4-a
 
 begin arch armv8.5-a
@@ -605,8 +621,25 @@ begin arch armv8.5-a
  option crypto add FP_ARMv8 CRYPTO DOTPROD
  option nocrypto remove ALL_CRYPTO
  option nofp remove ALL_FP
+ option i8mm add i8mm FP_ARMv8 NEON
+ option bf16 add bf16 FP_ARMv8 NEON
 end arch armv8.5-a
 
+begin arch armv8.6-a
+ tune for cortex-a53
+ tune flags CO_PROC
+ base 8A
+ profile A
+ isa ARMv8_6a
+ 

Re: Ping: [PATCH V6] Extend IPA-CP to support arithmetically-computed value-passing on by-ref argument (PR ipa/91682)

2019-11-22 Thread Jan Hubicka
> Hi Honza,
> > 
> > > I checked update_jump_functions_after_inlining(), and found one
> > suspicious place:
> > >
> > >   for (i = 0; i < count; i++)
> > > {
> > >   struct ipa_jump_func *dst = ipa_get_ith_jump_func (args, i);
> > >   if (!top)
> > > {
> > >   ipa_set_jf_unknown (dst);
> > >   <   we should also invalidate dst->agg.items.
> > This is a good catch. In meantime a smaller testcase surfaces in
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92528
> > I am re-building Firefox with the patch I attache to the PR.
> > 
> 
> Just curious how this went. Are the firefox issue sorted or is there 
> something still to be done?

Yes, Firefox is working now for me and in fact I am just in the progress
of updating my branch at mozilla try servers to do some benchmarking
(still have few warnings to cache etc.).

There was additional interesting consequence of the patch in making
inliner noticeably slower both on cc1 compilation and Firefox:
propagating a lot more values made it to consider a lot more inlining
contextes.  So I finally pushed out patches treating non-linearities
there.

What remains to do is to fix the value ranges - ipa_set_jf_unknown
should not invalidate them.   I will try to do that soon.

Honza
> 
> Thanks,
> Tamar
> 
> > Honza


[PATCH] Fix type handling in undistribute_bitref_for_vector (PR tree-optimization/92618)

2019-11-22 Thread Jakub Jelinek
Hi!

The undistribute_bitref_for_vector function assumes that BIT_FIELD_REFs
with vector first argument must have always type of the vector element,
but that is not the case, BIT_FIELD_REF can extract any type with the
corresponding size from any other vector.

So, without this patch, when it saw e.g. addition in unsigned long long
type, even when the vector was 2x long long, it computed the addition
in 2x long long vector (in theory could run into undefined overflows)
and fail on IL checking because conversion from long long to unsigned long
long is not useless.  Even worse, for double addition with the vector 2x
long long, it would again perform the addition in 2x long long vector type,
which is completely wrong.

The following patch determines the right vector type and adds VCE around
the operands when needed.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-11-22  Jakub Jelinek  

PR tree-optimization/92618
* tree-ssa-reassoc.c (v_info): Change from auto_vec to a struct
containing the auto_vec and a tree.
(undistribute_bitref_for_vector): Handle the case when element type
of vec is not the same as type of the BIT_FIELD_REF.  Formatting
fixes.

* gcc.c-torture/compile/pr92618.c: New test.
* gcc.c-torture/execute/pr92618.c: New test.

--- gcc/tree-ssa-reassoc.c.jj   2019-11-13 10:54:56.038884580 +0100
+++ gcc/tree-ssa-reassoc.c  2019-11-22 10:57:24.956931307 +0100
@@ -1775,7 +1775,10 @@ undistribute_ops_list (enum tree_code op
first: element index for each relevant BIT_FIELD_REF.
second: the index of vec ops* for each relevant BIT_FIELD_REF.  */
 typedef std::pair v_info_elem;
-typedef auto_vec v_info;
+struct v_info {
+  tree vec_type;
+  auto_vec vec;
+};
 typedef v_info *v_info_ptr;
 
 /* Comparison function for qsort on VECTOR SSA_NAME trees by machine mode.  */
@@ -1840,8 +1843,11 @@ undistribute_bitref_for_vector (enum tre
   if (ops->length () <= 1)
 return false;
 
-  if (opcode != PLUS_EXPR && opcode != MULT_EXPR && opcode != BIT_XOR_EXPR
-  && opcode != BIT_IOR_EXPR && opcode != BIT_AND_EXPR)
+  if (opcode != PLUS_EXPR
+  && opcode != MULT_EXPR
+  && opcode != BIT_XOR_EXPR
+  && opcode != BIT_IOR_EXPR
+  && opcode != BIT_AND_EXPR)
 return false;
 
   hash_map v_info_map;
@@ -1879,9 +1885,45 @@ undistribute_bitref_for_vector (enum tre
   if (!TYPE_VECTOR_SUBPARTS (vec_type).is_constant ())
continue;
 
+  if (VECTOR_TYPE_P (TREE_TYPE (rhs))
+ || !is_a  (TYPE_MODE (TREE_TYPE (rhs
+   continue;
+
+  /* The type of BIT_FIELD_REF might not be equal to the element type of
+the vector.  We want to use a vector type with element type the
+same as the BIT_FIELD_REF and size the same as TREE_TYPE (vec).  */
+  if (!useless_type_conversion_p (TREE_TYPE (rhs), TREE_TYPE (vec_type)))
+   {
+ machine_mode simd_mode;
+ unsigned HOST_WIDE_INT size, nunits;
+ unsigned HOST_WIDE_INT elem_size
+   = tree_to_uhwi (TYPE_SIZE (TREE_TYPE (rhs)));
+ if (!GET_MODE_BITSIZE (TYPE_MODE (vec_type)).is_constant ())
+   continue;
+ if (size <= elem_size || (size % elem_size) != 0)
+   continue;
+ nunits = size / elem_size;
+ if (!mode_for_vector (SCALAR_TYPE_MODE (TREE_TYPE (rhs)),
+   nunits).exists (_mode))
+   continue;
+ vec_type = build_vector_type_for_mode (TREE_TYPE (rhs), simd_mode);
+
+ /* Ignore it if target machine can't support this VECTOR type.  */
+ if (!VECTOR_MODE_P (TYPE_MODE (vec_type)))
+   continue;
+
+ /* Check const vector type, constrain BIT_FIELD_REF offset and
+size.  */
+ if (!TYPE_VECTOR_SUBPARTS (vec_type).is_constant ())
+   continue;
+
+ if (maybe_ne (GET_MODE_SIZE (TYPE_MODE (vec_type)),
+   GET_MODE_SIZE (TYPE_MODE (TREE_TYPE (vec)
+   continue;
+   }
+
   tree elem_type = TREE_TYPE (vec_type);
-  unsigned HOST_WIDE_INT elem_size
-   = TREE_INT_CST_LOW (TYPE_SIZE (elem_type));
+  unsigned HOST_WIDE_INT elem_size = tree_to_uhwi (TYPE_SIZE (elem_type));
   if (maybe_ne (bit_field_size (rhs), elem_size))
continue;
 
@@ -1898,8 +1940,13 @@ undistribute_bitref_for_vector (enum tre
   bool existed;
   v_info_ptr  = v_info_map.get_or_insert (vec, );
   if (!existed)
-   info = new v_info;
-  info->safe_push (std::make_pair (idx, i));
+   {
+ info = new v_info;
+ info->vec_type = vec_type;
+   }
+  else if (!types_compatible_p (vec_type, info->vec_type))
+   continue;
+  info->vec.safe_push (std::make_pair (idx, i));
 }
 
   /* At least two VECTOR to combine.  */
@@ -1919,14 +1966,15 @@ undistribute_bitref_for_vector (enum tre
 {
   tree cand_vec = (*it).first;
   v_info_ptr cand_info = 

[C++ PATCH] Fix ICE in build_ctor_subob_ref during replace_placeholders (PR c++/92524)

2019-11-22 Thread Jakub Jelinek
Hi!

On the following testcase, replace_placeholders is called on a CONSTRUCTOR
{ a, [1..63] = NON_LVALUE_EXPR<42> }
and call build_ctor_subob_ref in that case.  The problem is that index is
RANGE_EXPR and as it is not INTEGER_CST, we call
build_class_member_access_expr which is not appropriate for array indexes.
cp_build_array_ref will not really work with RANGE_EXPR either, the
following patch just usesthe low bound of the range.  At least when trying
to read it, it should be equivalent to any other value in the range.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-11-22  Jakub Jelinek  

PR c++/92524
* tree.c (build_ctor_subob_ref): For RANGE_EXPR index, build
array reference for the low bound.

* g++.dg/cpp0x/pr92524.C: New test.

--- gcc/cp/tree.c.jj2019-11-16 18:13:28.571821577 +0100
+++ gcc/cp/tree.c   2019-11-21 12:22:27.260660874 +0100
@@ -3064,6 +3064,10 @@ build_ctor_subob_ref (tree index, tree t
 obj = NULL_TREE;
   else if (TREE_CODE (index) == INTEGER_CST)
 obj = cp_build_array_ref (input_location, obj, index, tf_none);
+  else if (TREE_CODE (index) == RANGE_EXPR)
+/* Use low bound for ranges.  */
+obj = cp_build_array_ref (input_location, obj, TREE_OPERAND (index, 0),
+ tf_none);
   else
 obj = build_class_member_access_expr (obj, index, NULL_TREE,
  /*reference*/false, tf_none);
--- gcc/testsuite/g++.dg/cpp0x/pr92524.C.jj 2019-11-21 12:34:53.350406680 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/pr92524.C2019-11-21 12:34:18.943925681 
+0100
@@ -0,0 +1,12 @@
+// PR c++/92524
+// { dg-do compile { target c++11 } }
+
+struct A { char a = '*'; };
+struct B { A b[64]; };
+
+void
+foo ()
+{
+  A a;
+  B{a};
+}

Jakub



[C/C++ PATCH] Fix up build of GCC 4.6 and earlier with GCC 9+ (PR c/90677, take 2)

2019-11-22 Thread Jakub Jelinek
On Wed, Nov 20, 2019 at 02:01:58PM -0500, Jason Merrill wrote:
> I would think that get_named_type should find struct or enum names that have
> been hidden by another declaration; that would fix this without
> special-casing cgraph_node.  For the C++ front-end, that would be
> 
> lookup_qualified_name (global_namespace, id, /*prefer_type*/2,
> /*complain*/false)

Ok, this patch does that, for C by going from I_TAG_BINDING to the global
level.  Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2019-11-22  Jakub Jelinek  

PR c/90677
* c-common.h (identifier_global_tag): Declare.
* c-format.c (get_pointer_to_named_type): Renamed to ...
(get_named_type): ... this.  Use identifier_global_tag instead of
identifier_global_value, handle the return value being a TYPE_P.
(init_dynamic_diag_info): Adjust get_pointer_to_named_type callers
to call get_named_type instead.  Formatting fixes.
c/
* c-decl.c (identifier_global_tag): Define.
cp/
* cp-objcp-common.c (identifier_global_tag): Define.
testsuite/
* c-c++-common/pr90677.c: New test.

--- gcc/c-family/c-common.h.jj  2019-11-15 09:28:56.904930082 +0100
+++ gcc/c-family/c-common.h 2019-11-22 12:47:45.261379415 +0100
@@ -811,6 +811,7 @@ extern void c_register_addr_space (const
 extern bool in_late_binary_op;
 extern const char *c_addr_space_name (addr_space_t as);
 extern tree identifier_global_value (tree);
+extern tree identifier_global_tag (tree);
 extern bool names_builtin_p (const char *);
 extern tree c_linkage_bindings (tree);
 extern void record_builtin_type (enum rid, const char *, tree);
--- gcc/c-family/c-format.c.jj  2019-11-19 22:26:45.842300595 +0100
+++ gcc/c-family/c-format.c 2019-11-22 13:12:12.765658215 +0100
@@ -4899,31 +4899,32 @@ init_dynamic_gfc_info (void)
 }
 }
 
-/* Lookup the type named NAME and return a pointer-to-NAME type if found.
-   Otherwise, return void_type_node if NAME has not been used yet, or 
NULL_TREE if
-   NAME is not a type (issuing an error).  */
+/* Lookup the type named NAME and return a NAME type if found.
+   Otherwise, return void_type_node if NAME has not been used yet,
+   or NULL_TREE if NAME is not a type (issuing an error).  */
 
 static tree
-get_pointer_to_named_type (const char *name)
+get_named_type (const char *name)
 {
-  tree result;
-  if ((result = maybe_get_identifier (name)))
+  if (tree result = maybe_get_identifier (name))
 {
-  result = identifier_global_value (result);
+  result = identifier_global_tag (result);
   if (result)
{
- if (TREE_CODE (result) != TYPE_DECL)
+ if (TYPE_P (result))
+   ;
+ else if (TREE_CODE (result) == TYPE_DECL)
+   result = TREE_TYPE (result);
+ else
{
  error ("%qs is not defined as a type", name);
  result = NULL_TREE;
}
- else
-   result = TREE_TYPE (result);
}
+  return result;
 }
   else
-result = void_type_node;
-  return result;
+return void_type_node;
 }
 
 /* Determine the types of "tree" and "location_t" in the code being
@@ -4953,23 +4954,24 @@ init_dynamic_diag_info (void)
 an extra type level.  */
   if ((local_tree_type_node = maybe_get_identifier ("tree")))
{
- local_tree_type_node = identifier_global_value (local_tree_type_node);
+ local_tree_type_node
+   = identifier_global_value (local_tree_type_node);
  if (local_tree_type_node)
{
  if (TREE_CODE (local_tree_type_node) != TYPE_DECL)
{
  error ("% is not defined as a type");
- local_tree_type_node = 0;
+ local_tree_type_node = NULL_TREE;
}
  else if (TREE_CODE (TREE_TYPE (local_tree_type_node))
   != POINTER_TYPE)
{
  error ("% is not defined as a pointer type");
- local_tree_type_node = 0;
+ local_tree_type_node = NULL_TREE;
}
  else
-   local_tree_type_node =
- TREE_TYPE (TREE_TYPE (local_tree_type_node));
+   local_tree_type_node
+ = TREE_TYPE (TREE_TYPE (local_tree_type_node));
}
}
   else
@@ -4979,12 +4981,12 @@ init_dynamic_diag_info (void)
   /* Similar to the above but for gimple*.  */
   if (!local_gimple_ptr_node
   || local_gimple_ptr_node == void_type_node)
-local_gimple_ptr_node = get_pointer_to_named_type ("gimple");
+local_gimple_ptr_node = get_named_type ("gimple");
 
   /* Similar to the above but for cgraph_node*.  */
   if (!local_cgraph_node_ptr_node
   || local_cgraph_node_ptr_node == void_type_node)
-local_cgraph_node_ptr_node = get_pointer_to_named_type ("cgraph_node");
+local_cgraph_node_ptr_node = get_named_type ("cgraph_node");
 
   

[C++ PATCH] Fix concepts vs. PCH (PR c++/92458, take 2)

2019-11-22 Thread Jakub Jelinek
On Wed, Nov 20, 2019 at 07:46:13PM -0500, Jason Merrill wrote:
> > If decl_tree_cache_map will be needed in more than one spot, I'll probably
> > need to move it to some generic header.
> 
> Most of them probably need it, for code that uses the relevant features.
> Except debug_type_map, which probably needs to use TYPE_UID.
> 
> Or we might make default_hash_traits use DECL_UID for decls and
> TYPE_UID for types even if it doesn't do the more complex analysis of
> tree_operand_hash.

Finding out what is a type and what is decl (and what to do with rest?)
at runtime would be more costly than necessary.

The following updated patch moves the decl_tree_cache_map to tree.h and
adds there type_tree_cache_map too and uses it in all the tree_cache_map
spots in the C++ FE.
tree-hash-traits.h can't be included from tree.h, because operand_equal_p is
declared elsewhere, so instead this patch moves some of the
tree-hash-traits.h traits from that header to tree.h.  I could move there
just tree_decl_hash if moving the other two doesn't look right to you.

Bootstrapped/regtested on x86_64-linux and i686-linux, additionally tested
by compiling stdc++.h as PCH with -std=c++2a and using it.  Ok for trunk?

2019-11-22  Jakub Jelinek  

PR c++/92458
* tree-hash-traits.h (tree_decl_hash, tree_ssa_name_hash,
tree_hash): Move to ...
* tree.h (tree_decl_hash, tree_ssa_name_hash, tree_hash): ... here.
(struct decl_tree_cache_traits, struct type_tree_cache_traits): New
types.
(decl_tree_cache_map, tree_tree_cache_map): New typedefs.

* init.c (nsdmi_inst): Change type to
decl_tree_cache_map * from tree_cache_map *.
* constraint.cc (decl_constraints): Likewise.
* decl.c (get_tuple_decomp_init): Likewise.
* pt.c (defarg_inst, explicit_specifier_map): Likewise.
(tsubst_default_argument, store_explicit_specifier): Use
decl_tree_cache_map::create_ggc rather than
tree_cache_map::create_ggc.
* cp-objcp-common.c (debug_type_map): Change type to
type_tree_cache_map * from tree_cache_map *.

* g++.dg/pch/pr92458.C: New test.
* g++.dg/pch/pr92458.Hs: New test.

--- gcc/tree-hash-traits.h.jj   2019-01-01 12:37:16.902979134 +0100
+++ gcc/tree-hash-traits.h  2019-11-22 12:00:01.538725844 +0100
@@ -41,44 +41,4 @@ tree_operand_hash::equal (const value_ty
   return operand_equal_p (t1, t2, 0);
 }
 
-/* Hasher for tree decls.  Pointer equality is enough here, but the DECL_UID
-   is a better hash than the pointer value and gives a predictable traversal
-   order.  */
-struct tree_decl_hash : ggc_ptr_hash 
-{
-  static inline hashval_t hash (tree);
-};
-
-inline hashval_t
-tree_decl_hash::hash (tree t)
-{
-  return DECL_UID (t);
-}
-
-/* Hash for SSA_NAMEs in the same function.  Pointer equality is enough
-   here, but the SSA_NAME_VERSION is a better hash than the pointer
-   value and gives a predictable traversal order.  */
-struct tree_ssa_name_hash : ggc_ptr_hash 
-{
-  static inline hashval_t hash (tree);
-};
-
-inline hashval_t
-tree_ssa_name_hash::hash (tree t)
-{
-  return SSA_NAME_VERSION (t);
-}
-
-/* Hasher for general trees, based on their TREE_HASH.  */
-struct tree_hash : ggc_ptr_hash 
-{
-  static hashval_t hash (tree);
-};
-
-inline hashval_t
-tree_hash::hash (tree t)
-{
-  return TREE_HASH (t);
-}
-
 #endif
--- gcc/tree.h.jj   2019-11-15 00:37:26.293070143 +0100
+++ gcc/tree.h  2019-11-22 12:00:17.479478830 +0100
@@ -5351,6 +5351,58 @@ struct tree_decl_map_cache_hasher : ggc_
 #define tree_vec_map_hash tree_decl_map_hash
 #define tree_vec_map_marked_p tree_map_base_marked_p
 
+/* Hasher for tree decls.  Pointer equality is enough here, but the DECL_UID
+   is a better hash than the pointer value and gives a predictable traversal
+   order.  Additionally it can be used across PCH save/restore.  */
+struct tree_decl_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_decl_hash::hash (tree t)
+{
+  return DECL_UID (t);
+}
+
+/* Similarly for types.  Uses TYPE_UID as hash function.  */
+struct tree_type_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_type_hash::hash (tree t)
+{
+  return TYPE_UID (t);
+}
+
+/* Hash for SSA_NAMEs in the same function.  Pointer equality is enough
+   here, but the SSA_NAME_VERSION is a better hash than the pointer
+   value and gives a predictable traversal order.  */
+struct tree_ssa_name_hash : ggc_ptr_hash 
+{
+  static inline hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_ssa_name_hash::hash (tree t)
+{
+  return SSA_NAME_VERSION (t);
+}
+
+/* Hasher for general trees, based on their TREE_HASH.  */
+struct tree_hash : ggc_ptr_hash 
+{
+  static hashval_t hash (tree);
+};
+
+inline hashval_t
+tree_hash::hash (tree t)
+{
+  return TREE_HASH (t);
+}
+
 /* A hash_map of two trees for use with GTY((cache)).  Garbage collection for
such a map will not 

Re: [PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-22 Thread Uros Bizjak
On Fri, Nov 22, 2019 at 1:58 PM Bernd Schmidt  wrote:
>
> A patch I posted recently fixes combine to take costs of JUMP_INSNs into
> account. That causes the pr30315 test to fail with -m32, since the cost
> of an add that sets the flags is estimated too high.
>
> The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?

I think that the intention of the code below  "The embedded comparison
operand is completely free." comment is to handle this case. It looks
that it should return the cost of the inside operation of COMPARE rtx.

Uros.


[PATCH ix86] Fix rtx_costs for flag-setting adds

2019-11-22 Thread Bernd Schmidt
A patch I posted recently fixes combine to take costs of JUMP_INSNs into
account. That causes the pr30315 test to fail with -m32, since the cost
of an add that sets the flags is estimated too high.

The following seems to fix it.  Bootstrapped and tested on x86_64-linux, ok?


Bernd

	* config/i386/i386.c (ix86_rtx_costs): For a PLUS inside a COMPARE,
	representing an add that sets the flags, count just the PLUS.

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 7115ec44c2a..6e48f5ccbde 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -19500,6 +19500,11 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno,
 		+ rtx_cost (const1_rtx, mode, outer_code, opno, speed));
 	  return true;
 	}
+  if (GET_CODE (XEXP (x, 0)) == PLUS)
+	{
+	  *total = rtx_cost (XEXP (x, 0), mode, COMPARE, 0, speed);
+	  return true;
+	}
 
   /* The embedded comparison operand is completely free.  */
   if (!general_operand (XEXP (x, 0), GET_MODE (XEXP (x, 0)))


RE: Ping: [PATCH V6] Extend IPA-CP to support arithmetically-computed value-passing on by-ref argument (PR ipa/91682)

2019-11-22 Thread Tamar Christina
Hi Honza,
> 
> > I checked update_jump_functions_after_inlining(), and found one
> suspicious place:
> >
> >   for (i = 0; i < count; i++)
> > {
> >   struct ipa_jump_func *dst = ipa_get_ith_jump_func (args, i);
> >   if (!top)
> > {
> >   ipa_set_jf_unknown (dst);
> >   <   we should also invalidate dst->agg.items.
> This is a good catch. In meantime a smaller testcase surfaces in
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92528
> I am re-building Firefox with the patch I attache to the PR.
> 

Just curious how this went. Are the firefox issue sorted or is there something 
still to be done?

Thanks,
Tamar

> Honza


Re: [patch,Fortran] PR 92050 - fix ICE with -fcheck=all

2019-11-22 Thread Tobias Burnus

Hi all,

I was asked to backport this fix to the GCC 9 branch – given that 
-fcheck=bounds is a common option and it fails with code like 
genecode.org. Given that ICEs with -fcheck are a regression and that the 
patch is not that invasive, I am inclined to accept it.


Comments/(dis)approvals?

Tobias

On 10/11/19 12:17 PM, Tobias Burnus wrote:
Checking produces extra code (unsurprisingly); this code needs to end 
up in the program…


Bootstrapped on x86-64_gnu-linux. OK for the trunk?

Tobias


pr92050.diff

diff --git a/gcc/fortran/trans-expr.c b/gcc/fortran/trans-expr.c
index 965ab7786a1..65238ff623d 100644
--- a/gcc/fortran/trans-expr.c
+++ b/gcc/fortran/trans-expr.c
@@ -7031,8 +7031,11 @@ gfc_conv_procedure_call (gfc_se * se, gfc_symbol * sym,
gfc_allocate_lang_decl (result);
GFC_DECL_SAVED_DESCRIPTOR (result) = parmse.expr;
gfc_free_expr (class_expr);
-  gcc_assert (parmse.pre.head == NULL_TREE
- && parmse.post.head == NULL_TREE);
+  /* -fcheck= can add diagnostic code, which has to be placed before
+the call. */
+  if (parmse.pre.head != NULL)
+ gfc_add_expr_to_block (>pre, parmse.pre.head);
+  gcc_assert (parmse.post.head == NULL_TREE);
  }
  
/* Follow the function call with the argument post block.  */

diff --git a/gcc/testsuite/gfortran.dg/pr92050.f90 
b/gcc/testsuite/gfortran.dg/pr92050.f90
new file mode 100644
index 000..64193878d8f
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr92050.f90
@@ -0,0 +1,53 @@
+! { dg-do run }
+! { dg-options "-fcheck=all" }
+! { dg-shouldfail "above upper bound" }
+!
+! PR fortran/92050
+!
+!
+module buggy
+  implicit none (type, external)
+
+  type :: par
+  contains
+procedure, public :: fun => fun_par
+  end type par
+
+  type comp
+class(par), allocatable :: p
+  end type comp
+
+  type foo
+type(comp), allocatable :: m(:)
+  end type foo
+
+contains
+
+  function fun_par(this)
+class(par) :: this
+integer:: fun_par(1)
+fun_par = 42
+  end function fun_par
+
+  subroutine update_foo(this)
+class(foo) :: this
+write(*,*) this%m(1)%p%fun()
+  end subroutine update_foo
+
+  subroutine bad_update_foo(this)
+class(foo) :: this
+write(*,*) this%m(2)%p%fun()
+  end subroutine bad_update_foo
+end module buggy
+
+program main
+  use buggy
+  implicit none (type, external)
+  type(foo) :: x
+  allocate(x%m(1))
+  allocate(x%m(1)%p)
+  call update_foo(x)
+  call bad_update_foo(x)
+end program main
+
+! { dg-output "At line 39 of file .*pr92050.f90.*Fortran runtime error: Index '2' 
of dimension 1 of array 'this%m' above upper bound of 1" }


[committed] Disable epilogue loop vectorisation for vect-widen-mult-u8-*.c

2019-11-22 Thread Richard Sandiford
vect-widen-mult-u8.c and vect-widen-mult-u8-u32.c were failing
on arm-linux-gnueabihf with epilogue vectorisation because we
print the expected messages twice rather than once.  We could
fix that either by removing the counts or by disabling epilogue
loop vectorisation.  The other vect-widen-mult-* tests do the
latter, so I did the same here.

Tested on arm-linux-gnueabihf, aarch64-linux-gnu and x86_64-linux-gnu.
Applied as obvious.

Richard


2019-11-22  Richard Sandiford  

gcc/testsuite/
* gcc.dg/vect/vect-widen-mult-u8.c: Disable epilogue loop
vectorization.
* gcc.dg/vect/vect-widen-mult-u8-u32.c: Likewise.

Index: gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8.c
===
--- gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8.c  2019-03-08 
18:15:02.272871214 +
+++ gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8.c  2019-11-22 
12:02:53.750626899 +
@@ -1,3 +1,4 @@
+/* { dg-additional-options "--param vect-epilogues-nomask=0" } */
 /* { dg-require-effective-target vect_int } */
 
 #include 
Index: gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c
===
--- gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c  2019-03-08 
18:15:02.300871108 +
+++ gcc/testsuite/gcc.dg/vect/vect-widen-mult-u8-u32.c  2019-11-22 
12:02:53.750626899 +
@@ -1,3 +1,4 @@
+/* { dg-additional-options "--param vect-epilogues-nomask=0" } */
 /* { dg-require-effective-target vect_int } */
 
 #include 


Re: [SVE] PR89007 - Implement generic vector average expansion

2019-11-22 Thread Prathamesh Kulkarni
On Wed, 20 Nov 2019 at 16:54, Richard Biener  wrote:
>
> On Wed, 20 Nov 2019, Richard Sandiford wrote:
>
> > Hi,
> >
> > Thanks for doing this.  Adding Richard on cc:, since the SVE subject
> > tag might have put him off.  There's not really anything SVE-specific
> > here apart from the testcase.
>
> Ah.
>
> > > 2019-11-19  Prathamesh Kulkarni  
> > >
> > > PR tree-optimization/89007
> > > * tree-vect-patterns.c (vect_recog_average_pattern): If there is no
> > > target support available, generate code to distribute rshift over plus
> > > and add one depending upon floor or ceil rounding.
> > >
> > > testsuite/
> > > * gcc.target/aarch64/sve/pr89007.c: New test.
> > >
> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr89007-1.c 
> > > b/gcc/testsuite/gcc.target/aarch64/sve/pr89007-1.c
> > > new file mode 100644
> > > index 000..32095c63c61
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr89007-1.c
> > > @@ -0,0 +1,29 @@
> > > +/* { dg-do assemble { target aarch64_asm_sve_ok } } */
> > > +/* { dg-options "-O -ftree-vectorize -march=armv8.2-a+sve --save-temps" 
> > > } */
> > > +/* { dg-final { check-function-bodies "**" "" } } */
> > > +
> > > +#define N 1024
> > > +unsigned char dst[N];
> > > +unsigned char in1[N];
> > > +unsigned char in2[N];
> > > +
> > > +/*
> > > +**  foo:
> > > +** ...
> > > +** lsr (z[0-9]+\.b), z[0-9]+\.b, #1
> > > +** lsr (z[0-9]+\.b), z[0-9]+\.b, #1
> > > +** add (z[0-9]+\.b), \1, \2
> > > +** orr (z[0-9]+)\.d, z[0-9]+\.d, z[0-9]+\.d
> > > +** and (z[0-9]+\.b), \4\.b, #0x1
> > > +** add z0.b, \3, \5
> >
> > It'd probably be more future-proof to allow (\1, \2|\2, \1) and
> > (\3, \5|\5, \3).  Same for the other testcase.
> >
> > > +** ...
> > > +*/
> > > +void
> > > +foo ()
> > > +{
> > > +  for( int x = 0; x < N; x++ )
> > > +dst[x] = (in1[x] + in2[x] + 1) >> 1;
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-not {\tuunpklo\t} } } */
> > > +/* { dg-final { scan-assembler-not {\tuunpkhi\t} } } */
> > > diff --git a/gcc/testsuite/gcc.target/aarch64/sve/pr89007-2.c 
> > > b/gcc/testsuite/gcc.target/aarch64/sve/pr89007-2.c
> > > new file mode 100644
> > > index 000..cc40f45046b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/aarch64/sve/pr89007-2.c
> > > @@ -0,0 +1,29 @@
> > > +/* { dg-do assemble { target aarch64_asm_sve_ok } } */
> > > +/* { dg-options "-O -ftree-vectorize -march=armv8.2-a+sve --save-temps" 
> > > } */
> > > +/* { dg-final { check-function-bodies "**" "" } } */
> > > +
> > > +#define N 1024
> > > +unsigned char dst[N];
> > > +unsigned char in1[N];
> > > +unsigned char in2[N];
> > > +
> > > +/*
> > > +**  foo:
> > > +** ...
> > > +** lsr (z[0-9]+\.b), z[0-9]+\.b, #1
> > > +** lsr (z[0-9]+\.b), z[0-9]+\.b, #1
> > > +** add (z[0-9]+\.b), \1, \2
> > > +** and (z[0-9]+)\.d, z[0-9]+\.d, z[0-9]+\.d
> > > +** and (z[0-9]+\.b), \4\.b, #0x1
> > > +** add z0.b, \3, \5
> > > +** ...
> > > +*/
> > > +void
> > > +foo ()
> > > +{
> > > +  for( int x = 0; x < N; x++ )
> > > +dst[x] = (in1[x] + in2[x]) >> 1;
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-not {\tuunpklo\t} } } */
> > > +/* { dg-final { scan-assembler-not {\tuunpkhi\t} } } */
> > > diff --git a/gcc/tree-vect-patterns.c b/gcc/tree-vect-patterns.c
> > > index 8ebbcd76b64..7025a3b4dc2 100644
> > > --- a/gcc/tree-vect-patterns.c
> > > +++ b/gcc/tree-vect-patterns.c
> > > @@ -2019,22 +2019,59 @@ vect_recog_average_pattern (stmt_vec_info 
> > > last_stmt_info, tree *type_out)
> >
> > >/* Check for target support.  */
> > >tree new_vectype = get_vectype_for_scalar_type (vinfo, new_type);
> > > -  if (!new_vectype
> > > -  || !direct_internal_fn_supported_p (ifn, new_vectype,
> > > - OPTIMIZE_FOR_SPEED))
> > > +
> > > +  if (!new_vectype)
> > >  return NULL;
> > >
> > > +  bool ifn_supported
> > > += direct_internal_fn_supported_p (ifn, new_vectype, 
> > > OPTIMIZE_FOR_SPEED);
> > > +
> > >/* The IR requires a valid vector type for the cast result, even though
> > >   it's likely to be discarded.  */
> > >*type_out = get_vectype_for_scalar_type (vinfo, type);
> > >if (!*type_out)
> > >  return NULL;
> >  >
> > > -  /* Generate the IFN_AVG* call.  */
> > >tree new_var = vect_recog_temp_ssa_var (new_type, NULL);
> > >tree new_ops[2];
> > >vect_convert_inputs (last_stmt_info, 2, new_ops, new_type,
> > >unprom, new_vectype);
> > > +
> > > +  if (!ifn_supported)
> >
> > I guess this is personal preference, but I'm not sure there's much
> > benefit to splitting this out of the "if" into a separate variable.
> >
> > > +{
> > > +  /* If there is no target support available, generate code
> > > +to distribute rshift over plus and add one depending
> > > +upon floor or ceil rounding.  */
> >
> > Maybe "and add a carry"?  It'd be good to write out the expansion in
> > 

Re: [Patch, Fortran] dec comparisons - for approval

2019-11-22 Thread Tobias Burnus
First, can you also create some news quip for your GCC 10 changes for 
https://gcc.gnu.org/gcc-10/changes.html ? See 
https://gcc.gnu.org/about.html#git for the repository to diff against.


On 11/21/19 12:05 PM, Mark Eggleston wrote:
Please find attached an updated version of the patch originally 
submitted on 15th November. The comparisons with Holleriths are no 
only possible when using -fdec.

s/no/now/
The test cases have been revised to check for errors when -fdec is not 
used.

OK to commit in stage 3 or delay until stage 1 in the new year?


LGTM – The first version of the patch has been posted before Stage 3, 
which also has just started, and is very restricted – all are good 
reasons that GCC 10 trunk is fine.


Cheers,

Tobias


Change logs:
gcc/fortran

    Mark Eggleston 
    Jim MacArthur 

    * gfortran.texi: Update Hollerith constants support for character 
types

    and use in comparisons.
    * invoke.texi: Tidy up list of options. Update description of
    -fdec-char-conversions.
    * resolve.c (is_character_based): New.
    (Convert_hollerith_to_character): New. (convert_to_numeric): New.
    (resolve_operator): If both sides are character based and -fdec is
    enabled convert Hollerith to character. If an operand is 
Hollerith, the

    other is numeric and -fdec is enabled convert to numeric.
    (resolve_ordinary_assign): Add check for -fdec-char-conversions for
    assignment of character literals.

gcc/testsuite

    Mark Eggleston 
    Jim MacArthur 

    * gfortran.dg/dec-comparison-character_1.f90: New test.
    * gfortran.dg/dec-comparison-character_2.f90: New test.
    * gfortran.dg/dec-comparison-character_3.f90: New test.
    * gfortran.dg/dec-comparison-complex_1.f90: New test.
    * gfortran.dg/dec-comparison-complex_2.f90: New test.
    * gfortran.dg/dec-comparison-complex_3.f90: New test.
    * gfortran.dg/dec-comparison-int_1.f90: New test.
    * gfortran.dg/dec-comparison-int_2.f90: New test.
    * gfortran.dg/dec-comparison-int_3.f90: New test.
    * gfortran.dg/dec-comparison-real_1.f90: New test.
    * gfortran.dg/dec-comparison-real_2.f90: New test.
    * gfortran.dg/dec-comparison-real_3.f90: New test.
    * gfortran.dg/dec-comparison.f90: New test.

--
https://www.codethink.co.uk/privacy.html


0001-dec-comparisons.patch

 From 2fc6d83614d7f58620a9a9662e9972b5d4018ed1 Mon Sep 17 00:00:00 2001
From: Mark Eggleston
Date: Thu, 23 May 2019 09:42:26 +0100
Subject: [PATCH] dec comparisons

Allow comparison of Hollerith constants with numeric and character
expressions. Also allow comparison of character literalsa with numeric
expressions.

Enable using -fdec-comparisons or -fdec
---
  gcc/fortran/gfortran.texi  | 32 +
  gcc/fortran/invoke.texi| 24 +-
  gcc/fortran/resolve.c  | 54 +-
  .../gfortran.dg/dec-comparison-character_1.f90 | 18 
  .../gfortran.dg/dec-comparison-character_2.f90 | 18 
  .../gfortran.dg/dec-comparison-character_3.f90 | 26 +++
  .../gfortran.dg/dec-comparison-complex_1.f90   | 17 +++
  .../gfortran.dg/dec-comparison-complex_2.f90   | 14 ++
  .../gfortran.dg/dec-comparison-complex_3.f90   | 18 
  gcc/testsuite/gfortran.dg/dec-comparison-int_1.f90 | 22 +
  gcc/testsuite/gfortran.dg/dec-comparison-int_2.f90 | 18 
  gcc/testsuite/gfortran.dg/dec-comparison-int_3.f90 | 26 +++
  .../gfortran.dg/dec-comparison-real_1.f90  | 22 +
  .../gfortran.dg/dec-comparison-real_2.f90  | 18 
  .../gfortran.dg/dec-comparison-real_3.f90  | 26 +++
  gcc/testsuite/gfortran.dg/dec-comparison.f90   | 41 
  16 files changed, 371 insertions(+), 23 deletions(-)
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-character_1.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-character_2.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-character_3.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-complex_1.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-complex_2.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-complex_3.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-int_1.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-int_2.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-int_3.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-real_1.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-real_2.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison-real_3.f90
  create mode 100644 gcc/testsuite/gfortran.dg/dec-comparison.f90

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index a34ac5aa1bf..96be58b992d 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -1916,14 +1916,14 @@ in I/O 

Re: [PATCH] Support multi-versioning on self-recursive function (ipa/92133)

2019-11-22 Thread Martin Jambor
Hi,

On Fri, Nov 15 2019, Feng Xue OS wrote:
> Honza,
>
> I made some changes: do not penalize self-recursive function, and
> add --param ipa-cp-min-recursive-probability, similar to recursive
> inline. Please review this new one.

The patch and its effect on exchange is intriguing, I only have a few
comments, see below, otherwise I am happy about it.

> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index fe8daf40888..c84f4d73ed6 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,23 @@
> +2019-11-15  Feng Xue 
> +
> + PR ipa/92133
> + * doc/invoke.texi (ipa-cp-max-recursive-depth): Document new option.
> + (ipa-cp-min-recursive-probability): Likewise.
> + * params.opt (ipa-cp-max-recursive-depth): New.
> + (ipa-cp-min-recursive-probability): Likewise.
> + * ipa-cp.c (ipcp_lattice::add_value): Add two new parameters
> + val_pos_p and unlimited.
> + (self_recursively_generated_p): New function.
> + (get_val_across_arith_op): Likewise.
> + (propagate_vals_across_arith_jfunc): Add constant propagation for
> + self-recursive function.
> + (incorporate_penalties): Do not penalize pure self-recursive function.
> + (good_cloning_opportunity_p): Dump node_is_self_scc flag.
> + (propagate_constants_topo): Set node_is_self_scc flag for cgraph node.
> + (get_info_about_necessary_edges): Relax hotness check for edge to
> + self-recursive function.
> + * ipa-prop.h (ipa_node_params): Add new field node_is_self_scc.
> +

The least important comment: Thanks for providing the ChangeLog but
sending ChangeLog in the patch itself makes it difficult for people to
apply the patch because the ChangeLog hunks will never apply cleanly.
That's why people send them in the email body when they email
gcc-patches.  Hopefully we'll be putting ChangeLogs only into the commit
message soon and all of this will be a non-issue.

>  2019-11-15  Feng Xue  
>  
>   PR ipa/92528
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index fe79ca2247a..c30adfd31c2 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -12045,6 +12045,13 @@ IPA-CP calculates its own score of cloning 
> profitability heuristics
>  and performs those cloning opportunities with scores that exceed
>  @option{ipa-cp-eval-threshold}.
>  
> +@item ipa-cp-max-recursive-depth
> +Maximum depth of recursive cloning for self-recursive function.
> +
> +@item ipa-cp-min-recursive-probability
> +Recursive cloning only when the probability of call being executed exceeds
> +the parameter.
> +
>  @item ipa-cp-recursion-penalty
>  Percentage penalty the recursive functions will receive when they
>  are evaluated for cloning.
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 8372dfaa771..bbf508bca6c 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -228,7 +228,9 @@ public:
>inline bool set_contains_variable ();
>bool add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val = NULL,
> -   int src_idx = 0, HOST_WIDE_INT offset = -1);
> +   int src_idx = 0, HOST_WIDE_INT offset = -1,
> +   ipcp_value **val_pos_p = NULL,
> +   bool unlimited = false);
>void print (FILE * f, bool dump_sources, bool dump_benefits);
>  };
>  
> @@ -1817,22 +1819,37 @@ allocate_and_init_ipcp_value 
> (ipa_polymorphic_call_context source)
>  /* Try to add NEWVAL to LAT, potentially creating a new ipcp_value for it.  
> CS,
> SRC_VAL SRC_INDEX and OFFSET are meant for add_source and have the same
> meaning.  OFFSET -1 means the source is scalar and not a part of an
> -   aggregate.  */
> +   aggregate.  If non-NULL, VAL_POS_P specifies position in value list,
> +   after which newly created ipcp_value will be inserted, and it is also
> +   used to record address of the added ipcp_value before function returns.
> +   UNLIMITED means whether value count should not exceed the limit given
> +   by PARAM_IPA_CP_VALUE_LIST_SIZE.  */
>  
>  template 
>  bool
>  ipcp_lattice::add_value (valtype newval, cgraph_edge *cs,
> ipcp_value *src_val,
> -   int src_idx, HOST_WIDE_INT offset)
> +   int src_idx, HOST_WIDE_INT offset,
> +   ipcp_value **val_pos_p,
> +   bool unlimited)
>  {
>ipcp_value *val;
>  
> +  if (val_pos_p)
> +{
> +  for (val = values; val && val != *val_pos_p; val = val->next);

Please put empty statements on a separate line (I think there is one
more in the patch IIRC).

> +  gcc_checking_assert (val);
> +}
> +
>if (bottom)
>  return false;
>  
>for (val = values; val; val = val->next)
>  if (values_equal_for_ipcp_p (val->value, newval))
>{
> + if (val_pos_p)
> +   *val_pos_p = val;
> +
>   if (ipa_edge_within_scc (cs))
> {
>   ipcp_value_source *s;
> @@ -1847,7 +1864,7 @@ ipcp_lattice::add_value 

[patch, openacc] Fix ICE verifying gimple

2019-11-22 Thread Andrew Stubbs

This test case causes an ICE (reformatted for email):

  void test(int k)
  {
unsigned int x = 1;
  #pragma acc parallel loop async(x)
for (int i = 0; i < k; i++) { }
  }

  t.c: In function 'test':
  t.c:4:9: error: invalid argument to gimple call
  4 | #pragma acc parallel loop async(x)
| ^~~
  (int) x
  __builtin_GOACC_parallel_keyed (-1, test._omp_fn.0, 1,
  &.omp_data_arr.3, &.omp_data_sizes.4,
  &.omp_data_kinds.5, 536936447,
 (int) x, 0);
  during GIMPLE pass: ompexp
  dump file: t.c.013t.ompexp
  t.c:4:9: internal compiler error: verify_gimple failed

The problem is that "x" needs to be cast to "int" (from "unsigned int") 
before calling the function, and that's not valid in a gimple call.


The attached patch assigns the "(int) x" to a temporary and passes that 
to the function instead.


OK to commit?

--
Andrew Stubbs
CodeSourcery / Mentor Graphics
Normalize GOACC_parallel_keyed async parameter.

2019-11-22  Andrew Stubbs  

	gcc/
	* omp-expand.c (expand_omp_target): Pass sync parameter to
	GOACC_parallel_keyed via a temporary variable.

diff --git a/gcc/omp-expand.c b/gcc/omp-expand.c
index 6f945011cf5..08f95587e95 100644
--- a/gcc/omp-expand.c
+++ b/gcc/omp-expand.c
@@ -8418,7 +8418,12 @@ expand_omp_target (struct omp_region *region)
 	  i_async));
 	  }
 	if (t_async)
-	  args.safe_push (t_async);
+	  {
+	tree tmp = create_tmp_var (TREE_TYPE (t_async));
+	gimple *stmt = gimple_build_assign (tmp, t_async);
+	gsi_insert_before (, stmt, GSI_SAME_STMT);
+	args.safe_push (tmp);
+	  }
 
 	/* Save the argument index, and ... */
 	unsigned t_wait_idx = args.length ();


Re: [PATCH] Multibyte awareness for diagnostics (PR 49973)

2019-11-22 Thread David Malcolm
On Thu, 2019-11-21 at 21:43 -0500, David Malcolm wrote:
> This is the column number as reported in the diagnostic i.e the
> COL_NUM
> when printing e.g.
>   warning: FILENAME:LINE_NUM:COL_NUM: some message
> 
> It seems to me that PR 49973 and this patch cover two separate
> things:

On re-reading, I think I worded this poorly
"both PR 49973 and this patch", I meant to say (I didn't want to say
that the patch isn't for PR 49973, it's just that that PR covers two
things, if that makes sense).

> (a) bytes vs display columns in diagnostic-show-locus.c
> (b) the "COL_NUM" mentioned above.
> 
> I'd prefer to omit (b) from the patch, and have the focus of the
> patch
> be (a), to tackle (b) in a separate patch.




[committed] Fix markup in gcc.dg/vect/vect-cond-reduc-3.c

2019-11-22 Thread Richard Sandiford
gcc.dg/vect/vect-cond-reduc-3.c had been failing on
arm-linux-gnueabihf since the test was added, because the test needs
support for VEC_COND_EXPR  whereas the target
only supports VEC_COND_EXPRs in which all modes are the same.  (I have
a fix for that, but it's not really stage 3 material.)

Tested on arm-linux-gnueabihf, aarch64-linux-gnu and x86_64-linux-gnu.
Applied as obvious.

Richard


2019-11-22  Richard Sandiford  

gcc/testsuite/
* gcc.dg/vect/vect-cond-reduc-3.c: Require vect_cond_mixed
rather than vect_condition.

Index: gcc/testsuite/gcc.dg/vect/vect-cond-reduc-3.c
===
--- gcc/testsuite/gcc.dg/vect/vect-cond-reduc-3.c   2019-10-31 
17:15:22.610537252 +
+++ gcc/testsuite/gcc.dg/vect/vect-cond-reduc-3.c   2019-11-22 
10:55:51.642277792 +
@@ -1,6 +1,6 @@
 /* Disabling epilogues until we find a better way to deal with scans.  */
 /* { dg-additional-options "--param vect-epilogues-nomask=0" } */
-/* { dg-require-effective-target vect_condition } */
+/* { dg-require-effective-target vect_cond_mixed } */
 /* { dg-require-effective-target vect_float } */
 
 #include "tree-vect.h"


[C++ Patch] Improve build_new_op_1, cp_build_indirect_ref_1, and cp_build_modify_expr locations

2019-11-22 Thread Paolo Carlini

Hi,

I would say most of the changes are straightforward or mechanical. 
Essentially, for build_new_op_1 and cp_build_modify_expr I'm simply 
consistently using the available location argument; for 
cp_build_indirect_ref_1 I'm adding the parameter but then using it in a 
completely straightforward way. Minor nit: I wondered for a while if 
cp_build_modify_expr should use cp_expr_loc_or_loc more - normally the 
passed loc points to the '=' - but eventually, given the actual texts of 
the messages, I used it only in one place, for "void value not ignored 
as it ought to be" which is mostly about the type of 'rhs'. All the 
other messages in one way or the other talk about both sides (the 
primary clang caret appears to agree).


Tested x86_64-linux.

Thanks, Paolo.

//

/gcc
2019-11-20  Paolo Carlini  

* typeck.c (cp_build_indirect_ref_1): Add location_t parameter
and use it in error messages.
(build_x_indirect_ref): Adjust call.
(build_indirect_ref): Likewise.
(cp_build_fold_indirect_ref): Likewise.
(cp_build_array_ref): Likewise.
* call.c (build_new_op_1): Likewise.
* semantics.c (finish_omp_clauses): Likewise.
(finish_omp_depobj): Likewise.
* typeck2.c (build_x_arrow): Likewise.
* cp-tree.h (cp_build_indirect_ref): Update declaration.

* call.c (build_new_op_1): Use location argument in warning_at.

* typeck.c (cp_build_modify_expr): Consistently use the
location_t argument.

/libcc1
2019-11-20  Paolo Carlini  

* libcp1plugin.cc (plugin_pragma_push_user_expression): Update
cp_build_indirect_ref call.

/testsuite
2019-11-20  Paolo Carlini  

* g++.dg/diagnostic/base-operand-non-pointer-1.C: New.
* g++.dg/pr53055.C: Check location too.
* g++.old-deja/g++.bugs/900213_02.C: Likewise.
* g++.old-deja/g++.bugs/900215_02.C: Likewise.
* g++.old-deja/g++.other/badarrow.C: Likewise.
* g++.old-deja/g++.other/deref1.C: Likewise.

* g++.dg/warn/Wenum-compare.C: Check location too.

* g++.dg/cpp0x/initlist26.C: Check location too.
* g++.dg/cpp0x/initlist28.C: Likewise.
* g++.dg/cpp0x/initlist29.C: Likewise.
* g++.dg/cpp0x/initlist33.C: Likewise.
* g++.dg/expr/string-2.C: Likewise.
* g++.dg/other/ptrmem5.C: Likewise.
* g++.old-deja/g++.benjamin/14664-1.C: Likewise.
* g++.old-deja/g++.benjamin/14664-2.C: Likewise.
* g++.old-deja/g++.brendan/init12.C: Likewise.
* g++.old-deja/g++.bugs/900324_04.C: Likewise.
* g++.old-deja/g++.ext/array1.C: Likewise.
* g++.old-deja/g++.jason/rfg17.C: Likewise.
Index: cp/call.c
===
--- cp/call.c   (revision 278549)
+++ cp/call.c   (working copy)
@@ -6354,11 +6354,9 @@ build_new_op_1 (const op_location_t , enum tre
  && (TYPE_MAIN_VARIANT (arg1_type)
  != TYPE_MAIN_VARIANT (arg2_type))
  && (complain & tf_warning))
-   {
- warning (OPT_Wenum_compare,
-  "comparison between %q#T and %q#T",
-  arg1_type, arg2_type);
-   }
+   warning_at (loc, OPT_Wenum_compare,
+   "comparison between %q#T and %q#T",
+   arg1_type, arg2_type);
  break;
default:
  break;
@@ -6416,7 +6414,7 @@ build_new_op_1 (const op_location_t , enum tre
   return cp_build_modify_expr (loc, arg1, code2, arg2, complain);
 
 case INDIRECT_REF:
-  return cp_build_indirect_ref (arg1, RO_UNARY_STAR, complain);
+  return cp_build_indirect_ref (loc, arg1, RO_UNARY_STAR, complain);
 
 case TRUTH_ANDIF_EXPR:
 case TRUTH_ORIF_EXPR:
@@ -6472,8 +6470,9 @@ build_new_op_1 (const op_location_t , enum tre
   return cp_build_array_ref (input_location, arg1, arg2, complain);
 
 case MEMBER_REF:
-  return build_m_component_ref (cp_build_indirect_ref (arg1, 
RO_ARROW_STAR, 
-   complain), 
+  return build_m_component_ref (cp_build_indirect_ref (loc, arg1,
+  RO_ARROW_STAR,
+   complain),
 arg2, complain);
 
   /* The caller will deal with these.  */
Index: cp/cp-tree.h
===
--- cp/cp-tree.h(revision 278549)
+++ cp/cp-tree.h(working copy)
@@ -7482,9 +7482,11 @@ extern tree build_class_member_access_expr  (c
 extern tree finish_class_member_access_expr (cp_expr, tree, bool,
 tsubst_flags_t);
 extern tree build_x_indirect_ref   (location_t, tree,
-

Re: [patch, fortran] Load scalar intent-in variables at the beginning of procedures

2019-11-22 Thread Tobias Burnus
I have now filled https://gcc.gnu.org/PR92628 as ME bug to track this 
(i.e. honor TYPE_RESTRICT for pointer-escape analysis).


Cheers,

Tobias

On 11/21/19 3:36 PM, Tobias Burnus wrote:

On 11/21/19 3:09 PM, Richard Biener wrote:
So I think what you'd need to do is make 'i' marked as TYPE_RESTRICT 
then. I think we don't yet handle it but it means that bar() may not 
modify 'i' but via 'i' (but it doesn't get 'i' as a parameter). 
Okay, that would be then the attached patch. – I can confirm that it 
does not work.


Re: [committed] [testsuite] Fix fp-int-convert-timode-1.c testism.

2019-11-22 Thread Tamar Christina
Hi Joseph,

> > Or do you want me to send them separately?
> 
> I think it's best to fix the test now not to have the #ifdef, then if you 
> have execution failures those can be addressed separately.  (If you want 
> to avoid the test FAILing before then, an XFAIL with a comment referencing 
> an open bug in Bugzilla would be appropriate, not #ifdef that makes the 
> test spuriously PASS.)
> 

Fair enough, found the issue and it wasn't with the test. I've attached the
new patch.

Ok for trunk?

Thanks,
Tamar

> -- 
> Joseph S. Myers
> jos...@codesourcery.com

-- 
diff --git a/gcc/testsuite/gcc.dg/torture/fp-int-convert-timode-1.c b/gcc/testsuite/gcc.dg/torture/fp-int-convert-timode-1.c
index bf7f3cedb294cc834437593dae3507005f0f6b56..971a5985357ce50e187d3dea2660804c8055e141 100644
--- a/gcc/testsuite/gcc.dg/torture/fp-int-convert-timode-1.c
+++ b/gcc/testsuite/gcc.dg/torture/fp-int-convert-timode-1.c
@@ -2,27 +2,22 @@
float.  */
 /* { dg-do run } */
 /* { dg-require-effective-target int128 } */
-/* { dg-require-effective-target fenv } */
 /* { dg-options "-frounding-math" } */
 
-#include 
 #include 
 
 int
 main (void)
 {
-#ifdef FE_TONEAREST
   volatile unsigned long long h = 0x8000LL;
   volatile unsigned long long l = 0xdLL;
   volatile unsigned __int128 u128 = (((unsigned __int128) h) << 64) | l;
   volatile __int128 s128 = u128;
-  fesetround (FE_TONEAREST);
   float fs = s128;
   if (fs != -0x1p+127)
 abort ();
   double ds = s128;
   if (ds != -0x1p+127)
 abort ();
-#endif
   exit (0);
 }



Re: [PATCH] [PATCH] [ARC] Fix ARC target specific tests.

2019-11-22 Thread Claudiu Zissulescu
Thank you for your review. Pushed,
Claudiu

On Thu, Nov 21, 2019 at 9:13 PM Jeff Law  wrote:
>
> On 11/21/19 9:35 AM, Claudiu Zissulescu wrote:
> > PING
> >
> > On Mon, Nov 11, 2019 at 4:42 PM Claudiu Zissulescu  
> > wrote:
> >>
> >> Hi,
> >>
> >> Fix ARC specific tests by improving the matching pattern and adding
> >> the missing functionality in arc.exp
> >>
> >>
> >> OK to appy?
> >> Claudiu
> >>
> >>
> >> gcc/tests
> >> -xx-xx  Claudiu Zissulescu  
> >>
> >> * gcc.target/arc/add_n-combine.c: Match add1/2/3 instruction in
> >> output assembly.
> >> * gcc.target/arc/arc.exp (check_effective_target_codedensity):
> >> Add.
> >> * gcc.target/arc/cmem-7.c: Fix matching patterns.
> >> * gcc.target/arc/cmem-bit-1.c: Likewise.
> >> * gcc.target/arc/cmem-bit-2.c: Likewise.
> >> * gcc.target/arc/cmem-bit-3.c: Likewise.
> >> * gcc.target/arc/cmem-bit-4.c: Likewise.
> >> * gcc.target/arc/interrupt-2.c: Match rtie insn for A7.
> >> * gcc.target/arc/store-merge-1.c: This test is only meaningful for
> >> architectures with double load/store operations.
> OK
> jeff
>