date:20201203

Re: [PATCH 1/2] correct BB frequencies after loop changed

2020-12-03 Thread Martin Liška


On 12/4/20 7:17 AM, Jiufu Guo via Gcc-patches wrote:

Oh, this may be indicate 'approval with comments', right?:)


Yes, Honza can you please review the patch?

Thanks,
Martin

Re: [PATCH 1/2] correct BB frequencies after loop changed

2020-12-03 Thread Jiufu Guo via Gcc-patches

Jiufu Guo  writes:

> Jiufu Guo  writes:
>
>> Jeff Law  writes:
>>
>>> On 11/18/20 12:28 AM, Richard Biener wrote:
 On Tue, 17 Nov 2020, Jeff Law wrote:

> Minor questions for Jan and Richi embedded below...
>
> On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote:
>> When investigating the issue from 
>> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html
>> I find the BB COUNTs of loop seems are not accurate in some case.
>> For example:
>>
>> In below figure:
>>
>>
>>COUNT:268435456  pre-header
>> |
>> |  ..
>> |  ||
>> V  v|
>>COUNT:805306369|
>>/ \  |
>>33%/   \ |
>>  / \|
>> v   v   |
>> COUNT:268435456  COUNT:536870911  | 
>> exit-edge |   latch |
>>   ._.
>>
>> Those COUNTs have below equations:
>> COUNT of exit-edge:268435456 = COUNT of pre-header:268435456
>> COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33
>> COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of 
>> latch:536870911
>>
>>
>> While after pcom:
>>
>>COUNT:268435456  pre-header
>> |
>> |  ..
>> |  ||
>> V  v|
>>COUNT:268435456|
>>/ \  |
>>50%/   \ |
>>  / \|
>> v   v   |
>> COUNT:134217728  COUNT:134217728  | 
>> exit-edge |   latch |
>>   ._.
>>
>> COUNT != COUNT + COUNT
>> COUNT != COUNT
>>
>> In some cases, the probility of exit-edge is easy to estimate, then
>> those COUNTs of other BBs in loop can be re-caculated.
>>
>> Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
>>
>> Jiufu
>>
>> gcc/ChangeLog:
>> 2020-10-09  Jiufu Guo   
>>
>>  * cfgloopmanip.h (recompute_loop_frequencies): New function.
>>  * cfgloopmanip.c (recompute_loop_frequencies): New implementation.
>>  * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call
>>  recompute_loop_frequencies.
>>
>> ---
>>  gcc/cfgloopmanip.c| 53 +++
>>  gcc/cfgloopmanip.h|  2 +-
>>  gcc/tree-ssa-loop-manip.c | 28 +++--
>>  3 files changed, 57 insertions(+), 26 deletions(-)
>>
>> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
>> index 73134a20e33..b0ca82a67fd 100644
>> --- a/gcc/cfgloopmanip.c
>> +++ b/gcc/cfgloopmanip.c
>> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "gimplify-me.h"
>>  #include "tree-ssa-loop-manip.h"
>>  #include "dumpfile.h"
>> +#include "cfgrtl.h"
>>  
>>  static void copy_loops_to (class loop **, int,
>> class loop *);
>> @@ -1773,3 +1774,55 @@ loop_version (class loop *loop,
>>  
>>return nloop;
>>  }
>> +
>> +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit 
>> edge
>> +   is NEW_PROB.  */
>> +
>> +bool
>> +recompute_loop_frequencies (class loop *loop, profile_probability 
>> new_prob)
>> +{
>> +  edge exit = single_exit (loop);
>> +  if (!exit)
>> +return false;
>> +
>> +  edge e;
>> +  edge_iterator ei;
>> +  edge non_exit;
>> +  basic_block * bbs;
>> +  profile_count exit_count = loop_preheader_edge (loop)->count ();
>> +  profile_probability exit_p = exit_count.probability_in 
>> (loop->header->count);
>> +  profile_count base_count = loop->header->count;
>> +  profile_count after_num = base_count.apply_probability (exit_p);
>> +  profile_count after_den = base_count.apply_probability (new_prob);
>> +
>> +  /* Update BB counts in loop body.
>> + COUNT = COUNT
>> + COUNT = COUNT * exit_edge_probility
>> + The COUNT = COUNT * old_exit_p / new_prob. 
>>  */
>> +  bbs = get_loop_body (loop);
>> +  scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num,
>> + after_den);
>> +  free (bbs);
>> +
>> +  /* Update pr

Ping: [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move

2020-12-03 Thread Michael Meissner via Gcc-patches

I haven't received a reply for this patch:

| Date: Sun, 15 Nov 2020 23:53:20 -0500
| Subject: [PATCH 2/2] Power10: Add IEEE 128-bit fp conditional move
| Message-ID: <20201116045320.gb3...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559167.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Ping: [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.

2020-12-03 Thread Michael Meissner via Gcc-patches

I haven't received a reply for this patch:

| Date: Sun, 15 Nov 2020 23:50:51 -0500
| Subject: [PATCH 1/2] Power10: Add IEEE 128-bit xsmaxcqp and xsmincqp support.
| Message-ID: <20201116045051.ga3...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559166.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Ping: [PATCH] PowerPC: Add float128/Decimal conversions

2020-12-03 Thread Michael Meissner via Gcc-patches

I haven't received a reply for this patch:

| Date: Thu, 19 Nov 2020 19:05:24 -0500
| Subject: [PATCH] PowerPC: Add float128/Decimal conversions
| Message-ID: <2020112524.ga...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559661.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Ping [PATCH] PowerPC: Map IEEE 128-bit long double built-in functions

2020-12-03 Thread Michael Meissner via Gcc-patches

I haven't received a response for this patch:

| Date: Thu, 19 Nov 2020 18:58:14 -0500
| Subject: [PATCH] PowerPC: Map IEEE 128-bit long double built-in functions
| Message-ID: <20201119235814.ga...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559659.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Ping: [PATCH] PowerPC: Set long double size for IBM/IEEE.

2020-12-03 Thread Michael Meissner via Gcc-patches

I haven't received a replay for this patch:

| Date: Thu, 19 Nov 2020 19:00:11 -0500
| Subject: [PATCH] PowerPC: Set long double size for IBM/IEEE.
| Message-ID: <2020112011.ga...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559660.html

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH, Committed] PowerPC PR libgcc/o7543 and libgcc/97643, Fix long double issues

2020-12-03 Thread Michael Meissner via Gcc-patches

I committed the following patch today.  After a burn-in period, I plan to
commit the patch to older GCC releases, and close out the two PRs.

PowerPC: PR libgcc/97543 and libgcc/97643, fix long double issues

If you use a compiler with long double defaulting to 64-bit instead of 128-bit
with IBM extended double, you get linker warnings about mis-matches in the gnu
attributes for long double (PR libgcc/97543).  Even if the compiler is
configured to have long double be 64 bit as the default with the configuration
option '--without-long-double-128' you get the warnings.

You also get the same issues if you use a compiler with long double defaulting
to IEEE 128-bit instead of IBM extended double (PR libgcc/97643).

The issue is the way libgcc.a/libgcc.so is built.  Right now when building
libgcc under Linux, the long double size is set to 128-bits when building
libgcc.  However, the gnu attributes are set, leading to the warnings.

One feature of the current GNU attribute implementation is if you have a shared
library (such as libgcc_s.so), the GNU attributes for the shared library is an
inclusive OR of all of the objects within the library.  This means if any
object file that uses the -mlong-double-128 option and uses long double, the GNU
attributes for the library will indicate that it uses 128-bit IBM long
doubles.  If you have a static library, you will get the warning only if you
actually reference an object file  with the attribute set.

This patch does two things:

1)  All of the object files that support IBM 128-bit long doubles
explicitly set the ABI to IBM extended double.

2)  I turned off GNU attributes for building the shared library or for
building the IBM 128-bit long double support.

libgcc/
2020-12-03  Michael Meissner  

PR libgcc/97543
PR libgcc/97643
* config/rs6000/t-linux (IBM128_STATIC_OBJS): New make variable.
(IBM128_SHARED_OBJS): New make variable.
(IBM128_OBJS): New make variable.  Set all objects to use the
explicit IBM format, and disable gnu attributes.
(IBM128_CFLAGS): New make variable.
(gcc_s_compile): Add -mno-gnu-attribute to all shared library
modules.
---
 libgcc/config/rs6000/t-linux | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/libgcc/config/rs6000/t-linux b/libgcc/config/rs6000/t-linux
index ed821947b66..72e9c2770a6 100644
--- a/libgcc/config/rs6000/t-linux
+++ b/libgcc/config/rs6000/t-linux
@@ -6,3 +6,25 @@ HOST_LIBGCC2_CFLAGS += -mlong-double-128
 # smaller and faster libgcc code.  Directly specifying -mcmodel=small
 # would need to take into account targets for which -mcmodel is invalid.
 HOST_LIBGCC2_CFLAGS += -mno-minimal-toc
+
+# On the modules that deal with IBM 128-bit values, make sure that TFmode uses
+# the IBM extended double format.  Also turn off gnu attributes on the static
+# modules.
+IBM128_STATIC_OBJS = ibm-ldouble$(objext) _powitf2$(objext) \
+ ppc64-fp$(objext) _divtc3$(object) _multc3$(object) \
+ _fixtfdi$(object) _fixunstfdi$(object) \
+ _floatditf$(objext) _floatunsditf$(objext)
+IBM128_SHARED_OBJS = $(IBM128_STATIC_OBJS:$(objext):_s$(objext))
+IBM128_OBJS= $(IBM128_STATIC_OBJS) $(IBM128_SHARED_OBJS)
+
+IBM128_CFLAGS  = -Wno-psabi -mabi=ibmlongdouble -mno-gnu-attribute
+
+$(IBM128_OBJS) : INTERNAL_CFLAGS += $(IBM128_CFLAGS)
+
+# Turn off gnu attributes for long double size on all of the shared library
+# modules, but leave it on for the static modules, except for the functions
+# that explicitly process IBM 128-bit floating point.  Shared libraries only
+# have one gnu attribute for the whole library, and it can lead to warnings if
+# somebody changes the long double format.  We leave it on for the static
+# modules to catch mis-compilation errors.
+gcc_s_compile += -mno-gnu-attribute
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH 1/3] PowerPC: Add long double target-supports.

2020-12-03 Thread Michael Meissner via Gcc-patches

PowerPC: Add long double target-supports.

I messed up posting this patch, using the wrong subject line.  This patch is
what I meant to post.

This patch replaces the patch submitted on November 21st:
| Date: Sat, 21 Nov 2020 00:33:52 -0500
| Subject: [PATCH 1/3] PowerPC: Add long double target-supports.
| Message-ID: <20201121053352.gc17...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559839.html

I expanded the target supports to include more options to select targets with
an appropriate long double format.  There are four options to check whether the
current long double is:

1)  128-bit using the IBM extended double format;
2)  128-bit using the IEEE format;
3)  Long double is 128-bits (i.e. either IBM or IEEE); (and)
4)  Long double is 64-bits.

I also added two new target supports:

1)  If you can switch the long double to IBM extended double via compiler
options and the GLIBC supports this change.  If you are using an
existing GLIBC with IBM long double, this should work since you aren't
switching the long double format.

2)  And likewise if you can switch the long double to IEEE 128-bit.

This patch and the following 2 other patches were tested together on a power9
little endian server system.  I built 4 compilers:

1)  Compiler without modifications;
2)  Compiler using my other patches configured for IBM long double;
3)  Compiler using my other patches configured for IEEE long double; (and)
4)  Compiler using my other patches configure for 64-bit double.

While I used the other patches to test the 64-bit and IEEE long double, these 3
patches should work for the default builds with IBM 128-bit, and they will
continue to work once all of the long double support patches are in.  The two
tests that are patched in the next two patches, now work in all environments.
Can I apply these patches to the master branch?

gcc/testsuite/
2020-12-03  Michael Meissner  

* lib/target-supports.exp
(check_effective_target_ppc_long_double_ibm): New function.
(check_effective_target_ppc_long_double_ieee): New function.
(check_effective_target_ppc_long_double_override_ibm): New function.
(check_effective_target_ppc_long_double_override_ieee): New function.
(check_effective_target_ppc_long_double_128bit): New function.
(check_effective_target_ppc_long_double_64bit): New function.
---
 gcc/testsuite/lib/target-supports.exp | 122 ++
 1 file changed, 122 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index ff6bc5f4b92..01b82843bf5 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2348,6 +2348,128 @@ proc check_effective_target_ppc_ieee128_ok { } {
 }]
 }
 
+# See if the target is a powerpc with the long double format that uses the IBM
+# extended double format.
+
+proc check_effective_target_ppc_long_double_ibm { } {
+return [check_cached_effective_target ppc_long_double_ibm {
+   int main()
+   {
+ #if !defined(_ARCH_PPC) || !defined(__LONG_DOUBLE_IBM128__)
+   return 1;
+ #else
+   return 0;
+ #endif
+   }
+}]
+}
+
+# See if the target is a powerpc with the long double format that uses the IEEE
+# 128-bit format.
+
+proc check_effective_target_ppc_long_double_ieee { } {
+return [check_cached_effective_target ppc_long_double_ieee {
+   int main()
+   {
+ #if !defined(_ARCH_PPC) || !defined(__LONG_DOUBLE_IEEE128__)
+   return 1;
+ #else
+   return 0;
+ #endif
+   }
+}]
+}
+
+# Like check_effective_target_ppc_long_double_ibm, but check if we can
+# explicitly override the long double format to use the IBM 128-bit extended
+# double format, and GLIBC supports doing this override by switching the
+# sprintf to handle long double.
+
+proc check_effective_target_ppc_long_double_override_ibm { } {
+set options "-mlong-double-128 -mabi=ibmlongdouble -Wno-psabi"
+check_runtime_nocache ppc_long_double_ovveride_ibm {
+   #include 
+   #include 
+   volatile __ibm128 a = (__ibm128) 3.0;
+   volatile long double one = 1.0L;
+   volatile long double two = 2.0L;
+   volatile long double b;
+   char buffer[20];
+   int main()
+   {
+ #if !defined(_ARCH_PPC) || !defined(__LONG_DOUBLE_IBM128__)
+   return 1;
+ #else
+   b = one + two;
+   if (memcmp ((void *)&a, (void *)&b, sizeof (long double)) != 0)
+ return 1;
+   sprintf (buffer, "%lg", b);
+   return strcmp (buffer, "3") != 0;
+ #endif
+   }
+} $options
+}
+
+# Like check_effective_target_ppc_long_double_ieee, but check if we can
+# explicitly override the long double format to use the IEEE 128-bit format,
+# and GLIBC supports doing this override by

[PATCH 3/3] PowerPC: Force IBM long double for conversion test.

2020-12-03 Thread Michael Meissner via Gcc-patches

PowerPC: Force IBM long double for conversion test.

This patch replaces the following patch:
| Date: Sat, 21 Nov 2020 00:39:53 -0500
| Subject: [PATCH 3/3] PowerPC: Require IBM long double for conversion test.
| Message-ID: <20201121053953.ge17...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559841.html

The test c-c++-common/dfp/convert-bfp-11.c explicitly expects long double to
use the IBM 128-bit extended double format.  In particular, some of the tests
expect an infinity to be created if decimal values that are converted that are
too large for IBM extended double.  However, the numbers do fit in the range
for IEEE 128-bit format, since it has a larger exponent than the IBM 128-bit
format.  The test fails because an infinity is not generated.

This patch uses the target support option that says we can change the long
double type.  If the long double type uses the IBM format, you can use older
GLIBCs.  If you are using IEEE long doubles, this test will only run if GLIBC
is 2.32 or newer that supports switching the long double format.

I have checked this patch with 3 compilers built on a little endian power9
system, 1 compiler with IBM default long double, 1 compiler with IEEE default
long double, and 1 compiler with 64-bit long double.  This test passes with all
3 compilers.

Can I check this patch into the master branch?

gcc/testsuite/
2020-12-03  Michael Meissner  

* c-c++-common/dfp/convert-bfp-11.c: Force using IBM 128-bit long
double.  Remove check for 64-bit long double.
---
 .../c-c++-common/dfp/convert-bfp-11.c  | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c 
b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
index 95c433d2c24..d19d6aa9220 100644
--- a/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
+++ b/gcc/testsuite/c-c++-common/dfp/convert-bfp-11.c
@@ -1,9 +1,14 @@
-/* { dg-skip-if "" { ! "powerpc*-*-linux*" } } */
+/* { dg-require-effective-target dfp } */
+/* { dg-require-effective-target ppc_long_double_override_ibm } */
+/* { dg-options "-O2 -mlong-double-128 -mabi=ibmlongdouble -Wno-psabi" } */
 
-/* Test decimal float conversions to and from IBM 128-bit long double. 
-   Checks are skipped at runtime if long double is not 128 bits.
-   Don't force 128-bit long doubles because runtime support depends
-   on glibc.  */
+/* We force the long double type to be IBM 128-bit because the CONVERT_TO_PINF
+   tests will fail if we use IEEE 128-bit floating point.  This is due to IEEE
+   128-bit having a larger exponent range than IBM 128-bit extended double.  So
+   tests that would generate an infinity with IBM 128-bit will generate a
+   normal number with IEEE 128-bit.  */
+
+/* Test decimal float conversions to and from IBM 128-bit long double.   */
 
 #include "convert.h"
 
@@ -36,9 +41,6 @@ CONVERT_TO_PINF (312, tf, sd, 1.6e+308L, d32)
 int
 main ()
 {
-  if (sizeof (long double) != 16)
-return 0;
-
   convert_101 ();
   convert_102 ();
 
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

[PATCH 2/3] PowerPC: require IBM long double for pr70117.

2020-12-03 Thread Michael Meissner via Gcc-patches

PowerPC: PR target/70117, Force long double to be IBM 128-bit.

This patch replaces the following patch:
| Date: Sat, 21 Nov 2020 00:37:10 -0500
| Subject: [PATCH 2/3] PowerPC: require IBM long double for pr70117.
| Message-ID: <20201121053710.gd17...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559840.html

This patch uses the ppc_long_double_override_ibm target supports option that
was in the previous patch to make sure pr70117.c uses IBM extended double that
this test was designed for.

I have run this test on a power9 little endian server system with 3 compilers,
one configured for IBM long double, 1 configured for IEEE long double, and 1
configured for 64-bit long double.  All of the compilers used GLIBC 2.32, which
is needed to support switching the long double type, but this patch should work
with older GLIBCs as long as the compiler long double format matches GLIBC's
long double.

Can I check this change into the master branch once the previous patch is
applied?

gcc/testsuite/
2020-12-03  Michael Meissner  

PR target/70117
* gcc.target/powerpc/pr70117.c: Force the long double type to use
the IBM 128-bit format.
---
 gcc/testsuite/gcc.target/powerpc/pr70117.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr70117.c 
b/gcc/testsuite/gcc.target/powerpc/pr70117.c
index 3bbd2c595e0..6efece1c7c8 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr70117.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr70117.c
@@ -1,5 +1,6 @@
-/* { dg-do run { target { powerpc*-*-linux* powerpc*-*-darwin* powerpc*-*-aix* 
rs6000-*-* } } } */
-/* { dg-options "-std=c99 -mlong-double-128 -O2" } */
+/* { dg-do run } */
+/* { dg-require-effective-target ppc_long_double_override_ibm } */
+/* { dg-options "-std=c99 -O2 -mlong-double-128 -mabi=ibmlongdouble 
-Wno-psabi" } */
 
 #include 
 
-- 
2.22.0


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Date: Sat, 21 Nov 2020 00:33:52 -0500

2020-12-03 Thread Michael Meissner via Gcc-patches

PowerPC: Add long double target-supports.

This patch replaces the patch submitted on November 21st:
| Date: Sat, 21 Nov 2020 00:33:52 -0500
| Subject: [PATCH 1/3] PowerPC: Add long double target-supports.
| Message-ID: <20201121053352.gc17...@ibm-toto.the-meissners.org>
| https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559839.html

I expanded the target supports to include more options to select targets with
an appropriate long double format.  There are four options to check whether the
current long double is:

1)  128-bit using the IBM extended double format;
2)  128-bit using the IEEE format;
3)  Long double is 128-bits (i.e. either IBM or IEEE); (and)
4)  Long double is 64-bits.

I also added two new target supports:

1)  If you can switch the long double to IBM extended double via compiler
options and the GLIBC supports this change.  If you are using an
existing GLIBC with IBM long double, this should work since you aren't
switching the long double format.

2)  And likewise if you can switch the long double to IEEE 128-bit.

This patch and the following 2 other patches were tested together on a power9
little endian server system.  I built 4 compilers:

1)  Compiler without modifications;
2)  Compiler using my other patches configured for IBM long double;
3)  Compiler using my other patches configured for IEEE long double; (and)
4)  Compiler using my other patches configure for 64-bit double.

While I used the other patches to test the 64-bit and IEEE long double, these 3
patches should work for the default builds with IBM 128-bit, and they will
continue to work once all of the long double support patches are in.  The two
tests that are patched in the next two patches, now work in all environments.
Can I apply these patches to the master branch?

gcc/testsuite/
2020-12-03  Michael Meissner  

* lib/target-supports.exp
(check_effective_target_ppc_long_double_ibm): New function.
(check_effective_target_ppc_long_double_ieee): New function.
(check_effective_target_ppc_long_double_override_ibm): New function.
(check_effective_target_ppc_long_double_override_ieee): New function.
(check_effective_target_ppc_long_double_128bit): New function.
(check_effective_target_ppc_long_double_64bit): New function.
---
 gcc/testsuite/lib/target-supports.exp | 122 ++
 1 file changed, 122 insertions(+)

diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index ff6bc5f4b92..01b82843bf5 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2348,6 +2348,128 @@ proc check_effective_target_ppc_ieee128_ok { } {
 }]
 }
 
+# See if the target is a powerpc with the long double format that uses the IBM
+# extended double format.
+
+proc check_effective_target_ppc_long_double_ibm { } {
+return [check_cached_effective_target ppc_long_double_ibm {
+   int main()
+   {
+ #if !defined(_ARCH_PPC) || !defined(__LONG_DOUBLE_IBM128__)
+   return 1;
+ #else
+   return 0;
+ #endif
+   }
+}]
+}
+
+# See if the target is a powerpc with the long double format that uses the IEEE
+# 128-bit format.
+
+proc check_effective_target_ppc_long_double_ieee { } {
+return [check_cached_effective_target ppc_long_double_ieee {
+   int main()
+   {
+ #if !defined(_ARCH_PPC) || !defined(__LONG_DOUBLE_IEEE128__)
+   return 1;
+ #else
+   return 0;
+ #endif
+   }
+}]
+}
+
+# Like check_effective_target_ppc_long_double_ibm, but check if we can
+# explicitly override the long double format to use the IBM 128-bit extended
+# double format, and GLIBC supports doing this override by switching the
+# sprintf to handle long double.
+
+proc check_effective_target_ppc_long_double_override_ibm { } {
+set options "-mlong-double-128 -mabi=ibmlongdouble -Wno-psabi"
+check_runtime_nocache ppc_long_double_ovveride_ibm {
+   #include 
+   #include 
+   volatile __ibm128 a = (__ibm128) 3.0;
+   volatile long double one = 1.0L;
+   volatile long double two = 2.0L;
+   volatile long double b;
+   char buffer[20];
+   int main()
+   {
+ #if !defined(_ARCH_PPC) || !defined(__LONG_DOUBLE_IBM128__)
+   return 1;
+ #else
+   b = one + two;
+   if (memcmp ((void *)&a, (void *)&b, sizeof (long double)) != 0)
+ return 1;
+   sprintf (buffer, "%lg", b);
+   return strcmp (buffer, "3") != 0;
+ #endif
+   }
+} $options
+}
+
+# Like check_effective_target_ppc_long_double_ieee, but check if we can
+# explicitly override the long double format to use the IEEE 128-bit format,
+# and GLIBC supports doing this override by switching the sprintf to handle
+# long double.
+
+proc check_effective_target_ppc_long_double_over

[PATCH 0/3] Updates for float128 tests

2020-12-03 Thread Michael Meissner via Gcc-patches

These patches update the test suite patches I posted on November 21.

There are 3 patches in this series.

1)  The first patch adds new target-support options;
2)  The second patch updates pr70117.c; (and)
3)  The third patch updates convert-bfp-11.c.

In the target supports patches, I expanded the target supports to include more
options to select targets with an appropriate long double format.  There are
four options to check whether the current long double is:

1)  128-bit using the IBM extended double format;
2)  128-bit using the IEEE format;
3)  Long double is 128-bits (i.e. either IBM or IEEE); (and)
4)  Long double is 64-bits.

I also added two new target supports:

1)  If you can switch the long double to IBM extended double via compiler
options and the GLIBC supports this change.  If you are using an
existing GLIBC with IBM long double, this should work since you aren't
switching the long double format.

2)  And likewise if you can switch the long double to IEEE 128-bit.

I modified the two tests to use the new target supports where we select IBM
128-bit at compile time.  This will allow these two tests to be tested with the
current compilers, and also continue to running these two tests when the
default long double has been changed.  I have built compilers with each of the
three long double formats being default, and all 3 compilers run these tests
(providing I use GLIBC 2.32 or later).

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797

Re: [PATCH 1/2] correct BB frequencies after loop changed

2020-12-03 Thread Jiufu Guo via Gcc-patches

Jiufu Guo  writes:

> Jeff Law  writes:
>
>> On 11/18/20 12:28 AM, Richard Biener wrote:
>>> On Tue, 17 Nov 2020, Jeff Law wrote:
>>>
 Minor questions for Jan and Richi embedded below...

 On 10/9/20 4:12 AM, guojiufu via Gcc-patches wrote:
> When investigating the issue from 
> https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549786.html
> I find the BB COUNTs of loop seems are not accurate in some case.
> For example:
>
> In below figure:
>
>
>COUNT:268435456  pre-header
> |
> |  ..
> |  ||
> V  v|
>COUNT:805306369|
>/ \  |
>33%/   \ |
>  / \|
> v   v   |
> COUNT:268435456  COUNT:536870911  | 
> exit-edge |   latch |
>   ._.
>
> Those COUNTs have below equations:
> COUNT of exit-edge:268435456 = COUNT of pre-header:268435456
> COUNT of exit-edge:268435456 = COUNT of header:805306369 * 33
> COUNT of header:805306369 = COUNT of pre-header:268435456 + COUNT of 
> latch:536870911
>
>
> While after pcom:
>
>COUNT:268435456  pre-header
> |
> |  ..
> |  ||
> V  v|
>COUNT:268435456|
>/ \  |
>50%/   \ |
>  / \|
> v   v   |
> COUNT:134217728  COUNT:134217728  | 
> exit-edge |   latch |
>   ._.
>
> COUNT != COUNT + COUNT
> COUNT != COUNT
>
> In some cases, the probility of exit-edge is easy to estimate, then
> those COUNTs of other BBs in loop can be re-caculated.
>
> Bootstrap and regtest pass on ppc64le. Is this ok for trunk?
>
> Jiufu
>
> gcc/ChangeLog:
> 2020-10-09  Jiufu Guo   
>
>   * cfgloopmanip.h (recompute_loop_frequencies): New function.
>   * cfgloopmanip.c (recompute_loop_frequencies): New implementation.
>   * tree-ssa-loop-manip.c (tree_transform_and_unroll_loop): Call
>   recompute_loop_frequencies.
>
> ---
>  gcc/cfgloopmanip.c| 53 +++
>  gcc/cfgloopmanip.h|  2 +-
>  gcc/tree-ssa-loop-manip.c | 28 +++--
>  3 files changed, 57 insertions(+), 26 deletions(-)
>
> diff --git a/gcc/cfgloopmanip.c b/gcc/cfgloopmanip.c
> index 73134a20e33..b0ca82a67fd 100644
> --- a/gcc/cfgloopmanip.c
> +++ b/gcc/cfgloopmanip.c
> @@ -31,6 +31,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "gimplify-me.h"
>  #include "tree-ssa-loop-manip.h"
>  #include "dumpfile.h"
> +#include "cfgrtl.h"
>  
>  static void copy_loops_to (class loop **, int,
>  class loop *);
> @@ -1773,3 +1774,55 @@ loop_version (class loop *loop,
>  
>return nloop;
>  }
> +
> +/* Recalculate the COUNTs of BBs in LOOP, if the probability of exit edge
> +   is NEW_PROB.  */
> +
> +bool
> +recompute_loop_frequencies (class loop *loop, profile_probability 
> new_prob)
> +{
> +  edge exit = single_exit (loop);
> +  if (!exit)
> +return false;
> +
> +  edge e;
> +  edge_iterator ei;
> +  edge non_exit;
> +  basic_block * bbs;
> +  profile_count exit_count = loop_preheader_edge (loop)->count ();
> +  profile_probability exit_p = exit_count.probability_in 
> (loop->header->count);
> +  profile_count base_count = loop->header->count;
> +  profile_count after_num = base_count.apply_probability (exit_p);
> +  profile_count after_den = base_count.apply_probability (new_prob);
> +
> +  /* Update BB counts in loop body.
> + COUNT = COUNT
> + COUNT = COUNT * exit_edge_probility
> + The COUNT = COUNT * old_exit_p / new_prob.  
> */
> +  bbs = get_loop_body (loop);
> +  scale_bbs_frequencies_profile_count (bbs, loop->num_nodes, after_num,
> +  after_den);
> +  free (bbs);
> +
> +  /* Update probability and count of the BB besides exit edge (maybe 
> latch).  */
> +  FOR_EACH_EDGE (e, ei, exit->src->succs)
> +if (e != exit)
> +  break;

Merge from trunk to gccgo branch

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

I merged trunk revision 3089f5feef36810c625b5813370a97b4ecc841f8 to
the gccgo branch.

Ian

Re: [PATCH] v2: doc/implement-c.texi: About same-as-scalar-type volatile aggregate accesses, PR94600

2020-12-03 Thread Martin Sebor via Gcc-patches


On 12/3/20 12:14 PM, Hans-Peter Nilsson via Gcc-patches wrote:

Belatedly, here's an updated version, using Martin Sebor's
suggested wording from
"https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549580.html";.
I added two commas, hopefully helpfully.  Albeit ok'd by Richard
Biener in
"https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549922.html";,
better have this reviewed properly, including markup (none added).

Ok for trunk (gcc-11) and gcc-10?


Thanks for taking my suggestion!

These are just formatting nits but I would only further suggest
to enclose the name S (since it names a type) and the second
volatile in an @code{} directive (since it's a keyword).
(The volatile in volatile access is not one so it shouldn't
be formatted that way.)

Martin




---
We say very little about reads and writes to aggregate /
compound objects, just scalar objects (i.e. assignments don't
cause reads).  Let's lets say something safe about aggregate
objects, but only for those that are the same size as a scalar
type.

There's an equal-sounding section (Volatiles) in extend.texi,
but this seems a more appropriate place, as specifying the
behavior of a standard qualifier.

gcc:

2020-12-02  Hans-Peter Nilsson  
Martin Sebor  

PR middle-end/94600
* doc/implement-c.texi (Qualifiers implementation): Add blurb
about access to the whole of a volatile aggregate object, only for
same-size as a scalar object.
---
  gcc/doc/implement-c.texi | 5 +
  1 file changed, 5 insertions(+)

diff --git a/gcc/doc/implement-c.texi b/gcc/doc/implement-c.texi
index 692297b69c4..2e9158a2a45 100644
--- a/gcc/doc/implement-c.texi
+++ b/gcc/doc/implement-c.texi
@@ -576,6 +576,11 @@ are of scalar types, the expression is interpreted by GCC 
as a read of
  the volatile object; in the other cases, the expression is only evaluated
  for its side effects.
  
+When an object of an aggregate type, with the same size and alignment as a

+scalar type S, is the subject of a volatile access by an assignment
+expression or an atomic function, the access to it is performed as if the
+object's declared type were volatile S.
+
  @end itemize
  
  @node Declarators implementation

Re: [PATCH] libstdc++: Add C++ runtime support for new 128-bit long double format

2020-12-03 Thread Jonathan Wakely via Gcc-patches


On 03/12/20 20:07 -0300, Tulio Magno Quites Machado Filho via Libstdc++ wrote:

Jonathan Wakely via Libstdc++  writes:


diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index cbfdf4c6bad..d25842fef35 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -421,12 +425,43 @@ case "$target" in
 
port_specific_symbol_files="\$(top_srcdir)/config/os/gnu-linux/ldbl-extra.ver"
 case "$target" in
   powerpc*-*-linux*)
-   LONG_DOUBLE_COMPAT_FLAGS="$LONG_DOUBLE_COMPAT_FLAGS -mno-gnu-attribute" 
;;
+   LONG_DOUBLE_COMPAT_FLAGS="$LONG_DOUBLE_COMPAT_FLAGS -mno-gnu-attribute"
+# Check for IEEE128 support in libm:
+AC_CHECK_LIB(m, frexpf128,


I suggest to replace frexpf128 with __frexpieee128.

The former is available on a glibc that support _Float128 (since glibc 2.26).
The later is available on a glibc that supports binary128 long double (since
glibc 2.32)


Hmm, yes, you pointed me to __frexpieee128 a few months ago, but for
some reason I either didn't switch to using it, or lost a patch when
squashing and rebasing branches. Hopefully I just forgot to change it,
but I'll double check to make sure I haven't left any work on an old
branch. Thanks for suggesting it (again!)


Without this modification, the build fails on glibc between 2.26 and 2.31.

I'm also running a couple of tests here.
I believe this is showing a couple of glitches in glibc which I'm already
investigating.

Thanks!

--
Tulio Magno

[PATCH 2/2] Warn used and not used symbols in the same section

2020-12-03 Thread H.J. Lu via Gcc-patches

When SECTION_RETAIN is used, issue a warning of symbol without used
attribute is placed in the same section with symbol with used attribute,
like

int __attribute__((used,section(".data.foo"))) foo2 = 2;
int __attribute__((section(".data.foo"))) foo1 = 1;

since assembler will put them in different sections with the same section
name.

gcc/

PR other/98121
* varasm.c (switch_to_section): Warn if symbol without used
attribute is placed in a section with symbol with used
attribute.

gcc/testsuite/

PR other/98121
* c-c++-common/attr-used-5.c: Updated.
* c-c++-common/attr-used-6.c: Likewise.
* c-c++-common/attr-used-7.c: Likewise.
* c-c++-common/attr-used-8.c: Likewise.
---
 gcc/testsuite/c-c++-common/attr-used-5.c |  1 +
 gcc/testsuite/c-c++-common/attr-used-6.c |  1 +
 gcc/testsuite/c-c++-common/attr-used-7.c |  1 +
 gcc/testsuite/c-c++-common/attr-used-8.c |  1 +
 gcc/varasm.c | 19 +--
 5 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/c-c++-common/attr-used-5.c 
b/gcc/testsuite/c-c++-common/attr-used-5.c
index 9fc0d3834e9..f9199587c51 100644
--- a/gcc/testsuite/c-c++-common/attr-used-5.c
+++ b/gcc/testsuite/c-c++-common/attr-used-5.c
@@ -10,6 +10,7 @@ extern struct dtv_slotinfo_list *list;
 
 static int __attribute__ ((section ("__libc_freeres_fn")))
 free_slotinfo (struct dtv_slotinfo_list **elemp)
+/* { dg-warning "without 'used' attribute is placed in a section with" "" { 
target *-*-* } .-1 } */
 {
   if (!free_slotinfo (&(*elemp)->next))
 return 0;
diff --git a/gcc/testsuite/c-c++-common/attr-used-6.c 
b/gcc/testsuite/c-c++-common/attr-used-6.c
index 4526a692ee4..8f60c550be1 100644
--- a/gcc/testsuite/c-c++-common/attr-used-6.c
+++ b/gcc/testsuite/c-c++-common/attr-used-6.c
@@ -18,6 +18,7 @@ free_slotinfo (struct dtv_slotinfo_list **elemp)
 
 __attribute__ ((section ("__libc_freeres_fn")))
 void free_mem (void)
+/* { dg-warning "without 'used' attribute is placed in a section with" "" { 
target *-*-* } .-1 } */
 {
   free_slotinfo (&list);
 }
diff --git a/gcc/testsuite/c-c++-common/attr-used-7.c 
b/gcc/testsuite/c-c++-common/attr-used-7.c
index fba2706ffc1..cca1f2b8a33 100644
--- a/gcc/testsuite/c-c++-common/attr-used-7.c
+++ b/gcc/testsuite/c-c++-common/attr-used-7.c
@@ -3,6 +3,7 @@
 
 int __attribute__((used,section(".data.foo"))) foo2 = 2;
 int __attribute__((section(".data.foo"))) foo1 = 1;
+/* { dg-warning "without 'used' attribute is placed in a section with" "" { 
target *-*-* } .-1 } */
 
 /* { dg-final { scan-assembler ".data.foo,\"aw\"" { target R_flag_in_section } 
} } */
 /* { dg-final { scan-assembler ".data.foo,\"awR\"" { target R_flag_in_section 
} } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-8.c 
b/gcc/testsuite/c-c++-common/attr-used-8.c
index c8d65f65033..c797610e1d7 100644
--- a/gcc/testsuite/c-c++-common/attr-used-8.c
+++ b/gcc/testsuite/c-c++-common/attr-used-8.c
@@ -2,6 +2,7 @@
 /* { dg-options "-Wall -O2" } */
 
 int __attribute__((section(".data.foo"))) foo1 = 1;
+/* { dg-warning "without 'used' attribute is placed in a section with" "" { 
target *-*-* } .-1 } */
 int __attribute__((used,section(".data.foo"))) foo2 = 2;
 
 /* { dg-final { scan-assembler ".data.foo\n" { target R_flag_in_section } } } 
*/
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 7271705198c..fde00b5520e 100644
--- a/gcc/varasm.c
+++ b/gcc/varasm.c
@@ -7728,10 +7728,25 @@ switch_to_section (section *new_section, tree decl)
{
  /* If the SECTION_RETAIN bit doesn't match, switch to a new
 section.  */
+ tree used_decl, no_used_decl;
+
  if (DECL_PRESERVE_P (decl))
-   new_section->common.flags |= SECTION_RETAIN;
+   {
+ new_section->common.flags |= SECTION_RETAIN;
+ used_decl = decl;
+ no_used_decl = new_section->named.decl;
+   }
  else
-   new_section->common.flags &= ~SECTION_RETAIN;
+   {
+ new_section->common.flags &= ~SECTION_RETAIN;
+ used_decl = new_section->named.decl;
+ no_used_decl = decl;
+   }
+ warning_at (DECL_SOURCE_LOCATION (no_used_decl),
+ OPT_Wattributes,
+ "%+qD without % attribute is placed in a "
+ "section with %qD with % attribute",
+ no_used_decl, used_decl);
}
   else
return;
-- 
2.28.0

[PATCH 1/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-03 Thread H.J. Lu via Gcc-patches

When definitions marked with used attribute and unmarked definitions are
placed in the same section, switch to a new section if the SECTION_RETAIN
bit doesn't match.

gcc/

PR other/98121
* output.h (switch_to_section): Add a tree argument, default to
nullptr.
* varasm.c (get_section): If the SECTION_RETAIN bit doesn't match,
return and switch to a new section later.
(assemble_start_function): Pass decl to switch_to_section.
(assemble_variable): Likewise.
(switch_to_section): If the SECTION_RETAIN bit doesn't match,
switch to a new section.

gcc/testsuite/

PR other/98121
* c-c++-common/attr-used-5.c: New test.
* c-c++-common/attr-used-6.c: Likewise.
* c-c++-common/attr-used-7.c: Likewise.
* c-c++-common/attr-used-8.c: Likewise.
---
 gcc/output.h |  2 +-
 gcc/testsuite/c-c++-common/attr-used-5.c | 26 ++
 gcc/testsuite/c-c++-common/attr-used-6.c | 26 ++
 gcc/testsuite/c-c++-common/attr-used-7.c |  8 +++
 gcc/testsuite/c-c++-common/attr-used-8.c |  8 +++
 gcc/varasm.c | 28 
 6 files changed, 93 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-5.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-6.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-7.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-8.c

diff --git a/gcc/output.h b/gcc/output.h
index fa8ace1f394..1f9af46da1d 100644
--- a/gcc/output.h
+++ b/gcc/output.h
@@ -548,7 +548,7 @@ extern void switch_to_other_text_partition (void);
 extern section *get_cdtor_priority_section (int, bool);
 
 extern bool unlikely_text_section_p (section *);
-extern void switch_to_section (section *);
+extern void switch_to_section (section *, tree = nullptr);
 extern void output_section_asm_op (const void *);
 
 extern void record_tm_clone_pair (tree, tree);
diff --git a/gcc/testsuite/c-c++-common/attr-used-5.c 
b/gcc/testsuite/c-c++-common/attr-used-5.c
new file mode 100644
index 000..9fc0d3834e9
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-5.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+struct dtv_slotinfo_list
+{
+  struct dtv_slotinfo_list *next;
+};
+
+extern struct dtv_slotinfo_list *list;
+
+static int __attribute__ ((section ("__libc_freeres_fn")))
+free_slotinfo (struct dtv_slotinfo_list **elemp)
+{
+  if (!free_slotinfo (&(*elemp)->next))
+return 0;
+  return 1;
+}
+
+__attribute__ ((used, section ("__libc_freeres_fn")))
+static void free_mem (void)
+{
+  free_slotinfo (&list);
+}
+
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"ax\"" { target 
R_flag_in_section } } } */
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"axR\"" { target 
R_flag_in_section } } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-6.c 
b/gcc/testsuite/c-c++-common/attr-used-6.c
new file mode 100644
index 000..4526a692ee4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-6.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+struct dtv_slotinfo_list
+{
+  struct dtv_slotinfo_list *next;
+};
+
+extern struct dtv_slotinfo_list *list;
+
+static int __attribute__ ((used, section ("__libc_freeres_fn")))
+free_slotinfo (struct dtv_slotinfo_list **elemp)
+{
+  if (!free_slotinfo (&(*elemp)->next))
+return 0;
+  return 1;
+}
+
+__attribute__ ((section ("__libc_freeres_fn")))
+void free_mem (void)
+{
+  free_slotinfo (&list);
+}
+
+/* { dg-final { scan-assembler "__libc_freeres_fn\n" } } */
+/* { dg-final { scan-assembler "__libc_freeres_fn,\"axR\"" { target 
R_flag_in_section } } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-7.c 
b/gcc/testsuite/c-c++-common/attr-used-7.c
new file mode 100644
index 000..fba2706ffc1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-7.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+int __attribute__((used,section(".data.foo"))) foo2 = 2;
+int __attribute__((section(".data.foo"))) foo1 = 1;
+
+/* { dg-final { scan-assembler ".data.foo,\"aw\"" { target R_flag_in_section } 
} } */
+/* { dg-final { scan-assembler ".data.foo,\"awR\"" { target R_flag_in_section 
} } } */
diff --git a/gcc/testsuite/c-c++-common/attr-used-8.c 
b/gcc/testsuite/c-c++-common/attr-used-8.c
new file mode 100644
index 000..c8d65f65033
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/attr-used-8.c
@@ -0,0 +1,8 @@
+/* { dg-do compile } */
+/* { dg-options "-Wall -O2" } */
+
+int __attribute__((section(".data.foo"))) foo1 = 1;
+int __attribute__((used,section(".data.foo"))) foo2 = 2;
+
+/* { dg-final { scan-assembler ".data.foo\n" { target R_flag_in_section } } } 
*/
+/* { dg-final { scan-assembler ".data.foo,\"awR\"" { target R_flag_in_section 
} } } */
diff --git a/gcc/varasm.c b/gcc/varasm.c
index 961d2d6fe3b..7271705

[PATCH 0/2] Switch to a new section if the SECTION_RETAIN bit doesn't match

2020-12-03 Thread H.J. Lu via Gcc-patches

When SECTION_RETAIN is used, definitions marked with used attribute and
unmarked definitions are placed in the same section.  Instead of issue
an error:

[hjl@gnu-cfl-2 gcc]$ /usr/gcc-11.0.0-x32/bin/gcc -S c.c 
-fdiagnostics-plain-output
c.c:2:49: error: ‘foo1’ causes a section type conflict with ‘foo2’
c.c:1:54: note: ‘foo2’ was declared here
[hjl@gnu-cfl-2 gcc]$

the first patch switches to a new section if the SECTION_RETAIN bit
doesn't match.  The second optional patch issues a warning:

[hjl@gnu-cfl-2 gcc]$ ./xgcc -B./ -S c.c -fdiagnostics-plain-output
c.c:2:49: warning: ‘foo1’ without ‘used’ attribute is placed in a section with 
‘foo2’ with ‘used’ attribute [-Wattributes]
[hjl@gnu-cfl-2 gcc]$

H.J. Lu (2):
  Switch to a new section if the SECTION_RETAIN bit doesn't match
  Warn used and not used symbols in the same section

 gcc/output.h |  2 +-
 gcc/testsuite/c-c++-common/attr-used-5.c | 27 +++
 gcc/testsuite/c-c++-common/attr-used-6.c | 27 +++
 gcc/testsuite/c-c++-common/attr-used-7.c |  9 +
 gcc/testsuite/c-c++-common/attr-used-8.c |  9 +
 gcc/varasm.c | 43 +---
 6 files changed, 112 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-5.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-6.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-7.c
 create mode 100644 gcc/testsuite/c-c++-common/attr-used-8.c

-- 
2.28.0

[r11-5719 Regression] FAIL: g++.dg/cpp2a/concepts-nodiscard1.C -std=c++2a (test for excess errors) on Linux/x86_64

2020-12-03 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

5ea36d20c352a3ca436aa764404f6210b090866b is the first bad commit
commit 5ea36d20c352a3ca436aa764404f6210b090866b
Author: Jason Merrill 
Date:   Thu Dec 3 13:55:51 2020 -0500

c++: Add testcase for PR98019

caused

FAIL: g++.dg/cpp2a/concepts-nodiscard1.C  -std=c++2a (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5719/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp2a/concepts-nodiscard1.C 
--target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp2a/concepts-nodiscard1.C 
--target_board='unix{-m32\ -march=cascadelake}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp2a/concepts-nodiscard1.C 
--target_board='unix{-m64}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/cpp2a/concepts-nodiscard1.C 
--target_board='unix{-m64\ -march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH] libstdc++: Add C++ runtime support for new 128-bit long double format

2020-12-03 Thread Tulio Magno Quites Machado Filho via Gcc-patches

Jonathan Wakely via Libstdc++  writes:

> diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
> index cbfdf4c6bad..d25842fef35 100644
> --- a/libstdc++-v3/configure.ac
> +++ b/libstdc++-v3/configure.ac
> @@ -421,12 +425,43 @@ case "$target" in
>  
> port_specific_symbol_files="\$(top_srcdir)/config/os/gnu-linux/ldbl-extra.ver"
>  case "$target" in
>powerpc*-*-linux*)
> - LONG_DOUBLE_COMPAT_FLAGS="$LONG_DOUBLE_COMPAT_FLAGS -mno-gnu-attribute" 
> ;;
> + LONG_DOUBLE_COMPAT_FLAGS="$LONG_DOUBLE_COMPAT_FLAGS -mno-gnu-attribute"
> +# Check for IEEE128 support in libm:
> +AC_CHECK_LIB(m, frexpf128,

I suggest to replace frexpf128 with __frexpieee128.

The former is available on a glibc that support _Float128 (since glibc 2.26).
The later is available on a glibc that supports binary128 long double (since
glibc 2.32)

Without this modification, the build fails on glibc between 2.26 and 2.31.

I'm also running a couple of tests here.
I believe this is showing a couple of glitches in glibc which I'm already
investigating.

Thanks!

-- 
Tulio Magno

Re: How to traverse all the local variables that declared in the current routine?

2020-12-03 Thread Qing Zhao via Gcc-patches

Hi, Richard,

Thanks a lot for your suggestion.

Actually, I like this idea. 

My understanding of your suggestion is:

1. During gimplification phase:

For each auto-variable that does not have an explicit initializer, insert the 
following initializer for it:

X = DEFERRED_INIT (X, INIT)

In which, DEFERRED_INIT is an internal const function, which can be defined as:

DEF_INTERNAL_FN (DEFERRED_INIT, ECF_CONST | ECF_LEAF | ECF_NOTHROW, NULL)

It’s two arguments are:

1st argument:   this uninitialized auto-variable;
2nd argument:  initialized pattern (zero | pattern);

2.  During tree to SSA phase:  

No change, the current tree to SSA phase should automatically change the above 
new inserted statement as

X_2 = DEFERRED_INIT (X_1(D), INIT);
And all other uses of X-1(D) being replaced by X_2. 

3. During expanding phase:

Expand each call to “DEFERRED_INIT (X, INIT)” to zero or pattern depends on 
“INIT”. 

Is the above understanding correct? Do I miss anything? 

More comments and questions are embedded below:


> On Dec 3, 2020, at 11:32 AM, Richard Sandiford  
> wrote:
> 
> Richard Biener via Gcc-patches  writes:
>> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>>> Another issue is, in order to check whether an auto-variable has 
>>> initializer, I plan to add a new bit in “decl_common” as:
>>>  /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
>>>  unsigned decl_is_initialized :1;
>>> 
>>> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
>>> #define DECL_IS_INITIALIZED(NODE) \
>>>  (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>>> 
>>> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
>>> even though DECL_INITIAL might be NULLed.
>> 
>> For locals it would be more reliable to set this flag during gimplification.
>> 
>>> Do you have any comment and suggestions?
>> 
>> As said above - do you want to cover registers as well as locals?  I'd do
>> the actual zeroing during RTL expansion instead since otherwise you
>> have to figure youself whether a local is actually used (see 
>> expand_stack_vars)
>> 
>> Note that optimization will already made have use of "uninitialized" state
>> of locals so depending on what the actual goal is here "late" may be too 
>> late.
> 
> Haven't thought about this much, so it might be a daft idea, but would a
> compromise be to use a const internal function:
> 
>  X1 = .DEFERRED_INIT (X0, INIT)
> 
> where the X0 argument is an uninitialised value and the INIT argument
> describes the initialisation pattern?  So for a decl we'd have:
> 
>  X = .DEFERRED_INIT (X, INIT)
> 
> and for an SSA name we'd have:
> 
>  X_2 = .DEFERRED_INIT (X_1(D), INIT)
> 
> with all other uses of X_1(D) being replaced by X_2.  The idea is that:
> 
> * Having the X0 argument would keep the uninitialised use of the
>  variable around for the later warning passes.
> 
> * Using a const function should still allow the UB to be deleted as dead
>  if X1 isn't needed.

So, current GCC will delete the UB as dead code when X1 is not needed, with
The new option, we should keep this behavior? 

> 
> * Having a function in the way should stop passes from taking advantage
>  of direct uninitialised uses for optimisation.

This will resolve the issue we raised before with directly adding “artificial” 
zero-initializer 
during gimplification. 

However, I am wondering whether the new added const internal functions will 
impact the 
optimization and then change the uninitialized analysis behavior? 
> 
> This means we won't be able to optimise based on the actual init
> value at the gimple level, but that seems like a fair trade-off.

Yes, with this approach: 

At gimple level, we will not be able to optimize on the new added init values;
At RTL level, we will optimize on the new added init values;
RTL optimizations will be able to eliminate any redundancy introduced by this 
new
Initializations to reduce the cost of this options. 



> AIUI this is really a security feature or anti-UB hardening feature
> (in the sense that users are more likely to see predictable behaviour
> “in the field” even if the program has UB).

Yes, this option is for security purpose, and currently have been used in 
productions by Microsoft, 
Apple and google, etc. 

Qing
> 
> Thanks,
> Richard

[pushed] c++: Fix bootstrap on 32-bit hosts [PR91828]

2020-12-03 Thread Jason Merrill via Gcc-patches

Using the releasing_vec op[] with an int index was breaking on 32-bit hosts
because of ambiguity with the built-in operator and the conversion
function.  Since the built-in operator has a ptrdiff_t, this was fine on
64-bit targets where ptrdiff_t is larger than int, but broke on 32-bit
targets where it's the same as int, making the conversion for that argument
better than the member function.  Fixed by changing the member function to
also use ptrdiff_t for the index.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

* cp-tree.h (releasing_vec::operator[]): Change parameter type to
ptrdiff_t.
---
 gcc/cp/cp-tree.h | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index 081ede24e96..f28291e46d7 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -975,8 +975,10 @@ public:
   operator vec_t *() const { return v; }
   vec_t ** operator& () { return &v; }
 
-  /* Breaks pointer/value consistency for convenience.  */
-  tree& operator[] (unsigned i) const { return (*v)[i]; }
+  /* Breaks pointer/value consistency for convenience.  This takes ptrdiff_t
+ rather than unsigned to avoid ambiguity with the built-in operator[]
+ (bootstrap/91828).  */
+  tree& operator[] (ptrdiff_t i) const { return (*v)[i]; }
 
   ~releasing_vec() { release_tree_vector (v); }
 private:

base-commit: dce6c58db87ebf7f4477bd3126228e73e497
-- 
2.27.0

Re: [17/23] recog: Add a class for propagating into insns

2020-12-03 Thread Jeff Law via Gcc-patches




On 11/13/20 1:20 AM, Richard Sandiford via Gcc-patches wrote:
> This patch adds yet another way of propagating into an instruction and
> simplifying the result.  (The net effect of the series is to keep the
> total number of propagation approaches the same though, since a later
> patch removes the fwprop.c routines.)
>
> One of the drawbacks of the validate_replace_* routines is that
> they only do simple simplifications, mostly canonicalisations:
>
>   /* Do changes needed to keep rtx consistent.  Don't do any other
>  simplifications, as it is not our job.  */
>   if (simplify)
> simplify_while_replacing (loc, to, object, op0_mode);
>
> But substituting can often lead to real simplification opportunities.
> simplify-rtx.c:simplify_replace_rtx does fully simplify the result,
> but it only operates on specific rvalues rather than full instruction
> patterns.  It is also nondestructive, which means that it returns a
> new rtx whenever a substitution or simplification was possible.
> This can create quite a bit of garbage rtl in the context of a
> speculative recog, where changing the contents of a pointer is
> often enough.
>
> The new routines are therefore supposed to provide simplify_replace_rtx-
> style substitution in recog.  They go to some effort to prevent garbage
> rtl from being created.
>
> At the moment, the new routines fail if the pattern would still refer
> to the old "from" value in some way.  That might be unnecessary in
> some contexts; if so, it could be put behind a configuration parameter.
>
> gcc/
>   * recog.h (insn_propagation): New class.
>   * recog.c (insn_propagation::apply_to_mem_1): New function.
>   (insn_propagation::apply_to_rvalue_1): Likewise.
>   (insn_propagation::apply_to_lvalue_1): Likewise.
>   (insn_propagation::apply_to_pattern_1): Likewise.
>   (insn_propagation::apply_to_pattern): Likewise.
>   (insn_propagation::apply_to_rvalue): Likewise.
>

OK
jeff

Re: [PATCH v2 08/31] jump: Also handle jumps wrapped in UNSPEC or UNSPEC_VOLATILE

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/2/20 8:50 PM, Maciej W. Rozycki wrote:
> VAX has interlocked branch instructions used for atomic operations and
> we want to have them wrapped in UNSPEC_VOLATILE so as not to have code
> carried across.  This however breaks with jump optimization and leads
> to an ICE in the build of libbacktrace like:
>
> .../libbacktrace/mmap.c:190:1: internal compiler error: in 
> fixup_reorder_chain, at cfgrtl.c:3934
>   190 | }
>   | ^
> 0x1087d46b fixup_reorder_chain
>   .../gcc/cfgrtl.c:3934
> 0x1087f29f cfg_layout_finalize()
>   .../gcc/cfgrtl.c:4447
> 0x1087c74f execute
>   .../gcc/cfgrtl.c:3662
>
> on RTL like:
>
> (jump_insn 18 17 150 4 (unspec_volatile [
> (set (pc)
> (if_then_else (eq (zero_extract:SI (mem/v:SI (reg/f:SI 23 [ 
> _2 ]) [-1  S4 A32])
> (const_int 1 [0x1])
> (const_int 0 [0]))
> (const_int 1 [0x1]))
> (label_ref 20)
> (pc)))
> (set (zero_extract:SI (mem/v:SI (reg/f:SI 23 [ _2 ]) [-1  S4 A32])
> (const_int 1 [0x1])
> (const_int 0 [0]))
> (const_int 1 [0x1]))
> ] 101) ".../libbacktrace/mmap.c":135:14 158 {jbbssisi}
>  (nil)
>  -> 20)
>
> when those branches are enabled with a follow-up change.  Also showing
> with:
>
> FAIL: gcc.dg/pr61756.c (internal compiler error)
>
> Handle branches wrapped in UNSPEC_VOLATILE then and, for consistency,
> also in UNSPEC.  The presence of UNSPEC_VOLATILE will prevent such
> branches from being removed as they won't be accepted by `onlyjump_p',
> we just need to let them through.
>
>   gcc/
>   * jump.c (pc_set): Also accept a jump wrapped in UNSPEC or
>   UNSPEC_VOLATILE.
>   (any_uncondjump_p, any_condjump_p): Update comment accordingly.
> ---
> On Fri, 20 Nov 2020, Jeff Law wrote:
>
>>> Handle branches wrapped in UNSPEC_VOLATILE then and, for consistency,
>>> also in UNSPEC.  The presence of UNSPEC_VOLATILE will prevent such
>>> branches from being removed as they won't be accepted by `onlyjump_p',
>>> we just need to let them through.
>>>
>>> gcc/
>>> * jump.c (pc_set): Also accept a jump wrapped in UNSPEC or
>>> UNSPEC_VOLATILE.
>>> (any_uncondjump_p, any_condjump_p): Update comment accordingly.
>> I've got some concerns that there may be users of pc_set where handling
>> UNSPECs would be undesirable.  For example the uses in cfgcleanup.
>  I've gone through the use of `pc_set' and `any_condjump_p' in detail 
> then.  Mind that I wouldn't claim expertise with this stuff, just common 
> sense backed with documentation and source code available.
>
>  It appears safe to me though, as the really dangerous cases where a jump 
> is to be removed, in `thread_jump' and `outgoing_edges_match', are guarded 
> with calls to `onlyjump_p' (the reference to which from the description of 
> `any_condjump_p' making a promise it will guard the relevant cases is what 
> made me pretty confident with my change being technically correct), which 
> will reject any jumps wrapped into UNSPEC_VOLATILE or even UNSPEC (though 
> arguably the latter case might be an overkill, and we could loosen that 
> restriction) on the premise of failing `single_set', which only accepts a 
> SET, possibly wrapped into a PARALLEL.
>
>  Similarly with branch deletion in `cfg_layout_redirect_edge_and_branch' 
> and attempted transformations using `cond_exec_get_condition'.
>
>  Those `pc_set' calls that you mention are then only used once the 
> containing code has been qualified with `onlyjump_p', so they won't be 
> ever called with branches wrapped into UNSPEC_VOLATILE.  Likewise the 
> `pc_set' calls in `cprop_jump' and `bypass_block'.
>
>  The calls in `try_optimize_cfg' trying to convert a conditional branch 
> into a corresponding conditional return (necessary to analyse of course 
> though not relevant for the VAX; oh, how I long for the Z80) both rely on 
> `redirect_jump' and possibly `invert_jump', and they're supposed to fail 
> if a suitably modified original rtx cannot be matched with any RTL pattern 
> (though AFAICS `redirect_exp_1' will just fail due to the lack of explicit 
> UNSPEC_VOLATILE/UNSPEC support and so will `invert_jump' as it relies on 
> it too).
>
>  Except for one case the use in icvt appear safe to me: all legs except 
> for ones calling into `dead_or_predicable' refer to `onlyjump_p'.  Then in 
> `dead_or_predicable' we have two cases, NCE and CE.  The NCE one there is 
> safe due to `can_move_insns_across' rejecting the move in the case of any 
> UNSPEC_VOLATILE insn in either block.  The CE one isn't because ultimately 
> it will delete the jump without validating it with `onlyjump_p' AFAICT.
>
>  This will not affect VAX as it doesn't have conditional execution, and is 
> not a fault in my change.  I think it needs fixing though, and I will post 
> a patch separately, along with a min

[r11-5712 Regression] FAIL: g++.dg/template/pr98116.C -std=c++2a (test for excess errors) on Linux/x86_64

2020-12-03 Thread sunil.k.pandey via Gcc-patches

On Linux/x86_64,

7254a78cf4c419a9b9361289d8c535130cf1dfd0 is the first bad commit
commit 7254a78cf4c419a9b9361289d8c535130cf1dfd0
Author: Nathan Sidwell 
Date:   Thu Dec 3 08:40:43 2020 -0800

c++: Testcases [PR 98115]

caused

FAIL: g++.dg/template/pr98116.C  -std=c++2a (internal compiler error)
FAIL: g++.dg/template/pr98116.C  -std=c++2a (test for excess errors)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r11-5712/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/template/pr98116.C --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/template/pr98116.C --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 8:29 AM, Kumar, Venkataramanan via Gcc-patches wrote:
> [AMD Public Use]
>
>
> Hi Maintainers,
>
> PFA, the patch that enables support for the next generation AMD Zen3 CPU via 
> -march=znver3.
> This is a very basic enablement patch. As of now the cost, tuning and 
> scheduler changes are kept same as znver2.
> Further changes to the cost and tunings will be done later.
>
> Ok for trunk ?
>
> Regards,
> Venkat.
>
>
> X86_64-Enable-support-for-next-generation-AMD-Znver3.patch
>
> From ef7bd7d02e98d86ff32fa0dad6bc1d0802bd32aa Mon Sep 17 00:00:00 2001
> From: Venkataramanan Kumar 
> Date: Thu, 3 Dec 2020 17:32:53 +0530
> Subject: [PATCH] X86_64: Enable support for next generation AMD Zen3 CPU.
>
> 2020-12-03  Venkataramanan Kumar  
>   Sharavan Kumar  
>
> gcc/ChangeLog:
>
>   * common/config/i386/cpuinfo.h (get_amd_cpu) recognize znver3.
>   * common/config/i386/i386-common.c (processor_names): Add
>   znver3.
>   (processor_alias_table): Add znver3 and AMDFAM19H entry.
>   * common/config/i386/i386-cpuinfo.h (processor_types): Add
>   AMDFAM19H.
>   (processor_subtypes): AMDFAM19H_ZNVER3.
>   * config.gcc (i[34567]86-*-linux* | ...): Likewise.
>   * config/i386/driver-i386.c: (host_detect_local_cpu): Let
>   -march=native recognize znver3 processors.
>   * config/i386/i386-c.c (ix86_target_macros_internal): Add
>   znver3.
>   * config/i386/i386-options.c (m_znver3): New definition.
>   (m_ZNVER): Include m_znver3.
>   (processor_cost_table): Add znver3.
>   * config/i386/i386.c (ix86_reassociation_width): Likewise.
>   * config/i386/i386.h (TARGET_znver3): New definition.
>   (enum processor_type): Add PROCESSOR_ZNVER3.
>   * config/i386/i386.md (define_attr "cpu"): Add znver3.
>   * config/i386/x86-tune-sched.c: (ix86_issue_rate): Likewise.
>   (ix86_adjust_cost): Likewise.
>   * config/i386/x86-tune.def (X86_TUNE_AVOID_256FMA_CHAINS:
>   Likewise.
>   * config/i386/znver1.md: Add new reservations for znver3.
>   * doc/extend.texi: Add details about znver3.
>   * doc/invoke.texi: Likewise.
Normally I would consider this inappropriate for stage3, but AFAICT the
risk profile of this patch should be small.Â  Ultimately it's up to Uros
and I'll support whatever decision he makes.


Jeff

Re: [PATCH] c++: Distinguish unsatisfaction vs errors during satisfaction [PR97093]

2020-12-03 Thread Jason Merrill via Gcc-patches


On 12/3/20 9:24 AM, Patrick Palka wrote:

During satisfaction, the flag info.noisy() controls three things:
whether to diagnose fatal errors (such as the satisfaction value of an
atom being non-bool); whether to diagnose unsatisfaction; and whether to
bypass the satisfaction cache.

This flag turns out to be too coarse however, for sometimes we need to
diagnose fatal errors but not unsatisfaction, in particular when replaying
an erroneous satisfaction result from constraint_satisfaction_value,
evaluate_concept_check and tsubst_nested_requirement.

And we sometimes need to bypass the satisfaction cache but not diagnose
unsatisfaction, in particular when evaluating the branches of a
disjunction when info.noisy() is true.  Currently, satisfy_disjunction
first quietly evaluates each branch, but doing so causes satisfy_atom
to insert re-normalized atoms into the satisfaction cache when
diagnosing unsatisfaction of the overall constraint.  This is ultimately
the source of PR97093.

To that end, this patch adds the info.diagnose_unsatisfaction_p() flag
which refines the info.noisy() flag.  During satisfaction info.noisy()
now controls whether to diagnose fatal errors, and
info.diagnose_unsatisfaction_p() controls whether to additionally
diagnose unsatisfaction.  This enables us to address the above two
issues straightforwardly.



This flag refinement also allows us to fold the diagnose_foo_requirement
routines into the corresponding tsubst_foo_requirement ones.  Here, the
flags take on slightly different meanings: info.noisy() controls whether
to diagnose invalid types and expressions inside the requires-expression,
and info.diagnose_unsatisfaction_p() controls whether to diagnose the
overall unsatisfaction of the requires-expression.



Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
cmcstl2 and range-v3.  Does this look OK for trunk?

gcc/cp/ChangeLog:

PR c++/97093
* constraint.cc (struct sat_info): Define.
(tsubst_valid_expression_requirement): Take a sat_info instead
of subst_info.  Perform the substitution quietly first.  Fold in
error-replaying code from diagnose_valid_expression.
(tsubst_simple_requirement): Take a sat_info instead of
subst_info.
(tsubst_type_requirement_1): New.  Fold in error-replaying code
from diagnose_valid_type.
(tsubst_type_requirement): Use it. Take a sat_info instead of
subst_info.
(tsubst_compound_requirement): Likewise.  Fold in
error-replaying code from diagnose_compound_requirement.
(tsubst_nested_requirement): Take a sat_info instead of
subst_info.  Perform the substitution quietly first.  Fold in
error-replaying code from diagnose_nested_requirement.
(tsubst_requirement): Take a sat_info instead of subst_info.
(tsubst_requirement_body): Likewise.
(tsubst_requires_expr): Split into two versions, one that takes
a sat_info argument and another that takes a complain and
in_decl argument.  Remove outdated documentation.  Document he
effects of the sat_info argument.
(tsubst_parameter_mapping): Take a sat_info instead of a
subst_info.
(satisfy_conjunction): Likewise.
(satisfy_disjunction): Likewise.  Evaluate each branch with
unsatisfaction diagnostics disabled rather than completely
quietly, and short circuit when an erroneous branch is
encountered.
(satisfy_atom):  Take a sat_info instead of a subst_info.  Fix a
comment.  Use diagnose_unsatisfaction_p() instead of noisy() to
guard replaying of satisfaction failure.  Always check
constantness quietly first and consistently return
error_mark_node when the value is non-constant.
(satisfy_constraint_r): Document the effects of the sat_info
argument.  Take a sat_info instead of a subst_info.
(satisfy_constraint): Take a sat_info instead of a subst_info.
(satisfy_associated_constraints): Likewise.
(satisfy_constraint_expression): Likewise.
(satisfy_declaration_constraints): Likewise.
(constraint_satisfaction_value): Likewise.  Adjust.  XXX
(constraints_satisfied_p): Adjust.
(evaluate_concept_check): Adjust.
(diagnose_trait_expr): Make static.  Take a template args vector
instead of a parameter mapping.
(diagnose_atomic_constraint): Take a sat_info instead of a
subst_info.  Adjust call to diagnose_trait_expr.  Call
tsubst_requires_expr instead of diagnose_requires_expr.
(diagnose_constraints): Adjust calls to
constraint_satisfaction_value.
(diagnose_valid_expression): Remove.
(diagnose_valid_type): Likewise.
(diagnose_simple_requirement): Likewise.
(diagnose_compound_requirement): Likewise.
(diagnose_type_requirement): Likewise.
(diagnose_nested_requirement): Likewise.

Re: [PATCH 0/6] Add missing calls to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:34 AM, Maciej W. Rozycki wrote:
> Hi,
>
>  As discussed here: 
>
> 
>
> here is a small patch series adding missing calls to `onlyjump_p' around 
> `any_condjump_p' and `any_uncondjump_p' use where the jump in question is 
> about to be removed.
>
>  Note that I have included unrelated though contextually connected 6/6 as 
> an RFC to verify whether this potential anomaly I have spotted has been 
> intentional.  I'll be happy to drop it if that is the case.  The remaining 
> changes are I believe actual bug fixes.
I doubt it's intentional.  I'd tend to think this specific patch in the
series should wait until gcc-12 out of an abundance of caution.    I've
ACK'd the rest.

Jeff

Re: [PATCH 5/6] loop-doloop: Add missing call to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:35 AM, Maciej W. Rozycki wrote:
> Keep any jump that has side effects as those must not be removed.
>
>   gcc/
>   * loop-doloop.c (add_test): Only remove the jump if `onlyjump_p'.
OK
jeff

Re: [PATCH 5/6] loop-doloop: Add missing call to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:35 AM, Maciej W. Rozycki wrote:
> Keep any jump that has side effects as those must not be removed.
>
>   gcc/
>   * loop-doloop.c (add_test): Only remove the jump if `onlyjump_p'.
OK
jeff

Re: [PATCH 4/6] cfgrtl: Add missing call to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:34 AM, Maciej W. Rozycki wrote:
> If any unconditional jumps within a block have side effects then the 
> block cannot be considered empty.
>
>   gcc/
>   * cfgrtl.c (rtl_block_empty_p): Return false if `!onlyjump_p' 
>   too.
OK
jeff

Re: [PATCH 3/6] sel-sched-ir: Add missing call to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:34 AM, Maciej W. Rozycki wrote:
> Do not try to remove a conditional jump if it has side effects.
>
>   gcc/
>   * sel-sched-ir.c (maybe_tidy_empty_bb): Only try to remove a 
>   conditional jump if `onlyjump_p'.
OK
jeff

Re: [PATCH 2/6] loop-iv: Add missing calls to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:34 AM, Maciej W. Rozycki wrote:
> Ignore jumps that have side effects in loop processing as pasting the 
> body of a loop multiple times within is semantically equivalent to jump 
> deletion (between the iterations unrolled) even if we do not physically 
> delete the jump RTL insn.
>
>   gcc/
>   * loop-iv.c (simplify_using_initial_values): Only process jumps 
>   that match `onlyjump_p'.
>   (check_simple_exit): Likewise.
OK
jeff

Re: [PATCH 1/6] ifcvt: Add missing call to `onlyjump_p'

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 4:34 AM, Maciej W. Rozycki wrote:
> Do not convert a conditional jump into conditional execution (and remove 
> the jump as a consequence) if the jump has side effects.
>
>   gcc/
>   * ifcvt.c (dead_or_predicable) [!IFCVT_MODIFY_TESTS]: Bail out 
>   if `!onlyjump_p'.
OK
jeff
> ---
>  gcc/ifcvt.c |6 ++
>  1 file changed, 6 insertions(+)
>
> gcc-ifcvt-dead-or-predicable-ce-only-jump.diff
> Index: gcc/gcc/ifcvt.c
> ===
> --- gcc.orig/gcc/ifcvt.c
> +++ gcc/gcc/ifcvt.c
> @@ -5127,6 +5127,11 @@ dead_or_predicable (basic_block test_bb,
>  
>rtx cond;
>  
> +  /* If the conditional jump is more than just a conditional jump,
> +  then we cannot do conditional execution conversion on this block.  */
> +  if (!onlyjump_p (jump))
> + goto nce;
> +
>cond = cond_exec_get_condition (jump);
>if (! cond)
>   return FALSE;
> @@ -5154,6 +5159,7 @@ dead_or_predicable (basic_block test_bb,
>  
>earliest = jump;
>  }
> + nce:
>  #endif
>  
>/* If we allocated new pseudos (e.g. in the conditional move
>

Re: [PATCH RFA] vec: Simplify use with C++11 range-based 'for'.

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 10:53 AM, Jason Merrill via Gcc-patches wrote:
> It looks cleaner if we can use a vec* directly as a range for the C++11
> range-based 'for' loop, without needing to indirect from it, and also works
> with null pointers.
>
> The change in cp_parser_late_parsing_default_args is an example of how this
> can be used to simplify many loops over vec*.
>
> I deliberately didn't format the new overloads for etags since they are
> trivial, but am open to changing that.
>
> Tested x86_64-pc-linux-gnu.  Is this OK for trunk now, or should I hold it for
> stage 1?
>
> gcc/ChangeLog:
>
>   * vec.h (begin, end): Add overloads for vec*.
>   * tree.c (build_constructor_from_vec): Remove *.
>
> gcc/cp/ChangeLog:
>
>   * decl2.c (clear_consteval_vfns): Remove *.
>   * pt.c (do_auto_deduction): Remove *.
>   * parser.c (cp_parser_late_parsing_default_args): Change loop
>   to use range 'for'.
I'd go forward with it now, it's simple enough and simplifies the code
we end up writing...


jeff

c++: Exported using decls

2020-12-03 Thread Nathan Sidwell



With modules we need to record whethe a (namespace-scope) using decl
is exporting the named entities.  Record this on the OVERLOAD marking
the used decl.

gcc/cp/
* cp-tree.h (OVL_EXPORT): New.
(class ovl_iterator): Add get_using, exporting_p.
* tree.c (ovl_insert): Extend using_or_hidden meaning to include
an exported using.

pushed to trunk
--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 4db50128443..4720af2175a 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -503,6 +503,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   FUNCTION_RVALUE_QUALIFIED (in FUNCTION_TYPE, METHOD_TYPE)
   CALL_EXPR_REVERSE_ARGS (in CALL_EXPR, AGGR_INIT_EXPR)
   CONSTRUCTOR_PLACEHOLDER_BOUNDARY (in CONSTRUCTOR)
+  OVL_EXPORT_P (in OVERLOAD)
6: TYPE_MARKED_P (in _TYPE)
   DECL_NONTRIVIALLY_INITIALIZED_P (in VAR_DECL)
   RANGE_FOR_IVDEP (in RANGE_FOR_STMT)
@@ -780,6 +781,8 @@ typedef struct ptrmem_cst * ptrmem_cst_t;
 #define OVL_NESTED_P(NODE)	TREE_LANG_FLAG_3 (OVERLOAD_CHECK (NODE))
 /* If set, this overload was constructed during lookup.  */
 #define OVL_LOOKUP_P(NODE)	TREE_LANG_FLAG_4 (OVERLOAD_CHECK (NODE))
+/* If set, this OVL_USING_P overload is exported.  */
+#define OVL_EXPORT_P(NODE)	TREE_LANG_FLAG_5 (OVERLOAD_CHECK (NODE))
 
 /* The first decl of an overload.  */
 #define OVL_FIRST(NODE)	ovl_first (NODE)
@@ -839,6 +842,11 @@ class ovl_iterator {
 
 return fn;
   }
+  tree get_using () const
+  {
+gcc_checking_assert (using_p ());
+return ovl;
+  }
 
  public:
   /* Whether this overload was introduced by a using decl.  */
@@ -847,6 +855,12 @@ class ovl_iterator {
 return (TREE_CODE (ovl) == USING_DECL
 	|| (TREE_CODE (ovl) == OVERLOAD && OVL_USING_P (ovl)));
   }
+  /* Whether this using is being exported.  */
+  bool exporting_p () const
+  {
+return OVL_EXPORT_P (get_using ());
+  }
+  
   bool hidden_p () const
   {
 return TREE_CODE (ovl) == OVERLOAD && OVL_HIDDEN_P (ovl);
diff --git i/gcc/cp/tree.c w/gcc/cp/tree.c
index 8d7df60f963..d9fa505041f 100644
--- i/gcc/cp/tree.c
+++ w/gcc/cp/tree.c
@@ -2272,10 +2272,11 @@ ovl_make (tree fn, tree next)
   return result;
 }
 
-/* Add FN to the (potentially NULL) overload set OVL.  USING_OR_HIDDEN
-   is > 0, if FN is via a using declaration.  USING_OR_HIDDEN is < 0,
-   if FN is hidden.  (A decl cannot be both using and hidden.)  We
-   keep the hidden decls first, but remaining ones are unordered.  */
+/* Add FN to the (potentially NULL) overload set OVL.  USING_OR_HIDDEN is >
+   zero if this is a using-decl.  It is > 1 if we're exporting the
+   using decl.  USING_OR_HIDDEN is < 0, if FN is hidden.  (A decl
+   cannot be both using and hidden.)  We keep the hidden decls first,
+   but remaining ones are unordered.  */
 
 tree
 ovl_insert (tree fn, tree maybe_ovl, int using_or_hidden)
@@ -2299,7 +2300,11 @@ ovl_insert (tree fn, tree maybe_ovl, int using_or_hidden)
   if (using_or_hidden < 0)
 	OVL_HIDDEN_P (maybe_ovl) = true;
   if (using_or_hidden > 0)
-	OVL_DEDUP_P (maybe_ovl) = OVL_USING_P (maybe_ovl) = true;
+	{
+	  OVL_DEDUP_P (maybe_ovl) = OVL_USING_P (maybe_ovl) = true;
+	  if (using_or_hidden > 1)
+	OVL_EXPORT_P (maybe_ovl) = true;
+	}
 }
   else
 maybe_ovl = fn;

Re: Go testsuite patch committed: Add a bunch of new tests

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

On Thu, Dec 3, 2020 at 11:15 AM Ian Lance Taylor  wrote:
>
> This patch to the Go testsuite adds a bunch of new tests from the
> source repo.  These are mostly tests that were added specifically to
> test gccgo.  There are still other tests in the source repo that are
> not reflected in the gccgo copy.  Bootstrapped and ran Go testsuite on
> x86_64-pc-linux-gnu.  Committed to mainline.

Unfortunately I was working from an older copy of the tests, and
forgot to sync them up from the source repo.  This updates the new
tests to the current versions in the source repo.  Bootstrapped and
ran Go testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian

Re: [PATCH v2] rs6000, vector integer multiply/divide/modulo instructions

2020-12-03 Thread will schmidt via Gcc-patches

On Tue, 2020-12-01 at 15:48 -0800, Carl Love via Gcc-patches wrote:
> Segher, Pat:
> 
> I have updated the patch to address the comments below.
> 
> On Wed, 2020-11-25 at 20:30 -0600, Segher Boessenkool wrote:
> > On Tue, Nov 24, 2020 at 08:34:51PM -0600, Pat Haugen wrote:
> > > On 11/24/20 8:17 PM, Pat Haugen via Gcc-patches wrote:
> > > > On 11/24/20 12:59 PM, Carl Love via Gcc-patches wrote:
> > > > > +(define_insn "modu_"
> > > > > +  [(set (match_operand:VIlong 0 "vsx_register_operand" "=v")
> > > > > + (umod:VIlong (match_operand:VIlong 1
> > > > > "vsx_register_operand" "v")
> > > > > +  (match_operand:VIlong 2
> > > > > "vsx_register_operand" "v")))]
> > > > > +  "TARGET_POWER10"
> > > > > +  "vmodu %0,%1,%2"
> > > > > +  [(set_attr "type" "vecdiv")
> > > > > +   (set_attr "size" "128")])
> > > > 
> > > > We should only be setting "size" "128" for instructions that
> > > > operate on scalar 128-bit data items (i.e. 'vdivesq' etc).
> > > > Since
> > > > the above insns are either V2DI/V4SI (ala VIlong
> > > > mode_iterator),
> > > > they shouldn't be marked as size 128. If you want to set the
> > > > size
> > > > based on mode, (set_attr "size" "") should do the trick I
> > > > believe.
> > > 
> > > Well, after you update "(define_mode_attr bits" in rs6000.md for
> > > V2DI/V4SI.
> > 
> > So far,  was only used for scalars.  I agree that for vectors
> > it
> > makes most sense to do the element size (because the vector size
> > always
> > is 128 bits, and for scheduling the element size can matter).  But,
> > the
> > definitions of  and  now say
> > 
> > ;; What data size does this instruction work on?
> > ;; This is used for insert, mul and others as necessary.
> > (define_attr "size" "8,16,32,64,128" (const_string "32"))
> > 
> > and
> > 
> > ;; How many bits in this mode?
> > (define_mode_attr bits [(QI "8") (HI "16") (SI "32") (DI "64")
> >(SF "32") (DF "64")])
> > so those need a bit of update as well then :-)
> 
> I set the size based on the vector element size, extendeing the
> define_mode_attr bits definition.  Please take a look at the updated
> patch.  Hopefully I have this all correct.  Thanks.
> 
> Note, I retested the updated patch on 
> 
>   powerpc64le-unknown-linux-gnu (Power 9 LE)
>   powerpc64le-unknown-linux-gnu (Power 10 LE)
> 
> Thanks for the help.
> 
>  Carl 
> 

Continued from yesterday..  
Thanks
-Will

> ---
> 
> rs6000, vector integer multiply/divide/modulo instructions
> 
> 2020-12-01  Carl Love  
> 
> gcc/
>   * config/rs6000/altivec.h (vec_mulh, vec_div, vec_dive,
> vec_mod): New
>   defines.
>   * config/rs6000/altivec.md (VIlong): Move define to file
> vsx.md.
>   * config/rs6000/rs6000-builtin.def (DIVES_V4SI, DIVES_V2DI,
>   DIVEU_V4SI, DIVEU_V2DI, DIVS_V4SI, DIVS_V2DI, DIVU_V4SI,
>   DIVU_V2DI, MODS_V2DI, MODS_V4SI, MODU_V2DI, MODU_V4SI,
>   MULHS_V2DI, MULHS_V4SI, MULHU_V2DI, MULHU_V4SI, MULLD_V2DI):
>   Add builtin define.
>   (MULH, DIVE, MOD):  Add new BU_P10_OVERLOAD_2 definitions.
>   * config/rs6000/rs6000-call.c (VSX_BUILTIN_VEC_DIV,
>   P10_BUILTIN_VEC_VDIVE, P10_BUILTIN_VEC_VMOD, 
> P10_BUILTIN_VEC_VMULH):
No mentions of these three P10_BUILTIN_VEC_* in patch below.


>   New overloaded definitions.
>   (builtin_function_type) [P10V_BUILTIN_DIVEU_V4SI,
>   P10V_BUILTIN_DIVEU_V2DI, P10V_BUILTIN_DIVU_V4SI,
>   P10V_BUILTIN_DIVU_V2DI, P10V_BUILTIN_MODU_V2DI,
>   P10V_BUILTIN_MODU_V4SI, P10V_BUILTIN_MULHU_V2DI,
>   P10V_BUILTIN_MULHU_V4SI, P10V_BUILTIN_MULLD_V2DI]: Add case
>   statement for builtins.
>   * config/rs6000/vsx.md (VIlong_char): Add define_mod_attribute.

just VIlong 
Maybe s/define_mod_attribute/define_mod_attr /  ? 

>   (UNSPEC_VDIVES, UNSPEC_VDIVEU): Add enum for UNSPECs.



>   (vsx_mul_v2di, vsx_udiv_v2di): Add if TARGET_POWER10 statement.

I don't see vsx_mul_v2di or vsx_udiv_v2di in the patch contexts, Looks
OK per a look at trunks vsx.md. 

>   (dives_, diveu_, div3, uvdiv3,
>   mods_, modu_, mulhs_, mulhu_,
> mulv2di3):
>   Add define_insn, mode is VIlong.
>   * doc/extend.texi (vec_mulh, vec_mul, vec_div, vec_dive,
> vec_mod): Add
>   builtin descriptions.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/builtins-1-p10-runnable.c: New test file.
> ---
>  gcc/config/rs6000/altivec.h   |   5 +
>  gcc/config/rs6000/altivec.md  |   2 -
>  gcc/config/rs6000/rs6000-builtin.def  |  22 +
>  gcc/config/rs6000/rs6000-call.c   |  49 +++
>  gcc/config/rs6000/rs6000.md   |   3 +-
>  gcc/config/rs6000/vsx.md  | 213 +++---
>  gcc/doc/extend.texi   | 120 ++
>  .../powerpc/builtins-1-p10-runnable.c | 398
> ++
>  8 files changed, 759 insertions(+), 53 deletion

Re: [committed] libstdc++: Add std::bit_cast for C++20 [PR 93121]

2020-12-03 Thread Jonathan Wakely via Gcc-patches


On 03/12/20 19:25 +, Jonathan Wakely wrote:

Thanks to Jakub's addition of the built-in, we can add this to the
library now. The compiler tests for the built-in are quite extensive,
including verifying the constraints, so this only adds minimal tests to
the library testsuite.

This doesn't add a new _GLIBCXX_HAVE_BUILTIN_BIT_CAST because using
__has_builtin(__builtin_bit_cast) works for GCC and versions of Clang
that provide the built-in.

libstdc++-v3/ChangeLog:

PR libstdc++/93121
* include/std/bit (__cpp_lib_bit_cast, bit_cast): Define.
* include/std/version (__cpp_lib_bit_cast): Define.
* testsuite/26_numerics/bit/bit.cast/bit_cast.cc: New test.
* testsuite/26_numerics/bit/bit.cast/version.cc: New test.


This fixes some typos in the new tests.

Tested x86_64-linux. Committed to trunk.


commit 656131e06aa76ba3cb50305c07cf5c8ee79fce44
Author: Jonathan Wakely 
Date:   Thu Dec 3 19:30:02 2020

libstdc++: Fix typos in #error strings

libstdc++-v3/ChangeLog:

* testsuite/26_numerics/bit/bit.cast/bit_cast.cc: Remove stray
word from copy&paste.
* testsuite/26_numerics/bit/bit.cast/version.cc: Likewise.

diff --git a/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc b/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc
index b451f152b47..138ceb1c4d6 100644
--- a/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc
+++ b/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc
@@ -21,9 +21,9 @@
 #include 
 
 #ifndef __cpp_lib_bit_cast
-# error "Feature-test macro for bit_cast wait missing in "
+# error "Feature-test macro for bit_cast missing in "
 #elif __cpp_lib_bit_cast != 201806L
-# error "Feature-test macro for bit_cast wait has wrong value in "
+# error "Feature-test macro for bit_cast has wrong value in "
 #endif
 
 #include 
diff --git a/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/version.cc b/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/version.cc
index 688d44bbb89..82e97543481 100644
--- a/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/version.cc
+++ b/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/version.cc
@@ -21,7 +21,7 @@
 #include 
 
 #ifndef __cpp_lib_bit_cast
-# error "Feature-test macro for bit_cast wait missing in "
+# error "Feature-test macro for bit_cast missing in "
 #elif __cpp_lib_bit_cast != 201806L
-# error "Feature-test macro for bit_cast wait has wrong value in "
+# error "Feature-test macro for bit_cast has wrong value in "
 #endif

Re: [ PATCH ] [ C++ ] [ libstdc++ ] P1208r6 Merge source_location

2020-12-03 Thread Jonathan Wakely via Gcc-patches


On 02/01/20 17:20 -0500, JeanHeyd Meneide wrote:

On Thu, Jan 2, 2020 at 5:07 PM Jakub Jelinek  wrote:


On Thu, Jan 02, 2020 at 04:57:01PM -0500, JeanHeyd Meneide wrote:
> +#if defined(_GLIBCXX_HAVE_BUILTIN_SOURCE_LOCATION)
> +# define __cpp_lib_source_location 201907L
> +#elif defined(_GLIBCXX_HAVE_BUILTIN_LINE) && 
defined(_GLIBCXX_HAVE_BUILTIN_COLUMN)
> +# define __cpp_lib_is_constant_evaluated 201907L

How is __cpp_lib_is_constant_evaluated related to presence of __builtin_LINE
and __builtin_COLUMN?


Oops. Sorry; fat-fingered the diff a little!


I've committed a slightly reworked version of JeanHeyd's
 patch (see attached). Jakub has improved the
built-in so all the commented out tests work now. There's one more
compiler patch coming to use __PRETTY_FUNCTION__ for the function
name, so I'll tweak the new tests after than has gone in.

Tested powerpc64le-linux, committed to trunk.

Thanks, JeanHeyd!


commit 57d76ee9cf6265e012fad6286adfaeaba9414c11
Author: JeanHeyd Meneide 
Date:   Thu Dec 3 19:17:13 2020

libtdc++: Define std::source_location for C++20

This doesn't define a new _GLIBCXX_HAVE_BUILTIN_SOURCE_LOCATION macro.
because using __has_builtin(__builtin_source_location) is sufficient.
Currently only GCC supports it, but if/when Clang and Intel add it the
__has_builtin check should for them too.

Co-authored-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* doc/doxygen/user.cfg.in (INPUT): Add .
* include/Makefile.am: Add .
* include/Makefile.in: Regenerate.
* include/std/version (__cpp_lib_source_location): Define.
* include/std/source_location: New file.
* testsuite/18_support/source_location/1.cc: New test.
* testsuite/18_support/source_location/consteval.cc: New test.
* testsuite/18_support/source_location/srcloc.h: New test.
* testsuite/18_support/source_location/version.cc: New test.

diff --git a/libstdc++-v3/doc/doxygen/user.cfg.in b/libstdc++-v3/doc/doxygen/user.cfg.in
index 1966055e675..2261d572efb 100644
--- a/libstdc++-v3/doc/doxygen/user.cfg.in
+++ b/libstdc++-v3/doc/doxygen/user.cfg.in
@@ -891,6 +891,7 @@ INPUT  = @srcdir@/doc/doxygen/doxygroups.cc \
  include/semaphore \
  include/set \
  include/shared_mutex \
+ include/source_location \
  include/span \
  include/sstream \
  include/stack \
diff --git a/libstdc++-v3/include/Makefile.am b/libstdc++-v3/include/Makefile.am
index ca413b8fdfe..9dbc7dcb32d 100644
--- a/libstdc++-v3/include/Makefile.am
+++ b/libstdc++-v3/include/Makefile.am
@@ -73,6 +73,7 @@ std_headers = \
 	${std_srcdir}/semaphore \
 	${std_srcdir}/set \
 	${std_srcdir}/shared_mutex \
+	${std_srcdir}/source_location \
 	${std_srcdir}/span \
 	${std_srcdir}/sstream \
 	${std_srcdir}/syncstream \
diff --git a/libstdc++-v3/include/std/source_location b/libstdc++-v3/include/std/source_location
new file mode 100644
index 000..13d4bd48857
--- /dev/null
+++ b/libstdc++-v3/include/std/source_location
@@ -0,0 +1,92 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file include/source_location
+ *  This is a Standard C++ Library header.
+ */
+
+#ifndef _GLIBCXX_SRCLOC
+#define _GLIBCXX_SRCLOC 1
+
+#if __cplusplus > 201703L && __has_builtin(__builtin_source_location)
+#include 
+
+namespace std
+{
+_GLIBCXX_BEGIN_NAMESPACE_VERSION
+
+#define __cpp_lib_source_location 201907L
+
+  /// A class that describes a location in source code.
+  struct source_location
+  {
+  private:
+using uint_least32_t = __UINT_LEAST32_TYPE__;
+
+  public:
+
+// [support.srcloc.cons], creation
+static consteval source_location
+current(const void* __p = __builtin_source_location()) noexcept
+{
+  source_location __ret;
+

[committed] libstdc++: Update C++20 library implementation status

2020-12-03 Thread Jonathan Wakely via Gcc-patches

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2020.xml: Update C++20 status.
* doc/html/*: Regenerate.

Tested powerpc64le-linux. Committed to trunk.

commit 44ac1ea0e2244343b798ff1ccc7048029cb9fa02
Author: Jonathan Wakely 
Date:   Thu Dec 3 19:17:13 2020

libstdc++: Update C++20 library implementation status

libstdc++-v3/ChangeLog:

* doc/xml/manual/status_cxx2020.xml: Update C++20 status.
* doc/html/*: Regenerate.

diff --git a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml 
b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
index e633365ab40..b62a432eed1 100644
--- a/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
+++ b/libstdc++-v3/doc/xml/manual/status_cxx2020.xml
@@ -357,24 +357,22 @@ or any notes about the implementation.
 
 
 
-  
 C++ Synchronized Buffered Ostream 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0053r7.pdf";>
 P0053R7 
   
-   
+   11 
__cpp_lib_syncbuf >= 201711L 
 
 
 
-  
 Manipulators for C++ Synchronized Buffered Ostream 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0753r2.pdf";>
 P0753R2 
   
-   
+   11 
__cpp_lib_syncbuf >= 201803L 
 
 
@@ -1024,13 +1022,12 @@ or any notes about the implementation.
 
 
 
-  
 Bit-casting object representations 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0476r2.html";>
 P0476R2 
   
-   
+   11 
__cpp_lib_bit_cast >= 201806L 
 
 
@@ -1411,26 +1408,24 @@ or any notes about the implementation.
 
 
 
-  
std::source_location 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1208r6.pdf";>
 P1208R6 
   
-   
+   11 
   
 __cpp_lib_source_location >= 201907L
   
 
 
 
-  
Efficient access to std::basic_stringbuf's Buffer 
   
 http://www.w3.org/1999/xlink"; 
xlink:href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0408r7.pdf";>
 P0408R7 
   
-   
+   11

[committed] libstdc++: Add std::bit_cast for C++20 [PR 93121]

2020-12-03 Thread Jonathan Wakely via Gcc-patches

Thanks to Jakub's addition of the built-in, we can add this to the
library now. The compiler tests for the built-in are quite extensive,
including verifying the constraints, so this only adds minimal tests to
the library testsuite.

This doesn't add a new _GLIBCXX_HAVE_BUILTIN_BIT_CAST because using
__has_builtin(__builtin_bit_cast) works for GCC and versions of Clang
that provide the built-in.

libstdc++-v3/ChangeLog:

PR libstdc++/93121
* include/std/bit (__cpp_lib_bit_cast, bit_cast): Define.
* include/std/version (__cpp_lib_bit_cast): Define.
* testsuite/26_numerics/bit/bit.cast/bit_cast.cc: New test.
* testsuite/26_numerics/bit/bit.cast/version.cc: New test.

Tested powerpc64le-linux. Committed to trunk.

commit 9e433b3461ab64b38350817392a77efb67bb78b4
Author: Jonathan Wakely 
Date:   Thu Dec 3 19:17:13 2020

libstdc++: Add std::bit_cast for C++20 [PR 93121]

Thanks to Jakub's addition of the built-in, we can add this to the
library now. The compiler tests for the built-in are quite extensive,
including verifying the constraints, so this only adds minimal tests to
the library testsuite.

This doesn't add a new _GLIBCXX_HAVE_BUILTIN_BIT_CAST because using
__has_builtin(__builtin_bit_cast) works for GCC and versions of Clang
that provide the built-in.

libstdc++-v3/ChangeLog:

PR libstdc++/93121
* include/std/bit (__cpp_lib_bit_cast, bit_cast): Define.
* include/std/version (__cpp_lib_bit_cast): Define.
* testsuite/26_numerics/bit/bit.cast/bit_cast.cc: New test.
* testsuite/26_numerics/bit/bit.cast/version.cc: New test.

diff --git a/libstdc++-v3/include/std/bit b/libstdc++-v3/include/std/bit
index 16f7eba46d7..1d99c807c4a 100644
--- a/libstdc++-v3/include/std/bit
+++ b/libstdc++-v3/include/std/bit
@@ -49,6 +49,18 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* @{
*/
 
+#if __cplusplus > 201703l && __has_builtin(__builtin_bit_cast)
+#define __cpp_lib_bit_cast 201806L
+
+  /// Create a value of type `To` from the bits of `from`.
+  template
+constexpr _To
+bit_cast(const _From& __from) noexcept
+{
+  return __builtin_bit_cast(_To, __from);
+}
+#endif
+
   /// @cond undoc
 
   template
diff --git a/libstdc++-v3/include/std/version b/libstdc++-v3/include/std/version
index 25f628f399d..6e4bd99b361 100644
--- a/libstdc++-v3/include/std/version
+++ b/libstdc++-v3/include/std/version
@@ -201,6 +201,9 @@
 # define __cpp_lib_atomic_wait 201907L
 #endif
 #define __cpp_lib_bind_front 201907L
+#if __has_builtin(__builtin_bit_cast)
+# define __cpp_lib_bit_cast 201806L
+#endif
 // FIXME: #define __cpp_lib_execution 201902L
 #define __cpp_lib_integer_comparison_functions 202002L
 #define __cpp_lib_constexpr_algorithms 201806L
diff --git a/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc 
b/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc
new file mode 100644
index 000..b451f152b47
--- /dev/null
+++ b/libstdc++-v3/testsuite/26_numerics/bit/bit.cast/bit_cast.cc
@@ -0,0 +1,81 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+#ifndef __cpp_lib_bit_cast
+# error "Feature-test macro for bit_cast wait missing in "
+#elif __cpp_lib_bit_cast != 201806L
+# error "Feature-test macro for bit_cast wait has wrong value in "
+#endif
+
+#include 
+#include 
+#include 
+
+template
+constexpr bool
+check(const From& from)
+{
+  return std::bit_cast(std::bit_cast(from)) == from;
+}
+
+void
+test01()
+{
+  static_assert( std::bit_cast(123) == 123 );
+  static_assert( std::bit_cast(123u) == 123 );
+  static_assert( std::bit_cast(~0u) == ~0 );
+
+  if constexpr (sizeof(int) == sizeof(float))
+static_assert( check(12.34f) );
+  if constexpr (sizeof(unsigned long long) == sizeof(double))
+static_assert( check(123.456) );
+  if constexpr (sizeof(std::intptr_t) == sizeof(void(*)()))
+VERIFY( check(&test01) );
+}
+
+void
+test02()
+{
+  struct S
+  {
+int i;
+
+bool operator==(const char* s) const
+{ return std::memcmp(&i, s, sizeof(i)) == 0; }
+  };
+
+  char arr[sizeof(int)];
+  char arr2[sizeof(int)];
+  for (int i = 0;

c++: uninstantiated template friends

2020-12-03 Thread Nathan Sidwell



template friends need to be recognized by module streaming and
associated with the befriending class.  but their context is that of
the friend (a namespace or other class).  This adds a flag to mark
such templates, and uses their DECL_CHAIN to point at the befriender.

gcc/cp
* cp-tree.h (DECL_UNINSTANTIATED_TEMPLATE_FRIEND): New.
* pt.c (push_template_decl): Set it.
(tsubst_friend_function): Clear it.

pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index 69f8ed56e62..4db50128443 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -545,6 +545,7 @@ extern GTY(()) tree cp_global_trees[CPTI_MAX];
   DECL_ANON_UNION_VAR_P (in a VAR_DECL)
   DECL_SELF_REFERENCE_P (in a TYPE_DECL)
   DECL_INVALID_OVERRIDER_P (in a FUNCTION_DECL)
+  DECL_UNINSTANIATED_TEMPLATE_FRIEND_P (in TEMPLATE_DECL)
5: DECL_INTERFACE_KNOWN.
6: DECL_THIS_STATIC (in VAR_DECL, FUNCTION_DECL or PARM_DECL)
   DECL_FIELD_IS_BASE (in FIELD_DECL)
@@ -3161,6 +3162,13 @@ struct GTY(()) lang_decl {
   (DECL_LANG_SPECIFIC (FUNCTION_DECL_CHECK (NODE)) \
->u.base.friend_or_tls)
 
+/* True of a TEMPLATE_DECL that is a template class friend.  Such
+   decls are not pushed until instantiated (as they may depend on
+   parameters of the befriending class).  DECL_CHAIN is the
+   befriending class.  */
+#define DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P(NODE) \
+  (DECL_LANG_FLAG_4 (TEMPLATE_DECL_CHECK (NODE)))
+
 /* Nonzero if the thread-local variable was declared with __thread as
opposed to thread_local.  */
 #define DECL_GNU_TLS_P(NODE)\
diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 3ca28133d94..08931823d57 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -22,7 +22,9 @@ along with GCC; see the file COPYING3.  If not see
 /* Known bugs or deficiencies include:
 
  all methods must be provided in header files; can't use a source
- file that contains only the method templates and "just win".  */
+ file that contains only the method templates and "just win".
+
+ Fixed by: C++20 modules.  */
 
 #include "config.h"
 #include "system.h"
@@ -6044,6 +6046,14 @@ push_template_decl (tree decl, bool is_friend)
 	  tmpl = NULL_TREE;
 	}
 	}
+  else if (is_friend)
+	{
+	  /* Record this decl as belonging to the current class.  It's
+	 not chained onto anything else.  */
+	  DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P (tmpl) = true;
+	  gcc_checking_assert (!DECL_CHAIN (tmpl));
+	  DECL_CHAIN (tmpl) = current_scope ();
+	}
 }
   else if (tmpl)
 /* The type may have been completed, or (erroneously) changed.  */
@@ -11053,6 +11063,7 @@ tsubst_friend_function (tree decl, tree args)
   DECL_USE_TEMPLATE (new_friend) = 0;
   if (TREE_CODE (new_friend) == TEMPLATE_DECL)
 {
+  DECL_UNINSTANTIATED_TEMPLATE_FRIEND_P (new_friend) = false;
   DECL_USE_TEMPLATE (DECL_TEMPLATE_RESULT (new_friend)) = 0;
   DECL_SAVED_TREE (DECL_TEMPLATE_RESULT (new_friend))
 	= DECL_SAVED_TREE (DECL_TEMPLATE_RESULT (decl));

Go testsuite patch committed: Add a bunch of new tests

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go testsuite adds a bunch of new tests from the
source repo.  These are mostly tests that were added specifically to
test gccgo.  There are still other tests in the source repo that are
not reflected in the gccgo copy.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian


patch.txt.bz2
Description: application/bzip

[PATCH] v2: doc/implement-c.texi: About same-as-scalar-type volatile aggregate accesses, PR94600

2020-12-03 Thread Hans-Peter Nilsson via Gcc-patches

Belatedly, here's an updated version, using Martin Sebor's
suggested wording from
"https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549580.html";.
I added two commas, hopefully helpfully.  Albeit ok'd by Richard
Biener in
"https://gcc.gnu.org/pipermail/gcc-patches/2020-July/549922.html";,
better have this reviewed properly, including markup (none added).

Ok for trunk (gcc-11) and gcc-10?


---
We say very little about reads and writes to aggregate /
compound objects, just scalar objects (i.e. assignments don't
cause reads).  Let's lets say something safe about aggregate
objects, but only for those that are the same size as a scalar
type.

There's an equal-sounding section (Volatiles) in extend.texi,
but this seems a more appropriate place, as specifying the
behavior of a standard qualifier.

gcc:

2020-12-02  Hans-Peter Nilsson  
Martin Sebor  

PR middle-end/94600
* doc/implement-c.texi (Qualifiers implementation): Add blurb
about access to the whole of a volatile aggregate object, only for
same-size as a scalar object.
---
 gcc/doc/implement-c.texi | 5 +
 1 file changed, 5 insertions(+)

diff --git a/gcc/doc/implement-c.texi b/gcc/doc/implement-c.texi
index 692297b69c4..2e9158a2a45 100644
--- a/gcc/doc/implement-c.texi
+++ b/gcc/doc/implement-c.texi
@@ -576,6 +576,11 @@ are of scalar types, the expression is interpreted by GCC 
as a read of
 the volatile object; in the other cases, the expression is only evaluated
 for its side effects.
 
+When an object of an aggregate type, with the same size and alignment as a
+scalar type S, is the subject of a volatile access by an assignment
+expression or an atomic function, the access to it is performed as if the
+object's declared type were volatile S.
+
 @end itemize
 
 @node Declarators implementation
-- 
2.11.0

[patch] Fix checking failure in IPA-SRA

2020-12-03 Thread Eric Botcazou

Hi,

this is a regression present on the mainline and 10 branch: on the one hand, 
IPA-SRA does *not* disqualify accesses with zero size but, on the other hand, 
it checks that accesses present in the tree have a (strictly) positive size,
thus trivially yielding an ICE, for example on the attached Ada testcase.

The attached fix relaxes the check, OK for mainline and 10 branch?


2020-12-03  Eric Botcazou  

* ipa-sra.c (verify_access_tree_1): Relax assertion on the size.


2020-12-03  Eric Botcazou  

* gnat.dg/opt91.ad[sb]: New test.
* gnat.dg/opt91_pkg.ad[sb]: New helper.

-- 
Eric Botcazoudiff --git a/gcc/ipa-sra.c b/gcc/ipa-sra.c
index 82acc6a21cb..7adc4b688f3 100644
--- a/gcc/ipa-sra.c
+++ b/gcc/ipa-sra.c
@@ -1480,7 +1480,7 @@ verify_access_tree_1 (gensum_param_access *access, HOST_WIDE_INT parent_offset,
 {
   while (access)
 {
-  gcc_assert (access->offset >= 0 && access->size > 0);
+  gcc_assert (access->offset >= 0 && access->size >= 0);
 
   if (parent_size != 0)
 	{
package body Opt91_Pkg is

   package body Pure_Relation is

  overriding function Custom_Image (Self : Rel) return String is
  begin
 return Custom_Image (Self.Rel);
  end Custom_Image;

   end Pure_Relation;

end Opt91_Pkg;
package Opt91_Pkg is

   type Base_Relation is abstract tagged null record;

   function Custom_Image (Self : Base_Relation) return String is abstract;

   generic
  type Ty is private;
  with function Custom_Image (Self : Ty) return String is <>;
   package Pure_Relation is

  type Rel is new Base_Relation with record
 Rel : Ty;
  end record;

  overriding function Custom_Image (Self : Rel) return String;
   end Pure_Relation;

end Opt91_Pkg;
-- { dg-do compile }
-- { dg-options "-O2 -fchecking=1" }

package body Opt91 is

   function Custom_Image (Self : True_Relation_Rec) return String is
   begin
  return "";
   end;

end Opt91;
with Opt91_Pkg; use Opt91_Pkg;

package Opt91 is

   type True_Relation_Rec is null record;
   function Custom_Image (Self : True_Relation_Rec) return String;

   package True_Relation is new Pure_Relation (Ty => True_Relation_Rec);

end Opt91;

Go testsuite patch committed: Add -I. when compiling in directory

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

This Go testsuite patch adds a -I. argument when compiling sources in
a directory.  This tells the compiler to import packages in the
current directory first, which is what we want in the testsuite.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian

* go.test/go-test.exp (go-gc-tests): Add -I. when building all
sources in a directory (errorcheckdir, compiledir, rundir,
rundircmpout).
5c642e8b52c0508730a6273f0828e8c10b1be057
diff --git a/gcc/testsuite/go.test/go-test.exp 
b/gcc/testsuite/go.test/go-test.exp
index 067f309a296..8f17cb3db71 100644
--- a/gcc/testsuite/go.test/go-test.exp
+++ b/gcc/testsuite/go.test/go-test.exp
@@ -648,7 +648,7 @@ proc go-gc-tests { } {
set last [lindex $packages end]
set packages [lreplace $packages end end]
foreach p $packages {
-   dg-test -keep-output [lrange $p 1 end] "-O" "-w 
$DEFAULT_GOCFLAGS"
+   dg-test -keep-output [lrange $p 1 end] "-O -I." "-w 
$DEFAULT_GOCFLAGS"
lappend del "[file rootname [file tail [lindex $p 1]]].o"
}
errchk [lindex $last 1] "[lrange $last 2 end]"
@@ -692,7 +692,7 @@ proc go-gc-tests { } {
if { [llength $packages] > 0 } {
set del [list]
foreach p $packages {
-   dg-test -keep-output [lindex $p 1] "[lrange $p 2 end] -O" 
"-w $DEFAULT_GOCFLAGS"
+   dg-test -keep-output [lindex $p 1] "[lrange $p 2 end] -O 
-I." "-w $DEFAULT_GOCFLAGS"
lappend del "[file rootname [file tail [lindex $p 1]]].o"
}
foreach f $del {
@@ -712,7 +712,7 @@ proc go-gc-tests { } {
set last [lindex $packages end]
set packages [lreplace $packages end end]
foreach p $packages {
-   dg-test -keep-output [lrange $p 1 end] "-O" "-w 
$DEFAULT_GOCFLAGS"
+   dg-test -keep-output [lrange $p 1 end] "-O -I." "-w 
$DEFAULT_GOCFLAGS"
lappend del "[file rootname [file tail [lindex $p 1]]].o"
}
set dg-do-what-default "link"
@@ -737,11 +737,11 @@ proc go-gc-tests { } {
set last [lindex $packages end]
set packages [lreplace $packages end end]
foreach p $packages {
-   dg-test -keep-output [lrange $p 1 end] "-O" "-w 
$DEFAULT_GOCFLAGS"
+   dg-test -keep-output [lrange $p 1 end] "-O -I." "-w 
$DEFAULT_GOCFLAGS"
lappend del "[file rootname [file tail [lindex $p 1]]].o"
}
set dg-do-what-default "link"
-   dg-test -keep-output [lrange $last 1 end] "$del -O" "-w 
$DEFAULT_GOCFLAGS"
+   dg-test -keep-output [lrange $last 1 end] "$del -O -I." "-w 
$DEFAULT_GOCFLAGS"
set base "[file rootname [file tail [lindex $last 1]]]"
set output_file "./$base.exe"
lappend del $output_file

Re: [gcc r11-4816] Fix Ada build failure for the SuSE PowerPC64/Linux compiler

2020-12-03 Thread Andreas Schwab

On Dez 03 2020, Eric Botcazou wrote:

> I'm afraid we cannot support both given the current setup, so you'll have to 
> make a choice for the powerpc64-suse-linux compiler.

I'm using the same configuration as the system compiler.  There is no
choice to make.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [gcc r11-4816] Fix Ada build failure for the SuSE PowerPC64/Linux compiler

2020-12-03 Thread Eric Botcazou

> Nope.  The default is 64-bit.

That's not what Richard said under PR ada/97504 though:

"Yes, we're building a 64bit compiler defaulting to -m32.  Our
"ppc" target was never a true 32bit system but a 64bit system with a
32bit "default" runtime and 64bit multilibs, so most binaries in the
system were 32bit binaries but select ones could be 64bit.

Note the GCC binaries themselves become 32bit binaries as well (the way
we use it the host compiler is the very same, powerpc64-linux compiler
defaulting to 32bit code generation)."

I'm afraid we cannot support both given the current setup, so you'll have to 
make a choice for the powerpc64-suse-linux compiler.

-- 
Eric Botcazou

Go testsuite: update existing tests to source repo

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

I've committed a patch to update the existing Go tests, the ones under
go.test, to the versions in the source repo at
https://go.googlesource.com/.  This does not include any of the new
tests, just updates the ones we already have and removes the ones that
no longer exist in the source.  The attached patch omits some of the
removed files, as they are large.  Bootstrapped and ran Go testsuite
on x86_64-pc-linux-gnu.  Committed to mainline.

Ian


patch.txt.bz2
Description: application/bzip

Re: [PATCH RFA] vec: Simplify use with C++11 range-based 'for'.

2020-12-03 Thread Marek Polacek via Gcc-patches

On Thu, Dec 03, 2020 at 12:53:22PM -0500, Jason Merrill via Gcc-patches wrote:
> It looks cleaner if we can use a vec* directly as a range for the C++11
> range-based 'for' loop, without needing to indirect from it, and also works
> with null pointers.

Nice.

> The change in cp_parser_late_parsing_default_args is an example of how this
> can be used to simplify many loops over vec*.
> 
> I deliberately didn't format the new overloads for etags since they are
> trivial, but am open to changing that.
> 
> Tested x86_64-pc-linux-gnu.  Is this OK for trunk now, or should I hold it for
> stage 1?

I'd vote for pushing this now as it's simple enough.

> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -29273,7 +29273,7 @@ do_auto_deduction (tree type, tree init, tree 
> auto_node,
>/* We don't recurse here because we can't deduce from a nested
>initializer_list.  */
>if (CONSTRUCTOR_ELTS (init))

Can this check go now or do we still need it?

> - for (constructor_elt &elt : *CONSTRUCTOR_ELTS (init))
> + for (constructor_elt &elt : CONSTRUCTOR_ELTS (init))
> elt.value = resolve_nondeduced_context (elt.value, complain);

Marek

[PATCH RFA] vec: Simplify use with C++11 range-based 'for'.

2020-12-03 Thread Jason Merrill via Gcc-patches

It looks cleaner if we can use a vec* directly as a range for the C++11
range-based 'for' loop, without needing to indirect from it, and also works
with null pointers.

The change in cp_parser_late_parsing_default_args is an example of how this
can be used to simplify many loops over vec*.

I deliberately didn't format the new overloads for etags since they are
trivial, but am open to changing that.

Tested x86_64-pc-linux-gnu.  Is this OK for trunk now, or should I hold it for
stage 1?

gcc/ChangeLog:

* vec.h (begin, end): Add overloads for vec*.
* tree.c (build_constructor_from_vec): Remove *.

gcc/cp/ChangeLog:

* decl2.c (clear_consteval_vfns): Remove *.
* pt.c (do_auto_deduction): Remove *.
* parser.c (cp_parser_late_parsing_default_args): Change loop
to use range 'for'.
---
 gcc/vec.h   | 10 ++
 gcc/cp/decl2.c  |  2 +-
 gcc/cp/parser.c |  6 +-
 gcc/cp/pt.c |  2 +-
 gcc/tree.c  |  2 +-
 5 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/gcc/vec.h b/gcc/vec.h
index 90904515ea0..09166f1bce6 100644
--- a/gcc/vec.h
+++ b/gcc/vec.h
@@ -419,6 +419,16 @@ struct GTY((user)) vec
 {
 };
 
+/* Allow C++11 range-based 'for' to work directly on vec*.  */
+template
+T* begin (vec *v) { return v ? v->begin () : nullptr; }
+template
+T* end (vec *v) { return v ? v->end () : nullptr; }
+template
+const T* begin (const vec *v) { return v ? v->begin () : nullptr; }
+template
+const T* end (const vec *v) { return v ? v->end () : nullptr; }
+
 /* Generic vec<> debug helpers.
 
These need to be instantiated for each vec used throughout
diff --git a/gcc/cp/decl2.c b/gcc/cp/decl2.c
index 1bc7b7e0197..46069cb66a6 100644
--- a/gcc/cp/decl2.c
+++ b/gcc/cp/decl2.c
@@ -1928,7 +1928,7 @@ static void
 clear_consteval_vfns (vec &consteval_vtables)
 {
   for (tree vtable : consteval_vtables)
-for (constructor_elt &elt : *CONSTRUCTOR_ELTS (DECL_INITIAL (vtable)))
+for (constructor_elt &elt : CONSTRUCTOR_ELTS (DECL_INITIAL (vtable)))
   {
tree fn = cp_get_fndecl_from_callee (elt.value, /*fold*/false);
if (fn && DECL_IMMEDIATE_FUNCTION_P (fn))
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index 103567cd004..cc3da155032 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -30611,9 +30611,6 @@ cp_parser_late_parsing_default_args (cp_parser *parser, 
tree fn)
 {
   tree default_arg = TREE_PURPOSE (parm);
   tree parsed_arg;
-  vec *insts;
-  tree copy;
-  unsigned ix;
 
   tree parmdecl = parms[i];
   pushdecl (parmdecl);
@@ -30633,8 +30630,7 @@ cp_parser_late_parsing_default_args (cp_parser *parser, 
tree fn)
   TREE_PURPOSE (parm) = parsed_arg;
 
   /* Update any instantiations we've already created.  */
-  for (insts = DEFPARSE_INSTANTIATIONS (default_arg), ix = 0;
-  vec_safe_iterate (insts, ix, ©); ix++)
+  for (tree copy : DEFPARSE_INSTANTIATIONS (default_arg))
TREE_PURPOSE (copy) = parsed_arg;
 }
 
diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 3ca28133d94..ac987682a58 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -29273,7 +29273,7 @@ do_auto_deduction (tree type, tree init, tree auto_node,
   /* We don't recurse here because we can't deduce from a nested
 initializer_list.  */
   if (CONSTRUCTOR_ELTS (init))
-   for (constructor_elt &elt : *CONSTRUCTOR_ELTS (init))
+   for (constructor_elt &elt : CONSTRUCTOR_ELTS (init))
  elt.value = resolve_nondeduced_context (elt.value, complain);
 }
   else
diff --git a/gcc/tree.c b/gcc/tree.c
index 52a145dd018..431bbc22e52 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -2185,7 +2185,7 @@ build_constructor_from_vec (tree type, const vec *vals)
 {
   vec *v = NULL;
 
-  for (tree t : *vals)
+  for (tree t : vals)
 CONSTRUCTOR_APPEND_ELT (v, NULL_TREE, t);
 
   return build_constructor (type, v);

base-commit: 3843fa2d75a76ca64d3366950f0ac3d7d4729c4c
-- 
2.27.0

Re: [PATCH] implement pre-c++20 contracts

2020-12-03 Thread Jason Merrill via Gcc-patches

On 12/3/20 12:07 PM, Andrew Sutton wrote:

 > Attached is a new squashed revision of the patch sans ChangeLogs. The
 > current work is now being done on github:
 > https://github.com/lock3/gcc/tree/contracts-jac-alt

I'm starting to review this now, sorry for the delay. Is this still the
branch you want me to consider for GCC 11?  I notice that the
-constexpr
and -mangled-config branches are newer.

I think so. Jeff can answer more authoritatively. I know we had one set 
of changes to the design (how contracts) work aimed at improving the 
debugging experience for violated contracts. I'm not sure if that's in 
the jac-alt branch though.

The -constexpr branch checks for trivially satisfied contracts (e.g., 
[[assert: true]]) and issues warnings. It also preemptively checks 
preconditions against constant function arguments. It's probably worth 
reviewing that separately.

I'm not sure the -manged-config branch is worth considering for merging 
at this point. It's trying to solve a problem that might not be worth 
solving.

OK, I'll start with -alt then, thanks.

Out of curiosity, are you concerned that future versions of contracts 
might have considerably different syntax or configurability? I'd hope it 
wouldn't, but who knows where SG21 is going :)

Not particularly; I figure that most of the implementation would be 
unaffected.

Jason

Re: How to traverse all the local variables that declared in the current routine?

2020-12-03 Thread Richard Sandiford via Gcc-patches

Richard Biener via Gcc-patches  writes:
> On Tue, Nov 24, 2020 at 4:47 PM Qing Zhao  wrote:
>> Another issue is, in order to check whether an auto-variable has 
>> initializer, I plan to add a new bit in “decl_common” as:
>>   /* In a VAR_DECL, this is DECL_IS_INITIALIZED.  */
>>   unsigned decl_is_initialized :1;
>>
>> /* IN VAR_DECL, set when the decl is initialized at the declaration.  */
>> #define DECL_IS_INITIALIZED(NODE) \
>>   (DECL_COMMON_CHECK (NODE)->decl_common.decl_is_initialized)
>>
>> set this bit when setting DECL_INITIAL for the variables in FE. then keep it
>> even though DECL_INITIAL might be NULLed.
>
> For locals it would be more reliable to set this flag during gimplification.
>
>> Do you have any comment and suggestions?
>
> As said above - do you want to cover registers as well as locals?  I'd do
> the actual zeroing during RTL expansion instead since otherwise you
> have to figure youself whether a local is actually used (see 
> expand_stack_vars)
>
> Note that optimization will already made have use of "uninitialized" state
> of locals so depending on what the actual goal is here "late" may be too late.

Haven't thought about this much, so it might be a daft idea, but would a
compromise be to use a const internal function:

  X1 = .DEFERRED_INIT (X0, INIT)

where the X0 argument is an uninitialised value and the INIT argument
describes the initialisation pattern?  So for a decl we'd have:

  X = .DEFERRED_INIT (X, INIT)

and for an SSA name we'd have:

  X_2 = .DEFERRED_INIT (X_1(D), INIT)

with all other uses of X_1(D) being replaced by X_2.  The idea is that:

* Having the X0 argument would keep the uninitialised use of the
  variable around for the later warning passes.

* Using a const function should still allow the UB to be deleted as dead
  if X1 isn't needed.

* Having a function in the way should stop passes from taking advantage
  of direct uninitialised uses for optimisation.

This means we won't be able to optimise based on the actual init
value at the gimple level, but that seems like a fair trade-off.
AIUI this is really a security feature or anti-UB hardening feature
(in the sense that users are more likely to see predictable behaviour
“in the field” even if the program has UB).

Thanks,
Richard

Re: [pushed] c++: Push parms when late parsing default args

2020-12-03 Thread Rainer Orth

Hi Jason,

> In this testcase we weren't catching the error in A::f because the parameter
> 'I' wasn't in scope, so the default argument for 'b' found the global
> typedef I.  Fixed by pushing the parms before parsing.  This is a bit
> complicated because pushdecl clears DECL_CHAIN; do_push_parm_decls deals
> with this by nreversing first, but that doesn't work here because we only
> want to push them one at a time; if we pushed all of them before parsing,
> we'd wrongly reject A::g.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.
>
> gcc/cp/ChangeLog:
>
>   * parser.c (cp_parser_primary_expression): Distinguish
>   parms from vars in error.
>   (cp_parser_late_parsing_default_args): Pushdecl parms
>   as we go.

this patch broke i386-pc-solaris2.11 and sparc-sun-solaris2.11 bootstrap
with gcc 8.1.0 in stage 1:

/vol/gcc/src/hg/master/local/gcc/cp/parser.c: In function 'void 
cp_parser_late_parsing_default_args(cp_parser*, tree)':
/vol/gcc/src/hg/master/local/gcc/cp/parser.c:30618:28: error: ambiguous 
overload for 'operator[]' (operand types are 'releasing_vec' and 'int')
   tree parmdecl = parms[i];
^
/vol/gcc/src/hg/master/local/gcc/cp/parser.c:30618:28: note: candidate: 
'operator[](releasing_vec::vec_t* {aka vec*}, int)' 

In file included from /vol/gcc/src/hg/master/local/gcc/cp/parser.c:25:
/vol/gcc/src/hg/master/local/gcc/cp/cp-tree.h:965:9: note: candidate: 
'tree_node*& releasing_vec::operator[](unsigned int) const'
   tree& operator[] (unsigned i) const { return (*v)[i]; }
 ^~~~
In file included from /vol/gcc/src/hg/master/local/gcc/c-family/c-common.h:26,
 from /vol/gcc/src/hg/master/local/gcc/cp/cp-tree.h:40,
 from /vol/gcc/src/hg/master/local/gcc/cp/parser.c:25:
/vol/gcc/src/hg/master/local/gcc/cp/parser.c:30647:24: error: ambiguous 
overload for 'operator[]' (operand types are 'releasing_vec' and 'int')
   DECL_CHAIN (parms[i]) = parm;
^
/vol/gcc/src/hg/master/local/gcc/tree.h:286:26: note: in definition of macro 
'CONTAINS_STRUCT_CHECK'
 (contains_struct_check ((T), (STRUCT), __FILE__, __LINE__, __FUNCTION__))
  ^
/vol/gcc/src/hg/master/local/gcc/tree.h:2424:27: note: in expansion of macro 
'TREE_CHAIN'
 #define DECL_CHAIN(NODE) (TREE_CHAIN (DECL_MINIMAL_CHECK (NODE)))
[...]

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University

Re: [PATCH] 2/2 Remove debug/array

2020-12-03 Thread Jonathan Wakely via Gcc-patches

On 03/12/20 18:14 +0100, Daniel KrÃ¼gler via Libstdc++ wrote:

Am Do., 3. Dez. 2020 um 18:10 Uhr schrieb Jonathan Wakely via
Libstdc++ :

[..]

>>Ok to commit ?
>
>Yes, this is a nice simplification, thanks.

This broke the C++11 constexpr support in std::array. Fixed with this
patch. Tested x86_64-linux, committed to trunk.

Wouldn't a transformation into a comma expression, such as

return __glibcxx_requires_subscript(__n), _AT_Type::_S_ref(_M_elems, __n);

realize the same thing but would still keep the assertion-like thing?

No, because the assertion is defined as  do { ... } while(false) so
can't be used in C++11 constexpr functions.

We could change that, or introduce new assertion macros just for this
case, but I don't care about C++11 enough to do it.

[committed] libstdc++: Update powerpc-linux baselines for GCC 10.1

2020-12-03 Thread Jonathan Wakely via Gcc-patches

This should have been done before the GCC 10.1 release.

libstdc++-v3/ChangeLog:

* config/abi/post/powerpc-linux-gnu/baseline_symbols.txt:
Update.
* config/abi/post/powerpc64-linux-gnu/32/baseline_symbols.txt:
Update.

Tested powerpc64-linux. Committed to trunk. Backport to gcc-10 to
follow.

commit 3843fa2d75a76ca64d3366950f0ac3d7d4729c4c
Author: Jonathan Wakely 
Date:   Thu Dec 3 17:18:28 2020

libstdc++: Update powerpc-linux baselines for GCC 10.1

This should have been done before the GCC 10.1 release.

libstdc++-v3/ChangeLog:

* config/abi/post/powerpc-linux-gnu/baseline_symbols.txt:
Update.
* config/abi/post/powerpc64-linux-gnu/32/baseline_symbols.txt:
Update.

diff --git 
a/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt
index 5cb72bbfcb4..63a58b03596 100644
--- a/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/powerpc-linux-gnu/baseline_symbols.txt
@@ -2208,16 +2208,20 @@ FUNC:_ZNSt12__basic_fileIcED1Ev@@GLIBCXX_3.4
 FUNC:_ZNSt12__basic_fileIcED2Ev@@GLIBCXX_3.4
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1EOS5_@@GLIBCXX_3.4.26
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2EOS5_@@GLIBCXX_3.4.28
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev@@GLIBCXX_3.4.27
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC1EOS4_@@GLIBCXX_3.4.26
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2EOS4_@@GLIBCXX_3.4.28
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev@@GLIBCXX_3.4.27
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEaSEOS4_@@GLIBCXX_3.4.26
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1EOS6_@@GLIBCXX_3.4.26
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2EOS6_@@GLIBCXX_3.4.28
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEC2Ev@@GLIBCXX_3.4.27
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC1EOS5_@@GLIBCXX_3.4.26
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC1Ev@@GLIBCXX_3.4.26
+FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC2EOS5_@@GLIBCXX_3.4.28
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEC2Ev@@GLIBCXX_3.4.27
 
FUNC:_ZNSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEaSEOS5_@@GLIBCXX_3.4.26
 FUNC:_ZNSt12bad_weak_ptrD0Ev@@GLIBCXX_3.4.15
@@ -3191,12 +3195,18 @@ FUNC:_ZNSt3_V214error_categoryD1Ev@@GLIBCXX_3.4.21
 FUNC:_ZNSt3_V214error_categoryD2Ev@@GLIBCXX_3.4.21
 FUNC:_ZNSt3_V215system_categoryEv@@GLIBCXX_3.4.21
 FUNC:_ZNSt3_V216generic_categoryEv@@GLIBCXX_3.4.21
+FUNC:_ZNSt3pmr15memory_resourceD0Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr15memory_resourceD1Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr15memory_resourceD2Ev@@GLIBCXX_3.4.28
 FUNC:_ZNSt3pmr19new_delete_resourceEv@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr20get_default_resourceEv@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr20null_memory_resourceEv@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr20set_default_resourceEPNS_15memory_resourceE@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr25monotonic_buffer_resource13_M_new_bufferEjj@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr25monotonic_buffer_resource18_M_release_buffersEv@@GLIBCXX_3.4.26
+FUNC:_ZNSt3pmr25monotonic_buffer_resourceD0Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr25monotonic_buffer_resourceD1Ev@@GLIBCXX_3.4.28
+FUNC:_ZNSt3pmr25monotonic_buffer_resourceD2Ev@@GLIBCXX_3.4.28
 FUNC:_ZNSt3pmr26synchronized_pool_resource11do_allocateEjj@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr26synchronized_pool_resource13do_deallocateEPvjj@@GLIBCXX_3.4.26
 FUNC:_ZNSt3pmr26synchronized_pool_resource7releaseEv@@GLIBCXX_3.4.26
@@ -4642,6 +4652,7 @@ OBJECT:0:GLIBCXX_3.4.24
 OBJECT:0:GLIBCXX_3.4.25
 OBJECT:0:GLIBCXX_3.4.26
 OBJECT:0:GLIBCXX_3.4.27
+OBJECT:0:GLIBCXX_3.4.28
 OBJECT:0:GLIBCXX_3.4.3
 OBJECT:0:GLIBCXX_3.4.4
 OBJECT:0:GLIBCXX_3.4.5
@@ -4680,6 +4691,7 @@ 
OBJECT:12:_ZTINSt17__gnu_cxx_ldbl1289money_getIcSt19istreambuf_iteratorIcSt11cha
 
OBJECT:12:_Z

[PATCH][GCC10][6/6] arm: Add vstN_lane_bf16 + vstNq_lane_bf16 intrisics

2020-12-03 Thread Andrea Corallo via Gcc-patches

Hi all,

last patch of the series to backport a number of bfloat16 intrinsics from
trunk to gcc-10.

These patch are including the fixes to the tests that we have applied
into master.

Please see refer to:
ACLE 
ISA  

The series has been bootstrapped on arm-linux-gnueabihf and regtested.

Okay for gcc-10?

Thanks

  Andrea

>From 614211164b83a1cd426c10c8894cf0aa2837e070 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 29 Oct 2020 11:20:23 +0100
Subject: [PATCH 6/6] arm: Add vstN_lane_bf16 + vstNq_lane_bf16 intrisics

gcc/ChangeLog

2020-10-29  Andrea Corallo  

* config/arm/arm_neon.h (vst2_lane_bf16, vst2q_lane_bf16)
(vst3_lane_bf16, vst3q_lane_bf16, vst4_lane_bf16)
(vst4q_lane_bf16): New intrinsics.
* config/arm/arm_neon_builtins.def: Touch it for:
__builtin_neon_vst2_lanev4bf, __builtin_neon_vst2_lanev8bf,
__builtin_neon_vst3_lanev4bf, __builtin_neon_vst3_lanev8bf,
__builtin_neon_vst4_lanev4bf,__builtin_neon_vst4_lanev8bf.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  

* gcc.target/aarch64/advsimd-intrinsics/vst2_lane_bf16_indices_1.c:
Run it also for arm-*-*.
* gcc.target/aarch64/advsimd-intrinsics/vst2q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst3_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst3q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst4_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vst4q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/arm/simd/vstn_lane_bf16_1.c: New test.
---
 gcc/config/arm/arm_neon.h | 48 
 gcc/config/arm/arm_neon_builtins.def  | 12 +--
 .../vst2_lane_bf16_indices_1.c|  2 +-
 .../vst2q_lane_bf16_indices_1.c   |  2 +-
 .../vst3_lane_bf16_indices_1.c|  2 +-
 .../vst3q_lane_bf16_indices_1.c   |  2 +-
 .../vst4_lane_bf16_indices_1.c|  2 +-
 .../vst4q_lane_bf16_indices_1.c   |  2 +-
 .../gcc.target/arm/simd/vstn_lane_bf16_1.c| 73 +++
 9 files changed, 133 insertions(+), 12 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vstn_lane_bf16_1.c

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 4fee128ce8d..9569e1a4c9c 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19783,6 +19783,54 @@ vld4q_lane_bf16 (const bfloat16_t * __a, 
bfloat16x8x4_t __b, const int __c)
   return __rv.__i;
 }
 
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst2_lane_bf16 (bfloat16_t * __a, bfloat16x4x2_t __b, const int __c)
+{
+  union { bfloat16x4x2_t __i; __builtin_neon_ti __o; } __bu = { __b };
+  __builtin_neon_vst2_lanev4bf (__a, __bu.__o, __c);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst2q_lane_bf16 (bfloat16_t * __a, bfloat16x8x2_t __b, const int __c)
+{
+  union { bfloat16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b };
+  __builtin_neon_vst2_lanev8bf (__a, __bu.__o, __c);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst3_lane_bf16 (bfloat16_t * __a, bfloat16x4x3_t __b, const int __c)
+{
+  union { bfloat16x4x3_t __i; __builtin_neon_ei __o; } __bu = { __b };
+  __builtin_neon_vst3_lanev4bf (__a, __bu.__o, __c);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst3q_lane_bf16 (bfloat16_t * __a, bfloat16x8x3_t __b, const int __c)
+{
+  union { bfloat16x8x3_t __i; __builtin_neon_ci __o; } __bu = { __b };
+  __builtin_neon_vst3_lanev8bf (__a, __bu.__o, __c);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst4_lane_bf16 (bfloat16_t * __a, bfloat16x4x4_t __b, const int __c)
+{
+  union { bfloat16x4x4_t __i; __builtin_neon_oi __o; } __bu = { __b };
+  __builtin_neon_vst4_lanev4bf (__a, __bu.__o, __c);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst4q_lane_bf16 (bfloat16_t * __a, bfloat16x8x4_t __b, const int __c)
+{
+  union { bfloat16x8x4_t __i; __builtin_neon_xi __o; } __bu = { __b };
+  __builtin_neon_vst4_lanev8bf (__a, __bu.__o, __c);
+}
+
 #pragma GCC pop_options
 
 #ifdef __cplusplus
diff --git a/gcc/config/arm/arm_neon_builtins.def 
b/gcc/config/arm/arm_neon_builtins.def
index 1cb8c8c23b4..0ff0494b5da 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -329,8 +329,8 @@ VAR11 (LOAD1LANE, vld2_lane,
 VAR8 (LOAD1, vld2_dup, v8qi, v4hi, v4hf, v2si, v2sf, di, v4bf, v8bf)
 VAR13 (S

Re: [PATCH] 2/2 Remove debug/array

2020-12-03 Thread Daniel Krügler via Gcc-patches

Am Do., 3. Dez. 2020 um 18:10 Uhr schrieb Jonathan Wakely via
Libstdc++ :
>
[..]
> >>Ok to commit ?
> >
> >Yes, this is a nice simplification, thanks.
>
> This broke the C++11 constexpr support in std::array. Fixed with this
> patch. Tested x86_64-linux, committed to trunk.

Wouldn't a transformation into a comma expression, such as

return __glibcxx_requires_subscript(__n), _AT_Type::_S_ref(_M_elems, __n);

realize the same thing but would still keep the assertion-like thing?
(Untested, just out of my head)

- Daniel

[PATCH][GCC10][5/6] arm: Add vldN_lane_bf16 + vldNq_lane_bf16 intrisics

2020-12-03 Thread Andrea Corallo via Gcc-patches

Hi all,

fifth patch of the series to backport a number of bfloat16 intrinsics from
trunk to gcc-10.

These patch are including the fixes to the tests that we have applied
into master.

Please see refer to:
ACLE 
ISA  

The series has been bootstrapped on arm-linux-gnueabihf and regtested.

Okay for gcc-10?

Thanks

  Andrea

>From d3d58cbc90bc9bd4ac890f7871558bf52b5b6b37 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Mon, 26 Oct 2020 18:31:19 +0100
Subject: [PATCH 5/6] arm: Add vldN_lane_bf16 + vldNq_lane_bf16 intrisics

gcc/ChangeLog

2020-10-29  Andrea Corallo  

* config/arm/arm_neon.h (vld2_lane_bf16, vld2q_lane_bf16)
(vld3_lane_bf16, vld3q_lane_bf16, vld4_lane_bf16)
(vld4q_lane_bf16): Add intrinsics.
* config/arm/arm_neon_builtins.def: Touch for:
__builtin_neon_vld2_lanev4bf, __builtin_neon_vld2_lanev8bf,
__builtin_neon_vld3_lanev4bf, __builtin_neon_vld3_lanev8bf,
__builtin_neon_vld4_lanev4bf, __builtin_neon_vld4_lanev8bf.
* config/arm/iterators.md (VQ_HS): Add V8BF to the iterator.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  

* gcc.target/aarch64/advsimd-intrinsics/vld2_lane_bf16_indices_1.c:
Run it also for the arm backend.
* gcc.target/aarch64/advsimd-intrinsics/vld2q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld3_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld3q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/aarch64/advsimd-intrinsics/vld4q_lane_bf16_indices_1.c:
Likewise.
* gcc.target/arm/simd/vldn_lane_bf16_1.c: New test.
---
 gcc/config/arm/arm_neon.h | 62 +++
 gcc/config/arm/arm_neon_builtins.def  | 12 +--
 gcc/config/arm/iterators.md   |  2 +-
 .../vld2_lane_bf16_indices_1.c|  2 +-
 .../vld2q_lane_bf16_indices_1.c   |  2 +-
 .../vld3_lane_bf16_indices_1.c|  2 +-
 .../vld3q_lane_bf16_indices_1.c   |  2 +-
 .../vld4_lane_bf16_indices_1.c|  2 +-
 .../vld4q_lane_bf16_indices_1.c   |  2 +-
 .../gcc.target/arm/simd/vldn_lane_bf16_1.c| 79 +++
 10 files changed, 154 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vldn_lane_bf16_1.c

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 24aad3370f6..4fee128ce8d 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19721,6 +19721,68 @@ vst1q_lane_bf16 (bfloat16_t * __a, bfloat16x8_t __b, 
const int __c)
   __builtin_neon_vst1_lanev8bf (__a, __b, __c);
 }
 
+__extension__ extern __inline bfloat16x4x2_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld2_lane_bf16 (const bfloat16_t * __a, bfloat16x4x2_t __b, const int __c)
+{
+  union { bfloat16x4x2_t __i; __builtin_neon_ti __o; } __bu = { __b };
+  union { bfloat16x4x2_t __i; __builtin_neon_ti __o; } __rv;
+  __rv.__o = __builtin_neon_vld2_lanev4bf ( __a, __bu.__o, __c);
+  return __rv.__i;
+}
+
+__extension__ extern __inline bfloat16x8x2_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld2q_lane_bf16 (const bfloat16_t * __a, bfloat16x8x2_t __b, const int __c)
+{
+  union { bfloat16x8x2_t __i; __builtin_neon_oi __o; } __bu = { __b };
+  union { bfloat16x8x2_t __i; __builtin_neon_oi __o; } __rv;
+  __rv.__o = __builtin_neon_vld2_lanev8bf (__a, __bu.__o, __c);
+  return __rv.__i;
+}
+
+__extension__ extern __inline bfloat16x4x3_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld3_lane_bf16 (const bfloat16_t * __a, bfloat16x4x3_t __b, const int __c)
+{
+  union { bfloat16x4x3_t __i; __builtin_neon_ei __o; } __bu = { __b };
+  union { bfloat16x4x3_t __i; __builtin_neon_ei __o; } __rv;
+  __rv.__o = __builtin_neon_vld3_lanev4bf (__a, __bu.__o, __c);
+  return __rv.__i;
+}
+
+__extension__ extern __inline bfloat16x8x3_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld3q_lane_bf16 (const bfloat16_t * __a, bfloat16x8x3_t __b, const int __c)
+{
+  union { bfloat16x8x3_t __i; __builtin_neon_ci __o; } __bu = { __b };
+  union { bfloat16x8x3_t __i; __builtin_neon_ci __o; } __rv;
+  __rv.__o = __builtin_neon_vld3_lanev8bf (__a, __bu.__o, __c);
+  return __rv.__i;
+}
+
+__extension__ extern __inline bfloat16x4x4_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld4_lane_bf16 (const bfloat16_t * __a, bfloat16x4x4_t __b, const int __c)
+{
+  union { bfloat16x4x4_t __i; __builtin_neon_oi __o; } __bu = { __b };
+  union { bfloat16x4x4_t __i; __builtin_neon_oi __o; } __rv;
+  __rv.__o = __builtin_neon_vld4_lanev4bf (__a,
+  __bu.__o, __c);
+  return __rv.__i;
+}
+
+__extension__ extern __inline bfloat16x8x4_t
+__

[PATCH][GCC10][4/6] arm: Add vst1_bf16 + vst1q_bf16 intrinsics

2020-12-03 Thread Andrea Corallo via Gcc-patches

Hi all,

forth patch of the series to backport a number of bfloat16 intrinsics from
trunk to gcc-10.

These patch are including the fixes to the tests that we have applied
into master.

Please see refer to:
ACLE 
ISA  

The series has been bootstrapped on arm-linux-gnueabihf and regtested.

Okay for gcc-10?

Thanks

  Andrea

>From c2b787d773ff51485d0fdc594596b0873beb59c5 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 29 Oct 2020 15:11:37 +0100
Subject: [PATCH 4/6] arm: Add vst1_bf16 + vst1q_bf16 intrinsics

gcc/ChangeLog

2020-10-29  Andrea Corallo  

* config/arm/arm_neon.h (vst1_bf16, vst1q_bf16): Add intrinsics.
* config/arm/arm_neon_builtins.def : Touch for:
__builtin_neon_vst1v4bf, __builtin_neon_vst1v8bf.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  

* gcc.target/arm/simd/vst1_bf16_1.c: New test.
---
 gcc/config/arm/arm_neon.h | 14 +
 gcc/config/arm/arm_neon_builtins.def  |  5 ++--
 .../gcc.target/arm/simd/vst1_bf16_1.c | 29 +++
 3 files changed, 46 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_bf16_1.c

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index b77175eaa3e..24aad3370f6 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19509,6 +19509,20 @@ vbfmlaltq_laneq_f32 (float32x4_t __r, bfloat16x8_t 
__a, bfloat16x8_t __b,
   return __builtin_neon_vfmat_laneqv8bf (__r, __a, __b, __index);
 }
 
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst1_bf16 (bfloat16_t * __a, bfloat16x4_t __b)
+{
+  __builtin_neon_vst1v4bf (__a, __b);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst1q_bf16 (bfloat16_t * __a, bfloat16x8_t __b)
+{
+  __builtin_neon_vst1v8bf (__a, __b);
+}
+
 __extension__ extern __inline void
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 vst2_bf16 (bfloat16_t * __ptr, bfloat16x4x2_t __val)
diff --git a/gcc/config/arm/arm_neon_builtins.def 
b/gcc/config/arm/arm_neon_builtins.def
index 07eda44cc58..e3ab6281497 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -317,8 +317,9 @@ VAR12 (LOAD1LANE, vld1_lane,
v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di, v4bf, v8bf)
 VAR10 (LOAD1, vld1_dup,
v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di)
-VAR12 (STORE1, vst1,
-   v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di)
+VAR14 (STORE1, vst1,
+v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di,
+v4bf, v8bf)
 VAR14 (STORE1LANE, vst1_lane,
v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di, 
v4bf, v8bf)
 VAR13 (LOAD1, vld2,
diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vst1_bf16_1.c
new file mode 100644
index 000..06fb58ecd79
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vst1_bf16_1.c
@@ -0,0 +1,29 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-save-temps -O2 -mfloat-abi=hard" }  */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "arm_neon.h"
+
+/*
+**test_vst1_bf16:
+** vst1.16 {d0}, \[r0\]
+** bx  lr
+*/
+void
+test_vst1_bf16 (bfloat16_t *a, bfloat16x4_t b)
+{
+  vst1_bf16 (a, b);
+}
+
+/*
+**test_vst1q_bf16:
+** vst1.16 {d0-d1}, \[r0\]
+** bx  lr
+*/
+void
+test_vst1q_bf16 (bfloat16_t *a, bfloat16x8_t b)
+{
+  vst1q_bf16 (a, b);
+}
-- 
2.20.1

[PATCH][GCC10][3/6] arm: Add vld1_bf16 + vld1q_bf16 intrinsics

2020-12-03 Thread Andrea Corallo via Gcc-patches

Hi all,

third patch of the series to backport a number of bfloat16 intrinsics from
trunk to gcc-10.

These patch are including the fixes to the tests that we have applied
into master.

Please see refer to:
ACLE 
ISA  

The series has been bootstrapped on arm-linux-gnueabihf and regtested.

Okay for gcc-10?

Thanks

  Andrea

>From 995779bf10731d00acf6701b57251aeb5d4e46b6 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 29 Oct 2020 13:56:17 +0100
Subject: [PATCH 3/6] arm: Add vld1_bf16 + vld1q_bf16 intrinsics

gcc/ChangeLog

2020-10-29  Andrea Corallo  

* config/arm/arm-builtins.c (VAR14): Define macro.
* config/arm/arm_neon_builtins.def: Touch for:
__builtin_neon_vld1v4bf, __builtin_neon_vld1v8bf.
* config/arm/arm_neon.h (vld1_bf16, vld1q_bf16): Add intrinsics.

gcc/testsuite/ChangeLog

2020-10-29  Andrea Corallo  

* gcc.target/arm/simd/vld1_bf16_1.c: New test.
---
 gcc/config/arm/arm-builtins.c |  3 ++
 gcc/config/arm/arm_neon.h | 14 +
 gcc/config/arm/arm_neon_builtins.def  |  5 ++--
 .../gcc.target/arm/simd/vld1_bf16_1.c | 29 +++
 4 files changed, 49 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_bf16_1.c

diff --git a/gcc/config/arm/arm-builtins.c b/gcc/config/arm/arm-builtins.c
index 4716771d7e4..73650637e5e 100644
--- a/gcc/config/arm/arm-builtins.c
+++ b/gcc/config/arm/arm-builtins.c
@@ -946,6 +946,9 @@ typedef struct {
 #define VAR13(T, N, A, B, C, D, E, F, G, H, I, J, K, L, M) \
   VAR12 (T, N, A, B, C, D, E, F, G, H, I, J, K, L) \
   VAR1 (T, N, M)
+#define VAR14(T, N, A, B, C, D, E, F, G, H, I, J, K, L, M, O) \
+  VAR13 (T, N, A, B, C, D, E, F, G, H, I, J, K, L, M) \
+  VAR1 (T, N, O)
 
 /* The builtin data can be found in arm_neon_builtins.def, arm_vfp_builtins.def
and arm_acle_builtins.def.  The entries in arm_neon_builtins.def require
diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index 432d77fb272..b77175eaa3e 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19557,6 +19557,20 @@ vst4q_bf16 (bfloat16_t * __ptr, bfloat16x8x4_t __val)
   return __builtin_neon_vst4v8bf (__ptr, __bu.__o);
 }
 
+__extension__ extern __inline bfloat16x4_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld1_bf16 (bfloat16_t const * __ptr)
+{
+  return __builtin_neon_vld1v4bf (__ptr);
+}
+
+__extension__ extern __inline bfloat16x8_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld1q_bf16 (const bfloat16_t * __ptr)
+{
+  return __builtin_neon_vld1v8bf (__ptr);
+}
+
 __extension__ extern __inline bfloat16x4x2_t
 __attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
 vld2_bf16 (bfloat16_t const * __ptr)
diff --git a/gcc/config/arm/arm_neon_builtins.def 
b/gcc/config/arm/arm_neon_builtins.def
index 7a5dae0c4c0..07eda44cc58 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -310,8 +310,9 @@ VAR1 (TERNOP, vtbx1, v8qi)
 VAR1 (TERNOP, vtbx2, v8qi)
 VAR1 (TERNOP, vtbx3, v8qi)
 VAR1 (TERNOP, vtbx4, v8qi)
-VAR12 (LOAD1, vld1,
-v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di)
+VAR14 (LOAD1, vld1,
+v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di,
+v4bf, v8bf)
 VAR12 (LOAD1LANE, vld1_lane,
v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di, v4bf, v8bf)
 VAR10 (LOAD1, vld1_dup,
diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_bf16_1.c
new file mode 100644
index 000..b6b00dc03c2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1_bf16_1.c
@@ -0,0 +1,29 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-save-temps -O2 -mfloat-abi=hard" }  */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#include "arm_neon.h"
+
+/*
+**test_vld1_bf16:
+** vld1.16 {d0}, \[r0\]
+** bx  lr
+*/
+bfloat16x4_t
+test_vld1_bf16 (bfloat16_t const *p)
+{
+  return vld1_bf16 (p);
+}
+
+/*
+**test_vld1q_bf16:
+** vld1.16 {d0-d1}, \[r0\]
+** bx  lr
+*/
+bfloat16x8_t
+test_vld1q_bf16 (bfloat16_t const *p)
+{
+  return vld1q_bf16 (p);
+}
-- 
2.20.1

Re: [PATCH] 2/2 Remove debug/array

2020-12-03 Thread Jonathan Wakely via Gcc-patches


On 09/11/20 13:07 +, Jonathan Wakely wrote:

On 08/11/20 15:27 +0100, FranÃ§ois Dumont via Libstdc++ wrote:
Following a recent fix on std::array this test started to fail in 
_GLIBCXX_DEBUG mode.


FAIL: 23_containers/array/comparison_operators/96851.cc (test for 
excess errors)


Rather than fixing it and now that __glibcxx_assert is constexpr 
compatible I would like to propose to simply remove 
__gnu_debug::array.


The only code we are losing with this change are the 
_Array_check_nonempty/_Array_check_subscript types. I am not sure 
about the purpose of this code as I saw no impact on tests. Maybe it 
was to avoid assertion in constexpr where the value of the 
expression is not use but there is a test doing that and it does 
produce an assertion.


Note that I am also moving std::array in versioned namespace. It is 
just for consistency so no problem to remove it.


I also manually edited include/Makefile.in cause I do not have the 
proper autoreconf version. Can you regenerate it on your side once 
patch is in ?


Â Â Â  libstdc++: Remove 

Â Â Â  Add _GLIBCXX_ASSERTIONS assert in normal std::array and remove 
__gnu_debug::array

Â Â Â  implementation.

Â Â Â  libstdc++-v3/ChangeLog:

Â Â Â Â Â Â Â Â Â Â Â  * include/debug/array: Remove.
Â Â Â Â Â Â Â Â Â Â Â  * include/Makefile.am: Remove .
Â Â Â Â Â Â Â Â Â Â Â  * include/Makefile.in: Regenerate.
Â Â Â Â Â Â Â Â Â Â Â  * include/experimental/functional: Adapt.
Â Â Â Â Â Â Â Â Â Â Â  * include/std/array: Move to _GLIBCXX_INLINE_VERSION 
namespace.
Â Â Â Â Â Â Â Â Â Â Â  * include/std/functional: Adapt.
Â Â Â Â Â Â Â Â Â Â Â  * include/std/span: Adapt.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/array/debug/back1_neg.cc:
Â Â Â Â Â Â Â Â Â Â Â  Remove dg-require-debug-mode. Add -D_GLIBCXX_ASSERTIONS 
option.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/array/debug/back2_neg.cc: 
Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/array/debug/front1_neg.cc: 
Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/array/debug/front2_neg.cc: 
Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/array/debug/square_brackets_operator1_neg.cc:

Â Â Â Â Â Â Â Â Â Â Â  Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/array/debug/square_brackets_operator2_neg.cc:

Â Â Â Â Â Â Â Â Â Â Â  Likewise.
Â Â Â Â Â Â Â Â Â Â Â  * testsuite/23_containers/array/element_access/60497.cc
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/array/tuple_interface/get_debug_neg.cc:

Â Â Â Â Â Â Â Â Â Â Â  Remove.
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/array/tuple_interface/get_neg.cc
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/array/tuple_interface/tuple_element_debug_neg.cc
Â Â Â Â Â Â Â Â Â Â Â  * 
testsuite/23_containers/array/tuple_interface/tuple_element_neg.cc


Tested under Linux x86_64 normal and debug modes.

Ok to commit ?


Yes, this is a nice simplification, thanks.


This broke the C++11 constexpr support in std::array. Fixed with this
patch. Tested x86_64-linux, committed to trunk.


commit 91cfacc4b5d317b12a3bdcd798273a581568f645
Author: Jonathan Wakely 
Date:   Thu Dec 3 17:08:01 2020

libstdc++: Disable std::array assertions for C++11 constexpr

The recent changes to add assertions to std::array broke the functions
that need to be constexpr in C++11, because of the restrictive rules for
constexpr functions in C++11.

This simply disables the assertions for C++11 mode, so the functions can
be constexpr again.

libstdc++-v3/ChangeLog:

* include/std/array (array::operator[](size_t) const, array::front() const)
(array::back() const) [__cplusplus == 201103]: Disable
assertions.
* testsuite/23_containers/array/element_access/constexpr_element_access.cc:
Check for correct values.
* testsuite/23_containers/array/tuple_interface/get_neg.cc:
Adjust dg-error line numbers.
* testsuite/23_containers/array/debug/constexpr_c++11.cc: New test.

diff --git a/libstdc++-v3/include/std/array b/libstdc++-v3/include/std/array
index 3c4c88a536e..80750994058 100644
--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -192,7 +192,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr const_reference
   operator[](size_type __n) const noexcept
   {
+#if __cplusplus >= 201402L
 	__glibcxx_requires_subscript(__n);
+#endif
 	return _AT_Type::_S_ref(_M_elems, __n);
   }
 
@@ -228,7 +230,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr const_reference
   front() const noexcept
   {
+#if __cplusplus >= 201402L
 	__glibcxx_requires_nonempty();
+#endif
 	return _AT_Type::_S_ref(_M_elems, 0);
   }
 
@@ -242,7 +246,9 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   constexpr const_reference
   back() const noexcept
   {
+#if __cplusplus >= 201402L
 	__glibcxx_requires_nonempty();
+#endif
 	return _Nm ? _AT_Type::_S_ref(_M_elems, _Nm - 1)

[PATCH][GCC10][2/6] arm: Add vst1_lane_bf16 + vstq_lane_bf16 intrinsics

2020-12-03 Thread Andrea Corallo via Gcc-patches

Hi all,

second patch of the series to backport a number of bfloat16 intrinsics from
trunk to gcc-10.

These patch are including the fixes to the tests that we have applied
into master.

Please see refer to:
ACLE 
ISA  

The series has been bootstrapped on arm-linux-gnueabihf and regtested.

Okay for gcc-10?

Thanks

  Andrea
>From f138006c8e8bcc35f0bfea816dbc34d6256b0912 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Fri, 23 Oct 2020 14:21:56 +0200
Subject: [PATCH 2/6] arm: Add vst1_lane_bf16 + vstq_lane_bf16 intrinsics

gcc/ChangeLog

2020-10-23  Andrea Corallo  

* config/arm/arm-builtins.c (VAR14): Define macro.
* config/arm/arm_neon.h (vst1_lane_bf16, vst1q_lane_bf16): Add
intrinsics.
* config/arm/arm_neon_builtins.def (STORE1LANE): Add v4bf, v8bf.

gcc/testsuite/ChangeLog

2020-10-23  Andrea Corallo  

* gcc.target/arm/simd/vst1_lane_bf16_1.c: New testcase.
* gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vst1_lane_bf16_indices_1.c: Likewise.
---
 gcc/config/arm/arm_neon.h | 14 
 gcc/config/arm/arm_neon_builtins.def  |  4 ++--
 .../gcc.target/arm/simd/vst1_lane_bf16_1.c| 22 +++
 .../arm/simd/vst1_lane_bf16_indices_1.c   | 17 ++
 .../arm/simd/vstq1_lane_bf16_indices_1.c  | 17 ++
 5 files changed, 72 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c
 create mode 100644 
gcc/testsuite/gcc.target/arm/simd/vstq1_lane_bf16_indices_1.c

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index fcd8020425e..432d77fb272 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19679,6 +19679,20 @@ vld1q_lane_bf16 (const bfloat16_t * __a, bfloat16x8_t 
__b, const int __c)
   return __builtin_neon_vld1_lanev8bf (__a, __b, __c);
 }
 
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst1_lane_bf16 (bfloat16_t * __a, bfloat16x4_t __b, const int __c)
+{
+  __builtin_neon_vst1_lanev4bf (__a, __b, __c);
+}
+
+__extension__ extern __inline void
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vst1q_lane_bf16 (bfloat16_t * __a, bfloat16x8_t __b, const int __c)
+{
+  __builtin_neon_vst1_lanev8bf (__a, __b, __c);
+}
+
 #pragma GCC pop_options
 
 #ifdef __cplusplus
diff --git a/gcc/config/arm/arm_neon_builtins.def 
b/gcc/config/arm/arm_neon_builtins.def
index d0617a4695d..7a5dae0c4c0 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -318,8 +318,8 @@ VAR10 (LOAD1, vld1_dup,
v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di)
 VAR12 (STORE1, vst1,
v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di)
-VAR12 (STORE1LANE, vst1_lane,
-   v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di)
+VAR14 (STORE1LANE, vst1_lane,
+   v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di, 
v4bf, v8bf)
 VAR13 (LOAD1, vld2,
v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v4bf, 
v8bf)
 VAR9 (LOAD1LANE, vld2_lane,
diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c
new file mode 100644
index 000..8564b8fa062
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_1.c
@@ -0,0 +1,22 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-O3 --save-temps -mfloat-abi=hard" } */
+
+#include "arm_neon.h"
+
+void
+test_vst1_lane_bf16 (bfloat16_t *a, bfloat16x4_t b)
+{
+  vst1_lane_bf16 (a, b, 1);
+}
+
+void
+test_vst1q_lane_bf16 (bfloat16_t *a, bfloat16x8_t b)
+{
+  vst1q_lane_bf16 (a, b, 2);
+}
+
+/* { dg-final { scan-assembler "vst1.16\t{d0\\\[1\\\]}, \\\[r0\\\]" } } */
+/* { dg-final { scan-assembler "vst1.16\t{d0\\\[2\\\]}, \\\[r0\\\]" } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c
new file mode 100644
index 000..1bd68718d10
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vst1_lane_bf16_indices_1.c
@@ -0,0 +1,17 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-mfloat-abi=hard" } */
+
+#include "arm_neon.h"
+
+void
+test_vst1_lane_bf16 (bfloat16_t *a, bfloat16x4_t b)
+{
+  vst1_lane_bf16 (a, b, -1);
+  vst1_lane_bf16 (a, b, 4);
+}
+
+/* { dg-error "lane -

[PATCH][GCC10][1/6] arm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsics

2020-12-03 Thread Andrea Corallo via Gcc-patches

Hi all,

first patch of the series to backport a number of bfloat16 intrinsics from
trunk to gcc-10.

These patch are including the fixes to the tests that we have applied
into master.

Please see refer to:
ACLE 
ISA  

The serie has been bootstrapped on arm-linux-gnueabihf and regtested.

Okay for gcc-10?

Thanks

  Andrea
>From 7b2080b71405918769811174082646219d23163c Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Wed, 21 Oct 2020 11:16:01 +0200
Subject: [PATCH 1/6] arm: Add vld1_lane_bf16 + vldq_lane_bf16 intrinsics

gcc/ChangeLog

2020-10-21  Andrea Corallo  

* config/arm/arm_neon_builtins.def: Add to LOAD1LANE v4bf, v8bf.
* config/arm/arm_neon.h (vld1_lane_bf16, vld1q_lane_bf16): Add
intrinsics.

gcc/testsuite/ChangeLog

2020-10-21  Andrea Corallo  

* gcc.target/arm/simd/vld1_lane_bf16_1.c: New testcase.
* gcc.target/arm/simd/vld1_lane_bf16_indices_1.c: Likewise.
* gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c: Likewise.
---
 gcc/config/arm/arm_neon.h | 14 
 gcc/config/arm/arm_neon_builtins.def  |  4 ++--
 .../gcc.target/arm/simd/vld1_lane_bf16_1.c| 22 +++
 .../arm/simd/vld1_lane_bf16_indices_1.c   | 19 
 .../arm/simd/vld1q_lane_bf16_indices_1.c  | 19 
 5 files changed, 76 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c
 create mode 100644 
gcc/testsuite/gcc.target/arm/simd/vld1q_lane_bf16_indices_1.c

diff --git a/gcc/config/arm/arm_neon.h b/gcc/config/arm/arm_neon.h
index aa21730dea0..fcd8020425e 100644
--- a/gcc/config/arm/arm_neon.h
+++ b/gcc/config/arm/arm_neon.h
@@ -19665,6 +19665,20 @@ vld4q_dup_bf16 (const bfloat16_t * __ptr)
   return __rv.__i;
 }
 
+__extension__ extern __inline bfloat16x4_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld1_lane_bf16 (const bfloat16_t * __a, bfloat16x4_t __b, const int __c)
+{
+  return __builtin_neon_vld1_lanev4bf (__a, __b, __c);
+}
+
+__extension__ extern __inline bfloat16x8_t
+__attribute__  ((__always_inline__, __gnu_inline__, __artificial__))
+vld1q_lane_bf16 (const bfloat16_t * __a, bfloat16x8_t __b, const int __c)
+{
+  return __builtin_neon_vld1_lanev8bf (__a, __b, __c);
+}
+
 #pragma GCC pop_options
 
 #ifdef __cplusplus
diff --git a/gcc/config/arm/arm_neon_builtins.def 
b/gcc/config/arm/arm_neon_builtins.def
index 34c1945c0a1..d0617a4695d 100644
--- a/gcc/config/arm/arm_neon_builtins.def
+++ b/gcc/config/arm/arm_neon_builtins.def
@@ -312,8 +312,8 @@ VAR1 (TERNOP, vtbx3, v8qi)
 VAR1 (TERNOP, vtbx4, v8qi)
 VAR12 (LOAD1, vld1,
 v8qi, v4hi, v4hf, v2si, v2sf, di, v16qi, v8hi, v8hf, v4si, v4sf, v2di)
-VAR10 (LOAD1LANE, vld1_lane,
-   v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di)
+VAR12 (LOAD1LANE, vld1_lane,
+   v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di, v4bf, v8bf)
 VAR10 (LOAD1, vld1_dup,
v8qi, v4hi, v2si, v2sf, di, v16qi, v8hi, v4si, v4sf, v2di)
 VAR12 (STORE1, vst1,
diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
new file mode 100644
index 000..94fb38f32b8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_1.c
@@ -0,0 +1,22 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-O3 --save-temps -mfloat-abi=hard" } */
+
+#include "arm_neon.h"
+
+bfloat16x4_t
+test_vld1_lane_bf16 (bfloat16_t *a, bfloat16x4_t b)
+{
+  return vld1_lane_bf16 (a, b, 1);
+}
+
+bfloat16x8_t
+test_vld1q_lane_bf16 (bfloat16_t *a, bfloat16x8_t b)
+{
+  return vld1q_lane_bf16 (a, b, 2);
+}
+
+/* { dg-final { scan-assembler "vld1.16\t{d0\\\[1\\\]}, \\\[r0\\\]" } } */
+/* { dg-final { scan-assembler "vld1.16\t{d0\\\[2\\\]}, \\\[r0\\\]" } } */
diff --git a/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c 
b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c
new file mode 100644
index 000..d9af512cf92
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/simd/vld1_lane_bf16_indices_1.c
@@ -0,0 +1,19 @@
+/* { dg-do assemble } */
+/* { dg-require-effective-target arm_v8_2a_bf16_neon_ok } */
+/* { dg-require-effective-target arm_hard_ok } */
+/* { dg-add-options arm_v8_2a_bf16_neon } */
+/* { dg-additional-options "-mfloat-abi=hard" } */
+
+#include "arm_neon.h"
+
+bfloat16x4_t
+test_vld1_lane_bf16 (bfloat16_t *a, bfloat16x4_t b)
+{
+  bfloat16x4_t res;
+  res = vld1_lane_bf16 (a, b, -1);
+  res = vld1_lane_bf16 (a, b, 4);
+  return res;
+}
+
+/* { dg-error "lane -1 out of range 0 - 3" "" { target *-*-* } 0 } */
+/* { dg-error "lane 4 out of rang

Re: [PATCH] implement pre-c++20 contracts

2020-12-03 Thread Andrew Sutton via Gcc-patches

>
>
> > Attached is a new squashed revision of the patch sans ChangeLogs. The
> > current work is now being done on github:
> > https://github.com/lock3/gcc/tree/contracts-jac-alt
>
> I'm starting to review this now, sorry for the delay. Is this still the
> branch you want me to consider for GCC 11?  I notice that the -constexpr
> and -mangled-config branches are newer.


I think so. Jeff can answer more authoritatively. I know we had one set of
changes to the design (how contracts) work aimed at improving the debugging
experience for violated contracts. I'm not sure if that's in the jac-alt
branch though.

The -constexpr branch checks for trivially satisfied contracts (e.g.,
[[assert: true]]) and issues warnings. It also preemptively checks
preconditions against constant function arguments. It's probably worth
reviewing that separately.

I'm not sure the -manged-config branch is worth considering for merging at
this point. It's trying to solve a problem that might not be worth solving.

Out of curiosity, are you concerned that future versions of contracts might
have considerably different syntax or configurability? I'd hope it
wouldn't, but who knows where SG21 is going :)

Andrew

[PATCH v2]: i386: Fix up ix86_md_asm_adjust for TImode [PR98086]

2020-12-03 Thread Uros Bizjak via Gcc-patches

ix86_md_asm_adjust assumes that dest_mode can be only [QHSD]Imode
and nothing else.  The patch rewrites zero-extension part to use
convert_to_mode to handle TImode and hypothetically even wider modes.

2020-12-03  Uroš Bizjak  
Jakub Jelinek  

gcc/
PR target/98086
* config/i386/i386.c (ix86_md_asm_adjustmd): Rewrite
zero-extension part to use convert_to_mode.

gcc/testsuite/
PR target/98086
* gcc.target/i386/pr98086.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master, will be backported to gcc-10 and gcc-9.

Uros.
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index c5db8c9712e..63216782430 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -21508,40 +21508,18 @@ ix86_md_asm_adjust (vec &outputs, vec 
&/*inputs*/,
  continue;
}
 
-  if (dest_mode == DImode && !TARGET_64BIT)
-   dest_mode = SImode;
-
-  if (dest_mode != QImode)
-   {
- rtx destqi = gen_reg_rtx (QImode);
- emit_insn (gen_rtx_SET (destqi, x));
-
- if (TARGET_ZERO_EXTEND_WITH_AND
- && optimize_function_for_speed_p (cfun))
-   {
- x = force_reg (dest_mode, const0_rtx);
-
- emit_insn (gen_movstrictqi (gen_lowpart (QImode, x), destqi));
-   }
- else
-   {
- x = gen_rtx_ZERO_EXTEND (dest_mode, destqi);
- if (dest_mode == GET_MODE (dest)
- && !register_operand (dest, GET_MODE (dest)))
-   x = force_reg (dest_mode, x);
-   }
-   }
-
-  if (dest_mode != GET_MODE (dest))
+  if (dest_mode == QImode)
+   emit_insn (gen_rtx_SET (dest, x));
+  else
{
- rtx tmp = gen_reg_rtx (SImode);
+ rtx reg = gen_reg_rtx (QImode);
+ emit_insn (gen_rtx_SET (reg, x));
 
- emit_insn (gen_rtx_SET (tmp, x));
- emit_insn (gen_zero_extendsidi2 (dest, tmp));
+ reg = convert_to_mode (dest_mode, reg, 1);
+ emit_move_insn (dest, reg);
}
-  else
-   emit_insn (gen_rtx_SET (dest, x));
 }
+
   rtx_insn *seq = get_insns ();
   end_sequence ();
 
diff --git a/gcc/testsuite/gcc.target/i386/pr98086.c 
b/gcc/testsuite/gcc.target/i386/pr98086.c
new file mode 100644
index 000..254a3b9bef6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr98086.c
@@ -0,0 +1,17 @@
+/* PR target/98086 */
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+#ifdef __x86_64__
+typedef __int128 T;
+#else
+typedef long long T;
+#endif
+
+T x;
+
+void
+foo (void)
+{
+  __asm ("" : "=@ccc" (x));
+}

c++: templatey type creation

2020-12-03 Thread Nathan Sidwell



This patch makes a couple of type-creation routines available to
modules.  That needs to create unbound template parms, and canonical
template parms.

gcc/cp/
* cp-tree.h (make_unbound_class_template_raw): Declare.
(canonical_type_parameter): Declare.
* decl.c (make_unbound_class_template_raw): Break out of ...
(make_unboud_class_template): ... here.  Call it.
* pt.c (canonical_type_parameter): Externalize.  Refactor & set
structural_equality for type parms.

pushing to trunk

--
Nathan Sidwell
diff --git i/gcc/cp/cp-tree.h w/gcc/cp/cp-tree.h
index de905dcf37c..69f8ed56e62 100644
--- i/gcc/cp/cp-tree.h
+++ w/gcc/cp/cp-tree.h
@@ -6542,6 +6542,7 @@ extern bool check_omp_return			(void);
 extern tree make_typename_type			(tree, tree, enum tag_types, tsubst_flags_t);
 extern tree build_typename_type			(tree, tree, tree, tag_types);
 extern tree make_unbound_class_template		(tree, tree, tree, tsubst_flags_t);
+extern tree make_unbound_class_template_raw	(tree, tree, tree);
 extern tree build_library_fn_ptr		(const char *, tree, int);
 extern tree build_cp_library_fn_ptr		(const char *, tree, int);
 extern tree push_library_fn			(tree, tree, tree, int);
@@ -6880,6 +6881,7 @@ extern void maybe_show_extern_c_location (void);
 extern bool literal_integer_zerop (const_tree);
 
 /* in pt.c */
+extern tree canonical_type_parameter		(tree);
 extern void push_access_scope			(tree);
 extern void pop_access_scope			(tree);
 extern bool check_template_shadow		(tree);
diff --git i/gcc/cp/decl.c w/gcc/cp/decl.c
index 0cf84a0750c..a28e7924869 100644
--- i/gcc/cp/decl.c
+++ w/gcc/cp/decl.c
@@ -4132,6 +4132,14 @@ make_unbound_class_template (tree context, tree name, tree parm_list,
   return tmpl;
 }
 
+  return make_unbound_class_template_raw (context, name, parm_list);
+}
+
+/* Build an UNBOUND_CLASS_TEMPLATE.  */
+
+tree
+make_unbound_class_template_raw (tree context, tree name, tree parm_list)
+{
   /* Build the UNBOUND_CLASS_TEMPLATE.  */
   tree t = cxx_make_type (UNBOUND_CLASS_TEMPLATE);
   TYPE_CONTEXT (t) = FROB_CONTEXT (context);
diff --git i/gcc/cp/pt.c w/gcc/cp/pt.c
index 66ac6473983..3ca28133d94 100644
--- i/gcc/cp/pt.c
+++ w/gcc/cp/pt.c
@@ -4432,7 +4432,7 @@ build_template_parm_index (int index,
parameter.  Returns the canonical type parameter, which may be TYPE
if no such parameter existed.  */
 
-static tree
+tree
 canonical_type_parameter (tree type)
 {
   int idx = TEMPLATE_TYPE_IDX (type);
@@ -13212,19 +13212,24 @@ tsubst_argument_pack (tree orig_arg, tree args, tsubst_flags_t complain,
 		  tree in_decl)
 {
   /* Substitute into each of the arguments.  */
-  tree new_arg = TYPE_P (orig_arg)
-? cxx_make_type (TREE_CODE (orig_arg))
-: make_node (TREE_CODE (orig_arg));
-
   tree pack_args = tsubst_template_args (ARGUMENT_PACK_ARGS (orig_arg),
 	 args, complain, in_decl);
-  if (pack_args == error_mark_node)
-new_arg = error_mark_node;
-  else
-SET_ARGUMENT_PACK_ARGS (new_arg, pack_args);
+  tree new_arg = error_mark_node;
+  if (pack_args != error_mark_node)
+{
+  if (TYPE_P (orig_arg))
+	{
+	  new_arg = cxx_make_type (TREE_CODE (orig_arg));
+	  SET_TYPE_STRUCTURAL_EQUALITY (new_arg);
+	}
+  else
+	{
+	  new_arg = make_node (TREE_CODE (orig_arg));
+	  TREE_CONSTANT (new_arg) = TREE_CONSTANT (orig_arg);
+	}
 
-  if (TREE_CODE (new_arg) == NONTYPE_ARGUMENT_PACK)
-TREE_CONSTANT (new_arg) = TREE_CONSTANT (orig_arg);
+  SET_ARGUMENT_PACK_ARGS (new_arg, pack_args);
+}
 
   return new_arg;
 }

Re: How to traverse all the local variables that declared in the current routine?

2020-12-03 Thread Richard Sandiford via Gcc-patches

Richard Biener via Gcc-patches  writes:
> On December 3, 2020 5:07:28 PM GMT+01:00, Qing Zhao  
> wrote:
>>
>>
>>> On Dec 3, 2020, at 2:45 AM, Richard Biener
>> wrote:
>>> 
>>> On Wed, Dec 2, 2020 at 4:36 PM Qing Zhao >> wrote:

 On Dec 2, 2020, at 2:45 AM, Richard Biener
>> wrote:

 On Tue, Dec 1, 2020 at 8:49 PM Qing Zhao 
>>wrote:

 Hi, Richard,

 Could you please comment on the following approach:

 Instead of adding the zero-initializer quite late at the pass
>>“pass_expand”, we can add it as early as during gimplification.
 However, we will mark these new added zero-initializers as
>>“artificial”. And passing this “artificial” information to
 “pass_early_warn_uninitialized” and “pass_late_warn_uninitialized”,
>>in these two uninitialized variable analysis passes,
 (i.e., in tree-sea-uninit.c) We will update the checking on
>>“ssa_undefined_value_p”  to consider “artificial” zero-initializers.
 (i.e, if the def_stmt is marked with “artificial”, then it’s a
>>undefined value).

 With such approach, we should be able to address all those
>>conflicts.

 Do you see any obvious issue with this approach?

 Yes, DSE will happily elide an explicit zero-init following the
 artificial one leading to false uninit diagnostics.

 Indeed.  This is a big issue. And other optimizations might also be
>>impacted by the new zero-init, resulting changed behavior
 of uninitialized analysis in the later stage.
>>> 
>>> I don't see how the issue can be resolved, you can't get both, uninit
>>> warnings and no uninitialized memory.
>>> People can compile twice, once without -fzero-init to get uninit
>>> warnings and once with -fzero-init to get
>>> the extra "security".
>>
>>So, for GCC, you think that it’s okay to get rid of the following
>>requirement:
>>
>>C. The implementation needs to keep the current static warning on
>>uninitialized
>>variables untouched in order to avoid "forking the language”.
>>
>>Then, we can add explanation in the user documentation of the new
>>-fzero-init and also 
>>that of the -Wuninitialized to inform users that -fzero-init will
>>change the behavior of -Wuninitialized.
>>In order to get the warnings, -fzero-init should not be added at the
>>same time?
>>
>>With this requirement being eliminated, implementation will be much
>>easier. 
>>
>>We can add the new initialization during simplification phase. Then
>>this new option will work
>>for all languages.  Is this reasonable?
>
> I think that's reasonable indeed. Eventually doing the init after the early 
> uninit pass is possible as well.

Sorry to be awkward, but I kind-of disagree.  IIRC, clang was able to
give uninit warnings while implementing the initialisation as expected,
so I think this is a GCC restriction rather than a fundamental
incompatibility.

I don't think it's reasonable to expect people to read the documentation
of -ffoo for Clang and separately read the documentation of -ffoo for
GCC.  They'll at best read the documentation for one and (rightly)
expect the other compiler to behave in a compatible way.  I'm also not
sure people would build twice in practice.

I remember the issue of forking the language was discussed at length on
the Clang dev list at the time (but I haven't gone back and re-read the
thread, so I'm relying on memory here).  Not forking the language was an
important goal/requirement of the option and I don't think we should
drop it when implementing the option in GCC.

IMO, if we want to define a dialect of C/C++ in which uninitialised uses
are always well defined rather than UB, we should do that as a separate
option.  If we're implementing the Clang options, we should continue
to treat uninitialised uses as UB that triggers the same warnings as
if the option wasn't passed.

So TBH I'd rather not add the option until it can be implemented in a
way that is compatible with Clang.

Thanks,
Richard

Re: introduce overridable clear_cache emitter

2020-12-03 Thread Jeff Law via Gcc-patches




On 12/3/20 7:08 AM, Alexandre Oliva wrote:
> On Dec  3, 2020, Christophe Lyon  wrote:
>
>> This patches causes a lot of regressions in fortran on arm and aarch64,
> On Dec  3, 2020, Andreas Schwab  wrote:
>
>> On Dez 03 2020, Andreas Schwab wrote:
>>> ../../../../libffi/src/aarch64/ffi.c: In function 'ffi_prep_closure_loc':
>>> ../../../../libffi/src/aarch64/ffi.c:67:3: error: both arguments to 
>>> '__builtin___clear_cache' must be pointers
>>> 67 |   __builtin___clear_cache (start, end);
>>> |   ^~~~
>> This happens when compiling with -mabi=ilp32.
> Thank you both.  Here's the patch I'm testing to fix both issues.
> Ok to install?
>
>
> fix __builtin___clear_cache overrider fallout
>
> From: Alexandre Oliva 
>
> Machines that had CLEAR_CACHE_INSN and that would thus issue calls to
> __clear_cache with the default call expander, would fail on languages
> that did not set up the __clear_cache builtin.  This patch arranges
> for all languages to set up this builtin.
>
> Machines or multilibs that had ptr_mode != Pmode, such as aarch64 with
> -mabi=ilp32, would fail the RTL mode test of the arguments passed to
> __clear_cache, because we'd insist on ptr_mode.  This patch arranges for
> Pmode to be accepted as well.
>
>
> for  gcc/ChangeLog
>
>   * tree.c (build_common_builtin_nodes): Declare
>   __builtin___clear_cache for all languages.
>   * builtins.c (maybe_emit_call_builtin___clear_cache): Accept
>   Pmode arguments.
OK
jeff

Re: c++: Fix array type dependency [PR 98107]

2020-12-03 Thread Nathan Sidwell




These two testcases provide coverage for 98115,	which doesn't trigger on 
all hosts.


PR c++/98115
PR c++/98116
gcc/testsuite/
* g++.dg/template/pr98115.C: New.
* g++.dg/template/pr98116.C: New.


nathan

--
Nathan Sidwell
diff --git c/gcc/testsuite/g++.dg/template/pr98115.C w/gcc/testsuite/g++.dg/template/pr98115.C
new file mode 100644
index 000..0bfc57ab88e
--- /dev/null
+++ w/gcc/testsuite/g++.dg/template/pr98115.C
@@ -0,0 +1,4 @@
+// PR 98115, dependent array types lead to specialization issues
+
+template  class Stringify;
+template  class Stringify;
diff --git c/gcc/testsuite/g++.dg/template/pr98116.C w/gcc/testsuite/g++.dg/template/pr98116.C
new file mode 100644
index 000..d3398d2238c
--- /dev/null
+++ w/gcc/testsuite/g++.dg/template/pr98116.C
@@ -0,0 +1,29 @@
+// PR 98116, ICE with stripping typedef array type
+// { dg-do compile { target c++11 } }
+namespace std {
+struct is_convertible;
+template  using remove_pointer_t = typename _Tp ::type;
+template  struct enable_if;
+template  void declval();
+template  using enable_if_t = typename enable_if<_Cond>::type;
+template  class Trans_NS___cxx11_basic_string {
+  long _M_string_length;
+};
+} // namespace std
+struct string16_char_traits;
+template class std::Trans_NS___cxx11_basic_string;
+template  using IsLegalDataConversion = std::is_convertible;
+template 
+using ContainerHasConvertibleData = IsLegalDataConversion<
+std::remove_pointer_t)>, T>;
+template 
+using EnableIfSpanCompatibleArray =
+std::enable_if_t::value>;
+template  class span {
+  template [N],
+std::Trans_NS___cxx11_basic_string, Extent>>
+  span();
+};

Re: How to traverse all the local variables that declared in the current routine?

2020-12-03 Thread Qing Zhao via Gcc-patches




> On Dec 3, 2020, at 10:36 AM, Richard Biener  
> wrote:
> 
> On December 3, 2020 5:07:28 PM GMT+01:00, Qing Zhao  > wrote:
>> 
>> 
 of uninitialized analysis in the later stage.
>>> 
>>> I don't see how the issue can be resolved, you can't get both, uninit
>>> warnings and no uninitialized memory.
>>> People can compile twice, once without -fzero-init to get uninit
>>> warnings and once with -fzero-init to get
>>> the extra "security".
>> 
>> So, for GCC, you think that it’s okay to get rid of the following
>> requirement:
>> 
>> C. The implementation needs to keep the current static warning on
>> uninitialized
>> variables untouched in order to avoid "forking the language”.
>> 
>> Then, we can add explanation in the user documentation of the new
>> -fzero-init and also 
>> that of the -Wuninitialized to inform users that -fzero-init will
>> change the behavior of -Wuninitialized.
>> In order to get the warnings, -fzero-init should not be added at the
>> same time?
>> 
>> With this requirement being eliminated, implementation will be much
>> easier. 
>> 
>> We can add the new initialization during simplification phase. Then
>> this new option will work
>> for all languages.  Is this reasonable?
> 
> I think that's reasonable indeed. Eventually doing the init after the early 
> uninit pass is possible as well.

You suggested to put the new pass after the early uninit pass? Why?

Qing
> 
> Richard. 
> 
>> thanks.
>> 
>> Qing
>> 
>> 
>> 
>>> 
>>>

Re: How to traverse all the local variables that declared in the current routine?

2020-12-03 Thread Richard Biener via Gcc-patches

On December 3, 2020 5:07:28 PM GMT+01:00, Qing Zhao  
wrote:
>
>
>> On Dec 3, 2020, at 2:45 AM, Richard Biener
> wrote:
>> 
>> On Wed, Dec 2, 2020 at 4:36 PM Qing Zhao > wrote:
>>> 
>>> 
>>> 
>>> On Dec 2, 2020, at 2:45 AM, Richard Biener
> wrote:
>>> 
>>> On Tue, Dec 1, 2020 at 8:49 PM Qing Zhao 
>wrote:
>>> 
>>> 
>>> Hi, Richard,
>>> 
>>> Could you please comment on the following approach:
>>> 
>>> Instead of adding the zero-initializer quite late at the pass
>“pass_expand”, we can add it as early as during gimplification.
>>> However, we will mark these new added zero-initializers as
>“artificial”. And passing this “artificial” information to
>>> “pass_early_warn_uninitialized” and “pass_late_warn_uninitialized”,
>in these two uninitialized variable analysis passes,
>>> (i.e., in tree-sea-uninit.c) We will update the checking on
>“ssa_undefined_value_p”  to consider “artificial” zero-initializers.
>>> (i.e, if the def_stmt is marked with “artificial”, then it’s a
>undefined value).
>>> 
>>> With such approach, we should be able to address all those
>conflicts.
>>> 
>>> Do you see any obvious issue with this approach?
>>> 
>>> 
>>> Yes, DSE will happily elide an explicit zero-init following the
>>> artificial one leading to false uninit diagnostics.
>>> 
>>> 
>>> Indeed.  This is a big issue. And other optimizations might also be
>impacted by the new zero-init, resulting changed behavior
>>> of uninitialized analysis in the later stage.
>> 
>> I don't see how the issue can be resolved, you can't get both, uninit
>> warnings and no uninitialized memory.
>> People can compile twice, once without -fzero-init to get uninit
>> warnings and once with -fzero-init to get
>> the extra "security".
>
>So, for GCC, you think that it’s okay to get rid of the following
>requirement:
>
>C. The implementation needs to keep the current static warning on
>uninitialized
>variables untouched in order to avoid "forking the language”.
>
>Then, we can add explanation in the user documentation of the new
>-fzero-init and also 
>that of the -Wuninitialized to inform users that -fzero-init will
>change the behavior of -Wuninitialized.
>In order to get the warnings, -fzero-init should not be added at the
>same time?
>
>With this requirement being eliminated, implementation will be much
>easier. 
>
>We can add the new initialization during simplification phase. Then
>this new option will work
>for all languages.  Is this reasonable?

I think that's reasonable indeed. Eventually doing the init after the early 
uninit pass is possible as well.

Richard. 

>thanks.
>
>Qing
>
>
>
>> 
>> Richard.
>> 
>>> 
>>> What's the intended purpose of the zero-init?
>>> 
>>> 
>>> 
>>> The purpose of this new option is: (from the original LLVM patch
>submission):
>>> 
>>> "Add an option to initialize automatic variables with either a
>pattern or with
>>> zeroes. The default is still that automatic variables are
>uninitialized. Also
>>> add attributes to request uninitialized on a per-variable basis,
>mainly to disable
>>> initialization of large stack arrays when deemed too expensive.
>>> 
>>> This isn't meant to change the semantics of C and C++. Rather, it's
>meant to be
>>> a last-resort when programmers inadvertently have some undefined
>behavior in
>>> their code. This patch aims to make undefined behavior hurt less,
>which
>>> security-minded people will be very happy about. Notably, this means
>that
>>> there's no inadvertent information leak when:
>>> 
>>> • The compiler re-uses stack slots, and a value is used
>uninitialized.
>>> • The compiler re-uses a register, and a value is used
>uninitialized.
>>> • Stack structs / arrays / unions with padding are copied.
>>> This patch only addresses stack and register information leaks.
>There's many
>>> more infoleaks that we could address, and much more undefined
>behavior that
>>> could be tamed. Let's keep this patch focused, and I'm happy to
>address related
>>> issues elsewhere."
>>> 
>>> For more details, please refer to the LLVM code review discussion on
>this patch:
>>> https://reviews.llvm.org/D54604
>>> 
>>> 
>>> I also wrote a simple writeup for this task based on my study and
>discussion with
>>> Kees Cook (cc’ing him) as following:
>>> 
>>> 
>>> thanks.
>>> 
>>> Qing
>>> 
>>> Support stack variables auto-initialization in GCC
>>> 
>>> 11/19/2020
>>> 
>>> Qing Zhao
>>> 
>>> ===
>>> 
>>> 
>>> ** Background of the task:
>>> 
>>> The correponding GCC bugzilla RFE was created on 9/3/2018:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87210
>>> 
>>> A similar option for LLVM (around Nov, 2018)
>>> https://lists.llvm.org/pipermail/cfe-dev/2018-November/060172.html
>>> had invoked a lot of discussion before committed.
>>> 
>>> (The following are quoted from the comments of Alexander Potapenko
>in
>>> GCC bug 87210):
>>> 
>>> Finally, on Oct, 2019, upstream Clang supports force initialization
>>> of stack variab

Go patch committed: Convert comparison function result to expected bool type

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go frontend converts the result type of calling a
comparison function to the expected bool type.  Otherwise cases like
type mybool bool
var b mybool = [10]string{} == [10]string{}
get an incorrect type checking error.  Bootstrapped and ran Go
testsuite on x86_64-pc-linux-gnu.  Committed to mainline.

Ian
17c9cf3c17651950bd4bfefcbe15440fa2155810
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index f55daf7562c..cd1a3961a06 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-5364d15082de77d2759a01f254208d4cb4f579e3
+b3a0b068f7fa2d65ba781271b2c0479d103b7d7b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 50574c2bc58..ebe1b36eb53 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -6287,8 +6287,21 @@ Binary_expression::lower_array_comparison(Gogo* gogo,
   args->push_back(this->operand_address(inserter, this->left_));
   args->push_back(this->operand_address(inserter, this->right_));
 
-  Expression* ret = Expression::make_call(func, args, false, loc);
-
+  Call_expression* ce = Expression::make_call(func, args, false, loc);
+
+  // Record that this is a call to a generated equality function.  We
+  // need to do this because a comparison returns an abstract boolean
+  // type, but the function necessarily returns "bool".  The
+  // difference shows up in code like
+  // type mybool bool
+  // var b mybool = [10]string{} == [10]string{}
+  // The comparison function returns "bool", but since a comparison
+  // has an abstract boolean type we need an implicit conversion to
+  // "mybool".  The implicit conversion is inserted in
+  // Call_expression::do_flatten.
+  ce->set_is_equal_function();
+
+  Expression* ret = ce;
   if (this->op_ == OPERATOR_NOTEQ)
 ret = Expression::make_unary(OPERATOR_NOT, ret, loc);
 
@@ -11163,6 +11176,13 @@ Call_expression::do_flatten(Gogo* gogo, Named_object*,
 return ret;
 }
 
+  // Add an implicit conversion to a boolean type, if needed.  See the
+  // comment in Binary_expression::lower_array_comparison.
+  if (this->is_equal_function_
+  && this->type_ != NULL
+  && this->type_ != Type::lookup_bool_type())
+return Expression::make_cast(this->type_, this, this->location());
+
   return this;
 }
 
@@ -11938,7 +11958,7 @@ Call_expression::do_type()
 // parameter types to set the types of the arguments.
 
 void
-Call_expression::do_determine_type(const Type_context*)
+Call_expression::do_determine_type(const Type_context* context)
 {
   if (!this->determining_types())
 return;
@@ -11985,6 +12005,22 @@ Call_expression::do_determine_type(const Type_context*)
(*pa)->determine_type_no_context();
}
 }
+
+  // If this is a call to a generated equality function, we determine
+  // the type based on the context.  See the comment in
+  // Binary_expression::lower_array_comparison.
+  if (this->is_equal_function_
+  && !context->may_be_abstract
+  && context->type != NULL
+  && context->type->is_boolean_type()
+  && context->type != Type::lookup_bool_type())
+{
+  go_assert(this->type_ == NULL
+   || this->type_ == Type::lookup_bool_type()
+   || this->type_ == context->type
+   || this->type_->is_error());
+  this->type_ = context->type;
+}
 }
 
 // Called when determining types for a Call_expression.  Return true
diff --git a/gcc/go/gofrontend/expressions.h b/gcc/go/gofrontend/expressions.h
index d2975238572..259eeb6027e 100644
--- a/gcc/go/gofrontend/expressions.h
+++ b/gcc/go/gofrontend/expressions.h
@@ -2326,8 +2326,8 @@ class Call_expression : public Expression
   fn_(fn), args_(args), type_(NULL), call_(NULL), call_temp_(NULL)
 , expected_result_count_(0), is_varargs_(is_varargs),
   varargs_are_lowered_(false), types_are_determined_(false),
-  is_deferred_(false), is_concurrent_(false), issued_error_(false),
-  is_multi_value_arg_(false), is_flattened_(false)
+  is_deferred_(false), is_concurrent_(false), is_equal_function_(false),
+  issued_error_(false), is_multi_value_arg_(false), is_flattened_(false)
   { }
 
   // The function to call.
@@ -2408,6 +2408,11 @@ class Call_expression : public Expression
   set_is_concurrent()
   { this->is_concurrent_ = true; }
 
+  // Note that this is a call to a generated equality function.
+  void
+  set_is_equal_function()
+  { this->is_equal_function_ = true; }
+
   // We have found an error with this call expression; return true if
   // we should report it.
   bool
@@ -2545,6 +2550,8 @@ class Call_expression : public Expression
   bool is_deferred_;
   // True if the call is an argument to a go statement.
   bool is_concurrent_;
+  // True if this is a call to a generated equality function.
+  bool is_equal_function_;
   //

Go patch committed: Defer to middle-end for complex division

2020-12-03 Thread Ian Lance Taylor via Gcc-patches

This patch to the Go frontend and libgo defers to the middle-end for
complex division.  Go used to use slightly different semantics than
C99 for complex division, so we used runtime routines to handle the
difference.  The gc compiler has changed its behavior to match C99, so
change ours as well.  This is for https://golang.org/issue/14644.
This requires updating a test as well; the patch attached here does
not include the changes to the generated file
gcc/testsuite/go.test/test/cmplxdivide1.go, as they are large.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
e9a4798d3c615d1aa576ff76af429a188c6cd90f
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 183e5cae9c9..f55daf7562c 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-6b01f8cdc11d86bd98165c91d6ae101bcf6b9e1a
+5364d15082de77d2759a01f254208d4cb4f579e3
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 23caf61db93..50574c2bc58 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -6979,27 +6979,6 @@ Binary_expression::do_get_backend(Translate_context* 
context)
   // been converted to a String_concat_expression in do_lower.
   go_assert(!left_type->is_string_type());
 
-  // For complex division Go might want slightly different results than the
-  // backend implementation provides, so we have our own runtime routine.
-  if (this->op_ == OPERATOR_DIV && this->left_->type()->complex_type() != NULL)
-{
-  Runtime::Function complex_code;
-  switch (this->left_->type()->complex_type()->bits())
-   {
-   case 64:
-  complex_code = Runtime::COMPLEX64_DIV;
- break;
-   case 128:
-  complex_code = Runtime::COMPLEX128_DIV;
- break;
-   default:
- go_unreachable();
-   }
-  Expression* complex_div =
-  Runtime::make_call(complex_code, loc, 2, this->left_, this->right_);
-  return complex_div->get_backend(context);
-}
-
   Bexpression* left = this->left_->get_backend(context);
   Bexpression* right = this->right_->get_backend(context);
 
diff --git a/gcc/go/gofrontend/runtime.def b/gcc/go/gofrontend/runtime.def
index 9a3c6809130..4b606a6c00c 100644
--- a/gcc/go/gofrontend/runtime.def
+++ b/gcc/go/gofrontend/runtime.def
@@ -62,12 +62,6 @@ DEF_GO_RUNTIME(STRINGTOSLICERUNE, 
"runtime.stringtoslicerune",
   P2(POINTER, STRING), R1(SLICE))
 
 
-// Complex division.
-DEF_GO_RUNTIME(COMPLEX64_DIV, "__go_complex64_div",
-  P2(COMPLEX64, COMPLEX64), R1(COMPLEX64))
-DEF_GO_RUNTIME(COMPLEX128_DIV, "__go_complex128_div",
-  P2(COMPLEX128, COMPLEX128), R1(COMPLEX128))
-
 // Make a slice.
 DEF_GO_RUNTIME(MAKESLICE, "runtime.makeslice", P3(TYPE, INT, INT),
   R1(POINTER))
diff --git a/gcc/testsuite/go.test/test/cmplxdivide.c 
b/gcc/testsuite/go.test/test/cmplxdivide.c
index 12dc4f1c0c9..89a2868b75b 100644
--- a/gcc/testsuite/go.test/test/cmplxdivide.c
+++ b/gcc/testsuite/go.test/test/cmplxdivide.c
@@ -1,8 +1,19 @@
-// Copyright 2010 The Go Authors.  All rights reserved.
+// Copyright 2010 The Go Authors. All rights reserved.
 // Use of this source code is governed by a BSD-style
 // license that can be found in the LICENSE file.
 
-// gcc '-std=c99' cmplxdivide.c && a.out >cmplxdivide1.go
+// This C program generates the file cmplxdivide1.go. It uses the
+// output of the operations by C99 as the reference to check
+// the implementation of complex numbers in Go.
+// The generated file, cmplxdivide1.go, is compiled along
+// with the driver cmplxdivide.go (the names are confusing
+// and unimaginative) to run the actual test. This is done by
+// the usual test runner.
+//
+// The file cmplxdivide1.go is checked in to the repository, but
+// if it needs to be regenerated, compile and run this C program
+// like this:
+// gcc '-std=c99' cmplxdivide.c && a.out >cmplxdivide1.go
 
 #include 
 #include 
@@ -12,50 +23,63 @@
 #define nelem(x) (sizeof(x)/sizeof((x)[0]))
 
 double f[] = {
-   0,
-   1,
-   -1,
-   2,
+   0.0,
+   -0.0,
+   1.0,
+   -1.0,
+   2.0,
NAN,
INFINITY,
-INFINITY,
 };
 
-char*
-fmt(double g)
-{
+char* fmt(double g) {
static char buf[10][30];
static int n;
char *p;
-   
+
p = buf[n++];
-   if(n == 10)
+   if(n == 10) {
n = 0;
+   }
+
sprintf(p, "%g", g);
-   if(strcmp(p, "-0") == 0)
-   strcpy(p, "negzero");
-   return p;
-}
 
-int
-iscnan(double complex d)
-{
-   return !isinf(creal(d)) && !isinf(cimag(d)) && (isnan(creal(d)) || 
isnan(cimag(d)));
-}
+   if(strcmp(p, "0") == 0) {
+   strcpy(p, "zero");
+   return p;
+   }
+
+   if(strcmp(p, "-0") == 0) {
+

Re: How to traverse all the local variables that declared in the current routine?

2020-12-03 Thread Qing Zhao via Gcc-patches




> On Dec 3, 2020, at 2:45 AM, Richard Biener  wrote:
> 
> On Wed, Dec 2, 2020 at 4:36 PM Qing Zhao  > wrote:
>> 
>> 
>> 
>> On Dec 2, 2020, at 2:45 AM, Richard Biener  
>> wrote:
>> 
>> On Tue, Dec 1, 2020 at 8:49 PM Qing Zhao  wrote:
>> 
>> 
>> Hi, Richard,
>> 
>> Could you please comment on the following approach:
>> 
>> Instead of adding the zero-initializer quite late at the pass “pass_expand”, 
>> we can add it as early as during gimplification.
>> However, we will mark these new added zero-initializers as “artificial”. And 
>> passing this “artificial” information to
>> “pass_early_warn_uninitialized” and “pass_late_warn_uninitialized”, in these 
>> two uninitialized variable analysis passes,
>> (i.e., in tree-sea-uninit.c) We will update the checking on 
>> “ssa_undefined_value_p”  to consider “artificial” zero-initializers.
>> (i.e, if the def_stmt is marked with “artificial”, then it’s a undefined 
>> value).
>> 
>> With such approach, we should be able to address all those conflicts.
>> 
>> Do you see any obvious issue with this approach?
>> 
>> 
>> Yes, DSE will happily elide an explicit zero-init following the
>> artificial one leading to false uninit diagnostics.
>> 
>> 
>> Indeed.  This is a big issue. And other optimizations might also be impacted 
>> by the new zero-init, resulting changed behavior
>> of uninitialized analysis in the later stage.
> 
> I don't see how the issue can be resolved, you can't get both, uninit
> warnings and no uninitialized memory.
> People can compile twice, once without -fzero-init to get uninit
> warnings and once with -fzero-init to get
> the extra "security".

So, for GCC, you think that it’s okay to get rid of the following requirement:

C. The implementation needs to keep the current static warning on uninitialized
variables untouched in order to avoid "forking the language”.

Then, we can add explanation in the user documentation of the new -fzero-init 
and also 
that of the -Wuninitialized to inform users that -fzero-init will change the 
behavior of -Wuninitialized.
In order to get the warnings, -fzero-init should not be added at the same time?

With this requirement being eliminated, implementation will be much easier. 

We can add the new initialization during simplification phase. Then this new 
option will work
for all languages.  Is this reasonable?

thanks.

Qing



> 
> Richard.
> 
>> 
>> What's the intended purpose of the zero-init?
>> 
>> 
>> 
>> The purpose of this new option is: (from the original LLVM patch submission):
>> 
>> "Add an option to initialize automatic variables with either a pattern or 
>> with
>> zeroes. The default is still that automatic variables are uninitialized. Also
>> add attributes to request uninitialized on a per-variable basis, mainly to 
>> disable
>> initialization of large stack arrays when deemed too expensive.
>> 
>> This isn't meant to change the semantics of C and C++. Rather, it's meant to 
>> be
>> a last-resort when programmers inadvertently have some undefined behavior in
>> their code. This patch aims to make undefined behavior hurt less, which
>> security-minded people will be very happy about. Notably, this means that
>> there's no inadvertent information leak when:
>> 
>> • The compiler re-uses stack slots, and a value is used uninitialized.
>> • The compiler re-uses a register, and a value is used uninitialized.
>> • Stack structs / arrays / unions with padding are copied.
>> This patch only addresses stack and register information leaks. There's many
>> more infoleaks that we could address, and much more undefined behavior that
>> could be tamed. Let's keep this patch focused, and I'm happy to address 
>> related
>> issues elsewhere."
>> 
>> For more details, please refer to the LLVM code review discussion on this 
>> patch:
>> https://reviews.llvm.org/D54604
>> 
>> 
>> I also wrote a simple writeup for this task based on my study and discussion 
>> with
>> Kees Cook (cc’ing him) as following:
>> 
>> 
>> thanks.
>> 
>> Qing
>> 
>> Support stack variables auto-initialization in GCC
>> 
>> 11/19/2020
>> 
>> Qing Zhao
>> 
>> ===
>> 
>> 
>> ** Background of the task:
>> 
>> The correponding GCC bugzilla RFE was created on 9/3/2018:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87210
>> 
>> A similar option for LLVM (around Nov, 2018)
>> https://lists.llvm.org/pipermail/cfe-dev/2018-November/060172.html
>> had invoked a lot of discussion before committed.
>> 
>> (The following are quoted from the comments of Alexander Potapenko in
>> GCC bug 87210):
>> 
>> Finally, on Oct, 2019, upstream Clang supports force initialization
>> of stack variables under the -ftrivial-auto-var-init flag.
>> 
>> -ftrivial-auto-var-init=pattern initializes local variables with a 0xAA 
>> pattern
>> (actually it's more complicated, see https://reviews.llvm.org/D54604)
>> 
>> -ftrivial-auto-var-init=zero provides zero-initialization of locals.

[Committed] IBM Z: Fix mode in probe_stack pattern

2020-12-03 Thread Andreas Krebbel via Gcc-patches

The probe pattern uses Pmode but the middle-end wants to emit a
word_mode probe check.  This - as usual - breaks on Z with -m31 -mzarch
were word_mode doesn't match Pmode.

Bootstrapped and regression-tested on s390x.

gcc/ChangeLog:

* config/s390/s390.md ("@probe_stack2"): Change mode
iterator to W.

gcc/testsuite/ChangeLog:

* gcc.target/s390/stack-clash-4.c: New test.
---
 gcc/config/s390/s390.md   |  6 +++---
 gcc/testsuite/gcc.target/s390/stack-clash-4.c | 10 ++
 2 files changed, 13 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/stack-clash-4.c

diff --git a/gcc/config/s390/s390.md b/gcc/config/s390/s390.md
index d4cfbdf6732..d6d8965a740 100644
--- a/gcc/config/s390/s390.md
+++ b/gcc/config/s390/s390.md
@@ -6,8 +6,8 @@ (define_expand "allocate_stack"
 
 (define_expand "@probe_stack2"
   [(set (reg:CCZ CC_REGNUM)
-   (compare:CCZ (reg:P 0)
-(match_operand 0 "memory_operand")))
+   (compare:CCZ (reg:W 0)
+(match_operand:W 0 "memory_operand")))
(unspec_volatile [(const_int 0)] UNSPECV_BLOCKAGE)]
   "")
 
@@ -11125,7 +11125,7 @@ (define_expand "probe_stack"
   [(match_operand 0 "memory_operand")]
   ""
 {
-  emit_insn (gen_probe_stack2 (Pmode, operands[0]));
+  emit_insn (gen_probe_stack2 (word_mode, operands[0]));
   DONE;
 })
 
diff --git a/gcc/testsuite/gcc.target/s390/stack-clash-4.c 
b/gcc/testsuite/gcc.target/s390/stack-clash-4.c
new file mode 100644
index 000..619d99ddf69
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/stack-clash-4.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O1 -m31 -mzarch -fstack-clash-protection" } */
+
+extern void c(char*);
+
+void
+a() {
+  char *b = __builtin_alloca(3);
+  c(b);
+}
-- 
2.28.0

c++: Fix array type dependency [PR 98107]

2020-12-03 Thread Nathan Sidwell


I'd missed somepaths through build_cplus_array_type, plus, some
arrays come via the C-type builder.  This propagates dependencyin
more places and asserts that in the cases where TYPE_DEPENDENT_P_VALID
is unset, the type is non-dependent.

PR c++/98107
gcc/cp/
* tree.c (build_cplus_array_type): Mark dependency of new variant.
(cp_build_qualified_type_real, strip_typedefs): Assert
TYPE_DEPENDENT_P_VALID, or not a dependent type.

--
Nathan Sidwell
diff --git c/gcc/cp/tree.c w/gcc/cp/tree.c
index 5932777be04..4f28e6d49fd 100644
--- c/gcc/cp/tree.c
+++ w/gcc/cp/tree.c
@@ -1076,6 +1076,9 @@ build_cplus_array_type (tree elt_type, tree index_type, int dependent)
 {
   bool typeless_storage = is_byte_access_type (elt_type);
   t = build_array_type (elt_type, index_type, typeless_storage);
+
+  /* Mark as non-dependenty now, this will save time later.  */
+  TYPE_DEPENDENT_P_VALID (t) = true;
 }
 
   /* Now check whether we already have this array variant.  */
@@ -1090,6 +1093,9 @@ build_cplus_array_type (tree elt_type, tree index_type, int dependent)
   if (!t)
 	{
 	  t = build_min_array_type (elt_type, index_type);
+	  /* Mark dependency now, this saves time later.  */
+	  TYPE_DEPENDENT_P_VALID (t) = true;
+	  TYPE_DEPENDENT_P (t) = dependent;
 	  set_array_type_canon (t, elt_type, index_type, dependent);
 	  if (!dependent)
 	{
@@ -1326,6 +1332,8 @@ cp_build_qualified_type_real (tree type,
 
   if (!t)
 	{
+	  gcc_checking_assert (TYPE_DEPENDENT_P_VALID (type)
+			   || !dependent_type_p (type));
 	  t = build_cplus_array_type (element_type, TYPE_DOMAIN (type),
   TYPE_DEPENDENT_P (type));
 
@@ -1563,6 +1571,8 @@ strip_typedefs (tree t, bool *remove_attributes, unsigned int flags)
 case ARRAY_TYPE:
   type = strip_typedefs (TREE_TYPE (t), remove_attributes, flags);
   t0  = strip_typedefs (TYPE_DOMAIN (t), remove_attributes, flags);
+  gcc_checking_assert (TYPE_DEPENDENT_P_VALID (t)
+			   || !dependent_type_p (t));
   result = build_cplus_array_type (type, t0, TYPE_DEPENDENT_P (t));
   break;
 case FUNCTION_TYPE:

Re: OpenMP patch ping ** 2

2020-12-03 Thread Tobias Burnus


Hi,

I would like to ping the following OpenMP-related patches:

On 27.11.20 17:09, Tobias Burnus wrote:


OpenMP-related patch pings:

Kwok's:
* Re: [PATCH] openmp: Implicit 'declare target' for C++ static
initializers
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559624.html

* openmp: Add OpenMP 5.0 task detach clause support
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558752.html

* RFC:
  Nested declare target support
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559758.html

Chung-Lin's:
* [PATCH v2, OpenMP 5, C++] Implement implicit mapping of this[:1]
(PR92120)
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/558975.html

My:
* [Patch] OpenMP: C/C++ parse 'omp allocate'
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559930.html

Julian's
* Re: [PATCH] nvptx: Cache stacks block for OpenMP kernel launch
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559044.html



Tobias

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter

[committed] aarch64: Don't fold svundef* at the gimple level

2020-12-03 Thread Richard Sandiford via Gcc-patches

As the testcase shows, folding svundef*() at the gimple level
has the unfortunate side-effect of introducing -Wuninitialized
or -Wmaybe-uninitialized warnings.  We don't have a testcase
that relies on the fold, so the easiest fix seems to be to
remove it.

Tested on aarch64-linux-gnu and aarch64_be-elf, pushed.

Thanks,
Richard


gcc/
* config/aarch64/aarch64-sve-builtins-base.cc (svundef_impl::fold):
Delete.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/undef_1.c: New test.
---
 gcc/config/aarch64/aarch64-sve-builtins-base.cc  | 11 ---
 .../gcc.target/aarch64/sve/acle/general/undef_1.c| 12 
 2 files changed, 12 insertions(+), 11 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/undef_1.c

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 9b63ea76ecd..4223125cd5e 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -2295,17 +2295,6 @@ public:
   CONSTEXPR svundef_impl (unsigned int vectors_per_tuple)
 : quiet (vectors_per_tuple) {}
 
-  gimple *
-  fold (gimple_folder &f) const OVERRIDE
-  {
-/* Don't fold svundef at the gimple level.  There's no exact
-   correspondence for SSA_NAMEs, and we explicitly don't want
-   to generate a specific value (like an all-zeros vector).  */
-if (vectors_per_tuple () == 1)
-  return NULL;
-return gimple_build_assign (f.lhs, build_clobber (TREE_TYPE (f.lhs)));
-  }
-
   rtx
   expand (function_expander &e) const OVERRIDE
   {
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/undef_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/undef_1.c
new file mode 100644
index 000..793593b662c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/undef_1.c
@@ -0,0 +1,12 @@
+/* { dg-options "-O2 -W -Wall -Werror" } */
+
+#include 
+
+svfloat32x2_t
+foo (svfloat32_t x, svfloat32_t y)
+{
+  svfloat32x2_t res = svundef2_f32 ();
+  res = svset2 (res, 0, x);
+  res = svset2 (res, 1, y);
+  return res;
+}

[PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-03 Thread Kumar, Venkataramanan via Gcc-patches

[AMD Public Use]


Hi Maintainers,

PFA, the patch that enables support for the next generation AMD Zen3 CPU via 
-march=znver3.
This is a very basic enablement patch. As of now the cost, tuning and scheduler 
changes are kept same as znver2.
Further changes to the cost and tunings will be done later.

Ok for trunk ?

Regards,
Venkat.



X86_64-Enable-support-for-next-generation-AMD-Znver3.patch
Description: X86_64-Enable-support-for-next-generation-AMD-Znver3.patch

RE: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-03 Thread Kumar, Venkataramanan via Gcc-patches

[AMD Public Use]

Thanks Uros, I forgot to change.

Please ignore this thread . I will send fresh  one.

Regards,
Venkat.

-Original Message-
From: Uros Bizjak  
Sent: Thursday, December 3, 2020 8:44 PM
To: Kumar, Venkataramanan 
Cc: gcc-patches@gcc.gnu.org; Jan Hubicka 
Subject: Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

[CAUTION: External Email]

The message says that it is for internal distribution. Please repost.

Thanks,
Uros.

On Thu, Dec 3, 2020 at 4:11 PM Kumar, Venkataramanan 
 wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> Hi Maintainers,
>
>
>
> PFA, the patch that enables support for the next generation AMD Zen3 CPU via 
> -march=znver3.
>
> This is a very basic enablement patch. As of now the cost, tuning and 
> scheduler changes are kept same as znver2.
>
> Further changes to the cost and tunings will be done later.
>
>
>
> Ok for trunk ?
>
>
>
> Regards,
>
> Venkat.

Re: [PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-03 Thread Uros Bizjak via Gcc-patches

The message says that it is for internal distribution. Please repost.

Thanks,
Uros.

On Thu, Dec 3, 2020 at 4:11 PM Kumar, Venkataramanan
 wrote:
>
> [AMD Official Use Only - Internal Distribution Only]
>
>
> Hi Maintainers,
>
>
>
> PFA, the patch that enables support for the next generation AMD Zen3 CPU via 
> -march=znver3.
>
> This is a very basic enablement patch. As of now the cost, tuning and 
> scheduler changes are kept same as znver2.
>
> Further changes to the cost and tunings will be done later.
>
>
>
> Ok for trunk ?
>
>
>
> Regards,
>
> Venkat.

[PATCH] [X86_64]: Enable support for next generation AMD Zen3 CPU

2020-12-03 Thread Kumar, Venkataramanan via Gcc-patches

[AMD Official Use Only - Internal Distribution Only]

Hi Maintainers,

PFA, the patch that enables support for the next generation AMD Zen3 CPU via 
-march=znver3.
This is a very basic enablement patch. As of now the cost, tuning and scheduler 
changes are kept same as znver2.
Further changes to the cost and tunings will be done later.

Ok for trunk ?

Regards,
Venkat.


X86_64-Enable-support-for-next-generation-AMD-Znver3.patch
Description: X86_64-Enable-support-for-next-generation-AMD-Znver3.patch

Re: [PATCH] c++: v3: Add __builtin_bit_cast to implement std::bit_cast [PR93121]

2020-12-03 Thread Jason Merrill via Gcc-patches


On 11/26/20 10:09 AM, Jakub Jelinek wrote:

Sorry, thought I had replied to this before.


The following patch removes the mask from the new native_interpret_aggregate
moved to fold-const.c altogether.  Instead, the code uses the
__builtin_clear_padding infrastructure (new entrypoint in there).
If native_interpret_aggregate succeeds (returns non-NULL), then
what we have in mask from the earlier native_encode_initializer call is
0 bits if they are well defined in argument and 1 bits for indeterminate
bits.  The new clear_type_padding_in_mask function then clears all bits
which are padding in the given type (the destination type), so if any bits
remain in the mask, it means we are using indeterminate bits and C++ FE
can diagnose it.  The bit-cast* tests didn't change at all.


Great, thanks.


+  if (can_native_interpret_type_p (TREE_TYPE (t)))
+if (tree r = native_interpret_expr (TREE_TYPE (t), ptr, len))
+  {
+   for (int i = 0; i < len; i++)
+ if (mask[i])
+   {
+ if (!ctx->quiet)
+   error_at (loc, "%qs accessing uninitialized byte at offset %d",
+ "__builtin_bit_cast", i);
+ *non_constant_p = true;
+ r = t;
+ break;
+   }
+   if (ptr != buf)
+ XDELETE (ptr);
+   return r;
+  }
+
+  if (TREE_CODE (TREE_TYPE (t)) == RECORD_TYPE)
+{
+  tree r = native_interpret_aggregate (TREE_TYPE (t), ptr, 0, len);
+  if (r != NULL_TREE)
+   {
+ clear_type_padding_in_mask (TREE_TYPE (t), mask);
+ for (int i = 0; i < len; i++)
+   if (mask[i])
+ {
+   if (!ctx->quiet)
+ error_at (loc, "%qs accessing uninitialized byte at offset "
+"%d", "__builtin_bit_cast", i);
+   *non_constant_p = true;
+   r = t;
+   break;
+ }
+ if (ptr != buf)
+   XDELETE (ptr);
+ return r;
+   }
+}


There's a lot of code that can be shared between these two cases now. 
The C++ bits are OK with that change.


Jason

[PATCH] c++: Distinguish unsatisfaction vs errors during satisfaction [PR97093]

2020-12-03 Thread Patrick Palka via Gcc-patches

During satisfaction, the flag info.noisy() controls three things:
whether to diagnose fatal errors (such as the satisfaction value of an
atom being non-bool); whether to diagnose unsatisfaction; and whether to
bypass the satisfaction cache.

This flag turns out to be too coarse however, for sometimes we need to
diagnose fatal errors but not unsatisfaction, in particular when replaying
an erroneous satisfaction result from constraint_satisfaction_value,
evaluate_concept_check and tsubst_nested_requirement.

And we sometimes need to bypass the satisfaction cache but not diagnose
unsatisfaction, in particular when evaluating the branches of a
disjunction when info.noisy() is true.  Currently, satisfy_disjunction
first quietly evaluates each branch, but doing so causes satisfy_atom
to insert re-normalized atoms into the satisfaction cache when
diagnosing unsatisfaction of the overall constraint.  This is ultimately
the source of PR97093.

To that end, this patch adds the info.diagnose_unsatisfaction_p() flag
which refines the info.noisy() flag.  During satisfaction info.noisy()
now controls whether to diagnose fatal errors, and
info.diagnose_unsatisfaction_p() controls whether to additionally
diagnose unsatisfaction.  This enables us to address the above two
issues straightforwardly.

This flag refinement also allows us to fold the diagnose_foo_requirement
routines into the corresponding tsubst_foo_requirement ones.  Here, the
flags take on slightly different meanings: info.noisy() controls whether
to diagnose invalid types and expressions inside the requires-expression,
and info.diagnose_unsatisfaction_p() controls whether to diagnose the
overall unsatisfaction of the requires-expression.

Bootstrapped and regtested on x86_64-pc-linux-gnu, and also tested on
cmcstl2 and range-v3.  Does this look OK for trunk?

gcc/cp/ChangeLog:

PR c++/97093
* constraint.cc (struct sat_info): Define.
(tsubst_valid_expression_requirement): Take a sat_info instead
of subst_info.  Perform the substitution quietly first.  Fold in
error-replaying code from diagnose_valid_expression.
(tsubst_simple_requirement): Take a sat_info instead of
subst_info.
(tsubst_type_requirement_1): New.  Fold in error-replaying code
from diagnose_valid_type.
(tsubst_type_requirement): Use it. Take a sat_info instead of
subst_info.
(tsubst_compound_requirement): Likewise.  Fold in
error-replaying code from diagnose_compound_requirement.
(tsubst_nested_requirement): Take a sat_info instead of
subst_info.  Perform the substitution quietly first.  Fold in
error-replaying code from diagnose_nested_requirement.
(tsubst_requirement): Take a sat_info instead of subst_info.
(tsubst_requirement_body): Likewise.
(tsubst_requires_expr): Split into two versions, one that takes
a sat_info argument and another that takes a complain and
in_decl argument.  Remove outdated documentation.  Document he
effects of the sat_info argument.
(tsubst_parameter_mapping): Take a sat_info instead of a
subst_info.
(satisfy_conjunction): Likewise.
(satisfy_disjunction): Likewise.  Evaluate each branch with
unsatisfaction diagnostics disabled rather than completely
quietly, and short circuit when an erroneous branch is
encountered.
(satisfy_atom):  Take a sat_info instead of a subst_info.  Fix a
comment.  Use diagnose_unsatisfaction_p() instead of noisy() to
guard replaying of satisfaction failure.  Always check
constantness quietly first and consistently return
error_mark_node when the value is non-constant.
(satisfy_constraint_r): Document the effects of the sat_info
argument.  Take a sat_info instead of a subst_info.
(satisfy_constraint): Take a sat_info instead of a subst_info.
(satisfy_associated_constraints): Likewise.
(satisfy_constraint_expression): Likewise.
(satisfy_declaration_constraints): Likewise.
(constraint_satisfaction_value): Likewise.  Adjust.  XXX
(constraints_satisfied_p): Adjust.
(evaluate_concept_check): Adjust.
(diagnose_trait_expr): Make static.  Take a template args vector
instead of a parameter mapping.
(diagnose_atomic_constraint): Take a sat_info instead of a
subst_info.  Adjust call to diagnose_trait_expr.  Call
tsubst_requires_expr instead of diagnose_requires_expr.
(diagnose_constraints): Adjust calls to
constraint_satisfaction_value.
(diagnose_valid_expression): Remove.
(diagnose_valid_type): Likewise.
(diagnose_simple_requirement): Likewise.
(diagnose_compound_requirement): Likewise.
(diagnose_type_requirement): Likewise.
(diagnose_nested_requirement): Likewise.
(diagnose_requirement): Likewise.

Re: [PATCH 3/4] rs6000: Enable vec_insert for P8 with rs6000_expand_vector_set_var_p8

2020-12-03 Thread Xionghu Luo via Gcc-patches


Ping. Thanks.


On 2020/11/27 09:04, Xionghu Luo via Gcc-patches wrote:

Hi Segher,
Thanks for the approval of [PATCH 1/4] and [PATCH 2/4], what's your
opinion of this [PATCH 3/4] for P8, please?  xxinsertw only exists since
v3.0, so we had to implement by another way.


Xionghu


On 2020/10/10 16:08, Xionghu Luo wrote:

gcc/ChangeLog:

2020-10-10  Xionghu Luo  

* config/rs6000/rs6000-c.c (altivec_resolve_overloaded_builtin):
Generate ARRAY_REF(VIEW_CONVERT_EXPR) for P8 and later
platforms.
* config/rs6000/rs6000.c (rs6000_expand_vector_set_var): Update
to call different path for P8 and P9.
(rs6000_expand_vector_set_var_p9): New function.
(rs6000_expand_vector_set_var_p8): New function.

gcc/testsuite/ChangeLog:

2020-10-10  Xionghu Luo  

* gcc.target/powerpc/pr79251.p8.c: New test.
---
   gcc/config/rs6000/rs6000-c.c  |  27 +++-
   gcc/config/rs6000/rs6000.c| 117 +-
   gcc/testsuite/gcc.target/powerpc/pr79251.p8.c |  17 +++
   3 files changed, 155 insertions(+), 6 deletions(-)
   create mode 100644 gcc/testsuite/gcc.target/powerpc/pr79251.p8.c

diff --git a/gcc/config/rs6000/rs6000-c.c b/gcc/config/rs6000/rs6000-c.c
index 5551a21d738..4bea8001ec6 100644
--- a/gcc/config/rs6000/rs6000-c.c
+++ b/gcc/config/rs6000/rs6000-c.c
@@ -1599,10 +1599,29 @@ altivec_resolve_overloaded_builtin (location_t loc, 
tree fndecl,
  SET_EXPR_LOCATION (stmt, loc);
  stmt = build1 (COMPOUND_LITERAL_EXPR, arg1_type, stmt);
}
-  stmt = build_array_ref (loc, stmt, arg2);
-  stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt,
- convert (TREE_TYPE (stmt), arg0));
-  stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+
+  if (TARGET_P8_VECTOR)
+   {
+ stmt = build_array_ref (loc, stmt, arg2);
+ stmt = fold_build2 (MODIFY_EXPR, TREE_TYPE (arg0), stmt,
+ convert (TREE_TYPE (stmt), arg0));
+ stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+   }
+  else
+   {
+ tree arg1_inner_type;
+ tree innerptrtype;
+ arg1_inner_type = TREE_TYPE (arg1_type);
+ innerptrtype = build_pointer_type (arg1_inner_type);
+
+ stmt = build_unary_op (loc, ADDR_EXPR, stmt, 0);
+ stmt = convert (innerptrtype, stmt);
+ stmt = build_binary_op (loc, PLUS_EXPR, stmt, arg2, 1);
+ stmt = build_indirect_ref (loc, stmt, RO_NULL);
+ stmt = build2 (MODIFY_EXPR, TREE_TYPE (stmt), stmt,
+convert (TREE_TYPE (stmt), arg0));
+ stmt = build2 (COMPOUND_EXPR, arg1_type, stmt, decl);
+   }
 return stmt;
   }
   
diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c

index 96f76c7a74c..33ca839cb28 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -6806,10 +6806,10 @@ rs6000_expand_vector_set (rtx target, rtx val, rtx 
elt_rtx)
   }
   
   /* Insert VAL into IDX of TARGET, VAL size is same of the vector element, IDX

-   is variable and also counts by vector element size.  */
+   is variable and also counts by vector element size for p9 and above.  */
   
   void

-rs6000_expand_vector_set_var (rtx target, rtx val, rtx idx)
+rs6000_expand_vector_set_var_p9 (rtx target, rtx val, rtx idx)
   {
 machine_mode mode = GET_MODE (target);
   
@@ -6852,6 +6852,119 @@ rs6000_expand_vector_set_var (rtx target, rtx val, rtx idx)

 emit_insn (perml);
   }
   
+/* Insert VAL into IDX of TARGET, VAL size is same of the vector element, IDX

+   is variable and also counts by vector element size for p8.  */
+
+void
+rs6000_expand_vector_set_var_p8 (rtx target, rtx val, rtx idx)
+{
+  machine_mode mode = GET_MODE (target);
+
+  gcc_assert (VECTOR_MEM_VSX_P (mode) && !CONST_INT_P (idx));
+
+  gcc_assert (GET_MODE (idx) == E_SImode);
+
+  machine_mode inner_mode = GET_MODE (val);
+  HOST_WIDE_INT mode_mask = GET_MODE_MASK (inner_mode);
+
+  rtx tmp = gen_reg_rtx (GET_MODE (idx));
+  int width = GET_MODE_SIZE (inner_mode);
+
+  gcc_assert (width >= 1 && width <= 4);
+
+  if (!BYTES_BIG_ENDIAN)
+{
+  /*  idx = idx * width.  */
+  emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
+  /*  idx = idx + 8.  */
+  emit_insn (gen_addsi3 (tmp, tmp, GEN_INT (8)));
+}
+  else
+{
+  emit_insn (gen_mulsi3 (tmp, idx, GEN_INT (width)));
+  emit_insn (gen_subsi3 (tmp, GEN_INT (24 - width), tmp));
+}
+
+  /*  lxv vs33, mask.
+  DImode: 0x
+  SImode: 0x
+  HImode: 0x.
+  QImode: 0x00ff.  */
+  rtx mask = gen_reg_rtx (V16QImode);
+  rtx mask_v2di = gen_reg_rtx (V2DImode);
+  rtvec v = rtvec_alloc (2);
+  if (!BYTES_BIG_ENDIAN)
+{
+  RTVEC_ELT (v, 0) = gen_rtx_CONST_INT (DImode, 0);
+  RTVEC_ELT (v, 1) = gen_r

Re: introduce overridable clear_cache emitter

2020-12-03 Thread Alexandre Oliva

On Dec  3, 2020, Christophe Lyon  wrote:

> This patches causes a lot of regressions in fortran on arm and aarch64,

On Dec  3, 2020, Andreas Schwab  wrote:

> On Dez 03 2020, Andreas Schwab wrote:

>> ../../../../libffi/src/aarch64/ffi.c: In function 'ffi_prep_closure_loc':
>> ../../../../libffi/src/aarch64/ffi.c:67:3: error: both arguments to 
>> '__builtin___clear_cache' must be pointers
>> 67 |   __builtin___clear_cache (start, end);
>> |   ^~~~

> This happens when compiling with -mabi=ilp32.

Thank you both.  Here's the patch I'm testing to fix both issues.
Ok to install?


fix __builtin___clear_cache overrider fallout

From: Alexandre Oliva 

Machines that had CLEAR_CACHE_INSN and that would thus issue calls to
__clear_cache with the default call expander, would fail on languages
that did not set up the __clear_cache builtin.  This patch arranges
for all languages to set up this builtin.

Machines or multilibs that had ptr_mode != Pmode, such as aarch64 with
-mabi=ilp32, would fail the RTL mode test of the arguments passed to
__clear_cache, because we'd insist on ptr_mode.  This patch arranges for
Pmode to be accepted as well.


for  gcc/ChangeLog

* tree.c (build_common_builtin_nodes): Declare
__builtin___clear_cache for all languages.
* builtins.c (maybe_emit_call_builtin___clear_cache): Accept
Pmode arguments.
---
 gcc/builtins.c |3 ++-
 gcc/tree.c |6 ++
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/gcc/builtins.c b/gcc/builtins.c
index ecc12e69c1466..cd30de8bfb035 100644
--- a/gcc/builtins.c
+++ b/gcc/builtins.c
@@ -7793,7 +7793,8 @@ default_emit_call_builtin___clear_cache (rtx begin, rtx 
end)
 void
 maybe_emit_call_builtin___clear_cache (rtx begin, rtx end)
 {
-  if (GET_MODE (begin) != ptr_mode || GET_MODE (end) != ptr_mode)
+  if ((GET_MODE (begin) != ptr_mode && GET_MODE (begin) != Pmode)
+  || (GET_MODE (end) != ptr_mode && GET_MODE (end) != Pmode))
 {
   error ("both arguments to %<__builtin___clear_cache%> must be pointers");
   return;
diff --git a/gcc/tree.c b/gcc/tree.c
index 52a145dd01819..72311005f57b2 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -10733,6 +10733,12 @@ build_common_builtin_nodes (void)
 
   ftype = build_function_type_list (void_type_node,
ptr_type_node, ptr_type_node, NULL_TREE);
+  if (!builtin_decl_explicit_p (BUILT_IN_CLEAR_CACHE))
+local_define_builtin ("__builtin___clear_cache", ftype,
+ BUILT_IN_CLEAR_CACHE,
+ "__builtin___clear_cache",
+ ECF_NOTHROW);
+
   local_define_builtin ("__builtin_nonlocal_goto", ftype,
BUILT_IN_NONLOCAL_GOTO,
"__builtin_nonlocal_goto",


-- 
Alexandre Oliva, happy hacker  https://FSFLA.org/blogs/lxo/
   Free Software Activist GNU Toolchain Engineer
Vim, Vi, Voltei pro Emacs -- GNUlius Caesar

Re: [PATCH] c++: consteval-defarg1.C test variant for templates

2020-12-03 Thread Jason Merrill via Gcc-patches


On 12/3/20 4:29 AM, Jakub Jelinek wrote:

On Wed, Dec 02, 2020 at 10:15:25PM -0500, Jason Merrill wrote:

Jakub noticed that we weren't recognizing a default argument for a consteval
member function as being in immediate function context because there was no
function parameter scope to look at.

Note that this patch doesn't actually push the parameters into the scope,
that happens in a separate commit.


Shouldn't we also be testing how it behaves in templates?

The following testcase is an attempt to test both non-dependent and
dependent consteval calls in both function and class templates, and with
your committed patch it now passes.

Ok for trunk?


OK, thanks.


2020-12-03  Jakub Jelinek  

* g++.dg/cpp2a/consteval-defarg2.C: New test.

--- gcc/testsuite/g++.dg/cpp2a/consteval-defarg2.C.jj   2020-12-03 
10:26:16.340256056 +0100
+++ gcc/testsuite/g++.dg/cpp2a/consteval-defarg2.C  2020-12-03 
10:19:33.215853790 +0100
@@ -0,0 +1,29 @@
+// Test that late-parsed default args have the same consteval semantics.
+// { dg-do compile { target c++20 } }
+
+template 
+consteval bool foo (bool x) { if (x) throw N; return false; }
+consteval bool qux (bool x) { if (x) throw 1; return false; }
+template 
+consteval bool bar (bool x = foo (true)) { return true; }
+template 
+consteval bool corge (bool x = qux (true)) { return true; }
+template 
+struct S
+{
+  consteval static bool baz (bool x = foo (true)) { return true; }
+  consteval static bool garply (bool x = qux (true)) { return true; }
+};
+struct T
+{
+  template 
+  consteval static bool baz (bool x = foo (true)) { return true; }
+  template 
+  consteval static bool garply (bool x = qux (true)) { return true; }
+};
+constexpr bool a = bar<0> (true);
+constexpr bool b = corge<0> (true);
+constexpr bool c = S<0>::baz (true);
+constexpr bool d = S<0>::garply (true);
+constexpr bool e = T::baz<0> (true);
+constexpr bool f = T::garply<0> (true);


Jakub

C++ patch ping

2020-12-03 Thread Jakub Jelinek via Gcc-patches

Hi!

I'd like to ping
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/560372.html
- v3 of the __builtin_bit_cast (with (hopefully) all earlier feedback
incorporated).

Thanks

Jakub

[PATCH] tree-optimization/98113 - vectorize a sequence of BIT_INSERT_EXPRs

2020-12-03 Thread Richard Biener

This adds the capability to handle a sequence of vector BIT_INSERT_EXPRs
to be vectorized similar as to how we vectorize vector constructors.

Bootstrapped and tested on x86_64-unknown-linux-gnu, OK at this stage
or do we want to defer to stage1 and regress for the PRs testcase?

The testcase can see amending for other targets that support
vectorized popcount, I've refrained from second-guessing needed
options and targest.

Thanks,
Richard.

2020-12-03  Richard Biener  

PR tree-optimization/98113
* tree-vectorizer.h (struct slp_root): New.
(_bb_vec_info::roots): New member.
* tree-vect-slp.c (vect_analyze_slp): Also walk BB info
roots.
(_bb_vec_info::_bb_vec_info): Adjust.
(_bb_vec_info::~_bb_vec_info): Likewise.
(vld_cmp): New.
(vect_slp_is_lane_insert): Likewise.
(vect_slp_check_for_constructors): Match a series of
BIT_INSERT_EXPRs as vector constructor.
(vect_slp_analyze_bb_1): Continue if BB info roots is
not empty.
(vect_slp_analyze_bb_1): Mark the whole BIT_INSERT_EXPR root
sequence as pure_slp.

* gcc.dg/vect/bb-slp-70.c: New testcase.
---
 gcc/testsuite/gcc.dg/vect/bb-slp-70.c |  17 +++
 gcc/tree-vect-slp.c   | 192 +++---
 gcc/tree-vectorizer.h |  12 ++
 3 files changed, 200 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/vect/bb-slp-70.c

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-70.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-70.c
new file mode 100644
index 000..0eb70112bde
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-70.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-mavx512vl -mavx512vpopcntdq" { target 
avx512vpopcntdq } } */
+
+typedef unsigned uv4si __attribute__((vector_size(16)));
+
+uv4si __attribute__((noinline))
+vpopctf (uv4si a)
+{
+  uv4si r;
+  r[2] = __builtin_popcount (a[2]);
+  r[1] = __builtin_popcount (a[1]);
+  r[0] = __builtin_popcount (a[0]);
+  r[3] = __builtin_popcount (a[3]);
+  return r;
+}
+
+/* { dg-final { scan-tree-dump "optimized: basic block" "slp2" { target 
avx512vpopcntdq } } } */
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index be87475092d..8f709aa0271 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -3039,6 +3039,19 @@ vect_analyze_slp (vec_info *vinfo, unsigned 
max_tree_size)
   ? slp_inst_kind_store : slp_inst_kind_ctor,
   max_tree_size);
 
+  if (bb_vec_info bb_vinfo = dyn_cast  (vinfo))
+{
+  for (unsigned i = 0; i < bb_vinfo->roots.length (); ++i)
+   {
+ vect_location = bb_vinfo->roots[i].root->stmt;
+ if (vect_build_slp_instance (bb_vinfo, bb_vinfo->roots[i].kind,
+  bb_vinfo->roots[i].stmts,
+  bb_vinfo->roots[i].root,
+  max_tree_size, bst_map, NULL))
+   bb_vinfo->roots[i].stmts = vNULL;
+   }
+}
+
   if (loop_vec_info loop_vinfo = dyn_cast  (vinfo))
 {
   /* Find SLP sequences starting from reduction chains.  */
@@ -3797,7 +3810,7 @@ vect_detect_hybrid_slp (loop_vec_info loop_vinfo)
 /* Initialize a bb_vec_info struct for the statements in BBS basic blocks.  */
 
 _bb_vec_info::_bb_vec_info (vec _bbs, vec_info_shared *shared)
-  : vec_info (vec_info::bb, init_cost (NULL), shared), bbs (_bbs)
+  : vec_info (vec_info::bb, init_cost (NULL), shared), bbs (_bbs), roots 
(vNULL)
 {
   for (unsigned i = 0; i < bbs.length (); ++i)
 {
@@ -3844,6 +3857,10 @@ _bb_vec_info::~_bb_vec_info ()
  gimple_set_uid (stmt, -1);
}
 }
+
+  for (unsigned i = 0; i < roots.length (); ++i)
+roots[i].stmts.release ();
+  roots.release ();
 }
 
 /* Subroutine of vect_slp_analyze_node_operations.  Handle the root of NODE,
@@ -4556,6 +4573,38 @@ vect_bb_vectorization_profitable_p (bb_vec_info bb_vinfo,
   return true;
 }
 
+/* qsort comparator for lane defs.  */
+
+static int
+vld_cmp (const void *a_, const void *b_)
+{
+  auto *a = (const std::pair *)a_;
+  auto *b = (const std::pair *)b_;
+  return a->first - b->first;
+}
+
+/* Return true if USE_STMT is a vector lane insert into VEC and set
+   *THIS_LANE to the lane number that is set.  */
+
+static bool
+vect_slp_is_lane_insert (gimple *use_stmt, tree vec, unsigned *this_lane)
+{
+  gassign *use_ass = dyn_cast  (use_stmt);
+  if (!use_ass
+  || gimple_assign_rhs_code (use_ass) != BIT_INSERT_EXPR
+  || (vec
+ ? gimple_assign_rhs1 (use_ass) != vec
+ : ((vec = gimple_assign_rhs1 (use_ass)), false))
+  || !useless_type_conversion_p (TREE_TYPE (TREE_TYPE (vec)),
+TREE_TYPE (gimple_assign_rhs2 (use_ass)))
+  || !constant_multiple_p
+   (tree_to_poly_uint64 (gimple_assign_rhs3 (use_ass)),
+tree_to_poly_uint64 (TYPE_SIZE (TREE_TYPE (TREE_TYP

Re: [stage1][PATCH] Change semantics of -frecord-gcc-switches and add -frecord-gcc-switches-format.

2020-12-03 Thread Richard Biener via Gcc-patches

On Thu, Nov 26, 2020 at 2:55 PM Martin Liška  wrote:
>
> On 11/25/20 2:48 PM, Richard Biener wrote:
> > On Mon, Nov 23, 2020 at 2:02 PM Martin Liška  wrote:
> >>
> >> On 11/23/20 12:00 PM, Richard Biener wrote:
> >>> Can you split out the unifying of -[gf]record-gcc-switches processing
> >>> and the target hook adjustment from the change introducing
> >>> -frecord-gcc-switches-format?
> >>
> >> Sure.
> >>
> >>>
> >>> dwarf2out.c seems to retain its gen_producer_string () even though
> >>> you duplicate it elsewhere and it is now unused?  Please retain
> >>> the gen_producer_string name since the function does actual
> >>> processing and not just fetch a precomputed string from somewhere.
> >>
> >> Yep, please take a look at the attached patch.
> >>
> >>>
> >>> I'd like somebody else chime in on the -frecord-gcc-switches-format
> >>> driver handling but will happily work with getting the unification part 
> >>> merged.
> >>
> >> May I apply the patch after it finishes regression tests and bootstrap?
> >
> > @@ -1523,8 +1378,9 @@ process_options (void)
> > if (version_flag)
> >   {
> > print_version (stderr, "", true);
> > -  if (! quiet_flag)
> > -   print_switch_values (print_to_stderr);
> > +  if (!quiet_flag)
> > +   fputs (gen_producer_string (lang_hooks.name, save_decoded_options,
> > +   save_decoded_options_count), stderr);
> >   }
> >
> >
> > so I wonder whether this regresses (no newlines anymore, no separate
> > printing of options passed/enabled).  Previously with -Q -v you'll get sth
> > like
> >
> > options passed:  -v
> >   -iprefix 
> > /home/rguenther/obj-gcc2-g/gcc/../lib64/gcc/x86_64-pc-linux-gnu/11.0.0/
> >   -isystem ./include -isystem ./include-fixed t.c -mtune=generic
> >   -march=x86-64 -O2
> > options enabled:  -faggressive-loop-optimizations -falign-functions
> >   -falign-jumps -falign-labels -falign-loops -fallocation-dce
> >   -fasynchronous-unwind-tables -fauto-inc-dec -fbranch-count-reg
> >   -fcaller-saves -fcode-hoisting -fcombine-stack-adjustments -fcompare-elim
> >   -fcprop-registers -fcrossjumping -fcse-follow-jumps -fdefer-pop
> >   -fdelete-null-pointer-checks -fdevirtualize -fdevirtualize-speculatively
> >   -fdwarf2-cfi-asm -fearly-inlining -feliminate-unused-debug-symbols
> > ...
>
> Oh, you are right that there's a significant change (I fixed the newline):
>
> BEFORE:
> GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
> options passed:  -v a.c -mtune=generic -march=x86-64
> options enabled:  -faggressive-loop-optimizations -fallocation-dce
>   -fasynchronous-unwind-tables -fauto-inc-dec -fdelete-null-pointer-checks
>   -fdwarf2-cfi-asm -fearly-inlining -feliminate-unused-debug-symbols
>   -feliminate-unused-debug-types -ffp-int-builtin-inexact -ffunction-cse
>   -fgcse-lm -fgnu-unique -fident -finline-atomics -fipa-stack-alignment
>   -fira-hoist-pressure -fira-share-save-slots -fira-share-spill-slots
>   -fivopts -fkeep-static-consts -fleading-underscore -flifetime-dse
>   -fmath-errno -fmerge-debug-strings -fpeephole -fplt -fprefetch-loop-arrays
>   -freg-struct-return -fsched-critical-path-heuristic
>   -fsched-dep-count-heuristic -fsched-group-heuristic -fsched-interblock
>   -fsched-last-insn-heuristic -fsched-rank-heuristic -fsched-spec
>   -fsched-spec-insn-heuristic -fsched-stalled-insns-dep -fschedule-fusion
>   -fsemantic-interposition -fshow-column -fshrink-wrap-separate
>   -fsigned-zeros -fsplit-ivs-in-unroller -fssa-backprop -fstdarg-opt
>   -fstrict-volatile-bitfields -fsync-libcalls -ftrapping-math -ftree-cselim
>   -ftree-forwprop -ftree-loop-if-convert -ftree-loop-im -ftree-loop-ivcanon
>   -ftree-loop-optimize -ftree-parallelize-loops= -ftree-phiprop
>   -ftree-reassoc -ftree-scev-cprop -funit-at-a-time -funwind-tables
>   -fvar-tracking -fvar-tracking-assignments -fzero-initialized-in-bss
>   -m128bit-long-double -m64 -m80387 -malign-stringops
>   -mavx256-split-unaligned-load -mavx256-split-unaligned-store
>   -mfancy-math-387 -mfp-ret-in-387 -mfxsr -mglibc -mieee-fp -mlong-double-80
>   -mmmx -mno-sse4 -mpush-args -mred-zone -msse -msse2 -mstv
>   -mtls-direct-seg-refs -mvzeroupper
> Compiler executable checksum: 
>
> AFTER:
> GGC heuristics: --param ggc-min-expand=30 --param ggc-min-heapsize=4096
> GNU C17 11.0.0 20201123 (experimental) -dumpbase-ext .c -mtune=generic 
> -march=x86-64
> Compiler executable checksum: 36558a9ca3607b6821d78562377c56da
>
> For -fverbose-asm we get something similar:
>
> BEFORE:
> # GNU C17 (SUSE Linux) version 10.2.1 20201117 [revision 
> 98ba03ffe0b9f37b4916ce6238fad754e00d720b] (x86_64-suse-linux)
> #   compiled by GNU C version 10.2.1 20201117 [revision 
> 98ba03ffe0b9f37b4916ce6238fad754e00d720b], GMP version 6.2.0, MPFR version 
> 4.1.0, MPC version 1.2.1, isl version isl-0.22.1-GMP
>
> # GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072
> # options passed:

Re: introduce overridable clear_cache emitter

2020-12-03 Thread Andreas Schwab

On Dez 03 2020, Andreas Schwab wrote:

> ../../../../libffi/src/aarch64/ffi.c: In function 'ffi_prep_closure_loc':
> ../../../../libffi/src/aarch64/ffi.c:67:3: error: both arguments to 
> '__builtin___clear_cache' must be pointers
>67 |   __builtin___clear_cache (start, end);
>   |   ^~~~

This happens when compiling with -mabi=ilp32.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [gcc r11-4816] Fix Ada build failure for the SuSE PowerPC64/Linux compiler

2020-12-03 Thread Andreas Schwab

On Dez 03 2020, Eric Botcazou wrote:

>> This breaks build of libada, it is missing all of $(GNATRTL_128BIT_OBJS).
>
> In the default multilib?  Yes, that's the point, since it's 32-bit apparently.

Nope.  The default is 64-bit.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [patch] Fix PR middle-end/98082

2020-12-03 Thread Richard Biener via Gcc-patches

On Thu, Dec 3, 2020 at 11:49 AM Eric Botcazou  wrote:
>
> Hi,
>
> this fixes an ICE introduced on the mainline by my fix for PR middle-end/97078
> where I changed use_register_for_decl to return true at -O0 for a parameter of
> a thunk.  It turns out that we need to do the same for a result in this case.
>
> Tested on x86-64/Linux, OK for the mainline?

OK.

Richard.

>
> 2020-12-03  Eric Botcazou  
>
> PR middle-end/98082
> * function.c (use_register_for_decl): Also return true for a result
> if cfun->tail_call_marked is true.
>
>
> 2020-12-03  Eric Botcazou  
>
> * g++.dg/cpp2a/pr98082.C: New test.
>
> --
> Eric Botcazou

Re: [patch] Fix PR middle-end/98099

2020-12-03 Thread Richard Biener via Gcc-patches

On Thu, Dec 3, 2020 at 11:49 AM Eric Botcazou  wrote:
>
> Hi,
>
> this replaces the ICE by a sorry message for the use of reverse scalar storage
> order with a 128-bit decimal floating-point type on 32-bit platforms.
>
> Tested on x86-64/Linux, OK for the mainline?

OK.

Richard.

>
> 2020-12-03  Eric Botcazou  
>
> * expmed.c (flip_storage_order): In the case of a non-integer mode,
> sorry out if the integer mode to be used instead is not supported.
>
>
> 2020-12-03  Eric Botcazou  
>
> * gcc.dg/pr98099.c: New test.
>
> --
> Eric Botcazou

Re: introduce overridable clear_cache emitter

2020-12-03 Thread Andreas Schwab

../../../../libffi/src/aarch64/ffi.c: In function 'ffi_prep_closure_loc':
../../../../libffi/src/aarch64/ffi.c:67:3: error: both arguments to 
'__builtin___clear_cache' must be pointers
   67 |   __builtin___clear_cache (start, end);
  |   ^~~~

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

[BUG/PATCH] ppc64 g5 and cell optimizations result in .machine power7

2020-12-03 Thread René Rebe

Hi,

Since reworking the rs6000 .machine output selection in commit 
e154242724b084380e3221df7c08fcdbd8460674 22 May 2019, compiling glibc with 
either G5 or cell results in power7 assembly optimizations to be chosen, which 
obviously crash with illegal instructions. This is because gcc's .machine 
output was accidentally changed due to OPTION_MASK_ALTIVEC only otherwise 
present in IBM CPUs since power7.

Bug & patch already at:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97367

Best regards,
René Rebe

-- 
 ExactCODE GmbH, Lietzenburger Str. 42, DE-10789 Berlin, https://exactcode.com
 https://exactscan.com | https://ocrkit.com | https://t2sde.org | 
https://rene.rebe.de

1 2 >

1 - 100 of 135 matches

Mail list logo