date:20210820

On Thu, 19 Aug 2021, Jirui Wu wrote:

> Hi all,
> 
> This patch generates FRINTZ instruction to optimize type casts.
> 
> The changes in this patch covers:
> * Generate FRINTZ for (double)(int) casts.
> * Add new test cases.
> 
> The intermediate type is not checked according to the C99 spec. 
> Overflow of the integral part when casting floats to integers causes 
> undefined behavior.
> As a result, optimization to trunc() is not invalid. 
> I've confirmed that Boolean type does not match the matching condition.
> 
> Regtested on aarch64-none-linux-gnu and no issues.
> 
> Ok for master? If OK can it be committed for me, I have no commit rights.

+/* Detected a fix_trunc cast inside a float type cast,
+   use IFN_TRUNC to optimize.  */
+#if GIMPLE
+(simplify
+  (float (fix_trunc @0))
+  (if (direct_internal_fn_supported_p (IFN_TRUNC, type,
+  OPTIMIZE_FOR_BOTH)
+   && flag_unsafe_math_optimizations
+   && type == TREE_TYPE (@0))

types_match (type, TREE_TYPE (@0))

please.  Please perform cheap tests first (the flag test).

+ (IFN_TRUNC @0)))
+#endif

why only for GIMPLE?  I'm not sure flag_unsafe_math_optimizations
is a good test here.  If you say we can use undefined behavior
of any overflow of the fix_trunc operation what do we guard here?
If it's Inf/NaN input then flag_finite_math_only would be more
appropriate, if it's behavior for -0. (I suppose trunc (-0.0) == -0.0
and thus "wrong") then a && !HONOR_SIGNED_ZEROS (type) is missing
instead.  If it's setting of FENV state and possibly trapping on
overflow (but it's undefined?!) then flag_trapping_math covers
the latter but we don't have any flag for eliding FENV state affecting
transforms, so there the kitchen-sink flag_unsafe_math_optimizations
might apply.

So - which is it?

Note there's also the pattern

/* Handle cases of two conversions in a row.  */
(for ocvt (convert float fix_trunc)
 (for icvt (convert float)
  (simplify
   (ocvt (icvt@1 @0))
   (with
{
...

which is related so please put the new pattern next to that (the
set of conversions handled there does not include (float (fix_trunc @0)))

Thanks,
Richard.

> Thanks,
> Jirui
> 
> gcc/ChangeLog:
> 
> * match.pd: Generate IFN_TRUNC.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.target/aarch64/merge_trunc1.c: New test.
> 
> > -Original Message-
> > From: Richard Biener 
> > Sent: Tuesday, August 17, 2021 9:13 AM
> > To: Andrew Pinski 
> > Cc: Jirui Wu ; Richard Sandiford
> > ; i...@airs.com; gcc-patches@gcc.gnu.org;
> > rguent...@suse.de
> > Subject: Re: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int)
> > under -ffast-math on aarch64
> > 
> > On Mon, Aug 16, 2021 at 8:48 PM Andrew Pinski via Gcc-patches  > patc...@gcc.gnu.org> wrote:
> > >
> > > On Mon, Aug 16, 2021 at 9:15 AM Jirui Wu via Gcc-patches
> > >  wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This patch generates FRINTZ instruction to optimize type casts.
> > > >
> > > > The changes in this patch covers:
> > > > * Opimization of a FIX_TRUNC_EXPR cast inside a FLOAT_EXPR using
> > IFN_TRUNC.
> > > > * Change of corresponding test cases.
> > > >
> > > > Regtested on aarch64-none-linux-gnu and no issues.
> > > >
> > > > Ok for master? If OK can it be committed for me, I have no commit 
> > > > rights.
> > >
> > > Is there a reason why you are doing the transformation manually inside
> > > forwprop rather than handling it inside match.pd?
> > > Also can't this only be done for -ffast-math case?
> > 
> > You definitely have to look at the intermediate type - that could be a 
> > uint8_t
> > or even a boolean type.  So unless the intermediate type can represent all
> > float values optimizing to trunc() is invalid.  Also if you emit IFN_TRUNC 
> > you
> > have to make sure there's target support - we don't emit calls to a library
> > trunc() from an internal function call (and we wouldn't want to optimize it
> > that way).
> > 
> > Richard.
> > 
> > >
> > > Thanks,
> > > Andrew Pinski
> > >
> > > >
> > > > Thanks,
> > > > Jirui
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > * tree-ssa-forwprop.c (pass_forwprop::execute): Optimize with 
> > > > frintz.
> > > >
> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > * gcc.target/aarch64/fix_trunc1.c: Update to new expectation.
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH, Maxfeldstrasse 5, 90409 Nuernberg,
Germany; GF: Felix Imendörffer; HRB 36809 (AG Nuernberg)

Re: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.

On Thu, Aug 19, 2021 at 6:01 PM Roger Sayle  wrote:
>
>
> Doh!  ENOPATCH.
>
> -Original Message-
> From: Roger Sayle 
> Sent: 19 August 2021 16:59
> To: 'GCC Patches' 
> Subject: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
>
>
> Back in June I briefly mentioned in one of my gcc-patches posts that a
> change that should have always reduced code size, would mysteriously
> occasionally result in slightly larger code (according to CSiBE):
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html
>
> Investigating further, the cause turns out to be that x86_64's
> scalar-to-vector (stv) pass is relying on poor estimates of the size
> costs/benefits.  This patch tweaks the backend's compute_convert_gain method
> to provide slightly more accurate values when compiling with -Os.
> Compilation without -Os is (should be) unaffected.  And for completeness,
> I'll mention that the stv pass is a net win for code size so it's much
> better to improve its heuristics than simply gate the pass on
> !optimize_for_size.
>
> The net effect of this change is to save 1399 bytes on the CSiBE code size
> benchmark when compiling with -Os.
>
> This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
> and "make -k check" with no new failures.
>
> Ok for mainline?

+   /* xor (2 bytes) vs. xorps (3 bytes).  */
+   if (src == const0_rtx)
+ igain -= COSTS_N_BYTES (1);
+   /* movdi_internal vs. movv2di_internal.  */
+   /* => mov (5 bytes) vs. movaps (7 bytes).  */
+   else if (x86_64_immediate_operand (src, SImode))
+ igain -= COSTS_N_BYTES (2);

doesn't it need two GPR xor for 32bit DImode and two mov?  Thus
the non-SSE cost should be times 'm'?  For const0_rtx we may
eventually re-use the zero reg for the high part so that is eventually
correct.

Also I'm missing a 'else' - in the default case there's no cost/benefit
of using SSE vs. GPR regs?  For SSE it would be a constant pool
load.

I also wonder, since I now see COSTS_N_BYTES for the first time (heh),
whether with -Os we'd need to replace all COSTS_N_INSNS (1)
scaling with COSTS_N_BYTES scaling?  OTOH costs_add_n_insns
uses COSTS_N_INSNS for the size part as well.

That said, it looks like we're eventually mixing apples and oranges
now or even previously?

Thanks,
Richard.

>
>
> 2021-08-19  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-features.c (compute_convert_gain): Provide
> more accurate values for CONST_INT, when optimizing for size.
> * config/i386/i386.c (COSTS_N_BYTES): Move definition from here...
> * config/i386/i386.h (COSTS_N_BYTES): to here.
>
> Roger
> --
>

Re: [PATCH] libgfortran : Use the libtool macro to determine libm availability.

On Thu, Aug 19, 2021 at 10:10 PM Iain Sandoe  wrote:
>
> Hi,
>
> A while ago had a report of build failure against a Darwin branch on
> the latest OS release.  This was because (temporarily) the symlink
> from libm.dylib => libSystem.dylib had been removed/omitted.
>
> libm is not needed on Darwin, and should not be added unconditionally
> even if that is (mostly) harmless since it is a symlink to libc.
>
> There could be cases where the addition was not completely harmless
> because the presentation of the symlink would cause the symbols exposed
> in libSystem to be considered ahead of ones presented in convenience
> libraries.
>
> tested on x86_64, i686-darwin, x86_64-linux,
> OK for master?

OK.

Richard.

> thanks
> Iain
>
> libgfortran/ChangeLog:
>
> * Makefile.am: Use configured libm availability.
> * Makefile.in: Regenerate.
> * configure: Regenerate.
> * configure.ac: Use libtool macro to find libm availability.
> * libgfortran.spec.in: Use configured libm availability.
> ---
>  libgfortran/Makefile.am |   2 +-
>  libgfortran/Makefile.in |   3 +-
>  libgfortran/configure   | 142 
>  libgfortran/configure.ac|   1 +
>  libgfortran/libgfortran.spec.in |   2 +-
>  5 files changed, 147 insertions(+), 3 deletions(-)
>
> diff --git a/libgfortran/Makefile.am b/libgfortran/Makefile.am
> index 8d104321567..6fc4b465c6e 100644
> --- a/libgfortran/Makefile.am
> +++ b/libgfortran/Makefile.am
> @@ -42,7 +42,7 @@ libgfortran_la_LINK = $(LINK) $(libgfortran_la_LDFLAGS)
>  libgfortran_la_LDFLAGS = -version-info `grep -v '^\#' 
> $(srcdir)/libtool-version` \
> $(LTLDFLAGS) $(LIBQUADLIB) ../libbacktrace/libbacktrace.la \
> $(HWCAP_LDFLAGS) \
> -   -lm $(extra_ldflags_libgfortran) \
> +   $(LIBM) $(extra_ldflags_libgfortran) \
> $(version_arg) -Wc,-shared-libgcc
>  libgfortran_la_DEPENDENCIES = $(version_dep) libgfortran.spec 
> $(LIBQUADLIB_DEP)
>
> diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
> index 523eb24bca1..513fd80b936 100644
> --- a/libgfortran/configure.ac
> +++ b/libgfortran/configure.ac
> @@ -260,6 +260,7 @@ AC_PROG_INSTALL
>  #AC_MSG_NOTICE([== Starting libtool configuration])
>  AC_LIBTOOL_DLOPEN
>  AM_PROG_LIBTOOL
> +AC_CHECK_LIBM
>  ACX_LT_HOST_FLAGS
>  AC_SUBST(enable_shared)
>  AC_SUBST(enable_static)
> diff --git a/libgfortran/libgfortran.spec.in b/libgfortran/libgfortran.spec.in
> index 95aa3f842a3..b870e78c151 100644
> --- a/libgfortran/libgfortran.spec.in
> +++ b/libgfortran/libgfortran.spec.in
> @@ -5,4 +5,4 @@
>  #
>
>  %rename lib liborig
> -*lib: @LIBQUADSPEC@ -lm %(libgcc) %(liborig)
> +*lib: @LIBQUADSPEC@  @LIBM@ %(libgcc) %(liborig)
> --
> 2.24.3 (Apple Git-128)
>
>

Re: [ping] Re-unify 'omp_build_component_ref' and 'oacc_build_component_ref'

On Thu, Aug 19, 2021 at 10:14 PM Thomas Schwinge
 wrote:
>
> Hi!
>
> Richard, maybe you have an opinion here, in particular about my
> "SLP vectorizer" comment below?  Please see
> <87r1f2puss.fsf@euler.schwinge.homeip.net">http://mid.mail-archive.com/87r1f2puss.fsf@euler.schwinge.homeip.net>
> for the full context.
>
> On 2021-08-16T10:21:04+0200, Jakub Jelinek  wrote:
> > On Mon, Aug 16, 2021 at 10:08:42AM +0200, Thomas Schwinge wrote:
> >>  /* Build COMPONENT_REF and set TREE_THIS_VOLATILE and TREE_READONLY on it
> >> as appropriate.  */
> >>
> >>  tree
> >>  omp_build_component_ref (tree obj, tree field)
> >>  {
> >> +  tree field_type = TREE_TYPE (field);
> >> +  tree obj_type = TREE_TYPE (obj);
> >> +  if (!ADDR_SPACE_GENERIC_P (TYPE_ADDR_SPACE (obj_type)))
> >> +field_type
> >> +  = build_qualified_type (field_type,
> >> +  KEEP_QUAL_ADDR_SPACE (TYPE_QUALS (obj_type)));
>
> (For later reference: "Kwok's new code" here is to propagate to
> 'field_type' any non-generic address space of 'obj_type'.)
>
> |> Concerning the current 'gcc/omp-low.c:omp_build_component_ref', for the
> |> current set of offloading testcases, we never see a
> |> '!ADDR_SPACE_GENERIC_P' there, so the address space handling doesn't seem
> |> to be necessary there (but also won't do any harm: no-op).
> >
> > Are you sure this can't trigger?
> > Say
> > extern int __seg_fs a;
> >
> > void
> > foo (void)
> > {
> >   #pragma omp parallel private (a)
> >   a = 2;
> > }
>
> That test case doesn't run into 'omp_build_component_ref' at all,
> but I'm attaching an altered and extended variant that does,
> "Add 'libgomp.c/address-space-1.c'".  OK to push to master branch?
>
> In this case, 'omp_build_component_ref' called via host compilation
> 'pass_lower_omp', it's the 'field_type' that has 'address-space-1', not
> 'obj_type', so indeed Kwok's new code is a no-op:
>
> (gdb) call debug_tree(field_type)
>   type  size 
> unit-size 
> align:32 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77686498 precision:32 min  max 
> 
> pointer_to_this >
> unsigned DI
> size  bitsizetype> constant 64>
> unit-size  0x77559000 sizetype> constant 8>
> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77686b28>
>
> (gdb) call debug_tree(obj_type)
>   size  bitsizetype> constant 64>
> unit-size  0x77559000 sizetype> constant 8>
> align:64 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77686bd0
> fields  type  0x77686498 int address-space-1>
> unsigned DI size  unit-size 
> 
> align:64 warn_if_not_align:0 symtab:0 alias-set -1 
> canonical-type 0x77686b28>
> unsigned DI /home/thomas/shared/gcc/omp/as.c:4:14 size 
>  unit-size 
> align:64 warn_if_not_align:0 offset_align 128
> offset 
> bit-offset  context 
> > reference_to_this  0x77686c78>>
>
> The case that Kwok's new code handles, however, is when 'obj_type' has a
> non-generic address space, and then propagates that one to 'field_type'.
>
> For a similar OpenACC example, 'omp_build_component_ref' called via GCN
> offloading compilation 'pass_omp_oacc_neuter_broadcast', we've got
> without Kwok's new code:
>
> (gdb) call debug_tree(field_type)
>   size  bitsizetype> constant 8>
> unit-size  0x7755 sizetype> constant 1>
> align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77550b28 precision:1 min  max 
> >
>
> (gdb) call debug_tree(obj_type)
>   size  bitsizetype> constant 8>
> unit-size  0x7755 sizetype> constant 1>
> align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77631000
> fields  type   unit-size 
> align:8 warn_if_not_align:0 symtab:0 alias-set -1 
> canonical-type 0x77550b28 precision:1 min  
> max >
> unsigned QI :0:0 size  
> unit-size 
> align:8 warn_if_not_align:0 offset_align 64
> offset 
> bit-offset  context 
> >
> pointer_to_this >
>
> ..., and with Kwok's new code the 'address-space-4' of 'obj_type' is
> propagated to 'field_type':
>
> (gdb) call debug_tree(field_type)
>   size  bitsizetype> constant 8>
> unit-size  0x7755 sizetype> constant 1>
> align:8 warn_if_not_align:0 symtab:0 alias-set -1 canonical-type 
> 0x77631540 precision:1 min  max 
> >
>
> I'm not familiar enough with these bits to tell whether Kwok's new code
> is the right solution to this problem -- or if, for example, the problem
> is rather in the SLP vectorizer, where the ICE seems to ultimately
> emerge?
>
> Without (ICEs later) vs. with (works) Kwok's new code, we see the
> 'a.xamdgcn-amdhsa.mkoffload.175t.slp1'

Re: [PATCH] libgfortran : Use the libtool macro to determine libm availability.


On 20.08.21 09:34, Richard Biener via Fortran wrote:


On Thu, Aug 19, 2021 at 10:10 PM Iain Sandoe  wrote:

libm is not needed on Darwin, and should not be added unconditionally
even if that is (mostly) harmless since it is a symlink to libc.

tested on x86_64, i686-darwin, x86_64-linux,
OK for master?

OK.
Richard.


Looks also good to me – but for completeness and as background, I also
want to remark the following.

(At least as I understand it, I did not dig deeper.)


--- a/libgfortran/configure.ac
+++ b/libgfortran/configure.ac
...
+AC_CHECK_LIBM


This CHECK_LIBM has a hard-coded list of targets and for some (like
Darwin) it simply does not set $LIBM.


+++ b/libgfortran/Makefile.am
@@ -42,7 +42,7 @@ libgfortran_la_LINK = $(LINK) $(libgfortran_la_LDFLAGS)
  libgfortran_la_LDFLAGS = -version-info `grep -v '^\#' 
$(srcdir)/libtool-version` \
 $(LTLDFLAGS) $(LIBQUADLIB) ../libbacktrace/libbacktrace.la \
 $(HWCAP_LDFLAGS) \
-   -lm $(extra_ldflags_libgfortran) \
+   $(LIBM) $(extra_ldflags_libgfortran) \


I think usage like this one do not actually link '-lm' as
gcc/config/darwin.h contains:

#define LINK_SPEC  \
   ...
   %:remove-outfile(-lm) \

However, this spec change comes too early for:

--- a/libgfortran/libgfortran.spec.in
+++ b/libgfortran/libgfortran.spec.in
@@ -5,4 +5,4 @@
  #

  %rename lib liborig
-*lib: @LIBQUADSPEC@ -lm %(libgcc) %(liborig)
+*lib: @LIBQUADSPEC@  @LIBM@ %(libgcc) %(liborig)

NIT: There are two spaces instead of one before @LIBM@.

Thus, it makes sense to the unconditional '-lm' from the .spec file.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [PATCH] libgfortran : Use the libtool macro to determine libm availability.

2021-08-20 Thread Iain Sandoe




> On 20 Aug 2021, at 08:52, Tobias Burnus  wrote:
> 
> On 20.08.21 09:34, Richard Biener via Fortran wrote:
> 
>> On Thu, Aug 19, 2021 at 10:10 PM Iain Sandoe  wrote:
>>> libm is not needed on Darwin, and should not be added unconditionally
>>> even if that is (mostly) harmless since it is a symlink to libc.
>>> 
>>> tested on x86_64, i686-darwin, x86_64-linux,
>>> OK for master?
>> OK.
>> Richard.
> 
> Looks also good to me – but for completeness and as background, I also
> want to remark the following.
> 
> (At least as I understand it, I did not dig deeper.)
> 
>> --- a/libgfortran/configure.ac
>> +++ b/libgfortran/configure.ac
>> ...
>> +AC_CHECK_LIBM
> 
> This CHECK_LIBM has a hard-coded list of targets and for some (like
> Darwin) it simply does not set $LIBM.

however, it’s one place to make changes (and, probably, I suppose a table is 
about the
only way to do it, reliably)?

>>> +++ b/libgfortran/Makefile.am
>>> @@ -42,7 +42,7 @@ libgfortran_la_LINK = $(LINK) $(libgfortran_la_LDFLAGS)
>>>  libgfortran_la_LDFLAGS = -version-info `grep -v '^\#' 
>>> $(srcdir)/libtool-version` \
>>> $(LTLDFLAGS) $(LIBQUADLIB) ../libbacktrace/libbacktrace.la \
>>> $(HWCAP_LDFLAGS) \
>>> -   -lm $(extra_ldflags_libgfortran) \
>>> +   $(LIBM) $(extra_ldflags_libgfortran) \
> 
> I think usage like this one do not actually link '-lm' as
> gcc/config/darwin.h contains:
> 
> #define LINK_SPEC  \
>   ...
>   %:remove-outfile(-lm) \

right, I have tried to be defensive w.r.t unneeded libs in the Darwin specs, 
but as
you note it doesn’t work for libraries added as infiles…

It’s also consistent to use the configure-discovered value, right?

> However, this spec change comes too early for:
>> --- a/libgfortran/libgfortran.spec.in
>> +++ b/libgfortran/libgfortran.spec.in
>> @@ -5,4 +5,4 @@
>>  #
>> 
>>  %rename lib liborig
>> -*lib: @LIBQUADSPEC@ -lm %(libgcc) %(liborig)
>> +*lib: @LIBQUADSPEC@  @LIBM@ %(libgcc) %(liborig)
> NIT: There are two spaces instead of one before @LIBM@.

… thanks.
> 
> Thus, it makes sense to the unconditional '-lm' from the .spec file.

I’ll hold off committing until this evening in case there are other comments.

thanks
Iain

> 
> Tobias
> 
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955

[PATCH] gcov: fix output location for JSON mode.


Hello.

The patch adds support for -x and -p when being used with JSON output format.
Apart from that, I utilized std::string mode in string building.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
I'm going to install the patch,
Martin

PR gcov-profile/89961

gcc/ChangeLog:

* gcov.c (make_gcov_file_name): Rewrite using std::string.
(mangle_name): Simplify, do not used the second argument.
(strip_extention): New function.
(get_md5sum): Likewise.
(get_gcov_intermediate_filename): Handle properly -p and -x
options.
(output_gcov_file): Use string type.
(generate_results): Likewise.
(md5sum_to_hex): Remove.
---
 gcc/gcov.c | 158 ++---
 1 file changed, 79 insertions(+), 79 deletions(-)

diff --git a/gcc/gcov.c b/gcc/gcov.c
index 5c651a9bdce..cf0a49d8c30 100644
--- a/gcc/gcov.c
+++ b/gcc/gcov.c
@@ -662,8 +662,8 @@ static void accumulate_line_counts (source_info *);
 static void output_gcov_file (const char *, source_info *);
 static int output_branch_count (FILE *, int, const arc_info *);
 static void output_lines (FILE *, const source_info *);
-static char *make_gcov_file_name (const char *, const char *);
-static char *mangle_name (const char *, char *);
+static string make_gcov_file_name (const char *, const char *);
+static char *mangle_name (const char *);
 static void release_structures (void);
 extern int main (int, char **);
 
@@ -1134,6 +1134,41 @@ output_intermediate_json_line (json::array *object,

   object->append (lineo);
 }
 
+/* Strip filename extension in STR.  */

+
+static string
+strip_extention (string str)
+{
+  string::size_type pos = str.rfind ('.');
+  if (pos != string::npos)
+str = str.substr (0, pos);
+
+  return str;
+}
+
+/* Calcualte md5sum for INPUT string and return it in hex string format.  */
+
+static string
+get_md5sum (const char *input)
+{
+  md5_ctx ctx;
+  char md5sum[16];
+  string str;
+
+  md5_init_ctx (&ctx);
+  md5_process_bytes (input, strlen (input), &ctx);
+  md5_finish_ctx (&ctx, md5sum);
+
+  for (unsigned i = 0; i < 16; i++)
+{
+  char b[3];
+  sprintf (b, "%02x", (unsigned char)md5sum[i]);
+  str += b;
+}
+
+  return str;
+}
+
 /* Get the name of the gcov file.  The return value must be free'd.
 
It appends the '.gcov' extension to the *basename* of the file.

@@ -1143,20 +1178,26 @@ output_intermediate_json_line (json::array *object,
input: foo.da,   output: foo.da.gcov
input: a/b/foo.cc,   output: foo.cc.gcov  */
 
-static char *

-get_gcov_intermediate_filename (const char *file_name)
+static string
+get_gcov_intermediate_filename (const char *input_file_name)
 {
-  const char *gcov = ".gcov.json.gz";
-  char *result;
-  const char *cptr;
+  string base = basename (input_file_name);
+  string str = strip_extention (base);
 
-  /* Find the 'basename'.  */

-  cptr = lbasename (file_name);
-
-  result = XNEWVEC (char, strlen (cptr) + strlen (gcov) + 1);
-  sprintf (result, "%s%s", cptr, gcov);
+  if (flag_hash_filenames)
+{
+  str += "##";
+  str += get_md5sum (input_file_name);
+}
+  else if (flag_preserve_paths && base != input_file_name)
+{
+  str += "##";
+  str += mangle_path (input_file_name);
+  str = strip_extention (str);
+}
 
-  return result;

+  str += ".gcov.json.gz";
+  return str.c_str ();
 }
 
 /* Output the result in JSON intermediate format.

@@ -1416,7 +1457,9 @@ process_all_functions (void)
 static void
 output_gcov_file (const char *file_name, source_info *src)
 {
-  char *gcov_file_name = make_gcov_file_name (file_name, src->coverage.name);
+  string gcov_file_name_str
+= make_gcov_file_name (file_name, src->coverage.name);
+  const char *gcov_file_name = gcov_file_name_str.c_str ();
 
   if (src->coverage.lines)

 {
@@ -1438,13 +1481,12 @@ output_gcov_file (const char *file_name, source_info 
*src)
   unlink (gcov_file_name);
   fnotice (stdout, "Removing '%s'\n", gcov_file_name);
 }
-  free (gcov_file_name);
 }
 
 static void

 generate_results (const char *file_name)
 {
-  char *gcov_intermediate_filename;
+  string gcov_intermediate_filename;
 
   for (vector::iterator it = functions.begin ();

it != functions.end (); it++)
@@ -1547,11 +1589,13 @@ generate_results (const char *file_name)
  root->print (&pp);
  pp_formatted_text (&pp);
 
-	  gzFile output = gzopen (gcov_intermediate_filename, "w");

+ fnotice (stdout, "Creating '%s'\n",
+  gcov_intermediate_filename.c_str ());
+ gzFile output = gzopen (gcov_intermediate_filename.c_str (), "w");
  if (output == NULL)
{
  fnotice (stderr, "Cannot open JSON output file %s\n",
-  gcov_intermediate_filename);
+  gcov_intermediate_filename.c_str ());
  return;
}
 
@@ -1559,7 +1603,7 @@ generate_

Re: [PATCH] configure: Allow a host makefile fragment to override PIE flag settings.

2021-08-20 Thread Richard Sandiford via Gcc-patches

Iain Sandoe  writes:
> Hi,
>
> This concerns the settings of flags (using the host makefile fragment) for
> tools that will run on the host.
>
> At present the (no)PIE flags are computed in gcc/configure but it is not
> possible to override them (either from higher level Makefile or from the
> command line).  Secondly the ordering of placement of flags assumes
> ELF semantics are OK for ordering of -fno-PIE and -fPIC.
>
> For Darwin, this introduces problems if fno-PIE causes PIC to be switched
> off and the bootstrap compiler does not support mdynamic-no-pic (which
> is the case when we bootstrap a 32b toolchain with clang).  This causes
> the host files to be built '-static' which is not legal for user-space
> code, and the build terminates with illegal relocations (so that bootstrap
> fails).
>
> This patch:
>
> 1. Allows the computed PIE flags to be overridden by the top level
>configure.
>
> 2. Allows a host fragment to set values on the basis of the configured
>host platform/version.
>
> 3. Sets suitable values for the Darwin cases that currently fail.
>
> tested on i686,powerpc,x86-64-darwin, x86_64, powerpc64-linux,
> OK for master?
>
> thanks
> Iain
>
> Signed-off-by: Iain Sandoe 
>
> ChangeLog:
>
>   * Makefile.in: Regenerated.
>   * Makefile.tpl: Add PIE flags to HOST_EXPORTS and
>   POST_STAGE1_HOST_EXPORTS as specified by the host makefile
>   fragment.
>
> config/ChangeLog:
>
>   * mh-darwin: Specify suitable PIE/PIC flags for 32b Darwin
>   hosts.
>
> gcc/ChangeLog:
>
>   * configure: Regenerate.
>   * configure.ac: Allow top level configure to override assumed
>   PIE flags.
> ---
>  Makefile.in  | 14 ++
>  Makefile.tpl | 14 ++
>  config/mh-darwin | 20 
>  gcc/configure|  4 ++--
>  gcc/configure.ac |  4 ++--
>  5 files changed, 48 insertions(+), 8 deletions(-)
>
> diff --git a/Makefile.tpl b/Makefile.tpl
> index 9adf4f94728..9523be5a761 100644
> --- a/Makefile.tpl
> +++ b/Makefile.tpl
> @@ -576,6 +576,20 @@ all:
>  @host_makefile_frag@
>  ###
>  
> +# Allow host makefile fragment to override PIE settings.
> +ifneq ($(STAGE1_NO_PIE_CFLAGS),)
> +  HOST_EXPORTS += export NO_PIE_CFLAGS="$(STAGE1_NO_PIE_CFLAGS)";
> +endif
> +ifneq ($(STAGE1_NO_PIE_FLAG),)
> +  HOST_EXPORTS += export NO_PIE_FLAG="$(STAGE1_NO_PIE_FLAG)";
> +endif
> +ifneq ($(BOOT_NO_PIE_CFLAGS),)
> +  POSTSTAGE1_HOST_EXPORTS += export NO_PIE_CFLAGS="$(BOOT_NO_PIE_CFLAGS)";
> +endif
> +ifneq ($(BOOT_NO_PIE_FLAG),)
> +  POSTSTAGE1_HOST_EXPORTS += export NO_PIE_FLAG="$(BOOT_NO_PIE_FLAG)";
> +endif

Not really my area, but:

Is the NO_PIE_FLAG just for completeness?  It didn't look like the
patch overrode that.  Might be easier to leave it out if so, to avoid
any risk of accidentally having an empty variable.

Maybe it would be easier to have the makefile fragments determine
something like CODE_MODEL_CFLAGS, which can be "-fPIC", "-mdynamic-no-pic",
etc., and use:

COMPILER += $(NO_PIE_CFLAGS) $(CODE_MODEL_CFLAGS)

This avoids mixing the autoconf-derived variable with the host override.

I think we need to stick to the traditional sh

  foo=X; export foo

style used elsewhere.  "export foo=X" is a bash extension.

It also might be better to have:

  STAGE1_CODE_MODEL_CFLAGS =
  BOOT_CODE_MODEL_CFLAGS =
  CODE_MODEL_CFLAGS =

in Makefile.tpl (just before the config/ includes) and then override
them in mh-darwin.  We can then include the exports unconditionally,
like with other variables.

I think we should export them for the host too (via HOST_EXPORTS).

We could add {STAGE1,BOOT}_CODE_MODEL_CFLAGS to the main
{STAGE1,BOOT}_{C,CXX}FLAGS in Makefile.tpl.  It looks like
that might simplify some of the existing mh-darwin code a bit.

Like I say, not my area, so feel free to ignore :-)

Thanks,
Richard

> +
>  # This is the list of directories that may be needed in RPATH_ENVVAR
>  # so that programs built for the target machine work.
>  TARGET_LIB_PATH = [+ FOR target_modules +][+
> diff --git a/config/mh-darwin b/config/mh-darwin
> index b72835ae953..d3c357a0574 100644
> --- a/config/mh-darwin
> +++ b/config/mh-darwin
> @@ -7,6 +7,13 @@
>  # We use Werror, since some versions of clang report unknown command line 
> flags
>  # as a warning only.
>  
> +# In addition, all versions of clang released to date treat -fno-PIE in -m32
> +# compilations  as switching PIC code off too, which means that -fno-PIE,
> +# without -mdynamic-no-pic produces broken relocations (and we cannot enable
> +# -mdynamic-no-pic because the inverse setting doesn't work).  To work around
> +# this, we need to ensure that the no-PIE option is followed by -fPIE, when
> +# the compiler does not support mdynamic-no-pic.
> +
>  # We only need to determine this for the host tool used to build stage1 (or a
>  # non-bootstrapped compiler), later stages will be built by GCC which 
> supports
>  # the required flags.
> @@ -24,23 +31,28 @@ endif
>  @if gcc-bootstrap
>  ifeq (${

Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

On Fri, Aug 20, 2021 at 08:48:57AM +0200, Harald Anlauf via Gcc-patches wrote:
> Hi!
> 
> > Gesendet: Freitag, 20. August 2021 um 02:21 Uhr
> > Von: "H.J. Lu" 
> 
> > This may have broken bootstrap on 32-bit hosts:
> > 
> > https://gcc.gnu.org/pipermail/gcc-regression/2021-August/075209.html
> 
> I do not understand the error message:
> 
> ../../src-master/gcc/fortran/simplify.c: In function ‘bool 
> substring_has_constant_len(gfc_expr*)’:
> ../../src-master/gcc/fortran/simplify.c:4557:22: error: unknown conversion 
> type character ‘l’ in format [-Werror=format=]
>  4557 |   gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC
>   |  ^
>  4558 |  ") at %L below 1",
>   |  ~
> ../../src-master/gcc/fortran/simplify.c:4557:22: error: format ‘%L’ expects 
> argument of type ‘locus*’, but argument 2 has type ‘long long int’ 
> [-Werror=format=]
>  4557 |   gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC
>   |  ^
>  4558 |  ") at %L below 1",
>   |  ~
>  4559 |  istart, &ref->u.ss.start->where);
>   |  ~~
>   |  |
>   |  long long int
> ../../src-master/gcc/fortran/simplify.c:4557:22: error: too many arguments 
> for format [-Werror=format-extra-args]
> 
> Is there an issue with HOST_WIDE_INT_PRINT_DEC on 32-bit hosts?
> What is the right way to print a HOST_WIDE_INT?
> 
> It works on 64-bit without any warning.

gfc_error etc. aren't *printf family, it has its own format specifier
handling (e.g. it handles %L and %C), and it is even different
from the error/warning handling in middle-end/c-family FEs/backends.

HOST_WIDE_INT_PRINT_DEC is wrong in the above spot for 2 reasons:
1) gfc_error etc. argument is automatically marked for translation
   and translated, having a macro in there that expands to various things
   depending on host makes it impossible to mark for translation and
   a lottery for translation.
2) hwint.h defines:
#define HOST_LONG_FORMAT "l"
#define HOST_LONG_LONG_FORMAT "ll"
#if INT64_T_IS_LONG
# define GCC_PRI64 HOST_LONG_FORMAT
#else
# define GCC_PRI64 HOST_LONG_LONG_FORMAT
#endif
#define PRId64 GCC_PRI64 "d"
#define HOST_WIDE_INT_PRINT_DEC "%" PRId64
   but xm-mingw32.h overrides
#define HOST_LONG_LONG_FORMAT "I64"
   so effectively HOST_WIDE_INT_PRINT_DEC is "%ld", "%lld" or "%I64d"
   Now, gfc_error does handle %d or %ld, but doesn't handle %lld nor
   %I64d, so even if the 1) issue didn't exist, this explains why
   it works only on some hosts (e.g. x86_64-linux where %ld is used).
   But e.g. on i686-linux or many other hosts it is %lld which
   gfc_error doesn't handle and on Windows that %I64d.

Now, the non-Fortran FE diagnostic code actually has %wd for this (w
modifier like l modifier), which takes HOST_WIDE_INT/unsigned HOST_WIDE_INT
argument and prints it.

So, either you get through the hops to support that, unfortunately it isn't
just adding support for that in fortran/error.c (error_print) and some
helper functions, which wouldn't be that hard, just add 'w' next to 'l'
handling, TYPE_* for that and union member etc., but one needs to modify
c-family/c-format.c too to register the modifier so that gcc doesn't warn
about it and knows the proper argument type etc.

The other much easier but uglier option is to use a temporary buffer:
  char buffer[21];
  sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, hwint_val);
  gfc_error ("... %s ...", ... buffer ...);
This works, as it uses the host sprintf i.e. *printf family for which
HOST_WIDE_INT_PRINT_DEC macro is designed.

Jakub

Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514


On 20.08.21 02:21, H.J. Lu wrote:

This may have broken bootstrap on 32-bit hosts:
https://gcc.gnu.org/pipermail/gcc-regression/2021-August/075209.html

The latter has:

../../src-master/gcc/fortran/simplify.c:4557:22: error: unknown conversion type 
character ‘l’ in format [-Werror=format=]
  4557 |   gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC
   |  ^


HOST_WIDE_INT_PRINT_DEC is:
  #define HOST_WIDE_INT_PRINT_DEC "%" PRId64
and the latter is something like:
  "ld"  or "lld"

I fail to see why that shouldn't work – and also building with:

 CC="gcc-10 -m32 -fno-lto -O2" CXX="g++-10 -m32 -fno-lto -O2" .../configure 
--prefix=... --disable-multilib --disable-libstdcxx-pch --enable-multiarch 
--enable-languages=c,c++,fortran --disable-lto

did succeed.


Looking at the format checking:
  gfortran.h:void gfc_error (const char *, ...) ATTRIBUTE_GCC_GFC(1,2);
with
  gfortran.h:#define ATTRIBUTE_GCC_GFC(m, n) __attribute__ ((__format__ 
(__gcc_gfc__, m, n))) ATTRIBUTE_NONNULL(m)

I do see that the C/C++ part (which also uses
HOST_WIDE_INT_PRINT_DEC) have some special code
for HWI, which is used for via
   init_dynamic_diag_info
contains support for hwi not for gfc_{error,warning} via
   init_dynamic_gfc_info

I did wonder whether that causes the differences (→ attached patch).
On the other hand, %ld and %lld are pretty standard – and the
comment in the function talks about %w – which is not used
(except in spec files and in 'go').

Hence: No idea what's the problem or goes wrong.

Maybe the patch is still useful – at least it gives an
error when HWI is neither long nor long long ...
(Build with and without that patch with the options
shown aboved (and 'error:' in stage 1, 2, 3 plus
the usual bootstrap on x86-64.)

Comments?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
c-family/c-format.c: Support HOST_WIDE_INT in __gcc_gfc__

This patch fixes x86 -m32 builds as gcc/fortran now uses HOST_WIDE_INT
with gfc_error.

gcc/c-family/ChangeLog:

	* c-format.c (init_dynamic_diag_hwi): New. Moved from ...
	(init_dynamic_diag_info): ... here; call it.
	(init_dynamic_gfc_info): Call it.

 gcc/c-family/c-format.c | 135 ++--
 1 file changed, 74 insertions(+), 61 deletions(-)

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 6fd0bb33d21..1db4fd33dbc 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -4843,12 +4843,85 @@ init_dynamic_asm_fprintf_info (void)
 }
 }
 
+/* Called by init_dynamic_gfc_info and init_dynamic_diag_info.
+   Determine the type of "HOST_WIDE_INT" in the code being compiled for
+   use in GCC's diagnostic custom format attributes.  You
+   must have set dynamic_format_types before calling this function.  */
+static void
+init_dynamic_diag_hwi (void)
+{
+  static tree hwi;
+
+  if (!hwi)
+{
+  static format_length_info *diag_ls;
+  unsigned int i;
+
+  /* Find the underlying type for HOST_WIDE_INT.  For the 'w'
+	 length modifier to work, one must have issued: "typedef
+	 HOST_WIDE_INT __gcc_host_wide_int__;" in one's source code
+	 prior to using that modifier.  */
+  if ((hwi = maybe_get_identifier ("__gcc_host_wide_int__")))
+	{
+	  hwi = identifier_global_value (hwi);
+	  if (hwi)
+	{
+	  if (TREE_CODE (hwi) != TYPE_DECL)
+		{
+		  error ("%<__gcc_host_wide_int__%> is not defined as a type");
+		  hwi = 0;
+		}
+	  else
+		{
+		  hwi = DECL_ORIGINAL_TYPE (hwi);
+		  gcc_assert (hwi);
+		  if (hwi != long_integer_type_node
+		  && hwi != long_long_integer_type_node)
+		{
+		  error ("%<__gcc_host_wide_int__%> is not defined"
+			 " as % or %");
+		  hwi = 0;
+		}
+		}
+	}
+	}
+
+  /* Assign the new data for use.  */
+
+  /* All the GCC diag formats use the same length specs.  */
+  if (!diag_ls)
+	dynamic_format_types[gcc_diag_format_type].length_char_specs =
+	  dynamic_format_types[gcc_tdiag_format_type].length_char_specs =
+	  dynamic_format_types[gcc_cdiag_format_type].length_char_specs =
+	  dynamic_format_types[gcc_cxxdiag_format_type].length_char_specs =
+	  dynamic_format_types[gcc_dump_printf_format_type].length_char_specs =
+	  diag_ls = (format_length_info *)
+		xmemdup (gcc_diag_length_specs,
+			 sizeof (gcc_diag_length_specs),
+			 sizeof (gcc_diag_length_specs));
+  if (hwi)
+	{
+	  /* HOST_WIDE_INT must be one of 'long' or 'long long'.  */
+	  i = find_length_info_modifier_index (diag_ls, 'w');
+	  if (hwi == long_integer_type_node)
+	diag_ls[i].index = FMT_LEN_l;
+	  else if (hwi == long_long_integer_type_node)
+	diag_ls[i].index = FMT_LEN_ll;
+	  else
+	gcc_unreachable ();
+	}
+}

[committed] openmp: Diagnose some superfluous commas in OpenMP parsing

Hi!

While working on error directive, I've noticed a few spots in OpenMP
parsing where we consume and don't diagnose superfluous commas at the end
(either of depend sink arguments or at the end of requires pragma).

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2021-08-20  Jakub Jelinek  

gcc/c/
* c-parser.c (c_parser_omp_clause_depend_sink): Reject spurious
comma at the end of list.
(c_parser_omp_requires): Likewise.
gcc/cp/
* parser.c (cp_parser_omp_clause_depend_sink): Reject spurious
comma at the end of list.  Don't parse closing paren here...
(cp_parser_omp_clause_depend): ... but here instead.
gcc/testsuite/
* c-c++-common/gomp/sink-5.c: New test.
* c-c++-common/gomp/requires-3.c: Add test for spurious comma
at the end of pragma line.

--- gcc/c/c-parser.c.jj 2021-08-19 12:53:44.692106632 +0200
+++ gcc/c/c-parser.c2021-08-19 13:19:56.217288105 +0200
@@ -15481,7 +15481,9 @@ c_parser_omp_clause_depend_sink (c_parse
OMP_CLAUSE_DEPEND_SINK_NEGATIVE (vec) = 1;
}
 
-  if (c_parser_next_token_is_not (parser, CPP_COMMA))
+  if (c_parser_next_token_is_not (parser, CPP_COMMA)
+ || c_parser_peek_2nd_token (parser)->type != CPP_NAME
+ || c_parser_peek_2nd_token (parser)->id_kind != C_ID_ID)
break;
 
   c_parser_consume_token (parser);
@@ -21663,7 +21665,9 @@ c_parser_omp_requires (c_parser *parser)
   location_t loc = c_parser_peek_token (parser)->location;
   while (c_parser_next_token_is_not (parser, CPP_PRAGMA_EOL))
 {
-  if (!first && c_parser_next_token_is (parser, CPP_COMMA))
+  if (!first
+ && c_parser_next_token_is (parser, CPP_COMMA)
+ && c_parser_peek_2nd_token (parser)->type == CPP_NAME)
c_parser_consume_token (parser);
 
   first = false;
--- gcc/cp/parser.c.jj  2021-08-19 11:42:27.335422818 +0200
+++ gcc/cp/parser.c 2021-08-19 13:40:26.387178416 +0200
@@ -38465,13 +38465,14 @@ cp_parser_omp_clause_depend_sink (cp_par
OMP_CLAUSE_DEPEND_SINK_NEGATIVE (vec) = 1;
}
 
-  if (cp_lexer_next_token_is_not (parser->lexer, CPP_COMMA))
+  if (cp_lexer_next_token_is_not (parser->lexer, CPP_COMMA)
+ || !cp_lexer_nth_token_is (parser->lexer, 2, CPP_NAME))
break;
 
   cp_lexer_consume_token (parser->lexer);
 }
 
-  if (cp_parser_require (parser, CPP_CLOSE_PAREN, RT_CLOSE_PAREN) && vec)
+  if (vec)
 {
   tree u = build_omp_clause (clause_loc, OMP_CLAUSE_DEPEND);
   OMP_CLAUSE_DEPEND_KIND (u) = OMP_CLAUSE_DEPEND_SINK;
@@ -38791,7 +38792,13 @@ cp_parser_omp_clause_depend (cp_parser *
 goto resync_fail;
 
   if (kind == OMP_CLAUSE_DEPEND_SINK)
-nlist = cp_parser_omp_clause_depend_sink (parser, loc, list);
+{
+  nlist = cp_parser_omp_clause_depend_sink (parser, loc, list);
+  if (!parens.require_close (parser))
+   cp_parser_skip_to_closing_parenthesis (parser, /*recovering=*/true,
+  /*or_comma=*/false,
+  /*consume_paren=*/true);
+}
   else
 {
   nlist = cp_parser_omp_var_list_no_open (parser, OMP_CLAUSE_DEPEND,
--- gcc/testsuite/c-c++-common/gomp/sink-5.c.jj 2021-08-19 13:04:38.934031130 
+0200
+++ gcc/testsuite/c-c++-common/gomp/sink-5.c2021-08-19 13:26:43.522623672 
+0200
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+
+void bar (int);
+
+void
+foo ()
+{
+  int i;
+#pragma omp parallel for ordered(1)
+  for (i = 0; i < 100; ++i)
+{
+#pragma omp ordered depend(sink:i-1,)  /* { dg-error "expected '\\\)' before 
',' token" } */
+  bar (i);
+#pragma omp ordered depend(source)
+}
+}
--- gcc/testsuite/c-c++-common/gomp/requires-3.c.jj 2021-08-19 
11:42:27.393422011 +0200
+++ gcc/testsuite/c-c++-common/gomp/requires-3.c2021-08-19 
13:23:53.217992113 +0200
@@ -3,3 +3,4 @@
 #pragma omp requires atomic_default_mem_order(foobar)  /* { dg-error "expected 
'seq_cst', 'relaxed' or 'acq_rel'" } */
 #pragma omp requires atomic_default_mem_order (/* { dg-error "expected 
'seq_cst', 'relaxed' or 'acq_rel'" } */
 /* { dg-error "expected '\\\)' before end of line" "" { target *-*-* } .-1 } */
+#pragma omp requires atomic_default_mem_order(seq_cst),/* { dg-error 
"expected end of line before ',' token" } */

Jakub

Aw: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

Hi Jakob,

thanks for the detailed explanation!

> The other much easier but uglier option is to use a temporary buffer:
>   char buffer[21];
>   sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, hwint_val);
>   gfc_error ("... %s ...", ... buffer ...);
> This works, as it uses the host sprintf i.e. *printf family for which
> HOST_WIDE_INT_PRINT_DEC macro is designed.

The attached followup patch implements this.

Can anybody affected by current HEAD confirm that this fixes bootstrap?

Thanks,
Harald
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index 492867e12cb..eaabbffca4d 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4552,11 +4552,12 @@ substring_has_constant_len (gfc_expr *e)

   if (istart <= iend)
 {
+  char buffer[21];
   if (istart < 1)
 	{
-	  gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC
-		 ") at %L below 1",
-		 istart, &ref->u.ss.start->where);
+	  sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, istart);
+	  gfc_error ("Substring start index (%s) at %L below 1",
+		 buffer, &ref->u.ss.start->where);
 	  return false;
 	}

@@ -4567,9 +4568,9 @@ substring_has_constant_len (gfc_expr *e)
 	length = gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer);
   if (iend > length)
 	{
-	  gfc_error ("Substring end index (" HOST_WIDE_INT_PRINT_DEC
-		 ") at %L exceeds string length",
-		 iend, &ref->u.ss.end->where);
+	  sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, iend);
+	  gfc_error ("Substring end index (%s) at %L exceeds string length",
+		 buffer, &ref->u.ss.end->where);
 	  return false;
 	}
   length = iend - istart + 1;

[committed] openmp: Implement the error directive

Hi!

This patch implements the error directive.  Depending on clauses it is either
a compile time diagnostics (in that case diagnosed right away) or runtime
diagnostics (libgomp API call that diagnoses at runtime), and either fatal
or warning (error or warning at compile time or fatal error vs. error at
runtime) and either has no message or user supplied message (this kind of
e.g. deprecated attribute).  The directive is also stand-alone directive
when at runtime while utility (thus disappears from the IL as if it wasn't
there for parsing like nothing directive) at compile time.

There are some clarifications in the works ATM, so this patch doesn't yet
require that for compile time diagnostics the user message must be a constant
string literal, there are uncertainities on what exactly is valid argument
of message clause (whether just const char * type, convertible to const char *,
qualified/unqualified const char * or char * or what else) and what to do
in templates.  Currently even in templates it is diagnosed right away for
compile time diagnostics, if we'll need to substitute it, we'd need to queue
something into the IL, have pt.c handle it and diagnose only later.

Bootstrapped/regtested on x86_64-linux, committed to trunk.

2021-08-20  Jakub Jelinek  

gcc/
* omp-builtins.def (BUILT_IN_GOMP_WARNING, BUILT_IN_GOMP_ERROR): New
builtins.
gcc/c-family/
* c-pragma.h (enum pragma_kind): Add PRAGMA_OMP_ERROR.
* c-pragma.c (omp_pragmas): Add error directive.
* c-omp.c (omp_directives): Uncomment error directive entry.
gcc/c/
* c-parser.c (c_parser_omp_error): New function.
(c_parser_pragma): Handle PRAGMA_OMP_ERROR.
gcc/cp/
* parser.c (cp_parser_handle_statement_omp_attributes): Determine if
PRAGMA_OMP_ERROR directive is C_OMP_DIR_STANDALONE.
(cp_parser_omp_error): New function.
(cp_parser_pragma): Handle PRAGMA_OMP_ERROR.
gcc/fortran/
* types.def (BT_FN_VOID_CONST_PTR_SIZE): New DEF_FUNCTION_TYPE_2.
* f95-lang.c (ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST): Define.
gcc/testsuite/
* c-c++-common/gomp/error-1.c: New test.
* c-c++-common/gomp/error-2.c: New test.
* c-c++-common/gomp/error-3.c: New test.
* g++.dg/gomp/attrs-1.C (bar): Add error directive test.
* g++.dg/gomp/attrs-2.C (bar): Add error directive test.
* g++.dg/gomp/attrs-13.C: New test.
* g++.dg/gomp/error-1.C: New test.
libgomp/
* libgomp.map (GOMP_5.1): Add GOMP_error and GOMP_warning.
* libgomp_g.h (GOMP_warning, GOMP_error): Declare.
* error.c (GOMP_warning, GOMP_error): New functions.
* testsuite/libgomp.c-c++-common/error-1.c: New test.

--- gcc/omp-builtins.def.jj 2021-08-19 12:53:44.693106618 +0200
+++ gcc/omp-builtins.def2021-08-19 17:46:45.960368837 +0200
@@ -463,3 +463,7 @@ DEF_GOMP_BUILTIN (BUILT_IN_GOMP_ALLOC,
  ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST)
 DEF_GOMP_BUILTIN (BUILT_IN_GOMP_FREE,
  "GOMP_free", BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_WARNING, "GOMP_warning",
+ BT_FN_VOID_CONST_PTR_SIZE, ATTR_NOTHROW_LEAF_LIST)
+DEF_GOMP_BUILTIN (BUILT_IN_GOMP_ERROR, "GOMP_error",
+ BT_FN_VOID_CONST_PTR_SIZE, 
ATTR_COLD_NORETURN_NOTHROW_LEAF_LIST)
--- gcc/c-family/c-pragma.h.jj  2021-08-19 12:53:44.690106660 +0200
+++ gcc/c-family/c-pragma.h 2021-08-19 14:37:58.004167196 +0200
@@ -53,6 +53,7 @@ enum pragma_kind {
   PRAGMA_OMP_DECLARE,
   PRAGMA_OMP_DEPOBJ,
   PRAGMA_OMP_DISTRIBUTE,
+  PRAGMA_OMP_ERROR,
   PRAGMA_OMP_END_DECLARE_TARGET,
   PRAGMA_OMP_FLUSH,
   PRAGMA_OMP_FOR,
--- gcc/c-family/c-pragma.c.jj  2021-08-19 12:53:44.690106660 +0200
+++ gcc/c-family/c-pragma.c 2021-08-19 14:37:58.004167196 +0200
@@ -1326,6 +1326,7 @@ static const struct omp_pragma_def omp_p
   { "cancellation", PRAGMA_OMP_CANCELLATION_POINT },
   { "critical", PRAGMA_OMP_CRITICAL },
   { "depobj", PRAGMA_OMP_DEPOBJ },
+  { "error", PRAGMA_OMP_ERROR },
   { "end", PRAGMA_OMP_END_DECLARE_TARGET },
   { "flush", PRAGMA_OMP_FLUSH },
   { "nothing", PRAGMA_OMP_NOTHING },
--- gcc/c-family/c-omp.c.jj 2021-08-19 12:53:44.690106660 +0200
+++ gcc/c-family/c-omp.c2021-08-19 14:37:58.004167196 +0200
@@ -2991,8 +2991,8 @@ static const struct c_omp_directive omp_
   /* { "end", "metadirective", nullptr, PRAGMA_OMP_END,
 C_OMP_DIR_???, ??? },  */
   /* error with at(execution) is C_OMP_DIR_STANDALONE.  */
-  /* { "error", nullptr, nullptr, PRAGMA_OMP_ERROR,
-C_OMP_DIR_UTILITY, false },  */
+  { "error", nullptr, nullptr, PRAGMA_OMP_ERROR,
+C_OMP_DIR_UTILITY, false },
   { "flush", nullptr, nullptr, PRAGMA_OMP_FLUSH,
 C_OMP_DIR_STANDALONE, false },
   { "for", nullptr, nullptr, PRAGMA_OMP_FOR,
--- gcc/c/c-parser.c.jj 2021-08-19 13:19:56.217288105 +0200
+++ gcc/c/c-parser.c2021-08-19 15:22:10.146195464 +0200
@@ -1588,6 +1588,7 @@ stati

[Patch] Fortran: Add OpenMP's error directive (was: [committed] openmp: Implement the error directive)


Hi Jakub, hi all,

On 20.08.21 11:45, Jakub Jelinek wrote:

This patch implements the error directive.  Depending on clauses it is either
a compile time diagnostics (in that case diagnosed right away) or runtime
diagnostics (libgomp API call that diagnoses at runtime),


The attached patch does likewise for Fortran. Compared to yesterday, _()
was replaced by G_().


There are some clarifications in the works ATM, so this patch doesn't yet
require that for compile time diagnostics the user message must be a constant
string literal,[...]


For Fortran, it does require a constant character expression; I think we
did converge on this. With at(execution), it is regarded as execution
expression - otherwise, as it belongs to 'untility' and can be placed
everywhere (ST_NONE).

There is on-going discussion/wordsmithing regarding the PURE restriction
with 'at(compilation)' – currently, it is rejected in pure procedures.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Add OpenMP's error directive

Fortran part to the C/C++ implementation of
commit r12-3040-g0d973c0a0d90a0a302e7eda1a4d9709be3c5b102

gcc/fortran/ChangeLog:

	* dump-parse-tree.c (show_omp_clauses): Handle 'at', 'severity'
	and 'message' clauses.
	(show_omp_node, show_code_node): Handle EXEC_OMP_ERROR.
	* gfortran.h (gfc_statement): Add ST_OMP_ERROR.
	(gfc_omp_severity_type, gfc_omp_at_type): New.
	(gfc_omp_clauses): Add 'at', 'severity' and 'message' clause;
	use more bitfields + ENUM_BITFIELD.
	(gfc_exec_op): Add EXEC_OMP_ERROR.
	* match.h (gfc_match_omp_error): New.
	* openmp.c (enum omp_mask1): Add OMP_CLAUSE_(AT,SEVERITY,MESSAGE).
	(gfc_match_omp_clauses): Handle new clauses.
	(OMP_ERROR_CLAUSES, gfc_match_omp_error): New.
	(resolve_omp_clauses): Resolve new clauses.
	(omp_code_to_statement, gfc_resolve_omp_directive): Handle
	EXEC_OMP_ERROR.
	* parse.c (decode_omp_directive, next_statement,
	gfc_ascii_statement): Handle 'omp error'.
	* resolve.c (gfc_resolve_blocks): Likewise.
	* st.c (gfc_free_statement): Likewise.
	* trans-openmp.c (gfc_trans_omp_error): Likewise.
	(gfc_trans_omp_directive): Likewise.
	* trans.c (trans_code): Likewise.

libgomp/ChangeLog:

	* testsuite/libgomp.fortran/error-1.f90: New test.

gcc/testsuite/ChangeLog:

	* gfortran.dg/gomp/error-1.f90: New test.
	* gfortran.dg/gomp/error-2.f90: New test.
	* gfortran.dg/gomp/error-3.f90: New test.

 gcc/fortran/dump-parse-tree.c |  27 +-
 gcc/fortran/gfortran.h|  58 
 gcc/fortran/match.h   |   1 +
 gcc/fortran/openmp.c  | 124 +-
 gcc/fortran/parse.c   |  10 ++-
 gcc/fortran/resolve.c |   2 +
 gcc/fortran/st.c  |   1 +
 gcc/fortran/trans-openmp.c|  34 +++
 gcc/fortran/trans.c   |   1 +
 gcc/testsuite/gfortran.dg/gomp/error-1.f90|  51 +++
 gcc/testsuite/gfortran.dg/gomp/error-2.f90|  15 
 gcc/testsuite/gfortran.dg/gomp/error-3.f90|  88 ++
 libgomp/testsuite/libgomp.fortran/error-1.f90 |  78 
 13 files changed, 465 insertions(+), 25 deletions(-)

diff --git a/gcc/fortran/dump-parse-tree.c b/gcc/fortran/dump-parse-tree.c
index 92d9f9e054d..c75a0a9d095 100644
--- a/gcc/fortran/dump-parse-tree.c
+++ b/gcc/fortran/dump-parse-tree.c
@@ -1908,6 +1908,26 @@ show_omp_clauses (gfc_omp_clauses *omp_clauses)
   fputc (' ', dumpfile);
   fputs (memorder, dumpfile);
 }
+  if (omp_clauses->at != OMP_AT_UNSET)
+{
+  if (omp_clauses->at != OMP_AT_COMPILATION)
+	fputs (" AT (COMPILATION)", dumpfile);
+  else
+	fputs (" AT (EXECUTION)", dumpfile);
+}
+  if (omp_clauses->severity != OMP_SEVERITY_UNSET)
+{
+  if (omp_clauses->severity != OMP_SEVERITY_FATAL)
+	fputs (" SEVERITY (FATAL)", dumpfile);
+  else
+	fputs (" SEVERITY (WARNING)", dumpfile);
+}
+  if (omp_clauses->message)
+{
+  fputs (" ERROR (", dumpfile);
+  show_expr (omp_clauses->message);
+  fputc (')', dumpfile);
+}
 }
 
 /* Show a single OpenMP or OpenACC directive node and everything underneath it
@@ -1950,8 +1970,9 @@ show_omp_node (int level, gfc_code *c)
 case EXEC_OMP_DISTRIBUTE_SIMD: name = "DISTRIBUTE SIMD"; break;
 case EXEC_OMP_DO: name = "DO"; break;
 case EXEC_OMP_DO_SIMD: name = "DO SIMD"; break;
-case EXEC_OMP_LOOP: name = "LOOP"; break;
+case EXEC_OMP_ERROR: name = "ERROR"; break;
 case EXEC_OMP_FLUSH: name = "FLUSH"; break;
+case EXEC_OMP_LOOP: name = "LOOP"; break;
 case EXEC_OMP_MASKED: name = "MASKED"; break;
 case EXEC_OMP_MASKED_TASKLOOP: name = "MASKED TASKLOOP"; break;
 case EXEC_OMP_MASK

Re: [Patch] Fortran: Add OpenMP's error directive (was: [committed] openmp: Implement the error directive)

On Fri, Aug 20, 2021 at 12:00:10PM +0200, Tobias Burnus wrote:
> gcc/fortran/ChangeLog:
> 
>   * dump-parse-tree.c (show_omp_clauses): Handle 'at', 'severity'
>   and 'message' clauses.
>   (show_omp_node, show_code_node): Handle EXEC_OMP_ERROR.
>   * gfortran.h (gfc_statement): Add ST_OMP_ERROR.
>   (gfc_omp_severity_type, gfc_omp_at_type): New.
>   (gfc_omp_clauses): Add 'at', 'severity' and 'message' clause;
>   use more bitfields + ENUM_BITFIELD.
>   (gfc_exec_op): Add EXEC_OMP_ERROR.
>   * match.h (gfc_match_omp_error): New.
>   * openmp.c (enum omp_mask1): Add OMP_CLAUSE_(AT,SEVERITY,MESSAGE).
>   (gfc_match_omp_clauses): Handle new clauses.
>   (OMP_ERROR_CLAUSES, gfc_match_omp_error): New.
>   (resolve_omp_clauses): Resolve new clauses.
>   (omp_code_to_statement, gfc_resolve_omp_directive): Handle
>   EXEC_OMP_ERROR.
>   * parse.c (decode_omp_directive, next_statement,
>   gfc_ascii_statement): Handle 'omp error'.
>   * resolve.c (gfc_resolve_blocks): Likewise.
>   * st.c (gfc_free_statement): Likewise.
>   * trans-openmp.c (gfc_trans_omp_error): Likewise.
>   (gfc_trans_omp_directive): Likewise.
>   * trans.c (trans_code): Likewise.
> 
> libgomp/ChangeLog:
> 
>   * testsuite/libgomp.fortran/error-1.f90: New test.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gfortran.dg/gomp/error-1.f90: New test.
>   * gfortran.dg/gomp/error-2.f90: New test.
>   * gfortran.dg/gomp/error-3.f90: New test.

LGTM, thanks.

Jakub

Re: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

On Fri, Aug 20, 2021 at 11:45:33AM +0200, Harald Anlauf wrote:
> Hi Jakob,
> 
> thanks for the detailed explanation!
> 
> > The other much easier but uglier option is to use a temporary buffer:
> >   char buffer[21];
> >   sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, hwint_val);
> >   gfc_error ("... %s ...", ... buffer ...);
> > This works, as it uses the host sprintf i.e. *printf family for which
> > HOST_WIDE_INT_PRINT_DEC macro is designed.
> 
> The attached followup patch implements this.
> 
> Can anybody affected by current HEAD confirm that this fixes bootstrap?

I have verified it fixes i686-linux bootstrap.
But the new testcase doesn't trigger any of those new errors, is something
else in the testsuite covering those or do you have some short snippet that
could verify the errors work properly?

> diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
> index 492867e12cb..eaabbffca4d 100644
> --- a/gcc/fortran/simplify.c
> +++ b/gcc/fortran/simplify.c
> @@ -4552,11 +4552,12 @@ substring_has_constant_len (gfc_expr *e)
> 
>if (istart <= iend)
>  {
> +  char buffer[21];
>if (istart < 1)
>   {
> -   gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC
> -  ") at %L below 1",
> -  istart, &ref->u.ss.start->where);
> +   sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, istart);
> +   gfc_error ("Substring start index (%s) at %L below 1",
> +  buffer, &ref->u.ss.start->where);
> return false;
>   }
> 
> @@ -4567,9 +4568,9 @@ substring_has_constant_len (gfc_expr *e)
>   length = gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer);
>if (iend > length)
>   {
> -   gfc_error ("Substring end index (" HOST_WIDE_INT_PRINT_DEC
> -  ") at %L exceeds string length",
> -  iend, &ref->u.ss.end->where);
> +   sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, iend);
> +   gfc_error ("Substring end index (%s) at %L exceeds string length",
> +  buffer, &ref->u.ss.end->where);
> return false;
>   }
>length = iend - istart + 1;


Jakub

Re: [PATCH] configure: Allow a host makefile fragment to override PIE flag settings.

2021-08-20 Thread Iain Sandoe

Hi Richard,

> On 20 Aug 2021, at 09:39, Richard Sandiford  wrote:
> 
> Iain Sandoe  writes:

>> This concerns the settings of flags (using the host makefile fragment) for
>> tools that will run on the host.
>> 
>> At present the (no)PIE flags are computed in gcc/configure but it is not
>> possible to override them (either from higher level Makefile or from the
>> command line).  Secondly the ordering of placement of flags assumes
>> ELF semantics are OK for ordering of -fno-PIE and -fPIC.
>> 
>> For Darwin, this introduces problems if fno-PIE causes PIC to be switched
>> off and the bootstrap compiler does not support mdynamic-no-pic (which
>> is the case when we bootstrap a 32b toolchain with clang).  This causes
>> the host files to be built '-static' which is not legal for user-space
>> code, and the build terminates with illegal relocations (so that bootstrap
>> fails).
>> 
>> This patch:
>> 
>> 1. Allows the computed PIE flags to be overridden by the top level
>>   configure.
>> 
>> 2. Allows a host fragment to set values on the basis of the configured
>>   host platform/version.
>> 
>> 3. Sets suitable values for the Darwin cases that currently fail.
>> 
>> tested on i686,powerpc,x86-64-darwin, x86_64, powerpc64-linux,
>> OK for master?
>> 
>> thanks
>> Iain
>> 
>> Signed-off-by: Iain Sandoe 
>> 
>> ChangeLog:
>> 
>>  * Makefile.in: Regenerated.
>>  * Makefile.tpl: Add PIE flags to HOST_EXPORTS and
>>  POST_STAGE1_HOST_EXPORTS as specified by the host makefile
>>  fragment.
>> 
>> config/ChangeLog:
>> 
>>  * mh-darwin: Specify suitable PIE/PIC flags for 32b Darwin
>>  hosts.
>> 
>> gcc/ChangeLog:
>> 
>>  * configure: Regenerate.
>>  * configure.ac: Allow top level configure to override assumed
>>  PIE flags.
>> ---
>> Makefile.in  | 14 ++
>> Makefile.tpl | 14 ++
>> config/mh-darwin | 20 
>> gcc/configure|  4 ++--
>> gcc/configure.ac |  4 ++--
>> 5 files changed, 48 insertions(+), 8 deletions(-)
>> 
>> diff --git a/Makefile.tpl b/Makefile.tpl
>> index 9adf4f94728..9523be5a761 100644
>> --- a/Makefile.tpl
>> +++ b/Makefile.tpl
>> @@ -576,6 +576,20 @@ all:
>> @host_makefile_frag@
>> ###
>> 
>> +# Allow host makefile fragment to override PIE settings.
>> +ifneq ($(STAGE1_NO_PIE_CFLAGS),)
>> +  HOST_EXPORTS += export NO_PIE_CFLAGS="$(STAGE1_NO_PIE_CFLAGS)";
>> +endif
>> +ifneq ($(STAGE1_NO_PIE_FLAG),)
>> +  HOST_EXPORTS += export NO_PIE_FLAG="$(STAGE1_NO_PIE_FLAG)";
>> +endif
>> +ifneq ($(BOOT_NO_PIE_CFLAGS),)
>> +  POSTSTAGE1_HOST_EXPORTS += export NO_PIE_CFLAGS="$(BOOT_NO_PIE_CFLAGS)";
>> +endif
>> +ifneq ($(BOOT_NO_PIE_FLAG),)
>> +  POSTSTAGE1_HOST_EXPORTS += export NO_PIE_FLAG="$(BOOT_NO_PIE_FLAG)";
>> +endif
> 
> Not really my area, but:

thanks for a very helpful review :)

> Is the NO_PIE_FLAG just for completeness?

Well, the NO_PIE_FLAG is used (the flags that are passed down lose the 
distinction of stage1/post-stage1), 
it’s the BOOT_NO_PIE_FLAG that is not used in this initial patch - however, it 
is used when we consider
the enable_host_shared in a later patch.

The idea was to produce a general facility even if the first user of it did not 
need all elements.

>  It didn't look like the
> patch overrode that.  Might be easier to leave it out if so, to avoid
> any risk of accidentally having an empty variable.

Yeah, perhaps trying to be general without a use-case, fails to test adequately.

… since the host shared stuff is now applied I’ll revise this consider that too 
so that both the
stage1 and post-stage1 cases should be exercised.

> Maybe it would be easier to have the makefile fragments determine
> something like CODE_MODEL_CFLAGS, which can be "-fPIC", "-mdynamic-no-pic",
> etc., and use:
> 
> COMPILER += $(NO_PIE_CFLAGS) $(CODE_MODEL_CFLAGS)

OK. I have misgivings about this - the problem is that:

-fPIC -fno-PIE != -fno-PIE -fPIC,  which is not obvious to many folks - who 
expect that
the “last edition of a flag will be the one in force”.

So the PIE-ness and the PIC-ness are decoupled in the configury but they need 
to be
ordered specifically for targets that want PIC code by default (FWIW, I don’t 
think Darwin
is the only default-PIC case here, from discussions on irc).

Thus the mechanism I adopted was to ensure that the no-PIE setting could be 
adjusted
to make sure that PIC was enforced where needed (the probably non-obvious 
aspect of
this is that mdynamic-no-pic only happens to work because it doesn’t exist for 
ELF and
thus doesn’t get switched off by the no-PIE logic).  All very fragile, sadly.

In fact, (on Darwin) PIE implies PIC - it is one case where ELF semantics are 
at odds with
Mach-O in a non-easy manner (I have caught most cases in 
darwin_override_options, but
that obv. doesn’t work at configure-time since the bootstrap complier is 
outside our control).

I would be concerned that any decoupled flag could become reordered w.r.t the 
no-PIE
and break things again, but I

RE: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.

2021-08-20 Thread Roger Sayle

Hi Richard,
Fortunately, it's (moderately) safe to mix COSTS_N_INSNS and 
COSTS_N_BYTES in the i386 backend.  The average length of an
x86_64 instruction in typical code is between 2 and 3 bytes, so
the definition of N*4 for COSTS_N_INSNS(N) and N*2 for
COSTS_N_BYTES(N) allows these be mixed, with no less approximation
error than the more common problem, that rtx_costs usually
encodes cycle counts (but converted to units of COSTS_N_INSNS).

The thing that I like about STV's use of a "gain calculator" is that
it allows much more accurate fine tuning.  Many passes, particularly
combine, base their decisions of obtaining total cost estimates (with
large approximate values) and then comparing those two totals.
As any physicist will confirm, it's much better to parameterize on
the observed delta between those two large approximate values.

But you're also right, that I need to run CSiBE with -m32 and get back
to you with the results.

Roger
--

-Original Message-
From: Richard Biener  
Sent: 20 August 2021 08:29
To: Roger Sayle 
Cc: GCC Patches 
Subject: Re: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.

On Thu, Aug 19, 2021 at 6:01 PM Roger Sayle  wrote:
>
>
> Doh!  ENOPATCH.
>
> -Original Message-
> From: Roger Sayle 
> Sent: 19 August 2021 16:59
> To: 'GCC Patches' 
> Subject: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
>
>
> Back in June I briefly mentioned in one of my gcc-patches posts that a 
> change that should have always reduced code size, would mysteriously 
> occasionally result in slightly larger code (according to CSiBE):
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html
>
> Investigating further, the cause turns out to be that x86_64's 
> scalar-to-vector (stv) pass is relying on poor estimates of the size 
> costs/benefits.  This patch tweaks the backend's compute_convert_gain 
> method to provide slightly more accurate values when compiling with -Os.
> Compilation without -Os is (should be) unaffected.  And for 
> completeness, I'll mention that the stv pass is a net win for code 
> size so it's much better to improve its heuristics than simply gate 
> the pass on !optimize_for_size.
>
> The net effect of this change is to save 1399 bytes on the CSiBE code 
> size benchmark when compiling with -Os.
>
> This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
> and "make -k check" with no new failures.
>
> Ok for mainline?

+   /* xor (2 bytes) vs. xorps (3 bytes).  */
+   if (src == const0_rtx)
+ igain -= COSTS_N_BYTES (1);
+   /* movdi_internal vs. movv2di_internal.  */
+   /* => mov (5 bytes) vs. movaps (7 bytes).  */
+   else if (x86_64_immediate_operand (src, SImode))
+ igain -= COSTS_N_BYTES (2);

doesn't it need two GPR xor for 32bit DImode and two mov?  Thus the non-SSE 
cost should be times 'm'?  For const0_rtx we may eventually re-use the zero reg 
for the high part so that is eventually correct.

Also I'm missing a 'else' - in the default case there's no cost/benefit of 
using SSE vs. GPR regs?  For SSE it would be a constant pool load.

I also wonder, since I now see COSTS_N_BYTES for the first time (heh), whether 
with -Os we'd need to replace all COSTS_N_INSNS (1) scaling with COSTS_N_BYTES 
scaling?  OTOH costs_add_n_insns uses COSTS_N_INSNS for the size part as well.

That said, it looks like we're eventually mixing apples and oranges now or even 
previously?

Thanks,
Richard.

>
>
> 2021-08-19  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-features.c (compute_convert_gain): Provide
> more accurate values for CONST_INT, when optimizing for size.
> * config/i386/i386.c (COSTS_N_BYTES): Move definition from here...
> * config/i386/i386.h (COSTS_N_BYTES): to here.
>
> Roger
> --
>

Re: [PATCH] configure: Allow a host makefile fragment to override PIE flag settings.

2021-08-20 Thread Richard Sandiford via Gcc-patches

Iain Sandoe  writes:
> Hi Richard,
>> Maybe it would be easier to have the makefile fragments determine
>> something like CODE_MODEL_CFLAGS, which can be "-fPIC", "-mdynamic-no-pic",
>> etc., and use:
>> 
>> COMPILER += $(NO_PIE_CFLAGS) $(CODE_MODEL_CFLAGS)
>
> OK. I have misgivings about this - the problem is that:
>
> -fPIC -fno-PIE != -fno-PIE -fPIC,  which is not obvious to many folks - who 
> expect that
> the “last edition of a flag will be the one in force”.
>
> So the PIE-ness and the PIC-ness are decoupled in the configury but they need 
> to be
> ordered specifically for targets that want PIC code by default (FWIW, I don’t 
> think Darwin
> is the only default-PIC case here, from discussions on irc).

Yeah, that's what the above was supposed to achieve.  In other words,
if you force non-PIE, you also need to follow that by $(CODE_MODEL_CFLAGS),
which restates whatever the base code model is.

If it's the decoupling you're worried about, then an alternative would
be to have:

  NO_PIE_CFLAGS="-fno-PIE \$(CODE_MODEL_CFLAGS)"

I was going to suggest that originally, but thought the above might
be simpler to follow later.

Thanks,
Richard

Re: [RFC] ldist: Recognize rawmemchr loop patterns

On Fri, Jun 25, 2021 at 12:23 PM Stefan Schulze Frielinghaus
 wrote:
>
> On Wed, Jun 16, 2021 at 04:22:35PM +0200, Richard Biener wrote:
> > On Mon, Jun 14, 2021 at 7:26 PM Stefan Schulze Frielinghaus
> >  wrote:
> > >
> > > On Thu, May 20, 2021 at 08:37:24PM +0200, Stefan Schulze Frielinghaus 
> > > wrote:
> > > [...]
> > > > > but we won't ever arrive here because of the niters condition.  But
> > > > > yes, doing the pattern matching in the innermost loop processing code
> > > > > looks good to me - for the specific case it would be
> > > > >
> > > > >   /* Don't distribute loop if niters is unknown.  */
> > > > >   tree niters = number_of_latch_executions (loop);
> > > > >   if (niters == NULL_TREE || niters == chrec_dont_know)
> > > > > ---> here?
> > > > > continue;
> > > >
> > > > Right, please find attached a new version of the patch where everything
> > > > is included in the loop distribution pass.  I will do a bootstrap and
> > > > regtest on IBM Z over night.  If you give me green light I will also do
> > > > the same on x86_64.
> > >
> > > Meanwhile I gave it a shot on x86_64 where the testsuite runs fine (at
> > > least the ldist-strlen testcase).  If you are Ok with the patch, then I
> > > would rebase and run the testsuites again and post a patch series
> > > including the rawmemchr implementation for IBM Z.
> >
> > @@ -3257,6 +3261,464 @@ find_seed_stmts_for_distribution (class loop
> > *loop, vec *work_list)
> >return work_list->length () > 0;
> >  }
> >
> > +static void
> > +generate_rawmemchr_builtin (loop_p loop, tree reduction_var,
> > +   data_reference_p store_dr, tree base, tree 
> > pattern,
> > +   location_t loc)
> > +{
> >
> > this new function needs a comment.  Applies to all of the new ones, btw.
>
> Done.
>
> > +  gcc_checking_assert (POINTER_TYPE_P (TREE_TYPE (base))
> > +  && TREE_TYPE (TREE_TYPE (base)) == TREE_TYPE 
> > (pattern));
> >
> > this looks fragile and is probably unnecessary as well.
> >
> > +  gcc_checking_assert (TREE_TYPE (reduction_var) == TREE_TYPE (base));
> >
> > in general you want types_compatible_p () checks which for pointers means
> > all pointers are compatible ...
>
> True, I removed both asserts.
>
> > (skipping stuff)
> >
> > @@ -3321,10 +3783,20 @@ loop_distribution::execute (function *fun)
> >   && !optimize_loop_for_speed_p (loop)))
> > continue;
> >
> > -  /* Don't distribute loop if niters is unknown.  */
> > +  /* If niters is unknown don't distribute loop but rather try to 
> > transform
> > +it to a call to a builtin.  */
> >tree niters = number_of_latch_executions (loop);
> >if (niters == NULL_TREE || niters == chrec_dont_know)
> > -   continue;
> > +   {
> > + if (transform_reduction_loop (loop))
> > +   {
> > + changed = true;
> > + loops_to_be_destroyed.safe_push (loop);
> > + if (dump_file)
> > +   fprintf (dump_file, "Loop %d transformed into a
> > builtin.\n", loop->num);
> > +   }
> > + continue;
> > +   }
> >
> > please look at
> >
> >   if (nb_generated_loops + nb_generated_calls > 0)
> > {
> >   changed = true;
> >   if (dump_enabled_p ())
> > dump_printf_loc (MSG_OPTIMIZED_LOCATIONS,
> >  loc, "Loop%s %d distributed: split to
> > %d loops "
> >  "and %d library calls.\n", str, loop->num,
> >  nb_generated_loops, nb_generated_calls);
> >
> > and follow the use of dump_* and MSG_OPTIMIZED_LOCATIONS so the
> > transforms are reported with -fopt-info-loop
>
> Done.
>
> > +
> > +  return transform_reduction_loop_1 (loop, load_dr, store_dr, 
> > reduction_var);
> > +}
> >
> > what's the point in tail-calling here and visually splitting the
> > function in half?
>
> In the first place I thought that this is more pleasant since in
> transform_reduction_loop_1 it is settled that we have a single load,
> store, and reduction variable.  After refactoring this isn't true
> anymore and I inlined the function and made this clear via a comment.
>
> >
> > (sorry for picking random pieces now ;))
> >
> > +  for (gphi_iterator bsi = gsi_start_phis (bb); !gsi_end_p (bsi);
> > +  gsi_next (&bsi), ++ninsns)
> > +   {
> >
> > this counts debug insns, I guess you want gsi_next_nondebug at least.
> > not sure why you are counting PHIs at all btw - for the loops you match
> > you are expecting at most two, one IV and eventually one for the virtual
> > operand of the store?
>
> Yes, I removed the counting for the phi loop and changed to
> gsi_next_nondebug for both loops.
>
> >
> > + if (gimple_has_volatile_ops (phi))
> > +   return false;
> >
> > PHIs never have volatile ops.
> >
> > + if (gimple_clobber_p (phi))
> > +

[PATCH] Refactor BB splitting of DRs for SLP group analysis

This uses the group_id computed to ensure DRs in different BBs do
not get merged into a DR group.  To achieve this we seed the
group from the BB index when group_ids are not computed and we
make sure to bump the group_id when advancing to the next BB for
BB SLP analysis.

This paves the way for relaxing the grouping for BB vectorization
by adjusting its group_id computation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

Richard.

2021-08-20  Richard Biener  

* tree-vect-data-refs.c (dr_group_sort_cmp): Do not compare
BBs.
(vect_analyze_data_ref_accesses): Likewise.  Assign the BB
index as group_id when dataref_groups were not computed.
* tree-vect-slp.c (vect_slp_bbs): Bump current_group when
we advace to the next BB.
---
 gcc/tree-vect-data-refs.c | 21 -
 gcc/tree-vect-slp.c   |  2 ++
 2 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
index b7dde07fc5e..37f46d1aaa3 100644
--- a/gcc/tree-vect-data-refs.c
+++ b/gcc/tree-vect-data-refs.c
@@ -2831,12 +2831,6 @@ dr_group_sort_cmp (const void *dra_, const void *drb_)
   if (dra == drb)
 return 0;
 
-  /* DRs in different basic-blocks never belong to the same group.  */
-  int bb_index1 = gimple_bb (DR_STMT (dra))->index;
-  int bb_index2 = gimple_bb (DR_STMT (drb))->index;
-  if (bb_index1 != bb_index2)
-return bb_index1 < bb_index2 ? -1 : 1;
-
   /* Different group IDs lead never belong to the same group.  */
   if (dra_pair.second != drb_pair.second)
 return dra_pair.second < drb_pair.second ? -1 : 1;
@@ -2963,7 +2957,13 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
   datarefs_copy.create (datarefs.length ());
   for (unsigned i = 0; i < datarefs.length (); i++)
 {
-  int group_id = dataref_groups ? (*dataref_groups)[i] : 0;
+  int group_id;
+  /* If the caller computed DR grouping use that, otherwise group by
+basic blocks.  */
+  if (dataref_groups)
+   group_id = (*dataref_groups)[i];
+  else
+   group_id = gimple_bb (DR_STMT (datarefs[i]))->index;
   datarefs_copy.quick_push (data_ref_pair (datarefs[i], group_id));
 }
   datarefs_copy.qsort (dr_group_sort_cmp);
@@ -2999,13 +2999,8 @@ vect_analyze_data_ref_accesses (vec_info *vinfo,
 matters we can push those to a worklist and re-iterate
 over them.  The we can just skip ahead to the next DR here.  */
 
- /* DRs in a different BBs should not be put into the same
+ /* DRs in a different DR group should not be put into the same
 interleaving group.  */
- int bb_index1 = gimple_bb (DR_STMT (dra))->index;
- int bb_index2 = gimple_bb (DR_STMT (drb))->index;
- if (bb_index1 != bb_index2)
-   break;
-
  if (dra_group_id != drb_group_id)
break;
 
diff --git a/gcc/tree-vect-slp.c b/gcc/tree-vect-slp.c
index 4dcc70c5dd7..d2f6a16f220 100644
--- a/gcc/tree-vect-slp.c
+++ b/gcc/tree-vect-slp.c
@@ -5985,6 +5985,8 @@ vect_slp_bbs (const vec &bbs)
  &dataref_groups, current_group))
++current_group;
}
+  /* New BBs always start a new DR group.  */
+  ++current_group;
 }
 
   return vect_slp_region (bbs, datarefs, &dataref_groups, insns);
-- 
2.31.1

Aw: Re: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

> Gesendet: Freitag, 20. August 2021 um 12:12 Uhr
> Von: "Jakub Jelinek" 

> I have verified it fixes i686-linux bootstrap.
> But the new testcase doesn't trigger any of those new errors, is something
> else in the testsuite covering those or do you have some short snippet that
> could verify the errors work properly?

The testcase was designed to verify correct simplification of
substring length for a variety of cases.

I played with variations of the testcase by specifying illegal
substring bounds, but all these cases were caught in a different
spot with similar error messages.

I put those checks into substring_has_constant_len as sanity checks.
Do you want me to remove them?

Or are you trying to say that it is important to exercise them?

Thanks,
Harald

Re: Re: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

On Fri, Aug 20, 2021 at 12:53:33PM +0200, Harald Anlauf wrote:
> > Gesendet: Freitag, 20. August 2021 um 12:12 Uhr
> > Von: "Jakub Jelinek" 
> 
> > I have verified it fixes i686-linux bootstrap.
> > But the new testcase doesn't trigger any of those new errors, is something
> > else in the testsuite covering those or do you have some short snippet that
> > could verify the errors work properly?
> 
> The testcase was designed to verify correct simplification of
> substring length for a variety of cases.
> 
> I played with variations of the testcase by specifying illegal
> substring bounds, but all these cases were caught in a different
> spot with similar error messages.
> 
> I put those checks into substring_has_constant_len as sanity checks.
> Do you want me to remove them?
> 
> Or are you trying to say that it is important to exercise them?

Not a big deal, just wanted to verify that it actually works on i686-linux.
Please just go ahead and commit the patch to avoid the bootstrap failures.

Thanks.

Jakub

[PATCH, committed] Fix bootstrap breakage caused by r12-3033 (PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514)

Dear all,

I've just pushed the fix for the bootstrap breakage as confirmed by Jakub.


commit r12-3043-g12f22906d3c025e7edb60e3264dc9cd27a49e3e1
Author: Harald Anlauf 
Date:   Fri Aug 20 13:38:00 2021 +0200

Fortran - use temporary char buffer for passing HOST_WIDE_INT to gfc_error

gcc/fortran/ChangeLog:

PR fortran/100950
* simplify.c (substring_has_constant_len): Fix format string of
gfc_error, pass HOST_WIDE_INT bounds values via char buffer.


Sorry for the inconvenience.

Harald

diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index 492867e12cb..eaabbffca4d 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4552,11 +4552,12 @@ substring_has_constant_len (gfc_expr *e)

   if (istart <= iend)
 {
+  char buffer[21];
   if (istart < 1)
 	{
-	  gfc_error ("Substring start index (" HOST_WIDE_INT_PRINT_DEC
-		 ") at %L below 1",
-		 istart, &ref->u.ss.start->where);
+	  sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, istart);
+	  gfc_error ("Substring start index (%s) at %L below 1",
+		 buffer, &ref->u.ss.start->where);
 	  return false;
 	}

@@ -4567,9 +4568,9 @@ substring_has_constant_len (gfc_expr *e)
 	length = gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer);
   if (iend > length)
 	{
-	  gfc_error ("Substring end index (" HOST_WIDE_INT_PRINT_DEC
-		 ") at %L exceeds string length",
-		 iend, &ref->u.ss.end->where);
+	  sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, iend);
+	  gfc_error ("Substring end index (%s) at %L exceeds string length",
+		 buffer, &ref->u.ss.end->where);
 	  return false;
 	}
   length = iend - istart + 1;

[Patch] c-format.c/Fortran: Support %wd / host-wide integer in gfc_error (Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514)


On 20.08.21 11:16, Jakub Jelinek wrote:


Now, the non-Fortran FE diagnostic code actually has %wd for this (w
modifier like l modifier), which takes HOST_WIDE_INT/unsigned HOST_WIDE_INT
argument and prints it.

So, either you get through the hops to support that, unfortunately it isn't
just adding support for that in fortran/error.c (error_print) and some
helper functions, which wouldn't be that hard, just add 'w' next to 'l'
handling, TYPE_* for that and union member etc., but one needs to modify
c-family/c-format.c too to register the modifier so that gcc doesn't warn
about it and knows the proper argument type etc.


That's what the attached patch does.

Build on x86-64 GNU Linux; I tried to build it with -m32 (cf. my
previous email) but as I did not run into the original issue, this does
not proof much.

Comments? OK?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
c-format.c/Fortran: Support %wd / host-wide integer in gfc_error

This patch adds support for the 'll' (long double)
and 'w' (HOST_WIDE_INT) length modifiers to the
Fortran FE diagnostic function (gfc_error, gfc_warning, ...)

gcc/c-family/ChangeLog:

	* c-format.c (gcc_gfc_length_specs): Add 'll' and 'w'.
	(gcc_gfc_char_table): Add T9L_LL and T9L_ULL to
	"di" and "u", respecitively; fill with BADLEN to match
	size of 'types'.
	(get_init_dynamic_hwi): Split off from ...
	(init_dynamic_diag_info): ... here. Call it.
	(init_dynamic_gfc_info): Call it.

gcc/fortran/ChangeLog:

	* error.c
	(error_uinteger): Take 'long long unsigned' instead
	of 'long unsigned' as argumpent.
	(error_integer): Take 'long long' instead of 'long'.
	(error_hwuint, error_hwint): New.
	(error_print): Update to handle 'll' and 'w'
	length modifiers.
	* simplify.c (substring_has_constant_len): Replace
	HOST_WIDE_INT_PRINT_DEC by '%wd'.

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 6fd0bb33d21..b4cb765a9d3 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -546,10 +546,11 @@ static const format_length_info strfmon_length_specs[] =
 };
 
 
-/* For now, the Fortran front-end routines only use l as length modifier.  */
+/* Length modifiers used by the fortran/error.c routines.  */
 static const format_length_info gcc_gfc_length_specs[] =
 {
-  { "l", FMT_LEN_l, STD_C89, NO_FMT, 0 },
+  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
+  { "w", FMT_LEN_w, STD_C89, NO_FMT, 0 },
   { NO_FMT, NO_FMT, 0 }
 };
 
@@ -821,10 +822,10 @@ static const format_char_info gcc_cxxdiag_char_table[] =
 static const format_char_info gcc_gfc_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T89_L,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  BADLEN,  T89_UL,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
-  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
-  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "cR", NULL },
+  { "di",  0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T89_L,   T9L_LL,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "", NULL },
+  { "u",   0, STD_C89, { T89_UI,  BADLEN,  BADLEN,  T89_UL,  T9L_ULL,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN  }, "q", "", NULL },
+  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "", NULL },
+  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "cR", NULL },
 
   /* gfc conversion specifiers.  */
 
@@ -4843,12 +4844,73 @@ init_dynamic_asm_fprintf_info (void)
 }
 }
 
+static const format_length_info*
+get_init_dynamic_hwi (void)
+{
+  static tree hwi;
+  static format_length_info *diag_ls;
+
+  if (!hwi)
+{
+  unsigned int i;
+
+  /* Find the underlying type for HOST_WIDE_INT.  For the 'w'
+	 length modifier to work, one must have issued: "typedef
+	 HOST_WIDE_INT __gcc_host_wide_int__;" in one's source code
+	 prior to using that modifier.  */
+  if ((hwi = maybe_get_identifier ("__gcc_host_wide_int__")))
+	{
+	  hwi = identifier_global_value (hwi);
+	  if (hwi)
+	{
+	  if (TREE_CODE (hwi) != TYPE_DECL)
+		{
+		  error ("%<__gcc_host_wide_int__%> is not defined as a type");
+		  hwi = 0;
+		}
+	  else
+		{
+		  hwi = DECL_ORIGINAL_TYPE (hwi);
+		  gcc_assert (hwi);
+		  if (hwi != long_integer_type_node
+		  && hwi != long_long_integer_type_node)
+		{
+		  error ("%<__gcc_host_wide_int__%> is not defined"
+			 " as % or %");
+		  hwi = 0;
+		}

Re: [Patch] c-format.c/Fortran: Support %wd / host-wide integer in gfc_error (Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514)

On Fri, Aug 20, 2021 at 01:50:00PM +0200, Tobias Burnus wrote:
> On 20.08.21 11:16, Jakub Jelinek wrote:
> 
> > Now, the non-Fortran FE diagnostic code actually has %wd for this (w
> > modifier like l modifier), which takes HOST_WIDE_INT/unsigned HOST_WIDE_INT
> > argument and prints it.
> > 
> > So, either you get through the hops to support that, unfortunately it isn't
> > just adding support for that in fortran/error.c (error_print) and some
> > helper functions, which wouldn't be that hard, just add 'w' next to 'l'
> > handling, TYPE_* for that and union member etc., but one needs to modify
> > c-family/c-format.c too to register the modifier so that gcc doesn't warn
> > about it and knows the proper argument type etc.
> 
> That's what the attached patch does.
> 
> Build on x86-64 GNU Linux; I tried to build it with -m32 (cf. my
> previous email) but as I did not run into the original issue, this does
> not proof much.
> 
> Comments? OK?

LGTM (except that the last hunk won't apply anymore).

> c-format.c/Fortran: Support %wd / host-wide integer in gfc_error
> 
> This patch adds support for the 'll' (long double)
> and 'w' (HOST_WIDE_INT) length modifiers to the
> Fortran FE diagnostic function (gfc_error, gfc_warning, ...)
> 
> gcc/c-family/ChangeLog:
> 
>   * c-format.c (gcc_gfc_length_specs): Add 'll' and 'w'.
>   (gcc_gfc_char_table): Add T9L_LL and T9L_ULL to
>   "di" and "u", respecitively; fill with BADLEN to match
>   size of 'types'.
>   (get_init_dynamic_hwi): Split off from ...
>   (init_dynamic_diag_info): ... here. Call it.
>   (init_dynamic_gfc_info): Call it.
> 
> gcc/fortran/ChangeLog:
> 
>   * error.c
>   (error_uinteger): Take 'long long unsigned' instead
>   of 'long unsigned' as argumpent.
>   (error_integer): Take 'long long' instead of 'long'.
>   (error_hwuint, error_hwint): New.
>   (error_print): Update to handle 'll' and 'w'
>   length modifiers.
>   * simplify.c (substring_has_constant_len): Replace
>   HOST_WIDE_INT_PRINT_DEC by '%wd'.
> 
> diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
> index 6fd0bb33d21..b4cb765a9d3 100644
> --- a/gcc/c-family/c-format.c
> +++ b/gcc/c-family/c-format.c
> @@ -546,10 +546,11 @@ static const format_length_info strfmon_length_specs[] =
>  };
>  
>  
> -/* For now, the Fortran front-end routines only use l as length modifier.  */
> +/* Length modifiers used by the fortran/error.c routines.  */
>  static const format_length_info gcc_gfc_length_specs[] =
>  {
> -  { "l", FMT_LEN_l, STD_C89, NO_FMT, 0 },
> +  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
> +  { "w", FMT_LEN_w, STD_C89, NO_FMT, 0 },
>{ NO_FMT, NO_FMT, 0 }
>  };
>  
> @@ -821,10 +822,10 @@ static const format_char_info gcc_cxxdiag_char_table[] =
>  static const format_char_info gcc_gfc_char_table[] =
>  {
>/* C89 conversion specifiers.  */
> -  { "di",  0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T89_L,   BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
> -  { "u",   0, STD_C89, { T89_UI,  BADLEN,  BADLEN,  T89_UL,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
> -  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
> -  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "cR", NULL },
> +  { "di",  0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T89_L,   T9L_LL,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "", NULL },
> +  { "u",   0, STD_C89, { T89_UI,  BADLEN,  BADLEN,  T89_UL,  T9L_ULL,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN  }, "q", "", NULL },
> +  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "", NULL },
> +  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  
> BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "cR", NULL 
> },
>  
>/* gfc conversion specifiers.  */
>  
> @@ -4843,12 +4844,73 @@ init_dynamic_asm_fprintf_info (void)
>  }
>  }
>  
> +static const format_length_info*
> +get_init_dynamic_hwi (void)
> +{
> +  static tree hwi;
> +  static format_length_info *diag_ls;
> +
> +  if (!hwi)
> +{
> +  unsigned int i;
> +
> +  /* Find the underlying type for HOST_WIDE_INT.  For the 'w'
> +  length modifier to work, one must have issued: "typedef
> +  HOST_WIDE_INT __gcc_host_wide_int__;" in one's source code
> +  prior to using that modifier.  */
> +  if ((hwi = maybe_get_identifier ("__gcc_host_wide_int__")))
> + {
> +   hwi = identifier_global_value (hwi);
> +   if (hwi)
> + {
> +   if (TREE_CODE (hwi) != TYPE_DECL)
> + {
> +   error ("%<__gcc_host_wide_int__%> is not defined as a type");
> +   hwi = 0;
> + }
> +   else

Re: Aw: Re: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514


On 20.08.21 12:53, Harald Anlauf wrote:


I have verified it fixes i686-linux bootstrap.
But the new testcase doesn't trigger any of those new errors, is something
else in the testsuite covering those or do you have some short snippet that
could verify the errors work properly?

The testcase was designed to verify correct simplification of
substring length for a variety of cases.

I played with variations of the testcase by specifying illegal
substring bounds, but all these cases were caught in a different
spot with similar error messages.


I can confirm this. – I think in order to reduce the clutter, the
diagnostic probably should be removed.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[PATCH] ipa: add debug counter for IPA MODREF PTA


Hi.

We already have a IPA modref debug counter, but it's only used in 
tree-ssa-alias,
which is only a part of what IPA modref does. I used the dbg counter in 
isolation
of PR101949.

Ready for master?

gcc/ChangeLog:

* dbgcnt.def (DEBUG_COUNTER): New counter.
* gimple.c (gimple_call_arg_flags): Use it in IPA PTA.
---
 gcc/dbgcnt.def | 1 +
 gcc/gimple.c   | 5 +++--
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/dbgcnt.def b/gcc/dbgcnt.def
index 2345899ba68..c2bcc4eef5e 100644
--- a/gcc/dbgcnt.def
+++ b/gcc/dbgcnt.def
@@ -175,6 +175,7 @@ DEBUG_COUNTER (ipa_cp_bits)
 DEBUG_COUNTER (ipa_cp_values)
 DEBUG_COUNTER (ipa_cp_vr)
 DEBUG_COUNTER (ipa_mod_ref)
+DEBUG_COUNTER (ipa_mod_ref_pta)
 DEBUG_COUNTER (ipa_sra_params)
 DEBUG_COUNTER (ipa_sra_retvalues)
 DEBUG_COUNTER (ira_move)
diff --git a/gcc/gimple.c b/gcc/gimple.c
index 4e2653cab2f..bed7ff9e71c 100644
--- a/gcc/gimple.c
+++ b/gcc/gimple.c
@@ -48,7 +48,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "attr-fnspec.h"
 #include "ipa-modref-tree.h"
 #include "ipa-modref.h"
-
+#include "dbgcnt.h"
 
 /* All the tuples have their operand vector (if present) at the very bottom

of the structure.  Therefore, the offset required to find the
@@ -1601,7 +1601,8 @@ gimple_call_arg_flags (const gcall *stmt, unsigned arg)
  if ((modref_flags & EAF_DIRECT) && !(flags & EAF_DIRECT))
modref_flags &= ~EAF_DIRECT;
}
- flags |= modref_flags;
+ if (dbg_cnt (ipa_mod_ref_pta))
+   flags |= modref_flags;
}
 }
   return flags;
--
2.32.0

Aw: Re: Re: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

> Gesendet: Freitag, 20. August 2021 um 14:01 Uhr
> Von: "Tobias Burnus" 
> On 20.08.21 12:53, Harald Anlauf wrote:
> 
> > I played with variations of the testcase by specifying illegal
> > substring bounds, but all these cases were caught in a different
> > spot with similar error messages.
> 
> I can confirm this. – I think in order to reduce the clutter, the
> diagnostic probably should be removed.

I am unable to prove that we will never that check.  So how about:

diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index eaabbffca4d..04722a8640c 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4552,27 +4552,13 @@ substring_has_constant_len (gfc_expr *e)
 
   if (istart <= iend)
 {
-  char buffer[21];
-  if (istart < 1)
-   {
- sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, istart);
- gfc_error ("Substring start index (%s) at %L below 1",
-buffer, &ref->u.ss.start->where);
- return false;
-   }
-
   /* For deferred strings use end index as proxy for length.  */
   if (e->ts.deferred)
length = iend;
   else
length = gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer);
-  if (iend > length)
-   {
- sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, iend);
- gfc_error ("Substring end index (%s) at %L exceeds string length",
-buffer, &ref->u.ss.end->where);
- return false;
-   }
+
+  gcc_assert (istart >= 1 && iend <= length);
   length = iend - istart + 1;
 }
   else

Or skip the gcc_assert and cross fingers... (we then would not even need to
verify ref->u.ss.length in that depth).

Harald

Re: [PATCH] enable ranger and caching in pass_waccess

2021-08-20 Thread Andrew MacLeod via Gcc-patches


On 8/19/21 7:09 PM, Martin Sebor via Gcc-patches wrote:

The attached patch changes the new access warning pass to use
the per-function ranger instance.  To do that it makes a number
of the global static functions members of the pass (that involved
moving one to a later point in the file, increasing the diff;
the body of the function hasn't changed otherwise).  Still more
functions remain.  At the same time, the patch also enables
the simple pointer_query cache to avoid repeatedly recomputing
the properties of related pointers into the same objects, and
makes the cache more effective (trunk fails to cache a bunch of
intermediate results).  Finally, the patch enhances the debugging
support for the cache.

Other than the ranger/caching the changes have no user-visible
effect.



Why are you calling enable/disable ranger if you are passing a ranger 
instance around instead of using the get_range_query (cfun)->range* calls?


Are you planning to transition to using the get_range_query() interface 
instead of keeping a range_query pointer in the pointer_query class?


Andrew

Re: [committed] openmp: Implement the error directive

Hi!

On 2021-08-20T11:45:29+0200, Jakub Jelinek via Gcc-patches 
 wrote:
> --- libgomp/error.c.jj2021-08-19 12:53:44.693106618 +0200
> +++ libgomp/error.c   2021-08-19 17:58:55.633203432 +0200

> +void
> +GOMP_warning (const char *msg, size_t msglen)
> +{
> +  if (msg && msglen == (size_t) -1)
> +gomp_error ("error directive encountered: %s", msg);
> +  else if (msg)
> +{
> +  fputs ("\nlibgomp: error directive encountered: ", stderr);
> +  fwrite (msg, 1, msglen, stderr);
> +  fputc ('\n', stderr);
> +}
> +  else
> +gomp_error ("error directive encountered");
> +}
> +
> +void
> +GOMP_error (const char *msg, size_t msglen)
> +{
> +  if (msg && msglen == (size_t) -1)
> +gomp_fatal ("fatal error: error directive encountered: %s", msg);
> +  else if (msg)
> +{
> +  fputs ("\nlibgomp: fatal error: error directive encountered: ", 
> stderr);
> +  fwrite (msg, 1, msglen, stderr);
> +  fputc ('\n', stderr);
> +  exit (EXIT_FAILURE);
> +}
> +  else
> +gomp_fatal ("fatal error: error directive encountered");
> +}

At least for nvptx offloading, and at least given the newlib sources I'm
using, the 'fputs'/'fwrite' calls here drag in 'isatty', which isn't
provided by my nvptx newlib at present, so we get, for example:

[...]
FAIL: libgomp.c/../libgomp.c-c++-common/declare_target-1.c (test for excess 
errors)
Excess errors:
unresolved symbol isatty
mkoffload: fatal error: 
[...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit 
status
[...]

..., and many more.

Now, there are many ways of addressing this...  The most simple one:
conditionalize these 'GOMP_warning'/'GOMP_error' definitions on
'#ifndef LIBGOMP_OFFLOADED_ONLY' is not possible here, because it's
permissible to use the 'error' directive also inside 'target' regions, as
far as I can tell?


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [committed] openmp: Implement the error directive

Hi!

On 2021-08-20T15:11:45+0200, I wrote:
> On 2021-08-20T11:45:29+0200, Jakub Jelinek via Gcc-patches 
>  wrote:
>> --- libgomp/error.c.jj   2021-08-19 12:53:44.693106618 +0200
>> +++ libgomp/error.c  2021-08-19 17:58:55.633203432 +0200
>
>> +void
>> +GOMP_warning (const char *msg, size_t msglen)
>> +{
>> +  if (msg && msglen == (size_t) -1)
>> +gomp_error ("error directive encountered: %s", msg);
>> +  else if (msg)
>> +{
>> +  fputs ("\nlibgomp: error directive encountered: ", stderr);
>> +  fwrite (msg, 1, msglen, stderr);
>> +  fputc ('\n', stderr);
>> +}
>> +  else
>> +gomp_error ("error directive encountered");
>> +}
>> +
>> +void
>> +GOMP_error (const char *msg, size_t msglen)
>> +{
>> +  if (msg && msglen == (size_t) -1)
>> +gomp_fatal ("fatal error: error directive encountered: %s", msg);
>> +  else if (msg)
>> +{
>> +  fputs ("\nlibgomp: fatal error: error directive encountered: ", 
>> stderr);
>> +  fwrite (msg, 1, msglen, stderr);
>> +  fputc ('\n', stderr);
>> +  exit (EXIT_FAILURE);
>> +}
>> +  else
>> +gomp_fatal ("fatal error: error directive encountered");
>> +}
>
> At least for nvptx offloading, and at least given the newlib sources I'm
> using, the 'fputs'/'fwrite' calls here drag in 'isatty', which isn't
> provided by my nvptx newlib at present, so we get, for example:
>
> [...]
> FAIL: libgomp.c/../libgomp.c-c++-common/declare_target-1.c (test for 
> excess errors)
> Excess errors:
> unresolved symbol isatty
> mkoffload: fatal error: 
> [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 
> exit status
> [...]
>
> ..., and many more.
>
> Now, there are many ways of addressing this...  The most simple one:
> conditionalize these 'GOMP_warning'/'GOMP_error' definitions on
> '#ifndef LIBGOMP_OFFLOADED_ONLY' is not possible here, because it's
> permissible to use the 'error' directive also inside 'target' regions, as
> far as I can tell?

Ah, I just re-discovered 'libgomp/config/nvptx/error.c' -- I'll cook
something up.


Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

Re: [committed] openmp: Implement the error directive

On Fri, Aug 20, 2021 at 03:11:45PM +0200, Thomas Schwinge wrote:
> > --- libgomp/error.c.jj2021-08-19 12:53:44.693106618 +0200
> > +++ libgomp/error.c   2021-08-19 17:58:55.633203432 +0200
> 
> > +void
> > +GOMP_warning (const char *msg, size_t msglen)
> > +{
> > +  if (msg && msglen == (size_t) -1)
> > +gomp_error ("error directive encountered: %s", msg);
> > +  else if (msg)
> > +{
> > +  fputs ("\nlibgomp: error directive encountered: ", stderr);
> > +  fwrite (msg, 1, msglen, stderr);
> > +  fputc ('\n', stderr);
> > +}
> > +  else
> > +gomp_error ("error directive encountered");
> > +}
> > +
> > +void
> > +GOMP_error (const char *msg, size_t msglen)
> > +{
> > +  if (msg && msglen == (size_t) -1)
> > +gomp_fatal ("fatal error: error directive encountered: %s", msg);
> > +  else if (msg)
> > +{
> > +  fputs ("\nlibgomp: fatal error: error directive encountered: ", 
> > stderr);
> > +  fwrite (msg, 1, msglen, stderr);
> > +  fputc ('\n', stderr);
> > +  exit (EXIT_FAILURE);
> > +}
> > +  else
> > +gomp_fatal ("fatal error: error directive encountered");
> > +}
> 
> At least for nvptx offloading, and at least given the newlib sources I'm
> using, the 'fputs'/'fwrite' calls here drag in 'isatty', which isn't
> provided by my nvptx newlib at present, so we get, for example:

fputs/fputc/vfprintf/exit/stderr have been in use by error.c already before,
so this must be the fwrite call.
The above is for Fortran which doesn't have zero terminated strings.
Initially I wanted to use instead ... encountered: %.*s", (int) msglen, stderr);
which doesn't handle > 2GB messages, but with offloading who cares, nobody
sane would be trying to print > 2GB messages from offloading regions.

The question is if it should be achieved through copy of error.c in
config/nvptx/, or just include_next there with say fwrite redefined as a
macro that does fprintf ("%.*s", (int) msglen, msg, file)?

> Now, there are many ways of addressing this...  The most simple one:
> conditionalize these 'GOMP_warning'/'GOMP_error' definitions on
> '#ifndef LIBGOMP_OFFLOADED_ONLY' is not possible here, because it's
> permissible to use the 'error' directive also inside 'target' regions, as
> far as I can tell?

!$omp error at(execution) message('whatever')
can be used in offloading regions.

Jakub

Re: [Patch] c-format.c/Fortran: Support %wd / host-wide integer in gfc_error (Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514)


On 20.08.21 13:56, Jakub Jelinek wrote:


On Fri, Aug 20, 2021 at 01:50:00PM +0200, Tobias Burnus wrote:

Comments? OK?

LGTM (except that the last hunk won't apply anymore).


Now applied as r12-3044; I have now changed it to %wd ...

... but as discussed in another email in the thread, I think the
gfc_error should be removed (as unused); possibly with some additional
testcases running into the resolve errors for those (out of bound).

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 1b507b1e3c58c063b9cf803dff80c28d4626cb5d
Author: Tobias Burnus 
Date:   Fri Aug 20 15:43:32 2021 +0200

c-format.c/Fortran: Support %wd / host-wide integer in gfc_error

This patch adds support for the 'll' (long double)
and 'w' (HOST_WIDE_INT) length modifiers to the
Fortran FE diagnostic function (gfc_error, gfc_warning, ...)

gcc/c-family/ChangeLog:

* c-format.c (gcc_gfc_length_specs): Add 'll' and 'w'.
(gcc_gfc_char_table): Add T9L_LL and T9L_ULL to
"di" and "u", respecitively; fill with BADLEN to match
size of 'types'.
(get_init_dynamic_hwi): Split off from ...
(init_dynamic_diag_info): ... here. Call it.
(init_dynamic_gfc_info): Call it.

gcc/fortran/ChangeLog:

* error.c
(error_uinteger): Take 'long long unsigned' instead
of 'long unsigned' as argumpent.
(error_integer): Take 'long long' instead of 'long'.
(error_hwuint, error_hwint): New.
(error_print): Update to handle 'll' and 'w'
length modifiers.
* simplify.c (substring_has_constant_len): Use '%wd'
in gfc_error.
---
 gcc/c-family/c-format.c | 142 +---
 gcc/fortran/error.c | 106 +---
 gcc/fortran/simplify.c  |  11 ++--
 3 files changed, 178 insertions(+), 81 deletions(-)

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 6fd0bb33d21..b4cb765a9d3 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -546,10 +546,11 @@ static const format_length_info strfmon_length_specs[] =
 };
 
 
-/* For now, the Fortran front-end routines only use l as length modifier.  */
+/* Length modifiers used by the fortran/error.c routines.  */
 static const format_length_info gcc_gfc_length_specs[] =
 {
-  { "l", FMT_LEN_l, STD_C89, NO_FMT, 0 },
+  { "l", FMT_LEN_l, STD_C89, "ll", FMT_LEN_ll, STD_C89, 0 },
+  { "w", FMT_LEN_w, STD_C89, NO_FMT, 0 },
   { NO_FMT, NO_FMT, 0 }
 };
 
@@ -821,10 +822,10 @@ static const format_char_info gcc_cxxdiag_char_table[] =
 static const format_char_info gcc_gfc_char_table[] =
 {
   /* C89 conversion specifiers.  */
-  { "di",  0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T89_L,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
-  { "u",   0, STD_C89, { T89_UI,  BADLEN,  BADLEN,  T89_UL,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
-  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "", NULL },
-  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN  }, "q", "cR", NULL },
+  { "di",  0, STD_C89, { T89_I,   BADLEN,  BADLEN,  T89_L,   T9L_LL,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "", NULL },
+  { "u",   0, STD_C89, { T89_UI,  BADLEN,  BADLEN,  T89_UL,  T9L_ULL,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN  }, "q", "", NULL },
+  { "c",   0, STD_C89, { T89_I,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "", NULL },
+  { "s",   1, STD_C89, { T89_C,   BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN,  BADLEN, BADLEN, BADLEN, BADLEN }, "q", "cR", NULL },
 
   /* gfc conversion specifiers.  */
 
@@ -4843,12 +4844,73 @@ init_dynamic_asm_fprintf_info (void)
 }
 }
 
+static const format_length_info*
+get_init_dynamic_hwi (void)
+{
+  static tree hwi;
+  static format_length_info *diag_ls;
+
+  if (!hwi)
+{
+  unsigned int i;
+
+  /* Find the underlying type for HOST_WIDE_INT.  For the 'w'
+	 length modifier to work, one must have issued: "typedef
+	 HOST_WIDE_INT __gcc_host_wide_int__;" in one's source code
+	 prior to using that modifier.  */
+  if ((hwi = maybe_get_identifier ("__gcc_host_wide_int__")))
+	{
+	  hwi = identifier_global_value (hwi);
+	  if (hwi)
+	{
+	  if (TREE_CODE (hwi) != TYPE_DECL)
+		{
+		  error ("%<__gcc_host_wide_int__%> is not defined as a type");
+		  hwi = 0;
+		}
+	  else
+		{
+		  hwi = DECL_ORIGINAL_TYPE (hwi);
+		  gcc_assert (hwi);
+		  if (hwi != long_integer_type_node
+

Re: [Patch] c-format.c/Fortran: Support %wd / host-wide integer in gfc_error (Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514)

On Fri, Aug 20, 2021 at 03:47:54PM +0200, Tobias Burnus wrote:
> On 20.08.21 13:56, Jakub Jelinek wrote:
> 
> > On Fri, Aug 20, 2021 at 01:50:00PM +0200, Tobias Burnus wrote:
> > > Comments? OK?
> > LGTM (except that the last hunk won't apply anymore).
> 
> Now applied as r12-3044; I have now changed it to %wd ...
> 
> ... but as discussed in another email in the thread, I think the
> gfc_error should be removed (as unused); possibly with some additional
> testcases running into the resolve errors for those (out of bound).

Yeah, it makes sense that if there are errors in user code, they are
diagnosed during resolve, no matter whether simplification is possible
or not and simplification either asserts the errors aren't there or just
punts on the simplification, but doesn't repeat those diagnostics.

Jakub

Re: [committed] openmp: Implement the error directive

Hi Jakub!

On 2021-08-20T15:21:12+0200, Jakub Jelinek  wrote:
> On Fri, Aug 20, 2021 at 03:11:45PM +0200, Thomas Schwinge wrote:
>> > --- libgomp/error.c.jj2021-08-19 12:53:44.693106618 +0200
>> > +++ libgomp/error.c   2021-08-19 17:58:55.633203432 +0200
>>
>> > +void
>> > +GOMP_warning (const char *msg, size_t msglen)
>> > +{
>> > +  if (msg && msglen == (size_t) -1)
>> > +gomp_error ("error directive encountered: %s", msg);
>> > +  else if (msg)
>> > +{
>> > +  fputs ("\nlibgomp: error directive encountered: ", stderr);
>> > +  fwrite (msg, 1, msglen, stderr);
>> > +  fputc ('\n', stderr);
>> > +}
>> > +  else
>> > +gomp_error ("error directive encountered");
>> > +}
>> > +
>> > +void
>> > +GOMP_error (const char *msg, size_t msglen)
>> > +{
>> > +  if (msg && msglen == (size_t) -1)
>> > +gomp_fatal ("fatal error: error directive encountered: %s", msg);
>> > +  else if (msg)
>> > +{
>> > +  fputs ("\nlibgomp: fatal error: error directive encountered: ", 
>> > stderr);
>> > +  fwrite (msg, 1, msglen, stderr);
>> > +  fputc ('\n', stderr);
>> > +  exit (EXIT_FAILURE);
>> > +}
>> > +  else
>> > +gomp_fatal ("fatal error: error directive encountered");
>> > +}
>>
>> At least for nvptx offloading, and at least given the newlib sources I'm
>> using, the 'fputs'/'fwrite' calls here drag in 'isatty', which isn't
>> provided by my nvptx newlib at present, so we get, for example:
>
> fputs/fputc/vfprintf/exit/stderr have been in use by error.c already before,
> so this must be the fwrite call.

ACK.

> The above is for Fortran which doesn't have zero terminated strings.
> Initially I wanted to use instead ... encountered: %.*s", (int) msglen, 
> stderr);
> which doesn't handle > 2GB messages, but with offloading who cares, nobody
> sane would be trying to print > 2GB messages from offloading regions.

(... likewise from the host...)  ;-)

> The question is if it should be achieved through copy of error.c in
> config/nvptx/, or just include_next there with say fwrite redefined as a
> macro that does fprintf ("%.*s", (int) msglen, msg, file)?

(Right, that was also my plan.)

| Ah, I just re-discovered 'libgomp/config/nvptx/error.c' -- I'll cook
| something up.

So, guess what this newlib 'printf ("%.*s", [...]);' prints?
Yes: literal '%.*s'...  Next try: a 'fputc' loop?

See attached "[WIP] Make the OpenMP 'error' directive work for nvptx
offloading".

>> Now, there are many ways of addressing this...  The most simple one:
>> conditionalize these 'GOMP_warning'/'GOMP_error' definitions on
>> '#ifndef LIBGOMP_OFFLOADED_ONLY' is not possible here, because it's
>> permissible to use the 'error' directive also inside 'target' regions, as
>> far as I can tell?
>
> !$omp error at(execution) message('whatever')
> can be used in offloading regions.

Yes, generally works, but at least for Fortran, 'severity (fatal)' seems
to cause a hang, so another thing to be looked into...


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From d524726ff5f319658ef317afaf0077f43b8eccf8 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 20 Aug 2021 15:12:56 +0200
Subject: [PATCH] [WIP] Make the OpenMP 'error' directive work for nvptx
 offloading

The 'fputs'/'fwrite' calls in 'libgomp/error.c:GOMP_warning',
'libgomp/error.c:GOMP_error' drag in 'isatty', which isn't provided by my nvptx
newlib build at present, so we get, for example:

[...]
FAIL: libgomp.c/../libgomp.c-c++-common/declare_target-1.c (test for excess errors)
Excess errors:
unresolved symbol isatty
mkoffload: fatal error: [...]/build-gcc/./gcc/x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status
[...]

..., and many more.

Fix up for recent commit 0d973c0a0d90a0a302e7eda1a4d9709be3c5b102
"openmp: Implement the error directive".
---
 libgomp/config/nvptx/error.c | 2 ++
 libgomp/error.c  | 6 ++
 libgomp/testsuite/libgomp.c-c++-common/error-1.c | 5 +
 libgomp/testsuite/libgomp.fortran/error-1.f90| 6 ++
 4 files changed, 19 insertions(+)

diff --git a/libgomp/config/nvptx/error.c b/libgomp/config/nvptx/error.c
index dfa75da354f..40c907e0c9e 100644
--- a/libgomp/config/nvptx/error.c
+++ b/libgomp/config/nvptx/error.c
@@ -34,9 +34,11 @@
 #undef vfprintf
 #undef fputs
 #undef fputc
+#undef fwrite
 
 #define vfprintf(stream, fmt, list) vprintf (fmt, list)
 #define fputs(s, stream) printf ("%s", s)
 #define fputc(c, stream) printf ("%c", c)
+#define fwrite(ptr, size, nmemb, stream) printf ("%.*s", (int) (size * nmemb), ptr)
 
 #include "../../error.c"
diff --git a/libgomp/error.c b/libgomp/error.c
index 9b69a4b33fe..35fc823abf9 100644
--- a/libgomp/error.c
+++ b/libgomp/error.c
@@ -

[committed] libstdc++: Skip filesystem tests that depend on permissions [PR90787]

2021-08-20 Thread Jonathan Wakely via Gcc-patches

Tests that depend on filesystem permissions FAIL if run on Windows or as
root. Add a helper function to detect those cases, so the tests can skip
those checks gracefully.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/90787
* testsuite/27_io/filesystem/iterators/directory_iterator.cc:
Use new __gnu_test::permissions_are_testable() function.
* testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove_all.cc: Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/iterators/directory_iterator.cc:
Likewise.
* 
testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove_all.cc:
Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/util/testsuite_fs.h (__gnu_test::permissions_are_testable):
New function to guess whether testing permissions will work.

Tested x86_64-linux. Committed to trunk.

commit 29b2fd371f18169141e20b90effa7205db68fb11
Author: Jonathan Wakely 
Date:   Fri Aug 20 14:51:06 2021

libstdc++: Skip filesystem tests that depend on permissions [PR90787]

Tests that depend on filesystem permissions FAIL if run on Windows or as
root. Add a helper function to detect those cases, so the tests can skip
those checks gracefully.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

PR libstdc++/90787
* testsuite/27_io/filesystem/iterators/directory_iterator.cc:
Use new __gnu_test::permissions_are_testable() function.
* 
testsuite/27_io/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/27_io/filesystem/operations/exists.cc: Likewise.
* testsuite/27_io/filesystem/operations/is_empty.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove.cc: Likewise.
* testsuite/27_io/filesystem/operations/remove_all.cc: Likewise.
* testsuite/27_io/filesystem/operations/status.cc: Likewise.
* testsuite/27_io/filesystem/operations/symlink_status.cc:
Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/experimental/filesystem/iterators/directory_iterator.cc:
Likewise.
* 
testsuite/experimental/filesystem/iterators/recursive_directory_iterator.cc:
Likewise.
* testsuite/experimental/filesystem/operations/exists.cc:
Likewise.
* testsuite/experimental/filesystem/operations/is_empty.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove.cc:
Likewise.
* testsuite/experimental/filesystem/operations/remove_all.cc:
Likewise.
* 
testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
* testsuite/util/testsuite_fs.h 
(__gnu_test::permissions_are_testable):
New function to guess whether testing permissions will work.

diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
index b49b4ae2fea..b432e8833ba 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/iterators/directory_iterator.cc
@@ -56,24 +56,26 @@ test01()
   ++iter;
   VERIFY( iter == end(iter) );
 
-#if !(defined(__MINGW32__) || defined(__MINGW64__))
-  // Test inaccessible directory.
-  ec = bad_ec;
-  permissions(p, fs::perms::none, ec);
-  VERIFY( !ec );
-  iter = fs::directory_iterator(p, ec);
-  VERIFY( ec );
-  VERIFY( iter == end(iter) );
+  if (__gnu_test::permissions_are_testable())
+  {
+// Test inaccessible directory.
+ec = bad_ec;
+permissions(p, fs::perms::none, ec);
+VERIFY( !ec );
+iter = fs::directory_iterator(p, ec);
+VERIFY( ec );
+VERIFY( iter == end(iter) );
 
-  // Test inaccessible directory, skipping permission d

Re: Aw: Re: Re: Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514


Hi Harald,

On 20.08.21 14:17, Harald Anlauf wrote:

I can confirm this. – I think in order to reduce the clutter, the
diagnostic probably should be removed.

I am unable to prove that we will never that check.  So how about:
diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index eaabbffca4d..04722a8640c 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4552,27 +4552,13 @@ substring_has_constant_len (gfc_expr *e)

if (istart <= iend)
  {
-  char buffer[21];
-  if (istart < 1)
-   {
- sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, istart);
- gfc_error ("Substring start index (%s) at %L below 1",
-buffer, &ref->u.ss.start->where);
- return false;
-   }
-
/* For deferred strings use end index as proxy for length.  */
if (e->ts.deferred)
 length = iend;
else
 length = gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer);
-  if (iend > length)
-   {
- sprintf (buffer, HOST_WIDE_INT_PRINT_DEC, iend);
- gfc_error ("Substring end index (%s) at %L exceeds string length",
-buffer, &ref->u.ss.end->where);
- return false;
-   }
+
+  gcc_assert (istart >= 1 && iend <= length);
length = iend - istart + 1;
  }
else

Or skip the gcc_assert and cross fingers... (we then would not even need to
verify ref->u.ss.length in that depth).


LGTM – I am fine with either variant, but I am slightly inclined to
removing the gcc_assert*
– as I believe that the existing checks come early enough and do seem to
work well.

Can you check ('grep') whether we already have sufficient coverage of
substring out of bounds?
We presumably have, but if you spot something, I think it makes sense to
add those to the testsuite.

Tobias
*I think GCC won't complain but if ENABLE_ASSERT_CHECKING is not defined,
there is then a pointless 'length =' assignment, overridden before it is
ever used.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[PATCH] IPA: MODREF should skip EAF_* flags for indirect calls


Hello.

As showed in the PR, returning (EAF_NOCLOBBER | EAF_NOESCAPE) for an argument
that is a function pointer is problematic. Doing such a function call is a 
clobber.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR 101949

gcc/ChangeLog:

* ipa-modref.c (analyze_ssa_name_flags): Do not propagate EAF
  flags arguments for indirect functions.

gcc/testsuite/ChangeLog:

* gcc.dg/lto/pr101949_0.c: New test.
* gcc.dg/lto/pr101949_1.c: New test.

Co-Authored-By: Richard Biener 
---
 gcc/ipa-modref.c  |  3 +++
 gcc/testsuite/gcc.dg/lto/pr101949_0.c | 20 
 gcc/testsuite/gcc.dg/lto/pr101949_1.c |  4 
 3 files changed, 27 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr101949_0.c
 create mode 100644 gcc/testsuite/gcc.dg/lto/pr101949_1.c

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index fafd804d4ba..380ba6926b9 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -1715,6 +1715,9 @@ analyze_ssa_name_flags (tree name, vec 
&lattice, int depth,
  else if (callee && !ipa && recursive_call_p (current_function_decl,
  callee))
lattice[index].merge (0);
+ /* Ignore indirect calls (PR101949).  */
+ else if (callee == NULL_TREE)
+   lattice[index].merge (0);
  else
{
  int ecf_flags = gimple_call_flags (call);
diff --git a/gcc/testsuite/gcc.dg/lto/pr101949_0.c 
b/gcc/testsuite/gcc.dg/lto/pr101949_0.c
new file mode 100644
index 000..142dffe8780
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr101949_0.c
@@ -0,0 +1,20 @@
+/* { dg-lto-do run } */
+/* { dg-lto-options { "-O2 -fipa-pta -flto -flto-partition=1to1" } } */
+
+extern int bar (int (*)(int *), int *);
+
+static int x;
+
+static int __attribute__ ((noinline)) foo (int *p)
+{
+  *p = 1;
+  x = 0;
+  return *p;
+}
+
+int main ()
+{
+  if (bar (foo, &x) != 0)
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.dg/lto/pr101949_1.c 
b/gcc/testsuite/gcc.dg/lto/pr101949_1.c
new file mode 100644
index 000..871d15c9bfb
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/lto/pr101949_1.c
@@ -0,0 +1,4 @@
+int __attribute__((noinline,noclone)) bar (int (*fn)(int *), int *p)
+{
+  return fn (p);
+}
--
2.32.0

Re: [PATCH] more warning code refactoring

2021-08-20 Thread Martin Sebor via Gcc-patches


On 8/19/21 7:38 PM, Kewen.Lin wrote:

Hi Martin,

on 2021/8/20 上午12:30, Martin Sebor wrote:

On 8/19/21 9:03 AM, Martin Sebor wrote:

On 8/18/21 11:56 PM, Kewen.Lin wrote:

Hi David,

on 2021/8/19 上午11:26, David Edelsohn via Gcc-patches wrote:

Hi, Martin

A few PowerPC-specific testcases started failing yesterday on AIX with
a strange failure mode: the compiler runs out of memory.  As you may
expect from telling you this in an email reply to your patch, I have
bisected the failure and landed on your commit.  I can alternate
between the previous commit and your commit, and the failure
definitely appears with your patch, although I'm unsure how your patch
affected memory allocation in the compiler.  Maybe moving the code
changed a type of allocation or some memory no longer is being freed?




To get rid of GTY variable alloc_object_size_limit looks suspicious,
maybe tree objects returned by alloc_max_size after the change are out
of GC's tracking?

If the suspicion holds, the attached explorative diff may help.


I wouldn't expect that to make a difference.  There are thousands
of similar calls to build_int_cst() throughout the middle end.

Looking at the original patch, the change that I'm not sure about
and that shouldn't have been part of the refactoring is the call
to enable_ranger() in pass_waccess::execute().  It's something
I was planning to do next.  But even that I wouldn't expect to
eat up a whole 1GB or memory.


I have reproduced the excessive memory consumption with
the rlwimi-0.c test and a powerpc-linux cross-compiler, and
confirmed that it is indeed caused by the call to enable_ranger().
The test defines some six thousand functions so it seems that
unless each call enable_ranger() is paired with some call to
release the memory it allocates the memory leaks.

The removal of the alloc_object_size_limit global variable doesn't
have any effect on the test case.  The function that used it (and
now calls build_int_cst () instead) isn't called when the test
is compiled  (It's only called for calls to allocation functions
in the source and the test case has none.



Thanks for the clarification and sorry for noisy suspicion!


No worries.  It was a reasonable first guess.

Martin



BR,
Kewen


Let me take care of releasing the ranger memory.

Martin






BR,
Kewen


Previously, compiler bootstrap and all testcases ran with a data size
of 1GB.  After your change, the data size required for those
particular testcases jumped to 2GB.

The testcases are

gcc/testsuite/gcc.target/powerpc/rlwimi-[012].c

The failure is

cc1: out of memory allocating 65536 bytes after a total of 1608979296

This seems like a significant memory use regression.  Any ideas what happened?


Not really.  The patch just moved code around.  I didn't make any
changes that I'd expect to impact memory allocation to an appreciable
extent, at least not intentionally.  Let me look into it and get back
to you.

Martin



Thanks, David

Re: [ping] Re-unify 'omp_build_component_ref' and 'oacc_build_component_ref'

On Thu, Aug 19, 2021 at 10:13:56PM +0200, Thomas Schwinge wrote:
>   libgomp/
>   * testsuite/libgomp.c/address-space-1.c: New file.
> 
> Co-authored-by: Jakub Jelinek 
> ---
>  libgomp/testsuite/libgomp.c/address-space-1.c | 24 +++
>  1 file changed, 24 insertions(+)
>  create mode 100644 libgomp/testsuite/libgomp.c/address-space-1.c
> 
> diff --git a/libgomp/testsuite/libgomp.c/address-space-1.c 
> b/libgomp/testsuite/libgomp.c/address-space-1.c
> new file mode 100644
> index 000..90244db03b1
> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.c/address-space-1.c
> @@ -0,0 +1,24 @@
> +/* Verify OMP instances of variables with address space.  */
> +
> +/* { dg-do run { target i?86-*-* x86_64-*-* } } */
> +/* { dg-require-effective-target offload_device_nonshared_as } */
> +
> +#include 
> +
> +int __seg_fs a;
> +
> +int
> +main (void)
> +{
> +  // a = 123; // SIGSEGV
> +  int b;
> +#pragma omp target map(alloc: a) map(from: b)
> +  {
> +a = 321; // no SIGSEGV (given 'offload_device_nonshared_as')
> +asm volatile ("" : : : "memory");

Maybe better asm volatile ("" : : "g" (&a) : "memory");
so that the compiler doesn't think it could optimize it away to
just b = 321;
Ok with that change.

> +b = a;
> +  }
> +  assert (b == 321);
> +
> +  return 0;
> +}
> -- 
> 2.30.2
> 


Jakub

Re: [patch][version 6] add -ftrivial-auto-var-init and variable attribute "uninitialized" to gcc

2021-08-20 Thread Qing Zhao via Gcc-patches



> On Aug 19, 2021, at 8:54 AM, Qing Zhao via Gcc-patches 
>  wrote:
> 
> Breakpoint 1, expand_DEFERRED_INIT (stmt=0x7fffe96ae348) at 
> ../../latest-gcc/gcc/internal-fn.c:3021
> 3021mark_addressable (lhs);
> (gdb) call debug_tree(lhs)
>   type   size 
>  unit-size 
>  align:32 warn_if_not_align:0 symtab:0 alias-set 2 canonical-type 
> 0x7fffe959b2a0 precision:32
>  pointer_to_this >
>  visited var 
>  def_stmt temp1_5 = .DEFERRED_INIT (4, 2, 0, &"temp1"[0]);
>  version:5>
> 
> when I deleted:
> 
> if (TREE_CODE (lhs) == SSA_NAME
> lhs = SSA_NAME_VAR (lhs);
 
 but then using SSA_NAME_VAR is broken.  I suspect use_register_for_decl
 isn't the correct thing to look at.  I think we need to look at what
 the LHS expanded to if it is a SSA_VAR_P (that includes SSA names
 but also plain DECLs but not what we get from VLAs where we'd see
 *ptr).  So sth like
 
 bool reg_lhs;
 if (SSA_VAR_P (lhs))
  {
rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
reg_lhs = !MEM_P (tem);
/* If not MEM_P reg_lhs should be REG_P or SUBREG_P (but maybe
   also CONCAT or lowpart...?)  */
  }
 else
  {
gcc_assert (is_vla);
reg_lhs = false;
  }
 
 if (!reg_lhs)
  memset path
 else
  expand_assignment path
>>> 
>>> After making the following change:
>>> 
>>> +  bool reg_lhs = true;
>>> 
>>>  tree var_type = TREE_TYPE (lhs);
>>>  gcc_assert (init_type > AUTO_INIT_UNINITIALIZED);
>>> 
>>> -  if (is_vla || (!use_register_for_decl (lhs)))
>>> +  if (SSA_VAR_P (lhs))
>>> +{
>>> +  rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>>> +  reg_lhs = !MEM_P (tem);
>>> +}
>>> +  else
>>> +{
>>> +  gcc_assert (is_vla);
>>> +  reg_lhs = false;
>>> +}
>>> +
>>> +  if (!reg_lhs)
>>>{
>>> 
>>> I got exactly the same internal error that failed at expr.c:
>>> 
>>> 8436   /* We must have made progress.  */
>>> 8437   gcc_assert (inner != exp);
>>> 
>>> 
>>> Looks like for the following code:
>>> 
>>> 3026   if (!reg_lhs)
>>> 3027 {
>>> 3028 /* If this is a VLA or the variable is not in register,
>>> 3029expand to a memset to initialize it.  */
>>> 3030   mark_addressable (lhs);
>>> 3031   tree var_addr = build_fold_addr_expr (lhs);
>>> 3032 
>>> 3033   tree value = (init_type == AUTO_INIT_PATTERN) ?
>>> 3034 build_int_cst (integer_type_node,
>>> 3035INIT_PATTERN_VALUE) :
>>> 3036 integer_zero_node;
>>> 3037   tree m_call = build_call_expr (builtin_decl_implicit 
>>> (BUILT_IN_MEMSET),
>>> 3038  3, var_addr, value, var_size);
>>> 3039   /* Expand this memset call.  */
>>> 3040   expand_builtin_memset (m_call, NULL_RTX, TYPE_MODE (var_type));
>>> 3041 }
>>> 
>>> At line 3030, “lhs” could be a SSA_NAME.
>>> 
>>> My questions are:
>>> 
>>> 1. Could the routine “mark_addressable” and “build_fold_addr_expr” be 
>>> applied on SSA_NAME?
>> 
>> No.
>> 
>>> 2. Could the routine “expand_builtin_memset” be applied on the memset call 
>>> whose “DEST” is
>>>   an address expression on SSA_NAME? 
>> 
>> No.
>> 
>>> 3. Within “expand_DEFERRED_INIT”, can I call “expand_builtin_memset” to 
>>> expand .DEFERRED_INIT?
>> 
>> Well, not with "invalid" GENERIC I fear (address of a SSA name).
>> 
>>> I suspect that one of the above 3 might be the issue, but not sure which 
>>> one?
>> 
>> All of the above ;)  So while reg_lhs is now precise as to how the
>> variable will end up (the SSA name will end up as a stack variable in this
>> case, for whatever reason), expansion via memcpy only works when
>> working on the RTL representation.  The usual "workaround" (ugh)
>> is to use make_tree (), so in the !reg_lhs path you'd do
>> 
>> /* Get a new GENERIC representation for the RTL.  That's necesary
>>in case LHS is an SSA name.  */
>> lhs = make_tree (TREE_TYPE (lhs), tem);
> 
> This resolved the issue.

However, there was two runtime errors with CPU2017 with this change.

I guess that there might be some issue with “expand_expr” + “make_tree” here. 

So, I changed the code to use the alternative solution that you suggested:

> 
>> 
>> alternatively you could maybe do
>> 
>> if (DECL_P (lhs))
>>   {
>> +  rtx tem = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
>> +  reg_lhs = !MEM_P (tem);
>>   }
>> else if (TREE_CODE (lhs) == SSA_NAME)
>>   reg_lhs = true;
>> else
>>   reg_lhs = false;
>> 
>> thus treat SSA names as register storage always (even if it will end
>> up on the stack).

This did resolve all the issues including CPU2017 runtime errors.
With this solution, all SSA_NAMEs will be expanded through “expand_assignment” 
even though they are in stack.
the generated code will be correct. But the performance might be

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)


On 8/16/21 3:58 PM, Martin Liška wrote:

PING^2

@Honza: Can you please review the change?


I've tested the patch and apparently it's not enough for 
{gimple,generic}-match.o not clashing
in symbol names. Apparently there are more IPA clones that collide.

Leaving that for now.

Martin



Martin

On 6/23/21 3:53 PM, Martin Liška wrote:

On 5/21/21 10:29 AM, Martin Liška wrote:

On 5/20/21 5:55 PM, Jan Hubicka wrote:

Quick solution is to also modify partitioner to use the local symbol
names when doing incremental linking (those mixing in source code and
random seeds) to avoid clashes.


Good hint. I added hash based on object file name (I don't want to handle
proper string escaping) and -frandom-seed.

What do you think about the patch?
Thanks,
Martin


@Honza: Can you please take a look at this patch?

Cheers,
Martin

Re: [PATCH] Try LTO partial linking. (Was: Speed of compiling gimple-match.c)

On 6/1/21 3:19 PM, Richard Biener wrote:

On Tue, Jun 1, 2021 at 1:25 PM Martin Liška wrote:

On 6/1/21 9:42 AM, Richard Biener wrote:

On Tue, Jun 1, 2021 at 9:33 AM Martin Liška wrote:

@Richi: Can you please reply to this email?

Not sure what I should add here? Honza suggested to mangle the
promoted symbol names.

Sure and I sent a patch for that.

I don't
really like the idea to compile multiple TUs into one object. Also

What's problematic is that we'll have to wait for one another release to make
it useful
(if you don't want to build the current master with a snapshot compiler).

IMHO it's a bugfix. Note that I'm not sure what the intent of the change is.
If it is to speedup bootstrap then using LTO bootstrap would do the trick
as well (and better) if we'd simply process all of libbackend.a this way
(and thus avoid re-linking that once for each frontend).

When building a GCC package, we intentionally re-link them with all FEs.

If it is to speedup
dev (re-)builds then dragging in more files will make it build longer since
for example insn-recog.c may be unchanged but gimple-match.c not.

Yes, the original motivation was a speed up of a dev. build and yes, the shown
example is problematic. Right now, I'm leaving that as I'm not interested enough
in the parallel build of a simple source file.

Martin

+LTO_LINKER_FLAGS = -flto=auto --param=lto-partitions=16
-flinker-output=nolto-rel -r

why hard-code to 16 partitions? You're side-stepping the driver
diagnostic by doing
compile & link separately, but in the end we're going to want sth like Giulianos
-fparallel-compile that works transparently from within the driver, so
the "manual"
operation should try to follow that or alternatively a driver-only
wrapper around the
"manual" processing could be added whose implementation can be optimized later.

All right. Do you want me refreshing his -fparallel-compile option introduction?

I'm not sure if we've arrived at mergeable state - but if it's
reasonably possible
to hide s/-fparallel-compile/-flto -r -flinker-output=nolto-rel/ split
into compile & link
parts (avoiding the diagnostic on -flinker-output) in the driver then
I think that's
a very reasonable first step (after fixing the symbol privatization issue). The
GSOC project then was to elide the IL streaming from the high-level operation.

Richard,

Why do you use -flto=auto? There should be a jobserver active.

Yes, that should not be needed.

Martin

On 5/21/21 10:43 AM, Martin Liška wrote:

On 5/20/21 2:54 PM, Richard Biener wrote:

On Thu, May 20, 2021 at 2:34 PM Martin Liška wrote:

Hello.

I've got a patch candidate that leverages partial linking for a couple of
selected object files.

I'm sending make all-host- jX results for my machine:

before: 3m18s (user 32m52s)
https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/1dd5eae5001295ba0230a689f7edc67284c9b742/gcc-all-host.svg

after: 2m57m (user 35m)
https://gist.githubusercontent.com/marxin/223890df4d8d8e490b6b2918b77dacad/raw/d659b2187cf622167841efbbe6bc93cb33855fa9/gcc-all-host-partial-lto.svg

One can utilize it with:
make -j16 all-host PARTIAL_LTO=1

@Segher, Andrew: Can you please measure time improvement for your slow
bootstrap?
One can also tweak --param=lto-partitions=16 param value.

Thoughts?

You're LTO linking multiple objects here - that's almost as if you
were doing this
for the whole of libbackend.a ... so $(OBJS)_CLFAGS += -flto and in the
libbackend.a rule do a similar partial link trick.

Yeah, apart from that one can't likely do partial linking for an archive:

$ g++ -no-pie -flto=auto --param=lto-partitions=16 -flinker-output=nolto-rel -r
libbackend.a
collect2: fatal error: ld terminated with signal 11 [Segmentation fault], core
dumped
compilation terminated.

while ld.bfd immediately finishes.

That gets you half of a LTO bootstrap then.

So why did you go from applying this per-file to multiple files? Does $(LINKER)
have a proper rule to pick up a jobserver?

When upstreaming in any form you probably have to gate it on bootstrap-lto
being not active.

Sure, that's reasonable, we can likely detect a -flto option in $(COMPILE),
right?

One more thing I face is broken dependency:
$ make clean && make -j32 PARTIAL_LTO=1

g++ -fcf-protection -fno-PIE -c -g -DIN_GCC -fPIC-fno-exceptions
-fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings
-Wcast-qual -Wno-error=format-diag -Wmissing-format-attribute
-Woverloaded-virtual -pedantic -Wno-long-long -Wno-variadic-macros
-Wno-overlength-strings -fno-common -Wno-unused -DHAVE_CONFIG_H -I. -I.
-I/home/marxin/Programming/gcc/gcc -I/home/marxin/Programming/gcc/gcc/.
-I/home/marxin/Programming/gcc/gcc/../include
-I/home/marxin/Programming/gcc/gcc/../libcpp/include
-I/home/marxin/Programming/gcc/gcc/../libcody
-I/home/marxin/Programming/gcc/gcc/../libdecnumber
-I/home/marxin/Programming/gcc/gcc/../libdecnumber/bid -I../libdecnumbe

[committed] Further improvements to constant shifts on H8

2021-08-20 Thread Jeff Law via Gcc-patches


A few more shift-by-constant improvements on the H8.


For H8/300H arithmetic right shift 15 bits, we'd conceptually like to 
use the shift-by-16 idiom where we move half-words around.  Of course 
that loses a bit.  But we can save away that bit into C, shift-by-16, 
sign-extend, then rotate through the carry one bit the other way.  That 
saves nearly 100 cycles and is the same size.  We were already using 
this concept elsewhere.


For H8/300H SImode arithmetic rightshift by 28, 29 or 30 bits we use the 
shift-by-16 + shift-by-8 sequences, sign extend, then handle the 
residuals inline.  This saves on the order of 200 cycles vs a loop.  THe 
sequences are larger, but we're stil talking about burning at most 10 
bytes of instruction space to save 200 cycles.  Obviously with -Os we 
revert to the loop sequence.


For H8/S HImode logical shift by 12 (right or left) we switch from using 
a shift-by-8 idiom, clear bits, shift-by-2, shift-by-2 to a 2 rotate by 
2 positions + masking sequence which is the same size/speed on the 
hardware, but should simulate faster as it's one less instruction.


For H8/S SImode arithmetic right by 15 bits, we use the same trick as on 
the H8/300H.


For H8/S SImode logical shifts by 27 bits we can shave off 2 more bytes 
and 2 cycles by using a rotation based sequence over shift-by-16, 
shift-by-8, shift-by-2, shift-by-1 style sequence.


This has gone through the usual simulator testing without regressions.

Jeff
commit 5f80c6270de6ac79d819de50048b32351a6b97c3
Author: Jeff Law 
Date:   Fri Aug 20 11:19:05 2021 -0400

Further improvements to constant shifts for the H8

gcc/
* config/h8300/h8300.c (shift_alg_hi): Improve arithmetic shift 
right
by 15 bits for H8/300H and H8/S.  Improve logical shifts by 12
bits for H8/S.
(shift_alg_si): Improve arithmetic right shift by 28-30 bits for
H8/300H.  Improve arithmetic shift right by 15 bits for H8/S.
Improve logical shifts by 27 bits for H8/S.
(get_shift_alg): Corresponding changes.
(h8300_option_override): Revert to loops for -Os when profitable.

diff --git a/gcc/config/h8300/h8300.c b/gcc/config/h8300/h8300.c
index 0c4e5089791..8ccacecba79 100644
--- a/gcc/config/h8300/h8300.c
+++ b/gcc/config/h8300/h8300.c
@@ -213,9 +213,9 @@ static enum shift_alg shift_alg_hi[2][3][16] = {
 /*  01234567  */
 /*  89   10   11   12   13   14   15  */
 { INL, INL, INL, INL, INL, INL, INL, INL,
-  SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_ASHIFT   */
+  SPC, SPC, SPC, SPC, ROT, ROT, ROT, ROT }, /* SHIFT_ASHIFT   */
 { INL, INL, INL, INL, INL, INL, INL, INL,
-  SPC, SPC, SPC, SPC, SPC, ROT, ROT, ROT }, /* SHIFT_LSHIFTRT */
+  SPC, SPC, SPC, SPC, ROT, ROT, ROT, ROT }, /* SHIFT_LSHIFTRT */
 { INL, INL, INL, INL, INL, INL, INL, INL,
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFTRT */
   }
@@ -237,9 +237,9 @@ static enum shift_alg shift_alg_si[2][3][32] = {
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC,
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_LSHIFTRT */
 { INL, INL, INL, INL, INL, INL, INL, LOP,
-  SPC, LOP, LOP, LOP, LOP, LOP, LOP, LOP,
+  SPC, LOP, LOP, LOP, LOP, LOP, LOP, SPC,
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC,
-  SPC, SPC, SPC, SPC, LOP, LOP, LOP, SPC }, /* SHIFT_ASHIFTRT */
+  SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFTRT */
   },
   {
 /* TARGET_H8300S  */
@@ -256,7 +256,7 @@ static enum shift_alg shift_alg_si[2][3][32] = {
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC,
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_LSHIFTRT */
 { INL, INL, INL, INL, INL, INL, INL, INL,
-  INL, INL, INL, INL, INL, INL, INL, LOP,
+  INL, INL, INL, INL, INL, INL, INL, SPC,
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC,
   SPC, SPC, SPC, SPC, SPC, SPC, SPC, SPC }, /* SHIFT_ASHIFTRT */
   }
@@ -372,6 +372,9 @@ h8300_option_override (void)
   shift_alg_si[H8_300H][SHIFT_ASHIFTRT][25] = SHIFT_LOOP;
   shift_alg_si[H8_300H][SHIFT_ASHIFTRT][26] = SHIFT_LOOP;
   shift_alg_si[H8_300H][SHIFT_ASHIFTRT][27] = SHIFT_LOOP;
+  shift_alg_si[H8_300H][SHIFT_ASHIFTRT][28] = SHIFT_LOOP;
+  shift_alg_si[H8_300H][SHIFT_ASHIFTRT][29] = SHIFT_LOOP;
+  shift_alg_si[H8_300H][SHIFT_ASHIFTRT][30] = SHIFT_LOOP;
 
   /* H8S */
   shift_alg_hi[H8_S][SHIFT_ASHIFTRT][14] = SHIFT_LOOP;
@@ -3830,6 +3833,10 @@ get_shift_alg (enum shift_type shift_type, enum 
shift_mode shift_mode,
}
   else if (count == 15)
{
+ /* The basic idea here is to use the shift-by-16 idiom to make things
+small and efficient.  Of course, that loses one bit that we need,
+so we stuff the bit into C, shift by 16, then rotate the bit
+back in.  */
  switch (shift_type)
{
case SHIFT_ASHIFT:
@@ -3841,7 +3848,9

RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64

2021-08-20 Thread Jirui Wu via Gcc-patches

> -Original Message-
> From: Richard Biener 
> Sent: Friday, August 20, 2021 8:15 AM
> To: Jirui Wu 
> Cc: Richard Biener ; Andrew Pinski
> ; Richard Sandiford ;
> i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers
> 
> Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int)
> under -ffast-math on aarch64
> 
> On Thu, 19 Aug 2021, Jirui Wu wrote:
> 
> > Hi all,
> >
> > This patch generates FRINTZ instruction to optimize type casts.
> >
> > The changes in this patch covers:
> > * Generate FRINTZ for (double)(int) casts.
> > * Add new test cases.
> >
> > The intermediate type is not checked according to the C99 spec.
> > Overflow of the integral part when casting floats to integers causes
> undefined behavior.
> > As a result, optimization to trunc() is not invalid.
> > I've confirmed that Boolean type does not match the matching condition.
> >
> > Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master? If OK can it be committed for me, I have no commit rights.
> 
> +/* Detected a fix_trunc cast inside a float type cast,
> +   use IFN_TRUNC to optimize.  */
> +#if GIMPLE
> +(simplify
> +  (float (fix_trunc @0))
> +  (if (direct_internal_fn_supported_p (IFN_TRUNC, type,
> +  OPTIMIZE_FOR_BOTH)
> +   && flag_unsafe_math_optimizations
> +   && type == TREE_TYPE (@0))
> 
> types_match (type, TREE_TYPE (@0))
> 
> please.  Please perform cheap tests first (the flag test).
> 
> + (IFN_TRUNC @0)))
> +#endif
> 
> why only for GIMPLE?  I'm not sure flag_unsafe_math_optimizations is a good
> test here.  If you say we can use undefined behavior of any overflow of the
> fix_trunc operation what do we guard here?
> If it's Inf/NaN input then flag_finite_math_only would be more appropriate, if
> it's behavior for -0. (I suppose trunc (-0.0) == -0.0 and thus "wrong") then a
> && !HONOR_SIGNED_ZEROS (type) is missing instead.  If it's setting of FENV
> state and possibly trapping on overflow (but it's undefined?!) then
> flag_trapping_math covers the latter but we don't have any flag for eliding
> FENV state affecting transforms, so there the kitchen-sink
> flag_unsafe_math_optimizations might apply.
> 
> So - which is it?
> 
This change is only for GIMPLE because we can't test for the optab support 
without being in GIMPLE. direct_internal_fn_supported_p is defined 
only for GIMPLE. 

IFN_TRUNC's documentation mentions nothing for zero, NaNs/inf inputs.
So I think the correct guard is just flag_fp_int_builtin_inexact.
!flag_trapping_math because the operation can only still raise inexacts.

The new pattern is moved next to the place you mentioned.

Ok for master? If OK can it be committed for me, I have no commit rights.

Thanks,
Jirui
> Note there's also the pattern
> 
> /* Handle cases of two conversions in a row.  */ (for ocvt (convert float
> fix_trunc)  (for icvt (convert float)
>   (simplify
>(ocvt (icvt@1 @0))
>(with
> {
> ...
> 
> which is related so please put the new pattern next to that (the set of
> conversions handled there does not include (float (fix_trunc @0)))
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Jirui
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Generate IFN_TRUNC.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/merge_trunc1.c: New test.
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Tuesday, August 17, 2021 9:13 AM
> > > To: Andrew Pinski 
> > > Cc: Jirui Wu ; Richard Sandiford
> > > ; i...@airs.com; gcc-patches@gcc.gnu.org;
> > > rguent...@suse.de
> > > Subject: Re: [Patch][GCC][middle-end] - Generate FRINTZ for
> > > (double)(int) under -ffast-math on aarch64
> > >
> > > On Mon, Aug 16, 2021 at 8:48 PM Andrew Pinski via Gcc-patches  > > patc...@gcc.gnu.org> wrote:
> > > >
> > > > On Mon, Aug 16, 2021 at 9:15 AM Jirui Wu via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This patch generates FRINTZ instruction to optimize type casts.
> > > > >
> > > > > The changes in this patch covers:
> > > > > * Opimization of a FIX_TRUNC_EXPR cast inside a FLOAT_EXPR using
> > > IFN_TRUNC.
> > > > > * Change of corresponding test cases.
> > > > >
> > > > > Regtested on aarch64-none-linux-gnu and no issues.
> > > > >
> > > > > Ok for master? If OK can it be committed for me, I have no commit
> rights.
> > > >
> > > > Is there a reason why you are doing the transformation manually
> > > > inside forwprop rather than handling it inside match.pd?
> > > > Also can't this only be done for -ffast-math case?
> > >
> > > You definitely have to look at the intermediate type - that could be
> > > a uint8_t or even a boolean type.  So unless the intermediate type
> > > can represent all float values optimizing to trunc() is invalid.
> > > Also if you emit IFN_TRUNC you have to make sure there's target
> > > support - we don't emit calls to a library
> > > trunc() from an internal function call (and we wouldn't want to
> > > optimi

Re: [PATCH] analyzer: Fix PR analyzer/101980

2021-08-20 Thread David Malcolm via Gcc-patches

On Fri, 2021-08-20 at 21:55 +0530, Ankur Saini wrote:
> The patch fixes the test failures introduced by :
> 
> aef703cf982072427e74034f4c460a11c5e04b8e
> 1b34248527472496ca3fe2a07183beac8cf69041
> 
> Thanks 
> - Ankur

Thanks for fixing this.

The patch looks OK, apart from some minor whitespace issues - I think
it's using spaces rather than tabs, as columns aren't lining up as
expected in some places.  (does your code editor support visualizing
whitespace and support GNU indentation styles?).

Ideally these nits should be fixed - but assuming this passes bootstrap
it's OK to push to trunk (and don't try to fix indentation in places
where it's already broken; best to focus on fixing the test suite).

Dave

[PATCH] c++: error message for dependent template members [PR70417]

2021-08-20 Thread Anthony Sharp via Gcc-patches

Hi, hope everyone is well. I have a patch here for issue 70417
(https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70417). I'm still a GCC
noob, and this is probably the hardest thing I have ever coded in my
life, so please forgive any initial mistakes!

TLDR

This patch introduces a helpful error message when the user attempts
to put a template-id after a member access token (., -> or ::) in a
dependent context without having put the "template" keyword after the
access token.

CONTEXT

In C++, when referencing a class member (using ., -> or ::) in a
dependent context, if that member is a template, the template keyword
is required after the access token. For example:

template void MyDependentTemplate(T t)
{
  t.DoSomething(); /* Error - t is dependent. "<" is treated as a
less-than sign because DoSomething is assumed to be a non-template
identifier. */
  t.template DoSomething(); // Good.
}

PATCH EXPLANATION

In order to throw an appropriate error message we need to identify
and/or create a point in the compiler where the following conditions
are all simultaneously satisfied:

A) a class member access token (., ->, ::)
B) a dependent context
C) the thing being accessed is a template-id
D) No template keyword

A, B and D are relatively straightforward and I won't go into detail
about how that was achieved. The real challenge is C - figuring out
whether a name is a template-id.

Usually, we would look up the name and use that to determine whether
the name is a template or not. But, we cannot do that. We are in a
dependent context, so lookup cannot (in theory) find anything at the
point the access expression is parsed.

We (maybe) could defer checking until the template is actually
instantiated. But this is not the approach I have gone for, since this
to me sounded unnecessarily awkward and clunky. In my mind this also
seems like a syntax error, and IMO it seems more natural that syntax
errors should get caught as soon as they are parsed.

So, the solution I went for was to figure out whether the name was a
template by looking at the surrounding tokens. The key to this lies in
being able to differentiate between the start and end of a template
parameter list (< and >) and greater-than and less-than tokens (and
perhaps also things like << or >>). If the final > (if we find one at
all) does indeed mark the end of a class or function template, then it
will be followed by something like (, ::, or even just ;. On the other
hand, a greater-than operator would be followed by a name.

PERFORMANCE IMPACT

My concern with this approach was that checking this would actually
slow the compiler down. In the end it seemed the opposite was true. By
parsing the tokens to determine whether the name looks like a
template-id, we can cut out a lot of the template-checking code that
gets run trying to find a template when there is none, making compile
time generally faster. That being said, I'm not sure if the
improvement will be that substantial, so I did not feel it necessary
to introduce the template checking method everywhere where we could
possibly have run into a template.

I ran an ABAB test with the following code repeated enough times to
fill up about 10,000 lines:

ai = 1;
bi = 2;
ci = 3;
c.x = 4;
(&c)->x = 5;
mytemplateclass::y = 6;
c.template class_func();
(&c)->template class_func();
mytemplateclass::template class_func_static();
c2.class_func();
(&c2)->class_func();
myclass::class_func_static();
ai = 6;
bi = 5;
ci = 4;
c.x = 3;
(&c)->x = 2;
mytemplateclass::y = 1;
c.template class_func();
(&c)->template class_func();
mytemplateclass::template class_func_static();
c2.class_func();
(&c2)->class_func();
myclass::class_func_static();

It basically just contains a mixture of class access expressions plus
some regular assignments for good measure. The compiler was compiled
without any optimisations (and so slower than usual). In hindsight,
perhaps this test case assumes the worst-ish-case scenario since it
contains a lot of templates; most of the benefits of this patch occur
when parsing non-templates.

The compiler (clean) and the compiler with the patch applied (patched)
compiled the above code 20 times each in ABAB fashion. On average, the
clean compiler took 2.26295 seconds and the patched compiler took
2.25145 seconds (about 0.508% faster). Whether this improvement
remains or disappears when the compiler is built with optimisations
turned on I haven't tested. My main concern was just making sure it
didn't get any slower.

AWKWARD TEST CASES

You will see from the patch that it required the updating of a few
testcases (as well as one place within the compiler). I believe this
is because up until now, the compiler allowed the omission of the
template keyword in some places it shouldn't have. Take decltype34.C
for example:

struct impl
{
  template  static T create();
};

template()->impl::create())>
struct tester{};

GCC currently accepts this code. My patch causes it to be rejected.
For what it's worth, Clang also rejects

[PATCH] mips: msa: truncate immediate shift amount [PR101922]

2021-08-20 Thread Xi Ruoyao via Gcc-patches

When -mloongson-mmi is enabled, SHIFT_COUNT_TRUNCATED is turned off.
This causes untruncated immediate shift amount outputed into the asm,
and the GNU assembler refuses to assemble it.

Truncate immediate shift amount when outputing the asm instruction to
make GAS happy again.

gcc/

PR target/101922
* config/mips/mips-protos.h (mips_msa_output_shift_immediate):
  Declare.
* config/mips/mips.c (mips_msa_output_shift_immediate): New
  function.
* config/mips/mips-msa.md (vashl3, vashr3,
  vlshr3): Call it.

gcc/testsuite/

PR target/101922
* gcc.target/mips/pr101922.c: New test.
---
 gcc/config/mips/mips-msa.md  | 27 
 gcc/config/mips/mips-protos.h|  1 +
 gcc/config/mips/mips.c   | 21 ++
 gcc/testsuite/gcc.target/mips/pr101922.c | 19 +
 4 files changed, 59 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr101922.c

diff --git a/gcc/config/mips/mips-msa.md b/gcc/config/mips/mips-msa.md
index 3a67f25be56..d3b27d132ad 100644
--- a/gcc/config/mips/mips-msa.md
+++ b/gcc/config/mips/mips-msa.md
@@ -870,9 +870,12 @@ (define_insn "vlshr3"
  (match_operand:IMSA 1 "register_operand" "f,f")
  (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
   "ISA_HAS_MSA"
-  "@
-   srl.\t%w0,%w1,%w2
-   srli.\t%w0,%w1,%E2"
+{
+  if (which_alternative == 0)
+return "srl.\t%w0,%w1,%w2";
+
+  return mips_msa_output_shift_immediate("srli.\t%w0,%w1,%E2", 
operands);
+}
   [(set_attr "type" "simd_shift")
(set_attr "mode" "")])
 
@@ -882,9 +885,12 @@ (define_insn "vashr3"
  (match_operand:IMSA 1 "register_operand" "f,f")
  (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
   "ISA_HAS_MSA"
-  "@
-   sra.\t%w0,%w1,%w2
-   srai.\t%w0,%w1,%E2"
+{
+  if (which_alternative == 0)
+return "sra.\t%w0,%w1,%w2";
+
+  return mips_msa_output_shift_immediate("srai.\t%w0,%w1,%E2", 
operands);
+}
   [(set_attr "type" "simd_shift")
(set_attr "mode" "")])
 
@@ -894,9 +900,12 @@ (define_insn "vashl3"
  (match_operand:IMSA 1 "register_operand" "f,f")
  (match_operand:IMSA 2 "reg_or_vector_same_uimm6_operand" "f,Uuv6")))]
   "ISA_HAS_MSA"
-  "@
-   sll.\t%w0,%w1,%w2
-   slli.\t%w0,%w1,%E2"
+{
+  if (which_alternative == 0)
+return "sll.\t%w0,%w1,%w2";
+
+  return mips_msa_output_shift_immediate("slli.\t%w0,%w1,%E2", 
operands);
+}
   [(set_attr "type" "simd_shift")
(set_attr "mode" "")])
 
diff --git a/gcc/config/mips/mips-protos.h b/gcc/config/mips/mips-protos.h
index a5e4151b9e6..8d97eb36125 100644
--- a/gcc/config/mips/mips-protos.h
+++ b/gcc/config/mips/mips-protos.h
@@ -317,6 +317,7 @@ extern const char *mips_output_sync_loop (rtx_insn *, rtx 
*);
 extern unsigned int mips_sync_loop_insns (rtx_insn *, rtx *);
 extern const char *mips_output_division (const char *, rtx *);
 extern const char *mips_msa_output_division (const char *, rtx *);
+extern const char *mips_msa_output_shift_immediate (const char *, rtx *);
 extern const char *mips_output_probe_stack_range (rtx, rtx);
 extern bool mips_hard_regno_rename_ok (unsigned int, unsigned int);
 extern bool mips_linked_madd_p (rtx_insn *, rtx_insn *);
diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 89d1be6cea6..3d5be369b1c 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -14495,6 +14495,27 @@ mips_msa_output_division (const char *division, rtx 
*operands)
 }
   return s;
 }
+
+/* Return the assembly code for MSA immediate shift instructions,
+   which has the operands given by OPERANDS.  Truncate the shift amount
+   to make GAS happy.  */
+
+const char *
+mips_msa_output_shift_immediate (const char *shift, rtx *operands)
+{
+  rtx amount = operands[2];
+  machine_mode mode = amount->mode;
+
+  unsigned val = UINTVAL (CONST_VECTOR_ELT (amount, 0));
+  val &= GET_MODE_UNIT_BITSIZE (mode) - 1;
+  if (!val)
+return "";
+
+  rtx c = gen_int_mode (val, GET_MODE_INNER (mode));
+  operands[2] = gen_const_vec_duplicate (mode, c);
+
+  return shift;
+}
 
 /* Return true if destination of IN_INSN is used as add source in
OUT_INSN. Both IN_INSN and OUT_INSN are of type fmadd. Example:
diff --git a/gcc/testsuite/gcc.target/mips/pr101922.c 
b/gcc/testsuite/gcc.target/mips/pr101922.c
new file mode 100644
index 000..00a6e495ba2
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr101922.c
@@ -0,0 +1,19 @@
+/* PR target/101922
+   This was triggering an assembler error with -O3 -mmsa -mloongson-mmi. */
+
+/* { dg-do assemble } */
+/* { dg-options "-mmsa -mloongson-mmi" } */
+
+typedef __INT8_TYPE__ i8;
+typedef __INT32_TYPE__ i32;
+
+i8 d[16];
+
+i32 f(i32 x) {
+  int i;
+  for (i = 0; i < 16; i++) {
+i32 t = (i32) d[i] >> 31;
+x &= t;
+  }
+  return x;
+}
-- 
2.33.0

[Patch] Fortran: Fix Bind(C) char-len check, add ptr-contiguous check


The following is about interoperability (BIND(C)) only.


* The patch adds a missing check for pointer + contiguous.
(Rejected to avoid copy-in issues? Or checking issues?)


* And it corrects an issue regarding len > 1 characters. While

 subroutine foo(x)
character(len=2) :: x(*)

is valid Fortran code (the argument can be "abce" or ['a','b','c','d'] or ...)
– and would work also with bind(C) as the len=2 does not need to be passed
as hidden argument as len is a constant.
However, it is not valid nonetheless.


OK? Comments?

Tobias


PS: Referenced locations in the standard (F2018):

C1554 If proc-language-binding-spec is specified for a procedure,
each of its dummy arguments shall be an interoperable procedure (18.3.6)
or a variable that is interoperable (18.3.4, 18.3.5), assumed-shape,
assumed-rank, assumed-type, of type CHARACTER with assumed length,
or that has the ALLOCATABLE or POINTER attribute.

18.3.1: "... If the type is character, the length type parameter is
interoperable if and only if its value is one. ..."

"18.3.4 Interoperability of scalar variables":
"... A named scalar Fortran variable is interoperable ... if it
is of type character12its length is not assumed or declared by
an expression that is not a constant expression."

18.3.5: Likewise but for arrays.

18.3.6 "... Fortran procedure interface is interoperable with a C function prototype 
..."
"(5) any dummy argument without the VALUE attribute corresponds
 to a formal parameter of the prototype that is of a pointer type, and 
either
 • the dummy argument is interoperable with an entity of the referenced type 
..."
(Remark: those are passed as byte stream)
 "• the dummy argument is a nonallocatable nonpointer variable of type
CHARACTER with assumed character length and the formal parameter is
a pointer to CFI_cdesc_t,
  • the dummy argument is allocatable, assumed-shape, assumed-rank, or
a pointer without the CONTIGUOUS attribute, and the formal parameter
is a pointer to CFI_cdesc_t, or
(Remark: those two use an array descriptor, also for explicit-size/assumed-size
arrays or for scalars.)
  • the dummy argument is assumed-type ..."

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Fix Bind(C) char-len check, add ptr-contiguous check

Add F2018, 18.3.6 (5), pointer + contiguous is not permitted
check for dummies in BIND(C) procs.

Fix misreading of F2018, 18.3.4/18.3.5 + 18.3.6 (5) regarding
character dummies passed as byte stream to a bind(C) dummy arg:
Per F2018, 18.3.1 only len=1 is interoperable (since F2003).
F2008 added 'constant expression' for vars (F2018, 18.3.4/18.3.5),
applicable to dummy args per F2018, C1554. I misread this such
that len > 1 is permitted if len is a constant expr.

While the latter would work as character len=1 a(10) and len=2 a(5)
have the same storage sequence and len is fixed, it is still invalid.
Hence, it is now rejected again.

gcc/fortran/ChangeLog:

	* decl.c (gfc_verify_c_interop_param): Reject pointer with
	CONTIGUOUS attributes as dummy arg. Reject character len > 1
	when passed as byte stream.

gcc/testsuite/ChangeLog:

	* gfortran.dg/bind_c_char_6.f90:
	* gfortran.dg/bind_c_char_7.f90:
	* gfortran.dg/bind_c_char_8.f90:
	* gfortran.dg/bind_c_char_9.f90:
	* gfortran.dg/iso_c_binding_char_1.f90:
	* gfortran.dg/pr32599.f03:
	* gfortran.dg/bind_c_contiguous.f90: New test.

 gcc/fortran/decl.c |  39 ++---
 gcc/testsuite/gfortran.dg/bind_c_char_6.f90|  22 ++-
 gcc/testsuite/gfortran.dg/bind_c_char_7.f90|  15 +-
 gcc/testsuite/gfortran.dg/bind_c_char_8.f90|  12 +-
 gcc/testsuite/gfortran.dg/bind_c_char_9.f90| 161 -
 gcc/testsuite/gfortran.dg/bind_c_contiguous.f90|  33 +
 gcc/testsuite/gfortran.dg/iso_c_binding_char_1.f90 |   1 +
 gcc/testsuite/gfortran.dg/pr32599.f03  |   2 +-
 8 files changed, 164 insertions(+), 121 deletions(-)

diff --git a/gcc/fortran/decl.c b/gcc/fortran/decl.c
index 05081c40f1e..3ecffe79d9f 100644
--- a/gcc/fortran/decl.c
+++ b/gcc/fortran/decl.c
@@ -1551,11 +1551,15 @@ gfc_verify_c_interop_param (gfc_symbol *sym)
 			 sym->ns->proc_name->name);
 	}
 
+	  /* Per F2018, 18.3.6 (5), pointer + contiguous is not permitted.  */
+	  if (sym->attr.pointer && sym->attr.contiguous)
+	gfc_error ("Dummy argument %qs at %L may not be a pointer with "
+		   "CONTIGUOUS attribute as procedure %qs is BIND(C)",
+		   sym->name, &sym->declared_at, sym->ns->proc_name->name);
+
   /* Character strings are only C interoperable if they have a
-	 length of 1.  However, as argument they are either iteroperable
-	 when passed as descriptor (which requires len=: or len=*) or
-	 when h

Re: [PATCH] mips: msa: truncate immediate shift amount [PR101922]

2021-08-20 Thread Xi Ruoyao via Gcc-patches

On Sat, 2021-08-21 at 01:07 +0800, Xi Ruoyao via Gcc-patches wrote:
> When -mloongson-mmi is enabled, SHIFT_COUNT_TRUNCATED is turned off.
> This causes untruncated immediate shift amount outputed into the asm,
> and the GNU assembler refuses to assemble it.
> 
> Truncate immediate shift amount when outputing the asm instruction to
> make GAS happy again.
> 
> gcc/
> 
> PR target/101922
> * config/mips/mips-protos.h (mips_msa_output_shift_immediate):
>   Declare.
> * config/mips/mips.c (mips_msa_output_shift_immediate): New
>   function.
> * config/mips/mips-msa.md (vashl3, vashr3,
>   vlshr3): Call it.
> 
> gcc/testsuite/
> 
> PR target/101922
> * gcc.target/mips/pr101922.c: New test.

Forgot to mention: tested on mips64el-linux-gnu, OK for trunk?

Re: [PATCH v2] rs6000: Avoid buffer overruns

2021-08-20 Thread Segher Boessenkool

Hi!

On Thu, Aug 19, 2021 at 04:40:42PM -0500, Bill Schmidt wrote:
> I totally biffed the previous version of this patch, as it was built
> against an experimental tree instead of trunk.  Trying again...
> 
> Although safe_inc_pos avoids buffer overruns in rs6000-gen-builtins.c,
> there are some other routines where we fail to detect the possibility.
> Clean those up!  (Also, safe_inc_pos is not quite right itself.)

>   PR target/101830
>   * config/rs6000/rs6000-gen-builtins.c (consume_whitespace):
>   Diagnose buffer overrun.

Please don't break changelog lines unnecessary.

>   (safe_inc_pos): Fix overrun detection.
>   (match_identifier): Diagnose buffer overrun.
>   (match_integer): Likewise.
>   (match_to_right_bracket): Likewise.

Okay for trunk.  Thanks!


Segher

Re: [Patch v2] C, C++, Fortran, OpenMP: Add support for device-modifiers for 'omp target device'

2021-08-20 Thread Marcel Vollweiler


Hi Jakub,

this is the second version of the patch for the device-modifiers for
'omp target device'.

Am 20.07.2021 um 15:30 schrieb Jakub Jelinek:

On Wed, Jul 07, 2021 at 07:59:58PM +0200, Marcel Vollweiler wrote:

OpenMP: Add support for device-modifiers for 'omp target device'

gcc/c/ChangeLog:

 * c-parser.c (c_parser_omp_clause_device): Add support for
 device-modifiers for 'omp target device'.

gcc/cp/ChangeLog:

 * parser.c (cp_parser_omp_clause_device): Add support for
 device-modifiers for 'omp target device'.

gcc/fortran/ChangeLog:

 * openmp.c (gfc_match_omp_clauses): Add support for
 device-modifiers for 'omp target device'.

gcc/testsuite/ChangeLog:

 * c-c++-common/gomp/target-device-1.c: New test.
 * c-c++-common/gomp/target-device-2.c: New test.
 * gfortran.dg/gomp/target-device-1.f90: New test.
 * gfortran.dg/gomp/target-device-2.f90: New test.



  static tree
  c_parser_omp_clause_device (c_parser *parser, tree list)
  {
location_t clause_loc = c_parser_peek_token (parser)->location;
+  location_t expr_loc;
+  c_expr expr;
+  tree c, t;
+
matching_parens parens;
-  if (parens.require_open (parser))
+  if (!parens.require_open (parser))
+return list;
+
+  int pos = 1;
+  int pos_colon = 0;
+  while (c_parser_peek_nth_token_raw (parser, pos)->type == CPP_NAME
+ || c_parser_peek_nth_token_raw (parser, pos)->type == CPP_COLON
+ || c_parser_peek_nth_token_raw (parser, pos)->type == CPP_COMMA)


Why CPP_COMMA?  The OpenMP 5.0/5.1/5.2 grammar only supports a single device
modifier.
So please simplify it to just an
   if (c_parser_next_token_is (parser, CPP_NAME)
   && c_parser_peek_2nd_token (parser, 2)->type == CPP_COLON)
{
and check there just for the two modifiers.
   const char *p
  = IDENTIFIER_POINTER (c_parser_peek_token (parser)->value);
   if (strcmp ("ancestor", p) == 0)
 ...
   else if (strcmp ("device-num", p) == 0)
  ;
   else
 error_at (..., "expected % or %");
 }
Similarly for C++.


The parser files for C and C++ are simplyfied accordingly.



Also, even if we sorry on device(ancestor: ...), it would be nice if you
in tree.h define OMP_CLAUSE_DEVICE_ANCESTOR macro (with
   (OMP_CLAUSE_SUBCODE_CHECK (NODE, OMP_CLAUSE_DEVICE)->base.public_flag)
definition), set it, sorry later on it (e.g. omp-expand.c) only if it
survived till then (wasn't removed because of other errors) and diagnose
the various restrictions/requirements on device(ancestor:).


I changed it as you proposed. I marked the tests for "sorry,
unimplemented: 'ancestor' not yet supported" with xfail because a
previous sorry for "requires reverse_offload" suppresses the message for
'ancestor'. "reverse_offload" is explicitly needed due to the
specificated ancestor restrictions (OpenMP specification p. 175, l. 1).


In particular:
1) that OMP_CLAUSE_DEVICE clauses with OMP_CLAUSE_DEVICE_ANCESTOR
only appear on OMP_TARGET and not on other constructs
(this can be easily tested e.g. during gimplification, when
gimplify_scan_omp_clauses sees OMP_CLAUSE_DEVICE with
OMP_CLAUSE_DEVICE_ANCESTOR and code != OMP_TARGET, diagnose)
2) that if after the usual fully folding the argument is INTEGER_CST,
it is equal to 1 (the spec says must evaluate to 1, but doesn't say
it has to be a constant, so it can evaluate to 1 at runtime but if it is
a constant other than 1, we know it will not evaluate to 1); this can be
done in *finish_omp_clauses
3) that omp_requires_mask has OMP_REQUIRES_REVERSE_OFFLOAD set; this should
be checked during the parsing
4) only the device, firstprivate, private, defaultmap, and map clauses may
appear on the construct; can be also done during gimplification, there is
at most one device clause, so walking all clauses when we see
OMP_CLAUSE_DEVICE_ANCESTOR is still linear complexity
5) no OpenMP constructs or calls to OpenMP API runtime routines are allowed 
inside
the corresponding target region (this is something that should be checked
in omp-low.c region nesting code, we already have similar restrictions
for e.g. the loop construct)
Everything should be covered by testcases.


Tests were added for all cases.



  Jakub



I tested on x86_64-linux with nvptx offloading with no regressions.

Marcel
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Add support for device-modifiers for 'omp target device'.

'device_num' and 'ancestor' are now parsed on target device constructs for C,
C++, and Fortran (see OpenMP specification 5.0, p. 170). When 'ancestor' is
 used, then 'sorry, not supported' is output. Moreover, the restrictions for
'ancestor' are implemented (see OpenMP specification 5.0, p. 174f).

gcc/c/ChangeLog:

* c-

[PATCH v2 3/6] rs6000: Simplify some SSE4.1 "test" intrinsics

Copy some simple redirections from i386 , for:
- _mm_test_all_zeros
- _mm_test_all_ones
- _mm_test_mix_ones_zeros

2021-08-20  Paul A. Clarke  

gcc
* config/rs6000/smmintrin.h (_mm_test_all_zeros,
_mm_test_all_ones, _mm_test_mix_ones_zeros): Replace.
---
v2:
- Removed "-Wno-psabi" from tests as unnecessary, per v1 review.
- Noted testing in patch series cover letter.

 gcc/config/rs6000/smmintrin.h | 30 --
 1 file changed, 4 insertions(+), 26 deletions(-)

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index 505fe4ce22a8..363534cb06a2 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -379,34 +379,12 @@ _mm_testnzc_si128 (__m128i __A, __m128i __B)
   return _mm_testz_si128 (__A, __B) == 0 && _mm_testc_si128 (__A, __B) == 0;
 }
 
-__inline int
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_test_all_zeros (__m128i __A, __m128i __mask)
-{
-  const __v16qu __zero = {0};
-  return vec_all_eq (vec_and ((__v16qu) __A, (__v16qu) __mask), __zero);
-}
+#define _mm_test_all_zeros(M, V) _mm_testz_si128 ((M), (V))
 
-__inline int
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_test_all_ones (__m128i __A)
-{
-  const __v16qu __ones = vec_splats ((unsigned char) 0xff);
-  return vec_all_eq ((__v16qu) __A, __ones);
-}
+#define _mm_test_all_ones(V) \
+  _mm_testc_si128 ((V), _mm_cmpeq_epi32 ((V), (V)))
 
-__inline int
-__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-_mm_test_mix_ones_zeros (__m128i __A, __m128i __mask)
-{
-  const __v16qu __zero = {0};
-  const __v16qu __Amasked = vec_and ((__v16qu) __A, (__v16qu) __mask);
-  const int any_ones = vec_any_ne (__Amasked, __zero);
-  const __v16qu __notA = vec_nor ((__v16qu) __A, (__v16qu) __A);
-  const __v16qu __notAmasked = vec_and ((__v16qu) __notA, (__v16qu) __mask);
-  const int any_zeros = vec_any_ne (__notAmasked, __zero);
-  return any_ones * any_zeros;
-}
+#define _mm_test_mix_ones_zeros(M, V) _mm_testnzc_si128 ((M), (V))
 
 extern __inline __m128i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
-- 
2.27.0

[PATCH v2 5/6] rs6000: Support more SSE4.1 "cmp", "mul", "pack" intrinsics

Function signatures and decorations match gcc/config/i386/smmintrin.h.

Also, copy tests for:
- _mm_cmpeq_epi64, _mm_cmpgt_epi64
- _mm_mullo_epi32, _mm_mul_epi32
- _mm_packus_epi32

from gcc/testsuite/gcc.target/i386.

2021-08-20  Paul A. Clarke  

gcc
* config/rs6000/smmintrin.h (_mm_cmpeq_epi64, _mm_cmpgt_epi64,
_mm_mullo_epi32, _mm_mul_epi32, _mm_packus_epi32): New.

gcc/testsuite
* gcc.target/powerpc/pr78102.c: Copy from gcc.target/i386,
adjust dg directives to suit.
* gcc.target/powerpc/sse4_1-packusdw.c: Same.
* gcc.target/powerpc/sse4_1-pcmpeqq.c: Same.
* gcc.target/powerpc/sse4_1-pmuldq.c: Same.
* gcc.target/powerpc/sse4_1-pmulld.c: Same.
* gcc.target/powerpc/sse4_2-pcmpgtq.c: Same.
---
v2:
- Added "extern" to functions to maintain compatible decorations with
  like implementations in gcc/config/i386.
- Removed "-Wno-psabi" from tests as unnecessary, per v1 review.
- Noted testing in patch series cover letter.

 gcc/config/rs6000/smmintrin.h | 41 +++
 gcc/testsuite/gcc.target/powerpc/pr78102.c| 23 ++
 .../gcc.target/powerpc/sse4_1-packusdw.c  | 73 +++
 .../gcc.target/powerpc/sse4_1-pcmpeqq.c   | 46 
 .../gcc.target/powerpc/sse4_1-pmuldq.c| 51 +
 .../gcc.target/powerpc/sse4_1-pmulld.c| 46 
 .../gcc.target/powerpc/sse4_2-pcmpgtq.c   | 46 
 7 files changed, 326 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr78102.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pcmpeqq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmuldq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmulld.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_2-pcmpgtq.c

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index fdef6674d16c..c04d2bb5b6d3 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -386,6 +386,15 @@ _mm_testnzc_si128 (__m128i __A, __m128i __B)
 
 #define _mm_test_mix_ones_zeros(M, V) _mm_testnzc_si128 ((M), (V))
 
+#ifdef _ARCH_PWR8
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpeq_epi64 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_cmpeq ((__v2di)__X, (__v2di)__Y);
+}
+#endif
+
 extern __inline __m128i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_min_epi8 (__m128i __X, __m128i __Y)
@@ -444,6 +453,22 @@ _mm_max_epu32 (__m128i __X, __m128i __Y)
 
 extern __inline __m128i
 __attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mullo_epi32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_mul ((__v4su)__X, (__v4su)__Y);
+}
+
+#ifdef _ARCH_PWR8
+__inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_mul_epi32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_mule ((__v4si)__X, (__v4si)__Y);
+}
+#endif
+
+__inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
 _mm_cvtepi8_epi16 (__m128i __A)
 {
   return (__m128i) vec_unpackh ((__v16qi)__A);
@@ -607,4 +632,20 @@ _mm_minpos_epu16 (__m128i __A)
   return __r.__m;
 }
 
+__inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_packus_epi32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_packsu ((__v4si)__X, (__v4si)__Y);
+}
+
+#ifdef _ARCH_PWR8
+__inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cmpgt_epi64 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_cmpgt ((__v2di)__X, (__v2di)__Y);
+}
+#endif
+
 #endif
diff --git a/gcc/testsuite/gcc.target/powerpc/pr78102.c 
b/gcc/testsuite/gcc.target/powerpc/pr78102.c
new file mode 100644
index ..56a2d497bbff
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr78102.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mvsx" } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+
+#include 
+
+__m128i
+foo (const __m128i x, const __m128i y)
+{
+  return _mm_cmpeq_epi64 (x, y);
+}
+
+__v2di
+bar (const __v2di x, const __v2di y)
+{
+  return x == y;
+}
+
+__v2di
+baz (const __v2di x, const __v2di y)
+{
+  return x != y;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c 
b/gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
new file mode 100644
index ..15b8ca418f54
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
@@ -0,0 +1,73 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mvsx" } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse4_1-check.h"
+#endif
+
+#ifndef TEST
+#define TEST sse4_1_test
+#endif
+
+#include CHECK_H
+
+#include 
+
+#define NUM 64
+
+static unsigned short
+int_to_ushort (int iVal)
+{
+  unsigned short sVal;
+
+  if (iVal < 0)
+sVa

[PATCH v2 0/6] rs6000: Support more SSE4.1 intrinsics

Tested ppc64le (POWER9) and ppc64/32 (POWER7).

OK for trunk?

Paul A. Clarke (6):
  rs6000: Support SSE4.1 "round" intrinsics
  rs6000: Support SSE4.1 "min" and "max" intrinsics
  rs6000: Simplify some SSE4.1 "test" intrinsics
  rs6000: Support SSE4.1 "cvt" intrinsics
  rs6000: Support more SSE4.1 "cmp", "mul", "pack" intrinsics
  rs6000: Guard some x86 intrinsics implementations

 gcc/config/rs6000/emmintrin.h |  12 +-
 gcc/config/rs6000/pmmintrin.h |   4 +
 gcc/config/rs6000/smmintrin.h | 427 --
 gcc/config/rs6000/tmmintrin.h |  12 +
 gcc/testsuite/gcc.target/powerpc/pr78102.c|  23 +
 .../gcc.target/powerpc/sse4_1-packusdw.c  |  73 +++
 .../gcc.target/powerpc/sse4_1-pcmpeqq.c   |  46 ++
 .../gcc.target/powerpc/sse4_1-pmaxsb.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pmaxsd.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pmaxud.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pmaxuw.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pminsb.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pminsd.c|  46 ++
 .../gcc.target/powerpc/sse4_1-pminud.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pminuw.c|  47 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbd.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbw.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxdq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxwd.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxwq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbd.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbw.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxdq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxwd.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxwq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmuldq.c|  51 +++
 .../gcc.target/powerpc/sse4_1-pmulld.c|  46 ++
 .../gcc.target/powerpc/sse4_1-round3.h|  81 
 .../gcc.target/powerpc/sse4_1-roundpd.c   | 143 ++
 .../gcc.target/powerpc/sse4_1-roundps.c   |  98 
 .../gcc.target/powerpc/sse4_1-roundsd.c   | 256 +++
 .../gcc.target/powerpc/sse4_1-roundss.c   | 208 +
 .../gcc.target/powerpc/sse4_2-pcmpgtq.c   |  46 ++
 35 files changed, 2349 insertions(+), 59 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr78102.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-packusdw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pcmpeqq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxud.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxuw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsb.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminud.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminuw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxdq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxdq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmuldq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmulld.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_2-pcmpgtq.c

-- 
2.27.0

[PATCH v2 2/6] rs6000: Support SSE4.1 "min" and "max" intrinsics

Function signatures and decorations match gcc/config/i386/smmintrin.h.

Also, copy tests for _mm_min_epi8, _mm_min_epu16, _mm_min_epi32,
_mm_min_epu32, _mm_max_epi8, _mm_max_epu16, _mm_max_epi32, _mm_max_epu32
from gcc/testsuite/gcc.target/i386.

sse4_1-pmaxsb.c and sse4_1-pminsb.c were modified from using
"char" types to "signed char" types, because the default is unsigned on
powerpc.

2021-08-20  Paul A. Clarke  

gcc
* config/rs6000/smmintrin.h (_mm_min_epi8, _mm_min_epu16,
_mm_min_epi32, _mm_min_epu32, _mm_max_epi8, _mm_max_epu16,
_mm_max_epi32, _mm_max_epu32): New.

gcc/testsuite
* gcc.target/powerpc/sse4_1-pmaxsb.c: Copy from gcc.target/i386.
* gcc.target/powerpc/sse4_1-pmaxsd.c: Same.
* gcc.target/powerpc/sse4_1-pmaxud.c: Same.
* gcc.target/powerpc/sse4_1-pmaxuw.c: Same.
* gcc.target/powerpc/sse4_1-pminsb.c: Same.
* gcc.target/powerpc/sse4_1-pminsd.c: Same.
* gcc.target/powerpc/sse4_1-pminud.c: Same.
* gcc.target/powerpc/sse4_1-pminuw.c: Same.
---
v2:
- Added "extern" to functions to maintain compatible decorations with
  like implementations in gcc/config/i386.
- Removed "-Wno-psabi" from tests as unnecessary, per v1 review.
- Noted testing in patch series cover letter.

 gcc/config/rs6000/smmintrin.h | 56 +++
 .../gcc.target/powerpc/sse4_1-pmaxsb.c| 46 +++
 .../gcc.target/powerpc/sse4_1-pmaxsd.c| 46 +++
 .../gcc.target/powerpc/sse4_1-pmaxud.c| 47 
 .../gcc.target/powerpc/sse4_1-pmaxuw.c| 47 
 .../gcc.target/powerpc/sse4_1-pminsb.c| 46 +++
 .../gcc.target/powerpc/sse4_1-pminsd.c| 46 +++
 .../gcc.target/powerpc/sse4_1-pminud.c| 47 
 .../gcc.target/powerpc/sse4_1-pminuw.c| 47 
 9 files changed, 428 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxud.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxuw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsb.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminud.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pminuw.c

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index a6b88d313ad0..505fe4ce22a8 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -408,6 +408,62 @@ _mm_test_mix_ones_zeros (__m128i __A, __m128i __mask)
   return any_ones * any_zeros;
 }
 
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_epi8 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_min ((__v16qi)__X, (__v16qi)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_epu16 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_min ((__v8hu)__X, (__v8hu)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_epi32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_min ((__v4si)__X, (__v4si)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_min_epu32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_min ((__v4su)__X, (__v4su)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_epi8 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_max ((__v16qi)__X, (__v16qi)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_epu16 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_max ((__v8hu)__X, (__v8hu)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_epi32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_max ((__v4si)__X, (__v4si)__Y);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_max_epu32 (__m128i __X, __m128i __Y)
+{
+  return (__m128i) vec_max ((__v4su)__X, (__v4su)__Y);
+}
+
 /* Return horizontal packed word minimum and its index in bits [15:0]
and bits [18:16] respectively.  */
 __inline __m128i
diff --git a/gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c 
b/gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
new file mode 100644
index ..7a465b01dd11
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/sse4_1-pmaxsb.c
@@ -0,0 +1,46 @@
+/* { dg-do run } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+
+#ifndef CHECK_H
+#define CHECK_H "sse4_1-check.h"
+#endif
+
+#ifndef TEST
+#define TEST sse4_1_test
+#endif
+

[PATCH v2 4/6] rs6000: Support SSE4.1 "cvt" intrinsics

Function signatures and decorations match gcc/config/i386/smmintrin.h.

Also, copy tests for:
- _mm_cvtepi8_epi16, _mm_cvtepi8_epi32, _mm_cvtepi8_epi64
- _mm_cvtepi16_epi32, _mm_cvtepi16_epi64
- _mm_cvtepi32_epi64,
- _mm_cvtepu8_epi16, _mm_cvtepu8_epi32, _mm_cvtepu8_epi64
- _mm_cvtepu16_epi32, _mm_cvtepu16_epi64
- _mm_cvtepu32_epi64

from gcc/testsuite/gcc.target/i386.

sse4_1-pmovsxbd.c, sse4_1-pmovsxbq.c, and sse4_1-pmovsxbw.c were
modified from using "char" types to "signed char" types, because
the default is unsigned on powerpc.

2021-08-20  Paul A. Clarke  

gcc
* config/rs6000/smmintrin.h (_mm_cvtepi8_epi16, _mm_cvtepi8_epi32,
_mm_cvtepi8_epi64, _mm_cvtepi16_epi32, _mm_cvtepi16_epi64,
_mm_cvtepi32_epi64, _mm_cvtepu8_epi16, _mm_cvtepu8_epi32,
_mm_cvtepu8_epi64, _mm_cvtepu16_epi32, _mm_cvtepu16_epi64,
_mm_cvtepu32_epi64): New.

gcc/testsuite
* gcc.target/powerpc/sse4_1-pmovsxbd.c: Copy from gcc.target/i386,
adjust dg directives to suit.
* gcc.target/powerpc/sse4_1-pmovsxbq.c: Same.
* gcc.target/powerpc/sse4_1-pmovsxbw.c: Same.
* gcc.target/powerpc/sse4_1-pmovsxdq.c: Same.
* gcc.target/powerpc/sse4_1-pmovsxwd.c: Same.
* gcc.target/powerpc/sse4_1-pmovsxwq.c: Same.
* gcc.target/powerpc/sse4_1-pmovzxbd.c: Same.
* gcc.target/powerpc/sse4_1-pmovzxbq.c: Same.
* gcc.target/powerpc/sse4_1-pmovzxbw.c: Same.
* gcc.target/powerpc/sse4_1-pmovzxdq.c: Same.
* gcc.target/powerpc/sse4_1-pmovzxwd.c: Same.
* gcc.target/powerpc/sse4_1-pmovzxwq.c: Same.
---
v2:
- Added "extern" to functions to maintain compatible decorations with
  like implementations in gcc/config/i386.
- Removed "-Wno-psabi" from tests as unnecessary, per v1 review.
- Noted testing in patch series cover letter.

 gcc/config/rs6000/smmintrin.h | 138 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbd.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxbw.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxdq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxwd.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovsxwq.c  |  42 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbd.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxbw.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxdq.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxwd.c  |  43 ++
 .../gcc.target/powerpc/sse4_1-pmovzxwq.c  |  43 ++
 13 files changed, 648 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxbw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxdq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovsxwq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxbw.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxdq.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-pmovzxwq.c

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index 363534cb06a2..fdef6674d16c 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -442,6 +442,144 @@ _mm_max_epu32 (__m128i __X, __m128i __Y)
   return (__m128i) vec_max ((__v4su)__X, (__v4su)__Y);
 }
 
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi8_epi16 (__m128i __A)
+{
+  return (__m128i) vec_unpackh ((__v16qi)__A);
+}
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi8_epi32 (__m128i __A)
+{
+  __A = (__m128i) vec_unpackh ((__v16qi)__A);
+  return (__m128i) vec_unpackh ((__v8hi)__A);
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi8_epi64 (__m128i __A)
+{
+  __A = (__m128i) vec_unpackh ((__v16qi)__A);
+  __A = (__m128i) vec_unpackh ((__v8hi)__A);
+  return (__m128i) vec_unpackh ((__v4si)__A);
+}
+#endif
+
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi16_epi32 (__m128i __A)
+{
+  return (__m128i) vec_unpackh ((__v8hi)__A);
+}
+
+#ifdef _ARCH_PWR8
+extern __inline __m128i
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+_mm_cvtepi16_epi64 (__m128i __A)
+{
+  __A = (__m128i) vec_unpackh ((__v8hi)__A);
+  return (__m128i) vec_unpackh ((__v4si)__A);
+}
+#endif
+
+#ifdef _ARCH_P

[PATCH v2 1/6] rs6000: Support SSE4.1 "round" intrinsics

Suppress exceptions (when specified), by saving, manipulating, and
restoring the FPSCR.  Similarly, save, set, and restore the floating-point
rounding mode when required.

No attempt is made to optimize writing the FPSCR (by checking if the new
value would be the same), other than using lighter weight instructions
when possible.

The scalar versions naively use the parallel versions to compute the
single scalar result and then construct the remainder of the result.

Of minor note, the values of _MM_FROUND_TO_NEG_INF and _MM_FROUND_TO_ZERO
are swapped from the corresponding values on x86 so as to match the
corresponding rounding mode values in the Power ISA.

Move implementations of _mm_ceil* and _mm_floor* into _mm_round*, and
convert _mm_ceil* and _mm_floor* into macros. This matches the current
analogous implementations in config/i386/smmintrin.h.

Function signatures match the analogous functions in config/i386/smmintrin.h.

Add tests for _mm_round_pd, _mm_round_ps, _mm_round_sd, _mm_round_ss,
modeled after the very similar "floor" and "ceil" tests.

Include basic tests, plus tests at the boundaries for floating-point
representation, positive and negative, test all of the parameterized
rounding modes as well as the C99 rounding modes and interactions
between the two.

Exceptions are not explicitly tested.

2021-08-20  Paul A. Clarke  

gcc
* config/rs6000/smmintrin.h (_mm_round_pd, _mm_round_ps,
_mm_round_sd, _mm_round_ss, _MM_FROUND_TO_NEAREST_INT,
_MM_FROUND_TO_ZERO, _MM_FROUND_TO_POS_INF, _MM_FROUND_TO_NEG_INF,
_MM_FROUND_CUR_DIRECTION, _MM_FROUND_RAISE_EXC, _MM_FROUND_NO_EXC,
_MM_FROUND_NINT, _MM_FROUND_FLOOR, _MM_FROUND_CEIL, _MM_FROUND_TRUNC,
_MM_FROUND_RINT, _MM_FROUND_NEARBYINT): New.
* config/rs6000/smmintrin.h (_mm_ceil_pd, _mm_ceil_ps, _mm_ceil_sd,
_mm_ceil_ss, _mm_floor_pd, _mm_floor_ps, _mm_floor_sd, _mm_floor_ss):
Convert from function to macro.

gcc/testsuite
* gcc.target/powerpc/sse4_1-round3.h: New.
* gcc.target/powerpc/sse4_1-roundpd.c: New.
* gcc.target/powerpc/sse4_1-roundps.c: New.
* gcc.target/powerpc/sse4_1-roundsd.c: New.
* gcc.target/powerpc/sse4_1-roundss.c: New.
---
v2:
- Replaced clever (and broken) exception masking with more straightforward
  implementation, per v1 review and closer inspection. mtfsf was only
  writing the final nybble (1) instead of the final two nybbles (2), so
  not all of the exception-enable bits were cleared.
- Renamed some variables from cryptic "tmp" and "save" to
  "fpscr_save" and "enables_save".
- Retained use of __builtin_mffsl, since that is supported pre-POWER8
  (with an alternate instruction sequence).
- Added "extern" to functions to maintain compatible decorations with
  like implementations in gcc/config/i386.
- Added some additional text to the commit message about some of the
  (unpleasant?) implementations and decorations coming from
  like implementations in gcc/config/i386, per v1 review.
- Removed "-Wno-psabi" from tests as unnecessary, per v1 review.
- Fixed indentation and other minor formatting changes, per v1 review.
- Noted testing in patch series cover letter.

 gcc/config/rs6000/smmintrin.h | 240 +++-
 .../gcc.target/powerpc/sse4_1-round3.h|  81 ++
 .../gcc.target/powerpc/sse4_1-roundpd.c   | 143 ++
 .../gcc.target/powerpc/sse4_1-roundps.c   |  98 +++
 .../gcc.target/powerpc/sse4_1-roundsd.c   | 256 ++
 .../gcc.target/powerpc/sse4_1-roundss.c   | 208 ++
 6 files changed, 962 insertions(+), 64 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-round3.h
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundpd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundps.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundsd.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/sse4_1-roundss.c

diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index 3767a67eada7..a6b88d313ad0 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -42,6 +42,182 @@
 #include 
 #include 
 
+/* Rounding mode macros. */
+#define _MM_FROUND_TO_NEAREST_INT   0x00
+#define _MM_FROUND_TO_ZERO  0x01
+#define _MM_FROUND_TO_POS_INF   0x02
+#define _MM_FROUND_TO_NEG_INF   0x03
+#define _MM_FROUND_CUR_DIRECTION0x04
+
+#define _MM_FROUND_NINT\
+  (_MM_FROUND_TO_NEAREST_INT | _MM_FROUND_RAISE_EXC)
+#define _MM_FROUND_FLOOR   \
+  (_MM_FROUND_TO_NEG_INF | _MM_FROUND_RAISE_EXC)
+#define _MM_FROUND_CEIL\
+  (_MM_FROUND_TO_POS_INF | _MM_FROUND_RAISE_EXC)
+#define _MM_FROUND_TRUNC   \
+  (_MM_FROUND_TO_ZERO | _MM_FROUND_RAISE_EXC)
+#define _MM_FROUND_RINT\
+  (_MM_FROUND_CUR_DIRECTION | _MM_FROUND_RAISE_EXC)
+#define _MM_FROUND_NEARBYINT   \
+  (_MM_FROUND_CUR

[PATCH v2 6/6] rs6000: Guard some x86 intrinsics implementations

Some compatibility implementations of x86 intrinsics include
Power intrinsics which require POWER8.  Guard them.

emmintrin.h:
- _mm_cmpord_pd: Remove code which was ostensibly for pre-POWER8,
  but which indeed depended on POWER8 (vec_cmpgt(v2du)/vcmpgtud).
  The "POWER8" version works fine on pre-POWER8.
- _mm_mul_epu32: vec_mule(v4su) uses vmuleuw.
pmmintrin.h:
- _mm_movehdup_ps: vec_mergeo(v4su) uses vmrgow.
- _mm_moveldup_ps: vec_mergee(v4su) uses vmrgew.
smmintrin.h:
- _mm_cmpeq_epi64: vec_cmpeq(v2di) uses vcmpequd.
- _mm_mul_epi32: vec_mule(v4si) uses vmuluwm.
- _mm_cmpgt_epi64: vec_cmpgt(v2di) uses vcmpgtsd.
tmmintrin.h:
- _mm_sign_epi8: vec_neg(v4si) uses vsububm.
- _mm_sign_epi16: vec_neg(v4si) uses vsubuhm.
- _mm_sign_epi32: vec_neg(v4si) uses vsubuwm.
  Note that the above three could actually be supported pre-POWER8,
  but current GCC does not support them before POWER8.
- _mm_sign_pi8: depends on _mm_sign_epi8.
- _mm_sign_pi16: depends on _mm_sign_epi16.
- _mm_sign_pi32: depends on _mm_sign_epi32.

2021-08-20  Paul A. Clarke  

gcc
PR target/101893
* config/rs6000/emmintrin.h: Guard POWER8 intrinsics.
* config/rs6000/pmmintrin.h: Same.
* config/rs6000/smmintrin.h: Same.
* config/rs6000/tmmintrin.h: Same.
---
v2:
- Ensured that new "#ifdef _ARCH_PWR8" bracket each function so
  impacted, rather than groups of functions, per v1 review.
- Noted testing in patch series cover letter.
- Added PR number to commit message.

 gcc/config/rs6000/emmintrin.h | 12 ++--
 gcc/config/rs6000/pmmintrin.h |  4 
 gcc/config/rs6000/smmintrin.h |  4 
 gcc/config/rs6000/tmmintrin.h | 12 
 4 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/gcc/config/rs6000/emmintrin.h b/gcc/config/rs6000/emmintrin.h
index ce1287edf782..32ad72b4cc35 100644
--- a/gcc/config/rs6000/emmintrin.h
+++ b/gcc/config/rs6000/emmintrin.h
@@ -430,20 +430,10 @@ _mm_cmpnge_pd (__m128d __A, __m128d __B)
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_cmpord_pd (__m128d __A, __m128d __B)
 {
-#if _ARCH_PWR8
   __v2du c, d;
   /* Compare against self will return false (0's) if NAN.  */
   c = (__v2du)vec_cmpeq (__A, __A);
   d = (__v2du)vec_cmpeq (__B, __B);
-#else
-  __v2du a, b;
-  __v2du c, d;
-  const __v2du double_exp_mask  = {0x7ff0, 0x7ff0};
-  a = (__v2du)vec_abs ((__v2df)__A);
-  b = (__v2du)vec_abs ((__v2df)__B);
-  c = (__v2du)vec_cmpgt (double_exp_mask, a);
-  d = (__v2du)vec_cmpgt (double_exp_mask, b);
-#endif
   /* A != NAN and B != NAN.  */
   return ((__m128d)vec_and(c, d));
 }
@@ -1472,6 +1462,7 @@ _mm_mul_su32 (__m64 __A, __m64 __B)
   return ((__m64)a * (__m64)b);
 }
 
+#ifdef _ARCH_PWR8
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_mul_epu32 (__m128i __A, __m128i __B)
 {
@@ -1498,6 +1489,7 @@ _mm_mul_epu32 (__m128i __A, __m128i __B)
   return (__m128i) vec_mule ((__v4su)__A, (__v4su)__B);
 #endif
 }
+#endif
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_slli_epi16 (__m128i __A, int __B)
diff --git a/gcc/config/rs6000/pmmintrin.h b/gcc/config/rs6000/pmmintrin.h
index eab712fdfa66..83dff1d85666 100644
--- a/gcc/config/rs6000/pmmintrin.h
+++ b/gcc/config/rs6000/pmmintrin.h
@@ -123,17 +123,21 @@ _mm_hsub_pd (__m128d __X, __m128d __Y)
vec_mergel ((__v2df) __X, (__v2df)__Y));
 }
 
+#ifdef _ARCH_PWR8
 extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_movehdup_ps (__m128 __X)
 {
   return (__m128)vec_mergeo ((__v4su)__X, (__v4su)__X);
 }
+#endif
 
+#ifdef _ARCH_PWR8
 extern __inline __m128 __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_moveldup_ps (__m128 __X)
 {
   return (__m128)vec_mergee ((__v4su)__X, (__v4su)__X);
 }
+#endif
 
 extern __inline __m128d __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_loaddup_pd (double const *__P)
diff --git a/gcc/config/rs6000/smmintrin.h b/gcc/config/rs6000/smmintrin.h
index c04d2bb5b6d3..29719367e205 100644
--- a/gcc/config/rs6000/smmintrin.h
+++ b/gcc/config/rs6000/smmintrin.h
@@ -272,6 +272,7 @@ _mm_extract_ps (__m128 __X, const int __N)
   return ((__v4si)__X)[__N & 3];
 }
 
+#ifdef _ARCH_PWR8
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_blend_epi16 (__m128i __A, __m128i __B, const int __imm8)
 {
@@ -283,6 +284,7 @@ _mm_blend_epi16 (__m128i __A, __m128i __B, const int __imm8)
   #endif
   return (__m128i) vec_sel ((__v8hu) __A, (__v8hu) __B, __shortmask);
 }
+#endif
 
 extern __inline __m128i __attribute__((__gnu_inline__, __always_inline__, 
__artificial__))
 _mm_blendv_epi8 (__m128i __A, __m128i __B, __m128i __mask)
@@ -343,6 +345,7 @@ _mm_blend_pd (__m128d __A, __m128d __B, const int __imm8)
   return (__m128d) __r;
 }
 
+#ifdef _ARCH_PWR8
 __inline __m128d
 __at

Re: [PATCH] PR fortran/100950 - ICE in output_constructor_regular_field, at varasm.c:5514

Hi Tobias,

> LGTM – I am fine with either variant, but I am slightly inclined to
> removing the gcc_assert*
> – as I believe that the existing checks come early enough and do seem to
> work well.

I played some more and found additional cases that we hadn't discussed
before.  (At least I hadn't thought of them because of the focus on
deferred length strings.)

- automatic string variables / arrays
- assumed length strings
- PDTs with character components.

The last one actually turned out sort of "hopeless" for now, so I opened

  https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102003

to track this.

I added the other cases to testcase pr100950.f90 and reduced the checks
and code within "substring_has_constant_len" to the bare minimum.
See the attached follow-up patch.

> Can you check ('grep') whether we already have sufficient coverage of
> substring out of bounds?
> We presumably have, but if you spot something, I think it makes sense to
> add those to the testsuite.

We do have some checks on substring indices (e.g. substr_10.f90),
but not really extensive coverage.

> Tobias
> *I think GCC won't complain but if ENABLE_ASSERT_CHECKING is not defined,
> there is then a pointless 'length =' assignment, overridden before it is
> ever used.

Yes.  This is fixed below.

I guess I have to apologize for things getting a bit out of control for
this PR, but the results are on the other hand way beyond my initial
expectations...

Re-regtested on x86_64-pc-linux-gnu.  Should be safe elsewhere...

OK?

Thanks,
Harald


Fortran - extend set of substring expressions handled in length simplification

gcc/fortran/ChangeLog:

PR fortran/100950
* simplify.c (substring_has_constant_len): Minimize checks for
substring expressions being allowed.

gcc/testsuite/ChangeLog:

PR fortran/100950
* gfortran.dg/pr100950.f90: Extend coverage.

diff --git a/gcc/fortran/simplify.c b/gcc/fortran/simplify.c
index 4cb73e836c7..b46cbfa90ab 100644
--- a/gcc/fortran/simplify.c
+++ b/gcc/fortran/simplify.c
@@ -4533,14 +4533,7 @@ substring_has_constant_len (gfc_expr *e)
   || !ref->u.ss.start
   || ref->u.ss.start->expr_type != EXPR_CONSTANT
   || !ref->u.ss.end
-  || ref->u.ss.end->expr_type != EXPR_CONSTANT
-  || !ref->u.ss.length)
-return false;
-
-  /* For non-deferred strings the given length shall be constant.  */
-  if (!e->ts.deferred
-  && (!ref->u.ss.length->length
-	  || ref->u.ss.length->length->expr_type != EXPR_CONSTANT))
+  || ref->u.ss.end->expr_type != EXPR_CONSTANT)
 return false;

   /* Basic checks on substring starting and ending indices.  */
@@ -4551,27 +4544,7 @@ substring_has_constant_len (gfc_expr *e)
   iend = gfc_mpz_get_hwi (ref->u.ss.end->value.integer);

   if (istart <= iend)
-{
-  if (istart < 1)
-	{
-	  gfc_error ("Substring start index (%wd) at %L below 1",
-		 istart, &ref->u.ss.start->where);
-	  return false;
-	}
-
-  /* For deferred strings use end index as proxy for length.  */
-  if (e->ts.deferred)
-	length = iend;
-  else
-	length = gfc_mpz_get_hwi (ref->u.ss.length->length->value.integer);
-  if (iend > length)
-	{
-	  gfc_error ("Substring end index (%wd) at %L exceeds string length",
-		 iend, &ref->u.ss.end->where);
-	  return false;
-	}
-  length = iend - istart + 1;
-}
+length = iend - istart + 1;
   else
 length = 0;

diff --git a/gcc/testsuite/gfortran.dg/pr100950.f90 b/gcc/testsuite/gfortran.dg/pr100950.f90
index cb9d126bc18..a19409c2507 100644
--- a/gcc/testsuite/gfortran.dg/pr100950.f90
+++ b/gcc/testsuite/gfortran.dg/pr100950.f90
@@ -46,6 +46,18 @@ program p
 integer, parameter :: l9 = len (r(1)%u(:)(3:4))
 if (l9 /= 2) stop 13
   end block
+
+  call sub (42, "abcde")
+contains
+  subroutine sub (m, c)
+integer,  intent(in) :: m
+character(len=*), intent(in) :: c
+character(len=m):: p, o(3)
+integer, parameter  :: l10 = len (p(6:7))
+integer, parameter  :: l11 = len (o(:)(6:7))
+integer, parameter  :: l12 = len (c(2:3))
+if (l10 /= 2 .or. l11 /= 2 .or. l12 /= 2) stop 14
+  end subroutine sub
 end

 ! { dg-final { scan-tree-dump-times "_gfortran_stop_numeric" 2 "original" } }

RE: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.

2021-08-20 Thread Roger Sayle

Hi Richard,

Benchmarking this patch using CSiBE on x86_64-pc-linux-gnu with -Os -m32 saves 
2432 bytes.
Of the 893 tests, 34 have size differences, 30 are improvements, 4 are 
regressions (of a few bytes).

> Also I'm missing a 'else' - in the default case there's no cost/benefit of 
> using SSE vs. GPR regs?
> For SSE it would be a constant pool load.

The code size regression  I primarily wanted to tackle was the zero vs. 
non-zero case when
dealing with immediate operands, which was the piece affected by my and Jakub's 
xor
improvements.

Alas my first attempt to specify a non-zero gain in the default (doesn't fit in 
SImode) case,
increased the code size slightly.  The use of the constant pool complicates 
things, as the number
of times the same value is used becomes an issue.  If the constant being loaded 
is unique, then
clearly the increase in constant pool size should (ideally) be taken into 
account.  But if the same
constant is used multiple times in a chain (or is already in the constant 
pool), the observed cost
is much cheaper.  Empirically, a value of zero isn't a poor choice, so the 
decision on whether to
use vector instructions is shifted to the gains from operations being 
performed, rather than the
loading of integer constants.  No doubt, like rtx_costs, these are free 
parameters that future
generations will continue to tweak and refine.

Given that this patch reduces code size with -Os, both with and without -m32, 
ok for mainline?

Thanks in advance,
Roger
--

-Original Message-
From: Richard Biener  
Sent: 20 August 2021 08:29
To: Roger Sayle 
Cc: GCC Patches 
Subject: Re: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.

On Thu, Aug 19, 2021 at 6:01 PM Roger Sayle  wrote:
>
>
> Doh!  ENOPATCH.
>
> -Original Message-
> From: Roger Sayle 
> Sent: 19 August 2021 16:59
> To: 'GCC Patches' 
> Subject: [x86_64 PATCH] Tweak -Os costs for scalar-to-vector pass.
>
>
> Back in June I briefly mentioned in one of my gcc-patches posts that a 
> change that should have always reduced code size, would mysteriously 
> occasionally result in slightly larger code (according to CSiBE):
> https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573233.html
>
> Investigating further, the cause turns out to be that x86_64's 
> scalar-to-vector (stv) pass is relying on poor estimates of the size 
> costs/benefits.  This patch tweaks the backend's compute_convert_gain 
> method to provide slightly more accurate values when compiling with -Os.
> Compilation without -Os is (should be) unaffected.  And for 
> completeness, I'll mention that the stv pass is a net win for code 
> size so it's much better to improve its heuristics than simply gate 
> the pass on !optimize_for_size.
>
> The net effect of this change is to save 1399 bytes on the CSiBE code 
> size benchmark when compiling with -Os.
>
> This patch has been tested on x86_64-pc-linux-gnu with "make bootstrap"
> and "make -k check" with no new failures.
>
> Ok for mainline?

+   /* xor (2 bytes) vs. xorps (3 bytes).  */
+   if (src == const0_rtx)
+ igain -= COSTS_N_BYTES (1);
+   /* movdi_internal vs. movv2di_internal.  */
+   /* => mov (5 bytes) vs. movaps (7 bytes).  */
+   else if (x86_64_immediate_operand (src, SImode))
+ igain -= COSTS_N_BYTES (2);

doesn't it need two GPR xor for 32bit DImode and two mov?  Thus the non-SSE 
cost should be times 'm'?  For const0_rtx we may eventually re-use the zero reg 
for the high part so that is eventually correct.

Also I'm missing a 'else' - in the default case there's no cost/benefit of 
using SSE vs. GPR regs?  For SSE it would be a constant pool load.

I also wonder, since I now see COSTS_N_BYTES for the first time (heh), whether 
with -Os we'd need to replace all COSTS_N_INSNS (1) scaling with COSTS_N_BYTES 
scaling?  OTOH costs_add_n_insns uses COSTS_N_INSNS for the size part as well.

That said, it looks like we're eventually mixing apples and oranges now or even 
previously?

Thanks,
Richard.

>
>
> 2021-08-19  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-features.c (compute_convert_gain): Provide
> more accurate values for CONST_INT, when optimizing for size.
> * config/i386/i386.c (COSTS_N_BYTES): Move definition from here...
> * config/i386/i386.h (COSTS_N_BYTES): to here.
>
> Roger
> --
>

[PATCH] libstdc++: Fix compare_three_way for constexpr and Clang

2021-08-20 Thread Paul Keir via Gcc-patches

Hi,

The current compare_three_way implementation makes provision for constant 
evaluation contexts (avoiding reinterpret_cast etc.), but the approach fails 
with Clang; when it compares two const volatile void pointers: "comparison 
between unequal pointers to void has unspecified result". I include a fix and 
test.

Could someone commit the attached patch for me?

Thanks,
Paul


Please consider the environment and think before you print.

The University of the West of Scotland is a registered Scottish charity. 
Charity number SC002520.

This e-mail and any attachment is for authorised use by the intended 
recipient(s) only. It may contain proprietary material, confidential 
information and/or be subject to legal privilege. It should not be copied, 
disclosed to, retained or used by, any other party. If you are not an intended 
recipient then please promptly delete this e-mail and any attachment and all 
copies and inform the sender.

Please note that any views or opinions presented in this email are solely those 
of the author and do not necessarily represent those of the University of the 
West of Scotland.

As a public body, the University of the West of Scotland may be required to 
make available emails as well as other written forms of information as a result 
of a request made under the Freedom of Information (Scotland) Act 2002.
diff --git a/libstdc++-v3/ChangeLog b/libstdc++-v3/ChangeLog
index 07cc83d98f4..638f00716c8 100644
--- a/libstdc++-v3/ChangeLog
+++ b/libstdc++-v3/ChangeLog
@@ -1,3 +1,10 @@
+2021-08-20  Paul Keir  
+
+	* libsupc++/compare: Avoid constexpr pointer comparison failure
+	in std::compare_three_way with Clang.
+	* testsuite/18_support/comparisons/pointers/constexpr.cc:
+	New test.
+
 2021-08-19  Jonathan Wakely  
 
 	* doc/xml/manual/status_cxx2020.xml: Move row  earlier in table.
diff --git a/libstdc++-v3/libsupc++/compare b/libstdc++-v3/libsupc++/compare
index 5aee89e3a6e..4081a3f2315 100644
--- a/libstdc++-v3/libsupc++/compare
+++ b/libstdc++-v3/libsupc++/compare
@@ -553,10 +553,10 @@ namespace std
   {
 	if constexpr (__detail::__3way_builtin_ptr_cmp<_Tp, _Up>)
 	  {
+	if (__builtin_is_constant_evaluated())
+	  return static_cast<_Tp&&>(__t) <=> static_cast<_Up&&>(__u);
 	auto __pt = static_cast(__t);
 	auto __pu = static_cast(__u);
-	if (__builtin_is_constant_evaluated())
-	  return __pt <=> __pu;
 	auto __it = reinterpret_cast<__UINTPTR_TYPE__>(__pt);
 	auto __iu = reinterpret_cast<__UINTPTR_TYPE__>(__pu);
 	return __it <=> __iu;
diff --git a/libstdc++-v3/testsuite/18_support/comparisons/pointers/constexpr.cc b/libstdc++-v3/testsuite/18_support/comparisons/pointers/constexpr.cc
new file mode 100644
index 000..8e1dc2ed6d1
--- /dev/null
+++ b/libstdc++-v3/testsuite/18_support/comparisons/pointers/constexpr.cc
@@ -0,0 +1,43 @@
+// Copyright (C) 2021 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+constexpr bool check01()
+{
+  int arr[2];
+  bool b1 = &arr[0] < &arr[1];
+  bool b2 = std::less{}(&arr[0], &arr[1]);
+  bool b3 = std::compare_three_way{}(&arr[0],&arr[1]) < 0;
+  return b1 && b2 && b3;
+}
+
+constexpr bool check02()
+{
+  int *p = new int[2];
+  bool b1 = &p[0] < &p[1];
+  bool b2 = std::less{}(&p[0], &p[1]);
+  bool b3 = std::compare_three_way{}(&p[0],&p[1]) < 0;
+  delete [] p;
+  return b1 && b2 && b3;
+}
+
+static_assert(check01());
+static_assert(check02());

Re: [PATCH 0/4] drop version checks for in-tree gas [PR91602]

2021-08-20 Thread Serge Belyshev via Gcc-patches

Jeff Law  writes:

> This set is approved.    Push them to the trunk when it's convenient
> for you.
>
> Thanks for your patience,

Thanks! Committed as r12-3047 .. r12-3050.

Re: [PATCH] enable ranger and caching in pass_waccess

2021-08-20 Thread Martin Sebor via Gcc-patches


On 8/20/21 7:09 AM, Andrew MacLeod wrote:

On 8/19/21 7:09 PM, Martin Sebor via Gcc-patches wrote:

The attached patch changes the new access warning pass to use
the per-function ranger instance.  To do that it makes a number
of the global static functions members of the pass (that involved
moving one to a later point in the file, increasing the diff;
the body of the function hasn't changed otherwise).  Still more
functions remain.  At the same time, the patch also enables
the simple pointer_query cache to avoid repeatedly recomputing
the properties of related pointers into the same objects, and
makes the cache more effective (trunk fails to cache a bunch of
intermediate results).  Finally, the patch enhances the debugging
support for the cache.

Other than the ranger/caching the changes have no user-visible
effect.



Why are you calling enable/disable ranger if you are passing a ranger 
instance around instead of using the get_range_query (cfun)->range* calls?


The pass stores an instance of the pointer_query class which in
turn stores a pointer to range_query (which is a copy of the ranger).
So storing it also in pass_waccess isn't necessary and can be
removed.  I've made that change in the attached update.  I'm not
sure the corresponding pointer should at some point also be removed
from the pointer_query class and replaced by calls to get_range_query
(cfun).  If so, that would take some surgery to the strlen pass which
also uses pointer_query and isn't quite ready to make this switch.




Are you planning to transition to using the get_range_query() interface 
instead of keeping a range_query pointer in the pointer_query class?


This pass and to a smaller extent the pointer_query class that's
used by it and the strlen pass are still a work in progress.
I also still need to convert the strlen pass to use Ranger and
I expect it will take some changes to pointer_query.  So at that
point, if going through get_range_query (cfun) everywhere is what
you recommend, I'm happy to do it.

Anyway, attached is an updated revision with the m_ranger member
removed and a few helpers changed to take a range_query argument
to use the pointer_query member instead.  It was retested on
x86_64-linux.

Martin

PS There has been an effort to get rid of global variables from GCC,
or, as the first step, to avoid accessing them directly(*).  If and
when that happens, it seems like each pass will have to store either
the ranger instance as a member (directly or indirectly, via a member
of a class that stores it) or the function passed to pass::execute()
if it wants to access either.

[*] https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573975.html
The patch at the link above wasn't approved but IIUC removing globals
from GCC is still a goal.
gcc/ChangeLog:

	* gimple-ssa-warn-access.cc (get_size_range): Add argument.
	(check_access): Pass additional argument.
	(check_memop_access): Remove template and make a member function.
	(maybe_check_dealloc_call): Make a pass_waccess member function.
	(class pass_waccess): Add, rename, and remove members.
	(pass_waccess::pass_waccess): Adjust to name change.
	(pass_waccess::~pass_waccess): Same.
	(check_alloca): Make a member function.
	(check_alloc_size_call): Same.
	(check_strcat): Same.
	(check_strncat): Same.
	(check_stxcpy): Same.
	(check_stxncpy): Same.
	(check_strncmp): Same.
	(maybe_warn_rdwr_sizes): Rename...
	(pass_waccess::maybe_check_access_sizes): ...to this.
	(pass_waccess::check_call): Adjust to name changes.
	(pass_waccess::maybe_check_dealloc_call): Make a pass_waccess member
	function.
	(pass_waccess::execute): Adjust to name changes.
	* gimple-ssa-warn-access.h (check_memop_access): Remove.
	* pointer-query.cc (access_ref::phi): Handle null pointer.
	(access_ref::inform_access): Same.
	(pointer_query::put_ref): Modify a cached value, not a copy of it.
	(pointer_query::dump): New function.
	(compute_objsize_r): Avoid overwriting access_ref::bndrng.  Cache
	more results.
	* pointer-query.h (pointer_query::dump): Declare.
	* tree-ssa-strlen.c (get_range): Simplify.  Use function query.
	(dump_strlen_info): Use function query.
	(printf_strlen_execute): Factor code out into pointer_query::put_ref.

gcc/testsuite/ChangeLog:

	* gcc.dg/Wstringop-overflow-11.c: Remove xfails.
	* gcc.dg/Wstringop-overflow-12.c: Same.
	* gcc.dg/Wstringop-overflow-43.c: Add xfails.
	* gcc.dg/Wstringop-overflow-73.c: New test.

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index 4a2dd9ade77..68dcf06804a 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -1172,10 +1172,11 @@ warn_for_access (location_t loc, tree func, tree expr, int opt,
by BNDRNG if nonnull and valid.  */
 
 static void
-get_size_range (tree bound, tree range[2], const offset_int bndrng[2])
+get_size_range (range_query *query, tree bound, tree range[2],
+		const offset_int bndrng[2])
 {
   if (bound)
-get_size_range (bound, range);
+get_size_range (query, b

Re: [PATCH] Fix tests that require IBM 128-bit long double

2021-08-20 Thread Segher Boessenkool

On Fri, Aug 13, 2021 at 12:09:24AM -0400, Michael Meissner wrote:
> This patch adds 3 more selections to target-supports.exp to see if we can
> specify to use a particular long double format (IEEE 128-bit, IBM extended
> double, 64-bit), and the library support will track the changes for the long
> double.  This is needed because two of the tests in the test suite use long
> double, and they are actually testing IBM extended double.
> 
> This patch also forces the two tests that explicitly require long double
> to use the IBM double-double encoding to explicitly run the test.  This
> requires GLIBC 2.32 or greater in order to do the switch.

> I did not remove the void * casts in calling memcmp, because not having those
> casts will cause the test to fail.  This is because the variables are declared
> volatile, and GCC now complains that you are discarding the volatile in doing
> the call.  Having this warning makes the test fail.

The explicit cast removes the volatile just the same, and that isn't
valid C, as the warning said when you did it implicitly.  Make a copy
from the volatile to some non-volatile storage, and do mem* on *that*?

> +proc add_options_for_long_double_ibm128 { flags } {

> +proc check_effective_target_long_double_ibm128 { } {

Swap these two maybe?  It is more obvious to read it that way (the
add_options is only used if the check returns true, that isn't so
obvious if you read it in the written order).


Okay for trunk with the casts removed.  Thanks!


Segher

Re: [committed] openmp: Implement the error directive

Hi!

On 2021-08-20T15:54:34+0200, I wrote:
> On 2021-08-20T15:21:12+0200, Jakub Jelinek  wrote:
>> On Fri, Aug 20, 2021 at 03:11:45PM +0200, Thomas Schwinge wrote:
>>> > --- libgomp/error.c.jj2021-08-19 12:53:44.693106618 +0200
>>> > +++ libgomp/error.c   2021-08-19 17:58:55.633203432 +0200
>>>
>>> > +void
>>> > +GOMP_warning (const char *msg, size_t msglen)
>>> > +{
>>> > +  if (msg && msglen == (size_t) -1)
>>> > +gomp_error ("error directive encountered: %s", msg);
>>> > +  else if (msg)
>>> > +{
>>> > +  fputs ("\nlibgomp: error directive encountered: ", stderr);
>>> > +  fwrite (msg, 1, msglen, stderr);
>>> > +  fputc ('\n', stderr);
>>> > +}
>>> > +  else
>>> > +gomp_error ("error directive encountered");
>>> > +}
>>> > +
>>> > +void
>>> > +GOMP_error (const char *msg, size_t msglen)
>>> > +{
>>> > +  if (msg && msglen == (size_t) -1)
>>> > +gomp_fatal ("fatal error: error directive encountered: %s", msg);
>>> > +  else if (msg)
>>> > +{
>>> > +  fputs ("\nlibgomp: fatal error: error directive encountered: ", 
>>> > stderr);
>>> > +  fwrite (msg, 1, msglen, stderr);
>>> > +  fputc ('\n', stderr);
>>> > +  exit (EXIT_FAILURE);
>>> > +}
>>> > +  else
>>> > +gomp_fatal ("fatal error: error directive encountered");
>>> > +}
>>>
>>> At least for nvptx offloading, and at least given the newlib sources I'm
>>> using, the 'fputs'/'fwrite' calls here drag in 'isatty', which isn't
>>> provided by my nvptx newlib at present, so we get, for example:
>>
>> fputs/fputc/vfprintf/exit/stderr have been in use by error.c already before,
>> so this must be the fwrite call.
>
> ACK.
>
>> The above is for Fortran which doesn't have zero terminated strings.
>> Initially I wanted to use instead ... encountered: %.*s", (int) msglen, 
>> stderr);
>> which doesn't handle > 2GB messages, but with offloading who cares, nobody
>> sane would be trying to print > 2GB messages from offloading regions.
>
> (... likewise from the host...)  ;-)
>
>> The question is if it should be achieved through copy of error.c in
>> config/nvptx/, or just include_next there with say fwrite redefined as a
>> macro that does fprintf ("%.*s", (int) msglen, msg, file)?
>
> (Right, that was also my plan.)
>
> | Ah, I just re-discovered 'libgomp/config/nvptx/error.c' -- I'll cook
> | something up.
>
> So, guess what this newlib 'printf ("%.*s", [...]);' prints?
> Yes: literal '%.*s'...  Next try: a 'fputc' loop?

Did that; "works".  But actually, I think that's good enough for the
intended purpose: there's not much point in optimizing the OpenMP 'error'
directive as long as we still have more than enough real
correctness/performance tasks to be worked on.

Tobias suggested using 'fputc_unlocked', "avoiding repreated locks and
locking for a single stderr char is also pointless", but it's not clear
to me if that's safe to do given that a ton of threads may be hammering
on this in parallel; it's not clear to me if there isn't any
newlib-internal state that needs to be accessed in a serialized way (even
if no actual 'FILE *stream' is involved here)?


>>> permissible to use the 'error' directive also inside 'target' regions, as
>>> far as I can tell?
>>
>> !$omp error at(execution) message('whatever')
>> can be used in offloading regions.

(Also should add test cases for OpenMP 'error' with 'at (execution)' from
deep inside parallelized loop nests, etc., offloaded and non-offloaded?)


> Yes, generally works, but at least for Fortran, 'severity (fatal)' seems
> to cause a hang, so another thing to be looked into...

We thus additionally acquired in 'libgomp/config/nvptx/error.c':

+/* The 'exit (EXIT_FAILURE);' of an Fortran (only, huh?) OpenMP 'error'
+   directive with 'severity (fatal)' causes a hang, so 'abort' instead of
+   'exit'.  */
+#undef exit
+#define exit(status) abort ()

... which is another thing to be resolved incrementally.  (Plus adding
corresponding test cases for OpenMP 'error' with 'at (execution)' and
'severity (fatal)' inside OpenMP 'target'.)


Is the attached "Make the OpenMP 'error' directive work for nvptx
offloading" OK to push for now?


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 0762945a17c3ff1a0268edc76f87c0063714a0fc Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Fri, 20 Aug 2021 15:12:56 +0200
Subject: [PATCH] Make the OpenMP 'error' directive work for nvptx offloading

... and add a minimum amount of offloading testing.

(Leaving aside that 'fwrite' to 'stderr' probably wouldn't work anyway) the
'fwrite' calls in 'libgomp/error.c:GOMP_warning', 'libgomp/error.c:GOMP_error'
drag in 'isatty', which isn't provided by my nvptx newlib build at present, so
we get, for example:

[...]
FAIL: l

fix latent bootstrap-debug issue (modref, tree-inline, tree jump-threading)

2021-08-20 Thread Alexandre Oliva



I've hit a bootstrap-debug error involving large subprograms in
gcc/ada/sem_ch12.adb.  I'm afraid I couldn't narrow it down to a
reasonable testcase.

thread1 made different decisions about a block containing a
builtin_eh_filter call because in one compilation, estimate_num_insns
found a cgraph_node for the builtin and could thus get to the
is_simple_builtin test, but in the other it didn't.  With different
insn counts, one stage jump-threaded and the other didn't, and the
resulting code diverged quite a bit.

The reason the builtin had a cgraph_node in one case but not the other
was that modref got a chance to analyze the builtin call when it was
the first stmt in the block, and that created the cgraph_node.
However, when it was preceded by debug stmts, the loop in
analyze_function was cut short after the first debug stmt, because the
summary so far was not useful.

This patch fixes both issues: skip debug stmts in the analyze_function
loop, so as to prevent them from affecting any decisions in the loop,
and enable the insn count estimator to get to the is_simple_builtin
test when a cgraph_node has not been created for the builtin.

Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* ipa-modref.c (analyze_function): Skip debug stmts.
* tree-inline.c (estimate_num_insn): Consider builtins even
without a cgraph_node.
---
 gcc/ipa-modref.c  |3 ++-
 gcc/tree-inline.c |4 ++--
 2 files changed, 4 insertions(+), 3 deletions(-)

diff --git a/gcc/ipa-modref.c b/gcc/ipa-modref.c
index fafd804d4bae4..f0cddbb077aaa 100644
--- a/gcc/ipa-modref.c
+++ b/gcc/ipa-modref.c
@@ -2108,7 +2108,8 @@ analyze_function (function *f, bool ipa)
   FOR_EACH_BB_FN (bb, f)
 {
   gimple_stmt_iterator si;
-  for (si = gsi_after_labels (bb); !gsi_end_p (si); gsi_next (&si))
+  for (si = gsi_start_nondebug_after_labels_bb (bb);
+  !gsi_end_p (si); gsi_next_nondebug (&si))
{
  if (!analyze_stmt (summary, summary_lto,
 gsi_stmt (si), ipa, &recursive_calls)
diff --git a/gcc/tree-inline.c b/gcc/tree-inline.c
index d0e9f52d5f138..636130fe0019e 100644
--- a/gcc/tree-inline.c
+++ b/gcc/tree-inline.c
@@ -4436,8 +4436,8 @@ estimate_num_insns (gimple *stmt, eni_weights *weights)
/* Do not special case builtins where we see the body.
   This just confuse inliner.  */
struct cgraph_node *node;
-   if (!(node = cgraph_node::get (decl))
-   || node->definition)
+   if ((node = cgraph_node::get (decl))
+   && node->definition)
  ;
/* For buitins that are likely expanded to nothing or
   inlined do not account operand costs.  */


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about

[PATCH] c++: Fix cp_tree_equal for template value args using dependent sizeof/alignof/noexcept expressions

2021-08-20 Thread Barrett Adair via Gcc-patches

This patch fixes AST comparison for trailing return types using dependent
sizeof/alignof/noexcept expressions as template value arguments. I believe
this bug is over a decade old, hailing from GCC 4.6. I found it over 5
years ago and sat on the repro until I had time to fix it myself.

The new test cases demonstrate redeclarations mistakenly parsed as
ambiguous overloads. These fail with gcc, but pass with clang. The last
test case concerns a GNU extension for alignof (hence the -Wno-pedantic).
After applying this patch, bootstrap succeeds, the new tests pass, and my
native "make -k check" outputs look essentially similar to these based on
the same commit:
https://gcc.gnu.org/pipermail/gcc-testresults/2021-August/715032.html
(except I only waited for a few thousand of the libstdc++ tests to pass).

I think it makes sense to now split the switch-case blocks here since there
isn't much shared logic left, and I have a patch to do that if desired, but
I think the patch below is the cleanest for initial review.

I studied the history and churn of the comparing_specializations hacks.
While I am still not entirely certain this is the correct change to make
there, the change seems at least superficially reasonable to me. I haven't
been able to break it yet. Perhaps someone more familiar with this area of
the compiler can weigh in. See also the existing cp_unevaluated_operand
push/pop for these EXPR codes in cp_walk_tree.

I am awaiting a response to the copyright assignment request sent to
ass...@gnu.org.

commit fe64ca143dfb9021b4c47abb6382827c13e1f0cd
Author: Barrett Adair 
Date:   Fri Aug 20 15:37:36 2021 -0500

Fix cp_tree_equal for template value args using
dependent sizeof/alignof/noexcept expressions

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 3c62dd74380..fce2a46cfa8 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -3903,40 +3903,41 @@ cp_tree_equal (tree t1, tree t2)
as being equivalent to anything.  */
  if (VAR_P (o1) && DECL_NAME (o1) == NULL_TREE
 && !DECL_RTL_SET_P (o1))
   /*Nop*/;
  else if (VAR_P (o2) && DECL_NAME (o2) == NULL_TREE
  && !DECL_RTL_SET_P (o2))
   /*Nop*/;
  else if (!cp_tree_equal (o1, o2))
   return false;

  return cp_tree_equal (TREE_OPERAND (t1, 1), TREE_OPERAND (t2, 1));
   }

 case PARM_DECL:
   /* For comparing uses of parameters in late-specified return types
  with an out-of-class definition of the function, but can also come
  up for expressions that involve 'this' in a member function
  template.  */

   if (comparing_specializations
+  && !cp_unevaluated_operand
   && DECL_CONTEXT (t1) != DECL_CONTEXT (t2))
  /* When comparing hash table entries, only an exact match is
good enough; we don't want to replace 'this' with the
version from another function.  But be more flexible
with parameters with identical contexts.  */
  return false;

   if (same_type_p (TREE_TYPE (t1), TREE_TYPE (t2)))
  {
   if (DECL_ARTIFICIAL (t1) ^ DECL_ARTIFICIAL (t2))
 return false;
   if (CONSTRAINT_VAR_P (t1) ^ CONSTRAINT_VAR_P (t2))
 return false;
   if (DECL_ARTIFICIAL (t1)
   || (DECL_PARM_LEVEL (t1) == DECL_PARM_LEVEL (t2)
   && DECL_PARM_INDEX (t1) == DECL_PARM_INDEX (t2)))
 return true;
  }
   return false;

@@ -3974,63 +3975,68 @@ cp_tree_equal (tree t1, tree t2)
   return true;

 case CONSTRAINT_INFO:
   return cp_tree_equal (CI_ASSOCIATED_CONSTRAINTS (t1),
 CI_ASSOCIATED_CONSTRAINTS (t2));

 case CHECK_CONSTR:
   return (CHECK_CONSTR_CONCEPT (t1) == CHECK_CONSTR_CONCEPT (t2)
   && comp_template_args (CHECK_CONSTR_ARGS (t1),
  CHECK_CONSTR_ARGS (t2)));

 case TREE_VEC:
   /* These are template args.  Really we should be getting the
  caller to do this as it knows it to be true.  */
   if (!comp_template_args (t1, t2, NULL, NULL, false))
  return false;
   return true;

 case SIZEOF_EXPR:
 case ALIGNOF_EXPR:
+case NOEXCEPT_EXPR:
   {
  tree o1 = TREE_OPERAND (t1, 0);
  tree o2 = TREE_OPERAND (t2, 0);

  if (code1 == SIZEOF_EXPR)
   {
 if (SIZEOF_EXPR_TYPE_P (t1))
   o1 = TREE_TYPE (o1);
 if (SIZEOF_EXPR_TYPE_P (t2))
   o2 = TREE_TYPE (o2);
   }
- else if (ALIGNOF_EXPR_STD_P (t1) != ALIGNOF_EXPR_STD_P (t2))
+ else if (code1 == ALIGNOF_EXPR
+  && ALIGNOF_EXPR_STD_P (t1) != ALIGNOF_EXPR_STD_P (t2))
   return false;

  if (TREE_CODE (o1) != TREE_CODE (o2))
   return false;

- if (ARGUMENT_PACK_P (o1))
-  return template_args_equal (o1, o2);
- else if (TYPE_P (o1))
-  return same_type_p (o1, o2);
- else
-  return cp_tree_equal (o1, o2);
+ {
+  cp_unevaluated ev;
+  if (code1 == SIZEOF_EXPR && ARGUMENT_PACK_P (o1))
+return template_args_equal (o1, o2);
+  else if (code1 != NOEXCEPT_EXPR && TYPE_P (o1))
+return same_type_p (o1, o2);
+  else
+return cp_tree_equal (o1, o2);
+ }
   }

 case MODOP_EXPR:
   {
  tree t1_op1, t2_op1;

  if (!cp_tree_equal (TREE_OPERAND (t1, 0), TREE_OPERAND (t2,

Re: [committed] openmp: Implement the error directive