date:20200127

Re: [PATCH] calls.c: refactor special_function_p for use by analyzer

2020-01-27 Thread Jakub Jelinek

On Mon, Jan 27, 2020 at 09:09:37PM -0500, David Malcolm wrote:
> > Please see calls.c (special_function_p), you should treat certainly
> > also sigsetjmp as a setjmp call, and similarly to special_function_p,
> > skip over _ or __ prefixes before the setjmp or sigsetjmp name.
> > Similarly for longjmp/siglongjmp.
> 
> This patch refactors some code in special_function_p that checks for
> the function being sane to match by name, splitting it out into a new
> maybe_special_function_p, and using it it two places in the analyzer.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
> OK for master?

Not sure it is worth it to factor out the
  DECL_NAME (fndecl)
  && (DECL_CONTEXT (fndecl) == NULL_TREE
  || TREE_CODE (DECL_CONTEXT (fndecl)) == TRANSLATION_UNIT_DECL)
  && TREE_PUBLIC (fndecl)
check, that seems like simple enough that it could be duplicated.
And, even if there is a strong reason not to, at least it ought to be
defined inline in the header, not everyone will use LTO and without LTO it
will need to be an out of line call.
Ack on removing the fndecl && check from special_function_p, the callers
ensure it is non-NULL already, and even if they didn't, after the if (fndecl
&& ...) guarded if there is unconditional dereferencing of fndecl.

Jakub

Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-27 Thread Alexander Monakov

On Tue, 28 Jan 2020, Uecker, Martin wrote:

> > (*) this also shows the level of "obfuscation" needed to fool compilers
> > to lose provenance knowledge is hard to predict.
> 
> Well, this is exactly the problem we want to address by defining
> a clear way to do this. Casting to an integer would be the way
> to state: "consider the pointer as escaped and forget the 
> provenance"  and casting an integer to a  pointer would
> mean "this pointer may point to all objects whose pointer has
> escaped". This would give the programmer explicit control about
> this aspect and make most existing code using pointer-to-integer
> casts well-defined. At the same time, this should be simple
> to add to existing points-to analysis.

Can you explain why you make it required for the compiler to treat the
points-to set unnecessarily broader than it could prove? In the Matlab
example, there's a simple chain of computations that the compiler is
following to prove that the pointer resulting from the final cast is
derived from exactly one other pointer (no other pointers have
participated in the computations).

Or, in other words:

is there an example where a programmer can distinguish between the
requirement you explain above vs. the fine-grained interpretation
that GIMPLE aims to implement (at least as I understand it), which is:

  when the program creates a pointer by means of non-pointer computations
  (casts, representation access, etc), the resulting pointer may point to:

* any object which address could have participated in the computation
  (which is at worst the entire set of "exposed" objects up to that
   program point, but can be much narrower if the compiler can see
   the entire chain of computations)

* any objects which is not "exposed" but could have known address, e.g.
  because it is placed at a specific address during linking

Thanks.
Alexander

Re: [PATCH] i386: Don't use ix86_tune_ctrl_string in parse_mtune_ctrl_str

2020-01-27 Thread Uros Bizjak

On Mon, Jan 27, 2020 at 3:13 PM H.J. Lu  wrote:
>
> There are
>
> static void
> parse_mtune_ctrl_str (bool dump)
> {
>   if (!ix86_tune_ctrl_string)
> return;
>
> parse_mtune_ctrl_str is only called from set_ix86_tune_features, which
> is only called from ix86_function_specific_restore and
> ix86_option_override_internal.  parse_mtune_ctrl_str shouldn't use
> ix86_tune_ctrl_string which is defined with global_options.  Instead,
> opts should be passed to parse_mtune_ctrl_str.
>
> PR target/91399
> * config/i386/i386-options.c (set_ix86_tune_features): Add an
> argument of a pointer to struct gcc_options and pass it to
> parse_mtune_ctrl_str.
> (ix86_function_specific_restore): Pass opts to
> set_ix86_tune_features.
> (ix86_option_override_internal): Likewise.
> (parse_mtune_ctrl_str): Add an argument of a pointer to struct
> gcc_options and use it for x_ix86_tune_ctrl_string.

Rubberstamp OK. I don't know this part well, but looks somehow obvious.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386-options.c | 18 ++
>  1 file changed, 10 insertions(+), 8 deletions(-)
>
> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 2acc9fb0cfe..e0be4932534 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -741,7 +741,8 @@ ix86_option_override_internal (bool main_args_p,
>struct gcc_options *opts,
>struct gcc_options *opts_set);
>  static void
> -set_ix86_tune_features (enum processor_type ix86_tune, bool dump);
> +set_ix86_tune_features (struct gcc_options *opts,
> +   enum processor_type ix86_tune, bool dump);
>
>  /* Restore the current options */
>
> @@ -810,7 +811,7 @@ ix86_function_specific_restore (struct gcc_options *opts,
>
>/* Recreate the tune optimization tests */
>if (old_tune != ix86_tune)
> -set_ix86_tune_features (ix86_tune, false);
> +set_ix86_tune_features (opts, ix86_tune, false);
>  }
>
>  /* Adjust target options after streaming them in.  This is mainly about
> @@ -1538,13 +1539,13 @@ ix86_parse_stringop_strategy_string (char 
> *strategy_str, bool is_memset)
> print the features that are explicitly set.  */
>
>  static void
> -parse_mtune_ctrl_str (bool dump)
> +parse_mtune_ctrl_str (struct gcc_options *opts, bool dump)
>  {
> -  if (!ix86_tune_ctrl_string)
> +  if (!opts->x_ix86_tune_ctrl_string)
>  return;
>
>char *next_feature_string = NULL;
> -  char *curr_feature_string = xstrdup (ix86_tune_ctrl_string);
> +  char *curr_feature_string = xstrdup (opts->x_ix86_tune_ctrl_string);
>char *orig = curr_feature_string;
>int i;
>do
> @@ -1583,7 +1584,8 @@ parse_mtune_ctrl_str (bool dump)
> processor type.  */
>
>  static void
> -set_ix86_tune_features (enum processor_type ix86_tune, bool dump)
> +set_ix86_tune_features (struct gcc_options *opts,
> +   enum processor_type ix86_tune, bool dump)
>  {
>unsigned HOST_WIDE_INT ix86_tune_mask = HOST_WIDE_INT_1U << ix86_tune;
>int i;
> @@ -1605,7 +1607,7 @@ set_ix86_tune_features (enum processor_type ix86_tune, 
> bool dump)
>   ix86_tune_features[i] ? "on" : "off");
>  }
>
> -  parse_mtune_ctrl_str (dump);
> +  parse_mtune_ctrl_str (opts, dump);
>  }
>
>
> @@ -2364,7 +2366,7 @@ ix86_option_override_internal (bool main_args_p,
>XDELETEVEC (s);
>  }
>
> -  set_ix86_tune_features (ix86_tune, opts->x_ix86_dump_tunes);
> +  set_ix86_tune_features (opts, ix86_tune, opts->x_ix86_dump_tunes);
>
>ix86_recompute_optlev_based_flags (opts, opts_set);
>
> --
> 2.24.1
>

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread Uros Bizjak

On Mon, Jan 27, 2020 at 11:17 PM H.J. Lu  wrote:
>
> On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak  wrote:
> >
> > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
> > >
> > > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> > > case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
> > > for TARGET_AVX.
> > >
> > > gcc/
> > >
> > > PR target/91461
> > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
> > > TARGET_AVX.
> > > * config/i386/i386.md (*movoi_internal_avx): Remove
> > > TARGET_SSE_TYPELESS_STORES check.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/91461
> > > * gcc.target/i386/pr91461-1.c: New test.
> > > * gcc.target/i386/pr91461-2.c: Likewise.
> > > * gcc.target/i386/pr91461-3.c: Likewise.
> > > * gcc.target/i386/pr91461-4.c: Likewise.
> > > * gcc.target/i386/pr91461-5.c: Likewise.
> > > ---
> > >  gcc/config/i386/i386.h|  4 +-
> > >  gcc/config/i386/i386.md   |  4 +-
> > >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
> > >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
> > >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
> > >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
> > >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
> > >  7 files changed, 203 insertions(+), 4 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> > >
> > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > index 943e9a5c783..c134b04c5c4 100644
> > > --- a/gcc/config/i386/i386.h
> > > +++ b/gcc/config/i386/i386.h
> > > @@ -516,8 +516,10 @@ extern unsigned char 
> > > ix86_tune_features[X86_TUNE_LAST];
> > >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> > >  #define TARGET_SSE_SPLIT_REGS  
> > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> > > +   isn't the case for AVX nor AVX512.  */
> > >  #define TARGET_SSE_TYPELESS_STORES \
> > > -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > > +   (!TARGET_AVX && ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
> >
> > This is wrong place to disable the feature.
>
> Like this?

No.

There is a mode attribute in i386.md/sse.md for relevant patterns.
Please adapt calculation of mode attributes instead.

Uros.

> diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
> index 2acc9fb0cfe..639969d736d 100644
> --- a/gcc/config/i386/i386-options.c
> +++ b/gcc/config/i386/i386-options.c
> @@ -1597,6 +1597,11 @@ set_ix86_tune_features (enum processor_type
> ix86_tune, bool dump)
>  = !!(initial_ix86_tune_features[i] & ix86_tune_mask);
>  }
>
> +  /* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> + isn't the case for AVX nor AVX512.  */
> +  if (TARGET_AVX)
> +ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES] = 0;
> +
>if (dump)
>  {
>fprintf (stderr, "List of x86 specific tuning parameter names:\n");
>
>
> --
> H.J.

[PATCH, libphobos] Fix compilation dependencies on s390x-linux-musl

2020-01-27 Thread Mathias Lang

Hi,

This patch fixes GDC on s390x-linux-musl targets. It was specifically
tested under Alpine Linux (see
https://gitlab.alpinelinux.org/alpine/aports/commit/c123e0f14ab73976a36c651d47d134f249413f29
).
The patch fixes two issues: First, Musl always provide
`__tls_get_addr`, so we can always use it to get the TLS range instead
of the internal function (which is glibc-specific).
Second, druntime provide an ASM implementation for
`fiber_switchContext` for most platform under
libphobos/libdruntime/config/$ARCH/switchcontext.S, and default to
`swapcontext` when not available, which is the case on s390x.
However, the configure script did not depend on `swapcontext` being
present, as it's part of glibc, but not Musl (there is a libucontext
available on Alpine for this), which is added here.

@Iain: Any chance those could be backported to v9 ?

---
Mathias Lang
---
libphobos/ChangeLog:

* libdruntime/gcc/sections/elf_shared.d Always use
__tls_get_addr on Musl.
* configure.ac: Search librairies for swapcontext when
LIBDRUNTIME_NEEDS_UCONTEXT is yes.
* configure.tgt: Set LIBDRUNTIME_NEEDS_UCONTEXT on s390*-linux*.
* configure: Regenerate.
---
diff -Nurp a/libphobos/libdruntime/gcc/sections/elf_shared.d
b/libphobos/libdruntime/gcc/sections/elf_shared.d
--- a/libphobos/libdruntime/gcc/sections/elf_shared.d
+++ b/libphobos/libdruntime/gcc/sections/elf_shared.d
@@ -1084,7 +1084,9 @@ void[] getTLSRange(size_t mod, size_t sz) nothrow @nogc

 // base offset
 auto ti = tls_index(mod, 0);
-version (IBMZ_Any)
+version (CRuntime_Musl)
+return (__tls_get_addr()-TLS_DTV_OFFSET)[0 .. sz];
+else version (IBMZ_Any)
 {
 auto idx = cast(void *)__tls_get_addr_internal()
 + cast(ulong)__builtin_thread_pointer();
diff -Nurp a/libphobos/configure.ac b/libphobos/configure.ac
--- a/libphobos/configure.ac
+++ b/libphobos/configure.ac
@@ -140,6 +140,14 @@ case ${host} in
 esac
 AC_MSG_RESULT($LIBPHOBOS_SUPPORTED)

+AC_MSG_CHECKING([if target needs to link in swapcontext])
+AC_MSG_RESULT($LIBDRUNTIME_NEEDS_UCONTEXT)
+AS_IF([test "x$LIBDRUNTIME_NEEDS_UCONTEXT" = xyes], [
+   AC_SEARCH_LIBS([swapcontext], [c ucontext], [], [
+   AC_MSG_ERROR([[can't find library providing swapcontext]])
+  ])
+])
+
 # Decide if it's usable.
 case $LIBPHOBOS_SUPPORTED:$enable_libphobos in
 *:no)  use_libphobos=no  ;;
diff -Nurp a/libphobos/configure.tgt b/libphobos/configure.tgt
--- a/libphobos/configure.tgt
+++ b/libphobos/configure.tgt
@@ -22,6 +22,13 @@
 # Disable the libphobos or libdruntime components on untested or known
 # broken systems.  More targets shall be added after testing.
 LIBPHOBOS_SUPPORTED=no
+
+# Check if we require 'ucontext' or if we have a custom solution.
+# Most platform uses a custom assembly solution for context switches,
+# see `core.thread` and grep for `AsmExternal`.
+# Definitions are in config/ARCH/
+LIBDRUNTIME_NEEDS_UCONTEXT=no
+
 case "${target}" in
   aarch64*-*-linux*)
LIBPHOBOS_SUPPORTED=yes
@@ -37,6 +44,7 @@ case "${target}" in
;;
   s390*-linux*)
LIBPHOBOS_SUPPORTED=yes
+   LIBDRUNTIME_NEEDS_UCONTEXT=yes
;;
   x86_64-*-kfreebsd*-gnu | i?86-*-kfreebsd*-gnu)
LIBPHOBOS_SUPPORTED=yes

[PATCH] calls.c: refactor special_function_p for use by analyzer

2020-01-27 Thread David Malcolm

On Wed, 2020-01-22 at 20:40 +0100, Jakub Jelinek wrote:
> On Wed, Jan 22, 2020 at 02:35:13PM -0500, David Malcolm wrote:
> > PR analyzer/93316 reports various testsuite failures where I
> > accidentally relied on properties of x86_64-pc-linux-gnu.
> > 
> > The following patch fixes them on sparc-sun-solaris2.11 (gcc211 in
> > the
> > GCC compile farm), and, I hope, the other configurations showing
> > failures.
> > 
> > There may still be other failures for pattern-test-2.c, which I'm
> > tracking separately as PR analyzer/93291.
> > 
> > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
> > tested on stage 1 on sparc-sun-solaris2.11.
> > 
> > gcc/analyzer/ChangeLog:
> > PR analyzer/93316
> > * analyzer.cc (is_setjmp_call_p): Check for "setjmp" as well as
> > "_setjmp".
> 
> Please see calls.c (special_function_p), you should treat certainly
> also sigsetjmp as a setjmp call, and similarly to special_function_p,
> skip over _ or __ prefixes before the setjmp or sigsetjmp name.
> Similarly for longjmp/siglongjmp.
> 
>   Jakub

This patch refactors some code in special_function_p that checks for
the function being sane to match by name, splitting it out into a new
maybe_special_function_p, and using it it two places in the analyzer.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
OK for master?

Thanks
Dave

gcc/analyzer/ChangeLog:
* analyzer.cc (is_named_call_p): Replace tests for fndecl being
extern at file scope and having a non-NULL DECL_NAME with a call
to maybe_special_function_p.
* function-set.cc (function_set::contains_decl_p): Add call to
maybe_special_function_p.

gcc/ChangeLog:
* calls.c (maybe_special_function_p): New function, splitting out
the check for DECL_NAME being non-NULL and fndecl being extern at
file scope from...
(special_function_p): ...here.  Drop check for fndecl being
non-NULL that was after a usage of DECL_NAME (fndecl).
* tree.h (maybe_special_function_p): New decl.
---
 gcc/analyzer/analyzer.cc | 10 +
 gcc/analyzer/function-set.cc |  2 ++
 gcc/calls.c  | 40 +---
 gcc/tree.h   |  2 ++
 4 files changed, 33 insertions(+), 21 deletions(-)

diff --git a/gcc/analyzer/analyzer.cc b/gcc/analyzer/analyzer.cc
index 1b5e4c9ecf8..5cf745ea632 100644
--- a/gcc/analyzer/analyzer.cc
+++ b/gcc/analyzer/analyzer.cc
@@ -65,18 +65,10 @@ is_named_call_p (tree fndecl, const char *funcname)
   gcc_assert (fndecl);
   gcc_assert (funcname);
 
-  /* Exclude functions not at the file scope, or not `extern',
- since they are not the magic functions we would otherwise
- think they are.  */
-  if (!((DECL_CONTEXT (fndecl) == NULL_TREE
-|| TREE_CODE (DECL_CONTEXT (fndecl)) == TRANSLATION_UNIT_DECL)
-   && TREE_PUBLIC (fndecl)))
+  if (!maybe_special_function_p (fndecl))
 return false;
 
   tree identifier = DECL_NAME (fndecl);
-  if (identifier == NULL)
-return false;
-
   const char *name = IDENTIFIER_POINTER (identifier);
   const char *tname = name;
 
diff --git a/gcc/analyzer/function-set.cc b/gcc/analyzer/function-set.cc
index 6ed15ae95ad..1b6b5d9f9c1 100644
--- a/gcc/analyzer/function-set.cc
+++ b/gcc/analyzer/function-set.cc
@@ -59,6 +59,8 @@ bool
 function_set::contains_decl_p (tree fndecl) const
 {
   gcc_assert (fndecl && DECL_P (fndecl));
+  if (!maybe_special_function_p (fndecl))
+return false;
   return contains_name_p (IDENTIFIER_POINTER (DECL_NAME (fndecl)));
 }
 
diff --git a/gcc/calls.c b/gcc/calls.c
index 1336f49ea5e..447a36c3707 100644
--- a/gcc/calls.c
+++ b/gcc/calls.c
@@ -572,22 +572,18 @@ emit_call_1 (rtx funexp, tree fntree ATTRIBUTE_UNUSED, 
tree fndecl ATTRIBUTE_UNU
 anti_adjust_stack (gen_int_mode (n_popped, Pmode));
 }
 
-/* Determine if the function identified by FNDECL is one with
-   special properties we wish to know about.  Modify FLAGS accordingly.
-
-   For example, if the function might return more than one time (setjmp), then
-   set ECF_RETURNS_TWICE.
+/* Determine if the function identified by FNDECL is one that
+   makes sense to match by name, for those places where we detect
+   "magic" functions by name.
 
-   Set ECF_MAY_BE_ALLOCA for any memory allocation function that might allocate
-   space from the stack such as alloca.  */
+   Return true if FNDECL has a name and is an extern fndecl at file scope.
+   FNDECL must be a non-NULL decl.  */
 
-static int
-special_function_p (const_tree fndecl, int flags)
+bool
+maybe_special_function_p (const_tree fndecl)
 {
   tree name_decl = DECL_NAME (fndecl);
-
-  if (fndecl && name_decl
-  && IDENTIFIER_LENGTH (name_decl) <= 11
+  if (name_decl
   /* Exclude functions not at the file scope, or not `extern',
 since they are not the magic functions we would otherwise
 think they are.
@@ -598,6 +594,26 @@ special_function_p (const_tree fndecl, int flags)

[committed] analyzer: fix ICE when canonicalizing NaN (PR 93451)

2020-01-27 Thread David Malcolm

PR analyzer/93451 reports an ICE when canonicalizing the constants
in a region_model, with a failed qsort_chk when attempting to sort
the constants within the region_model.

The svalues in the model were:
  sv0: {poisoned: uninit}
  sv1: {type: ‘double’, ‘0.0’}
  sv2: {type: ‘double’, ‘1.0e+0’}
  sv3: {type: ‘double’, ‘ Nan’}

The qsort_chk of the 3 constants fails due to tree_cmp using the
LT_EXPR ordering of the REAL_CSTs, which doesn't work for NaN.

This patch adjusts tree_cmp to impose an arbitrary ordering during
canonicalization for UNORDERED_EXPR cases w/o relying on the LT_EXPR
ordering, fixing the ICE.

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
committed to master as r10-6271-g8c08c983015e675f555d57a30e15d918abef2b93.

gcc/analyzer/ChangeLog:
PR analyzer/93451
* region-model.cc (tree_cmp): For the REAL_CST case, impose an
arbitrary order on NaNs relative to other NaNs and to non-NaNs;
const-correctness tweak.
(ana::selftests::build_real_cst_from_string): New function.
(ana::selftests::append_interesting_constants): New function.
(ana::selftests::test_tree_cmp_on_constants): New test.
(ana::selftests::test_canonicalization_4): New test.
(ana::selftests::analyzer_region_model_cc_tests): Call the new
tests.

gcc/testsuite/ChangeLog:
PR analyzer/93451
* gcc.dg/analyzer/torture/pr93451.c: New test.
---
 gcc/analyzer/region-model.cc  | 90 ++-
 .../gcc.dg/analyzer/torture/pr93451.c | 14 +++
 2 files changed, 101 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/analyzer/torture/pr93451.c

diff --git a/gcc/analyzer/region-model.cc b/gcc/analyzer/region-model.cc
index a986549b597..62c96a6ceea 100644
--- a/gcc/analyzer/region-model.cc
+++ b/gcc/analyzer/region-model.cc
@@ -1811,11 +1811,22 @@ tree_cmp (const_tree t1, const_tree t2)
 
 case REAL_CST:
   {
-   real_value *rv1 = TREE_REAL_CST_PTR (t1);
-   real_value *rv2 = TREE_REAL_CST_PTR (t2);
+   const real_value *rv1 = TREE_REAL_CST_PTR (t1);
+   const real_value *rv2 = TREE_REAL_CST_PTR (t2);
+   if (real_compare (UNORDERED_EXPR, rv1, rv2))
+ {
+   /* Impose an arbitrary order on NaNs relative to other NaNs
+  and to non-NaNs.  */
+   if (int cmp_isnan = real_isnan (rv1) - real_isnan (rv2))
+ return cmp_isnan;
+   if (int cmp_issignaling_nan
+ = real_issignaling_nan (rv1) - real_issignaling_nan (rv2))
+ return cmp_issignaling_nan;
+   return real_isneg (rv1) - real_isneg (rv2);
+ }
if (real_compare (LT_EXPR, rv1, rv2))
  return -1;
-   if (real_compare (LT_EXPR, rv2, rv1))
+   if (real_compare (GT_EXPR, rv1, rv2))
  return 1;
return 0;
   }
@@ -6927,6 +6938,58 @@ namespace ana {
 
 namespace selftest {
 
+/* Build a constant tree of the given type from STR.  */
+
+static tree
+build_real_cst_from_string (tree type, const char *str)
+{
+  REAL_VALUE_TYPE real;
+  real_from_string (, str);
+  return build_real (type, real);
+}
+
+/* Append various "interesting" constants to OUT (e.g. NaN).  */
+
+static void
+append_interesting_constants (auto_vec *out)
+{
+  out->safe_push (build_int_cst (integer_type_node, 0));
+  out->safe_push (build_int_cst (integer_type_node, 42));
+  out->safe_push (build_int_cst (unsigned_type_node, 0));
+  out->safe_push (build_int_cst (unsigned_type_node, 42));
+  out->safe_push (build_real_cst_from_string (float_type_node, "QNaN"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "-QNaN"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "SNaN"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "-SNaN"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "0.0"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "-0.0"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "Inf"));
+  out->safe_push (build_real_cst_from_string (float_type_node, "-Inf"));
+}
+
+/* Verify that tree_cmp is a well-behaved comparator for qsort, even
+   if the underlying constants aren't comparable.  */
+
+static void
+test_tree_cmp_on_constants ()
+{
+  auto_vec csts;
+  append_interesting_constants ();
+
+  /* Try sorting every triple. */
+  const unsigned num = csts.length ();
+  for (unsigned i = 0; i < num; i++)
+for (unsigned j = 0; j < num; j++)
+  for (unsigned k = 0; k < num; k++)
+   {
+ auto_vec v (3);
+ v.quick_push (csts[i]);
+ v.quick_push (csts[j]);
+ v.quick_push (csts[k]);
+ v.qsort (tree_cmp);
+   }
+}
+
 /* Implementation detail of the ASSERT_CONDITION_* macros.  */
 
 void
@@ -7577,6 +7640,25 @@ test_canonicalization_3 ()
   ASSERT_EQ (model0, model1);
 }
 
+/* Verify that we can canonicalize a model containing NaN and other real
+

Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-27 Thread Uecker, Martin


Hi Richard,

thank you for your response. 


Am Montag, den 27.01.2020, 15:42 +0100 schrieb Richard Biener:
> On Fri, Jan 24, 2020 at 12:46 AM Uecker, Martin
>  wrote:
> > 
> > Am Donnerstag, den 23.01.2020, 14:18 +0100 schrieb Richard Biener:
> > > On Wed, Jan 22, 2020 at 12:40 PM Martin Sebor  wrote:
> > > > 
> > > > On 1/22/20 8:32 AM, Richard Biener wrote:
> > > > > On Tue, 21 Jan 2020, Alexander Monakov wrote:
> > > > > 
> > > > > > On Tue, 21 Jan 2020, Richard Biener wrote:
> > > > > > 
> > > > > > > Fourth.  That PNVI (I assume it's the whole pointer-provenance 
> > > > > > > stuff)
> > > > > > > wants to get the "best" of both which can never be done since a 
> > > > > > > compiler
> > > > > > > needs to have a way to be conservative - in this area it's 
> > > > > > > conflicting
> > > > > > > conservative treatment which is impossible.
> > > > > > 
> > > > > > This paragraph is unclear, I don't immediately see what the 
> > > > > > conflicting goals
> > > > > > are. The rest is clear enough given the previous discussions I saw.
> > > > > > 
> > > > > > Did you mean the restriction that you cannot do arithmetic 
> > > > > > involving two
> > > > > > integers based on pointers, get a value corresponding to one of 
> > > > > > them,
> > > > > > cast it back and get a pointer suitable for accessing either of two
> > > > > > originally pointed-to objects? I don't see that as a conflict 
> > > > > > because
> > > > > > it places a restriction on users, not the compiler.
> > > > > 
> > > > > As far as I remember the discussions PNVI requires to track
> > > > > provenance for correctness, you may not miss or attach wrong 
> > > > > provenance
> > > > > to a pointer and there's only "single" provenance, not "many"
> > > > > (aka, may point to A and B).  I don't see how you can ever implement 
> > > > > that.
> > 
> > I have not idea how you came to that conclusion. PNVI is perfectly
> > compatible with a naive compiler who does not track provenance at
> > all as well as an abstract machine that actually carries run-time
> > provenance around with each pointer and checks every operation.
> > It was designed specifically to allow both cases and everything
> > in between (especially compilers who track provenance during
> > compile time but the programs then do not track provenance at
> > run-time).
> > 
> > You may be confused by the abstract formulation that indeed
> > assigns a single provenance to each pointer. A compiler would
> > track its *knowledge about provenance*, which would be a set
> > of possible targets.
> 
> Well, the question is whether PVNI allows the compiler to put any
> additional restriction on what the provenance of an interger is.  It
> appears not, so any attempt to track provenance through integers
> is doomed until the cases are very simple.  I'm not sure that's desirable (*).

PVNI makes something which is now implementation
defined have well defined semantics. This is not possible to do
without putting additional restrictions on the compiler and
this will make some optimizations non-compliant. In both
committees  (C and C++) there is a consensus that the
rules about provenance should be made clear and explicit.
So far, this is the only well-developed proposal.



> > > Well, PNVI limits optimization opportunities of GCC which currently
> > > _does_ track provenance through integers and thus only allows
> > > a very limited set(*) of "unrelated" pointers to appear here (documented
> > > is that none are, implementation details differ from version to version).
> > > 
> > > There are no optimization "opportunities" by making pointer <-> integer
> > > conversions lose information.
> > 
> > You are right: It is meant to constrain optimizations.
> > 
> > The reasoning behind this that currently all compilers behave
> > inconsistently and this is not terrible useful to anybody.
> > 
> > At the same time, there does not appear to
> > be any reasonable way how integers can have provenance.
> > Any rules we came up with really got complicated and are
> > also fundamentally at odds with the usual mathematical
> > properties of integers one would naively expect.
> 
> So the original point where GCC started to track provenance through
> non-pointers (PVNI should really be PVNNP since I guess tracking
> provenance through floats isn't to be done either ;)) was a testcase
> showing that Matlab (IIRC) generated C code funneled pointers through
> a pair of floats (obviously a single float isn't enough for 64bit pointers 
> ...)
> and that prevents a good deal of optimization due to missed alias analysis.
> I fixed that and now GCC happily tracks provenance through a pair of
> floats ...

Yes,  this is impressive. And yes, the proposal would
forbid this (i.e. doing aliasing analysis based on the provenance
tracked through integers). 

PVNI would make the Matlab code well-defined (under some
conditions), but break the optimization based on tracking
provenance through floats. But as the

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread H.J. Lu

On Mon, Jan 27, 2020 at 2:17 PM H.J. Lu  wrote:
>
> On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak  wrote:
> >
> > On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
> > >
> > > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> > > case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
> > > for TARGET_AVX.
> > >
> > > gcc/
> > >
> > > PR target/91461
> > > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
> > > TARGET_AVX.
> > > * config/i386/i386.md (*movoi_internal_avx): Remove
> > > TARGET_SSE_TYPELESS_STORES check.
> > >
> > > gcc/testsuite/
> > >
> > > PR target/91461
> > > * gcc.target/i386/pr91461-1.c: New test.
> > > * gcc.target/i386/pr91461-2.c: Likewise.
> > > * gcc.target/i386/pr91461-3.c: Likewise.
> > > * gcc.target/i386/pr91461-4.c: Likewise.
> > > * gcc.target/i386/pr91461-5.c: Likewise.
> > > ---
> > >  gcc/config/i386/i386.h|  4 +-
> > >  gcc/config/i386/i386.md   |  4 +-
> > >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
> > >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
> > >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
> > >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
> > >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
> > >  7 files changed, 203 insertions(+), 4 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> > >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> > >
> > > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > > index 943e9a5c783..c134b04c5c4 100644
> > > --- a/gcc/config/i386/i386.h
> > > +++ b/gcc/config/i386/i386.h
> > > @@ -516,8 +516,10 @@ extern unsigned char 
> > > ix86_tune_features[X86_TUNE_LAST];
> > >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> > >  #define TARGET_SSE_SPLIT_REGS  
> > > ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> > > +   isn't the case for AVX nor AVX512.  */
> > >  #define TARGET_SSE_TYPELESS_STORES \
> > > -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > > +   (!TARGET_AVX && ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
> >
> > This is wrong place to disable the feature.
>

Here is the updated patch on top of

https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01742.html

so that set_ix86_tune_features can access per function setting.

OK for master branch?

Thanks.

-- 
H.J.
From 61482a7d4dff07075f2534840040bafa420e9f36 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Mon, 27 Jan 2020 09:35:11 -0800
Subject: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
for TARGET_AVX and adjust vmovups checks in assembly ouputs.

gcc/

	PR target/91461
	* config/i386/i386-options.c (set_ix86_tune_features): Disable
	TARGET_SSE_TYPELESS_STORES for TARGET_AVX.
	* config/i386/i386.md (*movoi_internal_avx): Remove
	TARGET_SSE_TYPELESS_STORES check.

gcc/testsuite/

	PR target/91461
	* gcc.target/i386/avx256-unaligned-store-3.c: Don't check
	vmovups.
	* gcc.target/i386/pieces-memcpy-4.c: Likewise.
	* gcc.target/i386/pieces-memcpy-5.c: Likewise.
	* gcc.target/i386/pieces-memcpy-6.c: Likewise.
	* gcc.target/i386/pieces-strcpy-2.c: Likewise.
	* gcc.target/i386/pr90980-1.c: Likewise.
	* gcc.target/i386/pr87317-4.c: Check "\tvmovd\t" instead of
	"vmovd" to avoid matching "vmovdqu".
	* gcc.target/i386/pr87317-5.c: Likewise.
	* gcc.target/i386/pr87317-7.c: Likewise.
	* gcc.target/i386/pr91461-1.c: New test.
	* gcc.target/i386/pr91461-2.c: Likewise.
	* gcc.target/i386/pr91461-3.c: Likewise.
	* gcc.target/i386/pr91461-4.c: Likewise.
	* gcc.target/i386/pr91461-5.c: Likewise.
	* gcc.target/i386/pr91461-6.c: Likewise.
---
 gcc/config/i386/i386-options.c|  5 ++
 gcc/config/i386/i386.md   |  4 +-
 .../i386/avx256-unaligned-store-3.c   |  4 +-
 .../gcc.target/i386/pieces-memcpy-4.c |  3 +-
 .../gcc.target/i386/pieces-memcpy-5.c |  3 +-
 .../gcc.target/i386/pieces-memcpy-6.c |  3 +-
 .../gcc.target/i386/pieces-strcpy-2.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr87317-4.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr87317-5.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr87317-7.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr90980-1.c |  2 +-
 gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
 gcc/testsuite/gcc.target/i386/pr91461-2.c | 19

[PATCH 4/4] SRA: Do not ignore padding when totally scalarizing [PR92486]

2020-01-27 Thread Martin Jambor

Hi,

PR 92486 shows that DSE, when seeing a "normal" gimple aggregate
assignment coming from a C struct assignment and one a representing a
folded memcpy, can kill the latter and keep in place only the former,
which does not copy padding - at least when SRA decides to totally
scalarize a least one of the aggregates (when not doing total
scalarization, SRA cares about padding)

SRA would not totally scalarize an aggregate if it saw that it takes
part in a gimple assignment which is a folded memcpy (see how
type_changing_p is set in contains_vce_or_bfcref_p) but it doesn't
because of the DSE decisions.

I was asked to modify SRA to take padding into account - and to copy
it around - when totally scalarizing, which is what the patch below
does.  I am not very happy about this, I am afraid it will lead to
performance regressions, but this has all been discussed (see
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01185.html and
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00218.html).

I tried to alleviate the problem by not only inserting accesses for
padding but also by enlarging existing accesses whenever possible to
extend over padding - the extended access would get copied when in the
original IL an aggregate copy is replaced with SRA copies and a
BIT_FIELD_REF would be generated to replace a scalar access to a part
of the aggregate in the original IL.  I have made it work in the sense
that the patch passed bootstrap and testing (see the git branch
refs/users/jamborm/heads/sra-later_total-bfr-20200127 or look at
https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;a=shortlog;h=refs/users/jamborm/heads/sra-later_total-bfr-20200127
if you are interested), but this approach meant that each such
extended replacement which was written to (so all of them) could
potentially become only partially assigned to and so had to be marked
as addressable and could not become gimple register, meaning that
total scalarizatio would be creating addressable variables.  Detecting
such cases is not easy, it would mean introducing yet another type of
write flag (written to exactly this access) and propagate that flag
across assignment sub-accesses.

So I decided that was not the way to go and instead only extended
integer accesses, and that is what the atcg below does.  Like in the
previous attempt, whatever padding could not be covered by extending
an access would be covered by extra artificial accesses.  As you can
see, it adds a little complexity to various places of the pass which
are already not trivial, but hopefully it is manageable.

Bootstrapped and tested on x86_64-linux, I'll curious about the
feedback.

Thanks,

Martin


2020-01-27  Martin Jambor  

PR tree-optimization/92486
* tree-sra.c: Include langhooks.h
(struct access): New fields reg_size and reg_acc_type.
(dump_access): Print new fields.
(acc_size): New function.
(find_access_in_subtree): Use it, new parameter reg.
(get_var_base_offset_size_access): Pass true to
find_access_in_subtree.
(create_access_1): Initialize reg_size.
(create_artificial_child_access): Likewise.
(create_total_scalarization_access): Likewise.
(build_ref_for_model): Do not use model expr if reg_acc_size is
non-NULL.
(get_reg_access_replacement): New function.
(verify_sra_access_forest): Adjust verification for presence of
extended accesses covering padding.
(analyze_access_subtree): Undo extension over padding if total
scalarization failed, set grp_partial_lhs if we are going to introduce
a partial store to the new replacement, do not ignore holes when
totally scalarizing.
(sra_type_for_size): New function.
(total_scalarization_fill_padding): Likewise.
(total_should_skip_creating_access): Use it.
(totally_scalarize_subtree): Likewise.
(sra_modify_expr): Use get_reg_access_replacement instead of
get_access_replacement, adjust for reg_acc_type.
(sra_modify_assign): Likewise.
(load_assign_lhs_subreplacements): Pass false to
find_access_in_subtree.

testsuite/
* gcc.dg/tree-ssa/pr92486.c: New test.
---
 gcc/ChangeLog   |  32 +++
 gcc/testsuite/ChangeLog |   5 +
 gcc/testsuite/gcc.dg/tree-ssa/pr92486.c |  38 +++
 gcc/tree-sra.c  | 368 +---
 4 files changed, 396 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr92486.c

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 3541c6638f9..34c60e6f2a3 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,35 @@
+2020-01-26  Martin Jambor  
+
+   PR tree-optimization/92486
+   * tree-sra.c: Include langhooks.h
+   (struct access): New fields reg_size and reg_acc_type.
+   (dump_access): Print new fields.
+   (acc_size): New function.
+   (find_access_in_subtree): U

[PATCH 2/4] SRA: Total scalarization after access propagation [PR92706]

2020-01-27 Thread Martin Jambor

Hi,

this patch fixes the second testcase in PR 92706 by performing total
scalarization only quite a bit later, when we already have access
trees constructed and even done propagation of accesses from RHSs of
assignment to LHSs.

The new code simultaneously traverses the existing access tree and the
declared variable type and adds artificial accesses whenever they can
fit in between the existing ones.  This prevents us from creating
accesses based on the type which then clash with those which have
propagated here from another access tree describing an aggregate on a
RHS of an assignment, which means that both sides of the assignment
will be scalarized differently, leading to bad code and the aggregate
most likely not going away.

This new version is hopefully slightly easier to read and review and
also fixed one potential bug, but otherwise does pretty much the same
thing as the first one.

Bootstrapped and LTO-bootstrapped and tested on an x86_64-linux, where
it causes two new guality XPASSes.  I expect that review will lead to
requests to change things but provided we want to fix PR 92706 now, I
believe this is the way to go.  The fix for PR 92486 which I am
sending as a follow-up also depends on this patch.

Thanks,

Martin

2019-12-20  Martin Jambor  

PR tree-optimization/92706
* tree-sra.c (struct access): Adjust comment of
grp_total_scalarization.
(find_access_in_subtree): Look for single children spanning an entire
access.
(scalarizable_type_p): Allow register accesses, adjust callers.
(completely_scalarize): Remove function.
(scalarize_elem): Likewise.
(create_total_scalarization_access): Likewise.
(sort_and_splice_var_accesses): Do not track total scalarization
flags.
(analyze_access_subtree): New parameter totally, adjust to new meaning
of grp_total_scalarization.
(analyze_access_trees): Pass new parameter to analyze_access_subtree.
(can_totally_scalarize_forest_p): New function.
(create_total_scalarization_access): Likewise.
(create_total_access_and_reshape): Likewise.
(total_should_skip_creating_access): Likewise.
(totally_scalarize_subtree): Likewise.
(analyze_all_variable_accesses): Perform total scalarization after
subaccess propagation using the new functions above.
(initialize_constant_pool_replacements): Output initializers by
traversing the access tree.

testsuite/
* gcc.dg/tree-ssa/pr92706-2.c: New test.
* gcc.dg/guality/pr59776.c: Xfail tests for s2.g.
---
 gcc/testsuite/gcc.dg/guality/pr59776.c|   4 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr92706-2.c |  19 +
 gcc/tree-sra.c| 666 --
 3 files changed, 505 insertions(+), 184 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr92706-2.c

diff --git a/gcc/testsuite/gcc.dg/guality/pr59776.c 
b/gcc/testsuite/gcc.dg/guality/pr59776.c
index 382abb622bb..6c1c8165b70 100644
--- a/gcc/testsuite/gcc.dg/guality/pr59776.c
+++ b/gcc/testsuite/gcc.dg/guality/pr59776.c
@@ -12,11 +12,11 @@ foo (struct S *p)
   struct S s1, s2; /* { dg-final { gdb-test pr59776.c:17 
"s1.f" "5.0" } } */
   s1 = *p; /* { dg-final { gdb-test pr59776.c:17 
"s1.g" "6.0" } } */
   s2 = s1; /* { dg-final { gdb-test pr59776.c:17 
"s2.f" "0.0" } } */
-  *(int *)  = 0;  /* { dg-final { gdb-test pr59776.c:17 
"s2.g" "6.0" } } */
+  *(int *)  = 0;  /* { dg-final { gdb-test pr59776.c:17 
"s2.g" "6.0" { xfail *-*-* } } } */
   asm volatile (NOP : : : "memory");   /* { dg-final { gdb-test pr59776.c:20 
"s1.f" "5.0" } } */
   asm volatile (NOP : : : "memory");   /* { dg-final { gdb-test pr59776.c:20 
"s1.g" "6.0" } } */
   s2 = s1; /* { dg-final { gdb-test pr59776.c:20 
"s2.f" "5.0" } } */
-  asm volatile (NOP : : : "memory");   /* { dg-final { gdb-test pr59776.c:20 
"s2.g" "6.0" } } */
+  asm volatile (NOP : : : "memory");   /* { dg-final { gdb-test pr59776.c:20 
"s2.g" "6.0" { xfail *-*-* } } } */
   asm volatile (NOP : : : "memory");
 }
 
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr92706-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr92706-2.c
new file mode 100644
index 000..37ab9765db0
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr92706-2.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-esra" } */
+
+typedef __UINT64_TYPE__ uint64_t;
+typedef __UINT32_TYPE__ uint32_t;
+struct S { uint32_t i[2]; } __attribute__((aligned(__alignof__(uint64_t;
+typedef uint64_t my_int64 __attribute__((may_alias));
+uint64_t load (void *p)
+{
+  struct S u, v, w;
+  uint64_t tem;
+  tem = *(my_int64 *)p;
+  *(my_int64 *) = tem;
+  u = v;
+  w = u;
+  return *(my_int64 *)
+}
+
+/* { dg-final { scan-tree-dump "Created a replacement for v" "esra" } } */
diff --git

[PATCH 3/4] SRA: Also propagate accesses from LHS to RHS [PR92706]

2020-01-27 Thread Martin Jambor

Hi,

the previous patch unfortunately does not fix the first testcase in PR
92706 and since I am afraid it might be the important one, I also
focused on that.  The issue here is again total scalarization accesses
clashing with those representing accesses in the IL - on another
aggregate but here the sides are switched.  Whereas in the previous
case the total scalarization accesses prevented propagation along
assignments, here we have the user accesses on the LHS, so even though
we do not create anything there, we totally scalarize the RHS and
again end up with assignments with different scalarizations leading to
bad code.

So we clearly need to propagate information about accesses from RHSs
to LHSs too, which the patch below does.  Because the intent is only
preventing bad total scalarization - which the last patch now performs
late enough - and do not care about grp_write flag and so forth, the
propagation is a bit simpler and so I did not try to unify all of the
code for both directions.

More information and some discussion is in the thread from the initial
submission, the code has not changed in any (substantial) way.  See
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01184.html and
https://gcc.gnu.org/ml/gcc-patches/2020-01/msg00698.html.

Bootstrapped and tested on x86_64-linux.

Thanks,

Martin

2019-12-11  Martin Jambor  

PR tree-optimization/92706
* tree-sra.c (struct access): Fields first_link, last_link,
next_queued and grp_queued renamed to first_rhs_link, last_rhs_link,
next_rhs_queued and grp_rhs_queued respectively, new fields
first_lhs_link, last_lhs_link, next_lhs_queued and grp_lhs_queued.
(struct assign_link): Field next renamed to next_rhs, new field
next_lhs.  Updated comment.
(work_queue_head): Renamed to rhs_work_queue_head.
(lhs_work_queue_head): New variable.
(add_link_to_lhs): New function.
(relink_to_new_repr): Also relink LHS lists.
(add_access_to_work_queue): Renamed to add_access_to_rhs_work_queue.
(add_access_to_lhs_work_queue): New function.
(pop_access_from_work_queue): Renamed to
pop_access_from_rhs_work_queue.
(pop_access_from_lhs_work_queue): New function.
(build_accesses_from_assign): Also add links to LHS lists and to LHS
work_queue.
(child_would_conflict_in_lacc): Renamed to
child_would_conflict_in_acc.  Adjusted parameter names.
(create_artificial_child_access): New parameter set_grp_read, use it.
(subtree_mark_written_and_enqueue): Renamed to
subtree_mark_written_and_rhs_enqueue.
(propagate_subaccesses_across_link): Renamed to
propagate_subaccesses_from_rhs.
(propagate_subaccesses_from_lhs): New function.
(propagate_all_subaccesses): Also propagate subaccesses from LHSs to
RHSs.

testsuite/
* gcc.dg/tree-ssa/pr92706-1.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c |  17 ++
 gcc/tree-sra.c| 306 --
 2 files changed, 248 insertions(+), 75 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c
new file mode 100644
index 000..c36d103798e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c
@@ -0,0 +1,17 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-esra-details" } */
+
+struct S { int i[4]; } __attribute__((aligned(128)));
+typedef __int128_t my_int128 __attribute__((may_alias));
+__int128_t load (void *p)
+{
+  struct S v;
+  __builtin_memcpy (, p, sizeof (struct S));
+  struct S u;
+  u = v;
+  struct S w;
+  w = u;
+  return *(my_int128 *)
+}
+
+/* { dg-final { scan-tree-dump-not "Created a replacement for u offset: 
\[^0\]" "esra" } } */
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 2b0849858de..ea8594db193 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -167,11 +167,15 @@ struct access
   struct access *next_sibling;
 
   /* Pointers to the first and last element in the linked list of assign
- links.  */
-  struct assign_link *first_link, *last_link;
+ links for propagation from LHS to RHS.  */
+  struct assign_link *first_rhs_link, *last_rhs_link;
 
-  /* Pointer to the next access in the work queue.  */
-  struct access *next_queued;
+  /* Pointers to the first and last element in the linked list of assign
+ links for propagation from LHS to RHS.  */
+  struct assign_link *first_lhs_link, *last_lhs_link;
+
+  /* Pointer to the next access in the work queues.  */
+  struct access *next_rhs_queued, *next_lhs_queued;
 
   /* Replacement variable for this access "region."  Never to be accessed
  directly, always only by the means of get_access_replacement() and only
@@ -184,8 +188,11 @@ struct access
   /* Is this particular access write access? */
   unsigned write : 1;
 
-  /* Is this access

[PATCH 1/4] SRA: Add verification of accesses

2020-01-27 Thread Martin Jambor

Hi,

this patch has not changed since the last submission at all, in fact
it got approved but without the follow-up fix of the reverse flag, it
would introduce regression, so it should not be committed on its own.

Because the follow-up patches perform some non-trivial operations on
SRA patches, I wrote myself a verifier.  And sure enough, it has
spotted two issues, one of which is fixed in this patch too - we did
not correctly set the parent link when creating artificial accesses
for propagation across assignments.  The second one is the (not)
setting of reverse flag when creating accesses for total scalarization
but since the following patch removes the offending function, this
patch does not fix it.

Bootstrapped and tested on x86_64, I consider this a pre-requisite for
the followup patches (and the parent link fix really is).

Thanks,

Martin

2019-12-10  Martin Jambor  

* tree-sra.c (verify_sra_access_forest): New function.
(verify_all_sra_access_forests): Likewise.
(create_artificial_child_access): Set parent.
(analyze_all_variable_accesses): Call the verifier.
---
 gcc/tree-sra.c | 86 ++
 1 file changed, 86 insertions(+)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 875d5b21763..36106fecaf1 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -2321,6 +2321,88 @@ build_access_trees (struct access *access)
   return true;
 }
 
+/* Traverse the access forest where ROOT is the first root and verify that
+   various important invariants hold true.  */
+
+DEBUG_FUNCTION void
+verify_sra_access_forest (struct access *root)
+{
+  struct access *access = root;
+  tree first_base = root->base;
+  gcc_assert (DECL_P (first_base));
+  do
+{
+  gcc_assert (access->base == first_base);
+  if (access->parent)
+   gcc_assert (access->offset >= access->parent->offset
+   && access->size <= access->parent->size);
+  if (access->next_sibling)
+   gcc_assert (access->next_sibling->offset
+   >= access->offset + access->size);
+
+  poly_int64 poffset, psize, pmax_size;
+  bool reverse;
+  tree base = get_ref_base_and_extent (access->expr, , ,
+  _size, );
+  HOST_WIDE_INT offset, size, max_size;
+  if (!poffset.is_constant ()
+ || !psize.is_constant ()
+ || !pmax_size.is_constant (_size))
+   gcc_unreachable ();
+  gcc_assert (base == first_base);
+  gcc_assert (offset == access->offset);
+  gcc_assert (access->grp_unscalarizable_region
+ || size == max_size);
+  gcc_assert (max_size == access->size);
+  gcc_assert (reverse == access->reverse);
+
+  if (access->first_child)
+   {
+ gcc_assert (access->first_child->parent == access);
+ access = access->first_child;
+   }
+  else if (access->next_sibling)
+   {
+ gcc_assert (access->next_sibling->parent == access->parent);
+ access = access->next_sibling;
+   }
+  else
+   {
+ while (access->parent && !access->next_sibling)
+   access = access->parent;
+ if (access->next_sibling)
+   access = access->next_sibling;
+ else
+   {
+ gcc_assert (access == root);
+ root = root->next_grp;
+ access = root;
+   }
+   }
+}
+  while (access);
+}
+
+/* Verify access forests of all candidates with accesses by calling
+   verify_access_forest on each on them.  */
+
+DEBUG_FUNCTION void
+verify_all_sra_access_forests (void)
+{
+  bitmap_iterator bi;
+  unsigned i;
+  EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
+{
+  tree var = candidate (i);
+  struct access *access = get_first_repr_for_decl (var);
+  if (access)
+   {
+ gcc_assert (access->base == var);
+ verify_sra_access_forest (access);
+   }
+}
+}
+
 /* Return true if expr contains some ARRAY_REFs into a variable bounded
array.  */
 
@@ -2566,6 +2648,7 @@ create_artificial_child_access (struct access *parent, 
struct access *model,
   access->offset = new_offset;
   access->size = model->size;
   access->type = model->type;
+  access->parent = parent;
   access->grp_write = set_grp_write;
   access->grp_read = false;
   access->reverse = model->reverse;
@@ -2850,6 +2933,9 @@ analyze_all_variable_accesses (void)
 
   propagate_all_subaccesses ();
 
+  if (flag_checking)
+verify_all_sra_access_forests ();
+
   bitmap_copy (tmp, candidate_bitmap);
   EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
 {
-- 
2.24.1

Re: [PATCH] i386: Fix ix86_fold_builtin shift folding [PR93418]

2020-01-27 Thread Jeff Law

On Tue, 2020-01-28 at 00:41 +0100, Jakub Jelinek wrote:
> Hi!
> 
> The following testcase is miscompiled, because the variable shift left
> operand, { -1, -1, -1, -1 } is represented as a VECTOR_CST with
> VECTOR_CST_NPATTERNS 1 and VECTOR_CST_NELTS_PER_PATTERN 1, so when
> we call builder.new_unary_operation, builder.encoded_nelts () will be just 1
> and thus we encode the resulting vector as if all the elements were the
> same.
> For non-masked is_vshift, we could perhaps call builder.new_binary_operation
> (TREE_TYPE (args[0]), args[0], args[1], false), but then there are masked
> shifts, for non-is_vshift we could perhaps call it too but with args[2]
> instead of args[1], but there is no builder.new_ternary_operation.
> All this stuff is primarily for aarch64 anyway, on x86 we don't have any
> variable length vectors, and it is not a big deal to compute all elements
> and just let builder.finalize () find the most efficient VECTOR_CST
> representation of the vector.  So, instead of doing too much, this just
> keeps using new_unary_operation only if only one VECTOR_CST is involved
> (i.e. non-masked shift by constant) and for the rest just compute all elts.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2020-01-28  Jakub Jelinek  
> 
>   PR target/93418
>   * config/i386/i386.c (ix86_fold_builtin) : If mask is not
>   -1 or is_vshift is true, use new_vector with number of elts npatterns
>   rather than new_unary_operation.
> 
>   * gcc.target/i386/avx2-pr93418.c: New test.
OK.
Jeff

>

Re: [PATCH] gimple-fold: Fix buffer overflow in fold_array_ctor_reference [PR93454]

2020-01-27 Thread Jeff Law

On Tue, 2020-01-28 at 00:33 +0100, Jakub Jelinek wrote:
> Hi!
> 
> libgcrypt FAILs to build on aarch64-linux with
> *** stack smashing detected ***: terminated
> when gcc is compiled with -D_FORTIFY_SOURCE=2.  The problem is if
> fold_array_ctor_reference is called with size equal to or very close to
> MAX_BITSIZE_MODE_ANY_MODE bits and non-zero inner_offset.
> The first native_encode_expr is called with that inner_offset and bufoff 0,
> the subsequent ones with offset of 0, and bufoff elt_size - inner_offset,
> 2 * elt_size - inner_offset etc.  So, e.g. on the testcase where we start
> with inner_offset 1 and size is e.g. 256 bytes and elt_size 4 bytes
> we then call native_encode_expr at bufoff 251 and then 255, but that one
> overwrites 3 bytes beyond the buf array.
> The following patch fixes that.  In addition, it avoids calling
> elt_size.to_uhwi () all the time, and punts if elt_sz would be too large.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2020-01-28  Jakub Jelinek  
> 
>   PR tree-optimization/93454
>   * gimple-fold.c (fold_array_ctor_reference): Perform
>   elt_size.to_uhwi () just once, instead of calling it in every
>   iteration.  Punt if that value is above size of the temporary
>   buffer.  Decrease third native_encode_expr argument when
>   bufoff + elt_sz is above size of buf.
> 
>   * gcc.dg/pr93454.c: New test.
OK
jeff
>

Re: [PATCH] libiberty/hashtab: More const parameters

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 22:32 +, Andrew Burgess wrote:
> I know that the tree's currently closed to non-bugfix changes, but I
> was hoping this might be accpeted anyway so it can be backported to
> binutils-gdb.
> 
> ---
> 
> Makes some parameters const in libiberty's hashtab library.
> 
> include/ChangeLog:
> 
> * hashtab.h (htab_remove_elt): Make a parameter const.
> (htab_remove_elt_with_hash): Likewise.
> 
> libiberty/ChangeLog:
> 
> * hashtab.c (htab_remove_elt): Make a parameter const.
> (htab_remove_elt_with_hash): Likewise.
OK
jeff
>

[patch, fortran] PR93473 - ICE on valid with long module + submodule names

2020-01-27 Thread Andrew Benson

I created PR93473 for this problem: The following code causes a bogus "symbol 
is already defined" error (using git commit 
472dc648ce3e7661762931d584d239611ddca964):

module aModestlyLongModuleName
  
  type :: aTypeWithASignificantlyLongNameButStillAllowedOK
  end type aTypeWithASignificantlyLongNameButStillAllowedOK
  
  interface
 module function aFunctionWithALongButStillAllowedName(parameters) 
result(self)
   type(aTypeWithASignificantlyLongNameButStillAllowedOK) :: self
 end function aFunctionWithALongButStillAllowedName
  end interface
  
end module aModestlyLongModuleName

submodule (aModestlyLongModuleName) 
aTypeWithASignificantlyLongNameButStillAllowedOK_

contains

  module procedure aFunctionWithALongButStillAllowedName
 class(*), pointer :: genericObject
  end procedure aFunctionWithALongButStillAllowedName

end submodule aTypeWithASignificantlyLongNameButStillAllowedOK_

submodule 
(aModestlyLongModuleName:aTypeWithASignificantlyLongNameButStillAllowedOK_) 
aSubmoduleWithASignificantlyLongButStillAllowedName__
end submodule aSubmoduleWithASignificantlyLongButStillAllowedName__



$ gfortran -v 
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data001/abenson/Galacticus/Tools_Devel_Install/bin/../
libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-git/configure --prefix=/home/abenson/Galacticus/
Tools_Devel --enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200127 (experimental) (GCC) 


$ gfortran -c 1057.F90 -o test.o  -ffree-line-length-none
f951: internal compiler error: Segmentation fault
0xe1021f crash_signal
../../gcc-git/gcc/toplev.c:328
0x7fd1480c91ef ???
/data001/abenson/Galacticus/Tools/glibc-2.12.1/signal/../sysdeps/unix/
sysv/linux/x86_64/sigaction.c:0
0x891106 do_traverse_symtree
../../gcc-git/gcc/fortran/symbol.c:4173
0x85739b parse_module
../../gcc-git/gcc/fortran/parse.c:6111
0x85782d gfc_parse_file()
../../gcc-git/gcc/fortran/parse.c:6427
0x8a7f2f gfc_be_parse_file
../../gcc-git/gcc/fortran/f95-lang.c:210
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See <https://gcc.gnu.org/bugs/> for instructions.


The problem occurs in set_syms_host_assoc() where the "parent1" and "parent2" 
variables have a maximum length of GFC_MAX_SYMBOL_LEN+1. This is insufficient 
when the parent names are a module+submodule name concatenated with a ".". The 
patch above fixes this by increasing their length to 2*GFC_MAX_SYMBOL_LEN+2.

A patch to fix this is attached. The patch regression tests cleanly - ok to 
commit?

-Andrew
diff --git a/gcc/fortran/parse.c b/gcc/fortran/parse.c
index 4bff0c8..cbace25 100644
--- a/gcc/fortran/parse.c
+++ b/gcc/fortran/parse.c
@@ -6045,8 +6045,8 @@ set_syms_host_assoc (gfc_symbol *sym)
 {
   gfc_component *c;
   const char dot[2] = ".";
-  char parent1[GFC_MAX_SYMBOL_LEN + 1];
-  char parent2[GFC_MAX_SYMBOL_LEN + 1];
+  char parent1[2 * GFC_MAX_SYMBOL_LEN + 2];
+  char parent2[2 * GFC_MAX_SYMBOL_LEN + 2];
 
   if (sym == NULL)
 return;
diff --git a/gcc/testsuite/gfortran.dg/pr93473.f90 b/gcc/testsuite/gfortran.dg/pr93473.f90
new file mode 100644
index 000..dda8525
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93473.f90
@@ -0,0 +1,28 @@
+! { dg-do compile }
+! { dg-options "-ffree-line-length-none" }
+! PR fortran/93473
+module aModestlyLongModuleName
+  
+  type :: aTypeWithASignificantlyLongNameButStillAllowedOK
+  end type aTypeWithASignificantlyLongNameButStillAllowedOK
+  
+  interface
+ module function aFunctionWithALongButStillAllowedName(parameters) result(self)
+   type(aTypeWithASignificantlyLongNameButStillAllowedOK) :: self
+ end function aFunctionWithALongButStillAllowedName
+  end interface
+  
+end module aModestlyLongModuleName
+
+submodule (aModestlyLongModuleName) aTypeWithASignificantlyLongNameButStillAllowedOK_
+
+contains
+
+  module procedure aFunctionWithALongButStillAllowedName
+ class(*), pointer :: genericObject
+  end procedure aFunctionWithALongButStillAllowedName
+
+end submodule aTypeWithASignificantlyLongNameButStillAllowedOK_
+
+submodule (aModestlyLongModuleName:aTypeWithASignificantlyLongNameButStillAllowedOK_) aSubmoduleWithASignificantlyLongButStillAllowedName__
+end submodule aSubmoduleWithASignificantlyLongButStillAllowedName__
2020-01-27  Andrew Benson  

	PR fortran/93473
	* parse.c: Increase length of char variables to allow them to hold
	a concatenated module + submodule name.

	* gfortran.dg/pr93473.f90: New test.

[PATCH] i386: Fix ix86_fold_builtin shift folding [PR93418]

2020-01-27 Thread Jakub Jelinek

Hi!

The following testcase is miscompiled, because the variable shift left
operand, { -1, -1, -1, -1 } is represented as a VECTOR_CST with
VECTOR_CST_NPATTERNS 1 and VECTOR_CST_NELTS_PER_PATTERN 1, so when
we call builder.new_unary_operation, builder.encoded_nelts () will be just 1
and thus we encode the resulting vector as if all the elements were the
same.
For non-masked is_vshift, we could perhaps call builder.new_binary_operation
(TREE_TYPE (args[0]), args[0], args[1], false), but then there are masked
shifts, for non-is_vshift we could perhaps call it too but with args[2]
instead of args[1], but there is no builder.new_ternary_operation.
All this stuff is primarily for aarch64 anyway, on x86 we don't have any
variable length vectors, and it is not a big deal to compute all elements
and just let builder.finalize () find the most efficient VECTOR_CST
representation of the vector.  So, instead of doing too much, this just
keeps using new_unary_operation only if only one VECTOR_CST is involved
(i.e. non-masked shift by constant) and for the rest just compute all elts.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-01-28  Jakub Jelinek  

PR target/93418
* config/i386/i386.c (ix86_fold_builtin) : If mask is not
-1 or is_vshift is true, use new_vector with number of elts npatterns
rather than new_unary_operation.

* gcc.target/i386/avx2-pr93418.c: New test.

--- gcc/config/i386/i386.c.jj   2020-01-22 09:49:27.375413362 +0100
+++ gcc/config/i386/i386.c  2020-01-27 18:22:34.986577375 +0100
@@ -17278,8 +17278,13 @@ ix86_fold_builtin (tree fndecl, int n_ar
countt = build_int_cst (integer_type_node, count);
}
  tree_vector_builder builder;
- builder.new_unary_operation (TREE_TYPE (args[0]), args[0],
-  false);
+ if (mask != HOST_WIDE_INT_M1U || is_vshift)
+   builder.new_vector (TREE_TYPE (args[0]),
+   TYPE_VECTOR_SUBPARTS (TREE_TYPE (args[0])),
+   1);
+ else
+   builder.new_unary_operation (TREE_TYPE (args[0]), args[0],
+false);
  unsigned int cnt = builder.encoded_nelts ();
  for (unsigned int i = 0; i < cnt; ++i)
{
--- gcc/testsuite/gcc.target/i386/avx2-pr93418.c.jj 2020-01-27 
18:27:53.461799372 +0100
+++ gcc/testsuite/gcc.target/i386/avx2-pr93418.c2020-01-27 
18:26:11.592327685 +0100
@@ -0,0 +1,20 @@
+/* PR target/93418 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx2 -fdump-tree-optimized" } */
+/* { dg-final { scan-tree-dump-not "link_error" "optimized" } } */
+
+#include 
+
+void link_error (void);
+
+void
+foo (void)
+{
+  __m128i a = _mm_set1_epi32 (0xU);
+  __m128i b = _mm_setr_epi32 (16, 31, -34, 3);
+  __m128i c = _mm_sllv_epi32 (a, b);
+  __v4su d = (__v4su) c;
+  if (d[0] != 0xU || d[1] != 0x8000U
+  || d[2] != 0 || d[3] != 0xfff8U)
+link_error ();
+}

Jakub

[PATCH] gimple-fold: Fix buffer overflow in fold_array_ctor_reference [PR93454]

2020-01-27 Thread Jakub Jelinek

Hi!

libgcrypt FAILs to build on aarch64-linux with
*** stack smashing detected ***: terminated
when gcc is compiled with -D_FORTIFY_SOURCE=2.  The problem is if
fold_array_ctor_reference is called with size equal to or very close to
MAX_BITSIZE_MODE_ANY_MODE bits and non-zero inner_offset.
The first native_encode_expr is called with that inner_offset and bufoff 0,
the subsequent ones with offset of 0, and bufoff elt_size - inner_offset,
2 * elt_size - inner_offset etc.  So, e.g. on the testcase where we start
with inner_offset 1 and size is e.g. 256 bytes and elt_size 4 bytes
we then call native_encode_expr at bufoff 251 and then 255, but that one
overwrites 3 bytes beyond the buf array.
The following patch fixes that.  In addition, it avoids calling
elt_size.to_uhwi () all the time, and punts if elt_sz would be too large.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2020-01-28  Jakub Jelinek  

PR tree-optimization/93454
* gimple-fold.c (fold_array_ctor_reference): Perform
elt_size.to_uhwi () just once, instead of calling it in every
iteration.  Punt if that value is above size of the temporary
buffer.  Decrease third native_encode_expr argument when
bufoff + elt_sz is above size of buf.

* gcc.dg/pr93454.c: New test.

--- gcc/gimple-fold.c.jj2020-01-12 11:54:36.0 +0100
+++ gcc/gimple-fold.c   2020-01-27 15:54:51.188830178 +0100
@@ -6665,12 +6665,14 @@ fold_array_ctor_reference (tree type, tr
   /* And offset within the access.  */
   inner_offset = offset % (elt_size.to_uhwi () * BITS_PER_UNIT);
 
-  if (size > elt_size.to_uhwi () * BITS_PER_UNIT)
+  unsigned HOST_WIDE_INT elt_sz = elt_size.to_uhwi ();
+  if (size > elt_sz * BITS_PER_UNIT)
 {
   /* native_encode_expr constraints.  */
   if (size > MAX_BITSIZE_MODE_ANY_MODE
  || size % BITS_PER_UNIT != 0
- || inner_offset % BITS_PER_UNIT != 0)
+ || inner_offset % BITS_PER_UNIT != 0
+ || elt_sz > MAX_BITSIZE_MODE_ANY_MODE / BITS_PER_UNIT)
return NULL_TREE;
 
   unsigned ctor_idx;
@@ -6701,10 +6703,11 @@ fold_array_ctor_reference (tree type, tr
   index = wi::umax (index, access_index);
   do
{
- int len = native_encode_expr (val, buf + bufoff,
-   elt_size.to_uhwi (),
+ if (bufoff + elt_sz > sizeof (buf))
+   elt_sz = sizeof (buf) - bufoff;
+ int len = native_encode_expr (val, buf + bufoff, elt_sz,
inner_offset / BITS_PER_UNIT);
- if (len != elt_size - inner_offset / BITS_PER_UNIT)
+ if (len != (int) elt_sz - inner_offset / BITS_PER_UNIT)
return NULL_TREE;
  inner_offset = 0;
  bufoff += len;
--- gcc/testsuite/gcc.dg/pr93454.c.jj   2020-01-27 16:04:22.420430555 +0100
+++ gcc/testsuite/gcc.dg/pr93454.c  2020-01-27 16:03:24.734278795 +0100
@@ -0,0 +1,25 @@
+/* PR tree-optimization/93454 */
+/* { dg-do compile } */
+/* { dg-options "-O2 -g" } */
+
+#if __SIZEOF_INT__ == 4 && __CHAR_BIT__ == 8
+#define A(n) n, n + 0x01010101, n + 0x02020202, n + 0x03030303
+#define B(n) A (n), A (n + 0x04040404), A (n + 0x08080808), A (n + 0x0c0c0c0c)
+#define C(n) B (n), B (n + 0x10101010), B (n + 0x20202020), B (n + 0x30303030)
+#define D(n) C (n), C (n + 0x40404040), C (n + 0x80808080U), C (n + 
0xc0c0c0c0U)
+const unsigned int a[64] = { C (0) };
+const unsigned int b[256] = { D (0) };
+const unsigned int c[32] = { B (0), B (0x10101010) };
+const unsigned int d[16] = { B (0) };
+const unsigned int e[8] = { A (0), A (0x04040404) };
+
+void
+foo (void)
+{
+  const unsigned char *s = ((const unsigned char *) a) + 1;
+  const unsigned char *t = ((const unsigned char *) b) + 1;
+  const unsigned char *u = ((const unsigned char *) c) + 1;
+  const unsigned char *v = ((const unsigned char *) d) + 1;
+  const unsigned char *w = ((const unsigned char *) e) + 1;
+}
+#endif

Jakub

Re: [cris-decc0 8/9] cris: Move trivially from cc0 to reg:CC model, removing most optimizations.

2020-01-27 Thread Segher Boessenkool

Hi!

On Wed, Jan 22, 2020 at 07:11:27AM +0100, Hans-Peter Nilsson wrote:
> I intend to put back as many as I find use for, of those
> anonymous patterns in a controlled manner, with self-contained
> test-cases proving their usability, rather than symmetry with
> other instructions and similar addressing modes, which guided
> the original introduction.  I've entered prX to track code
> performance regressions related to this transition, with focus
> on target-side causes and fixes; besides the function prologue
> special-case, there were some checking presence of the bit-test
> (btstq) instruction.

That's PR93372 (not X :-) ).

Do you have any estimate how much removing cc0 this way costs in
performance (or code size, or any other metric)?


Segher

[PATCH] libiberty/hashtab: More const parameters

2020-01-27 Thread Andrew Burgess

I know that the tree's currently closed to non-bugfix changes, but I
was hoping this might be accpeted anyway so it can be backported to
binutils-gdb.

---

Makes some parameters const in libiberty's hashtab library.

include/ChangeLog:

* hashtab.h (htab_remove_elt): Make a parameter const.
(htab_remove_elt_with_hash): Likewise.

libiberty/ChangeLog:

* hashtab.c (htab_remove_elt): Make a parameter const.
(htab_remove_elt_with_hash): Likewise.
---
 include/ChangeLog   | 5 +
 include/hashtab.h   | 4 ++--
 libiberty/ChangeLog | 5 +
 libiberty/hashtab.c | 4 ++--
 4 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/include/hashtab.h b/include/hashtab.h
index d94b54c3c41..6cca342b989 100644
--- a/include/hashtab.h
+++ b/include/hashtab.h
@@ -173,8 +173,8 @@ extern void *   htab_find_with_hash (htab_t, const void 
*, hashval_t);
 extern void ** htab_find_slot_with_hash (htab_t, const void *,
  hashval_t, enum insert_option);
 extern voidhtab_clear_slot (htab_t, void **);
-extern voidhtab_remove_elt (htab_t, void *);
-extern voidhtab_remove_elt_with_hash (htab_t, void *, hashval_t);
+extern voidhtab_remove_elt (htab_t, const void *);
+extern voidhtab_remove_elt_with_hash (htab_t, const void *, hashval_t);
 
 extern voidhtab_traverse (htab_t, htab_trav, void *);
 extern voidhtab_traverse_noresize (htab_t, htab_trav, void *);
diff --git a/libiberty/hashtab.c b/libiberty/hashtab.c
index 26c98ce2d68..225e9e540a7 100644
--- a/libiberty/hashtab.c
+++ b/libiberty/hashtab.c
@@ -709,7 +709,7 @@ htab_find_slot (htab_t htab, const PTR element, enum 
insert_option insert)
element in the hash table, this function does nothing.  */
 
 void
-htab_remove_elt (htab_t htab, PTR element)
+htab_remove_elt (htab_t htab, const PTR element)
 {
   htab_remove_elt_with_hash (htab, element, (*htab->hash_f) (element));
 }
@@ -720,7 +720,7 @@ htab_remove_elt (htab_t htab, PTR element)
function does nothing.  */
 
 void
-htab_remove_elt_with_hash (htab_t htab, PTR element, hashval_t hash)
+htab_remove_elt_with_hash (htab_t htab, const PTR element, hashval_t hash)
 {
   PTR *slot;
 
-- 
2.14.5

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread H.J. Lu

On Mon, Jan 27, 2020 at 12:26 PM Uros Bizjak  wrote:
>
> On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
> >
> > movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> > case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
> > for TARGET_AVX.
> >
> > gcc/
> >
> > PR target/91461
> > * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
> > TARGET_AVX.
> > * config/i386/i386.md (*movoi_internal_avx): Remove
> > TARGET_SSE_TYPELESS_STORES check.
> >
> > gcc/testsuite/
> >
> > PR target/91461
> > * gcc.target/i386/pr91461-1.c: New test.
> > * gcc.target/i386/pr91461-2.c: Likewise.
> > * gcc.target/i386/pr91461-3.c: Likewise.
> > * gcc.target/i386/pr91461-4.c: Likewise.
> > * gcc.target/i386/pr91461-5.c: Likewise.
> > ---
> >  gcc/config/i386/i386.h|  4 +-
> >  gcc/config/i386/i386.md   |  4 +-
> >  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
> >  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
> >  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
> >  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
> >  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
> >  7 files changed, 203 insertions(+), 4 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
> >  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
> >
> > diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> > index 943e9a5c783..c134b04c5c4 100644
> > --- a/gcc/config/i386/i386.h
> > +++ b/gcc/config/i386/i386.h
> > @@ -516,8 +516,10 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
> >  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> > ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
> >  #define TARGET_SSE_SPLIT_REGS  ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> > +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> > +   isn't the case for AVX nor AVX512.  */
> >  #define TARGET_SSE_TYPELESS_STORES \
> > -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> > +   (!TARGET_AVX && ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
>
> This is wrong place to disable the feature.

Like this?

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 2acc9fb0cfe..639969d736d 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -1597,6 +1597,11 @@ set_ix86_tune_features (enum processor_type
ix86_tune, bool dump)
 = !!(initial_ix86_tune_features[i] & ix86_tune_mask);
 }

+  /* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
+ isn't the case for AVX nor AVX512.  */
+  if (TARGET_AVX)
+ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES] = 0;
+
   if (dump)
 {
   fprintf (stderr, "List of x86 specific tuning parameter names:\n");


-- 
H.J.

Re: [PATCH] gcc: Add new configure options to allow static libraries to be selected

2020-01-27 Thread Andrew Burgess

* Jeff Law  [2020-01-22 13:52:27 -0700]:

> On Wed, 2020-01-22 at 15:39 +, Andrew Burgess wrote:
> > The motivation behind this change is to make it easier for a user to
> > link against static libraries on a target where dynamic libraries are
> > the default library type (for example GNU/Linux).
> > 
> > Further, my motivation is really for linking libraries into GDB,
> > however, the binutils-gdb/config/ directory is a copy of gcc/config/
> > so changes for GDB need to be approved by the GCC project first.
> > 
> > After making this change in the gcc/config/ directory I've run
> > autoreconf on all of the configure scripts in the GCC tree and a
> > couple have been updated, so I'll use one of these to describe what my
> > change does.
> > 
> > Consider libcpp, this library links against libiconv.  Currently if
> > the user builds on a system with both static and dynamic libiconv
> > installed then autotools will pick up the dynamic libiconv by
> > default.  This is almost certainly the right thing to do.
> > 
> > However, if the user wants to link against static libiconv then things
> > are a little harder, they could remove the dynamic libiconv from their
> > system, but this is probably a bad idea (other things might depend on
> > that library), or the user can build their own version of libiconv,
> > install it into a unique prefix, and then configure gcc using the
> > --with-libiconv-prefix=DIR flag.  This works fine, but is somewhat
> > annoying, the static library available, I just can't get autotools to
> > use it.
> > 
> > My change then adds a new flag --with-libiconv-type=TYPE, where type
> > is either auto, static, or shared.  The default auto, ensures we keep
> > the existing behaviour unchanged.
> > 
> > If the user configures with --with-libiconv-type=static then the
> > configure script will ignore any dynamic libiconv it finds, and will
> > only look for a static libiconv, if no static libiconv is found then
> > the configure will continue as though there is no libiconv at all
> > available.
> > 
> > Similarly a user can specify --with-libiconv-type=shared and force the
> > use of shared libiconv, any static libiconv will be ignored.
> > 
> > As I've implemented this change within the AC_LIB_LINKFLAGS_BODY macro
> > then only libraries configured using the AC_LIB_LINKFLAGS or
> > AC_LIB_HAVE_LINKFLAGS macros will gain the new configure flag.
> > 
> > If this is accepted into GCC then there will be follow on patches for
> > binutils and GDB to regenerate some configure scripts in those
> > projects.
> > 
> > For GCC only two configure scripts needed updated after this commit,
> > libcpp and libstdc++-v3, both of which link against libiconv.
> > 
> > config/ChangeLog:
> > 
> > * lib-link.m4 (AC_LIB_LINKFLAGS_BODY): Add new
> > --with-libXXX-type=... option.  Use this to guide the selection of
> > either a shared library or a static library.
> > 
> > libcpp/ChangeLog:
> > 
> > * configure: Regnerate.
> > 
> > libstdc++-v3/ChangeLog:
> > 
> > * configure: Regnerate.
> s/Regnerate/Regenerate/
> 
> This isn't strictly a regression bugfix.  But given the nature of these
> files I think we probably need to be a bit more lax and allow safe
> changes so that downstream uses can move forward independent of the gcc
> development and release schedule.
> 
> So, OK.

Thanks for the flexibility.  Now pushed.

Thanks,
Andrew

Re: Support gnu_unique_object symbols on MIPS

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 18:23 +, Joseph Myers wrote:
> mips_declare_object_name is missing the support for declaring symbols
> as gnu_unique_object that is present in the generic
> ASM_DECLARE_OBJECT_NAME in elfos.h.  I'm not aware of any
> MIPS-specific reason for that support to be absent;
> mips_declare_object_name predates the addition of gnu_unique_object
> support and as far as I can tell this was simply an oversight when
> that support was added.  This patch adds the missing support,
> following the code in elfos.h.
> 
> Tested with no regressions with cross to mips-linux-gnu.  In
> particular, this fixes the failure of the recently-added glibc test
> elf/tst-dlopen-nodelete-reloc, which relies on the compiler generating
> such symbols, for MIPS.
> 
> 2020-01-27  Joseph Myers  
> 
>   * config/mips/mips.c (mips_declare_object_name)
>   [USE_GNU_UNIQUE_OBJECT]: Support use of gnu_unique_object.
LGTM.
jeff
>

Re: [PATCH v2][ARM] Disable code hoisting with -O3 (PR80155)

2020-01-27 Thread Segher Boessenkool

Hi!

On Tue, Jan 21, 2020 at 02:10:21PM +, Wilco Dijkstra wrote:
> While code hoisting generally improves codesize, it can affect performance
> negatively. Benchmarking shows it doesn't help SPEC and negatively affects
> embedded benchmarks. Since the impact is relatively small with -O2 and mainly
> affects -O3, the simplest option is to disable code hoisting for -O3 and 
> higher.

Should this be a generic thing, not target-specific?


Segher

Re: [PATCH 0/2] Make C front end share the C++ tree representation of loops and switches

2020-01-27 Thread Jeff Law

On Thu, 2019-12-12 at 15:44 -0500, Jason Merrill wrote:
> Here are the dumps from ssa-dom-thread-7.c made to compile as C++; cx-current 
> is the dumps with current trunk; cx-old is changed to use the old goto-based 
> lowering like C.
Sorry this has taken so long to get back to.

For ssa-dom-thread-7.c it looks like the differences we're encountering start 
at the thread1 pass.

While both cc1 and cc1plus optimized 16 jump threading paths, the final targets 
differ in some cases.  I guess somewhat ironically cc1plus actually does a 
better job threading deeper through the CFG.   I suspect, but have not actually 
confirmed that by threading deeper through the CFG, there's just fewer things 
for subsequent passes to detect and optimize.

I'm pretty sure cc1 doesn't thread as deeply simply due to the ordering of the 
jump thread paths that have been recorded.  Essentially we only optimize *one* 
path starting at any given edge even though we may have multiple potential jump 
threading paths that start at that edge.   This clearly argues that we should 
sort the vector of jump threading paths so that we find the longest paths first.

While I think we've missed the boat for gcc-10, I think these patches should go 
forward in gcc-11.  I'll own getting the paths sorted so that this problem is 
avoided.

Jeff

[patch, fortran] PR93461 - Bogus "symbol is already defined" with long subroutine names in submodule

2020-01-27 Thread Andrew Benson

I created PR93461 for this issue: The following code causes a bogus "symbol is 
already defined" error (using git commit 
73380abd6b2783215c7950a2ade5e3f4b271e2bc):

module aModuleWithAnAllowedName

  interface
 module subroutine aShortName()
 end subroutine aShortName
  end interface
 
end module aModuleWithAnAllowedName

submodule (aModuleWithAnAllowedName) 
aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName

contains

  subroutine aShortName()
call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
  end subroutine aShortName
  
  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem

  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso
  
end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName



$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/data001/abenson/Galacticus/Tools_Devel_Install/bin/../
libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-git/configure --prefix=/home/abenson/Galacticus/
Tools_Devel --enable-languages=c,c++,fortran --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200124 (experimental) (GCC) 


$ gfortran  -c symlength.F90 -o symlength.o -ffree-line-length-none -
frecursive  -pthread -Wall -fbacktrace -ffpe-trap=invalid,zero,overflow -fdump-
core -O3 -ffinite-math-only -fno-math-errno -fopenmp -g
/tmp/cc8B4Hmp.s: Assembler messages:
/tmp/cc8B4Hmp.s:20: Error: symbol 
`__amodulewithanallowedname.asubmodulewithaveryveryverylongbutentirelylegalname_MOD_asubroutinewithaverylongnamethatwillcauseaprobl'
 
is already defined



The problem occurs because GFC_MAX_MANGLED_SYMBOL_LEN is set to 
GFC_MAX_SYMBOL_LEN*2+4, which is sufficient for a module name plus function 
name 
(plus the additional "_"'s that get prepended), but insufficient if a submodule 
name is included. The name then gets truncated and can lead to two different 
functions having the same (truncated) symbol name.

The fix is to increase this length to GFC_MAX_SYMBOL_LEN*3+5 - which allows for 
the submodule name plus the "." added between module and submodule names.

I've attached a patch for this which includes a new test case for this PR. The 
patch regression tests cleanly.

OK to commit?

-Andrew

-- 

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
diff --git a/gcc/fortran/trans.h b/gcc/fortran/trans.h
index 52bc045..5942320 100644
--- a/gcc/fortran/trans.h
+++ b/gcc/fortran/trans.h
@@ -24,7 +24,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "predict.h"  /* For enum br_predictor and PRED_*.  */
 
 /* Mangled symbols take the form __module__name.  */
-#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*2+4)
+#define GFC_MAX_MANGLED_SYMBOL_LEN  (GFC_MAX_SYMBOL_LEN*3+5)
 
 /* Struct for holding a block of statements.  It should be treated as an
opaque entity and not modified directly.  This allows us to change the
diff --git a/gcc/testsuite/gfortran.dg/pr93461.f90 b/gcc/testsuite/gfortran.dg/pr93461.f90
new file mode 100644
index 000..3bef326
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr93461.f90
@@ -0,0 +1,22 @@
+! { dg-do compile }
+! PR fortran/93461
+module aModuleWithAnAllowedName
+  interface
+ module subroutine aShortName()
+ end subroutine aShortName
+  end interface
+end module aModuleWithAnAllowedName
+
+submodule (aModuleWithAnAllowedName) aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
+contains
+  subroutine aShortName()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+call aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aShortName
+  
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblem
+
+  subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso()
+  end subroutine aSubroutineWithAVeryLongNameThatWillCauseAProblemAlso  
+end submodule aSubmoduleWithAVeryVeryVeryLongButEntirelyLegalName
2020-01-27  Andrew Benson  

	* trans.h: Increase GFC_MAX_MANGLED_SYMBOL_LEN to
	GFC_MAX_SYMBOL_LEN*3+5 to allow for inclusion of submodule name,
	plus the "." between module and submodule names.

Go patch committed: Cleanups for MPFR 3.1.0

2020-01-27 Thread Ian Lance Taylor

This patch to the Go frontend adds cleanups now that MPFR 3.1.0 is
required.  For MPFR functions, change from GMP_RND* to MPFR_RND*.
Also change mp_exp_t to mpfr_expt_t.  This fixes GCC PR 92463.
Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu (with MPFR
4.0.2).  Committed to mainline.

Ian
b1321526a1de86d4feadef1bc9b1a64f45ceebb8
diff --git a/gcc/go/gofrontend/MERGE b/gcc/go/gofrontend/MERGE
index 38872c44eab..49312fa10f7 100644
--- a/gcc/go/gofrontend/MERGE
+++ b/gcc/go/gofrontend/MERGE
@@ -1,4 +1,4 @@
-197381c6364431a7a05e32df683874b7cadcc4b4
+132e0e61d59aaa52f8fdb03a925300c1ced2a0f2
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
diff --git a/gcc/go/gofrontend/expressions.cc b/gcc/go/gofrontend/expressions.cc
index 14bec9a427f..42ad93b9830 100644
--- a/gcc/go/gofrontend/expressions.cc
+++ b/gcc/go/gofrontend/expressions.cc
@@ -2580,11 +2580,11 @@ Integer_expression::do_import(Import_expression* imp, 
Location loc)
  return Expression::make_error(loc);
}
   if (pos == std::string::npos)
-   mpfr_set_ui(real, 0, GMP_RNDN);
+   mpfr_set_ui(real, 0, MPFR_RNDN);
   else
{
  std::string real_str = num.substr(0, pos);
- if (mpfr_init_set_str(real, real_str.c_str(), 10, GMP_RNDN) != 0)
+ if (mpfr_init_set_str(real, real_str.c_str(), 10, MPFR_RNDN) != 0)
{
  go_error_at(imp->location(), "bad number in import data: %qs",
  real_str.c_str());
@@ -2599,7 +2599,7 @@ Integer_expression::do_import(Import_expression* imp, 
Location loc)
imag_str = num.substr(pos);
   imag_str = imag_str.substr(0, imag_str.size() - 1);
   mpfr_t imag;
-  if (mpfr_init_set_str(imag, imag_str.c_str(), 10, GMP_RNDN) != 0)
+  if (mpfr_init_set_str(imag, imag_str.c_str(), 10, MPFR_RNDN) != 0)
{
  go_error_at(imp->location(), "bad number in import data: %qs",
  imag_str.c_str());
@@ -2639,7 +2639,7 @@ Integer_expression::do_import(Import_expression* imp, 
Location loc)
   else
 {
   mpfr_t val;
-  if (mpfr_init_set_str(val, num.c_str(), 10, GMP_RNDN) != 0)
+  if (mpfr_init_set_str(val, num.c_str(), 10, MPFR_RNDN) != 0)
{
  go_error_at(imp->location(), "bad number in import data: %qs",
  num.c_str());
@@ -2753,7 +2753,7 @@ class Float_expression : public Expression
 : Expression(EXPRESSION_FLOAT, location),
   type_(type)
   {
-mpfr_init_set(this->val_, *val, GMP_RNDN);
+mpfr_init_set(this->val_, *val, MPFR_RNDN);
   }
 
   // Write VAL to export data.
@@ -2923,8 +2923,8 @@ Float_expression::do_get_backend(Translate_context* 
context)
 void
 Float_expression::export_float(String_dump *exp, const mpfr_t val)
 {
-  mp_exp_t exponent;
-  char* s = mpfr_get_str(NULL, , 10, 0, val, GMP_RNDN);
+  mpfr_exp_t exponent;
+  char* s = mpfr_get_str(NULL, , 10, 0, val, MPFR_RNDN);
   if (*s == '-')
 exp->write_c_string("-");
   exp->write_c_string("0.");
@@ -4781,7 +4781,7 @@ Unary_expression::eval_constant(Operator op, const 
Numeric_constant* unc,
  unc->get_float();
  mpfr_t val;
  mpfr_init(val);
- mpfr_neg(val, uval, GMP_RNDN);
+ mpfr_neg(val, uval, MPFR_RNDN);
  nc->set_float(unc->type(), val);
  mpfr_clear(uval);
  mpfr_clear(val);
@@ -5613,8 +5613,8 @@ Binary_expression::compare_float(const Numeric_constant* 
left_nc,
   if (!type->is_abstract() && type->float_type() != NULL)
 {
   int bits = type->float_type()->bits();
-  mpfr_prec_round(left_val, bits, GMP_RNDN);
-  mpfr_prec_round(right_val, bits, GMP_RNDN);
+  mpfr_prec_round(left_val, bits, MPFR_RNDN);
+  mpfr_prec_round(right_val, bits, MPFR_RNDN);
 }
 
   *cmp = mpfr_cmp(left_val, right_val);
@@ -5649,10 +5649,10 @@ Binary_expression::compare_complex(const 
Numeric_constant* left_nc,
   if (!type->is_abstract() && type->complex_type() != NULL)
 {
   int bits = type->complex_type()->bits();
-  mpfr_prec_round(mpc_realref(left_val), bits / 2, GMP_RNDN);
-  mpfr_prec_round(mpc_imagref(left_val), bits / 2, GMP_RNDN);
-  mpfr_prec_round(mpc_realref(right_val), bits / 2, GMP_RNDN);
-  mpfr_prec_round(mpc_imagref(right_val), bits / 2, GMP_RNDN);
+  mpfr_prec_round(mpc_realref(left_val), bits / 2, MPFR_RNDN);
+  mpfr_prec_round(mpc_imagref(left_val), bits / 2, MPFR_RNDN);
+  mpfr_prec_round(mpc_realref(right_val), bits / 2, MPFR_RNDN);
+  mpfr_prec_round(mpc_imagref(right_val), bits / 2, MPFR_RNDN);
 }
 
   *cmp = mpc_cmp(left_val, right_val) != 0;
@@ -5899,10 +5899,10 @@ Binary_expression::eval_float(Operator op, const 
Numeric_constant* left_nc,
   switch (op)
 {
 case OPERATOR_PLUS:
-  mpfr_add(val, left_val, right_val, GMP_RNDN);
+  mpfr_add(val, left_val, right_val, MPFR_RNDN);
   break;
 case OPERATOR_MINUS:
-

Re: [PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread Uros Bizjak

On Mon, Jan 27, 2020 at 7:23 PM H.J. Lu  wrote:
>
> movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
> case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
> for TARGET_AVX.
>
> gcc/
>
> PR target/91461
> * config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
> TARGET_AVX.
> * config/i386/i386.md (*movoi_internal_avx): Remove
> TARGET_SSE_TYPELESS_STORES check.
>
> gcc/testsuite/
>
> PR target/91461
> * gcc.target/i386/pr91461-1.c: New test.
> * gcc.target/i386/pr91461-2.c: Likewise.
> * gcc.target/i386/pr91461-3.c: Likewise.
> * gcc.target/i386/pr91461-4.c: Likewise.
> * gcc.target/i386/pr91461-5.c: Likewise.
> ---
>  gcc/config/i386/i386.h|  4 +-
>  gcc/config/i386/i386.md   |  4 +-
>  gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
>  gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
>  gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
>  gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
>  gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
>  7 files changed, 203 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c
>
> diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
> index 943e9a5c783..c134b04c5c4 100644
> --- a/gcc/config/i386/i386.h
> +++ b/gcc/config/i386/i386.h
> @@ -516,8 +516,10 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
>  #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
> ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
>  #define TARGET_SSE_SPLIT_REGS  ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
> +/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
> +   isn't the case for AVX nor AVX512.  */
>  #define TARGET_SSE_TYPELESS_STORES \
> -   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
> +   (!TARGET_AVX && ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])

This is wrong place to disable the feature.

Uros.

>  #define TARGET_SSE_LOAD0_BY_PXOR 
> ix86_tune_features[X86_TUNE_SSE_LOAD0_BY_PXOR]
>  #define TARGET_MEMORY_MISMATCH_STALL \
> ix86_tune_features[X86_TUNE_MEMORY_MISMATCH_STALL]
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 6e9c9bd2fb6..bb096133880 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -1980,9 +1980,7 @@
>(and (eq_attr "alternative" "1")
> (match_test "TARGET_AVX512VL"))
>  (const_string "XI")
> -  (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
> -   (and (eq_attr "alternative" "3")
> -(match_test "TARGET_SSE_TYPELESS_STORES")))
> +  (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
>  (const_string "V8SF")
>   ]
>   (const_string "OI")))])
> diff --git a/gcc/testsuite/gcc.target/i386/pr91461-1.c 
> b/gcc/testsuite/gcc.target/i386/pr91461-1.c
> new file mode 100644
> index 000..0c94b8e2b76
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr91461-1.c
> @@ -0,0 +1,66 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx" } */
> +/* { dg-final { scan-assembler "\tvmovdqa\t" } } */
> +/* { dg-final { scan-assembler "\tvmovdqu\t" } } */
> +/* { dg-final { scan-assembler "\tvmovapd\t" } } */
> +/* { dg-final { scan-assembler "\tvmovupd\t" } } */
> +/* { dg-final { scan-assembler-not "\tvmovaps\t" } } */
> +/* { dg-final { scan-assembler-not "\tvmovups\t" } } */
> +
> +#include 
> +
> +void
> +foo1 (__m128i *p, __m128i x)
> +{
> +  *p = x;
> +}
> +
> +void
> +foo2 (__m128d *p, __m128d x)
> +{
> +  *p = x;
> +}
> +
> +void
> +foo3 (__float128 *p, __float128 x)
> +{
> +  *p = x;
> +}
> +
> +void
> +foo4 (__m128i_u *p, __m128i x)
> +{
> +  *p = x;
> +}
> +
> +void
> +foo5 (__m128d_u *p, __m128d x)
> +{
> +  *p = x;
> +}
> +
> +typedef __float128 __float128_u __attribute__ ((__aligned__ (1)));
> +
> +void
> +foo6 (__float128_u *p, __float128 x)
> +{
> +  *p = x;
> +}
> +
> +#ifdef __x86_64__
> +typedef __int128 __int128_u __attribute__ ((__aligned__ (1)));
> +
> +extern __int128 int128;
> +
> +void
> +foo7 (__int128 *p)
> +{
> +  *p = int128;
> +}
> +
> +void
> +foo8 (__int128_u *p)
> +{
> +  *p = int128;
> +}
> +#endif
> diff --git a/gcc/testsuite/gcc.target/i386/pr91461-2.c 
> b/gcc/testsuite/gcc.target/i386/pr91461-2.c
> new file mode 100644
> index 000..921cfaf9780
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr91461-2.c
> @@ -0,0 +1,19 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx" } */
> +/* { dg-final { scan-assembler

RE: Home care Centers

2020-01-27 Thread Olivia Martin

Hi,

 

Did you had a chance to review my previous mail which I sent across?

 

If you are interested please revert me with your target requirement, so that
I can get back to you with more information on the counts and pricing.

 

Thank you and looking forward for your response.

 

Regards,

Olivia Martin | Manager Demand Generation|

 

If you do not wish to receive further emails, please respond with
"Unsubscribe" in the subject line.

 

From: Olivia Martin [mailto:olivia.mar...@solutiontradeinfra.com] 
Sent: 21 January 2020 16:35
To: 'gcc-patches@gcc.gnu.org'
Subject: Home care Centers

 

Hi,

 

Just wanted to check your interest in acquiring the records of Home care
Centers/Agencies.

 

If you are interested please let me know your target Geography, so that I
can revert with further information on counts and pricing.

 

Thank you and I look forward for your response.

 

Regards,

Olivia Martin | Manager Demand Generation|

 

To opt out, please reply with Opt Out in the Subject Line.

[COMMITTED] c++: Fix array of char typedef in template (PR90966).

2020-01-27 Thread Jason Merrill

Since Martin Sebor's patch for PR 71625 to change braced array initializers
to STRING_CST in some cases, we need to be ready for STRING_CST with types
that are changed by tsubst.  fold_convert doesn't know how to deal with
STRING_CST, which is reasonable; we really shouldn't expect it to here.  So
let's handle STRING_CST separately.

Tested x86_64-pc-linux-gnu, applying to trunk.

PR c++/90966
* pt.c (tsubst_copy) [STRING_CST]: Don't use fold_convert.
---
 gcc/cp/pt.c   | 13 -
 gcc/testsuite/g++.dg/cpp0x/initlist-array10.C | 14 ++
 2 files changed, 26 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/initlist-array10.C

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 45c204e4269..6e614d5a058 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -16772,7 +16772,6 @@ tsubst_copy (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
 
 case INTEGER_CST:
 case REAL_CST:
-case STRING_CST:
 case COMPLEX_CST:
   {
/* Instantiate any typedefs in the type.  */
@@ -16782,6 +16781,18 @@ tsubst_copy (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
return r;
   }
 
+case STRING_CST:
+  {
+   tree type = tsubst (TREE_TYPE (t), args, complain, in_decl);
+   r = t;
+   if (type != TREE_TYPE (t))
+ {
+   r = copy_node (t);
+   TREE_TYPE (r) = type;
+ }
+   return r;
+  }
+
 case PTRMEM_CST:
   /* These can sometimes show up in a partial instantiation, but never
 involve template parms.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/initlist-array10.C 
b/gcc/testsuite/g++.dg/cpp0x/initlist-array10.C
new file mode 100644
index 000..fb9e136b56e
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/initlist-array10.C
@@ -0,0 +1,14 @@
+// PR c++/90966
+// { dg-do compile { target c++11 } }
+
+template
+void f()
+{
+  using S = signed char;
+  constexpr const S v[]{0};
+}
+
+int main()
+{
+  f();
+}

base-commit: 1f2e84238c9f079747804026b6225ec8c1d0e4b7
-- 
2.18.1

Re: [PATCH] get source line for diagnostic from preprocessed file / PR preprocessor/79106

2020-01-27 Thread Jeff Law

On Mon, 2019-12-16 at 11:18 +, Bader, Lucas wrote:
> Hello,
> 
> within a compile cluster, only the preprocessed output of GCC is transferred 
> to remote nodes for compilation. 
> When GCC produces advanced diagnostics (with -fdiagnostics-show-caret), e.g. 
> prints out the affected source
> line and fixit hints, it attempts to read the source file again, even when 
> compiling a preprocessed file (-fpreprocessed). 
> This leads to wrong diagnostics when building with a compile cluster, or, 
> more generally, when changing or deleting the original source file.
> 
> This patch attempts to alter the behavior by implementing a 
> location_get_source_line_preprocessed 
> function that can be used in diagnostic-show-locus.c in case a preprocessed 
> file is compiled.
> There was some previous discussion on this behavior on PR preprocessor/79106.
> 
> This is my first patch to GCC, so in case something is wrong with the format, 
> please let me know.
> 
> Best regards
> Lucas
> 
> 2019-12-16  Lucas Bader  
> 
>   PR preprocessor/79106
>   * c-opts.c (c_common_handle_option): pass -fpreprocessed 
>   option value to global diagnostic configuration
>   
>   * diagnostic-show-locus.c (layout::layout): read line from source or 
> preprocessed
>   file based on -fpreprocessed value
>   (source_line::source_line): read line from source or preprocessed
>   file based on -fpreprocessed value
>   (layout::print_line): read line from source or preprocessed
>   file based on -fpreprocessed value
>   
>   * diagnostic.h (diagnostic_context): new members for reading
>   source lines from preprocessed files
>   * diagnostic.c (diagnostic_initialize): initialize new members
>   
>   * input.c (location_get_source_line_preprocessed): new function
>   to read source lines from preprocessed files
>   (test_reading_source_line_preprocessed): new test case
>   (input_c_tests): execute new test case
>   
>   * opts-global.c (read_cmdline_options): pass input filename to global
>   diagnostic context
Note this is not forgotten.  But it seems more appropriate for gcc-11
rather than gcc-10.

Jeff
>

Re: [PATCH] Clean up references to Subversion in documentation sources.

2020-01-27 Thread Segher Boessenkool

On Mon, Jan 13, 2020 at 01:12:15PM -0500, Eric S. Raymond wrote:
> Jonathan Wakely :
> > Email the patches to gcc-patches@gcc.gnu.org, that's how things get
> > merged.
> > 
> > We're not looking to change any workflows now.
> 
> Roger that.
> 
> Once the dust from the conversion has settled, though, there is a
> related issue I intend to bring up on the main list.
> 
> You've only collected about 60% of the potential benefits from git
> by adopting git itself.  The other 40% would come from moving
> to to one of the modern git-centric forges like GitHub or GitLab.

NAK.

Our development model fits our needs well, even with all its warts.
A "pull request" model would not fit well *at all*.

The "everything passes through email" model is *good*, not in the least
because it puts everyone on a level playing field.  Everyone can see
everything, and comment on everything.

And if it slows you down, well, that is a good thing as well probably!
Thought and carefulness and looking at things from multiple angles is
what we need, not raw speed: we need good changes, we do not need making
it easier to get your changes included at the cost of basic quality.

Anyway, 90% of the advantages of using Git come from using it *locally*,
which many of us have been doing since forever and a day already.

Segher

PING^5: [PATCH] i386: Properly encode xmm16-xmm31/ymm16-ymm31 for vector move

2020-01-27 Thread H.J. Lu

On Mon, Jul 8, 2019 at 8:19 AM H.J. Lu  wrote:
>
> On Tue, Jun 18, 2019 at 8:59 AM H.J. Lu  wrote:
> >
> > On Fri, May 31, 2019 at 10:38 AM H.J. Lu  wrote:
> > >
> > > On Tue, May 21, 2019 at 2:43 PM H.J. Lu  wrote:
> > > >
> > > > On Fri, Feb 22, 2019 at 8:25 AM H.J. Lu  wrote:
> > > > >
> > > > > Hi Jan, Uros,
> > > > >
> > > > > This patch fixes the wrong code bug:
> > > > >
> > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89229
> > > > >
> > > > > Tested on AVX2 and AVX512 with and without --with-arch=native.
> > > > >
> > > > > OK for trunk?
> > > > >
> > > > > Thanks.
> > > > >
> > > > > H.J.
> > > > > --
> > > > > i386 backend has
> > > > >
> > > > > INT_MODE (OI, 32);
> > > > > INT_MODE (XI, 64);
> > > > >
> > > > > So, XI_MODE represents 64 INTEGER bytes = 64 * 8 = 512 bit operation,
> > > > > in case of const_1, all 512 bits set.
> > > > >
> > > > > We can load zeros with narrower instruction, (e.g. 256 bit by inherent
> > > > > zeroing of highpart in case of 128 bit xor), so TImode in this case.
> > > > >
> > > > > Some targets prefer V4SF mode, so they will emit float xorps for 
> > > > > zeroing.
> > > > >
> > > > > sse.md has
> > > > >
> > > > > (define_insn "mov_internal"
> > > > >   [(set (match_operand:VMOVE 0 "nonimmediate_operand"
> > > > >  "=v,v ,v ,m")
> > > > > (match_operand:VMOVE 1 "nonimmediate_or_sse_const_operand"
> > > > >  " C,BC,vm,v"))]
> > > > > 
> > > > >   /* There is no evex-encoded vmov* for sizes smaller than 
> > > > > 64-bytes
> > > > >  in avx512f, so we need to use workarounds, to access sse 
> > > > > registers
> > > > >  16-31, which are evex-only. In avx512vl we don't need 
> > > > > workarounds.  */
> > > > >   if (TARGET_AVX512F &&  < 64 && !TARGET_AVX512VL
> > > > >   && (EXT_REX_SSE_REG_P (operands[0])
> > > > >   || EXT_REX_SSE_REG_P (operands[1])))
> > > > > {
> > > > >   if (memory_operand (operands[0], mode))
> > > > > {
> > > > >   if ( == 32)
> > > > > return "vextract64x4\t{$0x0, %g1, %0|%0, 
> > > > > %g1, 0x0}";
> > > > >   else if ( == 16)
> > > > > return "vextract32x4\t{$0x0, %g1, %0|%0, 
> > > > > %g1, 0x0}";
> > > > >   else
> > > > > gcc_unreachable ();
> > > > > }
> > > > > ...
> > > > >
> > > > > However, since ix86_hard_regno_mode_ok has
> > > > >
> > > > >  /* TODO check for QI/HI scalars.  */
> > > > >   /* AVX512VL allows sse regs16+ for 128/256 bit modes.  */
> > > > >   if (TARGET_AVX512VL
> > > > >   && (mode == OImode
> > > > >   || mode == TImode
> > > > >   || VALID_AVX256_REG_MODE (mode)
> > > > >   || VALID_AVX512VL_128_REG_MODE (mode)))
> > > > > return true;
> > > > >
> > > > >   /* xmm16-xmm31 are only available for AVX-512.  */
> > > > >   if (EXT_REX_SSE_REGNO_P (regno))
> > > > > return false;
> > > > >
> > > > >   if (TARGET_AVX512F &&  < 64 && !TARGET_AVX512VL
> > > > >   && (EXT_REX_SSE_REG_P (operands[0])
> > > > >   || EXT_REX_SSE_REG_P (operands[1])))
> > > > >
> > > > > is a dead code.
> > > > >
> > > > > Also for
> > > > >
> > > > > long long *p;
> > > > > volatile __m256i yy;
> > > > >
> > > > > void
> > > > > foo (void)
> > > > > {
> > > > >_mm256_store_epi64 (p, yy);
> > > > > }
> > > > >
> > > > > with AVX512VL, we should generate
> > > > >
> > > > > vmovdqa %ymm0, (%rax)
> > > > >
> > > > > not
> > > > >
> > > > > vmovdqa64   %ymm0, (%rax)
> > > > >
> > > > > All TYPE_SSEMOV vector moves are consolidated to ix86_output_ssemov:
> > > > >
> > > > > 1. If xmm16-xmm31/ymm16-ymm31 registers aren't used, SSE/AVX vector
> > > > > moves will be generated.
> > > > > 2. If xmm16-xmm31/ymm16-ymm31 registers are used:
> > > > >a. With AVX512VL, AVX512VL vector moves will be generated.
> > > > >b. Without AVX512VL, xmm16-xmm31/ymm16-ymm31 register to register
> > > > >   move will be done with zmm register move.
> > > > >
> > > > > ext_sse_reg_operand is removed since it is no longer needed.
> > > > >
> > > > > Tested on AVX2 and AVX512 with and without --with-arch=native.
> > > > >
> > > > > gcc/
> > > > >
> > > > > PR target/89229
> > > > > PR target/89346
> > > > > * config/i386/i386-protos.h (ix86_output_ssemov): New 
> > > > > prototype.
> > > > > * config/i386/i386.c (ix86_get_ssemov): New function.
> > > > > (ix86_output_ssemov): Likewise.
> > > > > * config/i386/i386.md (*movxi_internal_avx512f): Call
> > > > > ix86_output_ssemov for TYPE_SSEMOV.
> > > > > (*movoi_internal_avx): Call ix86_output_ssemov for 
> > > > > TYPE_SSEMOV.
> > > > > Remove ext_sse_reg_operand and TARGET_AVX512VL check.
> > > > > (*movti_internal): Likewise.
> > > > > (*movdi_internal): Call ix86_output_ssemov

Support gnu_unique_object symbols on MIPS

2020-01-27 Thread Joseph Myers

mips_declare_object_name is missing the support for declaring symbols
as gnu_unique_object that is present in the generic
ASM_DECLARE_OBJECT_NAME in elfos.h.  I'm not aware of any
MIPS-specific reason for that support to be absent;
mips_declare_object_name predates the addition of gnu_unique_object
support and as far as I can tell this was simply an oversight when
that support was added.  This patch adds the missing support,
following the code in elfos.h.

Tested with no regressions with cross to mips-linux-gnu.  In
particular, this fixes the failure of the recently-added glibc test
elf/tst-dlopen-nodelete-reloc, which relies on the compiler generating
such symbols, for MIPS.

2020-01-27  Joseph Myers  

* config/mips/mips.c (mips_declare_object_name)
[USE_GNU_UNIQUE_OBJECT]: Support use of gnu_unique_object.

diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index dae189ed20d..513fc5fe295 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -9775,7 +9775,14 @@ mips_declare_object_name (FILE *stream, const char *name,
  tree decl ATTRIBUTE_UNUSED)
 {
 #ifdef ASM_OUTPUT_TYPE_DIRECTIVE
-  ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "object");
+#ifdef USE_GNU_UNIQUE_OBJECT
+  /* As in elfos.h.  */
+  if (USE_GNU_UNIQUE_OBJECT && DECL_ONE_ONLY (decl)
+  && (!DECL_ARTIFICIAL (decl) || !TREE_READONLY (decl)))
+ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "gnu_unique_object");
+  else
+#endif
+ASM_OUTPUT_TYPE_DIRECTIVE (stream, name, "object");
 #endif
 
   size_directive_output = 0;

-- 
Joseph S. Myers
jos...@codesourcery.com

[PATCH] i386: Disable TARGET_SSE_TYPELESS_STORES for TARGET_AVX

2020-01-27 Thread H.J. Lu

movaps/movups is one byte shorter than movdaq/movdqu.  But it isn't the
case for AVX nor AVX512.  We should disable TARGET_SSE_TYPELESS_STORES
for TARGET_AVX.

gcc/

PR target/91461
* config/i386/i386.h (TARGET_SSE_TYPELESS_STORES): Disable for
TARGET_AVX.
* config/i386/i386.md (*movoi_internal_avx): Remove
TARGET_SSE_TYPELESS_STORES check.

gcc/testsuite/

PR target/91461
* gcc.target/i386/pr91461-1.c: New test.
* gcc.target/i386/pr91461-2.c: Likewise.
* gcc.target/i386/pr91461-3.c: Likewise.
* gcc.target/i386/pr91461-4.c: Likewise.
* gcc.target/i386/pr91461-5.c: Likewise.
---
 gcc/config/i386/i386.h|  4 +-
 gcc/config/i386/i386.md   |  4 +-
 gcc/testsuite/gcc.target/i386/pr91461-1.c | 66 
 gcc/testsuite/gcc.target/i386/pr91461-2.c | 19 ++
 gcc/testsuite/gcc.target/i386/pr91461-3.c | 76 +++
 gcc/testsuite/gcc.target/i386/pr91461-4.c | 21 +++
 gcc/testsuite/gcc.target/i386/pr91461-5.c | 17 +
 7 files changed, 203 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-4.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr91461-5.c

diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h
index 943e9a5c783..c134b04c5c4 100644
--- a/gcc/config/i386/i386.h
+++ b/gcc/config/i386/i386.h
@@ -516,8 +516,10 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST];
 #define TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL \
ix86_tune_features[X86_TUNE_SSE_PACKED_SINGLE_INSN_OPTIMAL]
 #define TARGET_SSE_SPLIT_REGS  ix86_tune_features[X86_TUNE_SSE_SPLIT_REGS]
+/* NB: movaps/movups is one byte shorter than movdaq/movdqu.  But it
+   isn't the case for AVX nor AVX512.  */
 #define TARGET_SSE_TYPELESS_STORES \
-   ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES]
+   (!TARGET_AVX && ix86_tune_features[X86_TUNE_SSE_TYPELESS_STORES])
 #define TARGET_SSE_LOAD0_BY_PXOR ix86_tune_features[X86_TUNE_SSE_LOAD0_BY_PXOR]
 #define TARGET_MEMORY_MISMATCH_STALL \
ix86_tune_features[X86_TUNE_MEMORY_MISMATCH_STALL]
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6e9c9bd2fb6..bb096133880 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1980,9 +1980,7 @@
   (and (eq_attr "alternative" "1")
(match_test "TARGET_AVX512VL"))
 (const_string "XI")
-  (ior (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
-   (and (eq_attr "alternative" "3")
-(match_test "TARGET_SSE_TYPELESS_STORES")))
+  (match_test "TARGET_SSE_PACKED_SINGLE_INSN_OPTIMAL")
 (const_string "V8SF")
  ]
  (const_string "OI")))])
diff --git a/gcc/testsuite/gcc.target/i386/pr91461-1.c 
b/gcc/testsuite/gcc.target/i386/pr91461-1.c
new file mode 100644
index 000..0c94b8e2b76
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr91461-1.c
@@ -0,0 +1,66 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx" } */
+/* { dg-final { scan-assembler "\tvmovdqa\t" } } */
+/* { dg-final { scan-assembler "\tvmovdqu\t" } } */
+/* { dg-final { scan-assembler "\tvmovapd\t" } } */
+/* { dg-final { scan-assembler "\tvmovupd\t" } } */
+/* { dg-final { scan-assembler-not "\tvmovaps\t" } } */
+/* { dg-final { scan-assembler-not "\tvmovups\t" } } */
+
+#include 
+
+void
+foo1 (__m128i *p, __m128i x)
+{
+  *p = x;
+}
+
+void
+foo2 (__m128d *p, __m128d x)
+{
+  *p = x;
+}
+
+void
+foo3 (__float128 *p, __float128 x)
+{
+  *p = x;
+}
+
+void
+foo4 (__m128i_u *p, __m128i x)
+{
+  *p = x;
+}
+
+void
+foo5 (__m128d_u *p, __m128d x)
+{
+  *p = x;
+}
+
+typedef __float128 __float128_u __attribute__ ((__aligned__ (1)));
+
+void
+foo6 (__float128_u *p, __float128 x)
+{
+  *p = x;
+}
+
+#ifdef __x86_64__
+typedef __int128 __int128_u __attribute__ ((__aligned__ (1)));
+
+extern __int128 int128;
+
+void
+foo7 (__int128 *p)
+{
+  *p = int128;
+}
+
+void
+foo8 (__int128_u *p)
+{
+  *p = int128;
+}
+#endif
diff --git a/gcc/testsuite/gcc.target/i386/pr91461-2.c 
b/gcc/testsuite/gcc.target/i386/pr91461-2.c
new file mode 100644
index 000..921cfaf9780
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr91461-2.c
@@ -0,0 +1,19 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx" } */
+/* { dg-final { scan-assembler "\tvmovdqa\t" } } */
+/* { dg-final { scan-assembler "\tvmovapd\t" } } */
+/* { dg-final { scan-assembler-not "\tvmovaps\t" } } */
+
+#include 
+
+void
+foo1 (__m256i *p, __m256i x)
+{
+  *p = x;
+}
+
+void
+foo2 (__m256d *p, __m256d x)
+{
+  *p = x;
+}
diff --git a/gcc/testsuite/gcc.target/i386/pr91461-3.c 
b/gcc/testsuite/gcc.target/i386/pr91461-3.c
new file mode 100644
index

[committed][GCC][ARM] Update __fp16 test to fix regression caused by Bfloat optimisation.

2020-01-27 Thread Stam Markianos-Wright

Hi all,

This was committed following offline approval by Kyryl.

One minor optimisation introduced by :

https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01237.html

was to set a preference for both __fp16 types and __bf16 types to be
loaded/stored directly into/from the FP/NEON registers (if they are available
and if the vld1.16 is compatible), rather than be passed through the regular
r-registers.

This would convert many observed instances of:

**  ldrhr3, [r3]@ __fp16
**  vmov.f16s15, r3 @ __fp16

Into a single:

**  vld1.16 {d7[2]}, [r3]

This resulted in a regression of a dg-scan-assembler in a __fp16 test.

This patch updates the test to the same testing standard used by the BFloat
tests (use check-function-bodies to explicitly check for correct assembler
generated by each function) and updates it for the latest optimisation.

Cheers,
Stam

gcc/testsuite/ChangeLog:

2020-01-27  Stam Markianos-Wright  

* gcc.target/arm/armv8_2-fp16-move-1.c: Update following load/store
 optimisation.
diff --git a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
index 2321dd38cc6..009bb8d1575 100644
--- a/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
+++ b/gcc/testsuite/gcc.target/arm/armv8_2-fp16-move-1.c
@@ -3,39 +3,78 @@
 /* { dg-options "-O2" }  */
 /* { dg-add-options arm_v8_2a_fp16_scalar }  */
 /* { dg-additional-options "-mfloat-abi=hard" } */
-
+/* { dg-final { check-function-bodies "**" "" } } */
+
+/*
+**test_load_1:
+**	...
+**	vld1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+*/
 __fp16
 test_load_1 (__fp16* a)
 {
   return *a;
 }
 
+/*
+**test_load_2:
+**	...
+**	vld1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+*/
 __fp16
 test_load_2 (__fp16* a, int i)
 {
   return a[i];
 }
 
-
+/*
+**test_store_1:
+**	...
+**	vst1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+*/
 void
 test_store_1 (__fp16* a, __fp16 b)
 {
   *a = b;
 }
 
+/*
+**test_store_2:
+**	...
+**	vst1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+*/
 void
 test_store_2 (__fp16* a, int i, __fp16 b)
 {
   a[i] = b;
 }
 
-
+/*
+**test_load_store_1:
+**	...
+**	vld1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+**	vst1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+*/
 __fp16
 test_load_store_1 (__fp16* a, int i, __fp16* b)
 {
   a[i] = b[i];
 }
 
+/*
+**test_load_store_2:
+**	...
+**	vld1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+**	vst1.16	{d[0-9]+\[[0-9]+\]}, \[r[0-9]+\]
+**	...
+*/
 __fp16
 test_load_store_2 (__fp16* a, int i, __fp16* b)
 {
@@ -43,9 +82,6 @@ test_load_store_2 (__fp16* a, int i, __fp16* b)
   return a[i];
 }
 
-/* { dg-final { scan-assembler-times {vst1\.16\t\{d[0-9]+\[[0-9]+\]\}, \[r[0-9]+\]} 3 } }  */
-/* { dg-final { scan-assembler-times {vld1\.16\t\{d[0-9]+\[[0-9]+\]\}, \[r[0-9]+\]} 3 } }  */
-
 __fp16
 test_select_1 (int sel, __fp16 a, __fp16 b)
 {

Re: Deprecating cc0 (and consequently cc0 targets)

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 15:01 +0100, Hans-Peter Nilsson wrote:
> > From: Jeff Law 
> > Date: Fri, 20 Sep 2019 17:38:38 +0200
> 
> Hi.  I'm not going to question
> 
> > The first step in that process is to drop support for cc0.
> 
> but could you please elaborate on...
> 
> > [cc0 support in gcc core]
> > code is broken in various ways,
> > particularly WRT exceptions.
> 
> ...that last part?
See pr49847 as a good starting point.




> If you mean asynchronous exceptions then perhaps in theory,
> except there's no need to (and no state to) "unwind" to
> in-between cc0 setter and user.  But I guess that goes for
> "MODE_CC" targets too; exception information isn't that precise.
It's less about the need to unwind, but the fact that the cc0-setter
and cc0-user end up in different basic blocks with non-call exceptions
and how that interacts with the assumptions the optimizers make. 
Essentially you get ICEs.  We've papered over this stuff when it's
popped up, but it's time to stop :-)



> > This patch deprecates the affected targets.
> 
> (Not applied yet?  Before the gcc-10 branch?  Can you please
> consider dropping cris* from that part when rebasing it, as per
> contents on master and my pledge to merge axis/cris-decc0?)
Not applied.   But should be prior to gcc-10 release (with edits now
that m68k has been converted).  Given that you've got a conversion for
cris that looks ready to go for gcc-11 I'd suggest we pull it out of
the deprecated list as well.

> 

Jeff

Re: Return slot optimization for stack protector strong

2020-01-27 Thread Jakub Jelinek

On Mon, Jan 27, 2020 at 06:49:23PM +0100, Stefan Schulze Frielinghaus wrote:
> some function calls trigger the stack-protector-strong although such
> calls are later on implemented via calls to internal functions.
> Consider the following example:
> 
> long double
> rintl_wrapper (long double x)
> {
>   return rintl (x);
> }
> 
> On s390x a return value of type `long double` is passed via a return
> slot.  Thus according to function `stack_protect_return_slot_p` a
> function call like `rintl (x)` triggers the stack-protector-strong since
> rintl is not an internal function.  However, in a later stage, during
> `expand_call_stmt`, such a call is implemented via a call to an internal
> function.  This means in the example, the call `rintl (x)` is expanded
> into an assembler instruction with register operands only.  Thus this
> late time decision renders the usage of the stack protector superfluous.

I doubt your predicate gives any guarantees that the builtin will be
expanded inline rather than a library call.  Some builtins might be expanded
inline or as a library call depending on various options, or depending on
particular arguments etc.

Jakub

Return slot optimization for stack protector strong

2020-01-27 Thread Stefan Schulze Frielinghaus

Hi,

some function calls trigger the stack-protector-strong although such
calls are later on implemented via calls to internal functions.
Consider the following example:

long double
rintl_wrapper (long double x)
{
  return rintl (x);
}

On s390x a return value of type `long double` is passed via a return
slot.  Thus according to function `stack_protect_return_slot_p` a
function call like `rintl (x)` triggers the stack-protector-strong since
rintl is not an internal function.  However, in a later stage, during
`expand_call_stmt`, such a call is implemented via a call to an internal
function.  This means in the example, the call `rintl (x)` is expanded
into an assembler instruction with register operands only.  Thus this
late time decision renders the usage of the stack protector superfluous.

I am wondering if we can prevent this at least for calls to builtin
functions which are finally implemented via calls to internal functions.
The attached patch solves this for me. Any thoughts?

Successfully bootstrapped and regtested on s390x.

Cheers,
Stefan
diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 9864e4344d2..452813efef1 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -2003,6 +2003,20 @@ estimated_stack_frame_size (struct cgraph_node *node)
   return estimated_poly_value (size);
 }
 
+/* Return true, if CALL is a call to a BUILT_IN_NORMAL function which has no
+ * other effect than setting the lhs and which could be replaced on the current
+ * target by a call to an internal function. Otherwise, return false.  */
+
+static bool
+builtin_as_internal_p (gcall *call)
+{
+  tree decl = gimple_call_fndecl (call);
+  return (gimple_call_lhs (call)
+ && !gimple_has_side_effects (call)
+ && (optimize || (decl && called_as_built_in (decl)))
+ && replacement_internal_fn (call) != IFN_LAST);
+}
+
 /* Check if the current function has calls that use a return slot.  */
 
 static bool
@@ -2018,7 +2032,8 @@ stack_protect_return_slot_p ()
/* This assumes that calls to internal-only functions never
   use a return slot.  */
if (is_gimple_call (stmt)
-   && !gimple_call_internal_p (stmt)
+   && !(gimple_call_internal_p (stmt)
+|| builtin_as_internal_p (as_a  (stmt)))
&& aggregate_value_p (TREE_TYPE (gimple_call_fntype (stmt)),
  gimple_call_fndecl (stmt)))
  return true;
@@ -2613,25 +2628,19 @@ expand_call_stmt (gcall *stmt)
   return;
 }
 
-  /* If this is a call to a built-in function and it has no effect other
- than setting the lhs, try to implement it using an internal function
- instead.  */
-  decl = gimple_call_fndecl (stmt);
-  if (gimple_call_lhs (stmt)
-  && !gimple_has_side_effects (stmt)
-  && (optimize || (decl && called_as_built_in (decl
+  /* If this is a call to a built-in function which could be implemented as a
+   * call to an internal function for the current target, then do so.  */
+  if (builtin_as_internal_p (stmt))
 {
   internal_fn ifn = replacement_internal_fn (stmt);
-  if (ifn != IFN_LAST)
-   {
- expand_internal_call (ifn, stmt);
- return;
-   }
+  expand_internal_call (ifn, stmt);
+  return;
 }
 
   exp = build_vl_exp (CALL_EXPR, gimple_call_num_args (stmt) + 3);
 
   CALL_EXPR_FN (exp) = gimple_call_fn (stmt);
+  decl = gimple_call_fndecl (stmt);
   builtin_p = decl && fndecl_built_in_p (decl);
 
   /* If this is not a builtin function, the function type through which the
diff --git a/gcc/testsuite/gcc.target/s390/ssp-builtin-return-slot.c 
b/gcc/testsuite/gcc.target/s390/ssp-builtin-return-slot.c
new file mode 100644
index 000..a3c5eac7065
--- /dev/null
+++ b/gcc/testsuite/gcc.target/s390/ssp-builtin-return-slot.c
@@ -0,0 +1,39 @@
+/* Check that we do not trigger the Stack Smashing Protector unnecessarily.
+ * Some built-ins as e.g. rintl make use of a return slot and are therefore
+ * subject to stack protection. However, some of those built-ins (including
+ * rintl) are finally implemented via internal functions which should not
+ * trigger the Stack Smashing Protector Strong.  */
+
+/* { dg-do compile } */
+/* { dg-options "-O1 -march=zEC12 -fstack-protector-strong" } */
+/* { dg-final { scan-assembler-not "brasl\t%r14,__stack_chk_fail" } } */
+
+long double ceill(long double x);
+long double copysignl(long double x, long double y);
+long double floorl(long double x);
+long double nearbyintl(long double x);
+long double rintl(long double x);
+long double roundl(long double x);
+long double truncl(long double x);
+
+#define UNARY(F) \
+  long double F ## _wrapper(long double x) \
+{ return F(x); } \
+  long double F ## _wrapper_builtin(long double x) \
+{ return __builtin_ ## F(x); }
+
+#define BINARY(F) \
+  long double F ## _wrapper(long double x, long double y) \
+{ return F(x, y); } \
+  long double F ## _wrapper_builtin(long

Re: [PATCH] vect: Pattern-matched calls in reduction chains

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 17:02 +, Richard Sandiford wrote:
> gcc.dg/pr56350.c started ICEing for SVE in GCC 10 because we
> pattern-matched a division reduction:
> 
>   a /= 8;
> 
> into a signed shift with division semantics:
> 
>   ... = IFN_SDIV_POW2 (..., 3);
> 
> whereas the reduction code expected it still to be a gassign.
> 
> One fix would be to check for a reduction in the pattern matcher
> (but current patterns don't generally do that).  Another would be
> to fail gracefully for reductions involving calls.  Since we can't
> vectorise the reduction either way, and probably have a better shot
> with the shift form, this patch goes for the "fail gracefully" approach.
> 
> Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?
> 
> Richard
> 
> 
> 2020-01-27  Richard Sandiford  
> 
> gcc/
>   * tree-vect-loop.c (vectorizable_reduction): Fail gracefully
>   for reduction chains that (now) include a call.
OK
jeff
>

Re: [PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 16:41 +, Richard Sandiford wrote:
> In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:
> 
> Failed to match this instruction:
> (set (reg:SI 95)
> (ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> If we perform the natural simplification to:
> 
> (set (reg:SI 95)
> (ashift:SI (sign_extract:SI (reg:SI 97)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> then the pattern matches.  And it turns out that we do have a
> simplification like that already, but it would only kick in for
> extractions from a reg, not a subreg.  E.g.:
Yea.  I ran into similar problems with the extract/extend bits in
combine.  And we know it's a fairly general problem that we don't
handle SUBREGs anywhere near as consistently as REGs.


> 
> (set (reg:SI 95)
> (ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> would simplify to:
> 
> (set (reg:SI 95)
> (ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
> (const_int 3 [0x3])
> (const_int 0 [0])) 0)
> (const_int 19 [0x13])))
> 
> IMO the subreg case is even more obviously a simplification
> than the bare reg case, since the net effect is to remove
> either one or two subregs, rather than simply change the
> position of a subreg/truncation.
> 
> However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
> for -m32 on x86_64-linux-gnu, because we could then simplify
> a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
> pattern did already seem to want to handle QImode extractions:
> 
>   "ix86_match_ccmode (insn, CCNOmode)
>&& ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
>|| GET_MODE (operands[2]) == SImode
>|| GET_MODE (operands[2]) == HImode
>|| GET_MODE (operands[2]) == QImode)
> 
> but I'm not sure how often the QI case would trigger in practice,
> since the zero_extract mode was restricted to HI and above.  I checked
> the other x86 patterns and couldn't see any other instances of this.
> 
> Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
> OK to install?
> 
> Richard
> 
> 
> 2020-01-27  Richard Sandiford  
> 
> gcc/
>   PR rtl-optimization/87763
>   * simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
>   simplification to handle subregs as well as bare regs.
>   * config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
Do you need to check for and reject paradoxicals here?  If not, OK as-
is.  If you need to check, then that's pre-approved as well.

jeff
>

[PATCH] vect: Pattern-matched calls in reduction chains

2020-01-27 Thread Richard Sandiford

gcc.dg/pr56350.c started ICEing for SVE in GCC 10 because we
pattern-matched a division reduction:

  a /= 8;

into a signed shift with division semantics:

  ... = IFN_SDIV_POW2 (..., 3);

whereas the reduction code expected it still to be a gassign.

One fix would be to check for a reduction in the pattern matcher
(but current patterns don't generally do that).  Another would be
to fail gracefully for reductions involving calls.  Since we can't
vectorise the reduction either way, and probably have a better shot
with the shift form, this patch goes for the "fail gracefully" approach.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  OK to install?

Richard


2020-01-27  Richard Sandiford  

gcc/
* tree-vect-loop.c (vectorizable_reduction): Fail gracefully
for reduction chains that (now) include a call.
---
 gcc/tree-vect-loop.c | 14 +++---
 1 file changed, 11 insertions(+), 3 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index b4cfad875ab..53fccb715ef 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -6026,10 +6026,18 @@ vectorizable_reduction (stmt_vec_info stmt_info, 
slp_tree slp_node,
 info_for_reduction to work.  */
   if (STMT_VINFO_LIVE_P (vdef))
STMT_VINFO_REDUC_DEF (def) = phi_info;
-  if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (vdef->stmt)))
+  gassign *assign = dyn_cast  (vdef->stmt);
+  if (!assign)
{
- if (!tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs 
(vdef->stmt)),
- TREE_TYPE (gimple_assign_rhs1 
(vdef->stmt
+ if (dump_enabled_p ())
+   dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
+"reduction chain includes calls.\n");
+ return false;
+   }
+  if (CONVERT_EXPR_CODE_P (gimple_assign_rhs_code (assign)))
+   {
+ if (!tree_nop_conversion_p (TREE_TYPE (gimple_assign_lhs (assign)),
+ TREE_TYPE (gimple_assign_rhs1 (assign
{
  if (dump_enabled_p ())
dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,

Re: [PATCH] Fix missed IPA-CP on by-ref argument directly passed through (PR ipa/93429)

2020-01-27 Thread Jeff Law

On Mon, 2020-01-27 at 04:53 +, Feng Xue OS wrote:
> Current IPA does not propagate aggregate constant for by-ref argument
> if it is simple pass-through of caller parameter. Here is an example,
> 
>f1 (int *p)
>{
>  ... = *p;
>  ...
>}
> 
>f2 (int *p)
>{
>   *p = 2;
>   f1 (p);
>}
> 
> It is easy to know that in f1(), *p should be 2 after a simple propagation
> from f2() to f1(). But this is missed due to some bug, which is targeted
> by this patch.
> 
> Bootstrapped/regtested on x86_64-linux and aarch64-linux.
> 
> Feng
> ---
> 2020-01-27  Feng Xue  
> 
> PR ipa/93429
> * ipa-cp.c (propagate_aggs_across_jump_function): Further
> check aggregate jump functions if by-ref argument is simple
> pass through.
> (intersect_aggregates_with_edge): Likewise.
I'm deferring this to gcc-11 as it's a missed-optimization and we're in
stage4.

jeff

[PATCH] simplify-rtx: Extend (truncate (*extract ...)) fold [PR87763]

2020-01-27 Thread Richard Sandiford

In the gcc.target/aarch64/lsl_asr_sbfiz.c part of this PR, we have:

Failed to match this instruction:
(set (reg:SI 95)
(ashift:SI (subreg:SI (sign_extract:DI (subreg:DI (reg:SI 97) 0)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))

If we perform the natural simplification to:

(set (reg:SI 95)
(ashift:SI (sign_extract:SI (reg:SI 97)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))

then the pattern matches.  And it turns out that we do have a
simplification like that already, but it would only kick in for
extractions from a reg, not a subreg.  E.g.:

(set (reg:SI 95)
(ashift:SI (subreg:SI (sign_extract:DI (reg:DI X)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))

would simplify to:

(set (reg:SI 95)
(ashift:SI (sign_extract:SI (subreg:SI (reg:DI X) 0)
(const_int 3 [0x3])
(const_int 0 [0])) 0)
(const_int 19 [0x13])))

IMO the subreg case is even more obviously a simplification
than the bare reg case, since the net effect is to remove
either one or two subregs, rather than simply change the
position of a subreg/truncation.

However, doing that regressed gcc.dg/tree-ssa/pr64910-2.c
for -m32 on x86_64-linux-gnu, because we could then simplify
a :HI zero_extract to a :QI one.  The associated *testqi_ext_3
pattern did already seem to want to handle QImode extractions:

  "ix86_match_ccmode (insn, CCNOmode)
   && ((TARGET_64BIT && GET_MODE (operands[2]) == DImode)
   || GET_MODE (operands[2]) == SImode
   || GET_MODE (operands[2]) == HImode
   || GET_MODE (operands[2]) == QImode)

but I'm not sure how often the QI case would trigger in practice,
since the zero_extract mode was restricted to HI and above.  I checked
the other x86 patterns and couldn't see any other instances of this.

Tested on aarch64-linux-gnu, x86_64-linux-gnu and powerpc64le-linux-gnu,
OK to install?

Richard


2020-01-27  Richard Sandiford  

gcc/
PR rtl-optimization/87763
* simplify-rtx.c (simplify_truncation): Extend sign/zero_extract
simplification to handle subregs as well as bare regs.
* config/i386/i386.md (*testqi_ext_3): Match QI extracts too.
---
 gcc/config/i386/i386.md | 2 +-
 gcc/simplify-rtx.c  | 4 +++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6e9c9bd2fb6..a125ab350bb 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -8927,7 +8927,7 @@ (define_insn "*testqi_ext_2"
 (define_insn_and_split "*testqi_ext_3"
   [(set (match_operand 0 "flags_reg_operand")
 (match_operator 1 "compare_operator"
- [(zero_extract:SWI248
+ [(zero_extract:SWI
 (match_operand 2 "nonimmediate_operand" "rm")
 (match_operand 3 "const_int_operand" "n")
 (match_operand 4 "const_int_operand" "n"))
diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
index eff1d07a253..db4f9339c15 100644
--- a/gcc/simplify-rtx.c
+++ b/gcc/simplify-rtx.c
@@ -736,7 +736,9 @@ simplify_truncation (machine_mode mode, rtx op,
  (*_extract:M1 (truncate:M1 (reg:M2)) (len) (pos')) if possible without
  changing len.  */
   if ((GET_CODE (op) == ZERO_EXTRACT || GET_CODE (op) == SIGN_EXTRACT)
-  && REG_P (XEXP (op, 0))
+  && (REG_P (XEXP (op, 0))
+ || (SUBREG_P (XEXP (op, 0))
+ && REG_P (SUBREG_REG (XEXP (op, 0)
   && GET_MODE (XEXP (op, 0)) == GET_MODE (op)
   && CONST_INT_P (XEXP (op, 1))
   && CONST_INT_P (XEXP (op, 2)))
-- 
2.17.1

Fwd: Re: [Patch][Fortran] On unformatted read, convert != 0 logical to 1

2020-01-27 Thread Tobias Burnus

Just saw that gcc-patches@ wasn't included in the list. See: 
https://gcc.gnu.org/ml/fortran/2020-01/threads.html#00088 for the thread.


Tobias

 Forwarded Message 
Subject: 	Re: [Patch][Fortran] On unformatted read, convert != 0 logical 
to 1

Date:   Mon, 27 Jan 2020 17:29:10 +0100
From:   Tobias Burnus 
To: 	Tobias Burnus , Thomas Koenig 
, Janne Blomqvist 
CC: 	Richard Biener , Marco Jacopo 
Ferrarotti , fort...@gcc.gnu.org 
, Jerry DeLisle 




On 1/27/20 9:58 AM, Tobias Burnus wrote:
I think (3) with (a) and only (iii) is my preferred combination, but I 
am also open for other suggestions.


That's now what the attached patch does.

RFC: Should this option use "!= 0" as .true. or "(var % 2) == 1" as 
.true.? – Either works for the ubiquitous 0 = .false. plus both 1 
(gfortran, ifort –standard-semantics, …) and –1 (ifort, PGI, …) as 
.true. [Other values can only occur when modifying the value directly, 
which should be done in a proper program, or if interop goes wrong with 
.not.. (".true.(1) xor -1" or ".true.(-1) xor 1")]


I have used != 0 – and placed it before the endian conversion ("else 
if"). For the even/odd check, it has to be after the endian conversion.


Besides != 0 and even/odd, one could also change the Boolean flag into a 
three-state flag, using != 0 or even/odd at the user's discretion but 
that seems to be overkill.


What do you think?

Tobias

PS: Minor changes: libgomp.texi — I removed some tailing "." in the 
@menu for consistency. And in libgfortran.h, I put "optional_plus" into 
another like to avoid mixing Boolean and integer items. One could change 
optional_plus, locus, all_unbuffered, unbuffered_preconnect, backtrace, 
legacy_logical_read, backtrace to "bool" and moving the bool and the 
char item together, saving 8*4 - 8*1 = 24 bytes. [However, the type is 
only used once for static variable.]
	* gfortran.texi (Internal representation of LOGICAL variables):
	Add @ref.
	(GFORTRAN_LEGACY_LOGICAL_READ): Document new env variable.

	* gfortran.dg/read_logical_1.f90: New.
	* gfortran.dg/read_logical_2.f90: New.

	* libgfortran.h (options_t): Add legacy_logical_read.
	* runtime/environ.c (variable_table): Add entry for
	GFORTRAN_LEGACY_LOGICAL_READ.
	* io/transfer.c (unformatted_read): If options.legacy_logical_read,
	convert bitvalue != 0 to canonical .true. (= 1) for BT_LOGICAL.

 gcc/fortran/gfortran.texi|  34 -
 gcc/testsuite/gfortran.dg/read_logical_1.f90 | 194 +++
 gcc/testsuite/gfortran.dg/read_logical_2.f90 |  66 +
 libgfortran/io/transfer.c|  60 -
 libgfortran/libgfortran.h|   5 +-
 libgfortran/runtime/environ.c|   4 +
 6 files changed, 355 insertions(+), 8 deletions(-)

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index a50634ab9d2..b0e0077e80e 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -604,15 +604,16 @@ Malformed environment variables are silently ignored.
 * GFORTRAN_STDIN_UNIT:: Unit number for standard input
 * GFORTRAN_STDOUT_UNIT:: Unit number for standard output
 * GFORTRAN_STDERR_UNIT:: Unit number for standard error
-* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units.
+* GFORTRAN_UNBUFFERED_ALL:: Do not buffer I/O for all units
 * GFORTRAN_UNBUFFERED_PRECONNECTED:: Do not buffer I/O for preconnected units.
 * GFORTRAN_SHOW_LOCUS::  Show location for runtime errors
 * GFORTRAN_OPTIONAL_PLUS:: Print leading + where permitted
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
+* GFORTRAN_LEGACY_LOGICAL_READ:: Nonzero, nonone unformatted reads of logicals
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
-* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files.
-* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files.
+* GFORTRAN_FORMATTED_BUFFER_SIZE:: Buffer size for formatted files
+* GFORTRAN_UNFORMATTED_BUFFER_SIZE:: Buffer size for unformatted files
 @end menu
 
 @node TMPDIR
@@ -784,6 +785,30 @@ the backtracing, set the variable to @samp{n}, @samp{N}, @samp{0}.
 Default is to print a backtrace unless the @option{-fno-backtrace}
 compile option was used.
 
+@node GFORTRAN_LEGACY_LOGICAL_READ
+@section @env{GFORTRAN_LEGACY_LOGICAL_READ}--Nonzero, nonone unformatted reads of logicals
+
+GNU Fortran uses @code{0} and @code{1} as internal representation for
+logical @code{.false.} and @code{.true.}, respectively.  However, some other
+compilers use different representations; the most common other representation
+is @code{-1} for @code{.true.}.
+
+The different internal representation affects procedure calls plus writing and
+reading unformatted files.  This option only affects the latter.  If the first
+character of the @env{GFORTRAN_LEGACY_LOGICAL_READ} environment variable is
+@samp{y}, @samp{Y} or @samp{1}, all nonzero values in unformatted reads of

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-27 Thread Wilco Dijkstra

Hi Segher,

> On Thu, Jan 16, 2020 at 12:50:14PM +, Wilco Dijkstra wrote:
>> The separate shrinkwrapping pass may insert stores in the middle
>> of atomics loops which can cause issues on some implementations.
>> Avoid this by delaying splitting of atomic patterns until after
>> prolog/epilog generation.
>
> Note that this isn't specific to sws at all: there isn't anything
> stopping later passes from doing this either.  Is there anything that
> protects us from sched2 doing similar here, for example?

The expansions create extra basic blocks and insert various barriers
that would stop any reasonable scheduler from doing it. And the
current scheduler is basic block based.

Wilco

Re: [PATCH][AArch64] Fix shrinkwrapping interactions with atomics (PR92692)

2020-01-27 Thread Segher Boessenkool

Hi!

On Thu, Jan 16, 2020 at 12:50:14PM +, Wilco Dijkstra wrote:
> The separate shrinkwrapping pass may insert stores in the middle
> of atomics loops which can cause issues on some implementations.
> Avoid this by delaying splitting of atomic patterns until after
> prolog/epilog generation.

Note that this isn't specific to sws at all: there isn't anything
stopping later passes from doing this either.  Is there anything that
protects us from sched2 doing similar here, for example?

Segher

Re: [PATCH Coroutines] Handle type deduction of auto and decltype(auto) with indirect_ref expression

2020-01-27 Thread Nathan Sidwell


On 1/21/20 6:21 AM, JunMa wrote:

Hi
When test gcc with cppcoro, I find case like:

   int& awaitable::await_resume()
   auto x1 = co_await awaitable;
   decltype(auto) x2 = co_await awaitable;

Based on standard, typeof(x1) should be int, typeof(x2) is int&.
However, we get both x1 and x2 as int, this because we donot
consider indirect_ref which wrap await_resume call expression
(convert_from_reference), and it is invoked into type deduction
of auto and decltype(auto).

This patch wrap co_await expression with indirect_ref which should be
same with await_resume expression, and it sink await_resume expression
to real call_expr when replace co_await expression. it fix type deduction
of auto and decltype(auto) in coroutine.

Bootstrap and test on X86_64, is it OK?


+  /* Wrap co_await_expr.  */
+  if (TREE_CODE (awrs_call) == INDIRECT_REF)
+await_expr = build1_loc (loc, INDIRECT_REF, TREE_TYPE (awrs_call),
+await_expr);

I think all you want here is:
  await_expr = convert_from_reference (await_expr);



Regards
JunMa

gcc/cp
2020-01-21  Jun Ma 

     * coroutines.cc (build_co_await): Wrap co_await with
     indirect_ref when needed.
     (co_await_expander):  Sink to call_expr if await_resume
     is wrapped by indirect_ref.

gcc/testsuite
2020-01-21  Jun Ma 

     * g++.dg/coroutines/co-await-14-return-ref-to-auto.C: Add label.



--
Nathan Sidwell

[PINGx2][PATCH][GCC][ARM] Arm generates out of range conditional branches in Thumb2 (PR91816)

2020-01-27 Thread Stam Markianos-Wright



On 1/16/20 4:06 PM, Stam Markianos-Wright wrote:
> 
> 
> On 1/8/20 3:18 PM, Stam Markianos-Wright wrote:
>>
>>
>> On 12/10/19 5:03 PM, Kyrill Tkachov wrote:
>>> Hi Stam,
>>>
>>> On 11/15/19 5:26 PM, Stam Markianos-Wright wrote:
 Pinging with more correct maintainers this time :)

 Also would need to backport to gcc7,8,9, but need to get this approved
 first!

>>>
>>> Sorry for the delay.
>>
>> Same here now! Sorry totally forget about this in the lead up to Xmas!
>>
>> Done the changes marked below and also removed the unnecessary extra 
>> #defines 
>> from the test.
> 
> Ping :)
> 
> Cheers,
> Stam
> 
>>
>>>
>>>
 Thank you,
 Stam


  Forwarded Message 
 Subject: Re: [PATCH][GCC][ARM] Arm generates out of range conditional
 branches in Thumb2 (PR91816)
 Date: Mon, 21 Oct 2019 10:37:09 +0100
 From: Stam Markianos-Wright 
 To: Ramana Radhakrishnan 
 CC: gcc-patches@gcc.gnu.org , nd ,
 James Greenhalgh , Richard Earnshaw
 



 On 10/13/19 4:23 PM, Ramana Radhakrishnan wrote:
 >>
 >> Patch bootstrapped and regression tested on arm-none-linux-gnueabihf,
 >> however, on my native Aarch32 setup the test times out when run as part
 >> of a big "make check-gcc" regression, but not when run individually.
 >>
 >> 2019-10-11  Stamatis Markianos-Wright 
 >>
 >>   * config/arm/arm.md: Update b for Thumb2 range checks.
 >>   * config/arm/arm.c: New function arm_gen_far_branch.
 >>   * config/arm/arm-protos.h: New function arm_gen_far_branch
 >>   prototype.
 >>
 >> gcc/testsuite/ChangeLog:
 >>
 >> 2019-10-11  Stamatis Markianos-Wright 
 >>
 >>   * testsuite/gcc.target/arm/pr91816.c: New test.
 >
 >> diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
 >> index f995974f9bb..1dce333d1c3 100644
 >> --- a/gcc/config/arm/arm-protos.h
 >> +++ b/gcc/config/arm/arm-protos.h
 >> @@ -570,4 +570,7 @@ void arm_parse_option_features (sbitmap, const 
 cpu_arch_option *,
 >>
 >>   void arm_initialize_isa (sbitmap, const enum isa_feature *);
 >>
 >> +const char * arm_gen_far_branch (rtx *, int,const char * , const char 
 >> *);
 >> +
 >> +
 >
 > Lets get the nits out of the way.
 >
 > Unnecessary extra new line, need a space between int and const above.
 >
 >

 .Fixed!

 >>   #endif /* ! GCC_ARM_PROTOS_H */
 >> diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
 >> index 39e1a1ef9a2..1a693d2ddca 100644
 >> --- a/gcc/config/arm/arm.c
 >> +++ b/gcc/config/arm/arm.c
 >> @@ -32139,6 +32139,31 @@ arm_run_selftests (void)
 >>   }
 >>   } /* Namespace selftest.  */
 >>
 >> +
 >> +/* Generate code to enable conditional branches in functions over 1 
 MiB.  */
 >> +const char *
 >> +arm_gen_far_branch (rtx * operands, int pos_label, const char * dest,
 >> +    const char * branch_format)
 >
 > Not sure if this is some munging from the attachment but check
 > vertical alignment of parameters.
 >

 .Fixed!

 >> +{
 >> +  rtx_code_label * tmp_label = gen_label_rtx ();
 >> +  char label_buf[256];
 >> +  char buffer[128];
 >> +  ASM_GENERATE_INTERNAL_LABEL (label_buf, dest , \
 >> +    CODE_LABEL_NUMBER (tmp_label));
 >> +  const char *label_ptr = arm_strip_name_encoding (label_buf);
 >> +  rtx dest_label = operands[pos_label];
 >> +  operands[pos_label] = tmp_label;
 >> +
 >> +  snprintf (buffer, sizeof (buffer), "%s%s", branch_format , 
 >> label_ptr);
 >> +  output_asm_insn (buffer, operands);
 >> +
 >> +  snprintf (buffer, sizeof (buffer), "b\t%%l0%d\n%s:", pos_label, 
 label_ptr);
 >> +  operands[pos_label] = dest_label;
 >> +  output_asm_insn (buffer, operands);
 >> +  return "";
 >> +}
 >> +
 >> +
 >
 > Unnecessary extra newline.
 >

 .Fixed!

 >>   #undef TARGET_RUN_TARGET_SELFTESTS
 >>   #define TARGET_RUN_TARGET_SELFTESTS selftest::arm_run_selftests
 >>   #endif /* CHECKING_P */
 >> diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
 >> index f861c72ccfc..634fd0a59da 100644
 >> --- a/gcc/config/arm/arm.md
 >> +++ b/gcc/config/arm/arm.md
 >> @@ -6686,9 +6686,16 @@
 >>   ;; And for backward branches we have
 >>   ;;   (neg_range - neg_base_offs + pc_offs) = (neg_range - (-2 or -4) 
 >>+ 4).
 >>   ;;
 >> +;; In 16-bit Thumb these ranges are:
 >>   ;; For a 'b'   pos_range = 2046, neg_range = -2048 giving 
 (-2040->2048).
 >>   ;; For a 'b' pos_range = 254, neg_range = -256  giving (-250 
 >>->256).
 >>
 >> +;; In 32-bit Thumb these ranges are:
 >> +;; For a 'b'   +/- 16MB is not checked for.
 >> +;; For a 'b'

[Pingx3][GCC][PATCH][ARM]Add ACLE intrinsics for dot product (vusdot - vector, vdot - by element) for AArch32 AdvSIMD ARMv8.6 Extension

2020-01-27 Thread Stam Markianos-Wright

On 1/16/20 4:05 PM, Stam Markianos-Wright wrote:
> 
> 
> On 1/10/20 6:48 PM, Stam Markianos-Wright wrote:
>>
>>
>> On 12/18/19 1:25 PM, Stam Markianos-Wright wrote:
>>>
>>>
>>> On 12/13/19 10:22 AM, Stam Markianos-Wright wrote:
 Hi all,

 This patch adds the ARMv8.6 Extension ACLE intrinsics for dot product
 operations (vector/by element) to the ARM back-end.

 These are:
 usdot (vector), dot (by element).

 The functions are optional from ARMv8.2-a as -march=armv8.2-a+i8mm and
 for ARM they remain optional as of ARMv8.6-a.

 The functions are declared in arm_neon.h, RTL patterns are defined to
 generate assembler and tests are added to verify and perform adequate 
 checks.

 Regression testing on arm-none-eabi passed successfully.

 This patch depends on:

 https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02195.html

 for ARM CLI updates, and on:

 https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00857.html

 for testsuite effective_target update.

 Ok for trunk?
>>>
>>> .Ping :)
>>>
>> Ping :)
>>
>> New diff addressing review comments from Aarch64 version of the patch.
>>
>> _Change of order of operands in RTL patterns.
>> _Change tests to use check-function-bodies, compile with optimisation and 
>> check for exact registers.
>> _Rename tests to remove "-compile-" in filename.
>>
> 
> Ping!
> 
> Cheers,
> Stam
> 

 Cheers,
 Stam

 ACLE documents are at https://developer.arm.com/docs/101028/latest
 ISA documents are at https://developer.arm.com/docs/ddi0596/latest

 PS. I don't have commit rights, so if someone could commit on my behalf,
 that would be great :)

 gcc/ChangeLog:

 2019-11-28  Stam Markianos-Wright  

  * config/arm/arm-builtins.c (enum arm_type_qualifiers):
  (USTERNOP_QUALIFIERS): New define.
  (USMAC_LANE_QUADTUP_QUALIFIERS): New define.
  (SUMAC_LANE_QUADTUP_QUALIFIERS): New define.
  (arm_expand_builtin_args):
  Add case ARG_BUILTIN_LANE_QUADTUP_INDEX.
  (arm_expand_builtin_1): Add qualifier_lane_quadtup_index.
  * config/arm/arm_neon.h (vusdot_s32): New.
  (vusdot_lane_s32): New.
  (vusdotq_lane_s32): New.
  (vsudot_lane_s32): New.
  (vsudotq_lane_s32): New.
  * config/arm/arm_neon_builtins.def
  (usdot,usdot_lane,sudot_lane): New.
  * config/arm/iterators.md (DOTPROD_I8MM): New.
  (sup, opsuffix): Add .
     * config/arm/neon.md (neon_usdot, dot_lane: New.
  * config/arm/unspecs.md (UNSPEC_DOT_US, UNSPEC_DOT_SU): New.

 gcc/testsuite/ChangeLog:

 2019-12-12  Stam Markianos-Wright  

  * gcc.target/arm/simd/vdot-compile-2-1.c: New test.
  * gcc.target/arm/simd/vdot-compile-2-2.c: New test.
  * gcc.target/arm/simd/vdot-compile-2-3.c: New test.
  * gcc.target/arm/simd/vdot-compile-2-4.c: New test.

>>

Re: [PATCH Coroutines]Fix an ICE case in expanding co_await expression

2020-01-27 Thread Nathan Sidwell


On 1/22/20 6:38 AM, bin.cheng wrote:

Hi,

Though function co_await_expander may need to be further revised, this simple
patch fixes an ICE case in co_await_expander,

 Handle CO_AWAIT_EXPR in conversion in co_await_expander.
 
 Function co_await_expander expands CO_AWAIT_EXPR and inserts expanded

 code before result of co_await is used, however, it doesn't cover the
 type conversion case and leads to gimplify ICE.  This patch fixes it.

Bootstrap and test on x86_64.  Is it OK?

Thanks,
bin

gcc/cp
2020-01-22  Bin Cheng  

 * coroutines.cc (co_await_expander): Handle type conversion case.

gcc/testsuite
2020-01-22  Bin Cheng  

 * g++.dg/coroutines/co-await-syntax-09-convert.C: New test.



ok, thanks

--
Nathan Sidwell

[COMMITTED] aarch64: Fix pr71727.c failure

2020-01-27 Thread Richard Sandiford

This test started failing after the switch to -fno-common because we can
now force the array to be aligned to 16 bytes, which in turn lets us use
SIMD accesses.  Locally restoring -fcommon seems the most faithful to
the original PR.

Tested on aarch64-linux-gnu & pushed.

Richard


2020-01-27  Richard Sandiford  

gcc/testsuite/
PR testsuite/71727
* gcc.target/aarch64/pr71727.c: Add -fcommon.
---
 gcc/testsuite/gcc.target/aarch64/pr71727.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/aarch64/pr71727.c 
b/gcc/testsuite/gcc.target/aarch64/pr71727.c
index 05eef3e9191..41fa72bc67e 100644
--- a/gcc/testsuite/gcc.target/aarch64/pr71727.c
+++ b/gcc/testsuite/gcc.target/aarch64/pr71727.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-mstrict-align -O3" } */
+/* { dg-options "-mstrict-align -O3 -fcommon" } */
 
 struct test_struct_s
 {

[Ping][GCC][IRA] Revert 11b8091fb to fix Bug 93221

2020-01-27 Thread Joel Hutton

Ping! Eric, do you have any objections to reverting?

On 21/01/2020 19:16, Vladimir Makarov wrote:
> I am in favour of reverting the patch now.  But may be Eric can provide
> another version of the patch not causing the arm problem.  I am ready to
> reconsider this too.  So I guess the decision is upto Eric.

Eric did previously say "Feel free to eventually revert it.", but I
hoped he would reply on this thread.

--- Comment #7 from Eric Botcazou  ---
Probably missing live range splitting or somesuch, as envisioned by
Vladimir in
its approval message.  Feel free to eventually revert it.


Changelog:

2020-01-21  Joel Hutton  

 PR target/93221
 * ira.c (ira): Revert use of simplified LRA algorithm.

gcc/testsuite/ChangeLog:

2020-01-21  Joel Hutton  

 PR target/93221
 * gcc.target/aarch64/pr93221.c: New test.
From 1a2980ef6eeb76dbf0556f806a85a4f49ad3ebdd Mon Sep 17 00:00:00 2001
From: Joel Hutton 
Date: Tue, 21 Jan 2020 09:37:48 +
Subject: [PATCH] [IRA] Fix bug 93221 by reverting 11b8091fb

11b8091fb introduced a simplified LRA algorithm for -O0 that turned off
hard register splitting, this causes a problem for parameters passed in
multiple registers on aarch64. This fixes bug 93221.
---
 gcc/ira.c  | 38 +-
 gcc/testsuite/gcc.target/aarch64/pr93221.c | 10 ++
 2 files changed, 25 insertions(+), 23 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/pr93221.c

diff --git a/gcc/ira.c b/gcc/ira.c
index 46091adf8109263c72343dccfe4913857b5c74ae..c8b5f869da121506f0414901271eae9810689316 100644
--- a/gcc/ira.c
+++ b/gcc/ira.c
@@ -5205,35 +5205,27 @@ ira (FILE *f)
   /* Perform target specific PIC register initialization.  */
   targetm.init_pic_reg ();
 
-  if (optimize)
-{
-  ira_conflicts_p = true;
-
-  /* Determine the number of pseudos actually requiring coloring.  */
-  unsigned int num_used_regs = 0;
-  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
-	if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
-	  num_used_regs++;
-
-  /* If there are too many pseudos and/or basic blocks (e.g. 10K
-	 pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
-	 use simplified and faster algorithms in LRA.  */
-  lra_simple_p
-	= ira_use_lra_p
-	  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
-}
-  else
-{
-  ira_conflicts_p = false;
-  lra_simple_p = ira_use_lra_p;
-}
+  ira_conflicts_p = optimize > 0;
+
+  /* Determine the number of pseudos actually requiring coloring.  */
+  unsigned int num_used_regs = 0;
+  for (unsigned int i = FIRST_PSEUDO_REGISTER; i < DF_REG_SIZE (df); i++)
+if (DF_REG_DEF_COUNT (i) || DF_REG_USE_COUNT (i))
+  num_used_regs++;
+
+  /* If there are too many pseudos and/or basic blocks (e.g. 10K
+ pseudos and 10K blocks or 100K pseudos and 1K blocks), we will
+ use simplified and faster algorithms in LRA.  */
+  lra_simple_p
+= ira_use_lra_p
+  && num_used_regs >= (1U << 26) / last_basic_block_for_fn (cfun);
 
   if (lra_simple_p)
 {
   /* It permits to skip live range splitting in LRA.  */
   flag_caller_saves = false;
   /* There is no sense to do regional allocation when we use
-	 simplified LRA.  */
+	simplified LRA.  */
   flag_ira_region = IRA_REGION_ONE;
   ira_conflicts_p = false;
 }
diff --git a/gcc/testsuite/gcc.target/aarch64/pr93221.c b/gcc/testsuite/gcc.target/aarch64/pr93221.c
new file mode 100644
index ..4dc2c3d0149423dd3d666f7428277ffa9eb765c4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/pr93221.c
@@ -0,0 +1,10 @@
+/* PR target/93221 */
+/* { dg-do compile } */
+/* { dg-options "-O0 -mno-omit-leaf-frame-pointer" } */
+
+struct S { __Int32x4_t b[2]; };
+
+void
+foo (struct S x)
+{
+}
-- 
2.17.1

Re: [PATCH Coroutines] Change context of label_decl in original function

2020-01-27 Thread Nathan Sidwell


On 1/20/20 11:08 PM, JunMa wrote:

Hi
This patch does minor fix on changing context of label_decl from
original function to actor function which avoid assertion in gimplify pass.

Bootstrap and test on X86_64, is it OK?


ok, thanks


gcc/cp
2020-01-21  Jun Ma 

     * coroutines.cc (transform_await_wrapper): Set actor funcion as
     new context of label_decl.
     (build_actor_fn): Fill new field of await_xform_data.

gcc/testsuite
2020-01-21  Jun Ma 

     * g++.dg/coroutines/co-await-04-control-flow.C: Add label.



--
Nathan Sidwell

Re: [PATCH Coroutines]Access promise via actor function's frame pointer argument

2020-01-27 Thread Nathan Sidwell


On 1/21/20 5:19 AM, Iain Sandoe wrote:

Hi Nathan, Bin,

bin.cheng  wrote:




Nathan, is this OK for trunk as-is?
thanks
Iain



ok, with one nit I noticed:

+act_des_fn (tree orig, tree fn_type, tree coro_frame_ptr, const char* name)


that final parameter should be 'const char *name' -- the '*' clings to 
the name not the type.




Patch updated as attached.

Thanks,
bin

gcc/cp
2020-01-20  Bin Cheng  
 * coroutines.cc (act_des_fn): New.
 (morph_fn_to_coro): Call act_des_fn to build actor/destroy decls.
 Access promise via actor function's frame pointer argument.
 (build_actor_fn, build_destroy_fn): Use frame pointer argument.






--
Nathan Sidwell

Re: [PATCH] Replace one error with inform.

2020-01-27 Thread David Malcolm

On Mon, 2020-01-27 at 16:23 +0100, Martin Liška wrote:
> On 1/27/20 2:38 PM, David Malcolm wrote:
> > Please add an
> >auto_diagnostic_group d;
> > here, so that -fdiagnostics-format=json can nest the note below the
> > error.
> > 
> > OK with that change.
> 
> Sure, there's one another patch that does the for all error+inform
> in the function.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression
> tests.
> 
> Ready to be installed?
> Thanks,
> Martin

LGTM, though a nit in the ChangeLog message: "couple of" to me means 2
and the number is 4, so "several" is probably better here.  [1] 

Thanks
Dave

[1] though I'm probably being overly pedantic; see 
https://xkcd.com/1070/

Re: [PATCH] Replace one error with inform.

2020-01-27 Thread Martin Liška


On 1/27/20 2:38 PM, David Malcolm wrote:

Please add an
   auto_diagnostic_group d;
here, so that -fdiagnostics-format=json can nest the note below the
error.

OK with that change.


Sure, there's one another patch that does the for all error+inform
in the function.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 8c2facb9b82519b10e8a7a279cf85f3ab88de1d5 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 27 Jan 2020 15:03:55 +0100
Subject: [PATCH] Add couple of auto_diagnostic_group in
 redeclare_class_template.

gcc/cp/ChangeLog:

2020-01-27  Martin Liska  

	PR c++/92440
	* pt.c (redeclare_class_template): Group couple of
	errors and inform messages with auto_diagnostic_group.
---
 gcc/cp/pt.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index b8acedeaa5a..ed5df1e9841 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -6148,6 +6148,7 @@ redeclare_class_template (tree type, tree parms, tree cons)
 	  && (TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (tmpl_parm))
 		  != TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (parm)
 	{
+	  auto_diagnostic_group d;
 	  error ("template parameter %q+#D", tmpl_parm);
 	  inform (input_location, "redeclared here as %q#D", parm);
 	  return false;
@@ -6159,6 +6160,7 @@ redeclare_class_template (tree type, tree parms, tree cons)
   tree p2 = TREE_VEC_ELT (parms, i);
   if (!template_parameter_constraints_equivalent_p (p1, p2))
 	{
+	  auto_diagnostic_group d;
 	  error ("declaration of template parameter %q+#D with different "
 		 "constraints", parm);
 	  inform (DECL_SOURCE_LOCATION (tmpl_parm),
@@ -6172,6 +6174,7 @@ redeclare_class_template (tree type, tree parms, tree cons)
 
 	 A template-parameter may not be given default arguments
 	 by two different declarations in the same scope.  */
+	  auto_diagnostic_group d;
 	  error_at (input_location, "redefinition of default argument for %q#D", parm);
 	  inform (DECL_SOURCE_LOCATION (tmpl_parm),
 		  "original definition appeared here");
@@ -6206,6 +6209,7 @@ redeclare_class_template (tree type, tree parms, tree cons)
   /* Two classes with different constraints declare different entities.  */
   if (!cp_tree_equal (req1, req2))
 {
+  auto_diagnostic_group d;
   error_at (input_location, "redeclaration %q#D with different "
 "constraints", tmpl);
   inform (DECL_SOURCE_LOCATION (tmpl),
-- 
2.25.0

Re: [PATCH] analyzer: fixes to tree_cmp and other comparators

2020-01-27 Thread David Malcolm

On Fri, 2020-01-24 at 11:16 +0100, Stefan Schulze Frielinghaus wrote:
> On Thu, Jan 23, 2020 at 05:13:20PM -0500, David Malcolm wrote:
> [...]
> > Fixes build on s390x-ibm-linux-gnu for stage 1, at least, with no
> > testsuite regressions.  Full bootstrap and regression test run
> > in progress.
> 
> Thank you for taking care of this! With your new patch I can
> successfully bootstrap + regtest.

Thanks.  I've committed this to master as
6a81cabc14426b642271647b03218a3af19d600f.

[PATCH] forwprop: Tweak choice of VEC_PERM_EXPR filler [PR92822]

2020-01-27 Thread Richard Sandiford

For the 2s failures in the PR, we have a V4SF VEC_PERM_EXPR in
which the first two elements are duplicates of one element and
the other two are don't-care:

v4sf_b = VEC_PERM_EXPR ;

The heuristic was to extend this with a blend:

v4sf_b = VEC_PERM_EXPR ;

but it seems better to extend a partial duplicate to a full duplicate:

v4sf_b = VEC_PERM_EXPR ;

Obviously this is still just a heuristic though.

I wondered whether to restrict this to two elements or more
but couldn't find any examples in which it made a difference.
Either way should be fine for the purposes of fixing this PR.

Tested on aarch64-linux-gnu and x86_64-linux-gnu.  Richard mentioned
in the PR that he had a different fix in mind, but since I'd tested
this overnight, I thought I might as well post it anyway as a possible
belt-and-braces fix.  OK to install?

Richard


2020-01-27  Richard Sandiford  

gcc/
PR tree-optimization/92822
* tree-ssa-forwprop.c (simplify_vector_constructor): When filling
out the don't-care elements of a vector whose significant elements
are duplicates, make the don't-care elements duplicates too.
---
 gcc/tree-ssa-forwprop.c | 22 --
 1 file changed, 16 insertions(+), 6 deletions(-)

diff --git a/gcc/tree-ssa-forwprop.c b/gcc/tree-ssa-forwprop.c
index d63e87c8a5b..5203891950a 100644
--- a/gcc/tree-ssa-forwprop.c
+++ b/gcc/tree-ssa-forwprop.c
@@ -2455,16 +2455,26 @@ simplify_vector_constructor (gimple_stmt_iterator *gsi)
 it and its source indexes to make the permutation supported.
 For now it mimics a blend.  */
   vec_perm_builder sel (refnelts, refnelts, 1);
+  bool all_same_p = true;
   for (i = 0; i < elts.length (); ++i)
-   sel.quick_push (elts[i].second + elts[i].first * refnelts);
+   {
+ sel.quick_push (elts[i].second + elts[i].first * refnelts);
+ all_same_p &= known_eq (sel[i], sel[0]);
+   }
   /* And fill the tail with "something".  It's really don't care,
  and ideally we'd allow VEC_PERM to have a smaller destination
-vector.  As heuristic try to preserve a uniform orig[0] which
-facilitates later pattern-matching VEC_PERM_EXPR to a
-BIT_INSERT_EXPR.  */
+vector.  As a heuristic:
+
+(a) if what we have so far duplicates a single element, make the
+tail do the same
+
+(b) otherwise preserve a uniform orig[0].  This facilitates
+later pattern-matching of VEC_PERM_EXPR to a BIT_INSERT_EXPR.  */
   for (; i < refnelts; ++i)
-   sel.quick_push ((elts[0].second == 0 && elts[0].first == 0
-? 0 : refnelts) + i);
+   sel.quick_push (all_same_p
+   ? sel[0]
+   : (elts[0].second == 0 && elts[0].first == 0
+  ? 0 : refnelts) + i);
   vec_perm_indices indices (sel, orig[1] ? 2 : 1, refnelts);
   if (!can_vec_perm_const_p (TYPE_MODE (perm_type), indices))
return false;

[COMMITTED] aarch64: Add vector/vector vec_extract patterns [PR92822]

2020-01-27 Thread Richard Sandiford

Part of the problem in this PR is that we don't provide patterns
to extract a 64-bit vector from one half of a 128-bit vector.
Adding them fixes:

FAIL: gcc.target/aarch64/fmul_intrinsic_1.c scan-assembler-times 
fmul\\td[0-9]+, d[0-9]+, d[0-9]+ 1
FAIL: gcc.target/aarch64/fmul_intrinsic_1.c scan-assembler-times 
fmul\\tv[0-9]+.2d, v[0-9]+.2d, v[0-9]+.d\\[[0-9]+\\] 3

The 2s failures need target-independent changes, after which they rely
on these patterns too.

Tested on aarch64-linux-gnu & pushed.

Richard


2020-01-27  Richard Sandiford  

gcc/
PR target/92822
* config/aarch64/aarch64-simd.md (aarch64_get_half): New
expander.
(@aarch64_split_simd_mov): Use it.
(aarch64_simd_mov_from_low): Add a GPR alternative.
Leave the vec_extract patterns to handle 2-element vectors.
(aarch64_simd_mov_from_high): Likewise.
(vec_extract): New expander.
(vec_extractv2dfv1df): Likewise.
---
 gcc/config/aarch64/aarch64-simd.md | 87 ++
 1 file changed, 65 insertions(+), 22 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-simd.md 
b/gcc/config/aarch64/aarch64-simd.md
index 97f46f96968..5a58051cf7e 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -282,37 +282,51 @@ (define_expand "@aarch64_split_simd_mov"
 rtx dst_high_part = gen_highpart (mode, dst);
rtx lo = aarch64_simd_vect_par_cnst_half (mode, , false);
rtx hi = aarch64_simd_vect_par_cnst_half (mode, , true);
-
-emit_insn
-  (gen_aarch64_simd_mov_from_low (dst_low_part, src, lo));
-emit_insn
-  (gen_aarch64_simd_mov_from_high (dst_high_part, src, hi));
+emit_insn (gen_aarch64_get_half (dst_low_part, src, lo));
+emit_insn (gen_aarch64_get_half (dst_high_part, src, hi));
   }
 DONE;
   }
 )
 
-(define_insn "aarch64_simd_mov_from_low"
-  [(set (match_operand: 0 "register_operand" "=r")
+(define_expand "aarch64_get_half"
+  [(set (match_operand: 0 "register_operand")
 (vec_select:
-  (match_operand:VQMOV 1 "register_operand" "w")
-  (match_operand:VQMOV 2 "vect_par_cnst_lo_half" "")))]
-  "TARGET_SIMD && reload_completed"
-  "umov\t%0, %1.d[0]"
-  [(set_attr "type" "neon_to_gp")
-   (set_attr "length" "4")
-  ])
+  (match_operand:VQMOV 1 "register_operand")
+  (match_operand 2 "ascending_int_parallel")))]
+  "TARGET_SIMD"
+)
+
+(define_insn_and_split "aarch64_simd_mov_from_low"
+  [(set (match_operand: 0 "register_operand" "=w,?r")
+(vec_select:
+  (match_operand:VQMOV_NO2E 1 "register_operand" "w,w")
+  (match_operand:VQMOV_NO2E 2 "vect_par_cnst_lo_half" "")))]
+  "TARGET_SIMD"
+  "@
+   #
+   umov\t%0, %1.d[0]"
+  "&& reload_completed && aarch64_simd_register (operands[0], mode)"
+  [(set (match_dup 0) (match_dup 1))]
+  {
+operands[1] = aarch64_replace_reg_mode (operands[1], mode);
+  }
+  [(set_attr "type" "mov_reg,neon_to_gp")
+   (set_attr "length" "4")]
+)
 
 (define_insn "aarch64_simd_mov_from_high"
-  [(set (match_operand: 0 "register_operand" "=r")
+  [(set (match_operand: 0 "register_operand" "=w,?r")
 (vec_select:
-  (match_operand:VQMOV 1 "register_operand" "w")
-  (match_operand:VQMOV 2 "vect_par_cnst_hi_half" "")))]
-  "TARGET_SIMD && reload_completed"
-  "umov\t%0, %1.d[1]"
-  [(set_attr "type" "neon_to_gp")
-   (set_attr "length" "4")
-  ])
+  (match_operand:VQMOV_NO2E 1 "register_operand" "w,w")
+  (match_operand:VQMOV_NO2E 2 "vect_par_cnst_hi_half" "")))]
+  "TARGET_SIMD"
+  "@
+   dup\\t%d0, %1.d[1]
+   umov\t%0, %1.d[1]"
+  [(set_attr "type" "neon_dup,neon_to_gp")
+   (set_attr "length" "4")]
+)
 
 (define_insn "orn3"
  [(set (match_operand:VDQ_I 0 "register_operand" "=w")
@@ -6140,6 +6154,35 @@ (define_expand "vec_extract"
 DONE;
 })
 
+;; Extract a 64-bit vector from one half of a 128-bit vector.
+(define_expand "vec_extract"
+  [(match_operand: 0 "register_operand")
+   (match_operand:VQMOV_NO2E 1 "register_operand")
+   (match_operand 2 "immediate_operand")]
+  "TARGET_SIMD"
+{
+  int start = INTVAL (operands[2]);
+  if (start != 0 && start !=  / 2)
+FAIL;
+  rtx sel = aarch64_gen_stepped_int_parallel ( / 2, start, 1);
+  emit_insn (gen_aarch64_get_half (operands[0], operands[1], sel));
+  DONE;
+})
+
+;; Extract a single-element 64-bit vector from one half of a 128-bit vector.
+(define_expand "vec_extractv2dfv1df"
+  [(match_operand:V1DF 0 "register_operand")
+   (match_operand:V2DF 1 "register_operand")
+   (match_operand 2 "immediate_operand")]
+  "TARGET_SIMD"
+{
+  /* V1DF is rarely used by other patterns, so it should be better to hide
+ it in a subreg destination of a normal DF op.  */
+  rtx scalar0 = gen_lowpart (DFmode, operands[0]);
+  emit_insn (gen_vec_extractv2dfdf (scalar0, operands[1], operands[2]));
+  DONE;
+})
+
 ;; aes
 
 (define_insn "aarch64_crypto_aesv16qi"

Re: [PATCH] analyzer: fix setjmp-detection and support sigsetjmp

2020-01-27 Thread David Malcolm

On Thu, 2020-01-23 at 17:16 -0500, David Malcolm wrote:
> On Wed, 2020-01-22 at 20:40 +0100, Jakub Jelinek wrote:
> > On Wed, Jan 22, 2020 at 02:35:13PM -0500, David Malcolm wrote:
> > > PR analyzer/93316 reports various testsuite failures where I
> > > accidentally relied on properties of x86_64-pc-linux-gnu.
> > > 
> > > The following patch fixes them on sparc-sun-solaris2.11 (gcc211
> > > in
> > > the
> > > GCC compile farm), and, I hope, the other configurations showing
> > > failures.
> > > 
> > > There may still be other failures for pattern-test-2.c, which I'm
> > > tracking separately as PR analyzer/93291.
> > > 
> > > Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu;
> > > tested on stage 1 on sparc-sun-solaris2.11.
> > > 
> > > gcc/analyzer/ChangeLog:
> > >   PR analyzer/93316
> > >   * analyzer.cc (is_setjmp_call_p): Check for "setjmp" as well as
> > >   "_setjmp".
> > 
> > Please see calls.c (special_function_p), you should treat certainly
> > also sigsetjmp as a setjmp call, and similarly to
> > special_function_p,
> > skip over _ or __ prefixes before the setjmp or sigsetjmp name.
> > Similarly for longjmp/siglongjmp.
> > 
> > Jakub
> 
> Thanks.
> 
> This patch removes the hack in is_setjmp_call_p of looking for
> "setjmp" and "_setjmp", replacing it with some logic adapted from
> special_function_p in calls.c, ignoring up to 2 leading underscores
> from
> the fndecl's name when checking for a function by name.
> 
> It also requires that such functions are "extern" and at file scope
> for them to be matched.
> 
> The patch also generalizes the setjmp/longjmp handling in the
> analyzer
> to also work with sigsetjmp/siglongjmp.  Doing so requires
> generalizing
> some hardcoded functions in diagnostics (which were hardcoded to
> avoid
> user-facing messages referring to "_setjmp", which is an
> implementation
> detail) - the patch adds a new function, get_user_facing_name for
> this,
> for use on calls that matched is_named_call_p and
> is_specical_named_call_p.
> 
> Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
> 
> OK for master?

I've gone ahead and committed this, based on Jeff's blanket approval
here:
  https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01689.html

I'm working on a followup which would refactor it to share more code
with calls.c (which would need review, as it would touch calls.c).

Dave

[COMMITTED] aarch64: Fix failure in cmpimm_branch_1.c

2020-01-27 Thread Richard Sandiford

gcc.target/aarch64/cmpimm_branch_1.c started failing after Bernd's
fix to make combine take the costs of jumps into account
(g:391500af1932e696a007).  This is because the rtx costs
of *compare_condjump were higher than the costs
of the instructions it combines.

Tested on aarch64-linux-gnu & pushed.

Richard


2020-01-27  Richard Sandiford  

gcc/
* config/aarch64/aarch64.c (aarch64_if_then_else_costs): Match
jump conditions for *compare_condjump.
---
 gcc/config/aarch64/aarch64.c | 15 +--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 3437fff6811..11197bd033e 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11020,6 +11020,8 @@ aarch64_if_then_else_costs (rtx op0, rtx op1, rtx op2, 
int *cost, bool speed)
   rtx inner;
   rtx comparator;
   enum rtx_code cmpcode;
+  const struct cpu_cost_table *extra_cost
+= aarch64_tune_params.insn_extra_cost;
 
   if (COMPARISON_P (op0))
 {
@@ -11054,8 +11056,17 @@ aarch64_if_then_else_costs (rtx op0, rtx op1, rtx op2, 
int *cost, bool speed)
/* CBZ/CBNZ.  */
*cost += rtx_cost (inner, VOIDmode, cmpcode, 0, speed);
 
-   return true;
- }
+ return true;
+   }
+ if (register_operand (inner, VOIDmode)
+ && aarch64_imm24 (comparator, VOIDmode))
+   {
+ /* SUB and SUBS.  */
+ *cost += COSTS_N_INSNS (2);
+ if (speed)
+   *cost += extra_cost->alu.arith * 2;
+ return true;
+   }
}
  else if (cmpcode == LT || cmpcode == GE)
{

[PATCH] testsuite/91171 no longer needed XFAIL

2020-01-27 Thread Richard Biener

Pushed.

2020-01-27  Richard Biener  

PR testsuite/91171
* gcc.dg/graphite/scop-21.c: un-XFAIL.
---
 gcc/testsuite/ChangeLog | 5 +
 gcc/testsuite/gcc.dg/graphite/scop-21.c | 3 +--
 2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 852d0ed3649..bd76fa78e91 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2020-01-27  Richard Biener  
+
+   PR testsuite/91171
+   * gcc.dg/graphite/scop-21.c: un-XFAIL.
+
 2020-01-27  Claudiu Zissulescu  
 
* gcc.target/arc/interrupt-6.c: Update test.
diff --git a/gcc/testsuite/gcc.dg/graphite/scop-21.c 
b/gcc/testsuite/gcc.dg/graphite/scop-21.c
index 304e0792b2b..be5d8ced991 100644
--- a/gcc/testsuite/gcc.dg/graphite/scop-21.c
+++ b/gcc/testsuite/gcc.dg/graphite/scop-21.c
@@ -30,5 +30,4 @@ int test ()
 
   return a[20];
 }
-/* XFAILed by the fix for PR86865.  */
-/* { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" { xfail 
*-*-* } } } */
+/* { dg-final { scan-tree-dump-times "number of SCoPs: 1" 1 "graphite" } } */
-- 
2.23.0

Re: [PATCH] Add __gcov_indirect_call_profiler_v4_atomic.

2020-01-27 Thread Jan Hubicka

> Hi.
> 
> The patch is about missing atomic profiler function for indirect calls.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2020-01-27  Martin Liska  
> 
>   PR gcov-profile/93403
>   * tree-profile.c (gimple_init_gcov_profiler): Generate
>   both __gcov_indirect_call_profiler_v4 and
>   __gcov_indirect_call_profiler_v4_atomic.
> 
> libgcc/ChangeLog:
> 
> 2020-01-27  Martin Liska  
> 
>   PR gcov-profile/93403
>   * libgcov-profiler.c (__gcov_indirect_call_profiler_v4):
>   Call __gcov_indirect_call_profiler_body.
>   (__gcov_indirect_call_profiler_body): New.
>   (__gcov_indirect_call_profiler_v4_atomic): New.
>   * libgcov.h (__gcov_indirect_call_profiler_v4_atomic):
>   New declaration.
OK, thanks!

Honza

Re: [PATCH] doc: clarify the situation with pointer arithmetic

2020-01-27 Thread Richard Biener

On Fri, Jan 24, 2020 at 12:46 AM Uecker, Martin
 wrote:
>
> Am Donnerstag, den 23.01.2020, 14:18 +0100 schrieb Richard Biener:
> > On Wed, Jan 22, 2020 at 12:40 PM Martin Sebor  wrote:
> > >
> > > On 1/22/20 8:32 AM, Richard Biener wrote:
> > > > On Tue, 21 Jan 2020, Alexander Monakov wrote:
> > > >
> > > > > On Tue, 21 Jan 2020, Richard Biener wrote:
> > > > >
> > > > > > Fourth.  That PNVI (I assume it's the whole pointer-provenance 
> > > > > > stuff)
> > > > > > wants to get the "best" of both which can never be done since a 
> > > > > > compiler
> > > > > > needs to have a way to be conservative - in this area it's 
> > > > > > conflicting
> > > > > > conservative treatment which is impossible.
> > > > >
> > > > > This paragraph is unclear, I don't immediately see what the 
> > > > > conflicting goals
> > > > > are. The rest is clear enough given the previous discussions I saw.
> > > > >
> > > > > Did you mean the restriction that you cannot do arithmetic involving 
> > > > > two
> > > > > integers based on pointers, get a value corresponding to one of them,
> > > > > cast it back and get a pointer suitable for accessing either of two
> > > > > originally pointed-to objects? I don't see that as a conflict because
> > > > > it places a restriction on users, not the compiler.
> > > >
> > > > As far as I remember the discussions PNVI requires to track
> > > > provenance for correctness, you may not miss or attach wrong provenance
> > > > to a pointer and there's only "single" provenance, not "many"
> > > > (aka, may point to A and B).  I don't see how you can ever implement 
> > > > that.
>
> I have not idea how you came to that conclusion. PNVI is perfectly
> compatible with a naive compiler who does not track provenance at
> all as well as an abstract machine that actually carries run-time
> provenance around with each pointer and checks every operation.
> It was designed specifically to allow both cases and everything
> in between (especially compilers who track provenance during
> compile time but the programs then do not track provenance at
> run-time).
>
> You may be confused by the abstract formulation that indeed
> assigns a single provenance to each pointer. A compiler would
> track its *knowledge about provenance*, which would be a set
> of possible targets.

Well, the question is whether PVNI allows the compiler to put any
additional restriction on what the provenance of an interger is.  It
appears not, so any attempt to track provenance through integers
is doomed until the cases are very simple.  I'm not sure that's desirable (*).

> > > The PVNI variant preferred by the object model group is referred
> > > to as "PNVI-ae-udi" which stands for "PNVI exposed-address user-
> > > disambiguation."  (The PNVI part stands for "Provenance Not Via
> > > Integers.)  This base PVNI model basically prohibits provenance
> > > tracking via integers, making it possible for programs to derive
> > > pointers to unrelated objects via casts between pointers and
> > > integers (and modifying the integer in between the casts).  This
> > > is considered a new restriction on implementations because
> > > the standard doesn't permit it (as you said upthread, all it
> > > specifies is that a pointer is equal to one obtained by casting
> > > the original to a intptr_t and back).
>
> This is not entirely clear what the standard means. 7.20.1.4.
>
> In my opinion, converting the same integer back should yield
> a valid pointer where "same" is defined in the usual sense
> (i.e. via mathematical identity and not via provenance).
>
> > > The -ae-udi variant limits this restriction on implementations
> > > to escaped pointers and provides a means for users/programs to
> > > disambiguate between pointers to adjacent objects (i.e., a past
> > > the end pointer and one to the beginning of the object stored
> > > there).  The latest proposal is in N2362, with an overview in
> > > N2378).  At the last WG14 meeting there was broad discomfort
> > > with adopting the proposal for C2X because of the absence of
> > > implementation experience and concerns raised by implementers.
> > > The guidance to the study group was to target a separate technical
> > > specification for the proposal and allow time for implementation
> > > experience.  If the feedback from implementers is positive
> > > (whatever that might mean) WG14 said it would consider adopting
> > > the model for a revision of C after C2X.
> > >
> > > Overall, the impact of the proposals as well as their goal is to
> > > constrain implementations to the (presumed) benefit of programs
> > > in terms of expressiveness.  There are numerous examples of code
> > > that's currently treated as undefined by one or more compilers
> > > (as a consequence of optimizations) that the model makes valid.
> > > I'm not aware of any optimization opportunities the proposal
> > > might open up in GCC.
> >
> > Well, PNVI limits optimization opportunities of GCC which currently
> > _does_

[PR c++/91826] bogus error with alias namespace

2020-01-27 Thread Nathan Sidwell


I've committed this to fix 91826

My changes to is_nested_namespace broke is_ancestor's use where a 
namespace alias might be passed in.  This changes is_ancestor to look 
through the alias.


nathan

--
Nathan Sidwell
 gcc/cp/ChangeLog  |  5 +
 gcc/cp/name-lookup.c  | 32 
 gcc/testsuite/g++.dg/lookup/pr91826.C | 16 
 3 files changed, 41 insertions(+), 12 deletions(-)

2020-01-27  Nathan Sidwell  

	PR c++/91826
	* name-lookup.c (is_ancestor): Allow CHILD to be a namespace alias.

diff --git c/gcc/cp/ChangeLog w/gcc/cp/ChangeLog
index c01becefe87..32cc230fb71 100644
diff --git c/gcc/cp/name-lookup.c w/gcc/cp/name-lookup.c
index cd7a5816e46..129cfad9ad6 100644
--- c/gcc/cp/name-lookup.c
+++ w/gcc/cp/name-lookup.c
@@ -4012,38 +4012,46 @@ is_nested_namespace (tree ancestor, tree descendant, bool inline_only)
   return ancestor == descendant;
 }
 
-/* Returns true if ROOT (a namespace, class, or function) encloses
-   CHILD.  CHILD may be either a class type or a namespace.  */
+/* Returns true if ROOT (a non-alias namespace, class, or function)
+   encloses CHILD.  CHILD may be either a class type or a namespace
+   (maybe alias).  */
 
 bool
 is_ancestor (tree root, tree child)
 {
-  gcc_assert ((TREE_CODE (root) == NAMESPACE_DECL
-	   || TREE_CODE (root) == FUNCTION_DECL
-	   || CLASS_TYPE_P (root)));
-  gcc_assert ((TREE_CODE (child) == NAMESPACE_DECL
-	   || CLASS_TYPE_P (child)));
+  gcc_checking_assert ((TREE_CODE (root) == NAMESPACE_DECL
+			&& !DECL_NAMESPACE_ALIAS (root))
+		   || TREE_CODE (root) == FUNCTION_DECL
+		   || CLASS_TYPE_P (root));
+  gcc_checking_assert (TREE_CODE (child) == NAMESPACE_DECL
+		   || CLASS_TYPE_P (child));
 
-  /* The global namespace encloses everything.  */
+  /* The global namespace encloses everything.  Early-out for the
+ common case.  */
   if (root == global_namespace)
 return true;
 
-  /* Search until we reach namespace scope.  */
+  /* Search CHILD until we reach namespace scope.  */
   while (TREE_CODE (child) != NAMESPACE_DECL)
 {
   /* If we've reached the ROOT, it encloses CHILD.  */
   if (root == child)
 	return true;
+
   /* Go out one level.  */
   if (TYPE_P (child))
 	child = TYPE_NAME (child);
   child = CP_DECL_CONTEXT (child);
 }
 
-  if (TREE_CODE (root) == NAMESPACE_DECL)
-return is_nested_namespace (root, child);
+  if (TREE_CODE (root) != NAMESPACE_DECL)
+/* Failed to meet the non-namespace we were looking for.  */
+return false;
+
+  if (tree alias = DECL_NAMESPACE_ALIAS (child))
+child = alias;
 
-  return false;
+  return is_nested_namespace (root, child);
 }
 
 /* Enter the class or namespace scope indicated by T suitable for name
diff --git c/gcc/testsuite/g++.dg/lookup/pr91826.C w/gcc/testsuite/g++.dg/lookup/pr91826.C
new file mode 100644
index 000..2b313ece8a7
--- /dev/null
+++ w/gcc/testsuite/g++.dg/lookup/pr91826.C
@@ -0,0 +1,16 @@
+// PR 91826 bogus error with aliased namespace
+
+namespace N1 { class C1; }
+namespace A1 = N1;
+class A1::C1 {}; //Ok
+
+namespace N2
+{
+  namespace N { class C2; }
+  namespace A2 = N;
+  class A2::C2 {}; // { dg_bogus "does not enclose" }
+}
+
+namespace N3 { namespace N { class C3; } }
+namespace A3 = N3::N;
+class A3::C3 {}; //Ok

Fortran 'acc_get_property' return type (was: [PATCH] Add OpenACC 2.6 `acc_get_property' support)

2020-01-27 Thread Thomas Schwinge

Hi!

On 2019-12-20T17:46:57+0100, "Harwath, Frederik"  
wrote:
>> > --- a/libgomp/libgomp-plugin.h
>> > +++ b/libgomp/libgomp-plugin.h
>> > @@ -54,6 +54,13 @@ enum offload_target_type
>> >OFFLOAD_TARGET_TYPE_GCN =3D 8
>> >  };
>> >=20=20
>> > +/* Container type for passing device properties.  */
>> > +union gomp_device_property_value
>> > +{
>> > +  void *ptr;
>> > +  uintmax_t val;
>> > +};
>>
>> Why wouldn't that be 'size_t', 'const char *', as the actual data types
>> used?  (Maybe I'm missing something.)
>
> I do not see a reason for this either. Changed.

For reference: C/C++ has 'size_t' ('acc_get_property'), or 'const char*'
('acc_get_property_string') return types.

>> > --- a/libgomp/openacc.f90
>> > +++ b/libgomp/openacc.f90
>> > @@ -28,7 +28,7 @@
>> >  !  .
>> >=20=20
>> >  module openacc_kinds
>> > -  use iso_fortran_env, only: int32
>> > +  use iso_fortran_env, only: int32, int64
>> >implicit none
>> >=20=20
>> >private :: int32
>> > @@ -47,6 +47,21 @@ module openacc_kinds
>> >integer (acc_device_kind), parameter :: acc_device_not_host =3D 4
>> >integer (acc_device_kind), parameter :: acc_device_nvidia =3D 5
>> >integer (acc_device_kind), parameter :: acc_device_gcn =3D 8
>> > +  integer (acc_device_kind), parameter :: acc_device_current =3D -3
>> > +
>> > +  public :: acc_device_property
>> > +
>> > +  integer, parameter :: acc_device_property =3D int64
>>
>> Why 'int64'?  I changed this to 'int32', but please tell if there's a
>> reason for 'int64'.
>
> int32 is too narrow as - conforming to the OpenACC spec - acc_device_property
> is also used for the return type of acc_get_property (a bit strang, isn't 
> it?).
> int64 also did not seem quite right. I have changed the type of 
> acc_device_property
> to c_size_t to match the type that is used internally and as the return type 
> of the
> corresponding C function.

I filed  "Fortran
'acc_get_property' return type":

| During review/implementation of `acc_get_property` in GCC, @frederik-h
| found that for Fortran `function acc_get_property`, the return type is
| specified as `integer(acc_device_property) :: acc_get_property`,
| whereas in C/C++, it is `size_t`.  For avoidance of doubt: it's correct
| to map the C/C++ `acc_device_property_t property` formal parameter to
| Fortran `integer(acc_device_property), value :: property`, but it's not
| clear why `integer(acc_device_property)` is also used as the function's
| return type -- the return type/values don't actually (conceptually)
| relate to the `integer(acc_device_property)` data type.
| 
| Should we use `c_size_t` for Fortran `acc_get_property` return type to
| explicitly match C/C++, or use plain `integer` (as used in all other
| interfaces taking `size_t` in C/C++ -- currently only as input formal
| parameters though)?


Grüße
 Thomas


> --- a/libgomp/libgomp.texi
> +++ b/libgomp/libgomp.texi

> +@node acc_get_property
> +@section @code{acc_get_property} -- Get device property.
> +@cindex acc_get_property
> +@cindex acc_get_property_string
> +@table @asis
> +@item @emph{Description}
> +These routines return the value of the specified @var{property} for the
> +device being queried according to @var{devicenum} and @var{devicetype}.
> +Integer-valued and string-valued properties are returned by
> +@code{acc_get_property} and @code{acc_get_property_string} respectively.
> +The Fortran @code{acc_get_property_string} subroutine returns the string
> +retrieved in its fourth argument while the remaining entry points are
> +functions, which pass the return value as their result.
> +
> +@item @emph{C/C++}:
> +@multitable @columnfractions .20 .80
> +@item @emph{Prototype}: @tab @code{size_t acc_get_property(int devicenum, 
> acc_device_t devicetype, acc_device_property_t property);}
> +@item @emph{Prototype}: @tab @code{const char *acc_get_property_string(int 
> devicenum, acc_device_t devicetype, acc_device_property_t property);}
> +@end multitable
> +
> +@item @emph{Fortran}:
> +@multitable @columnfractions .20 .80
> +@item @emph{Interface}: @tab @code{function acc_get_property(devicenum, 
> devicetype, property)}
> +@item @emph{Interface}: @tab @code{subroutine 
> acc_get_property_string(devicenum, devicetype, property, string)}
> +@item   @tab @code{integer devicenum}
> +@item   @tab @code{integer(kind=acc_device_kind) devicetype}
> +@item   @tab @code{integer(kind=acc_device_property) 
> property}
> +@item   @tab @code{integer(kind=acc_device_property) 
> acc_get_property}
> +@item   @tab @code{character(*) string}
> +@end multitable
> +
> +@item @emph{Reference}:
> +@uref{https://www.openacc.org, OpenACC specification v2.6}, section
> +3.2.6.
> +@end table


> --- a/libgomp/openacc.f90
> +++ b/libgomp/openacc.f90
> @@ -31,16 +31,18 @@
>  
>  module openacc_kinds
>use iso_fortran_env, only: int32
> +

[PATCH] i386: Don't use ix86_tune_ctrl_string in parse_mtune_ctrl_str

2020-01-27 Thread H.J. Lu

There are

static void
parse_mtune_ctrl_str (bool dump)
{
  if (!ix86_tune_ctrl_string)
return;

parse_mtune_ctrl_str is only called from set_ix86_tune_features, which
is only called from ix86_function_specific_restore and
ix86_option_override_internal.  parse_mtune_ctrl_str shouldn't use
ix86_tune_ctrl_string which is defined with global_options.  Instead,
opts should be passed to parse_mtune_ctrl_str.

PR target/91399
* config/i386/i386-options.c (set_ix86_tune_features): Add an
argument of a pointer to struct gcc_options and pass it to
parse_mtune_ctrl_str.
(ix86_function_specific_restore): Pass opts to
set_ix86_tune_features.
(ix86_option_override_internal): Likewise.
(parse_mtune_ctrl_str): Add an argument of a pointer to struct
gcc_options and use it for x_ix86_tune_ctrl_string.
---
 gcc/config/i386/i386-options.c | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/gcc/config/i386/i386-options.c b/gcc/config/i386/i386-options.c
index 2acc9fb0cfe..e0be4932534 100644
--- a/gcc/config/i386/i386-options.c
+++ b/gcc/config/i386/i386-options.c
@@ -741,7 +741,8 @@ ix86_option_override_internal (bool main_args_p,
   struct gcc_options *opts,
   struct gcc_options *opts_set);
 static void
-set_ix86_tune_features (enum processor_type ix86_tune, bool dump);
+set_ix86_tune_features (struct gcc_options *opts,
+   enum processor_type ix86_tune, bool dump);
 
 /* Restore the current options */
 
@@ -810,7 +811,7 @@ ix86_function_specific_restore (struct gcc_options *opts,
 
   /* Recreate the tune optimization tests */
   if (old_tune != ix86_tune)
-set_ix86_tune_features (ix86_tune, false);
+set_ix86_tune_features (opts, ix86_tune, false);
 }
 
 /* Adjust target options after streaming them in.  This is mainly about
@@ -1538,13 +1539,13 @@ ix86_parse_stringop_strategy_string (char 
*strategy_str, bool is_memset)
print the features that are explicitly set.  */
 
 static void
-parse_mtune_ctrl_str (bool dump)
+parse_mtune_ctrl_str (struct gcc_options *opts, bool dump)
 {
-  if (!ix86_tune_ctrl_string)
+  if (!opts->x_ix86_tune_ctrl_string)
 return;
 
   char *next_feature_string = NULL;
-  char *curr_feature_string = xstrdup (ix86_tune_ctrl_string);
+  char *curr_feature_string = xstrdup (opts->x_ix86_tune_ctrl_string);
   char *orig = curr_feature_string;
   int i;
   do
@@ -1583,7 +1584,8 @@ parse_mtune_ctrl_str (bool dump)
processor type.  */
 
 static void
-set_ix86_tune_features (enum processor_type ix86_tune, bool dump)
+set_ix86_tune_features (struct gcc_options *opts,
+   enum processor_type ix86_tune, bool dump)
 {
   unsigned HOST_WIDE_INT ix86_tune_mask = HOST_WIDE_INT_1U << ix86_tune;
   int i;
@@ -1605,7 +1607,7 @@ set_ix86_tune_features (enum processor_type ix86_tune, 
bool dump)
  ix86_tune_features[i] ? "on" : "off");
 }
 
-  parse_mtune_ctrl_str (dump);
+  parse_mtune_ctrl_str (opts, dump);
 }
 
 
@@ -2364,7 +2366,7 @@ ix86_option_override_internal (bool main_args_p,
   XDELETEVEC (s);
 }
 
-  set_ix86_tune_features (ix86_tune, opts->x_ix86_dump_tunes);
+  set_ix86_tune_features (opts, ix86_tune, opts->x_ix86_dump_tunes);
 
   ix86_recompute_optlev_based_flags (opts, opts_set);
 
-- 
2.24.1

Re: Deprecating cc0 (and consequently cc0 targets)

2020-01-27 Thread Hans-Peter Nilsson

> From: Jeff Law 
> Date: Fri, 20 Sep 2019 17:38:38 +0200

Hi.  I'm not going to question

> The first step in that process is to drop support for cc0.

but could you please elaborate on...

> [cc0 support in gcc core]
> code is broken in various ways,

> particularly WRT exceptions.

...that last part?

If you mean asynchronous exceptions then perhaps in theory,
except there's no need to (and no state to) "unwind" to
in-between cc0 setter and user.  But I guess that goes for
"MODE_CC" targets too; exception information isn't that precise.

> This patch deprecates the affected targets.

(Not applied yet?  Before the gcc-10 branch?  Can you please
consider dropping cris* from that part when rebasing it, as per
contents on master and my pledge to merge axis/cris-decc0?)

> diff --git a/gcc/config.gcc b/gcc/config.gcc
> index 69d0a024d85..0c1637e8be1 100644
> --- a/gcc/config.gcc
> +++ b/gcc/config.gcc
> @@ -248,6 +248,12 @@ md_file=
>  # Obsolete configurations.
>  case ${target} in
>tile*-*-*  \
> +  avr*-*-*   \
> +  h8300*-*-* \
> +  cris*-*-*  \
> +  m68k*-*-*  \
> +  vax*-*-*   \
> +  cr16*-*-*  \
>   )
>  if test "x$enable_obsolete" != xyes; then
>echo "*** Configuration ${target} is obsolete." >&2
> @@ -273,7 +279,6 @@ case ${target} in
>   | arm*-*-uclinux*   \
>   | i[34567]86-go32-* \
>   | i[34567]86-*-go32*\
> - | m68k-*-uclinuxoldabi* \
>   | mips64orion*-*-rtems* \
>   | pdp11-*-bsd   \
>   | powerpc*-*-linux*paired*  \
> @@ -294,7 +299,6 @@ case ${target} in
>   | *-*-solaris2.[0-9].*  \
>   | *-*-solaris2.10*  \
>   | *-*-sysv* \
> - | vax-*-vms*\
>   )
> echo "*** Configuration ${target} not supported" 1>&2
> exit 1

Beware, the two last hunks shouldn't be applied, else the patch
will actually make m68k-*-uclinuxoldabi* andvax-*-vms* available
(by --enable-obsolete).

That part would go in when actually removing the targets.

I may have lost track of the conversation that followed; maybe
the patch was itself obsoleted.

brgds, H-P

[Patch][Fortran] Fix to strict associate check (PR93427) (was: Rejects valid with pointer result from recursive function)

2020-01-27 Thread Tobias Burnus

Semantically, there is an issue when the function name is used both for 
recursively calling and as result variable. Hence, one should only use 
one own's function name – in context of function calls – if one has a 
separate result variable.


This somehow got messed up with  r10-5722-g4d12437 (3 Jan 2020, PR92994) 
– rejecting also the use of the function name as result variable.


Fixed by removing the check. At least the most straight-forward invalid 
use is still rejected as shown by the augmented test case.


OK for the trunk?

Tobias


On 1/25/20 6:37 AM, Andrew Benson wrote:

I opened PR 93427 for the issue below:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93427

The following code fails to compile (using git commit
472dc648ce3e7661762931d584d239611ddca964):

module a

type :: t
end type t

contains

recursive function b()
   class(t), pointer :: b
   type(t) :: c
   allocate(t :: b)
   select type (b)
   type is (t)
  b=c
   end select
end function b

end module a



$ gfortran -v
Using built-in specs.
COLLECT_GCC=gfortran
COLLECT_LTO_WRAPPER=/home/abenson/Galacticus/Tools/libexec/gcc/x86_64-pc-linux-gnu/10.0.1/lto-wrapper
Target: x86_64-pc-linux-gnu
Configured with: ../gcc-git/configure
--prefix=/home/abenson/Galacticus/Tools
--enable-languages=c,c++,fortran
  --disable-multilib
Thread model: posix
Supported LTO compression algorithms: zlib
gcc version 10.0.1 20200124 (experimental) (GCC)


$ gfortran -c p.F90 -o p.o
p.F90:12:15:

12 |   select type (b)
   |   1
Error: Associating entity 'b' at (1) is a procedure name
p.F90:14:5:

14 |  b=c
   | 1
Error: 'b' at (1) associated to vector-indexed target cannot be used
in a variable definition context (assignment)


The code compiles successfully using ifort 18.0.1. Removing the
"recursive" attribute, or specifying a "result()" variable makes the
errors go away.


--

* Andrew Benson: http://users.obs.carnegiescience.edu/abenson/contact.html

* Galacticus: https://github.com/galacticusorg/galacticus
[Fortran] Fix to strict associate check (PR93427)

	PR fortran/93427
	* resolve.c (resolve_assoc_var): Remove too strict check.
	* gfortran.dg/associate_51.f90: Update test case.

	PR fortran/93427
	* gfortran.dg/associate_52.f90: New.

diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index e840aec62f2..8f5267fde05 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -8846,8 +8846,7 @@ resolve_assoc_var (gfc_symbol* sym, bool resolve_target)
 
   if (tsym->attr.subroutine
 	  || tsym->attr.external
-	  || (tsym->attr.function
-	  && (tsym->result != tsym || tsym->attr.recursive)))
+	  || (tsym->attr.function && tsym->result != tsym))
 	{
 	  gfc_error ("Associating entity %qs at %L is a procedure name",
 		 tsym->name, >where);
diff --git a/gcc/testsuite/gfortran.dg/associate_51.f90 b/gcc/testsuite/gfortran.dg/associate_51.f90
index 7b3edc44990..b6ab1414b02 100644
--- a/gcc/testsuite/gfortran.dg/associate_51.f90
+++ b/gcc/testsuite/gfortran.dg/associate_51.f90
@@ -14,7 +14,14 @@ end
 recursive function f2()
   associate (y1 => f2()) ! { dg-error "Invalid association target" }
   end associate  ! { dg-error "Expecting END FUNCTION statement" }
-  associate (y2 => f2)   ! { dg-error "is a procedure name" }
+end
+
+recursive function f3()
+  associate (y1 => f3)
+print *, y1()  ! { dg-error "Expected array subscript" }
+  end associate
+  associate (y2 => f3) ! { dg-error "Associate-name 'y2' at \\(1\\) is used as array" }
+print *, y2(1)
   end associate
 end
 
diff --git a/gcc/testsuite/gfortran.dg/associate_52.f90 b/gcc/testsuite/gfortran.dg/associate_52.f90
new file mode 100644
index 000..c24ec4b8f6a
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/associate_52.f90
@@ -0,0 +1,24 @@
+! { dg-do compile }
+!
+! PR fortran/93427
+!
+! Contributed by Andrew Benson
+!
+module a
+
+type :: t
+end type t
+
+contains
+
+recursive function b()
+  class(t), pointer :: b
+  type(t) :: c
+  allocate(t :: b)
+  select type (b)
+  type is (t)
+ b=c
+  end select
+end function b
+
+end module a

Re: [PATCH] Replace one error with inform.

2020-01-27 Thread David Malcolm

On Mon, 2020-01-27 at 10:57 +0100, Martin Liška wrote:
> Hello.
> 
> The patch is about splitting pair of errors into
> error and inform which seems logical to me in this
> situation:

[...]

> diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
> index 4520c995028..f9bed1ea4fb 100644
> --- a/gcc/cp/pt.c
> +++ b/gcc/cp/pt.c
> @@ -6149,7 +6149,7 @@ redeclare_class_template (tree type, tree parms, tree 
> cons)
> != TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (parm)
>   {

Please add an
  auto_diagnostic_group d;
here, so that -fdiagnostics-format=json can nest the note below the
error.

OK with that change.

Thanks
Dave

> error ("template parameter %q+#D", tmpl_parm);
> -   error ("redeclared here as %q#D", parm);
> +   inform (input_location, "redeclared here as %q#D", parm);
> return false;
>   }

Re: [PATCH 3/4] [ARC] Save mlo/mhi registers when ISR.

2020-01-27 Thread Claudiu Zissulescu Ianculescu

Yes, I know :(

Thank you for your help. All four patches pushed.
Claudiu

On Wed, Jan 22, 2020 at 10:31 PM Jeff Law  wrote:
>
> On Wed, 2020-01-22 at 10:14 +0200, Claudiu Zissulescu wrote:
> > ARC600 when configured with mul64 instructions uses mlo and mhi
> > registers to store the 64 result of the multiplication. In the ARC600
> > ISA documentation we have the next register configuration when ARC600
> > is configured only with mul64 extension:
> >
> > Register | Name | Use
> > -+--+
> > r57  | mlo  | Multiply low 32 bits, read only
> > r58  | mmid | Multiply middle 32 bits, read only
> > r59  | mhi  | Multiply high 32 bits, read only
> > -
> >
> > When used for Co-existence configurations we have for mul64 the next
> > registers used:
> >
> > Register | Name | Use
> > -+--+
> > r58  | mlo  | Multiply low 32 bits, read only
> > r59  | mhi  | Multiply high 32 bits, read only
> > -
> >
> > Note that mlo/mhi assignment doesn't swap when bigendian CPU
> > configuration is used.
> >
> > The compiler will always use r58 for mlo, regardless of the
> > configuration choosen to ensure mlo/mhi correct splitting. Fixing mlo
> > to the right register number is done at assembly time. The dwarf info
> > is also notified via DBX_... macro. Both mlo/mhi registers needs to
> > saved when ISR happens using a custom sequence.
> >
> > gcc/
> > -xx-xx  Claudiu Zissulescu  
> >
> >   * config/arc/arc-protos.h (gen_mlo): Remove.
> >   (gen_mhi): Likewise.
> >   * config/arc/arc.c (AUX_MULHI): Define.
> >   (arc_must_save_reister): Special handling for r58/59.
> >   (arc_compute_frame_size): Consider mlo/mhi registers.
> >   (arc_save_callee_saves): Emit fp/sp move only when emit_move
> >   paramter is true.
> >   (arc_conditional_register_usage): Remove TARGET_BIG_ENDIAN from
> >   mlo/mhi name selection.
> >   (arc_restore_callee_saves): Don't early restore blink when ISR.
> >   (arc_expand_prologue): Add mlo/mhi saving.
> >   (arc_expand_epilogue): Add mlo/mhi restoring.
> >   (gen_mlo): Remove.
> >   (gen_mhi): Remove.
> >   * config/arc/arc.h (DBX_REGISTER_NUMBER): Correct register
> >   numbering when MUL64 option is used.
> >   (DWARF2_FRAME_REG_OUT): Define.
> >   * config/arc/arc.md (arc600_stall): New pattern.
> >   (VUNSPEC_ARC_ARC600_STALL): Define.
> >   (mulsi64): Use correct mlo/mhi registers.
> >   (mulsi_600): Clean it up.
> >   * config/arc/predicates.md (mlo_operand): Remove any dependency on
> >   TARGET_BIG_ENDIAN.
> >   (mhi_operand): Likewise.
> >
> > testsuite/
> > -xx-xx  Claudiu Zissulescu  
> >   * gcc.target/arc/code-density-flag.c: Update test.
> >   * gcc.target/arc/interrupt-6.c: Likewise.
> Ugh.  But OK.
>
> jeff
> >
>

[PATCH] Add __gcov_indirect_call_profiler_v4_atomic.

2020-01-27 Thread Martin Liška


Hi.

The patch is about missing atomic profiler function for indirect calls.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2020-01-27  Martin Liska  

PR gcov-profile/93403
* tree-profile.c (gimple_init_gcov_profiler): Generate
both __gcov_indirect_call_profiler_v4 and
__gcov_indirect_call_profiler_v4_atomic.

libgcc/ChangeLog:

2020-01-27  Martin Liska  

PR gcov-profile/93403
* libgcov-profiler.c (__gcov_indirect_call_profiler_v4):
Call __gcov_indirect_call_profiler_body.
(__gcov_indirect_call_profiler_body): New.
(__gcov_indirect_call_profiler_v4_atomic): New.
* libgcov.h (__gcov_indirect_call_profiler_v4_atomic):
New declaration.
---
 gcc/tree-profile.c|  8 
 libgcc/libgcov-profiler.c | 23 ---
 libgcc/libgcov.h  |  1 +
 3 files changed, 25 insertions(+), 7 deletions(-)


diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
index 4c1d296d9ea..6c0838261a1 100644
--- a/gcc/tree-profile.c
+++ b/gcc/tree-profile.c
@@ -120,7 +120,6 @@ gimple_init_gcov_profiler (void)
   tree gcov_type_ptr;
   tree ic_profiler_fn_type;
   tree average_profiler_fn_type;
-  const char *profiler_fn_name;
   const char *fn_name;
 
   if (!gcov_type_node)
@@ -167,6 +166,7 @@ gimple_init_gcov_profiler (void)
   fn_name = concat ("__gcov_topn_values_profiler", fn_suffix, NULL);
   tree_topn_values_profiler_fn
 	= build_fn_decl (fn_name, topn_values_profiler_fn_type);
+  free (CONST_CAST (char *, fn_name));
 
   TREE_NOTHROW (tree_topn_values_profiler_fn) = 1;
   DECL_ATTRIBUTES (tree_topn_values_profiler_fn)
@@ -181,10 +181,10 @@ gimple_init_gcov_profiler (void)
 	  gcov_type_node,
 	  ptr_type_node,
 	  NULL_TREE);
-  profiler_fn_name = "__gcov_indirect_call_profiler_v4";
-
+  fn_name = concat ("__gcov_indirect_call_profiler_v4", fn_suffix, NULL);
   tree_indirect_call_profiler_fn
-	  = build_fn_decl (profiler_fn_name, ic_profiler_fn_type);
+	= build_fn_decl (fn_name, ic_profiler_fn_type);
+  free (CONST_CAST (char *, fn_name));
 
   TREE_NOTHROW (tree_indirect_call_profiler_fn) = 1;
   DECL_ATTRIBUTES (tree_indirect_call_profiler_fn)
diff --git a/libgcc/libgcov-profiler.c b/libgcc/libgcov-profiler.c
index 58784d18477..6043ac4c7a1 100644
--- a/libgcc/libgcov-profiler.c
+++ b/libgcc/libgcov-profiler.c
@@ -199,8 +199,9 @@ struct indirect_call_tuple __gcov_indirect_call;
as a pointer to a function.  */
 
 /* Tries to determine the most common value among its inputs. */
-void
-__gcov_indirect_call_profiler_v4 (gcov_type value, void* cur_func)
+static inline void
+__gcov_indirect_call_profiler_body (gcov_type value, void *cur_func,
+int use_atomic)
 {
   /* If the C++ virtual tables contain function descriptors then one
  function may have multiple descriptors and we need to dereference
@@ -208,10 +209,26 @@ __gcov_indirect_call_profiler_v4 (gcov_type value, void* cur_func)
   if (cur_func == __gcov_indirect_call.callee
   || (__LIBGCC_VTABLE_USES_DESCRIPTORS__
 	  && *(void **) cur_func == *(void **) __gcov_indirect_call.callee))
-__gcov_topn_values_profiler_body (__gcov_indirect_call.counters, value, 0);
+__gcov_topn_values_profiler_body (__gcov_indirect_call.counters, value,
+  use_atomic);
 
   __gcov_indirect_call.callee = NULL;
 }
+
+void
+__gcov_indirect_call_profiler_v4 (gcov_type value, void *cur_func)
+{
+  __gcov_indirect_call_profiler_body (value, cur_func, 0);
+}
+
+#if GCOV_SUPPORTS_ATOMIC
+void
+__gcov_indirect_call_profiler_v4_atomic (gcov_type value, void *cur_func)
+{
+  __gcov_indirect_call_profiler_body (value, cur_func, 1);
+}
+#endif
+
 #endif
 
 #ifdef L_gcov_time_profiler
diff --git a/libgcc/libgcov.h b/libgcc/libgcov.h
index bc7e308a4f9..023293e05ec 100644
--- a/libgcc/libgcov.h
+++ b/libgcc/libgcov.h
@@ -274,6 +274,7 @@ extern void __gcov_pow2_profiler_atomic (gcov_type *, gcov_type);
 extern void __gcov_topn_values_profiler (gcov_type *, gcov_type);
 extern void __gcov_topn_values_profiler_atomic (gcov_type *, gcov_type);
 extern void __gcov_indirect_call_profiler_v4 (gcov_type, void *);
+extern void __gcov_indirect_call_profiler_v4_atomic (gcov_type, void *);
 extern void __gcov_time_profiler (gcov_type *);
 extern void __gcov_time_profiler_atomic (gcov_type *);
 extern void __gcov_average_profiler (gcov_type *, gcov_type);

Re: [PATCH][AArch64] ACLE 8-bit integer matrix multiply-accumulate intrinsics

2020-01-27 Thread Richard Sandiford

Dennis Zhang  writes:
> [...]
> gcc/ChangeLog:
>
> 2020-01-23  Dennis Zhang  
>
>   * config/aarch64/aarch64-builtins.c (TYPES_TERNOP_SSUS): New macro.
>   * config/aarch64/aarch64-simd-builtins.def (simd_smmla): New.
>   (simd_ummla, simd_usmmla): New.
>   * config/aarch64/aarch64-simd.md (aarch64_simd_mmlav16qi): New.
>   * config/aarch64/arm_neon.h (vmmlaq_s32, vmmlaq_u32): New.
>   (vusmmlaq_s32): New.
>   * config/aarch64/iterators.md (unspec): Add UNSPEC_SMATMUL,
>   UNSPEC_UMATMUL, and UNSPEC_USMATMUL.
>   (sur): Likewise.
>   (MATMUL): New iterator.
>
> gcc/testsuite/ChangeLog:
>
> 2020-01-23  Dennis Zhang  
>
>   * gcc.target/aarch64/simd/vmmla.c: New test.

OK, thanks.

One note below...

> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
> b/gcc/config/aarch64/aarch64-builtins.c
> index f0e0461b7f0..033a6d4e92f 100644
> --- a/gcc/config/aarch64/aarch64-builtins.c
> +++ b/gcc/config/aarch64/aarch64-builtins.c
> @@ -176,6 +176,10 @@ 
> aarch64_types_ternopu_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>= { qualifier_unsigned, qualifier_unsigned,
>qualifier_unsigned, qualifier_immediate };
>  #define TYPES_TERNOPUI (aarch64_types_ternopu_imm_qualifiers)
> +static enum aarch64_type_qualifiers
> +aarch64_types_ternop_ssus_qualifiers[SIMD_MAX_BUILTIN_ARGS]
> +  = { qualifier_none, qualifier_none, qualifier_unsigned, qualifier_none };
> +#define TYPES_TERNOP_SSUS (aarch64_types_ternop_ssus_qualifiers)
>  
>  
>  static enum aarch64_type_qualifiers
> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
> b/gcc/config/aarch64/aarch64-simd-builtins.def
> index 57fc5933b43..885c2540514 100644
> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
> @@ -682,3 +682,8 @@
>BUILTIN_VSFDF (UNOP, frint32x, 0)
>BUILTIN_VSFDF (UNOP, frint64z, 0)
>BUILTIN_VSFDF (UNOP, frint64x, 0)
> +
> +  /* Implemented by aarch64_simd_mmlav16qi.  */
> +  VAR1 (TERNOP, simd_smmla, 0, v16qi)
> +  VAR1 (TERNOPU, simd_ummla, 0, v16qi)
> +  VAR1 (TERNOP_SSUS, simd_usmmla, 0, v16qi)
> diff --git a/gcc/config/aarch64/aarch64-simd.md 
> b/gcc/config/aarch64/aarch64-simd.md
> index 2989096b170..b7659068b7d 100644
> --- a/gcc/config/aarch64/aarch64-simd.md
> +++ b/gcc/config/aarch64/aarch64-simd.md
> @@ -7025,3 +7025,15 @@
>"xtn\t%0., %1."
>[(set_attr "type" "neon_shift_imm_narrow_q")]
>  )
> +
> +;; 8-bit integer matrix multiply-accumulate
> +(define_insn "aarch64_simd_mmlav16qi"
> +  [(set (match_operand:V4SI 0 "register_operand" "=w")
> + (plus:V4SI
> +  (unspec:V4SI [(match_operand:V16QI 2 "register_operand" "w")
> +(match_operand:V16QI 3 "register_operand" "w")] MATMUL)
> +  (match_operand:V4SI 1 "register_operand" "0")))]
> +  "TARGET_I8MM"
> +  "mmla\\t%0.4s, %2.16b, %3.16b"
> +  [(set_attr "type" "neon_mla_s_q")]
> +)
> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
> index eaba156e26c..918000d98dc 100644
> --- a/gcc/config/aarch64/arm_neon.h
> +++ b/gcc/config/aarch64/arm_neon.h
> @@ -34609,6 +34609,36 @@ vrnd64xq_f64 (float64x2_t __a)
>  
>  #pragma GCC pop_options
>  
> +/* AdvSIMD 8-bit Integer Matrix Multiply (I8MM) intrinsics.  */
> +
> +#pragma GCC push_options
> +#pragma GCC target ("arch=armv8.2-a+i8mm")
> +
> +/* Matrix Multiply-Accumulate.  */
> +
> +__extension__ extern __inline int32x4_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmmlaq_s32 (int32x4_t __r, int8x16_t __a, int8x16_t __b)
> +{
> +  return __builtin_aarch64_simd_smmlav16qi (__r, __a, __b);
> +}
> +
> +__extension__ extern __inline uint32x4_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vmmlaq_u32 (uint32x4_t __r, uint8x16_t __a, uint8x16_t __b)
> +{
> +  return __builtin_aarch64_simd_ummlav16qi_ (__r, __a, __b);
> +}
> +
> +__extension__ extern __inline int32x4_t
> +__attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
> +vusmmlaq_s32 (int32x4_t __r, uint8x16_t __a, int8x16_t __b)
> +{
> +  return __builtin_aarch64_simd_usmmlav16qi_ssus (__r, __a, __b);
> +}
> +
> +#pragma GCC pop_options
> +
>  #include "arm_bf16.h"
>  
>  #undef __aarch64_vget_lane_any
> diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
> index b9843b83c5f..57aca36f646 100644
> --- a/gcc/config/aarch64/iterators.md
> +++ b/gcc/config/aarch64/iterators.md
> @@ -581,6 +581,9 @@
>  UNSPEC_FMLSL ; Used in aarch64-simd.md.
>  UNSPEC_FMLAL2; Used in aarch64-simd.md.
>  UNSPEC_FMLSL2; Used in aarch64-simd.md.
> +UNSPEC_SMATMUL   ; Used in aarch64-simd.md.
> +UNSPEC_UMATMUL   ; Used in aarch64-simd.md.
> +UNSPEC_USMATMUL  ; Used in aarch64-simd.md.
>  UNSPEC_ADR   ; Used in aarch64-sve.md.
>  UNSPEC_SEL   ; Used in aarch64-sve.md.
>  UNSPEC_BRKA  ; Used in aarch64-sve.md.
> @@ -2531,6 +2534,8 @@
>  
>  (define_int_iterator

Re: [PATCH] coroutines: Ensure the ramp return object is checked (PR93443).

2020-01-27 Thread Nathan Sidwell


On 1/27/20 6:43 AM, Iain Sandoe wrote:

As the PR shows, there is a pathway through the code where the
no_warning value is not set, which corresponds to a missing check
of the ramp return when it was constructed from the 'get return
object’.   Fixed by ensuring that the check of the return value is
carried out for both return cases.

bootstrapped and tested on x86_64-darwin16,
OK for trunk?
thanks
Iain

gcc/cp/ChangeLog:

2020-01-27  Iain Sandoe  

PR c++/93443
* coroutines.cc (morph_fn_to_coro): Check the ramp return
value when it is constructed from the 'get return object'.


ok, thanks


--
Nathan Sidwell

Re: [Patch] [libgomp, build] Skip plugin-{gcn,hsa} for (-m)x32 (PR bootstrap/93409)

2020-01-27 Thread Andrew Stubbs


On 24/01/2020 14:59, Tobias Burnus wrote:
As reported in PR93409, the build of libgomp/plugin/plugin-gcn.c fails 
with a bunch of error messages when building with 
--with-multilib-list=m32,m64,mx32


The reason is that the GCN plugin assumes 64bit pointers. As with HSA, 
the build is only enabled for x86-64 and "-m32" is excluded. — However, 
it seems as if it makes sense to exclude also "-mx32".


This patch was tested with -m32/-m64 multilib as I do not have a -mx32 
setup.

OK for the trunk?


This is fine with me,  but that's probably not enough for this file.

Andrew

Re: [PATCH][AArch64] ACLE 8-bit integer matrix multiply-accumulate intrinsics

2020-01-27 Thread Dennis Zhang

Hi Richard,

On 23/01/2020 15:28, Richard Sandiford wrote:
> Dennis Zhang  writes:
>> Hi all,
>> On 16/12/2019 13:53, Dennis Zhang wrote:
>>> Hi all,
>>>
>>> This patch is part of a series adding support for Armv8.6-A features.
>>> It depends on the Armv8.6-A effective target checking patch,
>>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00857.html.
>>>
>>> This patch adds intrinsics for matrix multiply-accumulate operations
>>> including vmmlaq_s32, vmmlaq_u32, and vusmmlaq_s32.
>>>
>>> ACLE documents are at https://developer.arm.com/docs/101028/latest
>>> ISA documents are at https://developer.arm.com/docs/ddi0596/latest
>>>
>>> Regtested & bootstrapped for aarch64-none-linux-gnu.
>>>
>>> Is it OK for trunk please?
>>>
>>
>> This patch is rebased to the trunk top.
>> There is no dependence on any other patches now.
>> Regtested again.
>>
>> Is it OK for trunk please?
>>
>> Cheers
>> Dennis
>>
>> gcc/ChangeLog:
>>
>> 2020-01-23  Dennis Zhang  
>>
>>  * config/aarch64/aarch64-builtins.c (TYPES_TERNOP_SSUS): New macro.
>>  * config/aarch64/aarch64-simd-builtins.def (simd_smmla): New.
>>  (simd_ummla, simd_usmmla): New.
>>  * config/aarch64/aarch64-simd.md (aarch64_simd_mmlav16qi): New.
>>  * config/aarch64/arm_neon.h (vmmlaq_s32, vmmlaq_u32): New.
>>  (vusmmlaq_s32): New.
>>  * config/aarch64/iterators.md (unspec): Add UNSPEC_SMATMUL,
>>  UNSPEC_UMATMUL, and UNSPEC_USMATMUL.
>>  (sur): Likewise.
>>  (MATMUL): New iterator.
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2020-01-23  Dennis Zhang  
>>
>>  * gcc.target/aarch64/advsimd-intrinsics/vmmla.c: New test.
>>
>> diff --git a/gcc/config/aarch64/aarch64-builtins.c 
>> b/gcc/config/aarch64/aarch64-builtins.c
>> index f0e0461b7f0..033a6d4e92f 100644
>> --- a/gcc/config/aarch64/aarch64-builtins.c
>> +++ b/gcc/config/aarch64/aarch64-builtins.c
>> @@ -176,6 +176,10 @@ 
>> aarch64_types_ternopu_imm_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> = { qualifier_unsigned, qualifier_unsigned,
>> qualifier_unsigned, qualifier_immediate };
>>   #define TYPES_TERNOPUI (aarch64_types_ternopu_imm_qualifiers)
>> +static enum aarch64_type_qualifiers
>> +aarch64_types_ternop_ssus_qualifiers[SIMD_MAX_BUILTIN_ARGS]
>> +  = { qualifier_none, qualifier_none, qualifier_unsigned, qualifier_none };
>> +#define TYPES_TERNOP_SSUS (aarch64_types_ternop_ssus_qualifiers)
>>   
>>   
>>   static enum aarch64_type_qualifiers
>> diff --git a/gcc/config/aarch64/aarch64-simd-builtins.def 
>> b/gcc/config/aarch64/aarch64-simd-builtins.def
>> index 57fc5933b43..06025b110cc 100644
>> --- a/gcc/config/aarch64/aarch64-simd-builtins.def
>> +++ b/gcc/config/aarch64/aarch64-simd-builtins.def
>> @@ -682,3 +682,8 @@
>> BUILTIN_VSFDF (UNOP, frint32x, 0)
>> BUILTIN_VSFDF (UNOP, frint64z, 0)
>> BUILTIN_VSFDF (UNOP, frint64x, 0)
>> +
>> +  /* Implemented by aarch64_simd_mmlav16qi.  */
>> +  VAR1 (TERNOP, simd_smmla, 0, v16qi)
>> +  VAR1 (TERNOPU, simd_ummla, 0, v16qi)
>> +  VAR1 (TERNOP_SSUS, simd_usmmla, 0, v16qi)
>> \ No newline at end of file
>> diff --git a/gcc/config/aarch64/aarch64-simd.md 
>> b/gcc/config/aarch64/aarch64-simd.md
>> index 2989096b170..409ec28d293 100644
>> --- a/gcc/config/aarch64/aarch64-simd.md
>> +++ b/gcc/config/aarch64/aarch64-simd.md
>> @@ -7025,3 +7025,15 @@
>> "xtn\t%0., %1."
>> [(set_attr "type" "neon_shift_imm_narrow_q")]
>>   )
>> +
>> +;; 8-bit integer matrix multiply-accumulate
>> +(define_insn "aarch64_simd_mmlav16qi"
>> +  [(set (match_operand:V4SI 0 "register_operand" "=w")
>> +(plus:V4SI (match_operand:V4SI 1 "register_operand" "0")
>> +   (unspec:V4SI [(match_operand:V16QI 2 "register_operand" "w")
>> + (match_operand:V16QI 3 "register_operand" "w")]
>> +MATMUL)))]
>> +  "TARGET_I8MM"
>> +  "mmla\\t%0.4s, %2.16b, %3.16b"
>> +  [(set_attr "type" "neon_mla_s_q")]
>> +)
>> \ No newline at end of file
> 
> (Would be good to add the newline)
> 
> The canonical rtl order for commutative operations like plus is
> to put the most complicated expression first (roughly speaking --
> the rules are a bit more precise than that).  So this should be:
> 
>[(set (match_operand:V4SI 0 "register_operand" "=w")
>   (plus:V4SI (unspec:V4SI [(match_operand:V16QI 2 "register_operand" "w")
>(match_operand:V16QI 3 "register_operand" "w")]
>   MATMUL)
>  (match_operand:V4SI 1 "register_operand" "0")))]
> 
>> diff --git a/gcc/config/aarch64/arm_neon.h b/gcc/config/aarch64/arm_neon.h
>> index eaba156e26c..918000d98dc 100644
>> --- a/gcc/config/aarch64/arm_neon.h
>> +++ b/gcc/config/aarch64/arm_neon.h
>> @@ -34609,6 +34609,36 @@ vrnd64xq_f64 (float64x2_t __a)
>>   
>>   #pragma GCC pop_options
>>   
>> +/* AdvSIMD 8-bit Integer Matrix Multiply (I8MM) intrinsics.  */
>> +
>> +#pragma GCC push_options
>> +#pragma GCC target ("arch=armv8.2-a+i8mm")
>> +
>> +/* Matrix Multiply-Accumulate.  */
>> +
>>

Re: [RFC] [c-family] PR92867 - Add returns_arg attribute

2020-01-27 Thread Richard Biener

On Fri, Jan 24, 2020 at 11:53 PM Joseph Myers  wrote:
>
> On Fri, 24 Jan 2020, Prathamesh Kulkarni wrote:
>
> > The middle-end representation issue of ERF_RETURNS_ARG still remains,
> > which restricts the attribute till first four args. The patch simply
> > emits sorry(), for arguments beyond first four..
>
> I think this should be fixed (e.g. make the middle-end check for the
> attribute, or something like that).

Since it's pure optimization you can also simply chose to ignore this without
notice.

Note ERF_RETURN_ARG_MASK is limited to the number of available
bits in an int and practically the only current setter was via "fn spec"
which uses a single digit [1-9] to denote the argument (so limiting to
three is indeed an odd choice but matches builtins using this at the moment).

Feel free to up ERF_RETURN_ARG_MASK (but then you need to adjust
the ERF_ flag defines).

>  The language semantics of the
> attribute should not be driven by such internal implementation details;
> rather, implementation details should be determined by the language
> semantics to be implemented.
>
> The sorry () has coding style issues.  Diagnostics should not end with '.'
> or '\n', should use full words (attribute not attr, arguments not args)
> and programming language text in them should be surrounded by %<%> (so
> %).
>
> --
> Joseph S. Myers
> jos...@codesourcery.com

[PATCH] coroutines: Ensure the ramp return object is checked (PR93443).

2020-01-27 Thread Iain Sandoe

As the PR shows, there is a pathway through the code where the
no_warning value is not set, which corresponds to a missing check
of the ramp return when it was constructed from the 'get return
object’.   Fixed by ensuring that the check of the return value is
carried out for both return cases.

bootstrapped and tested on x86_64-darwin16,
OK for trunk?
thanks
Iain

gcc/cp/ChangeLog:

2020-01-27  Iain Sandoe  

PR c++/93443
* coroutines.cc (morph_fn_to_coro): Check the ramp return
value when it is constructed from the 'get return object'.
—

 gcc/cp/coroutines.cc | 18 ++
 1 file changed, 10 insertions(+), 8 deletions(-)

diff --git a/gcc/cp/coroutines.cc b/gcc/cp/coroutines.cc
index b222c1f7a8e..e8a6a4033f6 100644
--- a/gcc/cp/coroutines.cc
+++ b/gcc/cp/coroutines.cc
@@ -3526,14 +3526,9 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
 
   /* Switch to using 'input_location' as the loc, since we're now more
  logically doing things related to the end of the function.  */
-  /* done, we just need the return value.  */
-  bool no_warning;
-  if (same_type_p (TREE_TYPE (gro), fn_return_type))
-{
-  /* Already got the result.  */
-  r = check_return_expr (DECL_RESULT (orig), _warning);
-}
-  else
+
+  /* The ramp is done, we just need the return value.  */
+  if (!same_type_p (TREE_TYPE (gro), fn_return_type))
 {
   /* construct the return value with a single GRO param.  */
   vec *args = make_tree_vector_single (gro);
@@ -3545,6 +3540,13 @@ morph_fn_to_coro (tree orig, tree *resumer, tree 
*destroyer)
   add_stmt (r);
   release_tree_vector (args);
 }
+  /* Else the GRO is the return and we already built it in place.  */
+
+  bool no_warning;
+  r = check_return_expr (DECL_RESULT (orig), _warning);
+  if (error_operand_p (r) && warn_return_type)
+/* Suppress -Wreturn-type for the ramp.  */
+TREE_NO_WARNING (orig) = true;
 
   r = build_stmt (input_location, RETURN_EXPR, DECL_RESULT (orig));
   TREE_NO_WARNING (r) |= no_warning;
-- 
2.24.1

Re: [PATCH] sanopt: Avoid crash on anonymous parameter [PR93436]

2020-01-27 Thread Martin Liška


On 1/26/20 10:18 PM, Marek Polacek wrote:

Sure, that's better, thanks.


Thank you for the fix Marek.

Martin

[PATCH] libstdc++: Fix deduction guide for std::span (PR93426)

2020-01-27 Thread Jonathan Wakely

The deduction guide from an iterator and sentinel used the wrong alias
template and so didn't work.

PR libstdc++/93426
* include/std/span (span): Fix deduction guide.
* testsuite/23_containers/span/deduction.cc: New test.

This used to work correctly but regressed with r279000.

Tested powerpc64le-linux, committed to trunk.

commit 389cd88ce797e2a4345eab8db478a3b8eba798e8
Author: Jonathan Wakely 
Date:   Mon Jan 27 10:30:03 2020 +

libstdc++: Fix deduction guide for std::span (PR93426)

The deduction guide from an iterator and sentinel used the wrong alias
template and so didn't work.

PR libstdc++/93426
* include/std/span (span): Fix deduction guide.
* testsuite/23_containers/span/deduction.cc: New test.

diff --git a/libstdc++-v3/include/std/span b/libstdc++-v3/include/std/span
index 0dae18672af..0072010dea8 100644
--- a/libstdc++-v3/include/std/span
+++ b/libstdc++-v3/include/std/span
@@ -190,7 +190,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
: span(static_cast(__arr.data()), _ArrayExtent)
{ }
 
-public:
   template
requires (_Extent == dynamic_extent)
  && (!__detail::__is_std_span>::value)
@@ -404,6 +403,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 };
 
   // deduction guides
+
   template
 span(_Type(&)[_ArrayExtent]) -> span<_Type, _ArrayExtent>;
 
@@ -416,7 +416,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
   template
 span(_Iter, _Sentinel)
-  -> span>>;
+  -> span>>;
 
   template
 span(_Range &&)
diff --git a/libstdc++-v3/testsuite/23_containers/span/deduction.cc 
b/libstdc++-v3/testsuite/23_containers/span/deduction.cc
new file mode 100644
index 000..66e955e961b
--- /dev/null
+++ b/libstdc++-v3/testsuite/23_containers/span/deduction.cc
@@ -0,0 +1,84 @@
+// Copyright (C) 2020 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// You should have received a copy of the GNU General Public License along
+// with this library; see the file COPYING3.  If not see
+// .
+
+// { dg-options "-std=gnu++2a" }
+// { dg-do compile { target c++2a } }
+
+#include 
+
+template
+constexpr bool is_static_span(const U&)
+{
+  return std::is_same_v, U> && N != std::dynamic_extent;
+}
+
+template
+constexpr bool is_dynamic_span(const U&)
+{
+  return std::is_same_v, U>;
+}
+
+struct Range
+{
+  float* begin() const;
+  float* end() const;
+};
+
+void
+test01()
+{
+  const char c[] = "";
+  int i[2]{};
+  std::array a;
+  Range r;
+
+  std::span s1(c);
+  static_assert( is_static_span(s1) );
+
+  std::span s2(i);
+  static_assert( is_static_span(s2) );
+
+  std::span s3(a);
+  static_assert( is_static_span(s3) );
+
+  std::span s4(const_cast&>(a));
+  static_assert( is_static_span(s4) );
+
+  std::span s5(std::begin(i), std::end(i));
+  static_assert( is_dynamic_span(s5) );
+
+  std::span s6(std::cbegin(i), std::cend(i));
+  static_assert( is_dynamic_span(s6) );
+
+  std::span s7(r);
+  static_assert( is_dynamic_span(s7) );
+
+  std::span s8(s1);
+  static_assert( is_static_span(s8) );
+
+  std::span s9(s2);
+  static_assert( is_static_span(s9) );
+
+  std::span s10(const_cast&>(s2));
+  static_assert( is_static_span(s10) );
+
+  std::span s11(s5);
+  static_assert( is_dynamic_span(s11) );
+
+  std::span s12(const_cast&>(s5));
+  static_assert( is_dynamic_span(s12) );
+}

[PATCH] Replace one error with inform.

2020-01-27 Thread Martin Liška


Hello.

The patch is about splitting pair of errors into
error and inform which seems logical to me in this
situation:

/xg++ -B. /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/vt-34314.C
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/vt-34314.C:3:24: error: 
template parameter ‘class ... Args’
3 | template // { dg-error "template 
parameter" }
  |^~~~
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/vt-34314.C:7:8: note: 
redeclared here as ‘class Arg0’
7 | struct call // { dg-message "note: redeclared here" }
  |^~~~

instead of:

g++ /home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/vt-34314.C -c
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/vt-34314.C:3:24: error: 
template parameter ‘class ... Args’
3 | template // { dg-error "template 
parameter" }
  |^~~~
/home/marxin/Programming/gcc/gcc/testsuite/g++.dg/cpp0x/vt-34314.C:7:8: error: 
redeclared here as ‘class Arg0’
7 | struct call // { dg-message "note: redeclared here" }
  |^~~~

That helps -fmax-errors=1 to not split the error in the middle.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/cp/ChangeLog:

2020-01-23  Martin Liska  

PR c++/92440
* pt.c (redeclare_class_template): Use inform
for the second location.

gcc/testsuite/ChangeLog:

2020-01-23  Martin Liska  

PR c++/92440
* g++.dg/template/pr92440.C: New test.
* g++.dg/cpp0x/vt-34314.C: Update error to note.
* g++.dg/template/pr59930-2.C: Likewise.
* g++.old-deja/g++.pt/redecl1.C: Likewise.
---
 gcc/cp/pt.c |  2 +-
 gcc/testsuite/g++.dg/cpp0x/vt-34314.C   |  6 +++---
 gcc/testsuite/g++.dg/template/pr59930-2.C   |  2 +-
 gcc/testsuite/g++.dg/template/pr92440.C | 10 ++
 gcc/testsuite/g++.dg/template/redecl2.C |  2 +-
 gcc/testsuite/g++.old-deja/g++.pt/redecl1.C |  4 ++--
 6 files changed, 18 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/template/pr92440.C


diff --git a/gcc/cp/pt.c b/gcc/cp/pt.c
index 4520c995028..f9bed1ea4fb 100644
--- a/gcc/cp/pt.c
+++ b/gcc/cp/pt.c
@@ -6149,7 +6149,7 @@ redeclare_class_template (tree type, tree parms, tree cons)
 		  != TEMPLATE_PARM_PARAMETER_PACK (DECL_INITIAL (parm)
 	{
 	  error ("template parameter %q+#D", tmpl_parm);
-	  error ("redeclared here as %q#D", parm);
+	  inform (input_location, "redeclared here as %q#D", parm);
 	  return false;
 	}
 
diff --git a/gcc/testsuite/g++.dg/cpp0x/vt-34314.C b/gcc/testsuite/g++.dg/cpp0x/vt-34314.C
index ee0ed01b8d3..b37cac53223 100644
--- a/gcc/testsuite/g++.dg/cpp0x/vt-34314.C
+++ b/gcc/testsuite/g++.dg/cpp0x/vt-34314.C
@@ -4,7 +4,7 @@ template // { dg-error "template parameter" }
 struct call;
 
 template
-struct call // { dg-error "redeclared here" }
+struct call // { dg-message "note: redeclared here" }
 {
 template
 struct result;
@@ -21,7 +21,7 @@ template // { dg-error "template parameter" }
 struct call2;
 
 template
-struct call2 // { dg-error "redeclared here" }
+struct call2 // { dg-message "note: redeclared here" }
 {
 template
 struct result;
@@ -37,7 +37,7 @@ template class... TT> // { dg-error "template p
 struct call3;
 
 template class TT>
-struct call3 // { dg-error "redeclared here" }
+struct call3 // { dg-message "note: redeclared here" }
 {
 template
 struct result;
diff --git a/gcc/testsuite/g++.dg/template/pr59930-2.C b/gcc/testsuite/g++.dg/template/pr59930-2.C
index a7e6ea4ea9a..65ec58e23f4 100644
--- a/gcc/testsuite/g++.dg/template/pr59930-2.C
+++ b/gcc/testsuite/g++.dg/template/pr59930-2.C
@@ -6,7 +6,7 @@ namespace N {
 // Injects N::N
 template < T > friend class N;
 // { dg-error "template parameter" "" { target *-*-* } .-1 }
-// { dg-error "redeclared"  "" { target *-*-* } .-2 }
+// { dg-message "note: redeclared"  "" { target *-*-* } .-2 }
   };
 }
 
diff --git a/gcc/testsuite/g++.dg/template/pr92440.C b/gcc/testsuite/g++.dg/template/pr92440.C
new file mode 100644
index 000..20db5f10586
--- /dev/null
+++ b/gcc/testsuite/g++.dg/template/pr92440.C
@@ -0,0 +1,10 @@
+// PR c++/92440
+// { dg-do compile }
+
+template  // { dg-error "template parameter" }
+struct S {
+template 
+friend struct S;  // { dg-message "note: redeclared here as" }
+};
+
+S<0> s;
diff --git a/gcc/testsuite/g++.dg/template/redecl2.C b/gcc/testsuite/g++.dg/template/redecl2.C
index 4dd432e6fea..31334f4f334 100644
--- a/gcc/testsuite/g++.dg/template/redecl2.C
+++ b/gcc/testsuite/g++.dg/template/redecl2.C
@@ -6,4 +6,4 @@
 // non-type template parameter.
 
 template  struct X;	// { dg-error "template parameter" }
-template  struct X;	// { dg-error "redeclared here" }
+template  struct X;	// { dg-message "note: redeclared here" }
diff --git a/gcc/testsuite/g++.old-deja/g++.pt/redecl1.C

Re: [PATCH] Make target_clones resolver fn static.

2020-01-27 Thread Martin Liška

On 1/26/20 6:35 PM, Jeff Law wrote:

On Tue, 2020-01-21 at 13:48 +0100, Martin Liška wrote:

 From a3faaced989869867671ceadd89b56fabde225ff Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 16 Jan 2020 10:38:41 +0100
Subject: [PATCH] Make target_clones resolver fn static.

gcc/ChangeLog:

2020-01-17  Martin Liska  

PR target/93274
* config/i386/i386-features.c (make_resolver_func):
Align the code with ppc64 target implementaion.
We do not need to have gnu_indirect_function
as a global function.  Drop TREE_PUBLIC on
ifunc symbol.
* config/rs6000/rs6000.c (make_resolver_func): Drop
TREE_PUBLIC on ifunc symbol.

gcc/testsuite/ChangeLog:

2020-01-17  Martin Liska  

PR target/93274
* gcc.target/i386/pr81213.c: Adjust to not expect
a global unique name.
* gcc.target/i386/pr81213-2.c: New test.

Not strictly a regression, but given the codegen impact, I think this
should go in.  OK

Thanks for the approval, but I'm going to install only the first part and leave
the second part for GCC 11.

Martin

jeff

Re: [PATCH] Make target_clones resolver fn static.

2020-01-27 Thread Martin Liška


On 1/23/20 2:52 PM, Alexander Monakov wrote:



On Thu, 23 Jan 2020, Martin Liška wrote:

So this doesn't help review including two different target changes.  Making
ifunc dispatchers of public functions non-public looks like an unrelated
thing
to the bug (sorry if I mis-suggested).  So I feel comfortable approving the
earlier patch which just dropped the extra mangling for non-public
dispatchers
in the x86 backend.


Works for me.


If you will be revising the patch, can you please improve the new comment?

I mean this addition:

   /* Make the resolver function static.
   ... */

but it's not what the following code does.


Sure, there's original version of the patch with modified comment.
I'm going to install it.

Martin



Thanks.
Alexander



>From 256ba0077dd91f70117c8c65f9ffd206ea98b2c4 Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Thu, 16 Jan 2020 10:38:41 +0100
Subject: [PATCH] Do not generate a unique fnname for resolver.

gcc/ChangeLog:

2020-01-17  Martin Liska  

	PR target/93274
	* config/i386/i386-features.c (make_resolver_func):
	Align the code with ppc64 target implementation.
	Do not generate a unique name for resolver function.

gcc/testsuite/ChangeLog:

2020-01-17  Martin Liska  

	PR target/93274
	* gcc.target/i386/pr81213.c: Adjust to not expect
	a globally unique name.
---
 gcc/config/i386/i386-features.c | 19 ---
 gcc/testsuite/gcc.target/i386/pr81213.c |  4 ++--
 2 files changed, 6 insertions(+), 17 deletions(-)

diff --git a/gcc/config/i386/i386-features.c b/gcc/config/i386/i386-features.c
index e580b26b995..b49e6f8d408 100644
--- a/gcc/config/i386/i386-features.c
+++ b/gcc/config/i386/i386-features.c
@@ -2738,26 +2738,16 @@ make_resolver_func (const tree default_decl,
 		const tree ifunc_alias_decl,
 		basic_block *empty_bb)
 {
-  char *resolver_name;
-  tree decl, type, decl_name, t;
+  tree decl, type, t;
 
-  /* IFUNC's have to be globally visible.  So, if the default_decl is
- not, then the name of the IFUNC should be made unique.  */
-  if (TREE_PUBLIC (default_decl) == 0)
-{
-  char *ifunc_name = make_unique_name (default_decl, "ifunc", true);
-  symtab->change_decl_assembler_name (ifunc_alias_decl,
-	  get_identifier (ifunc_name));
-  XDELETEVEC (ifunc_name);
-}
-
-  resolver_name = make_unique_name (default_decl, "resolver", false);
+  /* Create resolver function name based on default_decl.  */
+  tree decl_name = clone_function_name (default_decl, "resolver");
+  const char *resolver_name = IDENTIFIER_POINTER (decl_name);
 
   /* The resolver function should return a (void *). */
   type = build_function_type_list (ptr_type_node, NULL_TREE);
 
   decl = build_fn_decl (resolver_name, type);
-  decl_name = get_identifier (resolver_name);
   SET_DECL_ASSEMBLER_NAME (decl, decl_name);
 
   DECL_NAME (decl) = decl_name;
@@ -2809,7 +2799,6 @@ make_resolver_func (const tree default_decl,
 
   /* Create the alias for dispatch to resolver here.  */
   cgraph_node::create_same_body_alias (ifunc_alias_decl, decl);
-  XDELETEVEC (resolver_name);
   return decl;
 }
 
diff --git a/gcc/testsuite/gcc.target/i386/pr81213.c b/gcc/testsuite/gcc.target/i386/pr81213.c
index 13e15d5fef0..89c47529861 100644
--- a/gcc/testsuite/gcc.target/i386/pr81213.c
+++ b/gcc/testsuite/gcc.target/i386/pr81213.c
@@ -14,6 +14,6 @@ int main()
   return foo();
 }
 
-/* { dg-final { scan-assembler "\t.globl\tfoo\\..*\\.ifunc" } } */
+/* { dg-final { scan-assembler "\t.globl\tfoo" } } */
 /* { dg-final { scan-assembler "foo.resolver:" } } */
-/* { dg-final { scan-assembler "foo\\..*\\.ifunc, @gnu_indirect_function" } } */
+/* { dg-final { scan-assembler "foo\\, @gnu_indirect_function" } } */
-- 
2.25.0

85 matches

Mail list logo