[PATCH] [i386] Remove storage only description for _Float16 w/o avx512fp16.

2021-09-24 Thread liuhongt via Gcc-patches
[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-September/580207.html

gcc/ChangeLog:

* doc/extend.texi (Half-Precision): Remove storage only
description for _Float16 w/o avx512fp16.
---
 gcc/doc/extend.texi | 11 +--
 1 file changed, 5 insertions(+), 6 deletions(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 9501a60f20e..79fa1bd4bf8 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -1156,12 +1156,11 @@ It is recommended that portable code use the 
@code{_Float16} type defined
 by ISO/IEC TS 18661-3:2015.  @xref{Floating Types}.
 
 On x86 targets with SSE2 enabled, without @option{-mavx512fp16},
-@code{_Float16} type is storage only, all operations will be emulated by
-software emulation and the @code{float} instructions. The default behavior
-for @code{FLT_EVAL_METHOD} is to keep the intermediate result of the operation
-as 32-bit precision. This may lead to inconsistent behavior between software
-emulation and AVX512-FP16 instructions. Using @option{-fexcess-precision=16}
-will force round back after each operation.
+all operations will be emulated by software emulation and the @code{float}
+instructions. The default behavior for @code{FLT_EVAL_METHOD} is to keep the
+intermediate result of the operation as 32-bit precision. This may lead to
+inconsistent behavior between software emulation and AVX512-FP16 instructions.
+Using @option{-fexcess-precision=16} will force round back after each 
operation.
 
 Using @option{-mavx512fp16} will generate AVX512-FP16 instructions instead of
 software emulation. The default behavior of @code{FLT_EVAL_METHOD} is to round
-- 
2.27.0



Re: [Patch] Fortran: Fix assumed-size to assumed-rank passing [PR94070]

2021-09-24 Thread Thomas Koenig via Gcc-patches

Hi Tobias,


OK for mainline?


As promised on IRC, here's the review.

Maybe you can add a test case which shows that the call to the size
intrinsic really does not happen.

OK with that.

Thanks for the patch!

Best regards

Thomas


Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]

2021-09-24 Thread Jason Merrill via Gcc-patches

On 8/28/21 07:54, nick huang via Gcc-patches wrote:

Reference with cv-qualifiers should be ignored instead of causing an error
because standard accepts cv-qualified references introduced by typedef which
is ignored.
Therefore, the fix prevents GCC from reporting error by not setting variable
"bad_quals" in case the reference is introduced by typedef. Still the
cv-qualifier is silently ignored.
Here I quote spec (https://timsong-cpp.github.io/cppwp/dcl.ref#1):
"Cv-qualified references are ill-formed except when the cv-qualifiers
are introduced through the use of a typedef-name ([dcl.typedef],
[temp.param]) or decltype-specifier ([dcl.type.decltype]),
in which case the cv-qualifiers are ignored."

PR c++/101783

gcc/cp/ChangeLog:

2021-08-27  qingzhe huang  

* tree.c (cp_build_qualified_type_real):


The git commit verifier rejects this commit message with

Checking 1fa0fbcdd15adf936ab4fae584f841beb35da1bb: FAILED ERR: missing 
description of a change:

"  * tree.c (cp_build_qualified_type_real):"

(your initial patch had a description here, you just need to copy it over)

ERR: PR 101783 in subject but not in changelog:
"c++: Suppress error when cv-qualified reference is introduced by 
typedef [PR101783]"


(the PR number needs to have a Tab before it)

In Jonathan's earlier reply he asked how you tested the patch; this 
message still doesn't say anything about that.


https://gcc.gnu.org/contribute.html#testing

What is the legal status of your contributions?

https://gcc.gnu.org/contribute.html#legal

Existing code tries to handle this with the tf_ignore_bad_quals, but the 
unnecessary use of typename gets past the code that tries to set the 
flag.  But your approach is nice and straightforward, so let's go ahead 
with it.



gcc/testsuite/ChangeLog:

2021-08-27  qingzhe huang  

* g++.dg/parse/pr101783.C: New test.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 8840932dba2..7aa4318a574 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1356,12 +1356,22 @@ cp_build_qualified_type_real (tree type,
/* A reference or method type shall not be cv-qualified.
   [dcl.ref], [dcl.fct].  This used to be an error, but as of DR 295
   (in CD1) we always ignore extra cv-quals on functions.  */
+
+  /* PR 101783


Let's cite where this comes from in the standard ([dcl.ref]/1), and not 
the PR number.



+ Cv-qualified references are ill-formed except when the cv-qualifiers
+ are introduced through the use of a typedef-name ([dcl.typedef],
+ [temp.param]) or decltype-specifier ([dcl.type.decltype]),
+ in which case the cv-qualifiers are ignored.
+   */
if (type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE)
&& (TYPE_REF_P (type)
  || FUNC_OR_METHOD_TYPE_P (type)))
  {
-  if (TYPE_REF_P (type))
+  // do NOT set bad_quals when non-method reference is introduced by 
typedef.
+  if (TYPE_REF_P (type)
+ && (!typedef_variant_p (type) || FUNC_OR_METHOD_TYPE_P (type)))
bad_quals |= type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE);
+  // non-method reference introduced by typedef is also dropped silently


These two // comments seem redundant with the quote from the standard 
above, let's drop them.



type_quals &= ~(TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE);
  }
  
diff --git a/gcc/testsuite/g++.dg/parse/pr101783.C b/gcc/testsuite/g++.dg/parse/pr101783.C

new file mode 100644
index 000..4e0a435dd0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/pr101783.C
@@ -0,0 +1,5 @@
+template struct A{
+typedef T& Type;
+};
+template void f(const typename A::Type){}
+template <> void f(const typename A::Type){}







Re: *PING* [PATCH] c++: fix cases of core1001/1322 by not dropping cv-qualifier of function parameter of type of typename or decltype[PR101402,PR102033,PR102034,PR102039,PR102044]

2021-09-24 Thread Jason Merrill via Gcc-patches

I already responded to this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579527.html



Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2021-09-24 Thread H.J. Lu via Gcc-patches
On Fri, Sep 24, 2021 at 11:14 AM Fāng-ruì Sòng  wrote:
>
> On Fri, Sep 24, 2021 at 10:41 AM H.J. Lu  wrote:
> >
> > On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng  wrote:
> > >
> > >  On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng  wrote:
> > > >
> > > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu  wrote:
> > > > >
> > > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak  wrote:
> > > > > >
> > > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches
> > > > > >  wrote:
> > > > > > >
> > > > > > > PING^5 
> > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > >
> > > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng 
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > PING^4 
> > > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > >
> > > > > > > > One major design goal of PIE was to avoid copy relocations.
> > > > > > > > The original patch for GCC 5 caused problems for many years.
> > > > > > > >
> > > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng 
> > > > > > > >  wrote:
> > > > > > > >>
> > > > > > > >> PING^3 
> > > > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > >>
> > > > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng 
> > > > > > > >>  wrote:
> > > > > > > >> >
> > > > > > > >> > PING^2 
> > > > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > >> >
> > > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng 
> > > > > > > >> >  wrote:
> > > > > > > >> > >
> > > > > > > >> > > Ping 
> > > > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > > >> > >
> > > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song 
> > > > > > > >> > >  wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > This was introduced in 2014-12 to use local binding for 
> > > > > > > >> > > > external symbols
> > > > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for 
> > > > > > > >> > > > years which mostly
> > > > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, 
> > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC
> > > > > > > >> > > > should retire now.
> > > > > > > >> > > >
> > > > > > > >> > > > One design goal of -fPIE was to avoid copy relocations.
> > > > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal.  With 
> > > > > > > >> > > > this change, the
> > > > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and 
> > > > > > > >> > > > other targets.
> > > > > > > >> > > >
> > > > > > > >> > > > ---
> > > > > > > >> > > >
> > > > > > > >> > > > See 
> > > > > > > >> > > > https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html 
> > > > > > > >> > > > for a list
> > > > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with 
> > > > > > > >> > > > protected
> > > > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) 
> > > > > > > >> > > > issues.
> > > > > > > >> > > >
> > > > > > > >> > > > If you prefer a longer write-up, see
> > > > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected
> > > > > > > >> > > > ---
> > > > > > > >> > > >  gcc/config.in |  6 ---
> > > > > > > >> > > >  gcc/config/i386/i386.c| 11 +---
> > > > > > > >> > > >  gcc/configure | 52 
> > > > > > > >> > > > ---
> > > > > > > >> > > >  gcc/configure.ac  | 48 
> > > > > > > >> > > > -
> > > > > > > >> > > >  gcc/doc/sourcebuild.texi  |  3 --
> > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-1.c| 14 -
> > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-2.c| 14 -
> > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-3.c| 14 -
> > > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-4.c| 17 
> > > > > > > >> > > > --
> > > > > > > >> > > >  gcc/testsuite/lib/target-supports.exp | 47 
> > > > > > > >> > > > -
> > > > > > > >> > > >  10 files changed, 2 insertions(+), 224 deletions(-)
> > > > > > > >> > > >  delete mode 100644 
> > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
> > > > > > > >> > > >  delete mode 100644 
> > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
> > > > > > > >> > > >  delete mode 100644 
> > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
> > > > > > > >> > > >  delete mode 100644 
> > > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
> > > > > >
> > > > > > From x86 maintainer's PoV, the implementation is trivially correct,
> > > > > > but I have no idea about functionality. HJ, can you please review 
> > > > > > the
> > > > > > functionality and post your opinion on the patch to move it forward?
> > > > > >
> > > > > > Thanks,
> > > > > 

Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2021-09-24 Thread Fāng-ruì Sòng via Gcc-patches
On Fri, Sep 24, 2021 at 10:41 AM H.J. Lu  wrote:
>
> On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng  wrote:
> >
> >  On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng  wrote:
> > >
> > > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu  wrote:
> > > >
> > > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak  wrote:
> > > > >
> > > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches
> > > > >  wrote:
> > > > > >
> > > > > > PING^5 
> > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > >
> > > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng  
> > > > > > wrote:
> > > > > > >
> > > > > > > PING^4 
> > > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > >
> > > > > > > One major design goal of PIE was to avoid copy relocations.
> > > > > > > The original patch for GCC 5 caused problems for many years.
> > > > > > >
> > > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng 
> > > > > > >  wrote:
> > > > > > >>
> > > > > > >> PING^3 
> > > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > >>
> > > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng 
> > > > > > >>  wrote:
> > > > > > >> >
> > > > > > >> > PING^2 
> > > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > >> >
> > > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng 
> > > > > > >> >  wrote:
> > > > > > >> > >
> > > > > > >> > > Ping 
> > > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > > >> > >
> > > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song 
> > > > > > >> > >  wrote:
> > > > > > >> > > >
> > > > > > >> > > > This was introduced in 2014-12 to use local binding for 
> > > > > > >> > > > external symbols
> > > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years 
> > > > > > >> > > > which mostly
> > > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, 
> > > > > > >> > > > HAVE_LD_PIE_COPYRELOC
> > > > > > >> > > > should retire now.
> > > > > > >> > > >
> > > > > > >> > > > One design goal of -fPIE was to avoid copy relocations.
> > > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal.  With 
> > > > > > >> > > > this change, the
> > > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and 
> > > > > > >> > > > other targets.
> > > > > > >> > > >
> > > > > > >> > > > ---
> > > > > > >> > > >
> > > > > > >> > > > See 
> > > > > > >> > > > https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html 
> > > > > > >> > > > for a list
> > > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with 
> > > > > > >> > > > protected
> > > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) 
> > > > > > >> > > > issues.
> > > > > > >> > > >
> > > > > > >> > > > If you prefer a longer write-up, see
> > > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected
> > > > > > >> > > > ---
> > > > > > >> > > >  gcc/config.in |  6 ---
> > > > > > >> > > >  gcc/config/i386/i386.c| 11 +---
> > > > > > >> > > >  gcc/configure | 52 
> > > > > > >> > > > ---
> > > > > > >> > > >  gcc/configure.ac  | 48 
> > > > > > >> > > > -
> > > > > > >> > > >  gcc/doc/sourcebuild.texi  |  3 --
> > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-1.c| 14 -
> > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-2.c| 14 -
> > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-3.c| 14 -
> > > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-4.c| 17 --
> > > > > > >> > > >  gcc/testsuite/lib/target-supports.exp | 47 
> > > > > > >> > > > -
> > > > > > >> > > >  10 files changed, 2 insertions(+), 224 deletions(-)
> > > > > > >> > > >  delete mode 100644 
> > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
> > > > > > >> > > >  delete mode 100644 
> > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
> > > > > > >> > > >  delete mode 100644 
> > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
> > > > > > >> > > >  delete mode 100644 
> > > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
> > > > >
> > > > > From x86 maintainer's PoV, the implementation is trivially correct,
> > > > > but I have no idea about functionality. HJ, can you please review the
> > > > > functionality and post your opinion on the patch to move it forward?
> > > > >
> > > > > Thanks,
> > > > > Uros.
> > > >
> > > > I prefer to leave it alone and apply this:
> > > >
> > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576736.html
> > > >
> > > > instead.  I am working to add a nodirect_extern_access attribute based
> > > > on feedback at LPC 2021.
> > >
> > > I think -fpie 

[PATCH] Add a simulate_record_decl lang hook

2021-09-24 Thread Richard Sandiford via Gcc-patches
This patch adds a lang hook for defining a struct/RECORD_TYPE
“as if” it had appeared directly in the source code.  It follows
the similar existing hook for enums.

It's the caller's responsibility to create the fields
(as FIELD_DECLs) but the hook's responsibility to create
and declare the associated RECORD_TYPE.

For now the hook is hard-coded to do the equivalent of:

  typedef struct NAME { FIELDS } NAME;

but this could be controlled by an extra parameter if some callers
want a different behaviour in future.

The motivating use case is to allow the long list of struct
definitions in arm_neon.h to be provided by the compiler,
which in turn unblocks various arm_neon.h optimisations.

Tested on aarch64-linux-gnu, individually and with a follow-on
patch from Jonathan that makes use of the hook.  OK to install?

Richard


gcc/
* langhooks.h (lang_hooks_for_types::simulate_record_decl): New hook.
* langhooks-def.h (lhd_simulate_record_decl): Declare.
(LANG_HOOKS_SIMULATE_RECORD_DECL): Define.
(LANG_HOOKS_FOR_TYPES_INITIALIZER): Include it.
* langhooks.c (lhd_simulate_record_decl): New function.

gcc/c/
* c-tree.h (c_simulate_record_decl): Declare.
* c-objc-common.h (LANG_HOOKS_SIMULATE_RECORD_DECL): Override.
* c-decl.c (c_simulate_record_decl): New function.

gcc/cp/
* decl.c: Include langhooks-def.h.
(cxx_simulate_record_decl): New function.
* cp-objcp-common.h (cxx_simulate_record_decl): Declare.
(LANG_HOOKS_SIMULATE_RECORD_DECL): Override.
---
 gcc/c/c-decl.c   | 31 +++
 gcc/c/c-objc-common.h|  2 ++
 gcc/c/c-tree.h   |  2 ++
 gcc/cp/cp-objcp-common.h |  4 
 gcc/cp/decl.c| 38 ++
 gcc/langhooks-def.h  |  4 
 gcc/langhooks.c  | 21 +
 gcc/langhooks.h  | 10 ++
 8 files changed, 112 insertions(+)

diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index 771efa3eadf..8d1324b118c 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -9436,6 +9436,37 @@ c_simulate_enum_decl (location_t loc, const char *name,
   input_location = saved_loc;
   return enumtype;
 }
+
+/* Implement LANG_HOOKS_SIMULATE_RECORD_DECL.  */
+
+tree
+c_simulate_record_decl (location_t loc, const char *name,
+   array_slice fields)
+{
+  location_t saved_loc = input_location;
+  input_location = loc;
+
+  class c_struct_parse_info *struct_info;
+  tree ident = get_identifier (name);
+  tree type = start_struct (loc, RECORD_TYPE, ident, _info);
+
+  for (unsigned int i = 0; i < fields.size (); ++i)
+{
+  DECL_FIELD_CONTEXT (fields[i]) = type;
+  if (i > 0)
+   DECL_CHAIN (fields[i - 1]) = fields[i];
+}
+
+  finish_struct (loc, type, fields[0], NULL_TREE, struct_info);
+
+  tree decl = build_decl (loc, TYPE_DECL, ident, type);
+  TYPE_NAME (type) = decl;
+  TYPE_STUB_DECL (type) = decl;
+  lang_hooks.decls.pushdecl (decl);
+
+  input_location = saved_loc;
+  return type;
+}

 /* Create the FUNCTION_DECL for a function definition.
DECLSPECS, DECLARATOR and ATTRIBUTES are the parts of
diff --git a/gcc/c/c-objc-common.h b/gcc/c/c-objc-common.h
index 7d35a0621e4..f4e8271f06c 100644
--- a/gcc/c/c-objc-common.h
+++ b/gcc/c/c-objc-common.h
@@ -81,6 +81,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #undef LANG_HOOKS_SIMULATE_ENUM_DECL
 #define LANG_HOOKS_SIMULATE_ENUM_DECL c_simulate_enum_decl
+#undef LANG_HOOKS_SIMULATE_RECORD_DECL
+#define LANG_HOOKS_SIMULATE_RECORD_DECL c_simulate_record_decl
 #undef LANG_HOOKS_TYPE_FOR_MODE
 #define LANG_HOOKS_TYPE_FOR_MODE c_common_type_for_mode
 #undef LANG_HOOKS_TYPE_FOR_SIZE
diff --git a/gcc/c/c-tree.h b/gcc/c/c-tree.h
index d50d0cb7f2d..8578d2d1e77 100644
--- a/gcc/c/c-tree.h
+++ b/gcc/c/c-tree.h
@@ -598,6 +598,8 @@ extern tree finish_struct (location_t, tree, tree, tree,
   class c_struct_parse_info *);
 extern tree c_simulate_enum_decl (location_t, const char *,
  vec *);
+extern tree c_simulate_record_decl (location_t, const char *,
+   array_slice);
 extern struct c_arg_info *build_arg_info (void);
 extern struct c_arg_info *get_parm_info (bool, tree);
 extern tree grokfield (location_t, struct c_declarator *,
diff --git a/gcc/cp/cp-objcp-common.h b/gcc/cp/cp-objcp-common.h
index f1704aad557..d5859406e8f 100644
--- a/gcc/cp/cp-objcp-common.h
+++ b/gcc/cp/cp-objcp-common.h
@@ -39,6 +39,8 @@ extern bool cp_handle_option (size_t, const char *, 
HOST_WIDE_INT, int,
 extern tree cxx_make_type_hook (tree_code);
 extern tree cxx_simulate_enum_decl (location_t, const char *,
vec *);
+extern tree cxx_simulate_record_decl (location_t, const char *,
+ array_slice);
 
 /* Lang hooks that are shared between C++ and ObjC++ are defined here.  Hooks

Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2021-09-24 Thread H.J. Lu via Gcc-patches
On Fri, Sep 24, 2021 at 10:29 AM Fāng-ruì Sòng  wrote:
>
>  On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng  wrote:
> >
> > On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu  wrote:
> > >
> > > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak  wrote:
> > > >
> > > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches
> > > >  wrote:
> > > > >
> > > > > PING^5 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > >
> > > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng  
> > > > > wrote:
> > > > > >
> > > > > > PING^4 
> > > > > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > >
> > > > > > One major design goal of PIE was to avoid copy relocations.
> > > > > > The original patch for GCC 5 caused problems for many years.
> > > > > >
> > > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng  
> > > > > > wrote:
> > > > > >>
> > > > > >> PING^3 
> > > > > >> https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > >>
> > > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng  
> > > > > >> wrote:
> > > > > >> >
> > > > > >> > PING^2 
> > > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > >> >
> > > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng 
> > > > > >> >  wrote:
> > > > > >> > >
> > > > > >> > > Ping 
> > > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > > >> > >
> > > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song 
> > > > > >> > >  wrote:
> > > > > >> > > >
> > > > > >> > > > This was introduced in 2014-12 to use local binding for 
> > > > > >> > > > external symbols
> > > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years 
> > > > > >> > > > which mostly
> > > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, 
> > > > > >> > > > HAVE_LD_PIE_COPYRELOC
> > > > > >> > > > should retire now.
> > > > > >> > > >
> > > > > >> > > > One design goal of -fPIE was to avoid copy relocations.
> > > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal.  With this 
> > > > > >> > > > change, the
> > > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and other 
> > > > > >> > > > targets.
> > > > > >> > > >
> > > > > >> > > > ---
> > > > > >> > > >
> > > > > >> > > > See https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html 
> > > > > >> > > > for a list
> > > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with 
> > > > > >> > > > protected
> > > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) 
> > > > > >> > > > issues.
> > > > > >> > > >
> > > > > >> > > > If you prefer a longer write-up, see
> > > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected
> > > > > >> > > > ---
> > > > > >> > > >  gcc/config.in |  6 ---
> > > > > >> > > >  gcc/config/i386/i386.c| 11 +---
> > > > > >> > > >  gcc/configure | 52 
> > > > > >> > > > ---
> > > > > >> > > >  gcc/configure.ac  | 48 
> > > > > >> > > > -
> > > > > >> > > >  gcc/doc/sourcebuild.texi  |  3 --
> > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-1.c| 14 -
> > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-2.c| 14 -
> > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-3.c| 14 -
> > > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-4.c| 17 --
> > > > > >> > > >  gcc/testsuite/lib/target-supports.exp | 47 
> > > > > >> > > > -
> > > > > >> > > >  10 files changed, 2 insertions(+), 224 deletions(-)
> > > > > >> > > >  delete mode 100644 
> > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
> > > > > >> > > >  delete mode 100644 
> > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
> > > > > >> > > >  delete mode 100644 
> > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
> > > > > >> > > >  delete mode 100644 
> > > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
> > > >
> > > > From x86 maintainer's PoV, the implementation is trivially correct,
> > > > but I have no idea about functionality. HJ, can you please review the
> > > > functionality and post your opinion on the patch to move it forward?
> > > >
> > > > Thanks,
> > > > Uros.
> > >
> > > I prefer to leave it alone and apply this:
> > >
> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576736.html
> > >
> > > instead.  I am working to add a nodirect_extern_access attribute based
> > > on feedback at LPC 2021.
> >
> > I think -fpie should be fixed as soon as possible.
> >
> > "Add -f[no-]direct-extern-access" says "-fdirect-extern-access is the 
> > default."
> > IMHO this is not a good choice for -fpie.
> > As the description of this patch says, one of the design goals of
> > -fpie is to avoid copy relocations.
> >
> > > In 

Re: [PATCH] x86-64: Remove HAVE_LD_PIE_COPYRELOC

2021-09-24 Thread Fāng-ruì Sòng via Gcc-patches
 On Tue, Sep 21, 2021 at 7:08 PM Fāng-ruì Sòng  wrote:
>
> On Tue, Sep 21, 2021 at 6:57 PM H.J. Lu  wrote:
> >
> > On Tue, Sep 21, 2021 at 9:16 AM Uros Bizjak  wrote:
> > >
> > > On Mon, Sep 20, 2021 at 8:20 PM Fāng-ruì Sòng via Gcc-patches
> > >  wrote:
> > > >
> > > > PING^5 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > >
> > > > On Sat, Sep 4, 2021 at 12:11 PM Fāng-ruì Sòng  
> > > > wrote:
> > > > >
> > > > > PING^4 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > >
> > > > > One major design goal of PIE was to avoid copy relocations.
> > > > > The original patch for GCC 5 caused problems for many years.
> > > > >
> > > > > On Wed, Aug 18, 2021 at 11:54 PM Fāng-ruì Sòng  
> > > > > wrote:
> > > > >>
> > > > >> PING^3 https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > >>
> > > > >> On Fri, Jun 4, 2021 at 3:04 PM Fāng-ruì Sòng  
> > > > >> wrote:
> > > > >> >
> > > > >> > PING^2 
> > > > >> > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > >> >
> > > > >> > On Mon, May 24, 2021 at 9:43 AM Fāng-ruì Sòng  
> > > > >> > wrote:
> > > > >> > >
> > > > >> > > Ping 
> > > > >> > > https://gcc.gnu.org/pipermail/gcc-patches/2021-May/570139.html
> > > > >> > >
> > > > >> > > On Tue, May 11, 2021 at 8:29 PM Fangrui Song 
> > > > >> > >  wrote:
> > > > >> > > >
> > > > >> > > > This was introduced in 2014-12 to use local binding for 
> > > > >> > > > external symbols
> > > > >> > > > for -fPIE. Now that we have H.J. Lu's GOTPCRELX for years 
> > > > >> > > > which mostly
> > > > >> > > > nullify the benefit of HAVE_LD_PIE_COPYRELOC, 
> > > > >> > > > HAVE_LD_PIE_COPYRELOC
> > > > >> > > > should retire now.
> > > > >> > > >
> > > > >> > > > One design goal of -fPIE was to avoid copy relocations.
> > > > >> > > > HAVE_LD_PIE_COPYRELOC has deviated from the goal.  With this 
> > > > >> > > > change, the
> > > > >> > > > -fPIE behavior of x86-64 will be closer to x86-32 and other 
> > > > >> > > > targets.
> > > > >> > > >
> > > > >> > > > ---
> > > > >> > > >
> > > > >> > > > See https://gcc.gnu.org/legacy-ml/gcc/2019-05/msg00215.html 
> > > > >> > > > for a list
> > > > >> > > > of fixed and unfixed (e.g. gold incompatibility with protected
> > > > >> > > > https://sourceware.org/bugzilla/show_bug.cgi?id=19823) issues.
> > > > >> > > >
> > > > >> > > > If you prefer a longer write-up, see
> > > > >> > > > https://maskray.me/blog/2021-01-09-copy-relocations-canonical-plt-entries-and-protected
> > > > >> > > > ---
> > > > >> > > >  gcc/config.in |  6 ---
> > > > >> > > >  gcc/config/i386/i386.c| 11 +---
> > > > >> > > >  gcc/configure | 52 
> > > > >> > > > ---
> > > > >> > > >  gcc/configure.ac  | 48 
> > > > >> > > > -
> > > > >> > > >  gcc/doc/sourcebuild.texi  |  3 --
> > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-1.c| 14 -
> > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-2.c| 14 -
> > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-3.c| 14 -
> > > > >> > > >  .../gcc.target/i386/pie-copyrelocs-4.c| 17 --
> > > > >> > > >  gcc/testsuite/lib/target-supports.exp | 47 
> > > > >> > > > -
> > > > >> > > >  10 files changed, 2 insertions(+), 224 deletions(-)
> > > > >> > > >  delete mode 100644 
> > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-1.c
> > > > >> > > >  delete mode 100644 
> > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-2.c
> > > > >> > > >  delete mode 100644 
> > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-3.c
> > > > >> > > >  delete mode 100644 
> > > > >> > > > gcc/testsuite/gcc.target/i386/pie-copyrelocs-4.c
> > >
> > > From x86 maintainer's PoV, the implementation is trivially correct,
> > > but I have no idea about functionality. HJ, can you please review the
> > > functionality and post your opinion on the patch to move it forward?
> > >
> > > Thanks,
> > > Uros.
> >
> > I prefer to leave it alone and apply this:
> >
> > https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576736.html
> >
> > instead.  I am working to add a nodirect_extern_access attribute based
> > on feedback at LPC 2021.
>
> I think -fpie should be fixed as soon as possible.
>
> "Add -f[no-]direct-extern-access" says "-fdirect-extern-access is the 
> default."
> IMHO this is not a good choice for -fpie.
> As the description of this patch says, one of the design goals of
> -fpie is to avoid copy relocations.
>
> > In executable and shared library, bind symbols with the STV_PROTECTED 
> > visibility locally
>
> As I have repeated many times (also Clang's behavior), STV_PROTECTED
> visibility symbol should be bound locally regardless of
> -fno-direct-extern-access.
>
> I think it is fair to say all of Michael Matz, Alan Modra, and I think
> adding so many 

Re: [PATCH] Allow different vector types for stmt groups

2021-09-24 Thread Richard Sandiford via Gcc-patches
Richard Biener  writes:
> This allows vectorization (in practice non-loop vectorization) to
> have a stmt participate in different vector type vectorizations.
> It allows us to remove vect_update_shared_vectype and replace it
> by pushing/popping STMT_VINFO_VECTYPE from SLP_TREE_VECTYPE around
> vect_analyze_stmt and vect_transform_stmt.
>
> For data-ref the situation is a bit more complicated since we
> analyze alignment info with a specific vector type in mind which
> doesn't play well when that changes.
>
> So the bulk of the change is passing down the actual vector type
> used for a vectorized access to the various accessors of alignment
> info, first and foremost dr_misalignment but also aligned_access_p,
> known_alignment_for_access_p, vect_known_alignment_in_bytes and
> vect_supportable_dr_alignment.  I took the liberty to replace
> ALL_CAPS macro accessors with the lower-case function invocations.
>
> The actual changes to the behavior are in dr_misalignment which now
> is the place factoring in the negative step adjustment as well as
> handling alignment queries for a vector type with bigger alignment
> requirements than what we can (or have) analyze(d).
>
> vect_slp_analyze_node_alignment makes use of this and upon receiving
> a vector type with a bigger alingment desire re-analyzes the DR
> with respect to it but keeps an older more precise result if possible.
> In this context it might be possible to do the analysis just once
> but instead of analyzing with respect to a specific desired alignment
> look for the biggest alignment we can compute a not unknown alignment.
>
> The ChangeLog includes the functional changes but not the bulk due
> to the alignment accessor API changes - I hope that's something good.
>
> Bootstrapped and tested on x86_64-unknown-linux-gnu, testing on SPEC
> CPU 2017 in progress (for stats and correctness).
>
> Any comments?

Sorry for the super-slow response, some comments below.

> […]
> diff --git a/gcc/tree-vect-data-refs.c b/gcc/tree-vect-data-refs.c
> index a57700f2c1b..c42fc2fb272 100644
> --- a/gcc/tree-vect-data-refs.c
> +++ b/gcc/tree-vect-data-refs.c
> @@ -887,37 +887,53 @@ vect_slp_analyze_instance_dependence (vec_info *vinfo, 
> slp_instance instance)
>return res;
>  }
>  
> -/* Return the misalignment of DR_INFO.  */
> +/* Return the misalignment of DR_INFO accessed in VECTYPE.  */
>  
>  int
> -dr_misalignment (dr_vec_info *dr_info)
> +dr_misalignment (dr_vec_info *dr_info, tree vectype)
>  {
> +  HOST_WIDE_INT diff = 0;
> +  /* Alignment is only analyzed for the first element of a DR group,
> + use that but adjust misalignment by the offset of the access.  */
>if (STMT_VINFO_GROUPED_ACCESS (dr_info->stmt))
>  {
>dr_vec_info *first_dr
>   = STMT_VINFO_DR_INFO (DR_GROUP_FIRST_ELEMENT (dr_info->stmt));
> -  int misalign = first_dr->misalignment;
> -  gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
> -  if (misalign == DR_MISALIGNMENT_UNKNOWN)
> - return misalign;
>/* vect_analyze_data_ref_accesses guarantees that DR_INIT are
>INTEGER_CSTs and the first element in the group has the lowest
>address.  Likewise vect_compute_data_ref_alignment will
>have ensured that target_alignment is constant and otherwise
>set misalign to DR_MISALIGNMENT_UNKNOWN.  */

Can you move the second sentence down so that it stays with the to_constant?

> -  HOST_WIDE_INT diff = (TREE_INT_CST_LOW (DR_INIT (dr_info->dr))
> - - TREE_INT_CST_LOW (DR_INIT (first_dr->dr)));
> +  diff = (TREE_INT_CST_LOW (DR_INIT (dr_info->dr))
> +   - TREE_INT_CST_LOW (DR_INIT (first_dr->dr)));
>gcc_assert (diff >= 0);
> -  unsigned HOST_WIDE_INT target_alignment_c
> - = first_dr->target_alignment.to_constant ();
> -  return (misalign + diff) % target_alignment_c;
> +  dr_info = first_dr;
>  }
> -  else
> +
> +  int misalign = dr_info->misalignment;
> +  gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
> +  if (misalign == DR_MISALIGNMENT_UNKNOWN)
> +return misalign;
> +
> +  /* If the access is only aligned for a vector type with smaller alignment
> + requirement the access has unknown misalignment.  */
> +  if (maybe_lt (dr_info->target_alignment * BITS_PER_UNIT,
> + targetm.vectorize.preferred_vector_alignment (vectype)))
> +return DR_MISALIGNMENT_UNKNOWN;
> +
> +  /* If this is a backward running DR then first access in the larger
> + vectype actually is N-1 elements before the address in the DR.
> + Adjust misalign accordingly.  */
> +  if (tree_int_cst_sgn (DR_STEP (dr_info->dr)) < 0)
>  {
> -  int misalign = dr_info->misalignment;
> -  gcc_assert (misalign != DR_MISALIGNMENT_UNINITIALIZED);
> -  return misalign;
> +  if (!TYPE_VECTOR_SUBPARTS (vectype).is_constant ())
> + return DR_MISALIGNMENT_UNKNOWN;
> +  diff += ((TYPE_VECTOR_SUBPARTS (vectype).to_constant () - 1)
> + 

Re: [PATCH] top-level: merge Makefile.def patches from binutils-gdb repository

2021-09-24 Thread Andrew Burgess
* Richard Biener  [2021-09-24 13:58:20 +0200]:

> On Fri, Sep 24, 2021 at 12:49 PM Andrew Burgess
>  wrote:
> >
> > This commit back-ports two patches to Makefile.def from the
> > binutils-gdb repository, these patches were committed over there
> > without first being merged in to the gcc repository.
> >
> > These commits all relate to dependencies for binutils-gdb modules, so
> > should have no impact on gcc, I tested a gcc build/install on x86-64
> > GNU/Linux, and everything looked OK.
> >
> > The two patches being backported are binutils-gdb commits:
> >
> >   commit ba4d88ad892fe29c6ca7938c8861f8edef5f7a3f (gdb-gnulib-issues)
> >   Date:   Mon Oct 12 16:04:32 2020 +0100
> >
> >   gdb/gdbserver: add dependencies for distclean-gnulib
> >
> > And
> >
> >   commit 755ba58ebef02e1be9fc6770d00243ba6ed0223c
> >   Date:   Thu Mar 18 12:37:52 2021 +
> >
> >   Add install dependencies for ld -> bfd and libctf -> bfd
> >
> > OK to merge?
> 
> OK.

Thanks, I pushed this patch.

Andrew


[PING}[PATCH] libgcc, emutls: Allow building weak definitions of the emutls functions.

2021-09-24 Thread Iain Sandoe
Hi,

as noted below the non-Darwin parts of this are trivial (and a no-OP).
I’d like to apply this to start work towards solving Darwin’s libgcc issues,

OTOH, the two raised questions remain…
thanks
Iain

> On 20 Sep 2021, at 09:25, Iain Sandoe  wrote:
> 
> Hi,
> 
> The non-Darwin part of this patch is trivial but raises a couple of questions
> 
> A/
> We define builtins to support emulated TLS.
> These are defined with void * pointers
> The implementation (in libgcc) uses the correct type (struct __emutls_object 
> *)
> in both a forward declaration of the functions and in thier eventual 
> implementation.
> 
> This leads to a (long-standing, nothing new) complaint at build-time about
> the mismatch in the builtin/implementation decls.
> 
> AFAICT, there’s no way to fix that unless we introduce struct __emutls_object 
> *
> as a built-in type?
> 
> B/ 
> It seems that a consequence of the mismatch in decls means that if I apply
> attributes to the decl (in the implementation file), they are ignored and I 
> have
> to apply them to the definition in order for this to work.
> 
> This (B) is what the patch below does.
> 
> tested on powerpc,i686,x86_64-darwin, x86_64-linux
> OK for master?
> thanks,
> Iain
> 
> If the current situation is that A or B indicates “there’s a bug”, please 
> could that
> be considered as distinct from the current patch (which doesn’t alter this in 
> any
> way) so that we can make progress on fixing Darwin libgcc issues.
> 
> = commit log
> 
> In order to better support use of the emulated TLS between objects with
> DSO dependencies and static-linked libgcc, allow a target to make weak
> definitions.
> 
> Signed-off-by: Iain Sandoe 
> 
> libgcc/ChangeLog:
> 
>   * config.host: Add weak-defined emutls crt.
>   * config/t-darwin: Build weak-defined emutls objects.
>   * emutls.c (__emutls_get_address): Add optional attributes.
>   (__emutls_register_common): Likewise.
>   (EMUTLS_ATTR): New.
> ---
> libgcc/config.host |  2 +-
> libgcc/config/t-darwin | 13 +
> libgcc/emutls.c| 17 +++--
> 3 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/libgcc/config.host b/libgcc/config.host
> index 6c34b13d611..a447ac7ae30 100644
> --- a/libgcc/config.host
> +++ b/libgcc/config.host
> @@ -215,7 +215,7 @@ case ${host} in
> *-*-darwin*)
>   asm_hidden_op=.private_extern
>   tmake_file="$tmake_file t-darwin ${cpu_type}/t-darwin t-libgcc-pic 
> t-slibgcc-darwin"
> -  extra_parts="crt3.o libd10-uwfef.a crttms.o crttme.o"
> +  extra_parts="crt3.o libd10-uwfef.a crttms.o crttme.o libemutls_w.a"
>   ;;
> *-*-dragonfly*)
>   tmake_file="$tmake_file t-crtstuff-pic t-libgcc-pic t-eh-dw2-dip"
> diff --git a/libgcc/config/t-darwin b/libgcc/config/t-darwin
> index 14ae6b35a4e..d6f688d66d5 100644
> --- a/libgcc/config/t-darwin
> +++ b/libgcc/config/t-darwin
> @@ -15,6 +15,19 @@ crttme.o: $(srcdir)/config/darwin-crt-tm.c
> LIB2ADDEH = $(srcdir)/unwind-dw2.c $(srcdir)/config/unwind-dw2-fde-darwin.c \
>   $(srcdir)/unwind-sjlj.c $(srcdir)/unwind-c.c
> 
> +# Make emutls weak so that we can deal with -static-libgcc, override the
> +# hidden visibility when this is present in libgcc_eh.
> +emutls.o: HOST_LIBGCC2_CFLAGS += \
> +  -DEMUTLS_ATTR='__attribute__((__weak__,__visibility__("default")))'
> +emutls_s.o: HOST_LIBGCC2_CFLAGS += \
> +  -DEMUTLS_ATTR='__attribute__((__weak__,__visibility__("default")))'
> +
> +# Make the emutls crt as a convenience lib so that it can be linked
> +# optionally, use the shared version so that we can link with DSO.
> +libemutls_w.a: emutls_s.o
> + $(AR_CREATE_FOR_TARGET) $@ $<
> + $(RANLIB_FOR_TARGET) $@
> +
> # Patch to __Unwind_Find_Enclosing_Function for Darwin10.
> d10-uwfef.o: $(srcdir)/config/darwin10-unwind-find-enc-func.c
>   $(crt_compile) -mmacosx-version-min=10.6 -c $<
> diff --git a/libgcc/emutls.c b/libgcc/emutls.c
> index ed2658170f5..d553a74728f 100644
> --- a/libgcc/emutls.c
> +++ b/libgcc/emutls.c
> @@ -50,7 +50,16 @@ struct __emutls_array
>   void **data[];
> };
> 
> +/* EMUTLS_ATTR is provided to allow targets to build the emulated tls
> +   routines as weak definitions, for example.
> +   If there is no definition, fall back to the default.  */
> +#ifndef EMUTLS_ATTR
> +#  define EMUTLS_ATTR
> +#endif
> +
> +EMUTLS_ATTR
> void *__emutls_get_address (struct __emutls_object *);
> +EMUTLS_ATTR
> void __emutls_register_common (struct __emutls_object *, word, word, void *);
> 
> #ifdef __GTHREADS
> @@ -123,7 +132,11 @@ emutls_alloc (struct __emutls_object *obj)
>   return ret;
> }
> 
> -void *
> +/* Despite applying the attribute to the declaration, in this case the mis-
> +   match between the builtin's declaration [void * (*)(void *)] and the
> +   implementation here, causes the decl. attributes to be discarded.  */
> +
> +EMUTLS_ATTR void *
> __emutls_get_address (struct __emutls_object *obj)
> {
>   if (! __gthread_active_p ())
> @@ -187,7 +200,7 @@ __emutls_get_address 

Re: [PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.

2021-09-24 Thread Uros Bizjak via Gcc-patches
On Fri, Sep 24, 2021 at 1:26 PM liuhongt  wrote:
>
> Hi:
>   Related discussion in [1] and PR.
>
>   Bootstrapped and regtest on x86_64-linux-gnu{-m32,}.
>   Ok for trunk?
>
> [1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html
>
> gcc/ChangeLog:
>
> PR target/102464
> * config/i386/i386.c (ix86_optab_supported_p):
> Return true for HFmode.
> * match.pd: Simplify (_Float16) ceil ((double) x) to
> __builtin_ceilf16 (a) when a is _Float16 type and
> direct_internal_fn_supported_p.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/i386/pr102464.c: New test.

OK for x86 part.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.c   | 20 +++-
>  gcc/match.pd | 28 +
>  gcc/testsuite/gcc.target/i386/pr102464.c | 39 
>  3 files changed, 79 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr102464.c
>
> diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
> index ba89e111d28..3767fe9806d 100644
> --- a/gcc/config/i386/i386.c
> +++ b/gcc/config/i386/i386.c
> @@ -23582,20 +23582,24 @@ ix86_optab_supported_p (int op, machine_mode mode1, 
> machine_mode,
>return opt_type == OPTIMIZE_FOR_SPEED;
>
>  case rint_optab:
> -  if (SSE_FLOAT_MODE_P (mode1)
> - && TARGET_SSE_MATH
> - && !flag_trapping_math
> - && !TARGET_SSE4_1)
> +  if (mode1 == HFmode)
> +   return true;
> +  else if (SSE_FLOAT_MODE_P (mode1)
> +  && TARGET_SSE_MATH
> +  && !flag_trapping_math
> +  && !TARGET_SSE4_1)
> return opt_type == OPTIMIZE_FOR_SPEED;
>return true;
>
>  case floor_optab:
>  case ceil_optab:
>  case btrunc_optab:
> -  if (SSE_FLOAT_MODE_P (mode1)
> - && TARGET_SSE_MATH
> - && !flag_trapping_math
> - && TARGET_SSE4_1)
> +  if (mode1 == HFmode)
> +   return true;
> +  else if (SSE_FLOAT_MODE_P (mode1)
> +  && TARGET_SSE_MATH
> +  && !flag_trapping_math
> +  && TARGET_SSE4_1)
> return true;
>return opt_type == OPTIMIZE_FOR_SPEED;
>
> diff --git a/gcc/match.pd b/gcc/match.pd
> index a9791ceb74a..9ccec8b6ce3 100644
> --- a/gcc/match.pd
> +++ b/gcc/match.pd
> @@ -6191,6 +6191,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> (froms (convert float_value_p@0))
> (convert (tos @0)
>
> +#if GIMPLE
> +(match float16_value_p
> + @0
> + (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == float16_type_node)))
> +(for froms (BUILT_IN_TRUNCL BUILT_IN_TRUNC BUILT_IN_TRUNCF
> +   BUILT_IN_FLOORL BUILT_IN_FLOOR BUILT_IN_FLOORF
> +   BUILT_IN_CEILL BUILT_IN_CEIL BUILT_IN_CEILF
> +   BUILT_IN_ROUNDEVENL BUILT_IN_ROUNDEVEN BUILT_IN_ROUNDEVENF
> +   BUILT_IN_ROUNDL BUILT_IN_ROUND BUILT_IN_ROUNDF
> +   BUILT_IN_NEARBYINTL BUILT_IN_NEARBYINT BUILT_IN_NEARBYINTF
> +   BUILT_IN_RINTL BUILT_IN_RINT BUILT_IN_RINTF)
> + tos (IFN_TRUNC IFN_TRUNC IFN_TRUNC
> + IFN_FLOOR IFN_FLOOR IFN_FLOOR
> + IFN_CEIL IFN_CEIL IFN_CEIL
> + IFN_ROUNDEVEN IFN_ROUNDEVEN IFN_ROUNDEVEN
> + IFN_ROUND IFN_ROUND IFN_ROUND
> + IFN_NEARBYINT IFN_NEARBYINT IFN_NEARBYINT
> + IFN_RINT IFN_RINT IFN_RINT)
> + /* (_Float16) round ((doube) x) -> __built_in_roundf16 (x), etc.,
> +if x is a _Float16.  */
> + (simplify
> +   (convert (froms (convert float16_value_p@0)))
> + (if (types_match (type, TREE_TYPE (@0))
> + && direct_internal_fn_supported_p (as_internal_fn (tos),
> +type, OPTIMIZE_FOR_BOTH))
> +   (tos @0
> +#endif
> +
>  (for froms (XFLOORL XCEILL XROUNDL XRINTL)
>   tos (XFLOOR XCEIL XROUND XRINT)
>   /* llfloorl(extend(x)) -> llfloor(x), etc., if x is a double.  */
> diff --git a/gcc/testsuite/gcc.target/i386/pr102464.c 
> b/gcc/testsuite/gcc.target/i386/pr102464.c
> new file mode 100644
> index 000..e3e060ee80b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr102464.c
> @@ -0,0 +1,39 @@
> +/* PR target/102464.  */
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -mavx512fp16" } */
> +
> +#define FOO(FUNC,SUFFIX)   \
> +  _Float16 \
> +  foo_##FUNC##_##SUFFIX (_Float16 a)   \
> +  {\
> +return __builtin_##FUNC##SUFFIX (a);   \
> +  }
> +
> +FOO (roundeven, f16);
> +FOO (roundeven, f);
> +FOO (roundeven, );
> +FOO (roundeven, l);
> +FOO (trunc, f16);
> +FOO (trunc, f);
> +FOO (trunc, );
> +FOO (trunc, l);
> +FOO (ceil, f16);
> +FOO (ceil, f);
> +FOO (ceil, );
> +FOO (ceil, l);
> +FOO (floor, f16);
> +FOO (floor, f);
> +FOO (floor, );
> +FOO (floor, l);
> +FOO (nearbyint, f16);
> +FOO (nearbyint, f);
> +FOO (nearbyint, );
> +FOO (nearbyint, l);
> +FOO (rint, f16);
> +FOO (rint, f);
> +FOO (rint, );
> 

Re: [PATCH] Avoid invalid loop transformations in jump threading registry.

2021-09-24 Thread Jeff Law via Gcc-patches




On 9/24/2021 5:34 AM, Aldy Hernandez wrote:



On 9/23/21 6:10 PM, Jeff Law wrote:



On 9/23/2021 5:15 AM, Aldy Hernandez wrote:

My upcoming improvements to the forward jump threader make it thread
more aggressively.  In investigating some "regressions", I noticed
that it has always allowed threading through empty latches and across
loop boundaries.  As we have discussed recently, this should be avoided
until after loop optimizations have run their course.

Note that this wasn't much of a problem before because DOM/VRP
couldn't find these opportunities, but with a smarter solver, we trip
over them more easily.
We used to be much more aggressive in this space -- but we removed 
the equivalency tracking on backedges in the main part of DOM which 
had the side effect to reducing the number of threads related to back 
edges in loops.


I thought we couldn't thread through back edges at all in the old 
threader, or are we talking about the same thing?  We have a hard fail 
on backedge thread attempts for anything but the backward threader and 
its custom copier.

We used to have it in the distant past IIRC.

Jeff



[PATCH] Replace VRP threader with a hybrid forward threader.

2021-09-24 Thread Aldy Hernandez via Gcc-patches
This patch implements the new hybrid forward threader and replaces the
embedded VRP threader with it.

With all the pieces that have gone in, the implementation of the hybrid
threader is straightforward: convert the current state into
SSA imports that the solver will understand, and let the path solver
precompute ranges and relations for the path.  After this setup is done,
we can use the range_query API to solve gimple statements in the threader.
The forward threader is now engine agnostic so there are no changes to
the threader per se.

I have put the hybrid bits in tree-ssa-threadedge.*, instead of VRP,
because they will also be used in the evrp removal of the DOM/threader,
which is my next task.

Most of the patch, is actually test changes.  I have gone through every
single one and verified that we're correct.  Most were trivial dump
file name changes, but others required going through the IL an
certifying that the different IL was expected.

For example, in pr59597.c, we have one less thread because the
ASSERT_EXPR was getting in the way, and making it seem like things were
not crossing loops.  The hybrid threader sees the correct representation
of the IL, and avoids threading this one case.

The final numbers are a 12.16% improvement in jump threads immediately
after VRP, and a 0.82% improvement in overall jump threads.  The
performance drop is 0.6% (plus the 1.43% hit from moving the embedded
threader into its own pass).  As I've said, I'd prefer to keep the
threader in its own pass, but if this is an issue, we can address this
with a shared ranger when VRP is replaced with an evrp instance
(upcoming).

Note, that these numbers are slightly different than what I originally
posted.  A few correctness tweaks, plus restricting loop threads, made
the difference.  That being said, I was aiming for par.  A 12% gain is
just gravy ;-).  When we merge the threaders, we should see even better
numbers-- and we'll have the benefit of an entire release stress testing
the solver.

As I mentioned in my introductory note, paths ending in MEM_REF
conditional are missing.  In reality, this didn't make a difference, as
it was so rare.  However, as a follow-up, I will distill a test and add
a suitable PR to keep us honest.

There is a one-line change to libgomp/team.c silencing a new used
uninitialized warning.  As my previous work with the threaders has
shown, warnings flare up after each improvement to jump threading.  I
expect this to be no different.  I've promised Jakub to investigate
fully, so I will analyze and add the appropriate PR for the warning
experts.

Oh yeah, the new pass dump is called vrp-threader[12] to match each
VRP[12] pass.  However, there's no reason for it to either be named
vrp-threader, or for it to live in tree-vrp.c.

Tested on x86-64 Linux.

OK?

p.s. "Did I say 5 weeks?  My bad, I meant 5 months."

gcc/ChangeLog:

* passes.def (pass_vrp_threader): New.
* tree-pass.h (make_pass_vrp_threader): Add make_pass_vrp_threader.
* tree-ssa-threadedge.c (hybrid_jt_state::register_equivs_stmt): New.
(hybrid_jt_simplifier::hybrid_jt_simplifier): New.
(hybrid_jt_simplifier::simplify): New.
(hybrid_jt_simplifier::compute_ranges_from_state): New.
* tree-ssa-threadedge.h (class hybrid_jt_state): New.
(class hybrid_jt_simplifier): New.
* tree-vrp.c (execute_vrp): Remove ASSERT_EXPR based jump
threader.
(class hybrid_threader): New.
(hybrid_threader::hybrid_threader): New.
(hybrid_threader::~hybrid_threader): New.
(hybrid_threader::before_dom_children): New.
(hybrid_threader::after_dom_children): New.
(execute_vrp_threader): New.
(class pass_vrp_threader): New.
(make_pass_vrp_threader): New.

libgomp/ChangeLog:

* team.c: Initialize start_data.
* testsuite/libgomp.graphite/force-parallel-4.c: Adjust.
* testsuite/libgomp.graphite/force-parallel-8.c: Adjust.

gcc/testsuite/ChangeLog:

* gcc.dg/torture/pr55107.c: Adjust.
* gcc.dg/tree-ssa/phi_on_compare-1.c: Adjust.
* gcc.dg/tree-ssa/phi_on_compare-2.c: Adjust.
* gcc.dg/tree-ssa/phi_on_compare-3.c: Adjust.
* gcc.dg/tree-ssa/phi_on_compare-4.c: Adjust.
* gcc.dg/tree-ssa/pr21559.c: Adjust.
* gcc.dg/tree-ssa/pr59597.c: Adjust.
* gcc.dg/tree-ssa/pr61839_1.c: Adjust.
* gcc.dg/tree-ssa/pr61839_3.c: Adjust.
* gcc.dg/tree-ssa/pr71437.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-11.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-16.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-2a.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-4.c: Adjust.
* gcc.dg/tree-ssa/ssa-thread-14.c: Adjust.
* gcc.dg/tree-ssa/ssa-vrp-thread-1.c: Adjust.
* gcc.dg/tree-ssa/vrp106.c: Adjust.
* gcc.dg/tree-ssa/vrp55.c: Adjust.
---
 gcc/passes.def

[pushed] IRA: Make profitability calculation of RA conflict presentations independent of host compiler type sizes of RA conflict presentations independent of host compiler type sizes [PR102147]

2021-09-24 Thread Vladimir Makarov via Gcc-patches

The following patch solves

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102147

The patch was successfully bootstrapped and tested on x86-64.


commit ec4c30b64942e615b4bb4b9761cd3b2635158608 (HEAD -> master)
Author: Vladimir N. Makarov 
Date:   Fri Sep 24 10:06:45 2021 -0400

    Make profitability calculation of RA conflict presentations 
independent of host compiler type sizes. [PR102147]


    gcc/ChangeLog:

    2021-09-24  Vladimir Makarov  

    PR rtl-optimization/102147
    * ira-build.c (ira_conflict_vector_profitable_p): Make
    profitability calculation independent of host compiler 
pointer and

    IRA_INT_BITS sizes.

diff --git a/gcc/ira-build.c b/gcc/ira-build.c
index 42120656366..2a30efc4f2f 100644
--- a/gcc/ira-build.c
+++ b/gcc/ira-build.c
@@ -629,7 +629,7 @@ ior_hard_reg_conflicts (ira_allocno_t a, 
const_hard_reg_set set)

 bool
 ira_conflict_vector_profitable_p (ira_object_t obj, int num)
 {
-  int nw;
+  int nbytes;
   int max = OBJECT_MAX (obj);
   int min = OBJECT_MIN (obj);

@@ -638,9 +638,14 @@ ira_conflict_vector_profitable_p (ira_object_t obj, 
int num)

    in allocation.  */
 return false;

-  nw = (max - min + IRA_INT_BITS) / IRA_INT_BITS;
-  return (2 * sizeof (ira_object_t) * (num + 1)
- < 3 * nw * sizeof (IRA_INT_TYPE));
+  nbytes = (max - min) / 8 + 1;
+  STATIC_ASSERT (sizeof (ira_object_t) <= 8);
+  /* Don't use sizeof (ira_object_t), use constant 8.  Size of 
ira_object_t (a

+ pointer) is different on 32-bit and 64-bit targets.  Usage sizeof
+ (ira_object_t) can result in different code generation by GCC 
built as 32-
+ and 64-bit program.  In any case the profitability is just an 
estimation

+ and border cases are rare.  */
+  return (2 * 8 /* sizeof (ira_object_t) */ * (num + 1) < 3 * nbytes);
 }

 /* Allocates and initialize the conflict vector of OBJ for NUM



[PATCH] rs6000: Fix vec_cpsgn parameter order (PR101985)

2021-09-24 Thread Bill Schmidt via Gcc-patches
Hi!

This fixes a bug reported in 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101985.

The vec_cpsgn built-in function API differs in argument order from the
copysign3 convention.  Currently that pattern is incorrctly used to
implement vec_cpsgn.  Fix that while leaving the existing pattern in place
to implement copysignf for vector modes.

Part of the fix when using the new built-in support requires an adjustment
to a pending patch that replaces much of altivec.h with an automatically
generated file.  So that adjustment will be coming later...

Also fix a bug in the new built-in overload infrastructure where we were
using the VSX form of the VEC_COPYSIGN built-in when we should default to
the VMX form.

Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
Is this okay for trunk?

Thanks!
Bill


2021-09-24  Bill Schmidt  

gcc/
PR target/101985
* config/rs6000/altivec.h (vec_cpsgn): Adjust.
* config/rs6000/rs6000-overload.def (VEC_COPYSIGN): Use SKIP to
avoid generating an automatic #define of vec_cpsgn.  Use the
correct built-in for V4SFmode that doesn't depend on VSX.

gcc/testsuite/
PR target/101985
* gcc.target/powerpc/pr101985.c: New.
---
 gcc/config/rs6000/altivec.h   |  2 +-
 gcc/config/rs6000/rs6000-overload.def |  4 ++--
 gcc/testsuite/gcc.target/powerpc/pr101985-1.c | 18 ++
 gcc/testsuite/gcc.target/powerpc/pr101985-2.c | 18 ++
 4 files changed, 39 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr101985-1.c
 create mode 100644 gcc/testsuite/gcc.target/powerpc/pr101985-2.c

diff --git a/gcc/config/rs6000/altivec.h b/gcc/config/rs6000/altivec.h
index 5b631c7ebaf..ea72c9c1789 100644
--- a/gcc/config/rs6000/altivec.h
+++ b/gcc/config/rs6000/altivec.h
@@ -129,7 +129,7 @@
 #define vec_vcfux __builtin_vec_vcfux
 #define vec_cts __builtin_vec_cts
 #define vec_ctu __builtin_vec_ctu
-#define vec_cpsgn __builtin_vec_copysign
+#define vec_cpsgn(x,y) __builtin_vec_copysign(y,x)
 #define vec_double __builtin_vec_double
 #define vec_doublee __builtin_vec_doublee
 #define vec_doubleo __builtin_vec_doubleo
diff --git a/gcc/config/rs6000/rs6000-overload.def 
b/gcc/config/rs6000/rs6000-overload.def
index 141f831e2c0..4f583312f36 100644
--- a/gcc/config/rs6000/rs6000-overload.def
+++ b/gcc/config/rs6000/rs6000-overload.def
@@ -1154,9 +1154,9 @@
   vus __builtin_vec_convert_4f32_8f16 (vf, vf);
 CONVERT_4F32_8F16
 
-[VEC_COPYSIGN, vec_cpsgn, __builtin_vec_copysign]
+[VEC_COPYSIGN, SKIP, __builtin_vec_copysign]
   vf __builtin_vec_copysign (vf, vf);
-CPSGNSP
+COPYSIGN_V4SF
   vd __builtin_vec_copysign (vd, vd);
 CPSGNDP
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr101985-1.c 
b/gcc/testsuite/gcc.target/powerpc/pr101985-1.c
new file mode 100644
index 000..a1ec2d68d53
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr101985-1.c
@@ -0,0 +1,18 @@
+/* PR target/101985 */
+/* { dg-do run } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-options "-O2" } */
+
+#include 
+
+int
+main (void)
+{
+  vector float a = {  1,  2, - 3, - 4};
+  vector float b = {-10, 20, -30,  40};
+  vector float c = { 10, 20, -30, -40};
+  a = vec_cpsgn (a, b);
+  if (! vec_all_eq (a, c))
+__builtin_abort ();
+  return 0;
+}
diff --git a/gcc/testsuite/gcc.target/powerpc/pr101985-2.c 
b/gcc/testsuite/gcc.target/powerpc/pr101985-2.c
new file mode 100644
index 000..71cc254c170
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/pr101985-2.c
@@ -0,0 +1,18 @@
+/* PR target/101985 */
+/* { dg-do run } */
+/* { dg-require-effective-target vsx_hw } */
+/* { dg-options "-O2" } */
+
+#include 
+
+int
+main (void)
+{
+  vector double a = {  1,  -4};
+  vector double b = { -10,  40};
+  vector double c = {  10, -40};
+  a = vec_cpsgn (a, b);
+  if (! vec_all_eq (a, c))
+__builtin_abort ();
+  return 0;
+}
-- 
2.27.0




[COMMITTED] path solver: Avoid further lookups when range is defined in block.

2021-09-24 Thread Aldy Hernandez via Gcc-patches
If an SSA is defined in the current block, there is no need to query
range_on_path_entry for additional information.

Tested on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-path.cc (path_range_query::path_range_query):
Move debugging header...
(path_range_query::precompute_ranges): ...here.
(path_range_query::internal_range_of_expr): Do not call
range_on_path_entry if NAME is defined in the current block.
---
 gcc/gimple-range-path.cc | 9 +++--
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/gcc/gimple-range-path.cc b/gcc/gimple-range-path.cc
index d9704c8f86b..0738a5ca159 100644
--- a/gcc/gimple-range-path.cc
+++ b/gcc/gimple-range-path.cc
@@ -39,9 +39,6 @@ along with GCC; see the file COPYING3.  If not see
 path_range_query::path_range_query (gimple_ranger , bool resolve)
   : m_ranger (ranger)
 {
-  if (DEBUG_SOLVER)
-fprintf (dump_file, "\n*** path_range_query **\n");
-
   m_cache = new ssa_global_cache;
   m_has_cache_entry = BITMAP_ALLOC (NULL);
   m_path = NULL;
@@ -173,9 +170,6 @@ path_range_query::internal_range_of_expr (irange , tree 
name, gimple *stmt)
   if (TREE_CODE (name) == SSA_NAME)
r.intersect (gimple_range_global (name));
 
-  if (m_resolve && r.varying_p ())
-   range_on_path_entry (r, name);
-
   set_cache (r, name);
   return true;
 }
@@ -467,6 +461,9 @@ void
 path_range_query::precompute_ranges (const vec ,
 const bitmap_head *imports)
 {
+  if (DEBUG_SOLVER)
+fprintf (dump_file, "\n*** path_range_query **\n");
+
   set_path (path);
   bitmap_copy (m_imports, imports);
   m_undefined_path = false;
-- 
2.31.1



[committed] libstdc++: Remove redundant 'inline' specifiers

2021-09-24 Thread Jonathan Wakely via Gcc-patches
These functions are constexpr, which means they are implicitly inline.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/range_access.h (cbegin, cend): Remove redundant
'inline' specifier.

Tested x86_64-linux. Committed to trunk.

commit 9b11107ed72ca543af41dbb3226e16b61d31b098
Author: Jonathan Wakely 
Date:   Fri Sep 24 11:30:59 2021

libstdc++: Remove redundant 'inline' specifiers

These functions are constexpr, which means they are implicitly inline.

Signed-off-by: Jonathan Wakely 

libstdc++-v3/ChangeLog:

* include/bits/range_access.h (cbegin, cend): Remove redundant
'inline' specifier.

diff --git a/libstdc++-v3/include/bits/range_access.h 
b/libstdc++-v3/include/bits/range_access.h
index ab2d4f8652c..3dec687dd94 100644
--- a/libstdc++-v3/include/bits/range_access.h
+++ b/libstdc++-v3/include/bits/range_access.h
@@ -122,7 +122,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template
 [[__nodiscard__]]
-inline constexpr auto
+constexpr auto
 cbegin(const _Container& __cont) noexcept(noexcept(std::begin(__cont)))
   -> decltype(std::begin(__cont))
 { return std::begin(__cont); }
@@ -134,7 +134,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
*/
   template
 [[__nodiscard__]]
-inline constexpr auto
+constexpr auto
 cend(const _Container& __cont) noexcept(noexcept(std::end(__cont)))
   -> decltype(std::end(__cont))
 { return std::end(__cont); }


PING [PATCH] warn for more impossible null pointer tests [PR102103]

2021-09-24 Thread Martin Sebor via Gcc-patches

Ping: Jeff, with the C++ part approved, can you please confirm your
approval  with the C parts of the patch?

https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579693.html

On 9/21/21 6:34 PM, Martin Sebor wrote:

On 9/21/21 3:40 PM, Jason Merrill wrote:



The C++ changes are OK.


Jeff, should I take your previous "Generally OK" as an approval
for the rest of the patch as well?  (It has not changed in v2.)
I have just submitted a Glibc patch to suppress the new instances
there.

Martin




Re: [PATCH] Enable auto-vectorization at O2 with very-cheap cost model.

2021-09-24 Thread Martin Sebor via Gcc-patches

On 9/23/21 9:32 PM, Hongtao Liu wrote:

On Thu, Sep 23, 2021 at 11:18 PM Martin Sebor  wrote:


On 9/23/21 12:30 AM, Richard Biener wrote:

On Thu, 23 Sep 2021, Hongtao Liu wrote:


On Thu, Sep 23, 2021 at 9:48 AM Hongtao Liu  wrote:


On Wed, Sep 22, 2021 at 10:21 PM Martin Sebor  wrote:


On 9/21/21 7:38 PM, Hongtao Liu wrote:

On Mon, Sep 20, 2021 at 4:13 AM Martin Sebor  wrote:

...

diff --git a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c 
b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
index 1d79930cd58..9351f7e7a1a 100644
--- a/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
+++ b/gcc/testsuite/c-c++-common/Wstringop-overflow-2.c
@@ -1,7 +1,7 @@
 /* PR middle-end/91458 - inconsistent warning for writing past the end
of an array member
{ dg-do compile }
-   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf" } */
+   { dg-options "-O2 -Wall -Wno-array-bounds -fno-ipa-icf -fno-tree-vectorize" 
} */


The testcase is large - what part requires this change?  Given the
testcase was added for inconsistent warnings do they now become
inconsistent again as we enable vectorization at -O2?

That said, the testcase adjustments need some explaining - I suppose
you didn't just slap -fno-tree-vectorize to all of those changing
behavior?


void ga1_ (void)
{
  a1_.a[0] = 0;
  a1_.a[1] = 1; // { dg-warning "\\\[-Wstringop-overflow" }
  a1_.a[2] = 2; // { dg-warning "\\\[-Wstringop-overflow" }

  struct A1 a;
  a.a[0] = 0;
  a.a[1] = 1;   // { dg-warning "\\\[-Wstringop-overflow" }
  a.a[2] = 2;   // { dg-warning "\\\[-Wstringop-overflow" }
  sink ();
}

It's supposed to be 2 warning for a.a[1] = 1 and a.a[2] = 1 since
there are 2 accesses, but after enabling vectorization, there's only
one access, so one warning is missing which causes the failure.


With the stores vectorized, is the warning on the correct line or
does it point to the first store, the one that's in bounds, as
it does with -O3?  The latter would be a regression at -O2.

For the upper case, It points to the second store which is out of
bounds, the third store warning is missing.




I would find it preferable to change the test code over disabling
optimizations that are on by default.  My concern is that the test
would no longer exercise the default behavior.  (The same goes for
the -fno-ipa-icf option.)

Hmm, it's a middle-end test, for some backend, it may not do
vectorization(it depends on TARGET_VECTOR_MODE_SUPPORTED_P and
relative cost model).


Yes, there are quite a few warning tests like that.  Their main
purpose is to verify that in common GCC invocations (i.e., without
any special options) warnings are a) issued when expected and b)
not issued when not expected.  Otherwise, middle end warnings are
known to have both false positives and false negatives in some
invocations, depending on what optimizations are in effect.
Indiscriminately disabling common optimizations for these large
tests and invoking them under artificial conditions would
compromise this goal and hide the problems.

If enabling vectorization at -O2 causes regressions in the quality
of diagnostics (as the test failure above indicates seems to be
happening) we should investigate these and open bugs for them so
they can be fixed.  We can then tweak the specific failing test
cases to avoid the failures until they are fixed.

There are indeed cases of false positives and false negatives
.i.e.
// Verify warning for access to a definition with an initializer that
// initializes the one-element array member.
struct A1 a1i_1 = { 0, { 1 } };

void ga1i_1 (void)
{
a1i_1.a[0] = 0;
a1i_1.a[1] = 1;   // { dg-warning "\\\[-Wstringop-overflow" }
a1i_1.a[2] = 2;   // { dg-warning "\\\[-Wstringop-overflow" }

struct A1 a = { 0, { 1 } }; --- false positive here.
a.a[0] = 1;
a.a[1] = 2;   // { dg-warning
"\\\[-Wstringop-overflow" } false negative here.
a.a[2] = 3;   // { dg-warning
"\\\[-Wstringop-overflow" } false negative here.
sink ();
}

Similar for
* gcc.dg/Warray-bounds-51.c.
* gcc.dg/Warray-parameter-3.c
* gcc.dg/Wstringop-overflow-14.c
* gcc.dg/Wstringop-overflow-21.c

So there're 3 situations.
1. All accesses are out of bound, and after vectorization, there are
some warnings missing.
2. Part of accesses are inbound, part of accesses are out of bound,
and after vectorization, the warning goes from out of bound line to
inbound line.
3. All access are out of bound, and after vectoriation, all warning
are missing, and goes to a false-positive line.

My mistake, there's no case3, just case 1 and case2.
So i'm going to install the patch, ok?


Please don't add the -fno- option to the warning tests.  As I said,
I would prefer to either suppress the vectorization for the failing
cases by tweaking the test code or xfail them.  That way future
regressions won't be masked by 

Re: [PATCH] Avoid invalid loop transformations in jump threading registry.

2021-09-24 Thread Christophe LYON via Gcc-patches



On 23/09/2021 13:15, Aldy Hernandez via Gcc-patches wrote:

My upcoming improvements to the forward jump threader make it thread
more aggressively.  In investigating some "regressions", I noticed
that it has always allowed threading through empty latches and across
loop boundaries.  As we have discussed recently, this should be avoided
until after loop optimizations have run their course.

Note that this wasn't much of a problem before because DOM/VRP
couldn't find these opportunities, but with a smarter solver, we trip
over them more easily.

Because the forward threader doesn't have an independent localized cost
model like the new threader (profitable_path_p), it is difficult to
catch these things at discovery.  However, we can catch them at
registration time, with the added benefit that all the threaders
(forward and backward) can share the handcuffs.

This patch is an adaptation of what we do in the backward threader, but
it is not meant to catch everything we do there, as some of the
restrictions there are due to limitations of the different block
copiers (for example, the generic copier does not re-use existing
threading paths).

We could ideally remove the now redundant bits in profitable_path_p, but
I would prefer not to for two reasons.  First, the backward threader uses
profitable_path_p as it discovers paths to avoid discovering paths in
unprofitable directions.  Second, I would like to merge all the forward
cost restrictions into the profitability class in the backward threader,
not the other way around.  Alas, that reshuffling will have to wait for
the next release.

As usual, there are quite a few tests that needed adjustments.  It seems
we were quite happily threading improper scenarios.  With most of them,
as can be seen in pr77445-2.c, we're merely shifting the threading to
after loop optimizations.

Tested on x86-64 Linux.

OK for trunk?

p.s. "Sure, sounds like fun... how hard can improving the threaders be?"

gcc/ChangeLog:

* tree-ssa-threadupdate.c (jt_path_registry::cancel_invalid_paths):
New.
(jt_path_registry::register_jump_thread): Call
cancel_invalid_paths.
* tree-ssa-threadupdate.h (class jt_path_registry): Add
cancel_invalid_paths.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/20030714-2.c: Adjust.
* gcc.dg/tree-ssa/pr66752-3.c: Adjust.
* gcc.dg/tree-ssa/pr77445-2.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-18.c: Adjust.
* gcc.dg/tree-ssa/ssa-dom-thread-7.c: Adjust.
* gcc.dg/vect/bb-slp-16.c: Adjust.



After your commit r12-3876, I've noticed that some of the updated tests 
fail on some arm targets:


FAIL: gcc:gcc.dg/tree-ssa/tree-ssa.exp=gcc.dg/tree-ssa/pr66752-3.c scan-tree-dump-not 
thread3 "if .flag"
FAIL: gcc:gcc.dg/tree-ssa/tree-ssa.exp=gcc.dg/tree-ssa/pr77445-2.c scan-tree-dump thread1 
"Jumps threaded: 9"

when cpu is:

* cortex-a5 (fpu = vfpv3-d16-fp16)

* cortex-m0, m3, m4, m7 and m55 (with assorted -march/-mfloat-abi)

See 
https://people.linaro.org/~christophe.lyon/cross-validation/gcc/trunk/r12-3876-g4a960d548b7d7d942f316c5295f6d849b74214f5/report-build-info.html


for more details (you can ignore the regressions in libstdc++, they are 
related to random timeouts)


Thanks,


Christophe



---
  gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c|  7 +-
  gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c | 19 --
  gcc/testsuite/gcc.dg/tree-ssa/pr77445-2.c |  4 +-
  .../gcc.dg/tree-ssa/ssa-dom-thread-18.c   |  4 +-
  .../gcc.dg/tree-ssa/ssa-dom-thread-7.c|  4 +-
  gcc/testsuite/gcc.dg/vect/bb-slp-16.c |  7 --
  gcc/tree-ssa-threadupdate.c   | 67 +++
  gcc/tree-ssa-threadupdate.h   |  1 +
  8 files changed, 78 insertions(+), 35 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c 
b/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c
index eb663f2ff5b..9585ff11307 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/20030714-2.c
@@ -32,7 +32,8 @@ get_alias_set (t)
  }
  }
  
-/* There should be exactly three IF conditionals if we thread jumps

-   properly.  */
-/* { dg-final { scan-tree-dump-times "if " 3 "dom2"} } */
+/* There should be exactly 4 IF conditionals if we thread jumps
+   properly.  There used to be 3, but one thread was crossing
+   loops.  */
+/* { dg-final { scan-tree-dump-times "if " 4 "dom2"} } */
   
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c

index e1464e21170..922a331b217 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr66752-3.c
@@ -1,5 +1,5 @@
  /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-dce2" } */
+/* { dg-options "-O2 -fdump-tree-thread1-details -fdump-tree-thread3" } */
  
  extern int status, pt;

  extern int count;
@@ -32,10 +32,15 @@ foo (int N, int c, int b, int *a)
 pt--;
  

Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.

2021-09-24 Thread Segher Boessenkool
On Mon, Sep 13, 2021 at 04:24:13PM +0200, Richard Biener wrote:
> On Mon, Sep 13, 2021 at 4:10 PM Jeff Law via Gcc-patches
>  wrote:
> > I'm not convinced that we need the inner mode to match anything.  As
> > long as the vec_concat's mode is twice the size of the vec_select modes
> > and the vec_select mode is <= the mode of its operands ISTM this is
> > fine.   We  might want the modes of the vec_select to match, but I don't
> > think that's strictly necessary either, they just need to be the same
> > size.  ie, we could have somethig like
> >
> > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
> >
> > I'm not sure if that level of generality is useful though.  If we want
> > the modes of the vec_selects to match I think we could still support
> >
> > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
> >
> > Thoughts?
> 
> I think the component or scalar modes of the elements to concat need to match
> the component mode of the result.  I don't think you example involving
> a cat of DF and DI is too useful - but you could use a subreg around the DI
> value ;)

I agree.

If you want to concatenate components of different modes, you should
change mode first, using subregs for example.

("Inner mode" is something of subregs btw, "component mode" is what this
concept of modes is called, the name GET_MODE_INNER is a bit confusing
though :-) )

Btw, the documentation for "concat" says
  @findex concat
  @item (concat@var{m} @var{rtx} @var{rtx})
  This RTX represents the concatenation of two other RTXs.  This is used
  for complex values.  It should only appear in the RTL attached to
  declarations and during RTL generation.  It should not appear in the
  ordinary insn chain.
which needs some updating (in many ways).


Segher


FW: [PING] Re: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under -ffast-math on aarch64

2021-09-24 Thread Jirui Wu via Gcc-patches
Hi,

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577846.html

The patch is attached as text for ease of use. Is there anything that needs to 
change?

Ok for master? If OK, can it be committed for me, I have no commit rights.

Jirui Wu

-Original Message-
From: Jirui Wu 
Sent: Friday, September 10, 2021 10:14 AM
To: Richard Biener 
Cc: Richard Biener ; Andrew Pinski 
; Richard Sandiford ; 
i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 

Subject: [PING] Re: [Patch][GCC][middle-end] - Generate FRINTZ for 
(double)(int) under -ffast-math on aarch64

Hi,

Ping: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/577846.html

Ok for master? If OK, can it be committed for me, I have no commit rights.

Jirui Wu
-Original Message-
From: Jirui Wu
Sent: Friday, September 3, 2021 12:39 PM
To: 'Richard Biener' 
Cc: Richard Biener ; Andrew Pinski 
; Richard Sandiford ; 
i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 

Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under 
-ffast-math on aarch64

Ping

-Original Message-
From: Jirui Wu
Sent: Friday, August 20, 2021 4:28 PM
To: Richard Biener 
Cc: Richard Biener ; Andrew Pinski 
; Richard Sandiford ; 
i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 

Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for (double)(int) under 
-ffast-math on aarch64

> -Original Message-
> From: Richard Biener 
> Sent: Friday, August 20, 2021 8:15 AM
> To: Jirui Wu 
> Cc: Richard Biener ; Andrew Pinski 
> ; Richard Sandiford ; 
> i...@airs.com; gcc-patches@gcc.gnu.org; Joseph S. Myers 
> 
> Subject: RE: [Patch][GCC][middle-end] - Generate FRINTZ for
> (double)(int) under -ffast-math on aarch64
> 
> On Thu, 19 Aug 2021, Jirui Wu wrote:
> 
> > Hi all,
> >
> > This patch generates FRINTZ instruction to optimize type casts.
> >
> > The changes in this patch covers:
> > * Generate FRINTZ for (double)(int) casts.
> > * Add new test cases.
> >
> > The intermediate type is not checked according to the C99 spec.
> > Overflow of the integral part when casting floats to integers causes
> undefined behavior.
> > As a result, optimization to trunc() is not invalid.
> > I've confirmed that Boolean type does not match the matching condition.
> >
> > Regtested on aarch64-none-linux-gnu and no issues.
> >
> > Ok for master? If OK can it be committed for me, I have no commit rights.
> 
> +/* Detected a fix_trunc cast inside a float type cast,
> +   use IFN_TRUNC to optimize.  */
> +#if GIMPLE
> +(simplify
> +  (float (fix_trunc @0))
> +  (if (direct_internal_fn_supported_p (IFN_TRUNC, type,
> +  OPTIMIZE_FOR_BOTH)
> +   && flag_unsafe_math_optimizations
> +   && type == TREE_TYPE (@0))
> 
> types_match (type, TREE_TYPE (@0))
> 
> please.  Please perform cheap tests first (the flag test).
> 
> + (IFN_TRUNC @0)))
> +#endif
> 
> why only for GIMPLE?  I'm not sure flag_unsafe_math_optimizations is a 
> good test here.  If you say we can use undefined behavior of any 
> overflow of the fix_trunc operation what do we guard here?
> If it's Inf/NaN input then flag_finite_math_only would be more 
> appropriate, if it's behavior for -0. (I suppose trunc (-0.0) == -0.0 
> and thus "wrong") then a && !HONOR_SIGNED_ZEROS (type) is missing 
> instead.  If it's setting of FENV state and possibly trapping on 
> overflow (but it's undefined?!) then flag_trapping_math covers the 
> latter but we don't have any flag for eliding FENV state affecting 
> transforms, so there the kitchen-sink flag_unsafe_math_optimizations might 
> apply.
> 
> So - which is it?
> 
This change is only for GIMPLE because we can't test for the optab support 
without being in GIMPLE. direct_internal_fn_supported_p is defined only for 
GIMPLE. 

IFN_TRUNC's documentation mentions nothing for zero, NaNs/inf inputs.
So I think the correct guard is just flag_fp_int_builtin_inexact.
!flag_trapping_math because the operation can only still raise inexacts.

The new pattern is moved next to the place you mentioned.

Ok for master? If OK can it be committed for me, I have no commit rights.

Thanks,
Jirui
> Note there's also the pattern
> 
> /* Handle cases of two conversions in a row.  */ (for ocvt (convert 
> float
> fix_trunc)  (for icvt (convert float)
>   (simplify
>(ocvt (icvt@1 @0))
>(with
> {
> ...
> 
> which is related so please put the new pattern next to that (the set 
> of conversions handled there does not include (float (fix_trunc @0)))
> 
> Thanks,
> Richard.
> 
> > Thanks,
> > Jirui
> >
> > gcc/ChangeLog:
> >
> > * match.pd: Generate IFN_TRUNC.
> >
> > gcc/testsuite/ChangeLog:
> >
> > * gcc.target/aarch64/merge_trunc1.c: New test.
> >
> > > -Original Message-
> > > From: Richard Biener 
> > > Sent: Tuesday, August 17, 2021 9:13 AM
> > > To: Andrew Pinski 
> > > Cc: Jirui Wu ; Richard Sandiford 
> > > ; i...@airs.com; 
> > > gcc-patches@gcc.gnu.org; 

[PATCH] aarch64: Fix type qualifiers for qtbl1 and qtbx1 Neon builtins

2021-09-24 Thread Jonathan Wright via Gcc-patches
Hi,

This patch fixes type qualifiers for the qtbl1 and qtbx1 Neon builtins
and removes the casts from the Neon intrinsic function bodies that
use these builtins.

Regression tested and bootstrapped on aarch64-none-linux-gnu - no
issues.

Ok for master?

Thanks,
Jonathan

---

gcc/ChangeLog:

23-09-2021  Jonathan Wright  

* config/aarch64/aarch64-builtins.c (TYPES_BINOP_PPU): Define
new type qualifier enum.
(TYPES_TERNOP_SSSU): Likewise.
(TYPES_TERNOP_PPPU): Likewise.
* config/aarch64/aarch64-simd-builtins.def: Define PPU, SSU,
PPPU and SSSU builtin generator macros for qtbl1 and qtbx1
Neon builtins.
* config/aarch64/arm_neon.h (vqtbl1_p8): Use type-qualified
builtin and remove casts.
(vqtbl1_s8): Likewise.
(vqtbl1q_p8): Likewise.
(vqtbl1q_s8): Likewise.
(vqtbx1_s8): Likewise.
(vqtbx1_p8): Likewise.
(vqtbx1q_s8): Likewise.
(vqtbx1q_p8): Likewise.
(vtbl1_p8): Likewise.
(vtbl2_p8): Likewise.
(vtbx2_p8): Likewise.


rb14884.patch
Description: rb14884.patch


Re: [PATCH] top-level: merge Makefile.def patches from binutils-gdb repository

2021-09-24 Thread Richard Biener via Gcc-patches
On Fri, Sep 24, 2021 at 12:49 PM Andrew Burgess
 wrote:
>
> This commit back-ports two patches to Makefile.def from the
> binutils-gdb repository, these patches were committed over there
> without first being merged in to the gcc repository.
>
> These commits all relate to dependencies for binutils-gdb modules, so
> should have no impact on gcc, I tested a gcc build/install on x86-64
> GNU/Linux, and everything looked OK.
>
> The two patches being backported are binutils-gdb commits:
>
>   commit ba4d88ad892fe29c6ca7938c8861f8edef5f7a3f (gdb-gnulib-issues)
>   Date:   Mon Oct 12 16:04:32 2020 +0100
>
>   gdb/gdbserver: add dependencies for distclean-gnulib
>
> And
>
>   commit 755ba58ebef02e1be9fc6770d00243ba6ed0223c
>   Date:   Thu Mar 18 12:37:52 2021 +
>
>   Add install dependencies for ld -> bfd and libctf -> bfd
>
> OK to merge?

OK.

> 2021-09-07  Andrew Burgess  
>
> Merge from binutils-gdb:
> 2021-09-08  Nick Alcock  
>
> PR libctf/27482
> * Makefile.def: Add install-bfd dependencies for install-libctf and
> install-ld, and install-strip-bfd dependencies for
> install-strip-libctf and install-strip-ld; move the install-ld
> dependency on install-libctf to join it.
> * Makefile.in: Regenerated.
>
> And:
> 2020-10-14  Andrew Burgess  
>
> * Makefile.in: Rebuild.
> * Makefile.def: Make distclean-gnulib depend on distclean-gdb and
> distclean-gdbserver.
> ---
>  ChangeLog| 19 +++
>  Makefile.def | 14 ++
>  Makefile.in  |  8 
>  3 files changed, 41 insertions(+)
>
> diff --git a/Makefile.def b/Makefile.def
> index de3e0052106..143a6b469b2 100644
> --- a/Makefile.def
> +++ b/Makefile.def
> @@ -471,6 +471,14 @@ dependencies = { module=all-ld; on=all-libctf; };
>  dependencies = { module=install-binutils; on=install-opcodes; };
>  dependencies = { module=install-strip-binutils; on=install-strip-opcodes; };
>
> +// Likewise for ld, libctf, and bfd.
> +dependencies = { module=install-libctf; on=install-bfd; };
> +dependencies = { module=install-ld; on=install-bfd; };
> +dependencies = { module=install-ld; on=install-libctf; };
> +dependencies = { module=install-strip-libctf; on=install-strip-bfd; };
> +dependencies = { module=install-strip-ld; on=install-strip-bfd; };
> +dependencies = { module=install-strip-ld; on=install-strip-libctf; };
> +
>  // libopcodes depends on libbfd
>  dependencies = { module=install-opcodes; on=install-bfd; };
>  dependencies = { module=install-strip-opcodes; on=install-strip-bfd; };
> @@ -564,6 +572,12 @@ dependencies = { module=configure-libctf; on=all-zlib; };
>  dependencies = { module=configure-libctf; on=all-libiconv; };
>  dependencies = { module=check-libctf; on=all-ld; };
>
> +// The Makefiles in gdb and gdbserver pull in a file that configure
> +// generates in the gnulib directory, so distclean gnulib only after
> +// gdb and gdbserver.
> +dependencies = { module=distclean-gnulib; on=distclean-gdb; };
> +dependencies = { module=distclean-gnulib; on=distclean-gdbserver; };
> +
>  // Warning, these are not well tested.
>  dependencies = { module=all-bison; on=all-intl; };
>  dependencies = { module=all-bison; on=all-build-texinfo; };
> diff --git a/Makefile.in b/Makefile.in
> index 61af99dc75a..7613da5a378 100644
> --- a/Makefile.in
> +++ b/Makefile.in
> @@ -60763,6 +60763,12 @@ all-stageautoprofile-ld: 
> maybe-all-stageautoprofile-libctf
>  all-stageautofeedback-ld: maybe-all-stageautofeedback-libctf
>  install-binutils: maybe-install-opcodes
>  install-strip-binutils: maybe-install-strip-opcodes
> +install-libctf: maybe-install-bfd
> +install-ld: maybe-install-bfd
> +install-ld: maybe-install-libctf
> +install-strip-libctf: maybe-install-strip-bfd
> +install-strip-ld: maybe-install-strip-bfd
> +install-strip-ld: maybe-install-strip-libctf
>  install-opcodes: maybe-install-bfd
>  install-strip-opcodes: maybe-install-strip-bfd
>  configure-gas: maybe-configure-intl
> @@ -61131,6 +61137,8 @@ check-stagetrain-libctf: maybe-all-stagetrain-ld
>  check-stagefeedback-libctf: maybe-all-stagefeedback-ld
>  check-stageautoprofile-libctf: maybe-all-stageautoprofile-ld
>  check-stageautofeedback-libctf: maybe-all-stageautofeedback-ld
> +distclean-gnulib: maybe-distclean-gdb
> +distclean-gnulib: maybe-distclean-gdbserver
>  all-bison: maybe-all-build-texinfo
>  all-flex: maybe-all-build-bison
>  all-flex: maybe-all-m4
> --
> 2.25.4
>


Re: [PATCH] Avoid invalid loop transformations in jump threading registry.

2021-09-24 Thread Aldy Hernandez via Gcc-patches




On 9/23/21 6:10 PM, Jeff Law wrote:



On 9/23/2021 5:15 AM, Aldy Hernandez wrote:

My upcoming improvements to the forward jump threader make it thread
more aggressively.  In investigating some "regressions", I noticed
that it has always allowed threading through empty latches and across
loop boundaries.  As we have discussed recently, this should be avoided
until after loop optimizations have run their course.

Note that this wasn't much of a problem before because DOM/VRP
couldn't find these opportunities, but with a smarter solver, we trip
over them more easily.
We used to be much more aggressive in this space -- but we removed the 
equivalency tracking on backedges in the main part of DOM which had the 
side effect to reducing the number of threads related to back edges in 
loops.


I thought we couldn't thread through back edges at all in the old 
threader, or are we talking about the same thing?  We have a hard fail 
on backedge thread attempts for anything but the backward threader and 
its custom copier.




Of course that was generally a positive thing given the issues we've 
been discussing.


Yeah.  These tweaks have reduced the number of jump threads in my 
bootstrap .ii by 6%, so a considerable amount.  But they were 
problematic threading paths to begin with.


For example, it removed the regression introduced by the backward 
threader rewrite in gcc.dg/vect/bb-slp-16.c.  Interestingly, for all the 
checks we do in the backward threader, some threading through loop 
boundaries seep through.  In particular the check for loop crossing 
excludes the taken edge, which IMO is a mistake.  If the entire path is 
in a loop, but the taken edge points to another loop, that by definition 
is a loop crossing.  Note, we have an exception for the first block in a 
path being in another loop, but that's something else ;-).


Anywhoo... we're catching it now.  We really should clean this up and 
merge the differing implementations.  But I'm way over my time budget 
for this ;-).







Because the forward threader doesn't have an independent localized cost
model like the new threader (profitable_path_p), it is difficult to
catch these things at discovery.  However, we can catch them at
registration time, with the added benefit that all the threaders
(forward and backward) can share the handcuffs.
In an ideal world profitability and correctness would be separated -- 
but they're still intertwined and I don't think this makes that 
situation particularly worse.  And I do like that having a single choke 
point.


Huh, I hadn't though about it that way, but you're right.  The 
profitable_path_p code is catching both correctness as well as 
profitability issues.  It seems all the profitability stuff is keyed by 
param*fsm* compile options, though.  It should be easy enough to separate.




Obviously you're cleaning this up, so I think a significant degree of 
freedom should be given here


Much appreciated.

Aldy



[PATCH] [GIMPLE] Simplify (_Float16) ceil ((double) x) to .CEIL (x) when available.

2021-09-24 Thread liuhongt via Gcc-patches
Hi:
  Related discussion in [1] and PR.

  Bootstrapped and regtest on x86_64-linux-gnu{-m32,}.
  Ok for trunk?

[1] https://gcc.gnu.org/pipermail/gcc-patches/2021-July/574330.html

gcc/ChangeLog:

PR target/102464
* config/i386/i386.c (ix86_optab_supported_p):
Return true for HFmode.
* match.pd: Simplify (_Float16) ceil ((double) x) to
__builtin_ceilf16 (a) when a is _Float16 type and
direct_internal_fn_supported_p.

gcc/testsuite/ChangeLog:

* gcc.target/i386/pr102464.c: New test.
---
 gcc/config/i386/i386.c   | 20 +++-
 gcc/match.pd | 28 +
 gcc/testsuite/gcc.target/i386/pr102464.c | 39 
 3 files changed, 79 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr102464.c

diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index ba89e111d28..3767fe9806d 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -23582,20 +23582,24 @@ ix86_optab_supported_p (int op, machine_mode mode1, 
machine_mode,
   return opt_type == OPTIMIZE_FOR_SPEED;
 
 case rint_optab:
-  if (SSE_FLOAT_MODE_P (mode1)
- && TARGET_SSE_MATH
- && !flag_trapping_math
- && !TARGET_SSE4_1)
+  if (mode1 == HFmode)
+   return true;
+  else if (SSE_FLOAT_MODE_P (mode1)
+  && TARGET_SSE_MATH
+  && !flag_trapping_math
+  && !TARGET_SSE4_1)
return opt_type == OPTIMIZE_FOR_SPEED;
   return true;
 
 case floor_optab:
 case ceil_optab:
 case btrunc_optab:
-  if (SSE_FLOAT_MODE_P (mode1)
- && TARGET_SSE_MATH
- && !flag_trapping_math
- && TARGET_SSE4_1)
+  if (mode1 == HFmode)
+   return true;
+  else if (SSE_FLOAT_MODE_P (mode1)
+  && TARGET_SSE_MATH
+  && !flag_trapping_math
+  && TARGET_SSE4_1)
return true;
   return opt_type == OPTIMIZE_FOR_SPEED;
 
diff --git a/gcc/match.pd b/gcc/match.pd
index a9791ceb74a..9ccec8b6ce3 100644
--- a/gcc/match.pd
+++ b/gcc/match.pd
@@ -6191,6 +6191,34 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
(froms (convert float_value_p@0))
(convert (tos @0)
 
+#if GIMPLE
+(match float16_value_p
+ @0
+ (if (TYPE_MAIN_VARIANT (TREE_TYPE (@0)) == float16_type_node)))
+(for froms (BUILT_IN_TRUNCL BUILT_IN_TRUNC BUILT_IN_TRUNCF
+   BUILT_IN_FLOORL BUILT_IN_FLOOR BUILT_IN_FLOORF
+   BUILT_IN_CEILL BUILT_IN_CEIL BUILT_IN_CEILF
+   BUILT_IN_ROUNDEVENL BUILT_IN_ROUNDEVEN BUILT_IN_ROUNDEVENF
+   BUILT_IN_ROUNDL BUILT_IN_ROUND BUILT_IN_ROUNDF
+   BUILT_IN_NEARBYINTL BUILT_IN_NEARBYINT BUILT_IN_NEARBYINTF
+   BUILT_IN_RINTL BUILT_IN_RINT BUILT_IN_RINTF)
+ tos (IFN_TRUNC IFN_TRUNC IFN_TRUNC
+ IFN_FLOOR IFN_FLOOR IFN_FLOOR
+ IFN_CEIL IFN_CEIL IFN_CEIL
+ IFN_ROUNDEVEN IFN_ROUNDEVEN IFN_ROUNDEVEN
+ IFN_ROUND IFN_ROUND IFN_ROUND
+ IFN_NEARBYINT IFN_NEARBYINT IFN_NEARBYINT
+ IFN_RINT IFN_RINT IFN_RINT)
+ /* (_Float16) round ((doube) x) -> __built_in_roundf16 (x), etc.,
+if x is a _Float16.  */
+ (simplify
+   (convert (froms (convert float16_value_p@0)))
+ (if (types_match (type, TREE_TYPE (@0))
+ && direct_internal_fn_supported_p (as_internal_fn (tos),
+type, OPTIMIZE_FOR_BOTH))
+   (tos @0
+#endif
+
 (for froms (XFLOORL XCEILL XROUNDL XRINTL)
  tos (XFLOOR XCEIL XROUND XRINT)
  /* llfloorl(extend(x)) -> llfloor(x), etc., if x is a double.  */
diff --git a/gcc/testsuite/gcc.target/i386/pr102464.c 
b/gcc/testsuite/gcc.target/i386/pr102464.c
new file mode 100644
index 000..e3e060ee80b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr102464.c
@@ -0,0 +1,39 @@
+/* PR target/102464.  */
+/* { dg-do compile } */
+/* { dg-options "-O2 -mavx512fp16" } */
+
+#define FOO(FUNC,SUFFIX)   \
+  _Float16 \
+  foo_##FUNC##_##SUFFIX (_Float16 a)   \
+  {\
+return __builtin_##FUNC##SUFFIX (a);   \
+  }
+
+FOO (roundeven, f16);
+FOO (roundeven, f);
+FOO (roundeven, );
+FOO (roundeven, l);
+FOO (trunc, f16);
+FOO (trunc, f);
+FOO (trunc, );
+FOO (trunc, l);
+FOO (ceil, f16);
+FOO (ceil, f);
+FOO (ceil, );
+FOO (ceil, l);
+FOO (floor, f16);
+FOO (floor, f);
+FOO (floor, );
+FOO (floor, l);
+FOO (nearbyint, f16);
+FOO (nearbyint, f);
+FOO (nearbyint, );
+FOO (nearbyint, l);
+FOO (rint, f16);
+FOO (rint, f);
+FOO (rint, );
+FOO (rint, l);
+
+/* { dg-final { scan-assembler-not "vcvtsh2s\[sd\]" } } */
+/* { dg-final { scan-assembler-not "extendhfxf" } } */
+/* { dg-final { scan-assembler-times "vrndscalesh\[^\n\r\]*xmm\[0-9\]" 24 } } 
*/
-- 
2.27.0



*PING* Re: [PATCH] c++: Suppress error when cv-qualified reference is introduced by typedef [PR101783]

2021-09-24 Thread nick huang via Gcc-patches
Reference with cv-qualifiers should be ignored instead of causing an error 
because standard accepts cv-qualified references introduced by typedef which 
is ignored.
Therefore, the fix prevents GCC from reporting error by not setting variable
"bad_quals" in case the reference is introduced by typedef. Still the 
cv-qualifier is silently ignored. 
Here I quote spec (https://timsong-cpp.github.io/cppwp/dcl.ref#1):
"Cv-qualified references are ill-formed except when the cv-qualifiers
are introduced through the use of a typedef-name ([dcl.typedef],
[temp.param]) or decltype-specifier ([dcl.type.decltype]),
in which case the cv-qualifiers are ignored."

PR c++/101783

gcc/cp/ChangeLog:

2021-08-27  qingzhe huang  

    * tree.c (cp_build_qualified_type_real):

gcc/testsuite/ChangeLog:

2021-08-27  qingzhe huang  

    * g++.dg/parse/pr101783.C: New test.

diff --git a/gcc/cp/tree.c b/gcc/cp/tree.c
index 8840932dba2..7aa4318a574 100644
--- a/gcc/cp/tree.c
+++ b/gcc/cp/tree.c
@@ -1356,12 +1356,22 @@ cp_build_qualified_type_real (tree type,
   /* A reference or method type shall not be cv-qualified.
  [dcl.ref], [dcl.fct].  This used to be an error, but as of DR 295
  (in CD1) we always ignore extra cv-quals on functions.  */
+
+  /* PR 101783
+ Cv-qualified references are ill-formed except when the cv-qualifiers
+ are introduced through the use of a typedef-name ([dcl.typedef],
+ [temp.param]) or decltype-specifier ([dcl.type.decltype]),
+ in which case the cv-qualifiers are ignored.
+   */
   if (type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE)
   && (TYPE_REF_P (type)
   || FUNC_OR_METHOD_TYPE_P (type)))
 {
-  if (TYPE_REF_P (type))
+  // do NOT set bad_quals when non-method reference is introduced by 
typedef.
+  if (TYPE_REF_P (type)
+ && (!typedef_variant_p (type) || FUNC_OR_METHOD_TYPE_P (type)))
 bad_quals |= type_quals & (TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE);
+  // non-method reference introduced by typedef is also dropped silently
   type_quals &= ~(TYPE_QUAL_CONST | TYPE_QUAL_VOLATILE);
 }
 
diff --git a/gcc/testsuite/g++.dg/parse/pr101783.C 
b/gcc/testsuite/g++.dg/parse/pr101783.C
new file mode 100644
index 000..4e0a435dd0b
--- /dev/null
+++ b/gcc/testsuite/g++.dg/parse/pr101783.C
@@ -0,0 +1,5 @@
+template struct A{
+    typedef T& Type;
+};
+template void f(const typename A::Type){}
+template <> void f(const typename A::Type){}


*PING* [PATCH] c++: fix cases of core1001/1322 by not dropping cv-qualifier of function parameter of type of typename or decltype[PR101402,PR102033,PR102034,PR102039,PR102044]

2021-09-24 Thread nick huang via Gcc-patches
These bugs are considered duplicate cases of PR51851 which has been suspended 
since 2012, an issue known as "core1001/1322". Considering this background, 
it deserves a long comment to explain.

Many people believed the root cause of this family of bugs is related with 
the nature of how and when the array type is converted to pointer type during 
function signature is calculated. This is true, but we may need to go into 
details 
to understand the exact reason.

There is a pattern for these bugs(PR101402,PR102033,PR102034,PR102039). In the 
template function declaration, the function parameter is consisted of a "const" 
followed by a typename-type which is actually an array type. According to 
standard, function signature is calculated by dropping so-called 
"top-level-cv-qualifier". As a result, the templater specialization complains 
no matching to declaration can be found because specialization has const and 
template function declaration doesn't have const which is dropped as mentioned.
Obviously the template function declaration should NOT drop the const. But why?
Let's review the procedure of standard first.
(https://timsong-cpp.github.io/cppwp/dcl.fct#5.sentence-3)

"After determining the type of each parameter, any parameter of type “array of 
T” 
or of function type T is adjusted to be “pointer to T”. After producing the 
list 
of parameter types, any top-level cv-qualifiers modifying a parameter type are 
deleted when forming the function type."

Please note the action of deleting top-level cv-qualifiers happens at last 
stage 
after array type is converted to pointer type. More importantly, there are two 
conditions:
a) Each type must be able to be determined.
b) The cv-qualifier must be top-level.
Let's analysis if these two conditions can be met one by one.
1) Keyword "typename" indicates inside template it involves dependent name
 (https://timsong-cpp.github.io/cppwp/n4659/temp.res#2) for which the name 
lookup 
can be postponed until template instantiation. Clearly the type of dependent 
name cannot be determined without name lookup. Then we can NOT proceed to next 
step until concrete template argument type is determined during specialization. 
2) After “array of T” is converted to “pointer to T”, the cv-qualifiers are no 
longer top-level! Unfortunately in standard there is no definition 
of "top-level". Mr. Dan Saks's articals 
(https://www.dansaks.com/articles.shtml) 
are tremendous help! Especially this wonderful paper 
(https://www.dansaks.com/articles/2000-02%20Top-Level%20cv-Qualifiers%20in%20Function%20Parameters.pdf)
  
discusses this topic in details. In one short sentence, the "const" before 
array type is NOT top-level-cv-qualifier and should NOT be dropped.

So, understanding the root cause makes the fix very clear: Let's NOT drop 
cv-qualifier for typename-type inside template. Leave this task for template
substitution later when template specialization locks template argument types.

Similarly inside template, "decltype" may also include dependent name and 
the best strategy for parser is to preserve all original declaration and 
postpone the task till template substitution.

Here is an interesting observation to share. Originally my fix is trying to 
use function "resolve_typename_type" to see if the "typename-type" is indeed
an array type so as to decide whether the const should be dropped. It works 
for cases of PR101402,PR102033(with a small fix of function), but cannot 
succeed on cases of PR102034,PR102039. Especially PR102039 is impossible 
because it depends on template argument. This helps me realize that parser 
should not do any work if it cannot be 100% successful. All can wait.

At last I want to acknowledge other efforts to tackle this core 1001/1322 from 
PR92010 which is an irreplaceable different approach from this fix by doing 
rebuilding template function signature during template substitution stage. 
After all, this fix can only deal with dependent type started with "typename"
or "decltype" which is not the case of pr92010.
 

gcc/cp/ChangeLog:

2021-08-30  qingzhe huang  

    * decl.c (grokparms):

gcc/testsuite/ChangeLog:

2021-08-30  qingzhe huang  

    * g++.dg/parse/pr101402.C: New test.
    * g++.dg/parse/pr102033.C: New test.
    * g++.dg/parse/pr102034.C: New test.
    * g++.dg/parse/pr102039.C: New test.
    * g++.dg/parse/pr102044.C: New test.


diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c
index e0c603aaab6..940c43ce707 100644
--- a/gcc/cp/decl.c
+++ b/gcc/cp/decl.c
@@ -14384,7 +14384,16 @@ grokparms (tree parmlist, tree *parms)
 
   /* Top-level qualifiers on the parameters are
  ignored for function types.  */
- type = cp_build_qualified_type (type, 0);
+
+ int type_quals = 0;
+ /* Inside template declaration, typename and decltype indicating
+    dependent name and cv-qualifier are preserved until
+    template instantiation.
+    

Re: [PATCH] Relax condition of (vec_concat:M(vec_select op0 idx0)(vec_select op0 idx1)) to allow different modes between op0 and M, but have same inner mode.

2021-09-24 Thread Hongtao Liu via Gcc-patches
ping

On Mon, Sep 13, 2021 at 11:19 PM Hongtao Liu  wrote:
>
> On Mon, Sep 13, 2021 at 10:10 PM Jeff Law via Gcc-patches
>  wrote:
> >
> >
> >
> > On 9/9/2021 10:36 PM, liuhongt via Gcc-patches wrote:
> > >Currently for (vec_concat:M (vec_select op0 idx1)(vec_select op0 
> > > idx2)),
> > > optimizer wouldn't simplify if op0 has different mode with M, but that's 
> > > too
> > > restrict which will prevent below optimization, the condition can be 
> > > relaxed
> > > to op0 must have same inner mode with M.
> > >
> > > (set (reg:V2DF 87 [ xx ])
> > >  (vec_concat:V2DF (vec_select:DF (reg:V4DF 92)
> > >  (parallel [
> > >  (const_int 2 [0x2])
> > >  ]))
> > >  (vec_select:DF (reg:V4DF 92)
> > >  (parallel [
> > >  (const_int 3 [0x3])
> > >  ]
> > >
> > >Bootsrapped and regtested on x86_64-linux-gnu{-m32,}.
> > >Ok for trunk?
> > >
> > > gcc/ChangeLog:
> > >
> > >   * simplify-rtx.c
> > >   (simplify_context::simplify_binary_operation_1): Relax
> > >   condition of simplifying (vec_concat:M (vec_select op0
> > >   index0)(vec_select op1 index1)) to allow different modes
> > >   between op0 and M, but have same inner mode.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > >   * gcc.target/i386/vect-rebuild.c:
> > >   * gcc.target/i386/avx512f-vect-rebuild.c: New test.
> > Funny, I was looking at something rather similar recently, but never
> > pushed on it because we were going to need too many entries in the
> > parallel selector.
> >
> > I'm not convinced that we need the inner mode to match anything.  As
> > long as the vec_concat's mode is twice the size of the vec_select modes
> > and the vec_select mode is <= the mode of its operands ISTM this is
> > fine.   We  might want the modes of the vec_select to match, but I don't
> > think that's strictly necessary either, they just need to be the same
> > size.  ie, we could have somethig like
> If they're different sizes, i.e, something like below should also be legal?
> (vec_concat:V8SF (vec_select:V2SF (reg:V16SF)) (vec_select:V6SF (reg:V16SF)))
> >
> > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DI (reg:V4DI)))
> >
> > I'm not sure if that level of generality is useful though.  If we want
> > the modes of the vec_selects to match I think we could still support
> >
> > (vec_concat:V2DF (vec_select:DF (reg:V4DF)) (vec_select:DF (reg:V8DF)))
> >
> > Thoughts?
> >
> > jeff
> >
> > Jeff
> >
> >
> > > ---
> > >   gcc/simplify-rtx.c|  3 ++-
> > >   .../gcc.target/i386/avx512f-vect-rebuild.c| 21 +++
> > >   gcc/testsuite/gcc.target/i386/vect-rebuild.c  |  2 +-
> > >   3 files changed, 24 insertions(+), 2 deletions(-)
> > >   create mode 100644 gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > >
> > > diff --git a/gcc/simplify-rtx.c b/gcc/simplify-rtx.c
> > > index ebad5cb5a79..16286befd79 100644
> > > --- a/gcc/simplify-rtx.c
> > > +++ b/gcc/simplify-rtx.c
> > > @@ -4587,7 +4587,8 @@ simplify_context::simplify_binary_operation_1 
> > > (rtx_code code,
> > >   if (GET_CODE (trueop0) == VEC_SELECT
> > >   && GET_CODE (trueop1) == VEC_SELECT
> > >   && rtx_equal_p (XEXP (trueop0, 0), XEXP (trueop1, 0))
> > > - && GET_MODE (XEXP (trueop0, 0)) == mode)
> > > + && GET_MODE_INNER (GET_MODE (XEXP (trueop0, 0)))
> > > +== GET_MODE_INNER(mode))
> > > {
> > >   rtx par0 = XEXP (trueop0, 1);
> > >   rtx par1 = XEXP (trueop1, 1);
> > > diff --git a/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c 
> > > b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > > new file mode 100644
> > > index 000..aef6855aa46
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.target/i386/avx512f-vect-rebuild.c
> > > @@ -0,0 +1,21 @@
> > > +/* { dg-do compile } */
> > > +/* { dg-options "-O -mavx512vl -mavx512dq -fno-tree-forwprop" } */
> > > +
> > > +typedef double v2df __attribute__ ((__vector_size__ (16)));
> > > +typedef double v4df __attribute__ ((__vector_size__ (32)));
> > > +
> > > +v2df h (v4df x)
> > > +{
> > > +  v2df xx = { x[2], x[3] };
> > > +  return xx;
> > > +}
> > > +
> > > +v4df f2 (v4df x)
> > > +{
> > > +  v4df xx = { x[0], x[1], x[2], x[3] };
> > > +  return xx;
> > > +}
> > > +
> > > +/* { dg-final { scan-assembler-not "unpck" } } */
> > > +/* { dg-final { scan-assembler-not "valign" } } */
> > > +/* { dg-final { scan-assembler-times "\tv?extract(?:f128|f64x2)\[ \t\]" 
> > > 1 } } */
> > > diff --git a/gcc/testsuite/gcc.target/i386/vect-rebuild.c 
> > > b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > > index 570967f6b5c..8e85b98bf1d 100644
> > > --- a/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > > +++ b/gcc/testsuite/gcc.target/i386/vect-rebuild.c
> > > @@ -30,4 +30,4 @@ v2df h (v4df x)
> > >
> > >   /* { dg-final { scan-assembler-not "unpck" } } */
> > >   /* { 

[PATCH] top-level: merge Makefile.def patches from binutils-gdb repository

2021-09-24 Thread Andrew Burgess
This commit back-ports two patches to Makefile.def from the
binutils-gdb repository, these patches were committed over there
without first being merged in to the gcc repository.

These commits all relate to dependencies for binutils-gdb modules, so
should have no impact on gcc, I tested a gcc build/install on x86-64
GNU/Linux, and everything looked OK.

The two patches being backported are binutils-gdb commits:

  commit ba4d88ad892fe29c6ca7938c8861f8edef5f7a3f (gdb-gnulib-issues)
  Date:   Mon Oct 12 16:04:32 2020 +0100

  gdb/gdbserver: add dependencies for distclean-gnulib

And

  commit 755ba58ebef02e1be9fc6770d00243ba6ed0223c
  Date:   Thu Mar 18 12:37:52 2021 +

  Add install dependencies for ld -> bfd and libctf -> bfd

OK to merge?

2021-09-07  Andrew Burgess  

Merge from binutils-gdb:
2021-09-08  Nick Alcock  

PR libctf/27482
* Makefile.def: Add install-bfd dependencies for install-libctf and
install-ld, and install-strip-bfd dependencies for
install-strip-libctf and install-strip-ld; move the install-ld
dependency on install-libctf to join it.
* Makefile.in: Regenerated.

And:
2020-10-14  Andrew Burgess  

* Makefile.in: Rebuild.
* Makefile.def: Make distclean-gnulib depend on distclean-gdb and
distclean-gdbserver.
---
 ChangeLog| 19 +++
 Makefile.def | 14 ++
 Makefile.in  |  8 
 3 files changed, 41 insertions(+)

diff --git a/Makefile.def b/Makefile.def
index de3e0052106..143a6b469b2 100644
--- a/Makefile.def
+++ b/Makefile.def
@@ -471,6 +471,14 @@ dependencies = { module=all-ld; on=all-libctf; };
 dependencies = { module=install-binutils; on=install-opcodes; };
 dependencies = { module=install-strip-binutils; on=install-strip-opcodes; };
 
+// Likewise for ld, libctf, and bfd.
+dependencies = { module=install-libctf; on=install-bfd; };
+dependencies = { module=install-ld; on=install-bfd; };
+dependencies = { module=install-ld; on=install-libctf; };
+dependencies = { module=install-strip-libctf; on=install-strip-bfd; };
+dependencies = { module=install-strip-ld; on=install-strip-bfd; };
+dependencies = { module=install-strip-ld; on=install-strip-libctf; };
+
 // libopcodes depends on libbfd
 dependencies = { module=install-opcodes; on=install-bfd; };
 dependencies = { module=install-strip-opcodes; on=install-strip-bfd; };
@@ -564,6 +572,12 @@ dependencies = { module=configure-libctf; on=all-zlib; };
 dependencies = { module=configure-libctf; on=all-libiconv; };
 dependencies = { module=check-libctf; on=all-ld; };
 
+// The Makefiles in gdb and gdbserver pull in a file that configure
+// generates in the gnulib directory, so distclean gnulib only after
+// gdb and gdbserver.
+dependencies = { module=distclean-gnulib; on=distclean-gdb; };
+dependencies = { module=distclean-gnulib; on=distclean-gdbserver; };
+
 // Warning, these are not well tested.
 dependencies = { module=all-bison; on=all-intl; };
 dependencies = { module=all-bison; on=all-build-texinfo; };
diff --git a/Makefile.in b/Makefile.in
index 61af99dc75a..7613da5a378 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -60763,6 +60763,12 @@ all-stageautoprofile-ld: 
maybe-all-stageautoprofile-libctf
 all-stageautofeedback-ld: maybe-all-stageautofeedback-libctf
 install-binutils: maybe-install-opcodes
 install-strip-binutils: maybe-install-strip-opcodes
+install-libctf: maybe-install-bfd
+install-ld: maybe-install-bfd
+install-ld: maybe-install-libctf
+install-strip-libctf: maybe-install-strip-bfd
+install-strip-ld: maybe-install-strip-bfd
+install-strip-ld: maybe-install-strip-libctf
 install-opcodes: maybe-install-bfd
 install-strip-opcodes: maybe-install-strip-bfd
 configure-gas: maybe-configure-intl
@@ -61131,6 +61137,8 @@ check-stagetrain-libctf: maybe-all-stagetrain-ld
 check-stagefeedback-libctf: maybe-all-stagefeedback-ld
 check-stageautoprofile-libctf: maybe-all-stageautoprofile-ld
 check-stageautofeedback-libctf: maybe-all-stageautofeedback-ld
+distclean-gnulib: maybe-distclean-gdb
+distclean-gnulib: maybe-distclean-gdbserver
 all-bison: maybe-all-build-texinfo
 all-flex: maybe-all-build-bison
 all-flex: maybe-all-m4
-- 
2.25.4



Re: [PATCH v2 2/3] reassoc: Propagate PHI_LOOP_BIAS along single uses

2021-09-24 Thread Ilya Leoshkevich via Gcc-patches
On Thu, 2021-09-23 at 13:55 +0200, Richard Biener wrote:
> On Wed, 22 Sep 2021, Ilya Leoshkevich wrote:
> 
> > PR tree-optimization/49749 introduced code that shortens dependency
> > chains containing loop accumulators by placing them last on operand
> > lists of associative operations.
> > 
> > 456.hmmer benchmark on s390 could benefit from this, however, the
> > code
> > that needs it modifies loop accumulator before using it, and since
> > only
> > so-called loop-carried phis are are treated as loop accumulators,
> > the
> > code in the present form doesn't really help.   According to Bill
> > Schmidt - the original author - such a conservative approach was
> > chosen
> > so as to avoid unnecessarily swapping operands, which might cause
> > unpredictable effects.  However, giving special treatment to forms
> > of
> > loop accumulators is acceptable.
> > 
> > The definition of loop-carried phi is: it's a single-use phi, which
> > is
> > used in the same innermost loop it's defined in, at least one
> > argument
> > of which is defined in the same innermost loop as the phi itself.
> > Given this, it seems natural to treat single uses of such phis as
> > phis
> > themselves.
> > 
> > gcc/ChangeLog:
> > 
> > * tree-ssa-reassoc.c (biased_names): New global.
> > (propagate_bias_p): New function.
> > (loop_carried_phi): Remove.
> > (propagate_rank): Propagate bias along single uses.
> > (get_rank): Update biased_names when needed.
> > ---
> >  gcc/tree-ssa-reassoc.c | 97 --
> > 
> >  1 file changed, 64 insertions(+), 33 deletions(-)
> > 
> > diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
> > index 420c14e8cf5..2f7a8882aac 100644
> > --- a/gcc/tree-ssa-reassoc.c
> > +++ b/gcc/tree-ssa-reassoc.c
> > @@ -211,6 +211,10 @@ static int64_t *bb_rank;
> >  /* Operand->rank hashtable.  */
> >  static hash_map *operand_rank;
> >  
> > +/* SSA_NAMEs that are forms of loop accumulators and whose ranks
> > need to be
> > +   biased.  */
> > +static auto_bitmap biased_names;
> > +
> >  /* Vector of SSA_NAMEs on which after reassociate_bb is done with
> >     all basic blocks the CFG should be adjusted - basic blocks
> >     split right after that SSA_NAME's definition statement and
> > before
> > @@ -256,6 +260,50 @@ reassoc_remove_stmt (gimple_stmt_iterator
> > *gsi)
> >     the rank difference between two blocks.  */
> >  #define PHI_LOOP_BIAS (1 << 15)
> >  
> > +/* Return TRUE iff PHI_LOOP_BIAS should be propagated from one of
> > the STMT's
> > +   operands to the STMT's left-hand side.  The goal is to preserve
> > bias in code
> > +   like this:
> > +
> > + x_1 = phi(x_0, x_2)
> > + a = x_1 | 1
> > + b = a ^ 2
> > + .MEM = b
> > + c = b + d
> > + x_2 = c + e
> > +
> > +   That is, we need to preserve bias along single-use chains
> > originating from
> > +   loop-carried phis.  Only GIMPLE_ASSIGNs to SSA_NAMEs are
> > considered to be
> > +   uses, because only they participate in rank propagation.  */
> > +static bool
> > +propagate_bias_p (gimple *stmt)
> > +{
> > +  use_operand_p use;
> > +  imm_use_iterator use_iter;
> > +  gimple *single_use_stmt = NULL;
> > +
> > +  FOR_EACH_IMM_USE_FAST (use, use_iter, gimple_assign_lhs (stmt))
> > +    {
> > +  gimple *current_use_stmt = USE_STMT (use);
> > +
> > +  if (is_gimple_assign (current_use_stmt)
> > + && TREE_CODE (gimple_assign_lhs (current_use_stmt)) ==
> > SSA_NAME)
> > +   {
> > + if (single_use_stmt != NULL)
> 
> what if single_use_stmt == current_use_stmt?  We might have two
> uses on a stmt after all - should that still be biased?  I guess not
> and thus the check is correct?

Come to think of it, it should be ok to bias it.  Things like
x = x + x are fine (this particular case can be transformed into
something else earlier, but I think the overall point still holds).
> 
> > +   return false;
> > + single_use_stmt = current_use_stmt;
> > +   }
> > +    }
> > +
> > +  if (single_use_stmt == NULL)
> > +    return false;
> > +
> > +  if (gimple_bb (stmt)->loop_father
> > +  != gimple_bb (single_use_stmt)->loop_father)
> > +    return false;
> > +
> > +  return true;
> > +}
> > +
> >  /* Rank assigned to a phi statement.  If STMT is a loop-carried
> > phi of
> >     an innermost loop, and the phi has only a single use which is
> > inside
> >     the loop, then the rank is the block rank of the loop latch
> > plus an
> > @@ -313,46 +361,23 @@ phi_rank (gimple *stmt)
> >    return bb_rank[bb->index];
> >  }
> >  
> > -/* If EXP is an SSA_NAME defined by a PHI statement that
> > represents a
> > -   loop-carried dependence of an innermost loop, return TRUE; else
> > -   return FALSE.  */
> > -static bool
> > -loop_carried_phi (tree exp)
> > -{
> > -  gimple *phi_stmt;
> > -  int64_t block_rank;
> > -
> > -  if (TREE_CODE (exp) != SSA_NAME
> > -  || SSA_NAME_IS_DEFAULT_DEF (exp))
> > -    return 

[PATCHv2] top-level configure: setup target_configdirs based on repository

2021-09-24 Thread Andrew Burgess
* Thomas Schwinge  [2021-09-23 11:29:05 +0200]:

> Hi!
> 
> I only had a curious look here; hope that's still useful.
> 
> On 2021-09-22T16:30:42+0100, Andrew Burgess  
> wrote:
> > The top-level configure script is shared between the gcc repository
> > and the binutils-gdb repository.
> >
> > The target_configdirs variable in the configure.ac script, defines
> > sub-directories that contain components that should be built for the
> > target using the target tools.
> >
> > Some components, e.g. zlib, are built as both host and target
> > libraries.
> >
> > This causes problems for binutils-gdb.  If we run 'make all' in the
> > binutils-gdb repository we end up trying to build a target version of
> > the zlib library, which requires the target compiler be available.
> > Often the target compiler isn't immediately available, and so the
> > build fails.
> 
> I did wonder: shouldn't normally these target libraries be masked out via
> 'noconfigdirs' (see 'Handle --disable- generically' section),
> via 'enable_[...]' being set to 'no'?  But I think I now see the problem
> here: the 'enable_[...]' variables guard both the host and target library
> build!  (... if I'm quickly understanding that correctly...)
> 
> ... and you do need the host zlib, thus '$enable_zlib != no'.
> 
> > The problem with zlib impacted a previous attempt to synchronise the
> > top-level configure scripts from gcc to binutils-gdb, see this thread:
> >
> >   https://sourceware.org/pipermail/binutils/2019-May/107094.html
> >
> > And I'm in the process of importing libbacktrace in to binutils-gdb,
> > which is also a host and target library, and triggers the same issues.
> >
> > I believe that for binutils-gdb, at least at the moment, there are no
> > target libraries that we need to build.
> >
> > My proposal then is to make the value of target_libraries change based
> > on which repository we are building in.  Specifically, if the source
> > tree has a gcc/ directory then we should set the target_libraries
> > variable, otherwise this variable is left entry.
> >
> > I think that if someone tries to create a single unified tree (gcc +
> > binutils-gdb in a single source tree) and then build, this change will
> > not have a negative impact, the tree still has gcc/ so we'd expect the
> > target compiler to be built, which means building the target_libraries
> > should work just fine.
> >
> > However, if the source tree lacks gcc/ then we assume the target
> > compiler isn't built/available, and so target_libraries shouldn't be
> > built.
> >
> > There is already precedent within configure.ac for check on the
> > existence of gcc/ in the source tree, see the handling of
> > -enable-werror around line 3658.
> 
> (I understand that one to just guard the 'cat $srcdir/gcc/DEV-PHASE',
> tough.)
> 
> > I've tested a build of gcc on x86-64, and the same set of target
> > libraries still seem to get built.  On binutils-gdb this change
> > resolves the issues with 'make all'.
> >
> > Any thoughts?
> 
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -180,9 +180,17 @@ target_tools="target-rda"
> >  ## We assign ${configdirs} this way to remove all embedded newlines.  This
> >  ## is important because configure will choke if they ever get through.
> >  ## ${configdirs} is directories we build using the host tools.
> > -## ${target_configdirs} is directories we build using the target tools.
> > +##
> > +## ${target_configdirs} is directories we build using the target
> > +## tools, these are only needed when working in the gcc tree.  This
> > +## file is also reused in the binutils-gdb tree, where building any
> > +## target stuff doesn't make sense.
> >  configdirs=`echo ${host_libs} ${host_tools}`
> > -target_configdirs=`echo ${target_libraries} ${target_tools}`
> > +if test -d ${srcdir}/gcc; then
> > +  target_configdirs=`echo ${target_libraries} ${target_tools}`
> > +else
> > +  target_configdirs=""
> > +fi
> >  build_configdirs=`echo ${build_libs} ${build_tools}`
> 
> What I see is that after this, there are still occasions where inside
> 'case "${target}"', 'target_configdirs' gets amended, so those won't be
> caught by your approach?

Good point, I'd failed to spot these.

> 
> Instead of erasing 'target_configdirs' as you've posted, and
> understanding that we can't just instead add all the "offending" ones to
> 'noconfigdirs' for '! test -d "$srcdir"/gcc/' (because that would also
> disable them for host usage),

Great idea, this is what I've done in the revised patch below.

>I wonder if it'd make sense to turn all
> existing 'target_libraries=[...]' and 'target_tools=[...]' assignments
> and later amendments into '[...]_gcc=[...]' variants, with potentially
> further variants existing -- but probably not, because won't you always
> need the target GCC to be able to build target libraries ;-) -- and then,
> where we finally evalue '$target_libraries' and '$target_tools', only
> evaluate the '[...]_gcc' variants 

Re: [PATCH] top-level configure: setup target_configdirs based on repository

2021-09-24 Thread Andrew Burgess
* Richard Biener  [2021-09-23 10:53:16 +0200]:

> On Wed, Sep 22, 2021 at 5:47 PM Andrew Burgess
>  wrote:
> >
> > The top-level configure script is shared between the gcc repository
> > and the binutils-gdb repository.
> >
> > The target_configdirs variable in the configure.ac script, defines
> > sub-directories that contain components that should be built for the
> > target using the target tools.
> >
> > Some components, e.g. zlib, are built as both host and target
> > libraries.
> >
> > This causes problems for binutils-gdb.  If we run 'make all' in the
> > binutils-gdb repository we end up trying to build a target version of
> > the zlib library, which requires the target compiler be available.
> > Often the target compiler isn't immediately available, and so the
> > build fails.
> >
> > The problem with zlib impacted a previous attempt to synchronise the
> > top-level configure scripts from gcc to binutils-gdb, see this thread:
> >
> >   https://sourceware.org/pipermail/binutils/2019-May/107094.html
> >
> > And I'm in the process of importing libbacktrace in to binutils-gdb,
> > which is also a host and target library, and triggers the same issues.
> >
> > I believe that for binutils-gdb, at least at the moment, there are no
> > target libraries that we need to build.
> >
> > My proposal then is to make the value of target_libraries change based
> > on which repository we are building in.  Specifically, if the source
> > tree has a gcc/ directory then we should set the target_libraries
> > variable, otherwise this variable is left entry.
> >
> > I think that if someone tries to create a single unified tree (gcc +
> > binutils-gdb in a single source tree) and then build, this change will
> > not have a negative impact, the tree still has gcc/ so we'd expect the
> > target compiler to be built, which means building the target_libraries
> > should work just fine.
> >
> > However, if the source tree lacks gcc/ then we assume the target
> > compiler isn't built/available, and so target_libraries shouldn't be
> > built.
> >
> > There is already precedent within configure.ac for check on the
> > existence of gcc/ in the source tree, see the handling of
> > -enable-werror around line 3658.
> >
> > I've tested a build of gcc on x86-64, and the same set of target
> > libraries still seem to get built.  On binutils-gdb this change
> > resolves the issues with 'make all'.
> >
> > Any thoughts?
> 
> Hmm, why not use make all-binutils instead?

That absolutely would work, but sucks when I have to say 'make
all-binutils all-gas all-ld all-gdb' when 'make all' used to work.

>  Otherwise this does
> look like a reasonable thing to do.

Thanks.  I'm reworking things anyway based on Thomas's feedback.

Andrew


> 
> Richard.
> 
> > ChangeLog:
> >
> > * configure: Regenerate.
> > * configure.ac (target_configdirs): Only set this when building
> > within the gcc repository.
> > ---
> >  ChangeLog|  6 ++
> >  configure| 12 ++--
> >  configure.ac | 12 ++--
> >  3 files changed, 26 insertions(+), 4 deletions(-)
> >
> > diff --git a/configure b/configure
> > index 85ab9915402..3ef5c2b553f 100755
> > --- a/configure
> > +++ b/configure
> > @@ -2849,9 +2849,17 @@ target_tools="target-rda"
> >  ## We assign ${configdirs} this way to remove all embedded newlines.  This
> >  ## is important because configure will choke if they ever get through.
> >  ## ${configdirs} is directories we build using the host tools.
> > -## ${target_configdirs} is directories we build using the target tools.
> > +##
> > +## ${target_configdirs} is directories we build using the target
> > +## tools, these are only needed when working in the gcc tree.  This
> > +## file is also reused in the binutils-gdb tree, where building any
> > +## target stuff doesn't make sense.
> >  configdirs=`echo ${host_libs} ${host_tools}`
> > -target_configdirs=`echo ${target_libraries} ${target_tools}`
> > +if test -d ${srcdir}/gcc; then
> > +  target_configdirs=`echo ${target_libraries} ${target_tools}`
> > +else
> > +  target_configdirs=""
> > +fi
> >  build_configdirs=`echo ${build_libs} ${build_tools}`
> >
> >
> > diff --git a/configure.ac b/configure.ac
> > index 1df038b04f3..d1217e3f886 100644
> > --- a/configure.ac
> > +++ b/configure.ac
> > @@ -180,9 +180,17 @@ target_tools="target-rda"
> >  ## We assign ${configdirs} this way to remove all embedded newlines.  This
> >  ## is important because configure will choke if they ever get through.
> >  ## ${configdirs} is directories we build using the host tools.
> > -## ${target_configdirs} is directories we build using the target tools.
> > +##
> > +## ${target_configdirs} is directories we build using the target
> > +## tools, these are only needed when working in the gcc tree.  This
> > +## file is also reused in the binutils-gdb tree, where building any
> > +## target stuff doesn't make sense.
> >  configdirs=`echo 

Re: [PATCH] combine: Check for paradoxical subreg

2021-09-24 Thread Robin Dapp via Gcc-patches

Hi,

pinging this patch:

https://gcc.gnu.org/pipermail/gcc-patches/2021-June/573509.html

It introduces a check for a paradoxical subreg in combine that would ICE 
otherwise.


Regards
 Robin


Re: [PATCH, Fortran] Add missing diagnostic for F2018 C711 (TS29113 C407c)

2021-09-24 Thread Tobias Burnus

On 24.09.21 01:19, Sandra Loosemore wrote:

Here's another missing-diagnostic patch for the Fortran front end,
this time for PR Fortran/101333.  OK to commit?


That's "C711 An assumed-type actual argument that corresponds to an
assumed-rank dummy argument shall be assumed-shape or assumed-rank."

LGTM.

Thanks for the patch!

Tobias


commit 53171e748e28901693ca4362ff658883dab97e13
Author: Sandra Loosemore
Date:   Thu Sep 23 15:00:43 2021 -0700

 Fortran: Add missing diagnostic for F2018 C711 (TS29113 C407c)

 2021-09-23  Sandra Loosemore

  PR Fortran/101333

 gcc/fortran/
  * interface.c (compare_parameter): Enforce F2018 C711.

 gcc/testsuite/
  * gfortran.dg/c-interop/c407c-1.f90: Remove xfails.

diff --git a/gcc/fortran/interface.c b/gcc/fortran/interface.c
index dae4b95..a2fea0e 100644
--- a/gcc/fortran/interface.c
+++ b/gcc/fortran/interface.c
@@ -2448,6 +2448,21 @@ compare_parameter (gfc_symbol *formal, gfc_expr *actual,
return false;
  }

+  /* TS29113 C407c; F2018 C711.  */
+  if (actual->ts.type == BT_ASSUMED
+  && symbol_rank (formal) == -1
+  && actual->rank != -1
+  && !(actual->symtree->n.sym->as
+&& actual->symtree->n.sym->as->type == AS_ASSUMED_SHAPE))
+{
+  if (where)
+ gfc_error ("Assumed-type actual argument at %L corresponding to "
+"assumed-rank dummy argument %qs must be "
+"assumed-shape or assumed-rank",
+>where, formal->name);
+  return false;
+}
+
/* F2008, 12.5.2.5; IR F08/0073.  */
if (formal->ts.type == BT_CLASS && formal->attr.class_ok
&& actual->expr_type != EXPR_NULL
diff --git a/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90 
b/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90
index e4da66a..c77e6ac 100644
--- a/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90
+++ b/gcc/testsuite/gfortran.dg/c-interop/c407c-1.f90
@@ -44,7 +44,7 @@ subroutine s2 (x)
implicit none
type(*) :: x(*)

-  call g (x, 1)  ! { dg-error "Assumed.type" "pr101333" { xfail *-*-* } }
+  call g (x, 1)  ! { dg-error "Assumed.type" }
  end subroutine

  ! Check that a scalar gives an error.
@@ -53,7 +53,7 @@ subroutine s3 (x)
implicit none
type(*) :: x

-  call g (x, 1)  ! { dg-error "Assumed.type" "pr101333" { xfail *-*-* } }
+  call g (x, 1)  ! { dg-error "Assumed.type" }
  end subroutine

  ! Explicit-shape assumed-type actual arguments are forbidden implicitly

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: Fortran: Improve file-reading error diagnostic [PR55534] (was: Re: [Patch] Fortran: Improve -Wmissing-include-dirs warnings [PR55534])

2021-09-24 Thread Tobias Burnus

On 23.09.21 23:01, Harald Anlauf via Fortran wrote:


compiled with -cpp gives:

pr55534-play.f90:4:2:

4 |   type t
  |  1~~
Fatal Error: no/such/file.inc: No such file or directory
compilation terminated.

If you have an easy solution for that one,


David has an easy but hackish solution, cf. https://gcc.gnu.org/PR100904

That's a GCC 7 regression, which also affects C/C++ but only when
compiling with -traditional-cpp, which gfortran does by default but
gcc/g++ don't.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] Verify unallocated edge/BB flags are clear

2021-09-24 Thread Richard Biener via Gcc-patches
This adds verification that unused auto_{edge,bb}_flag are not
remaining set but correctly cleared by consumers.  The intent
is that those flags can be cheaply used on a smaller IL region
and thus afterwards clearing can be restricted to the same
small region as well.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

2021-09-24  Richard Biener  

* cfghooks.c (verify_flow_info): Verify unallocated BB and
edge flags are not set.
---
 gcc/cfghooks.c | 13 +
 1 file changed, 13 insertions(+)

diff --git a/gcc/cfghooks.c b/gcc/cfghooks.c
index 50b9b177639..6446e16ca8c 100644
--- a/gcc/cfghooks.c
+++ b/gcc/cfghooks.c
@@ -161,6 +161,12 @@ verify_flow_info (void)
  err = 1;
}
 
+  if (bb->flags & ~cfun->cfg->bb_flags_allocated)
+   {
+ error ("verify_flow_info: unallocated flag set on BB %d", bb->index);
+ err = 1;
+   }
+
   FOR_EACH_EDGE (e, ei, bb->succs)
{
  if (last_visited [e->dest->index] == bb)
@@ -202,6 +208,13 @@ verify_flow_info (void)
  err = 1;
}
 
+ if (e->flags & ~cfun->cfg->edge_flags_allocated)
+   {
+ error ("verify_flow_info: unallocated edge flag set on %d -> %d",
+e->src->index, e->dest->index);
+ err = 1;
+   }
+
  edge_checksum[e->dest->index] += (size_t) e;
}
   if (n_fallthru > 1)
-- 
2.31.1


Re: [PATCH] real: fix encoding of negative IEEE double/quad values [PR98216]

2021-09-24 Thread Richard Biener via Gcc-patches
On Thu, Sep 23, 2021 at 10:44 PM Patrick Palka via Gcc-patches
 wrote:
>
> In encode_ieee_double/quad, the assignment
>
>   unsigned long VAL = r->sign << 31;
>
> is intended to set the 31st bit of VAL whenever the given REAL_CST is
> negative.  But on LP64 hosts it also unintentionally sets the upper 32
> bits of VAL due to the promotion of r->sign from unsigned:1 to int and
> the subsequent sign extension of the shifted value from int to long.
>
> In the C++ frontend, this bug causes incorrect mangling of negative
> double values due to the output of real_to_target during write_real_cst
> unexpectedly having the upper 32 bits of each word set.  (I'm not sure
> if/how this bug manifests itself outside of the frontend..)
>
> This patch fixes this by avoiding the unwanted sign extension.  Note
> that r0-53976 fixed the same bug in encode_ieee_single long ago.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
> trunk and perhaps the release branches?

OK for trunk and branches.

Thanks,
Richard.

> PR c++/98216
> PR c++/91292
>
> gcc/ChangeLog:
>
> * real.c (encode_ieee_double): Avoid incorrect sign extension.
> (encode_ieee_quad): Likewise.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp2a/nontype-float2.C: New test.
> ---
>  gcc/real.c  |  6 --
>  gcc/testsuite/g++.dg/cpp2a/nontype-float2.C | 13 +
>  2 files changed, 17 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-float2.C
>
> diff --git a/gcc/real.c b/gcc/real.c
> index 555cf44c142..8c7a47a69e6 100644
> --- a/gcc/real.c
> +++ b/gcc/real.c
> @@ -3150,9 +3150,10 @@ encode_ieee_double (const struct real_format *fmt, 
> long *buf,
> const REAL_VALUE_TYPE *r)
>  {
>unsigned long image_lo, image_hi, sig_lo, sig_hi, exp;
> +  unsigned long sign = r->sign;
>bool denormal = (r->sig[SIGSZ-1] & SIG_MSB) == 0;
>
> -  image_hi = r->sign << 31;
> +  image_hi = sign << 31;
>image_lo = 0;
>
>if (HOST_BITS_PER_LONG == 64)
> @@ -3938,10 +3939,11 @@ encode_ieee_quad (const struct real_format *fmt, long 
> *buf,
>   const REAL_VALUE_TYPE *r)
>  {
>unsigned long image3, image2, image1, image0, exp;
> +  unsigned long sign = r->sign;
>bool denormal = (r->sig[SIGSZ-1] & SIG_MSB) == 0;
>REAL_VALUE_TYPE u;
>
> -  image3 = r->sign << 31;
> +  image3 = sign << 31;
>image2 = 0;
>image1 = 0;
>image0 = 0;
> diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-float2.C 
> b/gcc/testsuite/g++.dg/cpp2a/nontype-float2.C
> new file mode 100644
> index 000..5db208a05d1
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/nontype-float2.C
> @@ -0,0 +1,13 @@
> +// PR c++/98216
> +// { dg-do compile { target c++20 } }
> +
> +template void f() { }
> +
> +template void f<-1.0f>();
> +template void f<-2.0f>();
> +
> +template void f<-1.0>();
> +template void f<-2.0>();
> +
> +template void f<-1.0L>();
> +template void f<-2.0L>();
> --
> 2.33.0.514.g99c99ed825
>


Re: [RFC] Don't move cold code out of loop by checking bb count

2021-09-24 Thread Xionghu Luo via Gcc-patches
Update the patch to v3, not sure whether you prefer the paste style
and continue to link the previous thread as Segher dislikes this...


[PATCH v3] Don't move cold code out of loop by checking bb count


Changes:
1. Handle max_loop in determine_max_movement instead of
outermost_invariant_loop.
2. Remove unnecessary changes.
3. Add for_all_locs_in_loop (loop, ref, ref_in_loop_hot_body) in can_sm_ref_p.
4. "gsi_next ();" in move_computations_worker is kept since it caused
infinite loop when implementing v1 and the iteration is missed to be
updated actually.

v1: https://gcc.gnu.org/pipermail/gcc-patches/2021-August/576488.html
v2: https://gcc.gnu.org/pipermail/gcc-patches/2021-September/579086.html

There was a patch trying to avoid move cold block out of loop:

https://gcc.gnu.org/pipermail/gcc/2014-November/215551.html

Richard suggested to "never hoist anything from a bb with lower execution
frequency to a bb with higher one in LIM invariantness_dom_walker
before_dom_children".

In gimple LIM analysis, add find_coldest_out_loop to move invariants to
expected target loop, if profile count of the loop bb is colder
than target loop preheader, it won't be hoisted out of loop.
Likely for store motion, if all locations of the REF in loop is cold,
don't do store motion of it.

SPEC2017 performance evaluation shows 1% performance improvement for
intrate GEOMEAN and no obvious regression for others.  Especially,
500.perlbench_r +7.52% (Perf shows function S_regtry of perlbench is
largely improved.), and 548.exchange2_r+1.98%, 526.blender_r +1.00%
on P8LE.

gcc/ChangeLog:

* loop-invariant.c (find_invariants_bb): Check profile count
before motion.
(find_invariants_body): Add argument.
* tree-ssa-loop-im.c (find_coldest_out_loop): New function.
(determine_max_movement): Use find_coldest_out_loop.
(move_computations_worker): Adjust and fix iteration udpate.
(execute_sm_exit): Check pointer validness.
(class ref_in_loop_hot_body): New functor.
(ref_in_loop_hot_body::operator): New.
(can_sm_ref_p): Use for_all_locs_in_loop.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/recip-3.c: Adjust.
* gcc.dg/tree-ssa/ssa-lim-18.c: New test.
* gcc.dg/tree-ssa/ssa-lim-19.c: New test.
* gcc.dg/tree-ssa/ssa-lim-20.c: New test.
---
 gcc/loop-invariant.c   | 10 ++--
 gcc/tree-ssa-loop-im.c | 61 --
 gcc/testsuite/gcc.dg/tree-ssa/recip-3.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c | 20 +++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c | 27 ++
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c | 25 +
 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c | 28 ++
 7 files changed, 165 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-18.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-19.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-20.c
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/ssa-lim-21.c

diff --git a/gcc/loop-invariant.c b/gcc/loop-invariant.c
index fca0c2b24be..5c3be7bf0eb 100644
--- a/gcc/loop-invariant.c
+++ b/gcc/loop-invariant.c
@@ -1183,9 +1183,14 @@ find_invariants_insn (rtx_insn *insn, bool 
always_reached, bool always_executed)
call.  */
 
 static void
-find_invariants_bb (basic_block bb, bool always_reached, bool always_executed)
+find_invariants_bb (class loop *loop, basic_block bb, bool always_reached,
+   bool always_executed)
 {
   rtx_insn *insn;
+  basic_block preheader = loop_preheader_edge (loop)->src;
+
+  if (preheader->count > bb->count)
+return;
 
   FOR_BB_INSNS (bb, insn)
 {
@@ -1214,8 +1219,7 @@ find_invariants_body (class loop *loop, basic_block *body,
   unsigned i;
 
   for (i = 0; i < loop->num_nodes; i++)
-find_invariants_bb (body[i],
-   bitmap_bit_p (always_reached, i),
+find_invariants_bb (loop, body[i], bitmap_bit_p (always_reached, i),
bitmap_bit_p (always_executed, i));
 }
 
diff --git a/gcc/tree-ssa-loop-im.c b/gcc/tree-ssa-loop-im.c
index 4b187c2cdaf..655fab03442 100644
--- a/gcc/tree-ssa-loop-im.c
+++ b/gcc/tree-ssa-loop-im.c
@@ -417,6 +417,28 @@ movement_possibility (gimple *stmt)
   return ret;
 }
 
+/* Find coldest loop between outmost_loop and loop by comapring profile count. 
 */
+
+static class loop *
+find_coldest_out_loop (class loop *outmost_loop, class loop *loop,
+  basic_block curr_bb)
+{
+  class loop *cold_loop, *min_loop;
+  cold_loop = min_loop = outmost_loop;
+  profile_count min_count = loop_preheader_edge (min_loop)->src->count;
+
+  if (curr_bb && curr_bb->count < loop_preheader_edge (loop)->src->count)
+return NULL;
+
+  while (min_loop != loop)
+{
+  min_loop = superloop_at_depth (loop, loop_depth (min_loop) + 1);
+  if (loop_preheader_edge (min_loop)->src->count < min_count)
+   cold_loop =