RE: [PATCH] i386: Extend cvtps2pd to memory

2022-06-30 Thread Jiang, Haochen via Gcc-patches
> -Original Message-
> From: Uros Bizjak 
> Sent: Thursday, June 30, 2022 2:20 PM
> To: Jiang, Haochen 
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> 
> On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang 
> wrote:
> >
> > Hi all,
> >
> > This patch aims to fix the cvtps2pd insn, which should also work on
> > memory operand but currently does not. After this fix, when loop == 2,
> > it will eliminate movq instruction.
> >
> > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> >
> > BRs,
> > Haochen
> >
> > gcc/ChangeLog:
> >
> > PR target/43618
> > * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > (sse2_cvtps2pd_load): Rename extendvsdfv2df2.
> >
> > gcc/testsuite/ChangeLog:
> >
> > PR target/43618
> > * gcc.target/i386/pr43618-1.c: New test.
> 
> This patch could be as simple as:
> 
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> 8cd0f617bf3..c331445cb2d 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -9195,7 +9195,7 @@
> (define_insn "extendv2sfv2df2"
>   [(set (match_operand:V2DF 0 "register_operand" "=v")
>(float_extend:V2DF
> - (match_operand:V2SF 1 "register_operand" "v")))]
> + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
>   "TARGET_MMX_WITH_SSE"
>   "%vcvtps2pd\t{%1, %0|%0, %1}"
>   [(set_attr "type" "ssecvt")

We also tested on this version, it is ok.

The reason why the patch looks like this is because in the previous insn
sse2_cvtps2pd, the constraint vm and vector_operand
actually does not match the actual instruction. Memory operand is V2SF,
not V4SF.

Therefore, we changed the constraint in that insn. Then it caused another issue.
For memory operand, it seems that we cannot generate those mask instructions.
So I change the pattern to how extendv2hfv2df2 works.

Haochen

> Uros.


Re: [PATCH] i386: Extend cvtps2pd to memory

2022-06-30 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen  wrote:
>
> > -Original Message-
> > From: Uros Bizjak 
> > Sent: Thursday, June 30, 2022 2:20 PM
> > To: Jiang, Haochen 
> > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> >
> > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang 
> > wrote:
> > >
> > > Hi all,
> > >
> > > This patch aims to fix the cvtps2pd insn, which should also work on
> > > memory operand but currently does not. After this fix, when loop == 2,
> > > it will eliminate movq instruction.
> > >
> > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > >
> > > BRs,
> > > Haochen
> > >
> > > gcc/ChangeLog:
> > >
> > > PR target/43618
> > > * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > (sse2_cvtps2pd_load): Rename extendvsdfv2df2.
> > >
> > > gcc/testsuite/ChangeLog:
> > >
> > > PR target/43618
> > > * gcc.target/i386/pr43618-1.c: New test.
> >
> > This patch could be as simple as:
> >
> > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> > 8cd0f617bf3..c331445cb2d 100644
> > --- a/gcc/config/i386/sse.md
> > +++ b/gcc/config/i386/sse.md
> > @@ -9195,7 +9195,7 @@
> > (define_insn "extendv2sfv2df2"
> >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> >(float_extend:V2DF
> > - (match_operand:V2SF 1 "register_operand" "v")))]
> > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> >   "TARGET_MMX_WITH_SSE"
> >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> >   [(set_attr "type" "ssecvt")
>
> We also tested on this version, it is ok.
>
> The reason why the patch looks like this is because in the previous insn
> sse2_cvtps2pd, the constraint vm and vector_operand
> actually does not match the actual instruction. Memory operand is V2SF,
> not V4SF.
>
> Therefore, we changed the constraint in that insn. Then it caused another 
> issue.
> For memory operand, it seems that we cannot generate those mask instructions.
> So I change the pattern to how extendv2hfv2df2 works.

If you want to change the memory access in sse2_cvtps2pd,
then please see how e.g. v2hiv2di is handled in sse.md. In
addition to two instructions, you will need one define_insn_and_split
with a pre-reload splitter.

Uros,


Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails

2022-06-30 Thread Alexandre Oliva via Gcc-patches
On Jun 27, 2022, Alexandre Oliva  wrote:

> I see two potential ways to avoid this:

Another possibility occurred to me: seeking back to the entry we're
about to remove, before removing it.  Then, POSIX-compliant
implementations will just skip the removed entry and find the next one,
while RTEMS will find the next entry at the spot where the removed entry
used to be.

It is syscall-heavier, and it may invoke O(n^2) behavior for each
directory in remove_all, since prev_pos is quite likely to always hold
the initial offset, requiring scanning past more and more removed
entries after each removal, so I don't submit this formally for
inclusion, but post it FTR.  I've only confirmed that it solves the
problem on RTEMS, passing libstdc++ filesystem test, but I haven't
tested it further.


diff --git a/libstdc++-v3/src/c++17/fs_dir.cc b/libstdc++-v3/src/c++17/fs_dir.cc
index 2258399da2587..43e2d9678eae5 100644
--- a/libstdc++-v3/src/c++17/fs_dir.cc
+++ b/libstdc++-v3/src/c++17/fs_dir.cc
@@ -65,6 +65,7 @@ struct fs::_Dir : _Dir_base
   // Reports errors by setting ec.
   bool advance(bool skip_permission_denied, error_code& ec) noexcept
   {
+prev_pos = posix::telldir(dirp);
 if (const auto entp = _Dir_base::advance(skip_permission_denied, ec))
   {
auto name = path;
@@ -146,6 +147,12 @@ struct fs::_Dir : _Dir_base
   bool
   do_unlink(bool is_directory, error_code& ec) const noexcept
   {
+// On some systems, removing the just-read entry causes the next
+// readdir to skip the entry that comes after it.  That's not
+// POSIX-compliant, but we can work around this problem by moving
+// back to the position of the last-read entry, as if it was to be
+// read again next, before removing it.
+posix::seekdir(dirp, prev_pos);
 #if _GLIBCXX_HAVE_UNLINKAT
 const auto atp = current();
 if (::unlinkat(atp.dir(), atp.path_at_dir(),
@@ -176,6 +183,7 @@ struct fs::_Dir : _Dir_base
 
   fs::path path; // Empty if only using unlinkat with file descr.
   directory_entry  entry;
+  long prev_pos;
 };
 
 namespace
diff --git a/libstdc++-v3/src/filesystem/dir-common.h 
b/libstdc++-v3/src/filesystem/dir-common.h
index 228fab55afbcf..6174a8ef3c228 100644
--- a/libstdc++-v3/src/filesystem/dir-common.h
+++ b/libstdc++-v3/src/filesystem/dir-common.h
@@ -55,6 +55,8 @@ using char_type = wchar_t;
 using DIR = ::_WDIR;
 using dirent = _wdirent;
 inline DIR* opendir(const wchar_t* path) { return ::_wopendir(path); }
+inline long telldir(DIR* dir) { ::_wtelldir(dir); }
+inline void seekdir(DIR *dir, long loc) { ::_wseekdir(dir, loc); }
 inline dirent* readdir(DIR* dir) { return ::_wreaddir(dir); }
 inline int closedir(DIR* dir) { return ::_wclosedir(dir); }
 #elif defined _GLIBCXX_HAVE_DIRENT_H
@@ -64,6 +66,8 @@ typedef struct ::dirent dirent;
 using ::opendir;
 using ::readdir;
 using ::closedir;
+using ::telldir;
+using ::seekdir;
 #else
 using char_type = char;
 struct dirent { const char* d_name; };
@@ -71,6 +75,8 @@ struct DIR { };
 inline DIR* opendir(const char*) { return nullptr; }
 inline dirent* readdir(DIR*) { return nullptr; }
 inline int closedir(DIR*) { return -1; }
+inline long telldir(DIR *) { return -1; }
+inline void seekdir(DIR *, long) { }
 #undef _GLIBCXX_HAVE_DIRFD
 #undef _GLIBCXX_HAVE_UNLINKAT
 #endif


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about 


Re: [PATCH] libstdc++: retry removal of dir entries if dir removal fails

2022-06-30 Thread Sebastian Huber




On 30/06/2022 09:52, Alexandre Oliva via Gcc-patches wrote:

On Jun 27, 2022, Alexandre Oliva  wrote:


I see two potential ways to avoid this:

Another possibility occurred to me: seeking back to the entry we're
about to remove, before removing it.  Then, POSIX-compliant
implementations will just skip the removed entry and find the next one,
while RTEMS will find the next entry at the spot where the removed entry
used to be.

It is syscall-heavier, and it may invoke O(n^2) behavior for each
directory in remove_all, since prev_pos is quite likely to always hold
the initial offset, requiring scanning past more and more removed
entries after each removal, so I don't submit this formally for
inclusion, but post it FTR.  I've only confirmed that it solves the
problem on RTEMS, passing libstdc++ filesystem test, but I haven't
tested it further.


From my point of view this is behaviour is an RTEMS bug. Instead of 
adding tweaks for RTEMS, it would be better to report the issues and fix 
them in RTEMS. It could be also a Newlib issue.


--
embedded brains GmbH
Herr Sebastian HUBER
Dornierstr. 4
82178 Puchheim
Germany
email: sebastian.hu...@embedded-brains.de
phone: +49-89-18 94 741 - 16
fax:   +49-89-18 94 741 - 08

Registergericht: Amtsgericht München
Registernummer: HRB 157899
Vertretungsberechtigte Geschäftsführer: Peter Rasmussen, Thomas Dörfler
Unsere Datenschutzerklärung finden Sie hier:
https://embedded-brains.de/datenschutzerklaerung/


Re: [Patch] OpenMP: Prepare omp-* for ancestor:1 handling

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Wed, Jun 29, 2022 at 09:54:56PM +0200, Tobias Burnus wrote:
> Currently, this is a rather useless patch - even though it helps to reduce
> the number of local patches I have. Due to the printed sorry, adding a
> testcase with -fdump-tree-* is also not possible, yet.
> 
> For reverse offload, the plan is to call
>   GOMP_target_ext
> inside the on the device, passing
>   'device(omp_initial_device)'
> alias device(GOMP_DEVICE_HOST_FALLBACK) to the
> target device's libgomp.
> 
> The pointer to the generated target-region function is then
> passed as argument. However, that only works if that function
> is not nullified ...
> 
> The reason that nullifying was added is:
>   https://gcc.gnu.org/PR100573
>   https://gcc.gnu.org/r12-1066-g95d67762171f83277a5700b270c0d1e2756f83f4
>   https://gcc.gnu.org/pipermail/gcc-patches/2021-May/571285.html
> 
> 
> Note: Instead of just checking for GOMP_DEVICE_HOST_FALLBACK,
> more effort could be done, e.g. by setting some attribute on the
> generated function and then check for check for it. Example:
> 'omp target device_ancestor' + using lookup_attribute).
> 
> That's what's done in the second variant.
> 
> OK for mainline (which variant)? Or do you prefer to wait for
> a more complete patch?

So, what is the plan with reverse offload?
Which devices can do it?  I presume we won't bother with intelmic,
can gcn do it and how, can nvptx do it and how?

What we could do is implement it initially (with all the restriction
checking etc. needed) for host fallback only, say no devices support
reverse offload and take out the sorry.

But it would be good to at least have some idea how it will be implemented
for some offloading devices, whether map will there anything at all (or it
will require unified shared memory) and how we'll map the fn argument passed
to GOMP_target_ext back to the host function address.

Jakub



[PATCH][pushed] remove dead member variable in dom_jt_state

2022-06-30 Thread Martin Liška
Hi.

I'm going to push the following clean-up.

Martin

gcc/ChangeLog:

* tree-ssa-dom.cc (pass_dominator::execute): Remove m_ranger as
it is unused.
---
 gcc/tree-ssa-dom.cc | 8 +++-
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc
index dcaf4672b66..9b6520fd2dd 100644
--- a/gcc/tree-ssa-dom.cc
+++ b/gcc/tree-ssa-dom.cc
@@ -588,9 +588,8 @@ record_edge_info (basic_block bb)
 class dom_jt_state : public jt_state
 {
 public:
-  dom_jt_state (const_and_copies *copies, avail_exprs_stack *avails,
-   gimple_ranger *ranger)
-: m_copies (copies), m_avails (avails), m_ranger (ranger)
+  dom_jt_state (const_and_copies *copies, avail_exprs_stack *avails)
+: m_copies (copies), m_avails (avails)
   {
   }
   void push (edge e) override
@@ -613,7 +612,6 @@ public:
 private:
   const_and_copies *m_copies;
   avail_exprs_stack *m_avails;
-  gimple_ranger *m_ranger;
 };
 
 void
@@ -794,7 +792,7 @@ pass_dominator::execute (function *fun)
   gimple_ranger *ranger = enable_ranger (fun);
   path_range_query path_query (/*resolve=*/true, ranger);
   dom_jt_simplifier simplifier (avail_exprs_stack, ranger, &path_query);
-  dom_jt_state state (const_and_copies, avail_exprs_stack, ranger);
+  dom_jt_state state (const_and_copies, avail_exprs_stack);
   jump_threader threader (&simplifier, &state);
   dom_opt_dom_walker walker (CDI_DOMINATORS,
 &threader,
-- 
2.36.1



[PATCH] Avoid computing RPO for update_ssa

2022-06-30 Thread Richard Biener via Gcc-patches
At some point when domwalk got the ability to use RPO for ordering
dominator children we carefully avoided update_ssa eating the cost
of RPO compute.  Unfortunately some later consolidation of CTORs
lost this again so the following makes this explicit via a special
value to the bb_index_to_rpo argument of domwalk, speeding up
update_ssa again.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* domwalk.h (dom_walker::dom_walker): Update comment to
reflect reality and new special argument value for
bb_index_to_rpo.
* domwalk.cc (dom_walker::dom_walker): Recognize -1
bb_index_to_rpo.
* tree-into-ssa.cc
(rewrite_update_dom_walker::rewrite_update_dom_walker): Tell
dom_walker to not use RPO.
---
 gcc/domwalk.cc   | 6 --
 gcc/domwalk.h| 5 +++--
 gcc/tree-into-ssa.cc | 2 +-
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/gcc/domwalk.cc b/gcc/domwalk.cc
index d633088dff3..e36e9eb8b55 100644
--- a/gcc/domwalk.cc
+++ b/gcc/domwalk.cc
@@ -191,7 +191,8 @@ dom_walker::dom_walker (cdi_direction direction,
 m_reachability (reachability),
 m_user_bb_to_rpo (bb_index_to_rpo != NULL),
 m_unreachable_dom (NULL),
-m_bb_to_rpo (bb_index_to_rpo)
+m_bb_to_rpo (bb_index_to_rpo == (int *)(uintptr_t)-1
+? NULL : bb_index_to_rpo)
 {
 }
 
@@ -272,7 +273,8 @@ void
 dom_walker::walk (basic_block bb)
 {
   /* Compute the basic-block index to RPO mapping lazily.  */
-  if (!m_bb_to_rpo
+  if (!m_user_bb_to_rpo
+  && !m_bb_to_rpo
   && m_dom_direction == CDI_DOMINATORS)
 {
   int *postorder = XNEWVEC (int, n_basic_blocks_for_fn (cfun));
diff --git a/gcc/domwalk.h b/gcc/domwalk.h
index 17ddc1b83df..b71d29400ad 100644
--- a/gcc/domwalk.h
+++ b/gcc/domwalk.h
@@ -62,8 +62,9 @@ public:
 
   /* You can provide a mapping of basic-block index to RPO if you
  have that readily available or you do multiple walks.  If you
- specify NULL as BB_INDEX_TO_RPO dominator children will not be
- walked in RPO order.  */
+ specify NULL as BB_INDEX_TO_RPO this mapping will be computed
+ lazily at walk time.  If you specify -1 dominator children will
+ not be walked in RPO order.  */
   dom_walker (cdi_direction direction, enum reachability = ALL_BLOCKS,
  int *bb_index_to_rpo = NULL);
 
diff --git a/gcc/tree-into-ssa.cc b/gcc/tree-into-ssa.cc
index f9655ce1a28..c4e40e8fb08 100644
--- a/gcc/tree-into-ssa.cc
+++ b/gcc/tree-into-ssa.cc
@@ -2146,7 +2146,7 @@ class rewrite_update_dom_walker : public dom_walker
 {
 public:
   rewrite_update_dom_walker (cdi_direction direction)
-: dom_walker (direction, ALL_BLOCKS, NULL) {}
+: dom_walker (direction, ALL_BLOCKS, (int *)(uintptr_t)-1) {}
 
   edge before_dom_children (basic_block) final override;
   void after_dom_children (basic_block) final override;
-- 
2.35.3


Re: [PATCH 3/3] lto-plugin: implement LDPT_GET_API_VERSION

2022-06-30 Thread Martin Liška
On 6/30/22 08:43, Rui Ueyama wrote:
> Thanks Martin for creating this patch.

You're welcome.

> 
> Here is a preliminary change for the mold side: 
> https://github.com/rui314/mold/commit/9ad49d1c556bc963d06cca8233535183490de605
>  
> 
> 
> Overall the API is looking fine,

Good then!

> though it is not clear what kind of value is expected as a linker version. A 
> linker version is not a single unsigned integer but something like "1.3.0". 
> Something like "1.3.0-rc2" can also be a linker version. So I don't think we 
> can represent a linker version as a single integer.

Well, you can use the same what we use GCC_VERSION (plugin_version):

1000 * MAJOR + MINOR

Let me adjust the documentation of the API.

Richi: May I install the patch?

Thanks,
Martin

> 
> On Mon, Jun 20, 2022 at 9:01 PM Martin Liška  > wrote:
> 
> On 6/20/22 11:35, Richard Biener wrote:
> > I think this is OK.  Can we get buy-in from mold people?
> 
> Sure, I've just pinged Rui:
> https://github.com/rui314/mold/issues/454#issuecomment-1160419030 
> 
> 
> Martin
> 
From f04b187240b786a868377c3854113598ba69ce1b Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Mon, 16 May 2022 14:01:52 +0200
Subject: [PATCH] lto-plugin: implement LDPT_GET_API_VERSION

include/ChangeLog:

	* plugin-api.h (enum linker_api_version): New enum.
	(ld_plugin_get_api_version): New.
	(enum ld_plugin_tag): Add LDPT_GET_API_VERSION.
	(struct ld_plugin_tv): Add tv_get_api_version.

lto-plugin/ChangeLog:

	* lto-plugin.c (negotiate_api_version): New.
	(onload): Negotiate API version.
---
 include/plugin-api.h| 33 +
 lto-plugin/lto-plugin.c | 38 ++
 2 files changed, 71 insertions(+)

diff --git a/include/plugin-api.h b/include/plugin-api.h
index 8aebe2ff267..f6402151473 100644
--- a/include/plugin-api.h
+++ b/include/plugin-api.h
@@ -483,6 +483,37 @@ enum ld_plugin_level
   LDPL_FATAL
 };
 
+/* Contract between a plug-in and a linker.  */
+
+enum linker_api_version
+{
+   /* The linker/plugin do not implement any of the API levels below, the API
+   is determined solely via the transfer vector.  */
+   LAPI_UNSPECIFIED = 0,
+
+   /* API level v1.  The linker provides get_symbols_v3, add_symbols_v2,
+  the plugin will use that and not any lower versions.
+  claim_file is thread-safe on the plugin side and
+  add_symbols on the linker side.  */
+   LAPI_V1 = 1
+};
+
+/* The linker's interface for API version negotiation.  A plug-in calls
+  the function (with its IDENTIFIER and VERSION), plus minimal and maximal
+  version of linker_api_version is provided.  Linker then returns selected
+  API version and provides its IDENTIFIER and VERSION.
+  Both PLUGIN_VERSION and LINKER_VERSION are in the following format:
+  1000 * MAJOR + MINOR.
+  Identifier pointers remain valid as long as the plugin is loaded.  */
+
+typedef
+enum linker_api_version
+(*ld_plugin_get_api_version) (const char *plugin_identifier, unsigned plugin_version,
+			  enum linker_api_version minimal_api_supported,
+			  enum linker_api_version maximal_api_supported,
+			  const char **linker_identifier,
+			  unsigned *linker_version);
+
 /* Values for the tv_tag field of the transfer vector.  */
 
 enum ld_plugin_tag
@@ -521,6 +552,7 @@ enum ld_plugin_tag
   LDPT_REGISTER_NEW_INPUT_HOOK,
   LDPT_GET_WRAP_SYMBOLS,
   LDPT_ADD_SYMBOLS_V2,
+  LDPT_GET_API_VERSION,
 };
 
 /* The plugin transfer vector.  */
@@ -556,6 +588,7 @@ struct ld_plugin_tv
 ld_plugin_get_input_section_size tv_get_input_section_size;
 ld_plugin_register_new_input tv_register_new_input;
 ld_plugin_get_wrap_symbols tv_get_wrap_symbols;
+ld_plugin_get_api_version tv_get_api_version;
   } tv_u;
 };
 
diff --git a/lto-plugin/lto-plugin.c b/lto-plugin/lto-plugin.c
index 635e126946b..06e3219de75 100644
--- a/lto-plugin/lto-plugin.c
+++ b/lto-plugin/lto-plugin.c
@@ -74,6 +74,7 @@ along with this program; see the file COPYING3.  If not see
 #include "../gcc/lto/common.h"
 #include "simple-object.h"
 #include "plugin-api.h"
+#include "ansidecl.h"
 
 /* We need to use I64 instead of ll width-specifier on native Windows.
The reason for this is that older MS-runtimes don't support the ll.  */
@@ -174,6 +175,10 @@ static ld_plugin_add_input_file add_input_file;
 static ld_plugin_add_input_library add_input_library;
 static ld_plugin_message message;
 static ld_plugin_add_symbols add_symbols, add_symbols_v2;
+static ld_plugin_get_api_version get_api_version;
+
+/* By default, use version LAPI_UNSPECIFIED if there is not negotiation.  */
+static enum linker_api_version api_version = LAPI_UNSPECIFIED;
 
 static struct plugin_file_info *claimed_files = NULL;
 static unsigned int num_claimed_files = 0;
@@ -1421,6 +1426,33 @@ process_option (c

Re: [PATCH][pushed] remove dead member variable in dom_jt_state

2022-06-30 Thread Aldy Hernandez via Gcc-patches
Anything dealing with the hybrid threader could probably use a little
clean up.  I've neglected to do so, as I'm hoping to nuke the forward
threader altogether and replace it with the backwards threader.
However, in order to do this, we need to implement prange (pointers)
and frange (floats) to handle the slack currently being picked up by
DOM.

Anywhoo...thanks for doing this.
Aldy

On Thu, Jun 30, 2022 at 10:29 AM Martin Liška  wrote:
>
> Hi.
>
> I'm going to push the following clean-up.
>
> Martin
>
> gcc/ChangeLog:
>
> * tree-ssa-dom.cc (pass_dominator::execute): Remove m_ranger as
> it is unused.
> ---
>  gcc/tree-ssa-dom.cc | 8 +++-
>  1 file changed, 3 insertions(+), 5 deletions(-)
>
> diff --git a/gcc/tree-ssa-dom.cc b/gcc/tree-ssa-dom.cc
> index dcaf4672b66..9b6520fd2dd 100644
> --- a/gcc/tree-ssa-dom.cc
> +++ b/gcc/tree-ssa-dom.cc
> @@ -588,9 +588,8 @@ record_edge_info (basic_block bb)
>  class dom_jt_state : public jt_state
>  {
>  public:
> -  dom_jt_state (const_and_copies *copies, avail_exprs_stack *avails,
> -   gimple_ranger *ranger)
> -: m_copies (copies), m_avails (avails), m_ranger (ranger)
> +  dom_jt_state (const_and_copies *copies, avail_exprs_stack *avails)
> +: m_copies (copies), m_avails (avails)
>{
>}
>void push (edge e) override
> @@ -613,7 +612,6 @@ public:
>  private:
>const_and_copies *m_copies;
>avail_exprs_stack *m_avails;
> -  gimple_ranger *m_ranger;
>  };
>
>  void
> @@ -794,7 +792,7 @@ pass_dominator::execute (function *fun)
>gimple_ranger *ranger = enable_ranger (fun);
>path_range_query path_query (/*resolve=*/true, ranger);
>dom_jt_simplifier simplifier (avail_exprs_stack, ranger, &path_query);
> -  dom_jt_state state (const_and_copies, avail_exprs_stack, ranger);
> +  dom_jt_state state (const_and_copies, avail_exprs_stack);
>jump_threader threader (&simplifier, &state);
>dom_opt_dom_walker walker (CDI_DOMINATORS,
>  &threader,
> --
> 2.36.1
>



Re: [PATCH] i386: Extend cvtps2pd to memory

2022-06-30 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak  wrote:
>
> On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen  
> wrote:
> >
> > > -Original Message-
> > > From: Uros Bizjak 
> > > Sent: Thursday, June 30, 2022 2:20 PM
> > > To: Jiang, Haochen 
> > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > >
> > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang 
> > > wrote:
> > > >
> > > > Hi all,
> > > >
> > > > This patch aims to fix the cvtps2pd insn, which should also work on
> > > > memory operand but currently does not. After this fix, when loop == 2,
> > > > it will eliminate movq instruction.
> > > >
> > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > >
> > > > BRs,
> > > > Haochen
> > > >
> > > > gcc/ChangeLog:
> > > >
> > > > PR target/43618
> > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > (sse2_cvtps2pd_load): Rename extendvsdfv2df2.

Rename FROM ...

Please also mention change to sse2_cvtps2pd.

> > > >
> > > > gcc/testsuite/ChangeLog:
> > > >
> > > > PR target/43618
> > > > * gcc.target/i386/pr43618-1.c: New test.
> > >
> > > This patch could be as simple as:
> > >
> > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> > > 8cd0f617bf3..c331445cb2d 100644
> > > --- a/gcc/config/i386/sse.md
> > > +++ b/gcc/config/i386/sse.md
> > > @@ -9195,7 +9195,7 @@
> > > (define_insn "extendv2sfv2df2"
> > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > >(float_extend:V2DF
> > > - (match_operand:V2SF 1 "register_operand" "v")))]
> > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > >   "TARGET_MMX_WITH_SSE"
> > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > >   [(set_attr "type" "ssecvt")
> >
> > We also tested on this version, it is ok.
> >
> > The reason why the patch looks like this is because in the previous insn
> > sse2_cvtps2pd, the constraint vm and vector_operand
> > actually does not match the actual instruction. Memory operand is V2SF,
> > not V4SF.
> >
> > Therefore, we changed the constraint in that insn. Then it caused another 
> > issue.
> > For memory operand, it seems that we cannot generate those mask 
> > instructions.
> > So I change the pattern to how extendv2hfv2df2 works.
>
> If you want to change the memory access in sse2_cvtps2pd,
> then please see how e.g. v2hiv2di is handled in sse.md. In
> addition to two instructions, you will need one define_insn_and_split
> with a pre-reload splitter.

Oh, nowadays combine does vec_select from a paradoxical subreg on its own.

+(define_expand "extendv2sfv2df2"
+  [(set (match_operand:V2DF 0 "register_operand")
+(float_extend:V2DF
+  (match_operand:V2SF 1 "nonimmediate_operand")))]
+  "TARGET_MMX_WITH_SSE"
+{
+  if (!MEM_P (operands[1]))
+{

You will need force reg here:

rtx op1 = force_reg (V2SFmode, operands[1]);
+  operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
+  emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
+  DONE;
+}
+})


-(define_insn "extendv2sfv2df2"
+(define_insn "sse2_cvtps2pd_load"

Please name this insn "*sse2_cvtps2pd_1". Please note the
star at the beginning, You don't have to make the name public.

OK with the above changes.

Thanks,
Uros,


Re: [PATCH] i386: Extend cvtps2pd to memory

2022-06-30 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak  wrote:
>
> On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak  wrote:
> >
> > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen  
> > wrote:
> > >
> > > > -Original Message-
> > > > From: Uros Bizjak 
> > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > > To: Jiang, Haochen 
> > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > > >
> > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang 
> > > > wrote:
> > > > >
> > > > > Hi all,
> > > > >
> > > > > This patch aims to fix the cvtps2pd insn, which should also work on
> > > > > memory operand but currently does not. After this fix, when loop == 2,
> > > > > it will eliminate movq instruction.
> > > > >
> > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > > >
> > > > > BRs,
> > > > > Haochen
> > > > >
> > > > > gcc/ChangeLog:
> > > > >
> > > > > PR target/43618
> > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > > (sse2_cvtps2pd_load): Rename extendvsdfv2df2.
>
> Rename FROM ...
>
> Please also mention change to sse2_cvtps2pd.
>
> > > > >
> > > > > gcc/testsuite/ChangeLog:
> > > > >
> > > > > PR target/43618
> > > > > * gcc.target/i386/pr43618-1.c: New test.
> > > >
> > > > This patch could be as simple as:
> > > >
> > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index
> > > > 8cd0f617bf3..c331445cb2d 100644
> > > > --- a/gcc/config/i386/sse.md
> > > > +++ b/gcc/config/i386/sse.md
> > > > @@ -9195,7 +9195,7 @@
> > > > (define_insn "extendv2sfv2df2"
> > > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > > >(float_extend:V2DF
> > > > - (match_operand:V2SF 1 "register_operand" "v")))]
> > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > > >   "TARGET_MMX_WITH_SSE"
> > > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > >   [(set_attr "type" "ssecvt")
> > >
> > > We also tested on this version, it is ok.
> > >
> > > The reason why the patch looks like this is because in the previous insn
> > > sse2_cvtps2pd, the constraint vm and vector_operand
> > > actually does not match the actual instruction. Memory operand is V2SF,
> > > not V4SF.
> > >
> > > Therefore, we changed the constraint in that insn. Then it caused another 
> > > issue.
> > > For memory operand, it seems that we cannot generate those mask 
> > > instructions.
> > > So I change the pattern to how extendv2hfv2df2 works.
> >
> > If you want to change the memory access in sse2_cvtps2pd,
> > then please see how e.g. v2hiv2di is handled in sse.md. In
> > addition to two instructions, you will need one define_insn_and_split
> > with a pre-reload splitter.
>
> Oh, nowadays combine does vec_select from a paradoxical subreg on its own.
>
> +(define_expand "extendv2sfv2df2"
> +  [(set (match_operand:V2DF 0 "register_operand")
> +(float_extend:V2DF
> +  (match_operand:V2SF 1 "nonimmediate_operand")))]
> +  "TARGET_MMX_WITH_SSE"
> +{
> +  if (!MEM_P (operands[1]))
> +{
>
> You will need force reg here:
>
> rtx op1 = force_reg (V2SFmode, operands[1]);
> +  operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
> +  emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
> +  DONE;
> +}
> +})
>
>
> -(define_insn "extendv2sfv2df2"
> +(define_insn "sse2_cvtps2pd_load"
>
> Please name this insn "*sse2_cvtps2pd_1". Please note the
> star at the beginning, You don't have to make the name public.
>
> OK with the above changes.

Forgot to mention:


- (match_operand:V2SF 1 "register_operand" "v")))]
-  "TARGET_MMX_WITH_SSE"
-  "%vcvtps2pd\t{%1, %0|%0, %1}"
+ (match_operand:V2SF 1 "memory_operand" "m")))]
+  "TARGET_MMX_WITH_SSE && "
+  "%vcvtps2pd\t{%1, %0|%0, %q1}"
   [(set_attr "type" "ssecvt")

The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so we
can use TARGET_SSE2 here.

Which opens the question if the expander could also be TARGET_SSE2
only. There are no MMX registers involved in any of the patterns
anymore.

Uros.
>
> Thanks,
> Uros,


RE: [PATCH] i386: Extend cvtps2pd to memory

2022-06-30 Thread Liu, Hongtao via Gcc-patches


> -Original Message-
> From: Uros Bizjak 
> Sent: Thursday, June 30, 2022 4:53 PM
> To: Jiang, Haochen 
> Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao 
> Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> 
> On Thu, Jun 30, 2022 at 10:45 AM Uros Bizjak  wrote:
> >
> > On Thu, Jun 30, 2022 at 9:41 AM Uros Bizjak  wrote:
> > >
> > > On Thu, Jun 30, 2022 at 9:24 AM Jiang, Haochen 
> wrote:
> > > >
> > > > > -Original Message-
> > > > > From: Uros Bizjak 
> > > > > Sent: Thursday, June 30, 2022 2:20 PM
> > > > > To: Jiang, Haochen 
> > > > > Cc: gcc-patches@gcc.gnu.org; Liu, Hongtao
> > > > > 
> > > > > Subject: Re: [PATCH] i386: Extend cvtps2pd to memory
> > > > >
> > > > > On Thu, Jun 30, 2022 at 7:59 AM Haochen Jiang
> > > > > 
> > > > > wrote:
> > > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > This patch aims to fix the cvtps2pd insn, which should also
> > > > > > work on memory operand but currently does not. After this fix,
> > > > > > when loop == 2, it will eliminate movq instruction.
> > > > > >
> > > > > > Regtested on x86_64-pc-linux-gnu. Ok for trunk?
> > > > > >
> > > > > > BRs,
> > > > > > Haochen
> > > > > >
> > > > > > gcc/ChangeLog:
> > > > > >
> > > > > > PR target/43618
> > > > > > * config/i386/sse.md (extendv2sfv2df2): New define_expand.
> > > > > > (sse2_cvtps2pd_load): Rename extendvsdfv2df2.
> >
> > Rename FROM ...
> >
> > Please also mention change to sse2_cvtps2pd.
> >
> > > > > >
> > > > > > gcc/testsuite/ChangeLog:
> > > > > >
> > > > > > PR target/43618
> > > > > > * gcc.target/i386/pr43618-1.c: New test.
> > > > >
> > > > > This patch could be as simple as:
> > > > >
> > > > > diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> > > > > index 8cd0f617bf3..c331445cb2d 100644
> > > > > --- a/gcc/config/i386/sse.md
> > > > > +++ b/gcc/config/i386/sse.md
> > > > > @@ -9195,7 +9195,7 @@
> > > > > (define_insn "extendv2sfv2df2"
> > > > >   [(set (match_operand:V2DF 0 "register_operand" "=v")
> > > > >(float_extend:V2DF
> > > > > - (match_operand:V2SF 1 "register_operand" "v")))]
> > > > > + (match_operand:V2SF 1 "nonimmediate_operand" "vm")))]
> > > > >   "TARGET_MMX_WITH_SSE"
> > > > >   "%vcvtps2pd\t{%1, %0|%0, %1}"
> > > > >   [(set_attr "type" "ssecvt")
> > > >
> > > > We also tested on this version, it is ok.
> > > >
> > > > The reason why the patch looks like this is because in the
> > > > previous insn sse2_cvtps2pd, the constraint vm and
> > > > vector_operand actually does not match the actual instruction.
> > > > Memory operand is V2SF, not V4SF.
> > > >
> > > > Therefore, we changed the constraint in that insn. Then it caused 
> > > > another
> issue.
> > > > For memory operand, it seems that we cannot generate those mask
> instructions.
> > > > So I change the pattern to how extendv2hfv2df2 works.
> > >
> > > If you want to change the memory access in sse2_cvtps2pd,
> > > then please see how e.g. v2hiv2di is handled in sse.md. In
> > > addition to two instructions, you will need one
> > > define_insn_and_split with a pre-reload splitter.
> >
> > Oh, nowadays combine does vec_select from a paradoxical subreg on its own.
> >
> > +(define_expand "extendv2sfv2df2"
> > +  [(set (match_operand:V2DF 0 "register_operand")
> > +(float_extend:V2DF
> > +  (match_operand:V2SF 1 "nonimmediate_operand")))]
> > +  "TARGET_MMX_WITH_SSE"
> > +{
> > +  if (!MEM_P (operands[1]))
> > +{
> >
> > You will need force reg here:
> >
> > rtx op1 = force_reg (V2SFmode, operands[1]);
> > +  operands[1] = lowpart_subreg (V4SFmode, op1, V2SFmode);
> > +  emit_insn (gen_sse2_cvtps2pd (operands[0], operands[1]));
> > +  DONE;
> > +}
> > +})
> >
> >
> > -(define_insn "extendv2sfv2df2"
> > +(define_insn "sse2_cvtps2pd_load"
> >
> > Please name this insn "*sse2_cvtps2pd_1". Please note the
> > star at the beginning, You don't have to make the name public.
> >
> > OK with the above changes.
> 
> Forgot to mention:
> 
> 
> - (match_operand:V2SF 1 "register_operand" "v")))]
> -  "TARGET_MMX_WITH_SSE"
> -  "%vcvtps2pd\t{%1, %0|%0, %1}"
> + (match_operand:V2SF 1 "memory_operand" "m")))]
> + "TARGET_MMX_WITH_SSE && "
> +  "%vcvtps2pd\t{%1, %0|%0 and2>, %q1}"
>[(set_attr "type" "ssecvt")
> 
> The new insn does not need to be limited to TARGET_MMX_WITH_SSE, so we
> can use TARGET_SSE2 here.
> 
> Which opens the question if the expander could also be TARGET_SSE2 only.
> There are no MMX registers involved in any of the patterns anymore.
Yes.
> 
> Uros.
> >
> > Thanks,
> > Uros,


Re: [Patch] OpenMP: Prepare omp-* for ancestor:1 handling

2022-06-30 Thread Tobias Burnus

Hi Jakub,

On 30.06.22 10:21, Jakub Jelinek wrote:

So, what is the plan with reverse offload?


My idea was to just call omp_target_ext with
'device(omp_initial_device)'. This then automatically
works when called from a target region that runs on
omp_get_initial_device().

For the actual device part, this can be implemented
incrementally by supporting the reverse_offload for
a given device type.

For getting it to work when the code enclosing the ancestor:1
target region runs on an offloading device,
my idea is the following. Comments are welcome!


My idea was to do the same as done for I/O
(which supported for both nvptx and gcn). For GCN:

libgomp/plugin/plugin-gcn.c has:

struct kernargs {
  /* A pointer to struct output, below, for console output data.  */
  int64_t out_ptr;

  /* A pointer to struct heap, below.  */
  int64_t heap_ptr;

  /* A pointer to an ephemeral memory arena.
Only needed for OpenMP.  */
  int64_t arena_ptr;

/* to be added: */
  /* A pointer to reverse-offload. */
  int64_t rev_ptr;

/* Now come the actual structs.*/
  /* Output data.  */
  struct output {
int return_value;
unsigned int next_output;
struct printf_data {
...
};


This gets initialized on the host and then:

  while (hsa_fns.hsa_signal_wait_acquire_fn (s, HSA_SIGNAL_CONDITION_LT, 1,
 1000 * 1000,
 HSA_WAIT_STATE_BLOCKED) != 0)
console_output (kernel, shadow->kernarg_address, false);

with:

  unsigned int from = __atomic_load_n (&kernargs->output_data.consumed,
   __ATOMIC_ACQUIRE);

The I/O itself is implemented in newlib,
https://sourceware.org/git/?p=newlib-cygwin.git;a=blob;f=newlib/libc/sys/amdgcn/write.c

  register void **kernargs asm("s8");
  struct output *data = (struct output *)kernargs[2];

and then the data is filled.


For reverse offload, the idea is fill it on the device side via
/libgomp/config/gcn/target.c's GOMP_target_ext for
device == GOMP_DEVICE_HOST_FALLBACK && fn != NULL as:

Try to obtain a lock (busy wait)
Put addr/kinds/sizes into the struct
Put the device's fn pointer in the struct
busy wait for completion ('while (fn != NULL) { }')
unlock


And on the host side:
If fn == NULL (= data there) - return output/offload checking loop
Otherwise:
call a new function in target.c and pass args to it.
Once it completed, set fn = NULL to indicate it has been processed.

And in target.c's new reverse-offload-handling function:
- find generated-target function on the host,
  based on device stub function's pointer address
- Handle the mapping
- Call host function
- Handle the mapping
- return

Additionally:

If 'requires reverse_offload' is set, fill not only
the normal splay_tree for "host -> device" lookup but
also another one for the "device -> host" lookups.

Does this make sense?

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] Fortran: error recovery on invalid CLASS(), PARAMETER declarations [PR105243]

2022-06-30 Thread Tobias Burnus

Dear Harald, dear all,

On 29.06.22 21:54, Harald Anlauf via Fortran wrote:

a CLASS entity cannot have the PARAMETER attribute.
This is detected in some situations, but in others
we ICE because we never reach the existing check.
Adding a similar check when handling the declaration
improves error recovery.

The initial patch is by Steve.  I adjusted and moved
it slightly so that it also handles CLASS(*)
(unlimited polymorphic) at the same time.

Shouldn't you then also acknowledge him, e.g. via Co-authored-by?

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

OK. Thanks for the patch!

This patch actually addresses multiple PRs, some of
which are marked as regressions.  As I consider the
patch safe, I would like to backport to open branches
as far as it seems appropriate.


Fine with me.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


doc: extend --{enable,disable}-libsanitizer description [PR 105614]

2022-06-30 Thread Xi Ruoyao via Gcc-patches
A documentation improvement with no code change.  OK for trunk?

-- >8 --

We are receiving several reports that people are (mis)using
--enable-libsanitizer option, which was not documented by GCC
installation doc.  It forces to build libsanitizer even for unsupported
targets, causing build failure.  Extend the --disable-libsanitizer
description to also include --enable-libsanitizer, and warn about the
possible consequences if it's enabled explicitly.

gcc/ChangeLog:

PR sanitizer/105614
* doc/install.texi: Document --enable-libsanitizer and possible
consequences.
---
 gcc/doc/install.texi | 9 +++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 460da3a0fd5..136ee24e450 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1866,9 +1866,14 @@ be built.  This can be useful for debugging, or for 
compatibility with
 previous Ada build procedures, when it was required to explicitly
 do a @samp{make -C gcc gnatlib_and_tools}.
 
-@item --disable-libsanitizer
+@item --enable-libsanitizer
+@itemx --disable-libsanitizer
 Specify that the run-time libraries for the various sanitizers should
-not be built.
+(should not) be built.  The default is @option{--enable-libsanitizer}
+for targets where those libraries are tested and supported,
+@option{--disable-libsanitizer} for other targets.  Explicitly
+requesting @option{--enable-libsanitizer} for unsupported targets may
+cause build failure or runtime issues.
 
 @item --disable-libssp
 Specify that the run-time libraries for stack smashing protection
-- 
2.37.0




[x86 PATCH take #2] Double word logical operation clean-ups in i386.md.

2022-06-30 Thread Roger Sayle

Hi Uros,
Many thanks for your review of the "double word logical operation clean-up" 
patch.
The revision below incorporates the majority of your feedback, but with one or 
two
exceptions (required to allow the patch to bootstrap) that I thought I'd double 
check
with you before pushing.

Firstly, great catch that we no longer need to test rtx_equal (operands[0], 
operands[1])
when moving a splitter from before reload to after reload, as this is 
guaranteed by the
"0" constraints.  I've cleaned this up in all the doubleword splitters 
(including the
 case that's now moved).  Also, as you've suggested, this patch uses
a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split post-reload)
and the other for !TARGET_BMI (that's lowered rather than split, 
pre-load/post-STV).

Unfortunately, the "force_reg of tricky immediate constants" checks really are
required for these expanders.  I agree normally the predicate is 
checked/guaranteed
for a define_insn, but in this case the gen_iordi3 function and related 
expanders are
frequently called directly by the middle-end or from i386-expand, which bypasses
the checks made by the later RTL passes.  When given arbitrary immediate 
constants,
this results in ICEs from insns not matching their predicates soon after expand
(breaking bootstrap with an ICE).  It's only "standard name" expanders that 
require
this treatment, define_insn{_and_split} templates do enforce their predicates.

And finally, we can't/shouldn't use  in the actual
doubleword splitters, as the mode being iterated over is DWIH (not DWI),
where we require the predicate for the corresponding  mode.  It turns
out that it's always appropriate to use x86_64_hilo_general_operand wherever
we use the "r" constraint, and that's used consistently in this patch.

I hope these exceptions are acceptable.  The attached revised patch has
been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
both with and with --target_board=unix{-m32} with no new failures.
Are these revisions OK for mainline?

2022-06-30  Roger Sayle  
Uroš Bizjak  

gcc/ChangeLog
* config/i386/i386.md (general_szext_operand): Add TImode
support using x86_64_hilo_general_operand predicate.
(*cmp_doubleword): Use x86_64_hilo_general_operand predicate.
(*add3_doubleword): Improved optimization of zero addition.
(and3): Use SDWIM mode iterator to add support for double
word bit-wise AND in TImode.  Use force_reg when double word
immediate operand isn't x86_64_hilo_general_operand.
(and3_doubleword): Generalized from anddi3_doubleword and
converted into a post-reload splitter.
(*andndi3_doubleword): Old define_insn deleted.
(*andn3_doubleword_bmi): New define_insn_and_split for
TARGET_BMI that splits post-reload.
(*andn3_doubleword): New define_insn_and_split for
!TARGET_BMI, that lowers/splits before reload.
(3): Use SDWIM mode iterator to add suppport for
double word bit-wise XOR and bit-wise IOR in TImode.  Use
force_reg when double word immediate operand isn't
x86_64_hilo_general_operand.
(*di3_doubleword): Generalized from di3_doubleword.
(one_cmpl2): Use SDWIM mode iterator to add support for
double word bit-wise NOT in TImode.
(one_cmpl2_doubleword): Generalize from one_cmpldi2_doubleword
and converted into a post-reload splitter.


Thanks again,
Roger
--

> -Original Message-
> From: Uros Bizjak 
> Sent: 28 June 2022 16:38
> To: Roger Sayle 
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i386.md.
> 
> On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle 
> wrote:
> >
> >
> > Hi Uros,
> > As you've requested/suggested, here's a patch that tidies up and
> > unifies doubleword handling in i386.md; converting all doubleword
> > splitters for logic operations to post-reload form, generalizing their
> > define_insn_and_split templates to  form (supporting TARGET_64BIT
> > ? TImode : DImode), and where required tweaking the corresponding
> > expanders to use SDWIM to support TImode doubleword operations.  These
> > changes incorporate your feedback from
> > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html
> > where I included many/several of these clean-ups, in a patch to add a
> > new optimization.  I agree, it's better to split these out (this
> > patch), and I'll resubmit the (smaller) optimization patch as a
> > follow-up.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-m32},
> > with no new failures.  Ok for mainline?
> >
> >
> > 2022-06-28  Roger Sayle  
> >
> > gcc/ChangeLog
> > * config/i386/i386.md (general_szext_operand): Add TImode
> > support using x86_64_hilo_general_operand predicate.
> > (*cmp_doubleword): Use x86_64_hilo_general_op

[PATCH] Amend fix for PR middle-end/105874

2022-06-30 Thread Eric Botcazou via Gcc-patches
As pointed out by Richard, it's very likely too big of a hammer.

Bootstrapped/regtested on x86-64/Linux, OK for the mainline?


2022-06-30  Eric Botcazou  

PR middle-end/105874
* expr.cc (expand_expr_real_1) : Force
EXPAND_MEMORY for the expansion of the inner reference only
in the usual cases where a memory reference is required.

-- 
Eric Botcazoudiff --git a/gcc/expr.cc b/gcc/expr.cc
index c90cde35006..8fb2dc917f7 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -11186,37 +11186,58 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	machine_mode mode1, mode2;
 	poly_int64 bitsize, bitpos, bytepos;
 	tree offset;
-	int reversep, volatilep = 0, must_force_mem;
+	int reversep, volatilep = 0;
 	tree tem
 	  = get_inner_reference (exp, &bitsize, &bitpos, &offset, &mode1,
  &unsignedp, &reversep, &volatilep);
 	rtx orig_op0, memloc;
 	bool clear_mem_expr = false;
+	bool must_force_mem;
 
 	/* If we got back the original object, something is wrong.  Perhaps
 	   we are evaluating an expression too early.  In any event, don't
 	   infinitely recurse.  */
 	gcc_assert (tem != exp);
 
-	/* If tem is a VAR_DECL, we need a memory reference.  */
-	enum expand_modifier tem_modifier = modifier;
-	if (tem_modifier == EXPAND_SUM)
-	  tem_modifier = EXPAND_NORMAL;
-	if (TREE_CODE (tem) == VAR_DECL)
-	  tem_modifier = EXPAND_MEMORY;
+	/* Make sure bitpos is not negative, this can wreak havoc later.  */
+	if (maybe_lt (bitpos, 0))
+	  {
+	gcc_checking_assert (offset == NULL_TREE);
+	offset = size_int (bits_to_bytes_round_down (bitpos));
+	bitpos = num_trailing_bits (bitpos);
+	  }
+
+	/* If we have either an offset, a BLKmode result, or a reference
+	   outside the underlying object, we must force it to memory.
+	   Such a case can occur in Ada if we have unchecked conversion
+	   of an expression from a scalar type to an aggregate type or
+	   for an ARRAY_RANGE_REF whose type is BLKmode, or if we were
+	   passed a partially uninitialized object or a view-conversion
+	   to a larger size.  */
+	must_force_mem = offset != NULL_TREE
+			 || mode1 == BLKmode
+			 || (mode == BLKmode
+			 && !int_mode_for_size (bitsize, 1).exists ());
+
+	const enum expand_modifier tem_modifier
+	  = must_force_mem
+	? EXPAND_MEMORY
+	: modifier == EXPAND_SUM ? EXPAND_NORMAL : modifier;
 
 	/* If TEM's type is a union of variable size, pass TARGET to the inner
 	   computation, since it will need a temporary and TARGET is known
 	   to have to do.  This occurs in unchecked conversion in Ada.  */
+	const rtx tem_target
+	  = TREE_CODE (TREE_TYPE (tem)) == UNION_TYPE
+	&& COMPLETE_TYPE_P (TREE_TYPE (tem))
+	&& TREE_CODE (TYPE_SIZE (TREE_TYPE (tem))) != INTEGER_CST
+	&& modifier != EXPAND_STACK_PARM
+	? target
+	: NULL_RTX;
+
 	orig_op0 = op0
-	  = expand_expr_real (tem,
-			  (TREE_CODE (TREE_TYPE (tem)) == UNION_TYPE
-			   && COMPLETE_TYPE_P (TREE_TYPE (tem))
-			   && (TREE_CODE (TYPE_SIZE (TREE_TYPE (tem)))
-   != INTEGER_CST)
-			   && modifier != EXPAND_STACK_PARM
-			   ? target : NULL_RTX),
-			  VOIDmode, tem_modifier, NULL, true);
+	  = expand_expr_real (tem, tem_target, VOIDmode, tem_modifier, NULL,
+			  true);
 
 	/* If the field has a mode, we want to access it in the
 	   field's mode, not the computed mode.
@@ -11233,27 +11254,9 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	mode2
 	  = CONSTANT_P (op0) ? TYPE_MODE (TREE_TYPE (tem)) : GET_MODE (op0);
 
-	/* Make sure bitpos is not negative, it can wreak havoc later.  */
-	if (maybe_lt (bitpos, 0))
-	  {
-	gcc_checking_assert (offset == NULL_TREE);
-	offset = size_int (bits_to_bytes_round_down (bitpos));
-	bitpos = num_trailing_bits (bitpos);
-	  }
-
-	/* If we have either an offset, a BLKmode result, or a reference
-	   outside the underlying object, we must force it to memory.
-	   Such a case can occur in Ada if we have unchecked conversion
-	   of an expression from a scalar type to an aggregate type or
-	   for an ARRAY_RANGE_REF whose type is BLKmode, or if we were
-	   passed a partially uninitialized object or a view-conversion
-	   to a larger size.  */
-	must_force_mem = (offset
-			  || mode1 == BLKmode
-			  || (mode == BLKmode
-			  && !int_mode_for_size (bitsize, 1).exists ())
-			  || maybe_gt (bitpos + bitsize,
-   GET_MODE_BITSIZE (mode2)));
+	/* See above for the rationale.  */
+	if (maybe_gt (bitpos + bitsize, GET_MODE_BITSIZE (mode2)))
+	  must_force_mem = true;
 
 	/* Handle CONCAT first.  */
 	if (GET_CODE (op0) == CONCAT && !must_force_mem)
@@ -11311,7 +11314,7 @@ expand_expr_real_1 (tree exp, rtx target, machine_mode tmode,
 	  }
 	else
 	  /* Otherwise force into memory.  */
-	  must_force_mem = 1;
+	  must_force_mem = true;
 	  }
 
 	/* If this is a constant, put it in a register if it is a legitimate


Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Fri, Jun 10, 2022 at 03:59:37PM +0200, Marcel Vollweiler wrote:
> > I'm not sure we can rely on execv on all targets that do support libgomp.
> > Any reason why you actually need this, rather than using
> > dg-set-target-env-var directive(s) and perhaps return 0; if getenv doesn't
> > return the expected values?
> 
> Interesting topic. After some (internal) discussions I think the best way is 
> to
> set the environment variables explicitely instead using dg-set-target-env-var.
> The reason is that dg-set-target-env-var does not work for remote testing 
> (which
> seems to be a common test environment). For remote testing dejagnu immediately
> aborts the test case with UNSUPPORTED which is specified in the corresponding
> extension and makes sence from my point of view as the test assumption cannot 
> be
> fulfilled (since the environment variables are not set on remote targets).
> It also means that whenever dg-set-target-env-var is set in the test file, the
> execution of the test case is not tested on remote targets.

The only reason why dg-set-target-env-var is supported on native only right
now is that I'm never doing remote testing myself and so couldn't test that.
There is no inherent reason why the env vars couldn't be propagated over to
the remote and set in the environment there.
So trying to work around that rather than at least trying to change
dg-set-target-env-var so that it works with the remote testing you do looks
wrong.
If dg-set-target-env-var can be made to work remotely, it will magically
improve those 130+ tests that use it already together with the newly added
tests.

So, I'd suggest to just use dg-set-target-env-var and incrementally work on
making it work for remote testing if that is important to whomever does
that kind of testing.  Could be e.g. a matter of invoking remotely
env VAR1=val1 VAR2=val2 program args
instead of program args.  If env is missing on the remote side, it could
be UNSUPPORTED then.

> +/* The initial ICV values for the host, which are configured with environment
> +   variables without a suffix, e.g. OMP_NUM_TEAMS.  */
> +struct gomp_initial_icvs gomp_initial_icvs_none;
> +
> +/* Initial ICV values that were configured for the host and for all devices 
> by
> +   using environment variables like OMP_NUM_TEAMS_ALL.  */
> +struct gomp_initial_icvs gomp_initial_icvs_all;
> +
> +/* Initial ICV values that were configured only for devices (not for the 
> host)
> +   by using environment variables like OMP_NUM_TEAMS_DEV.  */
> +struct gomp_initial_icvs gomp_initial_icvs_dev;

As I said last time, I don't like allocating these
all the time in the data section of libgomp when at least for a few upcoming
years, most users will never use those suffixes.
Can't *_DEV and *_ALL go into the gomp_initial_icv_dev_list
chain too, perhaps 

> +static const struct envvar
> +{
> +  const char *name;
> +  int name_len;
> +  unsigned char flag_vars[3];
> +  unsigned char flag;
> +  void *params[3];
> +  bool (*parse_func) (const char *, const char *, void * const[]);
> +} envvars[] = {
> +  {ENTRY ("OMP_SCHEDULE_DEV"), {OMP_SCHEDULE_DEV_, 
> OMP_SCHEDULE_CHUNK_SIZE_DEV_}, GOMP_ENV_VAR_SUFFIX_DEV, 
> {&gomp_initial_icvs_dev.run_sched_var, 
> &gomp_initial_icvs_dev.run_sched_chunk_size}, &parse_schedule},
> +  {ENTRY ("OMP_SCHEDULE_ALL"), {OMP_SCHEDULE_DEV_, 
> OMP_SCHEDULE_CHUNK_SIZE_DEV_}, GOMP_ENV_VAR_SUFFIX_ALL, 
> {&gomp_initial_icvs_all.run_sched_var, 
> &gomp_initial_icvs_all.run_sched_chunk_size}, &parse_schedule},
> +  {ENTRY ("OMP_SCHEDULE"), {OMP_SCHEDULE_DEV_, 
> OMP_SCHEDULE_CHUNK_SIZE_DEV_}, GOMP_ENV_VAR_SUFFIX_NONE, 
> {&gomp_initial_icvs_none.run_sched_var, 
> &gomp_initial_icvs_none.run_sched_chunk_size}, &parse_schedule},
> +
> +  {ENTRY ("OMP_NUM_TEAMS_DEV"), {OMP_NUM_TEAMS_DEV_}, 
> GOMP_ENV_VAR_SUFFIX_DEV , {&gomp_initial_icvs_dev.nteams_var, false}, 
> &parse_int},
> +  {ENTRY ("OMP_NUM_TEAMS_ALL"), {OMP_NUM_TEAMS_DEV_}, 
> GOMP_ENV_VAR_SUFFIX_ALL, {&gomp_initial_icvs_all.nteams_var, false}, 
> &parse_int},
> +  {ENTRY ("OMP_NUM_TEAMS"), {OMP_NUM_TEAMS_DEV_}, GOMP_ENV_VAR_SUFFIX_NONE, 
> {&gomp_initial_icvs_none.nteams_var, false}, &parse_int},
> +
> +  {ENTRY ("OMP_DYNAMIC_DEV"), {OMP_DYNAMIC_DEV_}, GOMP_ENV_VAR_SUFFIX_DEV, 
> {&gomp_initial_icvs_dev.dyn_var}, &parse_boolean},
> +  {ENTRY ("OMP_DYNAMIC_ALL"), {OMP_DYNAMIC_DEV_}, GOMP_ENV_VAR_SUFFIX_ALL, 
> {&gomp_initial_icvs_all.dyn_var}, &parse_boolean},
> +  {ENTRY ("OMP_DYNAMIC"), {OMP_DYNAMIC_DEV_}, GOMP_ENV_VAR_SUFFIX_NONE, 
> {&gomp_initial_icvs_none.dyn_var}, &parse_boolean},
> +
> +  {ENTRY ("OMP_TEAMS_THREAD_LIMIT_DEV"), {OMP_TEAMS_THREAD_LIMIT_DEV_}, 
> GOMP_ENV_VAR_SUFFIX_DEV, {&gomp_initial_icvs_dev.teams_thread_limit_var, 
> false}, &parse_int},
> +  {ENTRY ("OMP_TEAMS_THREAD_LIMIT_ALL"), {OMP_TEAMS_THREAD_LIMIT_DEV_}, 
> GOMP_ENV_VAR_SUFFIX_ALL, {&gomp_initial_icvs_all.teams_thread_limit_var, 
> false}, &parse_int},
> +  {ENTRY ("OMP_TEAMS_THREAD_LIMIT"), {OMP_TEAMS_THREAD_LIMIT_DEV_}, 
> GO

Re: doc: extend --{enable,disable}-libsanitizer description [PR 105614]

2022-06-30 Thread Rainer Orth
Hi Xi,

> A documentation improvement with no code change.  OK for trunk?
>
> -- >8 --
>
> We are receiving several reports that people are (mis)using
> --enable-libsanitizer option, which was not documented by GCC
> installation doc.  It forces to build libsanitizer even for unsupported
> targets, causing build failure.  Extend the --disable-libsanitizer
> description to also include --enable-libsanitizer, and warn about the
> possible consequences if it's enabled explicitly.

this behaviour isn't specific to libsanitizer at all: it probably
pertains to every target library using configure.tgt (libatomic,
libgomp, libitm, liboffloadmic, libphobos, libsanitizer, libvtv).  I
think it should be documented in a more generic way.

> gcc/ChangeLog:
>
>   PR sanitizer/105614
>   * doc/install.texi: Document --enable-libsanitizer and possible
>   consequences.

It's better to specific which part of install.texi is affected.  In this
case, this would be

* doc/install.texi (Configuration, --disable-libsanitizer): ...

Thanks.
Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


c++: Note macro locations

2022-06-30 Thread Nathan Sidwell via Gcc-patches


In order to prune ordinary locations, we need to note the locations of 
macros we'll be writing out.  This reaaranges the macro processing to 
achieve that.  Also drop an unneeded parameter from macro reading & writing.


Fix some it's/its errors.

nathan

--
Nathan SidwellFrom 47e36785cd2ba35a577b0678a2ac185288eb9e52 Mon Sep 17 00:00:00 2001
From: Nathan Sidwell 
Date: Mon, 27 Jun 2022 07:51:12 -0700
Subject: [PATCH] c++: Note macro locations

In order to prune ordinary locations, we need to note the locations of
macros we'll be writing out.  This rearanges the macro processing to achieve
that.  Also drop an unneeded parameter from macro reading & writing.

Fix some it's/its errors.

	gcc/cp/
	* module.cc (module_state::write_define): Drop located param.
	(module_state::read_define): Likewise.
	(module_state::prepare_macros): New, broken out of ...
	(module_state::write_macros): ... here.  Adjust.
	(module_state::write_begin): Adjust.
	gcc/testsuite/
	* g++.dg/modules/inext-1.H: Check include-next happened.
---
 gcc/cp/module.cc   | 98 +-
 gcc/testsuite/g++.dg/modules/inext-1.H |  1 +
 2 files changed, 67 insertions(+), 32 deletions(-)

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index 68a7ce53ee4..238a5eb74d2 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -2281,7 +2281,7 @@ public:
 EK_EXPLICIT_HWM,  
 EK_BINDING = EK_EXPLICIT_HWM, /* Implicitly encoded.  */
 EK_FOR_BINDING,	/* A decl being inserted for a binding.  */
-EK_INNER_DECL,	/* A decl defined outside of it's imported
+EK_INNER_DECL,	/* A decl defined outside of its imported
 			   context.  */
 EK_DIRECT_HWM = EK_PARTIAL + 1,
 
@@ -3663,9 +3663,10 @@ class GTY((chain_next ("%h.parent"), for_user)) module_state {
   bool read_macro_maps (unsigned);
 
  private:
-  void write_define (bytes_out &, const cpp_macro *, bool located = true);
-  cpp_macro *read_define (bytes_in &, cpp_reader *, bool located = true) const;
-  unsigned write_macros (elf_out *to, cpp_reader *, unsigned *crc_ptr);
+  void write_define (bytes_out &, const cpp_macro *);
+  cpp_macro *read_define (bytes_in &, cpp_reader *) const;
+  vec *prepare_macros (cpp_reader *);
+  unsigned write_macros (elf_out *to, vec *, unsigned *crc_ptr);
   bool read_macros ();
   void install_macros ();
 
@@ -7136,7 +7137,7 @@ trees_in::tree_node_vals (tree t)
 }
 
 
-/* If T is a back reference, fixed reference or NULL, write out it's
+/* If T is a back reference, fixed reference or NULL, write out its
code and return WK_none.  Otherwise return WK_value if we must write
by value, or WK_normal otherwise.  */
 
@@ -10605,7 +10606,7 @@ trees_out::key_mergeable (int tag, merge_kind mk, tree decl, tree inner,
 
 /* DECL is a new declaration that may be duplicated in OVL.  Use RET &
ARGS to find its clone, or NULL.  If DECL's DECL_NAME is NULL, this
-   has been found by a proxy.  It will be an enum type located by it's
+   has been found by a proxy.  It will be an enum type located by its
first member.
 
We're conservative with matches, so ambiguous decls will be
@@ -11615,7 +11616,7 @@ trees_in::read_var_def (tree decl, tree maybe_template)
 }
 
 /* If MEMBER doesn't have an independent life outside the class,
-   return it (or it's TEMPLATE_DECL).  Otherwise NULL.  */
+   return it (or its TEMPLATE_DECL).  Otherwise NULL.  */
 
 static tree
 member_owned_by_class (tree member)
@@ -15405,7 +15406,7 @@ module_state::read_entities (unsigned count, unsigned lwm, unsigned hwm)
sure the specified entities are loaded.
 
An optimization might be to have a flag in each key-entity saying
-   that it's top key might be in the entity table.  It's not clear to
+   that its top key might be in the entity table.  It's not clear to
me how to set that flag cheaply -- cheaper than just looking.
 
FIXME: It'd be nice to have a bit in decls to tell us whether to
@@ -16444,7 +16445,7 @@ module_state::read_macro_maps (unsigned num_macro_locs)
 /* Serialize the definition of MACRO.  */
 
 void
-module_state::write_define (bytes_out &sec, const cpp_macro *macro, bool located)
+module_state::write_define (bytes_out &sec, const cpp_macro *macro)
 {
   sec.u (macro->count);
 
@@ -16453,8 +16454,7 @@ module_state::write_define (bytes_out &sec, const cpp_macro *macro, bool located
   sec.b (macro->syshdr);
   sec.bflush ();
 
-  if (located)
-write_location (sec, macro->line);
+  write_location (sec, macro->line);
   if (macro->fun_like)
 {
   sec.u (macro->paramc);
@@ -16467,8 +16467,7 @@ module_state::write_define (bytes_out &sec, const cpp_macro *macro, bool located
   for (unsigned ix = 0; ix != macro->count; ix++)
 {
   const cpp_token *token = ¯o->exp.tokens[ix];
-  if (located)
-	write_location (sec, token->src_loc);
+  write_location (sec, token->src_loc);
   sec.u (token->type);
   sec.u (token->flags);
   switch (cpp_token_val_index (token))
@@ -16533,11 +16532

Re: [x86 PATCH take #2] Double word logical operation clean-ups in i386.md.

2022-06-30 Thread Uros Bizjak via Gcc-patches
On Thu, Jun 30, 2022 at 12:56 PM Roger Sayle  wrote:
>
>
> Hi Uros,
> Many thanks for your review of the "double word logical operation clean-up" 
> patch.
> The revision below incorporates the majority of your feedback, but with one 
> or two
> exceptions (required to allow the patch to bootstrap) that I thought I'd 
> double check
> with you before pushing.
>
> Firstly, great catch that we no longer need to test rtx_equal (operands[0], 
> operands[1])
> when moving a splitter from before reload to after reload, as this is 
> guaranteed by the
> "0" constraints.  I've cleaned this up in all the doubleword splitters 
> (including the
>  case that's now moved).  Also, as you've suggested, this patch uses
> a pair of define_insn_and_split for ANDN, one for TARGET_BMI (split 
> post-reload)
> and the other for !TARGET_BMI (that's lowered rather than split, 
> pre-load/post-STV).
>
> Unfortunately, the "force_reg of tricky immediate constants" checks really are
> required for these expanders.  I agree normally the predicate is 
> checked/guaranteed
> for a define_insn, but in this case the gen_iordi3 function and related 
> expanders are
> frequently called directly by the middle-end or from i386-expand, which 
> bypasses
> the checks made by the later RTL passes.  When given arbitrary immediate 
> constants,
> this results in ICEs from insns not matching their predicates soon after 
> expand
> (breaking bootstrap with an ICE).  It's only "standard name" expanders that 
> require
> this treatment, define_insn{_and_split} templates do enforce their predicates.
>
> And finally, we can't/shouldn't use  in the actual
> doubleword splitters, as the mode being iterated over is DWIH (not DWI),
> where we require the predicate for the corresponding  mode.  It turns
> out that it's always appropriate to use x86_64_hilo_general_operand wherever
> we use the "r" constraint, and that's used consistently in this patch.
>
> I hope these exceptions are acceptable.  The attached revised patch has
> been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check
> both with and with --target_board=unix{-m32} with no new failures.
> Are these revisions OK for mainline?

Thanks for your explanation of the particularities of the patch!

Yes, the patch is OK.

Thanks,
Uros.

>
> 2022-06-30  Roger Sayle  
> Uroš Bizjak  
>
> gcc/ChangeLog
> * config/i386/i386.md (general_szext_operand): Add TImode
> support using x86_64_hilo_general_operand predicate.
> (*cmp_doubleword): Use x86_64_hilo_general_operand predicate.
> (*add3_doubleword): Improved optimization of zero addition.
> (and3): Use SDWIM mode iterator to add support for double
> word bit-wise AND in TImode.  Use force_reg when double word
> immediate operand isn't x86_64_hilo_general_operand.
> (and3_doubleword): Generalized from anddi3_doubleword and
> converted into a post-reload splitter.
> (*andndi3_doubleword): Old define_insn deleted.
> (*andn3_doubleword_bmi): New define_insn_and_split for
> TARGET_BMI that splits post-reload.
> (*andn3_doubleword): New define_insn_and_split for
> !TARGET_BMI, that lowers/splits before reload.
> (3): Use SDWIM mode iterator to add suppport for
> double word bit-wise XOR and bit-wise IOR in TImode.  Use
> force_reg when double word immediate operand isn't
> x86_64_hilo_general_operand.
> (*di3_doubleword): Generalized from di3_doubleword.
> (one_cmpl2): Use SDWIM mode iterator to add support for
> double word bit-wise NOT in TImode.
> (one_cmpl2_doubleword): Generalize from one_cmpldi2_doubleword
> and converted into a post-reload splitter.
>
>
> Thanks again,
> Roger
> --
>
> > -Original Message-
> > From: Uros Bizjak 
> > Sent: 28 June 2022 16:38
> > To: Roger Sayle 
> > Cc: gcc-patches@gcc.gnu.org
> > Subject: Re: [x86 PATCH] Double word logical operation clean-ups in i386.md.
> >
> > On Tue, Jun 28, 2022 at 1:34 PM Roger Sayle 
> > wrote:
> > >
> > >
> > > Hi Uros,
> > > As you've requested/suggested, here's a patch that tidies up and
> > > unifies doubleword handling in i386.md; converting all doubleword
> > > splitters for logic operations to post-reload form, generalizing their
> > > define_insn_and_split templates to  form (supporting TARGET_64BIT
> > > ? TImode : DImode), and where required tweaking the corresponding
> > > expanders to use SDWIM to support TImode doubleword operations.  These
> > > changes incorporate your feedback from
> > > https://gcc.gnu.org/pipermail/gcc-patches/2022-June/596205.html
> > > where I included many/several of these clean-ups, in a patch to add a
> > > new optimization.  I agree, it's better to split these out (this
> > > patch), and I'll resubmit the (smaller) optimization patch as a
> > > follow-up.
> > >
> > > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> 

[Patch] Fortran: Cleanup OpenMP match{o,s,do,ds} macros

2022-06-30 Thread Tobias Burnus

I initially thought that I need another set of macros - and started with
this cleanup. I then realized that I don't.

However, I still wonder whether this cleanup makes sense even if only
4 macros are affected.

OK for mainline - or should I put that patch into the bin?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Cleanup OpenMP match{o,s,do,ds} macros

Create match_internal as universal macro and use it to
define match{o,s,do,ds}

gcc/fortran/ChangeLog:

* parse.cc (match_internal): New macro.
(matcho, matchs, matchds, matchdo): Use it.

diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc
index 7356d1b5a3a..ebe27a7569f 100644
--- a/gcc/fortran/parse.cc
+++ b/gcc/fortran/parse.cc
@@ -745,81 +745,58 @@ decode_oacc_directive (void)
   return ST_GET_FCN_CHARACTERISTICS;
 }
 
-/* Like match, but set a flag simd_matched if keyword matched
-   and if spec_only, goto do_spec_only without actually matching.  */
-#define matchs(keyword, subr, st)\
-do {			\
-  match m2;			\
-  if (spec_only && gfc_match (keyword) == MATCH_YES)	\
-	goto do_spec_only;	\
-  if ((m2 = match_word_omp_simd (keyword, subr, &old_locus,	\
-			   &simd_matched)) == MATCH_YES)	\
-	{			\
-	  ret = st;		\
-	  goto finish;		\
-	}			\
-  else if (m2 == MATCH_ERROR)\
-	goto error_handling;	\
-  else			\
-	undo_new_statement ();  	\
+/* Like match, but with some special handling:
+   - dosimd - if false, don't do anything if not -fopenmp,
+ otherwise do match_word_omp_simd matching
+   - if dospec_only: if spec_only, goto do_spec_only after matching.
+
+   If the directive matched but the clauses failed, do not start
+   matching the next directive in the same switch statement.  */
+
+#define match_internal(match_simd, match_spec_only, keyword, subr, st)	\
+do {\
+  match m2;\
+  if (!match_simd && !flag_openmp)	\
+	;\
+  else if (match_spec_only		\
+	   && spec_only		\
+	   && gfc_match (keyword) == MATCH_YES)			\
+	goto do_spec_only;		\
+  else if (!match_simd			\
+	   && ((m2 = match_word (keyword, subr, &old_locus))	\
+		   == MATCH_YES))	\
+	{\
+	  ret = st;			\
+	  goto finish;			\
+	}\
+  else if (match_simd			\
+	   && (m2 = match_word_omp_simd (keyword, subr, &old_locus,	\
+	 &simd_matched)) == MATCH_YES)  \
+	{\
+	  ret = st;			\
+	  goto finish;			\
+	}\
+  else if (m2 == MATCH_ERROR)	\
+	goto error_handling;		\
+  else\
+	undo_new_statement ();		\
 } while (0)
 
-/* Like match, but don't match anything if not -fopenmp
-   and if spec_only, goto do_spec_only without actually matching.  */
-/* If the directive matched but the clauses failed, do not start
-   matching the next directive in the same switch statement. */
-#define matcho(keyword, subr, st)\
-do {			\
-  match m2;			\
-  if (!flag_openmp)		\
-	;			\
-  else if (spec_only && gfc_match (keyword) == MATCH_YES)	\
-	goto do_spec_only;	\
-  else if ((m2 = match_word (keyword, subr, &old_locus))	\
-	   == MATCH_YES)	\
-	{			\
-	  ret = st;		\
-	  goto finish;		\
-	}			\
-  else if (m2 == MATCH_ERROR)\
-	goto error_handling;	\
-  else			\
-	undo_new_statement ();  	\
-} while (0)
+/* Like match. Does simd matching; sets flag simd_matched if keyword matched. */
+#define matchds(keyword, subr, st)			\
+  match_internal(true, false, keyword, subr, st)
 
-/* Like match, but set a flag simd_matched if keyword matched.  */
-#define matchds(keyword, subr, st)\
-do {			\
-  match m2;			\
-  if ((m2 = match_word_omp_simd (keyword, subr, &old_locus,	\
-			   &simd_matched)) == MATCH_YES)	\
-	{			\
-	  ret = st;		\
-	  goto finish;		\
-	}			\
-  else if (m2 == MATCH_ERROR)\
-	goto error_handling;	\
-  else			\
-	undo_new_statement ();  	\
-} while (0)
+/* Like matchds, but also honors spec_only.  */
+#define matchs(keyword, subr, st)			\
+  match_internal(true, true, keyword, subr, st)
 
 /* Like match, but don't match anything if not -fopenmp.  */
-#define matchdo(keyword, subr, st)\
-do {			\
-  match m2;			\
-  if (!flag_openmp)		\
-	;			\
-  else if ((m2 = match_word (keyword, subr, &old_locus))	\
-	   == MATCH_YES)	\
-	{			\
-	  ret = st;		\
-	  goto finish;		\
-	}			\
-  else if (m2 == MATCH_ERROR)\
-	goto error_handling;	\
-  else			\
-

[COMMITTED] Implement ggc_vrange_allocator.

2022-06-30 Thread Aldy Hernandez via Gcc-patches
This patch makes the vrange_allocator an abstract class, and uses it
to implement the obstack allocator as well as a new GC allocator.

The GC bits will be used to implement the vrange storage class for
global ranges, which will be contributed in the next week or so.

Tested and benchmarked on x86-64 Linux.

gcc/ChangeLog:

* gimple-range-cache.cc (block_range_cache::block_range_cache):
Rename vrange_allocator to obstack_vrange_allocator.
(ssa_global_cache::ssa_global_cache): Same.
* gimple-range-edge.h (class gimple_outgoing_range): Same.
* gimple-range-infer.h (class infer_range_manager): Same.
* value-range.h (class vrange_allocator): Make abstract.
(class obstack_vrange_allocator): Inherit from vrange_allocator.
(class ggc_vrange_allocator): New.
---
 gcc/gimple-range-cache.cc |  4 +--
 gcc/gimple-range-edge.h   |  2 +-
 gcc/gimple-range-infer.h  |  2 +-
 gcc/value-range.h | 57 ---
 4 files changed, 40 insertions(+), 25 deletions(-)

diff --git a/gcc/gimple-range-cache.cc b/gcc/gimple-range-cache.cc
index 5df744184c4..5bb52d5f70f 100644
--- a/gcc/gimple-range-cache.cc
+++ b/gcc/gimple-range-cache.cc
@@ -320,7 +320,7 @@ block_range_cache::block_range_cache ()
   bitmap_obstack_initialize (&m_bitmaps);
   m_ssa_ranges.create (0);
   m_ssa_ranges.safe_grow_cleared (num_ssa_names);
-  m_range_allocator = new vrange_allocator;
+  m_range_allocator = new obstack_vrange_allocator;
 }
 
 // Remove any m_block_caches which have been created.
@@ -478,7 +478,7 @@ block_range_cache::dump (FILE *f, basic_block bb, bool 
print_varying)
 ssa_global_cache::ssa_global_cache ()
 {
   m_tab.create (0);
-  m_range_allocator = new vrange_allocator;
+  m_range_allocator = new obstack_vrange_allocator;
 }
 
 // Deconstruct a global cache.
diff --git a/gcc/gimple-range-edge.h b/gcc/gimple-range-edge.h
index a9c4af8715b..c81b943dae6 100644
--- a/gcc/gimple-range-edge.h
+++ b/gcc/gimple-range-edge.h
@@ -47,7 +47,7 @@ private:
 
   int m_max_edges;
   hash_map *m_edge_table;
-  vrange_allocator m_range_allocator;
+  obstack_vrange_allocator m_range_allocator;
 };
 
 // If there is a range control statement at the end of block BB, return it.
diff --git a/gcc/gimple-range-infer.h b/gcc/gimple-range-infer.h
index aafa8bb74f0..bf27d0d3423 100644
--- a/gcc/gimple-range-infer.h
+++ b/gcc/gimple-range-infer.h
@@ -78,7 +78,7 @@ private:
   bitmap m_seen;
   bitmap_obstack m_bitmaps;
   struct obstack m_list_obstack;
-  vrange_allocator m_range_allocator;
+  obstack_vrange_allocator m_range_allocator;
 };
 
 #endif // GCC_GIMPLE_RANGE_SIDE_H
diff --git a/gcc/value-range.h b/gcc/value-range.h
index dc6f6b0f935..627d221fe0f 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -903,39 +903,54 @@ vrp_val_min (const_tree type)
 class vrange_allocator
 {
 public:
-  vrange_allocator ();
-  ~vrange_allocator ();
+  vrange_allocator () { }
+  virtual ~vrange_allocator () { }
   // Allocate a range of TYPE.
   vrange *alloc_vrange (tree type);
   // Allocate a memory block of BYTES.
-  void *alloc (unsigned bytes);
+  virtual void *alloc (unsigned bytes) = 0;
+  virtual void free (void *p) = 0;
   // Return a clone of SRC.
   template  T *clone (const T &src);
 private:
   irange *alloc_irange (unsigned pairs);
-  DISABLE_COPY_AND_ASSIGN (vrange_allocator);
-  struct obstack m_obstack;
+  void operator= (const vrange_allocator &) = delete;
 };
 
-inline
-vrange_allocator::vrange_allocator ()
+class obstack_vrange_allocator : public vrange_allocator
 {
-  obstack_init (&m_obstack);
-}
-
-inline
-vrange_allocator::~vrange_allocator ()
-{
-  obstack_free (&m_obstack, NULL);
-}
-
-// Provide a hunk of memory from the obstack.
+public:
+  obstack_vrange_allocator ()
+  {
+obstack_init (&m_obstack);
+  }
+  virtual ~obstack_vrange_allocator () final override
+  {
+obstack_free (&m_obstack, NULL);
+  }
+  virtual void *alloc (unsigned bytes) final override
+  {
+return obstack_alloc (&m_obstack, bytes);
+  }
+  virtual void free (void *) final override { }
+private:
+  obstack m_obstack;
+};
 
-inline void *
-vrange_allocator::alloc (unsigned bytes)
+class ggc_vrange_allocator : public vrange_allocator
 {
-  return obstack_alloc (&m_obstack, bytes);
-}
+public:
+  ggc_vrange_allocator () { }
+  virtual ~ggc_vrange_allocator () final override { }
+  virtual void *alloc (unsigned bytes) final override
+  {
+return ggc_internal_alloc (bytes);
+  }
+  virtual void free (void *p) final override
+  {
+return ggc_free (p);
+  }
+};
 
 // Return a new range to hold ranges of TYPE.  The newly allocated
 // range is initialized to VR_UNDEFINED.
-- 
2.36.1



Re: [Patch] Fortran: Cleanup OpenMP match{o,s,do,ds} macros

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 30, 2022 at 02:30:24PM +0200, Tobias Burnus wrote:
> OK for mainline - or should I put that patch into the bin?

Not sure, the other match* macros also aren't wrappers
around another macro and with the internal macro it means we'll
need to parse more many times (even when we then fold the conditions quite
early).

> +/* Like match, but with some special handling:
> +   - dosimd - if false, don't do anything if not -fopenmp,
> + otherwise do match_word_omp_simd matching
> +   - if dospec_only: if spec_only, goto do_spec_only after matching.
> +
> +   If the directive matched but the clauses failed, do not start
> +   matching the next directive in the same switch statement.  */
> +
> +#define match_internal(match_simd, match_spec_only, keyword, subr, st)   
> \

It at least should be matcho_internal or have the OpenMP stuff in the name
somehow, because it is quite OpenMP specific and isn't used by match or
matcha etc.
> +/* Like match. Does simd matching; sets flag simd_matched if keyword 
> matched. */

Twice missing 2 spaces after .

Jakub



Re: [Patch] OpenMP, libgomp, gimple: omp_get_max_teams, omp_set_num_teams, and omp_{gs}et_teams_thread_limit on offload devices

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Thu, Apr 14, 2022 at 06:06:24PM +0200, Marcel Vollweiler wrote:
> --- a/gcc/gimplify.cc
> +++ b/gcc/gimplify.cc
> @@ -13994,7 +13994,7 @@ optimize_target_teams (tree target, gimple_seq *pre_p)
>struct gimplify_omp_ctx *target_ctx = gimplify_omp_ctxp;
>  
>if (teams == NULL_TREE)
> -num_teams_upper = integer_one_node;
> +num_teams_upper = integer_minus_two_node;

No, please don't introduce this, it is quite costly to have a GC trees
like integer_one_node, so they should stay for the most commonly used
numbers, -2 isn't like that.  Just build_int_cst (integer_type_node, -2).

> --- a/gcc/tree-core.h
> +++ b/gcc/tree-core.h
> @@ -642,6 +642,7 @@ enum tree_index {
>TI_INTEGER_ONE,
>TI_INTEGER_THREE,
>TI_INTEGER_MINUS_ONE,
> +  TI_INTEGER_MINUS_TWO,
>TI_NULL_POINTER,
>  
>TI_SIZE_ZERO,
> diff --git a/gcc/tree.cc b/gcc/tree.cc
> index 8f83ea1..8cb474d 100644
> --- a/gcc/tree.cc
> +++ b/gcc/tree.cc
> @@ -9345,6 +9345,7 @@ build_common_tree_nodes (bool signed_char)
>integer_one_node = build_int_cst (integer_type_node, 1);
>integer_three_node = build_int_cst (integer_type_node, 3);
>integer_minus_one_node = build_int_cst (integer_type_node, -1);
> +  integer_minus_two_node = build_int_cst (integer_type_node, -2);
>  
>size_zero_node = size_int (0);
>size_one_node = size_int (1);
> diff --git a/gcc/tree.h b/gcc/tree.h
> index cea49a5..1aeb009 100644
> --- a/gcc/tree.h
> +++ b/gcc/tree.h
> @@ -4206,6 +4206,7 @@ tree_strip_any_location_wrapper (tree exp)
>  #define integer_one_node global_trees[TI_INTEGER_ONE]
>  #define integer_three_node  global_trees[TI_INTEGER_THREE]
>  #define integer_minus_one_node   
> global_trees[TI_INTEGER_MINUS_ONE]
> +#define integer_minus_two_node   
> global_trees[TI_INTEGER_MINUS_TWO]
>  #define size_zero_node   global_trees[TI_SIZE_ZERO]
>  #define size_one_nodeglobal_trees[TI_SIZE_ONE]
>  #define bitsize_zero_nodeglobal_trees[TI_BITSIZE_ZERO]

And drop the above 3 hunks.

> --- a/libgomp/config/gcn/icv-device.c
> +++ b/libgomp/config/gcn/icv-device.c
> @@ -37,6 +37,7 @@ volatile int GOMP_DEFAULT_DEVICE_VAR;
>  volatile int GOMP_MAX_ACTIVE_LEVELS_VAR;
>  volatile omp_proc_bind_t GOMP_BIND_VAR;
>  volatile int GOMP_NTEAMS_VAR;
> +volatile int GOMP_TEAMS_THREAD_LIMIT_VAR;

I really don't like this copying of individual ICVs one by one to the
device, copy a struct containing them and access fields in that struct.

> --- a/libgomp/libgomp-plugin.h
> +++ b/libgomp/libgomp-plugin.h
> @@ -116,6 +116,7 @@ struct addr_pair
>  #define GOMP_MAX_ACTIVE_LEVELS_VAR __gomp_max_active_levels
>  #define GOMP_BIND_VAR __gomp_bind
>  #define GOMP_NTEAMS_VAR __gomp_nteams
> +#define GOMP_TEAMS_THREAD_LIMIT_VAR __gomp_teams_thread_limit_var

Likewise here.

> @@ -527,13 +538,19 @@ struct gomp_icv_list {
>  
>  extern void *gomp_get_icv_value_ptr (struct gomp_icv_list **list,
>int device_num);
> -extern struct gomp_icv_list *gomp_run_sched_var_dev_list;
> -extern struct gomp_icv_list *gomp_run_sched_chunk_size_dev_list;
> +extern struct gomp_icv_list* gomp_add_device_specific_icv (int dev_num,
> +size_t size,
> + struct 
> gomp_icv_list **list);
> +extern struct gomp_icv_list *gomp_initial_run_sched_var_dev_list;
> +extern struct gomp_icv_list *gomp_initial_run_sched_chunk_size_dev_list;
> +extern struct gomp_icv_list *gomp_initial_max_active_levels_var_dev_list;
> +extern struct gomp_icv_list *gomp_initial_proc_bind_var_dev_list;
> +extern struct gomp_icv_list *gomp_initial_proc_bind_var_list_dev_list;
> +extern struct gomp_icv_list *gomp_initial_proc_bind_var_list_len_dev_list;
> +extern struct gomp_icv_list *gomp_initial_nteams_var_dev_list;
> +
>  extern struct gomp_icv_list *gomp_nteams_var_dev_list;
> -extern struct gomp_icv_list *gomp_max_active_levels_var_dev_list;
> -extern struct gomp_icv_list *gomp_proc_bind_var_dev_list;
> -extern struct gomp_icv_list *gomp_proc_bind_var_list_dev_list;
> -extern struct gomp_icv_list *gomp_proc_bind_var_list_len_dev_list;
> +extern struct gomp_icv_list *gomp_teams_thread_limit_var_dev_list;

Nor these per-var lists.  For a specific device, walk the list with
all the vars in it, start with the most specific (matching dev number),
then just dev and then all and fill in from it what is going to be copied.
> --- a/libgomp/plugin/plugin-gcn.c
> +++ b/libgomp/plugin/plugin-gcn.c
> @@ -572,7 +572,8 @@ static char *GOMP_ICV_STRINGS[] =
>XSTRING (GOMP_DYN_VAR),
>XSTRING (GOMP_MAX_ACTIVE_LEVELS_VAR),
>XSTRING (GOMP_BIND_VAR),
> -  XSTRING (GOMP_NTEAMS_VAR)
> +  XSTRING (GOMP_NTEAMS_VAR),
> +  XSTRING (GOMP_TEAMS_THREAD_LIMIT_VAR)

Then you don't need to e.g. track the names of the individual vars, just
one for the whole ICV block.

Jakub



[committed] libstdc++: Fix experimental::filesystem::status on Windows [PR88881]

2022-06-30 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and x86_64-mingw, pushed to trunk.

-- >8 --

Although the Filesystem TS isn't properly supported on Windows (unlike
the C++17 Filesystem lib), most tests do pass. Two of the failures are
due to PR 1 which was only fixed for std::filesystem not the TS.
This applies the fix to the TS implementation too.

libstdc++-v3/ChangeLog:

PR libstdc++/1
* src/filesystem/ops.cc (has_trailing_slash): New helper
function.
(fs::status): Strip trailing slashes.
(fs::symlink_status): Likewise.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Clean the environment before each test and use TMP instead of
TMPDIR so the test passes on Windows.
---
 libstdc++-v3/src/filesystem/ops.cc| 56 ++-
 .../operations/temp_directory_path.cc |  6 +-
 2 files changed, 59 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/src/filesystem/ops.cc 
b/libstdc++-v3/src/filesystem/ops.cc
index 98ddff5a66e..896a4918ace 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -1176,13 +1176,43 @@ fs::space(const path& p, error_code& ec) noexcept
   return info;
 }
 
+#if _GLIBCXX_FILESYSTEM_IS_WINDOWS
+static bool has_trailing_slash(const fs::path& p)
+{
+  wchar_t c = p.native().back();
+  return c == '/' || c == L'\\';
+}
+#endif
+
 #ifdef _GLIBCXX_HAVE_SYS_STAT_H
 fs::file_status
 fs::status(const fs::path& p, error_code& ec) noexcept
 {
   file_status status;
+
+  auto str = p.c_str();
+
+#if _GLIBCXX_FILESYSTEM_IS_WINDOWS
+  // stat() fails if there's a trailing slash (PR 1)
+  path p2;
+  if (p.has_relative_path() && has_trailing_slash(p))
+{
+  __try
+   {
+ p2 = p.parent_path();
+ str = p2.c_str();
+   }
+  __catch(const bad_alloc&)
+   {
+ ec = std::make_error_code(std::errc::not_enough_memory);
+ return status;
+   }
+  str = p2.c_str();
+}
+#endif
+
   stat_type st;
-  if (posix::stat(p.c_str(), &st))
+  if (posix::stat(str, &st))
 {
   int err = errno;
   ec.assign(err, std::generic_category());
@@ -1205,8 +1235,30 @@ fs::file_status
 fs::symlink_status(const fs::path& p, std::error_code& ec) noexcept
 {
   file_status status;
+
+  auto str = p.c_str();
+
+#if _GLIBCXX_FILESYSTEM_IS_WINDOWS
+  // stat() fails if there's a trailing slash (PR 1)
+  path p2;
+  if (p.has_relative_path() && has_trailing_slash(p))
+{
+  __try
+   {
+ p2 = p.parent_path();
+ str = p2.c_str();
+   }
+  __catch(const bad_alloc&)
+   {
+ ec = std::make_error_code(std::errc::not_enough_memory);
+ return status;
+   }
+  str = p2.c_str();
+}
+#endif
+
   stat_type st;
-  if (posix::lstat(p.c_str(), &st))
+  if (posix::lstat(str, &st))
 {
   int err = errno;
   ec.assign(err, std::generic_category());
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/temp_directory_path.cc
 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/temp_directory_path.cc
index 9e9cd44d460..c2945c90866 100644
--- 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/temp_directory_path.cc
+++ 
b/libstdc++-v3/testsuite/experimental/filesystem/operations/temp_directory_path.cc
@@ -105,6 +105,8 @@ test03()
   if (!__gnu_test::permissions_are_testable())
 return;
 
+  clean_env();
+
   auto p = __gnu_test::nonexistent_path();
   create_directories(p/"tmp");
   permissions(p, fs::perms::none);
@@ -129,8 +131,10 @@ test03()
 void
 test04()
 {
+  clean_env();
+
   __gnu_test::scoped_file f;
-  set_env("TMPDIR", f.path.string());
+  set_env("TMP", f.path.string());
   std::error_code ec;
   auto r = fs::temp_directory_path(ec);
   VERIFY( ec == std::make_error_code(std::errc::not_a_directory) );
-- 
2.36.1



[committed] libstdc++: Improve exceptions thrown from fs::temp_directory_path

2022-06-30 Thread Jonathan Wakely via Gcc-patches
Tested x86_64-linux and x86_64-mingw, pushed to trunk.

-- >8 --

Currently the throwing overload of fs::temp_directory_path() will
discard the path that was obtained from the environment. When it fails
because the path doesn't resolve to a directory you get an unhelpful
error like:

  filesystem error: temp_directory_path: Not a directory

It would be better to also print the path in that case, e.g.

  filesystem error: temp_directory_path: Not a directory [/home/bob/tmp]

libstdc++-v3/ChangeLog:

* src/c++17/fs_ops.cc (fs::temp_directory_path()): Include path
in exception.
(fs::temp_directory_path(error_code&)): Rearrange to more
closely match the structure of the first overload.
* src/filesystem/ops.cc (fs::temp_directory_path): Likewise.
* testsuite/27_io/filesystem/operations/temp_directory_path.cc:
Check that exception contains the path.
* testsuite/experimental/filesystem/operations/temp_directory_path.cc:
Likewise.
---
 libstdc++-v3/src/c++17/fs_ops.cc  | 34 +--
 libstdc++-v3/src/filesystem/ops.cc| 31 ++---
 .../operations/temp_directory_path.cc |  5 +++
 .../operations/temp_directory_path.cc |  5 +++
 4 files changed, 52 insertions(+), 23 deletions(-)

diff --git a/libstdc++-v3/src/c++17/fs_ops.cc b/libstdc++-v3/src/c++17/fs_ops.cc
index 435368fa5c5..ed5e9f7d5cf 100644
--- a/libstdc++-v3/src/c++17/fs_ops.cc
+++ b/libstdc++-v3/src/c++17/fs_ops.cc
@@ -1571,25 +1571,37 @@ fs::path
 fs::temp_directory_path()
 {
   error_code ec;
-  path tmp = temp_directory_path(ec);
+  path p = fs::get_temp_directory_from_env(ec);
+  if (!ec)
+{
+  auto st = status(p, ec);
+  if (!ec && !is_directory(st))
+   ec = std::make_error_code(std::errc::not_a_directory);
+}
   if (ec)
-_GLIBCXX_THROW_OR_ABORT(filesystem_error("temp_directory_path", ec));
-  return tmp;
+{
+  if (p.empty())
+   _GLIBCXX_THROW_OR_ABORT(filesystem_error("temp_directory_path", ec));
+  else
+   _GLIBCXX_THROW_OR_ABORT(filesystem_error("temp_directory_path", p, ec));
+}
+  return p;
 }
 
 fs::path
 fs::temp_directory_path(error_code& ec)
 {
   path p = fs::get_temp_directory_from_env(ec);
-  if (ec)
-return p;
-  auto st = status(p, ec);
-  if (ec)
-p.clear();
-  else if (!is_directory(st))
+  if (!ec)
 {
-  p.clear();
-  ec = std::make_error_code(std::errc::not_a_directory);
+  auto st = status(p, ec);
+  if (ec)
+   p.clear();
+  else if (!is_directory(st))
+   {
+ p.clear();
+ ec = std::make_error_code(std::errc::not_a_directory);
+   }
 }
   return p;
 }
diff --git a/libstdc++-v3/src/filesystem/ops.cc 
b/libstdc++-v3/src/filesystem/ops.cc
index 896a4918ace..ab84eb84594 100644
--- a/libstdc++-v3/src/filesystem/ops.cc
+++ b/libstdc++-v3/src/filesystem/ops.cc
@@ -1326,25 +1326,32 @@ fs::path
 fs::temp_directory_path()
 {
   error_code ec;
-  path tmp = temp_directory_path(ec);
-  if (ec.value())
-_GLIBCXX_THROW_OR_ABORT(filesystem_error("temp_directory_path", ec));
-  return tmp;
+  path p = fs::get_temp_directory_from_env(ec);
+  if (!ec)
+{
+  auto st = status(p, ec);
+  if (!ec && !is_directory(st))
+   ec = std::make_error_code(std::errc::not_a_directory);
+}
+  if (ec)
+_GLIBCXX_THROW_OR_ABORT(filesystem_error("temp_directory_path", p, ec));
+  return p;
 }
 
 fs::path
 fs::temp_directory_path(error_code& ec)
 {
   path p = fs::get_temp_directory_from_env(ec);
-  if (ec)
-return p;
-  auto st = status(p, ec);
-  if (ec)
-p.clear();
-  else if (!is_directory(st))
+  if (!ec)
 {
-  p.clear();
-  ec = std::make_error_code(std::errc::not_a_directory);
+  auto st = status(p, ec);
+  if (ec)
+   p.clear();
+  else if (!is_directory(st))
+   {
+ p.clear();
+ ec = std::make_error_code(std::errc::not_a_directory);
+   }
 }
   return p;
 }
diff --git 
a/libstdc++-v3/testsuite/27_io/filesystem/operations/temp_directory_path.cc 
b/libstdc++-v3/testsuite/27_io/filesystem/operations/temp_directory_path.cc
index b4ef77f05e4..56bd7408c2d 100644
--- a/libstdc++-v3/testsuite/27_io/filesystem/operations/temp_directory_path.cc
+++ b/libstdc++-v3/testsuite/27_io/filesystem/operations/temp_directory_path.cc
@@ -140,12 +140,17 @@ test04()
   VERIFY( r == fs::path() );
 
   std::error_code ec2;
+  std::string failed_path;
   try {
 fs::temp_directory_path();
   } catch (const fs::filesystem_error& e) {
 ec2 = e.code();
+// On Windows the returned path will be in preferred form, i.e. using L'\\'
+// and will have a trailing slash, so compare generic forms.
+failed_path = e.path1().generic_string();
   }
   VERIFY( ec2 == ec );
+  VERIFY( failed_path.find(f.path.generic_string()) != std::string::npos );
 }
 
 int
diff --git 
a/libstdc++-v3/testsuite/experimental/filesystem/operations/temp_directory_path.cc
 

RE: [ping][vect-patterns] Refactor widen_plus/widen_minus as internal_fns

2022-06-30 Thread Joel Hutton via Gcc-patches
> We can go with a private vect_gimple_build function until we sort out the API
> issue to unblock Tamar (I'll reply to Richards reply with further thoughts on
> this)
> 

Done.

> > Similarly are you ok with the use of gimple_extract_op? I would lean
> towards using it as it is cleaner, but I don't have strong feelings.
> 
> I don't like using gimple_extract_op here, I think I outlined a variant that 
> is
> even shorter.
> 

Done.

Updated patches attached, bootstrapped and regression tested on aarch64.

Tomorrow is my last working day at Arm, so it will likely be Andre that commits 
this/addresses any further comments.



0001-Refactor-to-allow-internal_fn-s.patch
Description: 0001-Refactor-to-allow-internal_fn-s.patch


0002-Refactor-widen_plus-as-internal_fn.patch
Description: 0002-Refactor-widen_plus-as-internal_fn.patch


0003-Remove-widen_plus-minus_expr-tree-codes.patch
Description: 0003-Remove-widen_plus-minus_expr-tree-codes.patch


Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 30, 2022 at 01:40:24PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > +/* The initial ICV values for the host, which are configured with 
> > environment
> > +   variables without a suffix, e.g. OMP_NUM_TEAMS.  */
> > +struct gomp_initial_icvs gomp_initial_icvs_none;
> > +
> > +/* Initial ICV values that were configured for the host and for all 
> > devices by
> > +   using environment variables like OMP_NUM_TEAMS_ALL.  */
> > +struct gomp_initial_icvs gomp_initial_icvs_all;
> > +
> > +/* Initial ICV values that were configured only for devices (not for the 
> > host)
> > +   by using environment variables like OMP_NUM_TEAMS_DEV.  */
> > +struct gomp_initial_icvs gomp_initial_icvs_dev;
> 
> As I said last time, I don't like allocating these
> all the time in the data section of libgomp when at least for a few upcoming
> years, most users will never use those suffixes.
> Can't *_DEV and *_ALL go into the gomp_initial_icv_dev_list
> chain too, perhaps 

Sorry, forgot to finish sentence, I meant perhaps with dev_num of some magic
negative constants, and ensure that the all entry goes e.g. first in the
list, then dev and then the rest, so when filling up say what values to copy
to some device, it would start with the defaults, then if all is present
overwrite from selected all vars, then if non-host and dev is present,
overwrite from selected dev vars and finally overwrite from selected
specific device vars.

Jakub



Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-06-30 Thread Qing Zhao via Gcc-patches


> On Jun 29, 2022, at 5:14 PM, Martin Sebor  wrote:
> 
> On 6/28/22 13:01, Qing Zhao wrote:
>>> On Jun 28, 2022, at 2:49 PM, Jakub Jelinek  wrote:
>>> 
>>> On Tue, Jun 28, 2022 at 06:29:01PM +, Qing Zhao wrote:
 
 
> On Jun 28, 2022, at 2:22 PM, Jakub Jelinek  wrote:
> 
> On Tue, Jun 28, 2022 at 06:15:58PM +, Qing Zhao wrote:
>>> Because the flag just tells whether some array shouldn't be treated as 
>>> (poor man's)
>>> flexible array member.  We still need to find out if some FIELD_DECL is 
>>> to
>>> be treated like a flexible array member, which is a minority of
>>> COMPONENT_REFs.
>>> struct S { int a; char b[0]; int c; } s;
>>> struct T { int d; char e[]; };
>>> struct U { int f; struct T g; int h; } u;
>>> Neither s.b nor u.g.e is to be treated like flexible array member,
>>> no matter what -fstrict-flex-array= option is used.
>> 
>> Then, to resolve this issue, we might need a opposite  flag 
>> DECL_IS_FLEXARRAY in FIELD_DECL?
>> 
>> The default is FALSE for all FIELD_DECL.
> 
> Doesn't matter whether it is positive or negative, you still need to 
> analyze
> it.  See the above example.  If you have struct T t; and test t.e, then it
> is flexarray.  But u.g.e is not, even when the COMPONENT_REF refers to the
> same FIELD_DECL.  In the t.e case e is the very last field, in the latter
> case u.g.e is the last field in struct T, but struct U has the h field 
> after
 
 So, do you mean that the current FE analysis will not be able to decide 
 whether a specific array field is at the end of the enclosing structure?
 Only the middle end can decide this ?
>>> 
>>> Well, anything that analyzes it, can be in the FE or middle-end, but there
>>> is no place to store it for later.
>> Then I am a little confused:
>> If the FE can decide wether an array field is at the end of the enclosing 
>> structure,  then combined with whether it’s a [0], [1] or [], and which 
>> level of -fstrict-flex-array,
>> The FE should be able to decide whether this array field is a flexible array 
>> member or not, then set the flag DECL_IS_FLEXARRAY (or DECL_NOT_FLEXARRAY).
>> The new flag is the place to store such info, right?
>> Do I miss anything here?
> 
> I think the problem is that there is just one FIELD_DECL for member
> M of a given type T but there can be more than one instance of that
> member, one in each struct that has a subobject of T as its own
> member.  Whether M is or isn't a (valid) flexible array member
> varies between the two instances.

Okay, I see. 
A FIELD_DECL might be shared by multiple structure or unions, and whether 
it’s a flexible array member varies between different enclosing structures or 
unions.
Therefore FIELD_DECL cannot carry the flexible array member information 
accurately. 

Then, how about encoding the flexible array member information into the 
enclosing structure or union? 


Another thing is:  All this complexity is caused by GNU extension which permits 
the flexible array 
member not at the end of the struct. (As I mentioned in a previous email, I 
listed here again)

For example the following two examples:

1. [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t1.c
struct AX
{
  int n;
  short ax[];
  int m;
};

void warn_ax_local (struct AX *p)
{
  p->ax[2] = 0;   
}

2. [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t2.c
struct AX
{
  int n;
  short ax[];
};

struct UX
{
  struct AX b;
  int m;
};

void warn_ax_local (struct AX *p, struct UX *q)
{
  p->ax[2] = 0;   
  q->b.ax[2] = 0;
}

[opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t1.c -S
t4.c:4:9: error: flexible array member not at end of struct
4 |   short ax[];

[opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t2.c -S

It’s clear to see that in the above t1.c,  GCC  reports error when the flexible 
array member is Not at the end of the structure  (AX) that immediately 
enclosing the field.
However, for t2.c, when the flexible array member is Not at the end of the 
structure that does not immediately enclosing it (UX), then it’s accepted.   

I am very confused about t2.c, is the struct UX a correct declaration? 

Thanks.

Qing

> 
> Martin



Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-06-30 Thread Richard Biener via Gcc-patches



> Am 30.06.2022 um 16:08 schrieb Qing Zhao via Gcc-patches 
> :
> 
> 
> 
>> On Jun 29, 2022, at 5:14 PM, Martin Sebor  wrote:
>> 
>> On 6/28/22 13:01, Qing Zhao wrote:
 On Jun 28, 2022, at 2:49 PM, Jakub Jelinek  wrote:
 
 On Tue, Jun 28, 2022 at 06:29:01PM +, Qing Zhao wrote:
> 
> 
>> On Jun 28, 2022, at 2:22 PM, Jakub Jelinek  wrote:
>> 
>>> On Tue, Jun 28, 2022 at 06:15:58PM +, Qing Zhao wrote:
> Because the flag just tells whether some array shouldn't be treated 
> as (poor man's)
> flexible array member.  We still need to find out if some FIELD_DECL 
> is to
> be treated like a flexible array member, which is a minority of
> COMPONENT_REFs.
> struct S { int a; char b[0]; int c; } s;
> struct T { int d; char e[]; };
> struct U { int f; struct T g; int h; } u;
> Neither s.b nor u.g.e is to be treated like flexible array member,
> no matter what -fstrict-flex-array= option is used.
 
 Then, to resolve this issue, we might need a opposite  flag 
 DECL_IS_FLEXARRAY in FIELD_DECL?
 
 The default is FALSE for all FIELD_DECL.
>>> 
>>> Doesn't matter whether it is positive or negative, you still need to 
>>> analyze
>>> it.  See the above example.  If you have struct T t; and test t.e, then 
>>> it
>>> is flexarray.  But u.g.e is not, even when the COMPONENT_REF refers to 
>>> the
>>> same FIELD_DECL.  In the t.e case e is the very last field, in the 
>>> latter
>>> case u.g.e is the last field in struct T, but struct U has the h field 
>>> after
>> 
>> So, do you mean that the current FE analysis will not be able to decide 
>> whether a specific array field is at the end of the enclosing structure?
>> Only the middle end can decide this ?
> 
> Well, anything that analyzes it, can be in the FE or middle-end, but there
> is no place to store it for later.
>>> Then I am a little confused:
>>> If the FE can decide wether an array field is at the end of the enclosing 
>>> structure,  then combined with whether it’s a [0], [1] or [], and which 
>>> level of -fstrict-flex-array,
>>> The FE should be able to decide whether this array field is a flexible 
>>> array member or not, then set the flag DECL_IS_FLEXARRAY (or 
>>> DECL_NOT_FLEXARRAY).
>>> The new flag is the place to store such info, right?
>>> Do I miss anything here?
>> 
>> I think the problem is that there is just one FIELD_DECL for member
>> M of a given type T but there can be more than one instance of that
>> member, one in each struct that has a subobject of T as its own
>> member.  Whether M is or isn't a (valid) flexible array member
>> varies between the two instances.
> 
> Okay, I see. 
> A FIELD_DECL might be shared by multiple structure or unions, and whether 
> it’s a flexible array member varies between different enclosing structures or 
> unions.
> Therefore FIELD_DECL cannot carry the flexible array member information 
> accurately. 

No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
structure.


> Then, how about encoding the flexible array member information into the 
> enclosing structure or union? 
> 
> 
> Another thing is:  All this complexity is caused by GNU extension which 
> permits the flexible array 
> member not at the end of the struct. (As I mentioned in a previous email, I 
> listed here again)
> 
> For example the following two examples:
> 
> 1. [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t1.c
> struct AX
> {
>  int n;
>  short ax[];
>  int m;
> };
> 
> void warn_ax_local (struct AX *p)
> {
>  p->ax[2] = 0;   
> }
> 
> 2. [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t2.c
> struct AX
> {
>  int n;
>  short ax[];
> };
> 
> struct UX
> {
>  struct AX b;
>  int m;
> };
> 
> void warn_ax_local (struct AX *p, struct UX *q)
> {
>  p->ax[2] = 0;   
>  q->b.ax[2] = 0;
> }
> 
> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t1.c -S
> t4.c:4:9: error: flexible array member not at end of struct
>4 |   short ax[];
> 
> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t2.c -S
> 
> It’s clear to see that in the above t1.c,  GCC  reports error when the 
> flexible array member is Not at the end of the structure  (AX) that 
> immediately enclosing the field.
> However, for t2.c, when the flexible array member is Not at the end of the 
> structure that does not immediately enclosing it (UX), then it’s accepted.   
> 
> I am very confused about t2.c, is the struct UX a correct declaration? 
> 
> Thanks.
> 
> Qing
> 
>> 
>> Martin
> 


[PATCH] if-to-switch: properly allow side effects only for first condition

2022-06-30 Thread Martin Liška
Properly allow side effects only for a first BB in a condition chain.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR tree-optimization/106126

gcc/ChangeLog:

* gimple-if-to-switch.cc (struct condition_info): Save
has_side_effect.
(find_conditions): Parse all BBs.
(pass_if_to_switch::execute): Allow only side effects for first
BB.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr106126.c: New test.
---
 gcc/gimple-if-to-switch.cc   | 20 +++-
 gcc/testsuite/gcc.dg/tree-ssa/pr106126.c | 12 
 2 files changed, 23 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr106126.c

diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
index 70daae2003c..4441206c481 100644
--- a/gcc/gimple-if-to-switch.cc
+++ b/gcc/gimple-if-to-switch.cc
@@ -61,9 +61,11 @@ struct condition_info
 {
   typedef auto_vec> mapping_vec;
 
-  condition_info (gcond *cond): m_cond (cond), m_bb (gimple_bb (cond)),
-m_forwarder_bb (NULL), m_ranges (), m_true_edge (NULL), m_false_edge 
(NULL),
-m_true_edge_phi_mapping (), m_false_edge_phi_mapping ()
+  condition_info (gcond *cond, bool has_side_effect): m_cond (cond),
+m_bb (gimple_bb (cond)), m_forwarder_bb (NULL), m_ranges (),
+m_true_edge (NULL), m_false_edge (NULL),
+m_true_edge_phi_mapping (), m_false_edge_phi_mapping (),
+m_has_side_effect (has_side_effect)
   {
 m_ranges.create (0);
   }
@@ -80,6 +82,7 @@ struct condition_info
   edge m_false_edge;
   mapping_vec m_true_edge_phi_mapping;
   mapping_vec m_false_edge_phi_mapping;
+  bool m_has_side_effect;
 };
 
 /* Recond PHI mapping for an original edge E and save these into vector VEC.  
*/
@@ -389,16 +392,11 @@ find_conditions (basic_block bb,
   if (cond == NULL)
 return;
 
-  /* An empty conditions_in_bbs indicates we are processing the first
- basic-block then no need check side effect.  */
-  if (!conditions_in_bbs->is_empty () && !no_side_effect_bb (bb))
-return;
-
   tree lhs = gimple_cond_lhs (cond);
   tree rhs = gimple_cond_rhs (cond);
   tree_code code = gimple_cond_code (cond);
 
-  condition_info *info = new condition_info (cond);
+  condition_info *info = new condition_info (cond, !no_side_effect_bb (bb));
 
   gassign *def;
   if (code == NE_EXPR
@@ -536,6 +534,10 @@ pass_if_to_switch::execute (function *fun)
  if ((*info2)->m_false_edge != e)
break;
 
+ /* Only the first BB in a chain can have a side effect.  */
+ if (info->m_has_side_effect)
+   break;
+
  chain->m_entries.safe_push (*info2);
  bitmap_set_bit (seen_bbs, e->src->index);
  info = *info2;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr106126.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr106126.c
new file mode 100644
index 000..2f0fd44164b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr106126.c
@@ -0,0 +1,12 @@
+/* PR tree-optimization/106126 */
+/* { dg-do compile } */
+/* { dg-options "-O2" } */
+
+char *var_1;
+void pool_conda_matchspec() {
+  for (; var_1 && *var_1 &&
+ *var_1 != '<' && *var_1 != '>' &&
+ *var_1 != '!' && *var_1 != '~';)
+if (*var_1 >= 'A' && *var_1 <= 'Z')
+  *var_1 += 'A';
+}
-- 
2.36.1



[PATCH] x86: Support 2/4/8 byte constant vector stores

2022-06-30 Thread H.J. Lu via Gcc-patches
1. Add a predicate for constant vectors which can be converted to integer
constants suitable for constant integer stores.  For a 8-byte constant
vector, the converted 64-bit integer must be valid for store with 64-bit
immediate, which is a 64-bit integer sign-extended from a 32-bit integer.
2. Add a new pattern to allow 2-byte, 4-byte and 8-byte constant vector
stores, like

(set (mem:V2HI (reg:DI 84))
 (const_vector:V2HI [(const_int 0 [0]) (const_int 1 [0x1])]))

3. After reload, convert constant vector stores to constant integer
stores, like

(set (mem:SI (reg:DI 5 di [84]))
 (const_int 65536 [0x1]))

For

void
foo (short * c)
{
  c[0] = 0;
  c[1] = 1;
}

it generates

movl$65536, (%rdi)

instead of

movl.LC0(%rip), %eax
movl%eax, (%rdi)

gcc/

PR target/106022
* config/i386/i386-protos.h (ix86_convert_const_vector_to_integer):
New.
* config/i386/i386.cc (ix86_convert_const_vector_to_integer):
New.
* config/i386/mmx.md (V_16_32_64): New.
(*mov_imm): New patterns for stores with 16-bit, 32-bit
and 64-bit constant vector.
* config/i386/predicates.md (x86_64_const_vector_operand): New.

gcc/testsuite/

PR target/106022
* gcc.target/i386/pr106022-1.c: New test.
* gcc.target/i386/pr106022-2.c: Likewise.
* gcc.target/i386/pr106022-3.c: Likewise.
* gcc.target/i386/pr106022-4.c: Likewise.
---
 gcc/config/i386/i386-protos.h  |  2 +
 gcc/config/i386/i386.cc| 47 ++
 gcc/config/i386/mmx.md | 37 +
 gcc/config/i386/predicates.md  | 11 +
 gcc/testsuite/gcc.target/i386/pr106022-1.c | 13 ++
 gcc/testsuite/gcc.target/i386/pr106022-2.c | 14 +++
 gcc/testsuite/gcc.target/i386/pr106022-3.c | 14 +++
 gcc/testsuite/gcc.target/i386/pr106022-4.c | 14 +++
 8 files changed, 152 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-2.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-3.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr106022-4.c

diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 3596ce81ecf..cf847751ac5 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -122,6 +122,8 @@ extern void ix86_expand_unary_operator (enum rtx_code, 
machine_mode,
rtx[]);
 extern rtx ix86_build_const_vector (machine_mode, bool, rtx);
 extern rtx ix86_build_signbit_mask (machine_mode, bool, bool);
+extern HOST_WIDE_INT ix86_convert_const_vector_to_integer (rtx,
+  machine_mode);
 extern void ix86_split_convert_uns_si_sse (rtx[]);
 extern void ix86_expand_convert_uns_didf_sse (rtx, rtx);
 extern void ix86_expand_convert_uns_sixf_sse (rtx, rtx);
diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index b15b4893bb9..0cfe9962f75 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -15723,6 +15723,53 @@ ix86_build_signbit_mask (machine_mode mode, bool vect, 
bool invert)
   return force_reg (vec_mode, v);
 }
 
+/* Return HOST_WIDE_INT for const vector OP in MODE.  */
+
+HOST_WIDE_INT
+ix86_convert_const_vector_to_integer (rtx op, machine_mode mode)
+{
+  if (GET_MODE_SIZE (mode) > UNITS_PER_WORD)
+gcc_unreachable ();
+
+  int nunits = GET_MODE_NUNITS (mode);
+  wide_int val = wi::zero (GET_MODE_BITSIZE (mode));
+  machine_mode innermode = GET_MODE_INNER (mode);
+  unsigned int innermode_bits = GET_MODE_BITSIZE (innermode);
+
+  switch (mode)
+{
+case E_V2QImode:
+case E_V4QImode:
+case E_V2HImode:
+case E_V8QImode:
+case E_V4HImode:
+case E_V2SImode:
+  for (int i = 0; i < nunits; ++i)
+   {
+ int v = INTVAL (XVECEXP (op, 0, i));
+ wide_int wv = wi::shwi (v, innermode_bits);
+ val = wi::insert (val, wv, innermode_bits * i, innermode_bits);
+   }
+  break;
+case E_V2HFmode:
+case E_V4HFmode:
+case E_V2SFmode:
+  for (int i = 0; i < nunits; ++i)
+   {
+ rtx x = XVECEXP (op, 0, i);
+ int v = real_to_target (NULL, CONST_DOUBLE_REAL_VALUE (x),
+ REAL_MODE_FORMAT (innermode));
+ wide_int wv = wi::shwi (v, innermode_bits);
+ val = wi::insert (val, wv, innermode_bits * i, innermode_bits);
+   }
+  break;
+default:
+  gcc_unreachable ();
+}
+
+  return val.to_shwi ();
+}
+
 /* Return TRUE or FALSE depending on whether the first SET in INSN
has source and destination with matching CC modes, and that the
CC mode is at least as constrained as REQ_MODE.  */
diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md
index ba53007a35e..3294c1e6274 100644
--- a/gcc/config/i386/mmx.md
+++ b/gcc/config/i386/mmx.md
@@ -69,6 +69,12 

[PATCH] c-family: Add names to diagnostics for known headers

2022-06-30 Thread Jonathan Wakely via Gcc-patches
I recently changed  to no longer include an unnecessary header,
which meant it no longer includes , which means it no longer
includes . This resulted in some build failures:
https://issues.apache.org/jira/browse/LUCENE-10630
https://github.com/openSUSE/libzypp/pull/405

And that revealed that we don't suggest the right header for those
functions. Fixed like so.

Tested x86_64-linux. OK for trunk?

-- >8 --

gcc/c-family/ChangeLog:

* known-headers.cc (get_stdlib_header_for_name): Add 
names.

gcc/testsuite/ChangeLog:

* g++.dg/spellcheck-stdlib.C: Check  types and functions.
---
 gcc/c-family/known-headers.cc| 14 
 gcc/testsuite/g++.dg/spellcheck-stdlib.C | 29 
 2 files changed, 43 insertions(+)

diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
index 01c86b27dc8..9c256173b82 100644
--- a/gcc/c-family/known-headers.cc
+++ b/gcc/c-family/known-headers.cc
@@ -199,6 +199,20 @@ get_stdlib_header_for_name (const char *name, enum stdlib 
lib)
 {"WINT_MAX", {"", ""} },
 {"WINT_MIN", {"", ""} },
 
+/* .  */
+{"asctime", {"", ""} },
+{"clock", {"", ""} },
+{"clock_t", {"", ""} },
+{"ctime", {"", ""} },
+{"difftime", {"", ""} },
+{"gmtime", {"", ""} },
+{"localtime", {"", ""} },
+{"mktime", {"", ""} },
+{"strftime", {"", ""} },
+{"time", {"", ""} },
+{"time_t", {"", ""} },
+{"tm", {"", ""} },
+
 /* .  */
 {"WCHAR_MAX", {"", ""} },
 {"WCHAR_MIN", {"", ""} }
diff --git a/gcc/testsuite/g++.dg/spellcheck-stdlib.C 
b/gcc/testsuite/g++.dg/spellcheck-stdlib.C
index 87736b25e54..7a70641e3ae 100644
--- a/gcc/testsuite/g++.dg/spellcheck-stdlib.C
+++ b/gcc/testsuite/g++.dg/spellcheck-stdlib.C
@@ -158,6 +158,35 @@ void test_cstdlib (void *q)
   // { dg-message "'#include '" "" { target *-*-* } .-1 }
 }
 
+/* Missing .  */
+
+void test_ctime (void *q, long s, double d)
+{
+  clock_t c; // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  time_t t; // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  tm t2; // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  d = difftime (0, 0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  s = mktime (q); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  s = time (0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  q = asctime (0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  q = ctime (0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  q = gmtime (0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  q = localtime (0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+  char c[2];
+  strftime (c, 2, "", 0); // { dg-error "was not declared" }
+  // { dg-message "'#include '" "" { target *-*-* } .-1 }
+}
+
 /* Verify that we don't offer suggestions to stdlib globals names when
there's an explicit namespace.  */
 
-- 
2.36.1



Re: [PATCH] c-family: Add names to diagnostics for known headers

2022-06-30 Thread Marek Polacek via Gcc-patches
On Thu, Jun 30, 2022 at 04:11:42PM +0100, Jonathan Wakely via Gcc-patches wrote:
> I recently changed  to no longer include an unnecessary header,
> which meant it no longer includes , which means it no longer
> includes . This resulted in some build failures:
> https://issues.apache.org/jira/browse/LUCENE-10630
> https://github.com/openSUSE/libzypp/pull/405
> 
> And that revealed that we don't suggest the right header for those
> functions. Fixed like so.
> 
> Tested x86_64-linux. OK for trunk?

Ok, thanks.
 
> -- >8 --
> 
> gcc/c-family/ChangeLog:
> 
>   * known-headers.cc (get_stdlib_header_for_name): Add 
>   names.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/spellcheck-stdlib.C: Check  types and functions.
> ---
>  gcc/c-family/known-headers.cc| 14 
>  gcc/testsuite/g++.dg/spellcheck-stdlib.C | 29 
>  2 files changed, 43 insertions(+)
> 
> diff --git a/gcc/c-family/known-headers.cc b/gcc/c-family/known-headers.cc
> index 01c86b27dc8..9c256173b82 100644
> --- a/gcc/c-family/known-headers.cc
> +++ b/gcc/c-family/known-headers.cc
> @@ -199,6 +199,20 @@ get_stdlib_header_for_name (const char *name, enum 
> stdlib lib)
>  {"WINT_MAX", {"", ""} },
>  {"WINT_MIN", {"", ""} },
>  
> +/* .  */
> +{"asctime", {"", ""} },
> +{"clock", {"", ""} },
> +{"clock_t", {"", ""} },
> +{"ctime", {"", ""} },
> +{"difftime", {"", ""} },
> +{"gmtime", {"", ""} },
> +{"localtime", {"", ""} },
> +{"mktime", {"", ""} },
> +{"strftime", {"", ""} },
> +{"time", {"", ""} },
> +{"time_t", {"", ""} },
> +{"tm", {"", ""} },
> +
>  /* .  */
>  {"WCHAR_MAX", {"", ""} },
>  {"WCHAR_MIN", {"", ""} }
> diff --git a/gcc/testsuite/g++.dg/spellcheck-stdlib.C 
> b/gcc/testsuite/g++.dg/spellcheck-stdlib.C
> index 87736b25e54..7a70641e3ae 100644
> --- a/gcc/testsuite/g++.dg/spellcheck-stdlib.C
> +++ b/gcc/testsuite/g++.dg/spellcheck-stdlib.C
> @@ -158,6 +158,35 @@ void test_cstdlib (void *q)
>// { dg-message "'#include '" "" { target *-*-* } .-1 }
>  }
>  
> +/* Missing .  */
> +
> +void test_ctime (void *q, long s, double d)
> +{
> +  clock_t c; // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  time_t t; // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  tm t2; // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  d = difftime (0, 0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  s = mktime (q); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  s = time (0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  q = asctime (0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  q = ctime (0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  q = gmtime (0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  q = localtime (0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +  char c[2];
> +  strftime (c, 2, "", 0); // { dg-error "was not declared" }
> +  // { dg-message "'#include '" "" { target *-*-* } .-1 }
> +}
> +
>  /* Verify that we don't offer suggestions to stdlib globals names when
> there's an explicit namespace.  */
>  
> -- 
> 2.36.1
> 

Marek



[committed] libstdc++: Fix comment typos

2022-06-30 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, pushed to trunk.

-- >8 --

libstdc++-v3/ChangeLog:

* include/bits/utility.h: Fix comment typos.
---
 libstdc++-v3/include/bits/utility.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/libstdc++-v3/include/bits/utility.h 
b/libstdc++-v3/include/bits/utility.h
index 4a457afcc1e..e0e40309a6d 100644
--- a/libstdc++-v3/include/bits/utility.h
+++ b/libstdc++-v3/include/bits/utility.h
@@ -125,7 +125,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 }
 
 // The standard says this macro and alias template should be in  but we
-// we define them here, to be available in ,  and  too.
+// define them here, to be available in ,  and  too.
 // _GLIBCXX_RESOLVE_LIB_DEFECTS
 // 3378. tuple_size_v/tuple_element_t should be available when
 //   tuple_size/tuple_element are
@@ -190,7 +190,6 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus >= 201703L
 
-  //
   struct in_place_t {
 explicit in_place_t() = default;
   };
-- 
2.36.1



Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-06-30 Thread Qing Zhao via Gcc-patches


> On Jun 30, 2022, at 10:24 AM, Richard Biener  
> wrote:
> 
> 
> 
>> Am 30.06.2022 um 16:08 schrieb Qing Zhao via Gcc-patches 
>> :
>> 
>> 
>> 
>>> On Jun 29, 2022, at 5:14 PM, Martin Sebor  wrote:
>>> 
>>> On 6/28/22 13:01, Qing Zhao wrote:
> On Jun 28, 2022, at 2:49 PM, Jakub Jelinek  wrote:
> 
> On Tue, Jun 28, 2022 at 06:29:01PM +, Qing Zhao wrote:
>> 
>> 
>>> On Jun 28, 2022, at 2:22 PM, Jakub Jelinek  wrote:
>>> 
 On Tue, Jun 28, 2022 at 06:15:58PM +, Qing Zhao wrote:
>> Because the flag just tells whether some array shouldn't be treated 
>> as (poor man's)
>> flexible array member.  We still need to find out if some FIELD_DECL 
>> is to
>> be treated like a flexible array member, which is a minority of
>> COMPONENT_REFs.
>> struct S { int a; char b[0]; int c; } s;
>> struct T { int d; char e[]; };
>> struct U { int f; struct T g; int h; } u;
>> Neither s.b nor u.g.e is to be treated like flexible array member,
>> no matter what -fstrict-flex-array= option is used.
> 
> Then, to resolve this issue, we might need a opposite  flag 
> DECL_IS_FLEXARRAY in FIELD_DECL?
> 
> The default is FALSE for all FIELD_DECL.
 
 Doesn't matter whether it is positive or negative, you still need to 
 analyze
 it.  See the above example.  If you have struct T t; and test t.e, 
 then it
 is flexarray.  But u.g.e is not, even when the COMPONENT_REF refers to 
 the
 same FIELD_DECL.  In the t.e case e is the very last field, in the 
 latter
 case u.g.e is the last field in struct T, but struct U has the h field 
 after
>>> 
>>> So, do you mean that the current FE analysis will not be able to decide 
>>> whether a specific array field is at the end of the enclosing structure?
>>> Only the middle end can decide this ?
>> 
>> Well, anything that analyzes it, can be in the FE or middle-end, but 
>> there
>> is no place to store it for later.
 Then I am a little confused:
 If the FE can decide wether an array field is at the end of the enclosing 
 structure,  then combined with whether it’s a [0], [1] or [], and which 
 level of -fstrict-flex-array,
 The FE should be able to decide whether this array field is a flexible 
 array member or not, then set the flag DECL_IS_FLEXARRAY (or 
 DECL_NOT_FLEXARRAY).
 The new flag is the place to store such info, right?
 Do I miss anything here?
>>> 
>>> I think the problem is that there is just one FIELD_DECL for member
>>> M of a given type T but there can be more than one instance of that
>>> member, one in each struct that has a subobject of T as its own
>>> member.  Whether M is or isn't a (valid) flexible array member
>>> varies between the two instances.
>> 
>> Okay, I see. 
>> A FIELD_DECL might be shared by multiple structure or unions, and whether 
>> it’s a flexible array member varies between different enclosing structures 
>> or unions.
>> Therefore FIELD_DECL cannot carry the flexible array member information 
>> accurately. 
> 
> No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
> structure.

Sorry for my dump questions: 

1. What do you mean by “cv variants” of a structure?
2. For the following example:

struct AX { int n; short ax[];};
struct UX {struct AX b; int m;};

Are there two different FIELD_DECLs in the IR, one for AX.ax, the other one is 
for UX.b.ax?

Qing

> 
> 
>> Then, how about encoding the flexible array member information into the 
>> enclosing structure or union? 
>> 
>> 
>> Another thing is:  All this complexity is caused by GNU extension which 
>> permits the flexible array 
>> member not at the end of the struct. (As I mentioned in a previous email, I 
>> listed here again)
>> 
>> For example the following two examples:
>> 
>> 1. [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t1.c
>> struct AX
>> {
>> int n;
>> short ax[];
>> int m;
>> };
>> 
>> void warn_ax_local (struct AX *p)
>> {
>> p->ax[2] = 0;   
>> }
>> 
>> 2. [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t2.c
>> struct AX
>> {
>> int n;
>> short ax[];
>> };
>> 
>> struct UX
>> {
>> struct AX b;
>> int m;
>> };
>> 
>> void warn_ax_local (struct AX *p, struct UX *q)
>> {
>> p->ax[2] = 0;   
>> q->b.ax[2] = 0;
>> }
>> 
>> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t1.c -S
>> t4.c:4:9: error: flexible array member not at end of struct
>>   4 |   short ax[];
>> 
>> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t2.c -S
>> 
>> It’s clear to see that in the above t1.c,  GCC  reports error when the 
>> flexible array member is Not at the end of the structure  (AX) that 
>> immediately enclosing the field.
>> However, for t2.c, when the flexible array member is Not at the end of the 
>> structure that does not immediately enclosing it (UX), the

[PATCH] aarch64: Fix pure/const function attributes for intrinsics

2022-06-30 Thread Andrew Carlotti via Gcc-patches
No testcase for this, since I haven't found a way to turn the incorrect
attribute into incorrect codegen.

Bootstrapped and tested on aarch64-none-linux gnu.

gcc/

* config/aarch64/aarch64-builtins.c
(aarch64_get_attributes): Fix choice of pure/const attributes.

---

diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
b/gcc/config/aarch64/aarch64-builtins.cc
index 
e0a741ac663188713e21f457affa57217d074783..877f54aab787862794413259cd36ca0fb7bd49c5
 100644
--- a/gcc/config/aarch64/aarch64-builtins.cc
+++ b/gcc/config/aarch64/aarch64-builtins.cc
@@ -1085,9 +1085,9 @@ aarch64_get_attributes (unsigned int f, machine_mode mode)
   if (!aarch64_modifies_global_state_p (f, mode))
 {
   if (aarch64_reads_global_state_p (f, mode))
-   attrs = aarch64_add_attribute ("pure", attrs);
-  else
attrs = aarch64_add_attribute ("const", attrs);
+  else
+   attrs = aarch64_add_attribute ("pure", attrs);
 }

   if (!flag_non_call_exceptions || !aarch64_could_trap_p (f, mode))


[PATCH] c++: Refer to internal linkage for -Wsubobject-linkage [PR86491]

2022-06-30 Thread Jonathan Wakely via Gcc-patches
Tested powerpc64le-linux, OK for trunk?

-- >8 --

Since C++11 relaxed the requirement for template arguments to have
external linkage, it's possible to get -Wsubobject-linkage warnings
without using any anonymous namespaces. This confuses users when they
get diagnostics that refer to an anonymous namespace that doesn't exist
in their code.

This changes the diagnostic to say "has internal linkage" for C++11 and
later, which is accurate whether internal linkage is due to the 'static'
specifier, or due to the use of anonymous namespaces.

For C++98 template arguments declared with 'static' are ill-formed
anyway, so the only way this warning can arise is via anonymous
namespaces. That means the existing wording is accurate for C++98 and so
we can keep it.

PR c++/86491

gcc/cp/ChangeLog:

* decl2.cc (constrain_class_visibility): Adjust wording of
-Wsubobject-linkage to account for cases where anonymous
namespaces aren't used.

gcc/testsuite/ChangeLog:

* g++.dg/warn/Wsubobject-linkage-3.C: Adjust for new warning.
* g++.dg/warn/anonymous-namespace-1.C: Use separate dg-warning
directives for C++98 and everything else.
* g++.dg/warn/anonymous-namespace-2.C: Likewise.
* g++.dg/warn/anonymous-namespace-3.C: Likewise.
---
 gcc/cp/decl2.cc   | 12 ++--
 gcc/testsuite/g++.dg/warn/Wsubobject-linkage-3.C  |  4 ++--
 gcc/testsuite/g++.dg/warn/anonymous-namespace-1.C |  8 ++--
 gcc/testsuite/g++.dg/warn/anonymous-namespace-2.C |  9 ++---
 gcc/testsuite/g++.dg/warn/anonymous-namespace-3.C |  3 ++-
 5 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/gcc/cp/decl2.cc b/gcc/cp/decl2.cc
index 3737e5f010c..de53678715e 100644
--- a/gcc/cp/decl2.cc
+++ b/gcc/cp/decl2.cc
@@ -3027,7 +3027,11 @@ constrain_class_visibility (tree type)
 %qT has a field %qD whose type depends on the type %qT which has no linkage",
   type, t, nlt);
  }
-   else
+   else if (cxx_dialect > cxx98)
+ warning (OPT_Wsubobject_linkage, "\
+%qT has a field %qD whose type has internal linkage",
+  type, t);
+   else // In C++98 this can only happen with unnamed namespaces.
  warning (OPT_Wsubobject_linkage, "\
 %qT has a field %qD whose type uses the anonymous namespace",
   type, t);
@@ -3062,7 +3066,11 @@ constrain_class_visibility (tree type)
 %qT has a base %qT whose type depends on the type %qT which has no linkage",
 type, TREE_TYPE (t), nlt);
}
- else
+ else if (cxx_dialect > cxx98)
+   warning (OPT_Wsubobject_linkage, "\
+%qT has a base %qT whose type has internal linkage",
+type, TREE_TYPE (t));
+ else // In C++98 this can only happen with unnamed namespaces.
warning (OPT_Wsubobject_linkage, "\
 %qT has a base %qT whose type uses the anonymous namespace",
 type, TREE_TYPE (t));
diff --git a/gcc/testsuite/g++.dg/warn/Wsubobject-linkage-3.C 
b/gcc/testsuite/g++.dg/warn/Wsubobject-linkage-3.C
index 95a04501441..b116fbbb186 100644
--- a/gcc/testsuite/g++.dg/warn/Wsubobject-linkage-3.C
+++ b/gcc/testsuite/g++.dg/warn/Wsubobject-linkage-3.C
@@ -3,7 +3,7 @@
 namespace { struct Foo { }; }
 
 #line 6 "foo.C"
-struct Bar { Foo foo; };   // { dg-warning "anonymous namespace" }
+struct Bar { Foo foo; };   // { dg-warning "anonymous namespace|internal 
linkage" }
 // { dg-bogus "no linkage" "" { target *-*-* } .-1 }
-struct Bar2 : Foo { }; // { dg-warning "anonymous namespace" }
+struct Bar2 : Foo { }; // { dg-warning "anonymous namespace|internal 
linkage" }
 // { dg-bogus "no linkage" "" { target *-*-* } .-1 }
diff --git a/gcc/testsuite/g++.dg/warn/anonymous-namespace-1.C 
b/gcc/testsuite/g++.dg/warn/anonymous-namespace-1.C
index cf193e0cba5..eed3818c5cf 100644
--- a/gcc/testsuite/g++.dg/warn/anonymous-namespace-1.C
+++ b/gcc/testsuite/g++.dg/warn/anonymous-namespace-1.C
@@ -14,5 +14,9 @@ class foobar1
 };
 
 #line 17 "foo.C"
-class foobar : public bad { }; // { dg-warning "uses the anonymous namespace" }
-class foobar2 { bad b; }; // { dg-warning "uses the anonymous namespace" }
+class foobar : public bad { };
+// { dg-warning "has internal linkage" "" { target c++11 } .-1 }
+// { dg-warning "uses the anonymous namespace" "" { target c++98_only } .-2 }
+class foobar2 { bad b; };
+// { dg-warning "has internal linkage" "" { target c++11 } .-1 }
+// { dg-warning "uses the anonymous namespace" "" { target c++98_only } .-2 }
diff --git a/gcc/testsuite/g++.dg/warn/anonymous-namespace-2.C 
b/gcc/testsuite/g++.dg/warn/anonymous-namespace-2.C
index 4048f3959df..f2ca5915278 100644
--- a/gcc/testsuite/g++.dg/warn/anonymous-namespace-2.C
+++ b/gcc/testsuite/g++.dg/warn/anonymous-namespace-2.C
@@ -18,12 +18,15 @@ struct g3 {

Re: [PATCH] mksysinfo: add support for musl libc

2022-06-30 Thread Sören Tempel via Gcc-patches
Ian Lance Taylor  wrote:
> Thanks for the info.  Does this patch work?  It tweaks the handling of
> SYS_SECCOMP to be specific to that constant.

Yes, your patch works for me too on Alpine Linux Edge.

Thanks!

Greetings,
Sören


Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 30, 2022 at 03:31:00PM +, Qing Zhao wrote:
> > No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
> > structure.
> 
> Sorry for my dump questions: 
> 
> 1. What do you mean by “cv variants” of a structure?

const/volatile qualified variants.  So

> 2. For the following example:
> 
> struct AX { int n; short ax[];};

struct AX, const struct AX, volatile const struct AX etc. types will share
the FIELD_DECLs.

> struct UX {struct AX b; int m;};
> 
> Are there two different FIELD_DECLs in the IR, one for AX.ax, the other one 
> is for UX.b.ax?

No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
b and m FIELD_DECLs with DECL_CONTEXT of struct UX.

But, what is important is that when some FIELD_DECL is last in some
structure and has array type, it doesn't mean it should have an
unconstrained length.
In the above case, when struct AX is is followed by some other member, it
acts as a strict short ax[0]; field (even when that is an exception), one
can tak address of &UX.b.ax[0], but can't dereference that, or &UX.b.ax[1].

I believe pedantically flexible array members in such cases don't
necessarily mean zero length array, could be longer, e.g. for the usual
x86_64 alignments
struct BX { long long n; short o; short ax[]; };
struct VX { struct BX b; int m; };
I think it acts as short ax[3]; because the padding at the end of struct BX
is so long that 3 short elements fit in there.
While if one uses
struct BX bx = { 1LL, 2, { 3, 4, 5, 6, 7, 8, 9, 10 } };
(a GNU extension), then it acts as short ax[11]; - the initializer is 8
elements and after short ax[8]; is padding for another 3 full elemenets.
And of course:
struct BX *p = malloc (offsetof (struct BX, ax) + n * sizeof (short));
means short ax[n].
Whether struct WX { struct BX b; };
struct WX *p = malloc (offsetof (struct WX, b.ax) + n * sizeof (short));
is pedantically acting as short ax[n]; is unclear to me, but we are
generally allowing that and people expect it.

Though, on the GCC side, I think we are only treating like flexible arrays
what is really at the end of structs, not followed by other members.

Jakub



Re: [PATCH] OpenMP, libgomp: Environment variable syntax extension.

2022-06-30 Thread Jakub Jelinek via Gcc-patches
On Thu, Jun 30, 2022 at 03:21:15PM +0200, Jakub Jelinek via Gcc-patches wrote:
> On Thu, Jun 30, 2022 at 01:40:24PM +0200, Jakub Jelinek via Gcc-patches wrote:
> > > +/* The initial ICV values for the host, which are configured with 
> > > environment
> > > +   variables without a suffix, e.g. OMP_NUM_TEAMS.  */
> > > +struct gomp_initial_icvs gomp_initial_icvs_none;
> > > +
> > > +/* Initial ICV values that were configured for the host and for all 
> > > devices by
> > > +   using environment variables like OMP_NUM_TEAMS_ALL.  */
> > > +struct gomp_initial_icvs gomp_initial_icvs_all;
> > > +
> > > +/* Initial ICV values that were configured only for devices (not for the 
> > > host)
> > > +   by using environment variables like OMP_NUM_TEAMS_DEV.  */
> > > +struct gomp_initial_icvs gomp_initial_icvs_dev;
> > 
> > As I said last time, I don't like allocating these
> > all the time in the data section of libgomp when at least for a few upcoming
> > years, most users will never use those suffixes.
> > Can't *_DEV and *_ALL go into the gomp_initial_icv_dev_list
> > chain too, perhaps 
> 
> Sorry, forgot to finish sentence, I meant perhaps with dev_num of some magic
> negative constants, and ensure that the all entry goes e.g. first in the
> list, then dev and then the rest, so when filling up say what values to copy
> to some device, it would start with the defaults, then if all is present
> overwrite from selected all vars, then if non-host and dev is present,
> overwrite from selected dev vars and finally overwrite from selected
> specific device vars.

One more thing.  If we go just with gomp_initia_icvs_none and the all/dev
are in the list together with the rest, it would be enough not to use
multi-bit flags, but just single bit, has this env var been specified here
or not.  So, as long as there are <= 32 vars (excluding the ones that don't
have suffixed variants), we can use unsigned int bitmask indexed by the
enum, or if we need <= 64 vars we could use unsigned long long int bitmask.
And have one such bitmask next to gomp_initial_icvs_none and another one in
each list node.

Jakub



[r13-1357 Regression] FAIL: g++.dg/warn/Warray-bounds-16.C -std=gnu++98 pr102690 (test for bogus messages, line 22) on Linux/x86_64

2022-06-30 Thread skpandey--- via Gcc-patches
On Linux/x86_64,

0f6eef398045deb2a62d18b526831719c7c20c8a is the first bad commit
commit 0f6eef398045deb2a62d18b526831719c7c20c8a
Author: Kito Cheng 
Date:   Tue Jun 28 18:43:42 2022 +0800

testsuite/102690: Only check warning for lp64 in Warray-bounds-16.C

caused

FAIL: g++.dg/warn/Warray-bounds-16.C  -std=gnu++14 pr102690 (test for bogus 
messages, line 22)
FAIL: g++.dg/warn/Warray-bounds-16.C  -std=gnu++17 pr102690 (test for bogus 
messages, line 22)
FAIL: g++.dg/warn/Warray-bounds-16.C  -std=gnu++20 pr102690 (test for bogus 
messages, line 22)
FAIL: g++.dg/warn/Warray-bounds-16.C  -std=gnu++98 pr102690 (test for bogus 
messages, line 22)

with GCC configured with

../../gcc/configure 
--prefix=/local/skpandey/gccwork/toolwork/gcc-bisect-master/master/r13-1357/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Warray-bounds-16.C --target_board='unix{-m32}'"
$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="dg.exp=g++.dg/warn/Warray-bounds-16.C --target_board='unix{-m32\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at skpgkp2 at gmail dot com)


[x86 PATCH] UNSPEC_PALIGNR optimizations and clean-ups.

2022-06-30 Thread Roger Sayle

This patch is a follow-up to Hongtao's fix for PR target/105854.  That
fix is perfectly correct, but the thing that caught my eye was why is
the compiler generating a shift by zero at all.  Digging deeper it
turns out that we can easily optimize __builtin_ia32_palignr for
alignments of 0 and 64 respectively, which may be simplified to moves
from the highpart or lowpart.

After adding optimizations to simplify the 64-bit DImode palignr,
I started to add the corresponding optimizations for vpalignr (i.e.
128-bit).  The first oddity is that sse.md uses TImode and a special
SSESCALARMODE iterator, rather than V1TImode, and indeed the comment
above SSESCALARMODE hints that this should be "dropped in favor of
VIMAX_AVX2_AVX512BW".  Hence this patch includes the migration of
_palignr to use VIMAX_AVX2_AVX512BW, basically
using V1TImode instead of TImode for 128-bit palignr.

But it was only after I'd implemented this clean-up that I stumbled
across the strange semantics of 128-bit [v]palignr.  According to
https://www.felixcloutier.com/x86/palignr, the semantics are subtly
different based upon how the instruction is encoded.  PALIGNR leaves
the highpart unmodified, whilst VEX.128 encoded VPALIGNR clears the
highpart, and (unless I'm mistaken) it looks like GCC currently uses
the exact same RTL/templates for both, treating one as an alternative
for the other.

Hence I thought I'd post what I have so far (part optimization and
part clean-up), to then ask the x86 experts for their opinions.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-,32},
with no new failures.  Ok for mainline?


2022-06-30  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-builtin.def (__builtin_ia32_palignr128): Change
CODE_FOR_ssse3_palignrti to CODE_FOR_ssse3_palignrv1ti.
* config/i386/i386-expand.cc (expand_vec_perm_palignr): Use V1TImode
and gen_ssse3_palignv1ti instead of TImode.
* config/i386/sse.md (SSESCALARMODE): Delete.
(define_mode_attr ssse3_avx2): Handle V1TImode instead of TImode.
(_palignr): Use VIMAX_AVX2_AVX512BW as a mode
iterator instead of SSESCALARMODE.

(ssse3_palignrdi): Optimize cases when operands[3] is 0 or 64,
using a single move instruction (if required).
(define_split): Likewise split UNSPEC_PALIGNR $0 into a move.
(define_split): Likewise split UNSPEC_PALIGNR $64 into a move.

gcc/testsuite/ChangeLog
* gcc.target/i386/ssse3-palignr-2.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386-builtin.def b/gcc/config/i386/i386-builtin.def
index e6daad4..fd16093 100644
--- a/gcc/config/i386/i386-builtin.def
+++ b/gcc/config/i386/i386-builtin.def
@@ -900,7 +900,7 @@ BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_ssse3_psignv4si3, 
"__builtin_ia32_psig
 BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, 
CODE_FOR_ssse3_psignv2si3, "__builtin_ia32_psignd", IX86_BUILTIN_PSIGND, 
UNKNOWN, (int) V2SI_FTYPE_V2SI_V2SI)
 
 /* SSSE3.  */
-BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_ssse3_palignrti, 
"__builtin_ia32_palignr128", IX86_BUILTIN_PALIGNR128, UNKNOWN, (int) 
V2DI_FTYPE_V2DI_V2DI_INT_CONVERT)
+BDESC (OPTION_MASK_ISA_SSSE3, 0, CODE_FOR_ssse3_palignrv1ti, 
"__builtin_ia32_palignr128", IX86_BUILTIN_PALIGNR128, UNKNOWN, (int) 
V2DI_FTYPE_V2DI_V2DI_INT_CONVERT)
 BDESC (OPTION_MASK_ISA_SSSE3 | OPTION_MASK_ISA_MMX, 0, 
CODE_FOR_ssse3_palignrdi, "__builtin_ia32_palignr", IX86_BUILTIN_PALIGNR, 
UNKNOWN, (int) V1DI_FTYPE_V1DI_V1DI_INT_CONVERT)
 
 /* SSE4.1 */
diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index 8bc5430..6a3fcde 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -19548,9 +19548,11 @@ expand_vec_perm_palignr (struct expand_vec_perm_d *d, 
bool single_insn_only_p)
   shift = GEN_INT (min * GET_MODE_UNIT_BITSIZE (d->vmode));
   if (GET_MODE_SIZE (d->vmode) == 16)
 {
-  target = gen_reg_rtx (TImode);
-  emit_insn (gen_ssse3_palignrti (target, gen_lowpart (TImode, dcopy.op1),
- gen_lowpart (TImode, dcopy.op0), shift));
+  target = gen_reg_rtx (V1TImode);
+  emit_insn (gen_ssse3_palignrv1ti (target,
+   gen_lowpart (V1TImode, dcopy.op1),
+   gen_lowpart (V1TImode, dcopy.op0),
+   shift));
 }
   else
 {
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 8cd0f61..974deca 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -575,10 +575,6 @@
 (define_mode_iterator VIMAX_AVX2
   [(V2TI "TARGET_AVX2") V1TI])
 
-;; ??? This should probably be dropped in favor of VIMAX_AVX2_AVX512BW.
-(define_mode_iterator SSESCALARMODE
-  [(V4TI "TARGET_AVX512BW") (V2TI "TARGET_AVX2") TI])
-
 (define_mode_iterator VI12_AVX2
   [(V32QI "TARGET_AVX2") V16QI
(V16HI "TARGE

Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-06-30 Thread Qing Zhao via Gcc-patches


> On Jun 30, 2022, at 1:03 PM, Jakub Jelinek  wrote:
> 
> On Thu, Jun 30, 2022 at 03:31:00PM +, Qing Zhao wrote:
>>> No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
>>> structure.
>> 
>> Sorry for my dump questions: 
>> 
>> 1. What do you mean by “cv variants” of a structure?
> 
> const/volatile qualified variants.  So
Okay. I see. thanks.
> 
>> 2. For the following example:
>> 
>> struct AX { int n; short ax[];};
> 
> struct AX, const struct AX, volatile const struct AX etc. types will share
> the FIELD_DECLs.

Okay. 
> 
>> struct UX {struct AX b; int m;};
>> 
>> Are there two different FIELD_DECLs in the IR, one for AX.ax, the other one 
>> is for UX.b.ax?
> 
> No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
> b and m FIELD_DECLs with DECL_CONTEXT of struct UX.

Ah, right. 


> 
> But, what is important is that when some FIELD_DECL is last in some
> structure and has array type, it doesn't mean it should have an
> unconstrained length.
> In the above case, when struct AX is is followed by some other member, it
> acts as a strict short ax[0]; field (even when that is an exception), one
> can tak address of &UX.b.ax[0], but can't dereference that, or &UX.b.ax[1].

So, is this a GNU extension. I see that CLANG gives a warning by default and 
GCC gives a warning when specify -pedantic:
[opc@qinzhao-ol8u3-x86 trailing_array]$ cat t3.c
struct AX
{
  int n;
  short ax[];
};

struct UX
{
  struct AX b;
  int m;
};

void warn_ax_local (struct AX *p, struct UX *q)
{
  p->ax[2] = 0;   
  q->b.ax[2] = 0;
}
[opc@qinzhao-ol8u3-x86 trailing_array]$ clang -O2 -Wall t3.c -S
t3.c:9:13: warning: field 'b' with variable sized type 'struct AX' not at the 
end of a struct or class is a GNU extension 
[-Wgnu-variable-sized-type-not-at-end]
  struct AX b;
^
[opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t3.c -pedantic -S
t3.c:9:13: warning: invalid use of structure with flexible array member 
[-Wpedantic]
9 |   struct AX b;
  | ^

But, Yes, I agree, even though this is only a GNU extension, We still need to 
handle it and accept it as legal code. 

Then, yes, I also agree that encoding the info of is_flexible_array into 
FIELD_DECL is not good. 

How about encoding the info of “has_flexible_array” into the enclosing 
RECORD_TYPE or UNION_TYPE node?

For example, in the above example,  the RECORD_TYPE for “struct AX” will be 
marked as “has_flexible_array”, but that for “struct UX” will not.

> 
> I believe pedantically flexible array members in such cases don't
> necessarily mean zero length array, could be longer, e.g. for the usual
> x86_64 alignments
> struct BX { long long n; short o; short ax[]; };
> struct VX { struct BX b; int m; };
> I think it acts as short ax[3]; because the padding at the end of struct BX
> is so long that 3 short elements fit in there.
> While if one uses
> struct BX bx = { 1LL, 2, { 3, 4, 5, 6, 7, 8, 9, 10 } };
> (a GNU extension), then it acts as short ax[11]; - the initializer is 8
> elements and after short ax[8]; is padding for another 3 full elemenets.
> And of course:
> struct BX *p = malloc (offsetof (struct BX, ax) + n * sizeof (short));
> means short ax[n].
> Whether struct WX { struct BX b; };
> struct WX *p = malloc (offsetof (struct WX, b.ax) + n * sizeof (short));
> is pedantically acting as short ax[n]; is unclear to me, but we are
> generally allowing that and people expect it.

Okay, I see now.
> 
> Though, on the GCC side, I think we are only treating like flexible arrays
> what is really at the end of structs, not followed by other members.

My understanding is, Permitting flexible array to be followed by other members 
is a GNU extension.  (Actually, it’s not allowed by standard?).

Thanks a lot for your patience and help.

Qing
> 
>   Jakub
> 



Re: [PATCH] mksysinfo: add support for musl libc

2022-06-30 Thread Ian Lance Taylor via Gcc-patches
On Thu, Jun 30, 2022 at 9:59 AM Sören Tempel  wrote:
>
> Ian Lance Taylor  wrote:
> > Thanks for the info.  Does this patch work?  It tweaks the handling of
> > SYS_SECCOMP to be specific to that constant.
>
> Yes, your patch works for me too on Alpine Linux Edge.

Thanks.  Committed to mainline.

Ian


Re: [PATCH] Fortran: error recovery on invalid CLASS(), PARAMETER declarations [PR105243]

2022-06-30 Thread Harald Anlauf via Gcc-patches

Hi Tobias,

Am 30.06.22 um 11:58 schrieb Tobias Burnus:

The initial patch is by Steve.  I adjusted and moved
it slightly so that it also handles CLASS(*)
(unlimited polymorphic) at the same time.

Shouldn't you then also acknowledge him, e.g. via Co-authored-by?


yeah, I noticed that right after submitting the mail
and immediately amended the commit message.  Pushed as

https://gcc.gnu.org/g:4c233cabbe388a6b8957c1507e129090e9267ceb

Thanks,
Harald


Re: [pushed] c++: auto function as function argument [PR105779]

2022-06-30 Thread Patrick Palka via Gcc-patches
On Wed, Jun 1, 2022 at 3:21 PM Jason Merrill via Gcc-patches
 wrote:
>
> This testcase demonstrates that the issue in PR105623 is not limited to
> templates, so we should do the marking in a less template-specific place.
>
> Tested x86_64-pc-linux-gnu, applying to trunk.
>
> PR c++/105779
>
> gcc/cp/ChangeLog:
>
> * call.cc (resolve_args): Call mark_single_function here.
> * pt.cc (unify_one_argument): Not here.
>
> gcc/testsuite/ChangeLog:
>
> * g++.dg/cpp1y/auto-fn63.C: New test.
> ---
>  gcc/cp/call.cc |  5 +
>  gcc/cp/pt.cc   |  4 
>  gcc/testsuite/g++.dg/cpp1y/auto-fn63.C | 12 
>  3 files changed, 17 insertions(+), 4 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/cpp1y/auto-fn63.C
>
> diff --git a/gcc/cp/call.cc b/gcc/cp/call.cc
> index 85fe9b5ab85..4710c3777c5 100644
> --- a/gcc/cp/call.cc
> +++ b/gcc/cp/call.cc
> @@ -4672,6 +4672,11 @@ resolve_args (vec *args, tsubst_flags_t 
> complain)
> }
>else if (invalid_nonstatic_memfn_p (EXPR_LOCATION (arg), arg, 
> complain))
> return NULL;
> +
> +  /* Force auto deduction now.  Omit tf_warning to avoid redundant
> +deprecated warning on deprecated-14.C.  */
> +  if (!mark_single_function (arg, tf_error))

I wonder why pass tf_error here instead of an appropriately masked 'complain'?


> +   return NULL;
>  }
>return args;
>  }
> diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
> index 4f0ace2644b..6de8e496859 100644
> --- a/gcc/cp/pt.cc
> +++ b/gcc/cp/pt.cc
> @@ -22624,10 +22624,6 @@ unify_one_argument (tree tparms, tree targs, tree 
> parm, tree arg,
>   return unify_success (explain_p);
> }
>
> - /* Force auto deduction now.  Use tf_none to avoid redundant
> -deprecated warning on deprecated-14.C.  */
> - mark_single_function (arg, tf_none);
> -
>   arg_expr = arg;
>   arg = unlowered_expr_type (arg);
>   if (arg == error_mark_node)
> diff --git a/gcc/testsuite/g++.dg/cpp1y/auto-fn63.C 
> b/gcc/testsuite/g++.dg/cpp1y/auto-fn63.C
> new file mode 100644
> index 000..ca3bc854065
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp1y/auto-fn63.C
> @@ -0,0 +1,12 @@
> +// PR c++/105779
> +// { dg-do compile { target c++14 } }
> +
> +template
> +struct struct1
> +{
> +  static auto apply() { return 1; }
> +};
> +
> +int method(int(*f)());
> +
> +int t = method(struct1<1>::apply);
>
> base-commit: ae54c1b09963779c5c3914782324ff48af32e2f1
> --
> 2.27.0
>



[x86 PATCH] PR target/106122: Don't update %esp via the stack with -Oz.

2022-06-30 Thread Roger Sayle

When optimizing for size with -Oz, setting a register can be minimized by
pushing an immediate value to the stack and popping it to the destination.
Alas the one general register that shouldn't be updated via the stack is
the stack pointer itself, where "pop %esp" can't be represented in GCC's
RTL ("use of a register mentioned in pre_inc, pre_dec, post_inc or
post_dec is not permitted within the same instruction").  This patch
fixes PR target/106122 by explicitly checking for SP_REG in the
problematic peephole2.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2022-06-30  Roger Sayle  

gcc/ChangeLog
PR target/106122
* config/i386/i386.md (peephole2): Avoid generating pop %esp
when optimizing for size.

gcc/testsuite/ChangeLog
PR target/106122
* gcc.target/i386/pr106122.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 125a3b4..3b6f362 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -2588,7 +2588,8 @@
   "optimize_insn_for_size_p () && optimize_size > 1
&& operands[1] != const0_rtx
&& IN_RANGE (INTVAL (operands[1]), -128, 127)
-   && !ix86_red_zone_used"
+   && !ix86_red_zone_used
+   && REGNO (operands[0]) != SP_REG"
   [(set (match_dup 2) (match_dup 1))
(set (match_dup 0) (match_dup 3))]
 {
diff --git a/gcc/testsuite/gcc.target/i386/pr106122.c 
b/gcc/testsuite/gcc.target/i386/pr106122.c
new file mode 100644
index 000..7d24ed3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr106122.c
@@ -0,0 +1,15 @@
+/* PR middle-end/106122 */
+/* { dg-do compile } */
+/* { dg-options "-Oz" } */
+
+register volatile int a __asm__("%esp");
+void foo (void *);
+void bar (void *);
+
+void
+baz (void)
+{
+  foo (__builtin_return_address (0));
+  a = 0;
+  bar (__builtin_return_address (0));
+}


[og12] [committed] Fix bootstrap build of OG12

2022-06-30 Thread Kwok Cheung Yeung
The following patches have been committed to devel/omp/gcc-12 to fix a 
bootstrap build of the branch:


29ba2e4eeff Fix mis-merge of 'dwarf: Multi-register CFI address support'
82a3f9f22f7 Build fixes for OG12 on more recent GCC versions
e9ee746093b Fix string formatting issues
b8ecb83d528 Build fix for 'openmp: allow requires unified_shared_memory'

KwokFrom b8ecb83d52884153c2b9b9c44840f933dfaa4dc7 Mon Sep 17 00:00:00 2001
From: Tobias Burnus 
Date: Thu, 30 Jun 2022 08:30:48 +0200
Subject: [PATCH 1/5] Build fix for 'openmp: allow requires
 unified_shared_memory'

OG12 commit fa65fc45972d27f2fd79a44eaba1978348177ee9 added an
error diagnostic (moved around in later commits); this diagnostic
caused bootstrap fails as %<...%> were missing. This commit adds
them.

gcc/c/
* c-parser.cc (c_parser_omp_requires): Add missing %<...%> in error.

gcc/cp/
* parser.cc (cp_parser_omp_requires): Add missing %<...%> in error.
---
 gcc/c/c-parser.cc | 8 
 gcc/cp/parser.cc  | 8 
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/c/c-parser.cc b/gcc/c/c-parser.cc
index 363b80ebfeb..5cabcb684e9 100644
--- a/gcc/c/c-parser.cc
+++ b/gcc/c/c-parser.cc
@@ -22872,8 +22872,8 @@ c_parser_omp_requires (c_parser *parser)
  if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
  && flag_offload_memory != OFFLOAD_MEMORY_NONE)
error_at (cloc,
- "unified_address is incompatible with the "
- "selected -foffload-memory option");
+ "% is incompatible with the "
+ "selected %<-foffload-memory%> option");
  flag_offload_memory = OFFLOAD_MEMORY_UNIFIED;
}
  else if (!strcmp (p, "unified_shared_memory"))
@@ -22883,8 +22883,8 @@ c_parser_omp_requires (c_parser *parser)
  if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
  && flag_offload_memory != OFFLOAD_MEMORY_NONE)
error_at (cloc,
- "unified_shared_memory is incompatible with the "
- "selected -foffload-memory option");
+ "% is incompatible with the "
+ "selected %<-foffload-memory%> option");
  flag_offload_memory = OFFLOAD_MEMORY_UNIFIED;
}
  else if (!strcmp (p, "dynamic_allocators"))
diff --git a/gcc/cp/parser.cc b/gcc/cp/parser.cc
index 563bf4546eb..f8455e30ed8 100644
--- a/gcc/cp/parser.cc
+++ b/gcc/cp/parser.cc
@@ -47177,8 +47177,8 @@ cp_parser_omp_requires (cp_parser *parser, cp_token 
*pragma_tok)
  if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
  && flag_offload_memory != OFFLOAD_MEMORY_NONE)
error_at (cloc,
- "unified_address is incompatible with the "
- "selected -foffload-memory option");
+ "% is incompatible with the "
+ "selected %<-foffload-memory%> option");
  flag_offload_memory = OFFLOAD_MEMORY_UNIFIED;
}
  else if (!strcmp (p, "unified_shared_memory"))
@@ -47188,8 +47188,8 @@ cp_parser_omp_requires (cp_parser *parser, cp_token 
*pragma_tok)
  if (flag_offload_memory != OFFLOAD_MEMORY_UNIFIED
  && flag_offload_memory != OFFLOAD_MEMORY_NONE)
error_at (cloc,
- "unified_shared_memory is incompatible with the "
- "selected -foffload-memory option");
+ "% is incompatible with the "
+ "selected %<-foffload-memory%> option");
  flag_offload_memory = OFFLOAD_MEMORY_UNIFIED;
}
  else if (!strcmp (p, "dynamic_allocators"))
-- 
2.25.1

From e9ee746093bd989c33685e3197c75b901aef2cc1 Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Thu, 30 Jun 2022 15:31:41 +0100
Subject: [PATCH 3/5] Fix string formatting issues

Stricter format-string checking in more recent versions of GCC can cause
build failures.

2022-06-30  Kwok Cheung Yeung  

gcc/
* omp-data-optimize.cc (omp_data_optimize_add_candidate): Suppress
format checking.
(omp_data_optimize_can_be_private): Likewise.
(omp_data_optimize_can_be_private): Likewise.

(This should be a fixup to ab53d5a6a27dce2a92f28a62ceb6e184c8356f25: 'openacc:
Add data optimization pass')

2022-06-30  Kwok Cheung Yeung  

gcc/
* gimplify.cc (gimplify_scan_omp_clauses): Remove extra
'%<..%>' pair in format string.

(This should be a fixup to dbc770c4351c8824e8083f8aff6117a6b4ba3c0d: 'openmp:
Implement uses_allocators clause')
---
 gcc/ChangeLog.omp| 12 
 gcc/gimplify.cc  |  2 +-
 gcc/omp-data-optimize.cc | 28 
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/gcc/ChangeLog.omp b/gcc/ChangeLog.omp
in

Re: [PATCH] aarch64: testsuite: symbol-range compile only

2022-06-30 Thread Hans-Peter Nilsson
On Thu, 23 Jun 2022, Alexandre Oliva via Gcc-patches wrote:
> +proc check_effective_target_two_plus_gigs { } {
> +return [check_no_compiler_messages two_plus_gigs executable {
> + int dummy[0x8000];

Don't you mean "char" as in "char dummy[0x8000]"?

Or else the effective predicate is effectively eight_plus_gigs
(for targets where sizeof(int) == 4, i.e. most).

brgds, H-P


[patch] libgompd: Add thread handles

2022-06-30 Thread Ahmed Sayed Mousse via Gcc-patches
/This patch is the initial implementation of OpenMP-API specs book section //20.5.5 with title "Thread Handles". /libgomp/ChangeLog /2022-07-01 
Ahmed Sayed  //	* Makefile.am 
(libgompd_la_SOURCES): Add ompd-threads.c.///* Makefile.in: Regenerate. * team.c ( gomp_free_thread ): Called 
ompd_bp_thread_end ()./* ompd-support.c ( gompd_thread_initial_tls_bias 
): New Variable. (gompd_load): Initialize gompd_thread_initial_tls_bias.///
diff --git a/libgomp/Makefile.am b/libgomp/Makefile.am
index 6d913a93e7f..23f5bede1bf 100644
--- a/libgomp/Makefile.am
+++ b/libgomp/Makefile.am
@@ -94,7 +94,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c error.c \
priority_queue.c affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c ompd-support.c
 
-libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c
+libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c ompd-threads.c
 
 include $(top_srcdir)/plugin/Makefrag.am
 
diff --git a/libgomp/Makefile.in b/libgomp/Makefile.in
index 40f896b5f03..8bbc46cca25 100644
--- a/libgomp/Makefile.in
+++ b/libgomp/Makefile.in
@@ -133,21 +133,8 @@ target_triplet = @target@
 @USE_FORTRAN_TRUE@am__append_7 = openacc.f90
 subdir = .
 ACLOCAL_M4 = $(top_srcdir)/aclocal.m4
-am__aclocal_m4_deps = $(top_srcdir)/../config/acx.m4 \
-   $(top_srcdir)/../config/ax_count_cpus.m4 \
-   $(top_srcdir)/../config/depstand.m4 \
-   $(top_srcdir)/../config/enable.m4 \
-   $(top_srcdir)/../config/futex.m4 \
-   $(top_srcdir)/../config/lead-dot.m4 \
-   $(top_srcdir)/../config/lthostflags.m4 \
-   $(top_srcdir)/../config/multi.m4 \
-   $(top_srcdir)/../config/override.m4 \
-   $(top_srcdir)/../config/tls.m4 \
-   $(top_srcdir)/../config/toolexeclibdir.m4 \
-   $(top_srcdir)/../ltoptions.m4 $(top_srcdir)/../ltsugar.m4 \
-   $(top_srcdir)/../ltversion.m4 $(top_srcdir)/../lt~obsolete.m4 \
-   $(top_srcdir)/acinclude.m4 $(top_srcdir)/../libtool.m4 \
-   $(top_srcdir)/../config/cet.m4 \
+am__aclocal_m4_deps = $(top_srcdir)/acinclude.m4 \
+   $(top_srcdir)/../libtool.m4 $(top_srcdir)/../config/cet.m4 \
$(top_srcdir)/plugin/configfrag.ac $(top_srcdir)/configure.ac
 am__configure_deps = $(am__aclocal_m4_deps) $(CONFIGURE_DEPENDENCIES) \
$(ACLOCAL_M4)
@@ -233,7 +220,8 @@ am_libgomp_la_OBJECTS = alloc.lo atomic.lo barrier.lo 
critical.lo \
affinity-fmt.lo teams.lo allocator.lo oacc-profiling.lo \
oacc-target.lo ompd-support.lo $(am__objects_1)
 libgomp_la_OBJECTS = $(am_libgomp_la_OBJECTS)
-am_libgompd_la_OBJECTS = ompd-init.lo ompd-helper.lo ompd-icv.lo
+am_libgompd_la_OBJECTS = ompd-init.lo ompd-helper.lo ompd-icv.lo \
+   ompd-threads.lo
 libgompd_la_OBJECTS = $(am_libgompd_la_OBJECTS)
 AM_V_P = $(am__v_P_@AM_V@)
 am__v_P_ = $(am__v_P_@AM_DEFAULT_V@)
@@ -485,7 +473,6 @@ dvidir = @dvidir@
 enable_shared = @enable_shared@
 enable_static = @enable_static@
 exec_prefix = @exec_prefix@
-get_gcc_base_ver = @get_gcc_base_ver@
 host = @host@
 host_alias = @host_alias@
 host_cpu = @host_cpu@
@@ -501,10 +488,8 @@ libtool_VERSION = @libtool_VERSION@
 link_gomp = @link_gomp@
 localedir = @localedir@
 localstatedir = @localstatedir@
-lt_host_flags = @lt_host_flags@
 mandir = @mandir@
 mkdir_p = @mkdir_p@
-multi_basedir = @multi_basedir@
 offload_additional_lib_paths = @offload_additional_lib_paths@
 offload_additional_options = @offload_additional_options@
 offload_plugins = @offload_plugins@
@@ -514,6 +499,7 @@ pdfdir = @pdfdir@
 prefix = @prefix@
 program_transform_name = @program_transform_name@
 psdir = @psdir@
+runstatedir = @runstatedir@
 sbindir = @sbindir@
 sharedstatedir = @sharedstatedir@
 srcdir = @srcdir@
@@ -583,7 +569,7 @@ libgomp_la_SOURCES = alloc.c atomic.c barrier.c critical.c 
env.c \
oacc-async.c oacc-plugin.c oacc-cuda.c priority_queue.c \
affinity-fmt.c teams.c allocator.c oacc-profiling.c \
oacc-target.c ompd-support.c $(am__append_7)
-libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c
+libgompd_la_SOURCES = ompd-init.c ompd-helper.c ompd-icv.c ompd-threads.c
 
 # Nvidia PTX OpenACC plugin.
 @PLUGIN_NVPTX_TRUE@libgomp_plugin_nvptx_version_info = -version-info 
$(libtool_VERSION)
@@ -801,6 +787,7 @@ distclean-compile:
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-icv.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-init.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-support.Plo@am__quote@
+@AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ompd-threads.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/ordered.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/parallel.Plo@am__quote@
 @AMDEP_TRUE@@am__include@ @am__quote@./$(DEPDIR)/priority_queue.Plo@am__quote@
diff --git a/libgomp/aclocal.m4 b/libgomp/aclocal.m4
index 55d9d71895a..41915216beb 100644
--- a/libgomp/aclocal.m4
+++ b/libgomp/aclocal.m4
@@ -626,6 +626,25 @@ if test x"${install_sh+set

[PATCH] Add myself for write after approval

2022-06-30 Thread Haochen Jiang via Gcc-patches
Hi all,

I want to add myself in MAINTAINERS for write after approval.

Ok for trunk?

BRs,
Haochen

ChangeLog:

* MAINTAINERS (Write After Approval): Add myself.
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 151770f59f4..3c448ba9eb6 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -464,6 +464,7 @@ Harsha Jagasia  

 Fariborz Jahanian  
 Surya Kumari Jangala   
 Qian Jianhua   
+Haochen Jiang  
 Janis Johnson  
 Teresa Johnson 
 Kean Johnston  
-- 
2.18.1



Re: [x86 PATCH] UNSPEC_PALIGNR optimizations and clean-ups.

2022-06-30 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle  wrote:
>
>
> This patch is a follow-up to Hongtao's fix for PR target/105854.  That
> fix is perfectly correct, but the thing that caught my eye was why is
> the compiler generating a shift by zero at all.  Digging deeper it
> turns out that we can easily optimize __builtin_ia32_palignr for
> alignments of 0 and 64 respectively, which may be simplified to moves
> from the highpart or lowpart.
>
> After adding optimizations to simplify the 64-bit DImode palignr,
> I started to add the corresponding optimizations for vpalignr (i.e.
> 128-bit).  The first oddity is that sse.md uses TImode and a special
> SSESCALARMODE iterator, rather than V1TImode, and indeed the comment
> above SSESCALARMODE hints that this should be "dropped in favor of
> VIMAX_AVX2_AVX512BW".  Hence this patch includes the migration of
> _palignr to use VIMAX_AVX2_AVX512BW, basically
> using V1TImode instead of TImode for 128-bit palignr.
>
> But it was only after I'd implemented this clean-up that I stumbled
> across the strange semantics of 128-bit [v]palignr.  According to
> https://www.felixcloutier.com/x86/palignr, the semantics are subtly
> different based upon how the instruction is encoded.  PALIGNR leaves
> the highpart unmodified, whilst VEX.128 encoded VPALIGNR clears the
> highpart, and (unless I'm mistaken) it looks like GCC currently uses
> the exact same RTL/templates for both, treating one as an alternative
> for the other.
I think as long as patterns or intrinsics only care about the low
part, they should be ok.
But if we want to use default behavior for upper bits, we need to
restrict them under specific isa(.i.e. vmovq in vec_set_0).
Generally, 128-bit sse legacy instructions have different behaviors
for upper bits from AVX ones, and that's why vzeroupper is introduced
for sse <-> avx instructions transition.
>
> Hence I thought I'd post what I have so far (part optimization and
> part clean-up), to then ask the x86 experts for their opinions.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-,32},
> with no new failures.  Ok for mainline?
>
>
> 2022-06-30  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-builtin.def (__builtin_ia32_palignr128): Change
> CODE_FOR_ssse3_palignrti to CODE_FOR_ssse3_palignrv1ti.
> * config/i386/i386-expand.cc (expand_vec_perm_palignr): Use V1TImode
> and gen_ssse3_palignv1ti instead of TImode.
> * config/i386/sse.md (SSESCALARMODE): Delete.
> (define_mode_attr ssse3_avx2): Handle V1TImode instead of TImode.
> (_palignr): Use VIMAX_AVX2_AVX512BW as a mode
> iterator instead of SSESCALARMODE.
>
> (ssse3_palignrdi): Optimize cases when operands[3] is 0 or 64,
> using a single move instruction (if required).
> (define_split): Likewise split UNSPEC_PALIGNR $0 into a move.
> (define_split): Likewise split UNSPEC_PALIGNR $64 into a move.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/ssse3-palignr-2.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>

+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+ (unspec:DI [(match_operand:DI 1 "register_operand")
+(match_operand:DI 2 "register_mmxmem_operand")
+(const_int 0)]
+   UNSPEC_PALIGNR))]
+  ""
+  [(set (match_dup 0) (match_dup 2))])
+
+(define_split
+  [(set (match_operand:DI 0 "register_operand")
+ (unspec:DI [(match_operand:DI 1 "register_operand")
+(match_operand:DI 2 "register_mmxmem_operand")
+(const_int 64)]
+   UNSPEC_PALIGNR))]
+  ""
+  [(set (match_dup 0) (match_dup 1))])
+
define_split is assumed to be splitted to 2(or more) insns, hence
pass_combine will only try define_split if the number of merged insns
is greater than 2.
For palignr, i think most time there would be only 2 merged
insns(constant propagation), so better to change them as pre_reload
splitter.
(.i.e. (define_insn_and_split "*avx512bw_permvar_truncv16siv16hi_1").


--
BR,
Hongtao


Re: [PATCH] Add myself for write after approval

2022-06-30 Thread Hongtao Liu via Gcc-patches
I think this can be taken as an obvious fix without prior approval.
"Obvious fixes can be committed without prior approval. Just check in
the fix and copy it to gcc-patches."
Quoted from https://gcc.gnu.org/gitwrite.html

On Fri, Jul 1, 2022 at 10:02 AM Haochen Jiang via Gcc-patches
 wrote:
>
> Hi all,
>
> I want to add myself in MAINTAINERS for write after approval.
>
> Ok for trunk?
>
> BRs,
> Haochen
>
> ChangeLog:
>
> * MAINTAINERS (Write After Approval): Add myself.
> ---
>  MAINTAINERS | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 151770f59f4..3c448ba9eb6 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -464,6 +464,7 @@ Harsha Jagasia  
> 
>  Fariborz Jahanian  
>  Surya Kumari Jangala   
>  Qian Jianhua   
> +Haochen Jiang  
>  Janis Johnson  
> 
>  Teresa Johnson 
>  Kean Johnston  
> --
> 2.18.1
>


-- 
BR,
Hongtao


Re: [x86 PATCH] UNSPEC_PALIGNR optimizations and clean-ups.

2022-06-30 Thread Hongtao Liu via Gcc-patches
On Fri, Jul 1, 2022 at 10:12 AM Hongtao Liu  wrote:
>
> On Fri, Jul 1, 2022 at 2:42 AM Roger Sayle  wrote:
> >
> >
> > This patch is a follow-up to Hongtao's fix for PR target/105854.  That
> > fix is perfectly correct, but the thing that caught my eye was why is
> > the compiler generating a shift by zero at all.  Digging deeper it
> > turns out that we can easily optimize __builtin_ia32_palignr for
> > alignments of 0 and 64 respectively, which may be simplified to moves
> > from the highpart or lowpart.
> >
> > After adding optimizations to simplify the 64-bit DImode palignr,
> > I started to add the corresponding optimizations for vpalignr (i.e.
> > 128-bit).  The first oddity is that sse.md uses TImode and a special
> > SSESCALARMODE iterator, rather than V1TImode, and indeed the comment
> > above SSESCALARMODE hints that this should be "dropped in favor of
> > VIMAX_AVX2_AVX512BW".  Hence this patch includes the migration of
> > _palignr to use VIMAX_AVX2_AVX512BW, basically
> > using V1TImode instead of TImode for 128-bit palignr.
> >
> > But it was only after I'd implemented this clean-up that I stumbled
> > across the strange semantics of 128-bit [v]palignr.  According to
> > https://www.felixcloutier.com/x86/palignr, the semantics are subtly
> > different based upon how the instruction is encoded.  PALIGNR leaves
> > the highpart unmodified, whilst VEX.128 encoded VPALIGNR clears the
> > highpart, and (unless I'm mistaken) it looks like GCC currently uses
> > the exact same RTL/templates for both, treating one as an alternative
> > for the other.
> I think as long as patterns or intrinsics only care about the low
> part, they should be ok.
> But if we want to use default behavior for upper bits, we need to
> restrict them under specific isa(.i.e. vmovq in vec_set_0).
> Generally, 128-bit sse legacy instructions have different behaviors
> for upper bits from AVX ones, and that's why vzeroupper is introduced
> for sse <-> avx instructions transition.
> >
> > Hence I thought I'd post what I have so far (part optimization and
> > part clean-up), to then ask the x86 experts for their opinions.
> >
> > This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> > and make -k check, both with and without --target_board=unix{-,32},
> > with no new failures.  Ok for mainline?
> >
> >
> > 2022-06-30  Roger Sayle  
> >
> > gcc/ChangeLog
> > * config/i386/i386-builtin.def (__builtin_ia32_palignr128): Change
> > CODE_FOR_ssse3_palignrti to CODE_FOR_ssse3_palignrv1ti.
> > * config/i386/i386-expand.cc (expand_vec_perm_palignr): Use V1TImode
> > and gen_ssse3_palignv1ti instead of TImode.
> > * config/i386/sse.md (SSESCALARMODE): Delete.
> > (define_mode_attr ssse3_avx2): Handle V1TImode instead of TImode.
> > (_palignr): Use VIMAX_AVX2_AVX512BW as a mode
> > iterator instead of SSESCALARMODE.
> >
> > (ssse3_palignrdi): Optimize cases when operands[3] is 0 or 64,
> > using a single move instruction (if required).
> > (define_split): Likewise split UNSPEC_PALIGNR $0 into a move.
> > (define_split): Likewise split UNSPEC_PALIGNR $64 into a move.
> >
> > gcc/testsuite/ChangeLog
> > * gcc.target/i386/ssse3-palignr-2.c: New test case.
> >
> >
> > Thanks in advance,
> > Roger
> > --
> >
>
> +(define_split
> +  [(set (match_operand:DI 0 "register_operand")
> + (unspec:DI [(match_operand:DI 1 "register_operand")
> +(match_operand:DI 2 "register_mmxmem_operand")
> +(const_int 0)]
> +   UNSPEC_PALIGNR))]
> +  ""
> +  [(set (match_dup 0) (match_dup 2))])
> +
> +(define_split
> +  [(set (match_operand:DI 0 "register_operand")
> + (unspec:DI [(match_operand:DI 1 "register_operand")
> +(match_operand:DI 2 "register_mmxmem_operand")
> +(const_int 64)]
> +   UNSPEC_PALIGNR))]
> +  ""
> +  [(set (match_dup 0) (match_dup 1))])
> +
> define_split is assumed to be splitted to 2(or more) insns, hence
> pass_combine will only try define_split if the number of merged insns
> is greater than 2.
> For palignr, i think most time there would be only 2 merged
> insns(constant propagation), so better to change them as pre_reload
> splitter.
> (.i.e. (define_insn_and_split "*avx512bw_permvar_truncv16siv16hi_1").
I think you can just merge 2 define_split into define_insn_and_split
"ssse3_palignrdi" by relaxing split condition as

-  "TARGET_SSSE3 && reload_completed
-   && SSE_REGNO_P (REGNO (operands[0]))"
+  "(TARGET_SSSE3 && reload_completed
+   && SSE_REGNO_P (REGNO (operands[0])))
+   || INVAL(operands[3]) == 0
+   || INVAL(operands[3]) == 64"

and you have already handled them by

+  if (operands[3] == const0_rtx)
+{
+  if (!rtx_equal_p (operands[0], operands[2]))
+ emit_move_insn (operands[0], operands[2]);
+  else
+ emit_note (NOTE_INSN_DELETED);
+  DONE;
+}
+  else if (INTVAL (operands[3]) == 64)
+{
+  if (!rtx_equal_p (operands[0], operands[1]))
+ emit_move_insn (operands[0],

Re: [PATCH] Add myself for write after approval

2022-06-30 Thread Jeff Law via Gcc-patches




On 6/30/2022 8:22 PM, Hongtao Liu via Gcc-patches wrote:

I think this can be taken as an obvious fix without prior approval.
"Obvious fixes can be committed without prior approval. Just check in
the fix and copy it to gcc-patches."
Quoted from https://gcc.gnu.org/gitwrite.html
If we've given someone write access, they need to be listed in the 
MAINTAINERsS file, so yes, this is something I'd consider obvious.


In fact, I thought the template message we sent to folks when they're 
given write access includes a request to add a new entry to the 
MAINTAINERS file.


Jeff


Re: [x86 PATCH] PR target/106122: Don't update %esp via the stack with -Oz.

2022-06-30 Thread Uros Bizjak via Gcc-patches
On Fri, Jul 1, 2022 at 1:00 AM Roger Sayle  wrote:
>
>
> When optimizing for size with -Oz, setting a register can be minimized by
> pushing an immediate value to the stack and popping it to the destination.
> Alas the one general register that shouldn't be updated via the stack is
> the stack pointer itself, where "pop %esp" can't be represented in GCC's
> RTL ("use of a register mentioned in pre_inc, pre_dec, post_inc or
> post_dec is not permitted within the same instruction").  This patch
> fixes PR target/106122 by explicitly checking for SP_REG in the
> problematic peephole2.
>
> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32}
> with no new failures.  Ok for mainline?
>
>
> 2022-06-30  Roger Sayle  
>
> gcc/ChangeLog
> PR target/106122
> * config/i386/i386.md (peephole2): Avoid generating pop %esp
> when optimizing for size.
>
> gcc/testsuite/ChangeLog
> PR target/106122
> * gcc.target/i386/pr106122.c: New test case.

OK for mainline and backport.

Thanks,
Uros.

>
>
> Thanks in advance,
> Roger
> --
>


Re: [PATCH 3/3] lto-plugin: implement LDPT_GET_API_VERSION

2022-06-30 Thread Richard Biener via Gcc-patches
On Thu, Jun 30, 2022 at 10:42 AM Martin Liška  wrote:
>
> On 6/30/22 08:43, Rui Ueyama wrote:
> > Thanks Martin for creating this patch.
>
> You're welcome.
>
> >
> > Here is a preliminary change for the mold side: 
> > https://github.com/rui314/mold/commit/9ad49d1c556bc963d06cca8233535183490de605
> >  
> > 
> >
> > Overall the API is looking fine,
>
> Good then!
>
> > though it is not clear what kind of value is expected as a linker version. 
> > A linker version is not a single unsigned integer but something like 
> > "1.3.0". Something like "1.3.0-rc2" can also be a linker version. So I 
> > don't think we can represent a linker version as a single integer.
>
> Well, you can use the same what we use GCC_VERSION (plugin_version):
>
> 1000 * MAJOR + MINOR
>
> Let me adjust the documentation of the API.

Hmm, but then why not go back to the original suggestion merging
linker_identifier and linker_version into
a single string.  That of course puts the burden of parsing to the
consumer - still that's probably better
than imposing the constraint of encoding the version in an unsigned
integer.  Alternatively easing
parsing by separating out the version in a string would be possible as
well (but then you'd have
to care for 1.3.0-rc2+gitab4316174 or so, not sure what the advantage
over putting everything in
the identifier would be).

You usually cannot rely on a version anyway since distributors usually
apply patches.

> Richi: May I install the patch?

Let's sort out the version thing and consider simplifying the API.

Richard.

> Thanks,
> Martin
>
> >
> > On Mon, Jun 20, 2022 at 9:01 PM Martin Liška  > > wrote:
> >
> > On 6/20/22 11:35, Richard Biener wrote:
> > > I think this is OK.  Can we get buy-in from mold people?
> >
> > Sure, I've just pinged Rui:
> > https://github.com/rui314/mold/issues/454#issuecomment-1160419030 
> > 
> >
> > Martin
> >


Re: [PATCH] Amend fix for PR middle-end/105874

2022-06-30 Thread Richard Biener via Gcc-patches
On Thu, Jun 30, 2022 at 1:34 PM Eric Botcazou via Gcc-patches
 wrote:
>
> As pointed out by Richard, it's very likely too big of a hammer.
>
> Bootstrapped/regtested on x86-64/Linux, OK for the mainline?

LGTM

>
> 2022-06-30  Eric Botcazou  
>
> PR middle-end/105874
> * expr.cc (expand_expr_real_1) : Force
> EXPAND_MEMORY for the expansion of the inner reference only
> in the usual cases where a memory reference is required.
>
> --
> Eric Botcazou


Re: [PATCH] if-to-switch: properly allow side effects only for first condition

2022-06-30 Thread Richard Biener via Gcc-patches
On Thu, Jun 30, 2022 at 4:29 PM Martin Liška  wrote:
>
> Properly allow side effects only for a first BB in a condition chain.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>
> Ready to be installed?

OK

> Thanks,
> Martin
>
> PR tree-optimization/106126
>
> gcc/ChangeLog:
>
> * gimple-if-to-switch.cc (struct condition_info): Save
> has_side_effect.
> (find_conditions): Parse all BBs.
> (pass_if_to_switch::execute): Allow only side effects for first
> BB.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.dg/tree-ssa/pr106126.c: New test.
> ---
>  gcc/gimple-if-to-switch.cc   | 20 +++-
>  gcc/testsuite/gcc.dg/tree-ssa/pr106126.c | 12 
>  2 files changed, 23 insertions(+), 9 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr106126.c
>
> diff --git a/gcc/gimple-if-to-switch.cc b/gcc/gimple-if-to-switch.cc
> index 70daae2003c..4441206c481 100644
> --- a/gcc/gimple-if-to-switch.cc
> +++ b/gcc/gimple-if-to-switch.cc
> @@ -61,9 +61,11 @@ struct condition_info
>  {
>typedef auto_vec> mapping_vec;
>
> -  condition_info (gcond *cond): m_cond (cond), m_bb (gimple_bb (cond)),
> -m_forwarder_bb (NULL), m_ranges (), m_true_edge (NULL), m_false_edge 
> (NULL),
> -m_true_edge_phi_mapping (), m_false_edge_phi_mapping ()
> +  condition_info (gcond *cond, bool has_side_effect): m_cond (cond),
> +m_bb (gimple_bb (cond)), m_forwarder_bb (NULL), m_ranges (),
> +m_true_edge (NULL), m_false_edge (NULL),
> +m_true_edge_phi_mapping (), m_false_edge_phi_mapping (),
> +m_has_side_effect (has_side_effect)
>{
>  m_ranges.create (0);
>}
> @@ -80,6 +82,7 @@ struct condition_info
>edge m_false_edge;
>mapping_vec m_true_edge_phi_mapping;
>mapping_vec m_false_edge_phi_mapping;
> +  bool m_has_side_effect;
>  };
>
>  /* Recond PHI mapping for an original edge E and save these into vector VEC. 
>  */
> @@ -389,16 +392,11 @@ find_conditions (basic_block bb,
>if (cond == NULL)
>  return;
>
> -  /* An empty conditions_in_bbs indicates we are processing the first
> - basic-block then no need check side effect.  */
> -  if (!conditions_in_bbs->is_empty () && !no_side_effect_bb (bb))
> -return;
> -
>tree lhs = gimple_cond_lhs (cond);
>tree rhs = gimple_cond_rhs (cond);
>tree_code code = gimple_cond_code (cond);
>
> -  condition_info *info = new condition_info (cond);
> +  condition_info *info = new condition_info (cond, !no_side_effect_bb (bb));
>
>gassign *def;
>if (code == NE_EXPR
> @@ -536,6 +534,10 @@ pass_if_to_switch::execute (function *fun)
>   if ((*info2)->m_false_edge != e)
> break;
>
> + /* Only the first BB in a chain can have a side effect.  */
> + if (info->m_has_side_effect)
> +   break;
> +
>   chain->m_entries.safe_push (*info2);
>   bitmap_set_bit (seen_bbs, e->src->index);
>   info = *info2;
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr106126.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr106126.c
> new file mode 100644
> index 000..2f0fd44164b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr106126.c
> @@ -0,0 +1,12 @@
> +/* PR tree-optimization/106126 */
> +/* { dg-do compile } */
> +/* { dg-options "-O2" } */
> +
> +char *var_1;
> +void pool_conda_matchspec() {
> +  for (; var_1 && *var_1 &&
> + *var_1 != '<' && *var_1 != '>' &&
> + *var_1 != '!' && *var_1 != '~';)
> +if (*var_1 >= 'A' && *var_1 <= 'Z')
> +  *var_1 += 'A';
> +}
> --
> 2.36.1
>


Re: [PATCH] aarch64: Fix pure/const function attributes for intrinsics

2022-06-30 Thread Richard Biener via Gcc-patches
On Thu, Jun 30, 2022 at 6:04 PM Andrew Carlotti via Gcc-patches
 wrote:
>
> No testcase for this, since I haven't found a way to turn the incorrect
> attribute into incorrect codegen.
>
> Bootstrapped and tested on aarch64-none-linux gnu.
>
> gcc/
>
> * config/aarch64/aarch64-builtins.c
> (aarch64_get_attributes): Fix choice of pure/const attributes.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64-builtins.cc 
> b/gcc/config/aarch64/aarch64-builtins.cc
> index 
> e0a741ac663188713e21f457affa57217d074783..877f54aab787862794413259cd36ca0fb7bd49c5
>  100644
> --- a/gcc/config/aarch64/aarch64-builtins.cc
> +++ b/gcc/config/aarch64/aarch64-builtins.cc
> @@ -1085,9 +1085,9 @@ aarch64_get_attributes (unsigned int f, machine_mode 
> mode)
>if (!aarch64_modifies_global_state_p (f, mode))
>  {
>if (aarch64_reads_global_state_p (f, mode))
> -   attrs = aarch64_add_attribute ("pure", attrs);
> -  else
> attrs = aarch64_add_attribute ("const", attrs);
> +  else
> +   attrs = aarch64_add_attribute ("pure", attrs);

that looks backwards.  'pure' allows read of global memory while
'const' does not.  Is
aarch64_reads_global_state_p really backwards?

>  }
>
>if (!flag_non_call_exceptions || !aarch64_could_trap_p (f, mode))


Re: [GCC 13][PATCH] PR101836: Add a new option -fstrict-flex-array[=n] and use it in __builtin_object_size

2022-06-30 Thread Richard Biener via Gcc-patches
On Thu, Jun 30, 2022 at 9:30 PM Qing Zhao  wrote:
>
>
>
> > On Jun 30, 2022, at 1:03 PM, Jakub Jelinek  wrote:
> >
> > On Thu, Jun 30, 2022 at 03:31:00PM +, Qing Zhao wrote:
> >>> No, that’s not true.  A FIELD_DELC is only shared for cv variants of a 
> >>> structure.
> >>
> >> Sorry for my dump questions:
> >>
> >> 1. What do you mean by “cv variants” of a structure?
> >
> > const/volatile qualified variants.  So
> Okay. I see. thanks.
> >
> >> 2. For the following example:
> >>
> >> struct AX { int n; short ax[];};
> >
> > struct AX, const struct AX, volatile const struct AX etc. types will share
> > the FIELD_DECLs.
>
> Okay.
> >
> >> struct UX {struct AX b; int m;};
> >>
> >> Are there two different FIELD_DECLs in the IR, one for AX.ax, the other 
> >> one is for UX.b.ax?
> >
> > No, there are just n and ax FIELD_DECLs with DECL_CONTEXT of struct AX and
> > b and m FIELD_DECLs with DECL_CONTEXT of struct UX.
>
> Ah, right.
>
>
> >
> > But, what is important is that when some FIELD_DECL is last in some
> > structure and has array type, it doesn't mean it should have an
> > unconstrained length.
> > In the above case, when struct AX is is followed by some other member, it
> > acts as a strict short ax[0]; field (even when that is an exception), one
> > can tak address of &UX.b.ax[0], but can't dereference that, or &UX.b.ax[1].
>
> So, is this a GNU extension. I see that CLANG gives a warning by default and 
> GCC gives a warning when specify -pedantic:
> [opc@qinzhao-ol8u3-x86 trailing_array]$ cat t3.c
> struct AX
> {
>   int n;
>   short ax[];
> };
>
> struct UX
> {
>   struct AX b;
>   int m;
> };
>
> void warn_ax_local (struct AX *p, struct UX *q)
> {
>   p->ax[2] = 0;
>   q->b.ax[2] = 0;
> }
> [opc@qinzhao-ol8u3-x86 trailing_array]$ clang -O2 -Wall t3.c -S
> t3.c:9:13: warning: field 'b' with variable sized type 'struct AX' not at the 
> end of a struct or class is a GNU extension 
> [-Wgnu-variable-sized-type-not-at-end]
>   struct AX b;
> ^
> [opc@qinzhao-ol8u3-x86 trailing_array]$ gcc -O2 -Wall t3.c -pedantic -S
> t3.c:9:13: warning: invalid use of structure with flexible array member 
> [-Wpedantic]
> 9 |   struct AX b;
>   | ^
>
> But, Yes, I agree, even though this is only a GNU extension, We still need to 
> handle it and accept it as legal code.
>
> Then, yes, I also agree that encoding the info of is_flexible_array into 
> FIELD_DECL is not good.

Which is why I suggested to encode 'not_flexible_array'.  This way the
FE can mark all a[1] this way in some mode
but leave a[] as possibly flexarray (depending on context).

> How about encoding the info of “has_flexible_array” into the enclosing 
> RECORD_TYPE or UNION_TYPE node?

But that has the same issue.  Consider

struct A { int n; int a[1]; };

where a is considered possibly a flexarray vs.

struct B { struct A a; int b; };

where B.a would be not considered to have a flexarray (again note
'possibly' vs. 'actually does').

Also

struct A a;

has 'a' as _not_ having a flexarray (because it's size is statically
allocated) but

struct A *a;
struct B *b;

a->a[n];

as possibly accessing the flexarray portion of *a while

b->a.a[n]

is not accessing a flexarray because there's a member after a in b.

For your original proposal it's really the field declaration itself
which changes so annotating the FIELD_DECL
seems correct to me.

> For example, in the above example,  the RECORD_TYPE for “struct AX” will be 
> marked as “has_flexible_array”, but that for “struct UX” will not.
>
> >
> > I believe pedantically flexible array members in such cases don't
> > necessarily mean zero length array, could be longer, e.g. for the usual
> > x86_64 alignments
> > struct BX { long long n; short o; short ax[]; };
> > struct VX { struct BX b; int m; };
> > I think it acts as short ax[3]; because the padding at the end of struct BX
> > is so long that 3 short elements fit in there.
> > While if one uses
> > struct BX bx = { 1LL, 2, { 3, 4, 5, 6, 7, 8, 9, 10 } };
> > (a GNU extension), then it acts as short ax[11]; - the initializer is 8
> > elements and after short ax[8]; is padding for another 3 full elemenets.
> > And of course:
> > struct BX *p = malloc (offsetof (struct BX, ax) + n * sizeof (short));
> > means short ax[n].
> > Whether struct WX { struct BX b; };
> > struct WX *p = malloc (offsetof (struct WX, b.ax) + n * sizeof (short));
> > is pedantically acting as short ax[n]; is unclear to me, but we are
> > generally allowing that and people expect it.
>
> Okay, I see now.
> >
> > Though, on the GCC side, I think we are only treating like flexible arrays
> > what is really at the end of structs, not followed by other members.
>
> My understanding is, Permitting flexible array to be followed by other 
> members is a GNU extension.  (Actually, it’s not allowed by standard?).
>
> Thanks a lot for your patience and help.
>
> Qing
> >
> >   Jakub
> >
>


[PATCH] tree-optimization/106131 - wrong code with FRE rewriting

2022-06-30 Thread Richard Biener via Gcc-patches


The following makes sure to not use the original TBAA type for
looking up a value across an aggregate copy when we had to offset
the read.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed to trunk.

2022-06-30  Richard Biener  

PR tree-optimization/106131
* tree-ssa-sccvn.cc (vn_reference_lookup_3): Force alias-set
zero when offsetting the read looking through an aggregate
copy.

* g++.dg/torture/pr106131.C: New testcase.
---
 gcc/testsuite/g++.dg/torture/pr106131.C | 34 +
 gcc/tree-ssa-sccvn.cc   | 16 +---
 2 files changed, 46 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/torture/pr106131.C

diff --git a/gcc/testsuite/g++.dg/torture/pr106131.C 
b/gcc/testsuite/g++.dg/torture/pr106131.C
new file mode 100644
index 000..e110f4a8fe6
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/pr106131.C
@@ -0,0 +1,34 @@
+// { dg-do run { target c++11 } }
+
+struct Pair {
+int a, b;
+Pair(const Pair &) = default;
+Pair(int _a, int _b) : a(_a), b(_b) {}
+Pair &operator=(const Pair &z) {
+   a = z.a;
+   b = z.b;
+   return *this;
+}
+};
+
+const int &max(const int &a, const int &b)
+{
+  return a < b ? b : a;
+}
+
+int foo(Pair x, Pair y)
+{
+  return max(x.b, y.b);
+}
+
+int main()
+{
+  auto f = new Pair[3] {{0, -11}, {0, -8}, {0, 2}};
+  for (int i = 0; i < 1; i++) {
+  f[i] = f[0];
+  if(i == 0)
+   f[i] = f[2];
+  if (foo(f[i], f[1]) != 2)
+   __builtin_abort();
+  }
+}
diff --git a/gcc/tree-ssa-sccvn.cc b/gcc/tree-ssa-sccvn.cc
index 9deedeac378..76d92895a3a 100644
--- a/gcc/tree-ssa-sccvn.cc
+++ b/gcc/tree-ssa-sccvn.cc
@@ -3243,12 +3243,12 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
   poly_int64 extra_off = 0;
   if (j == 0 && i >= 0
  && lhs_ops[0].opcode == MEM_REF
- && maybe_ne (lhs_ops[0].off, -1))
+ && known_ne (lhs_ops[0].off, -1))
{
  if (known_eq (lhs_ops[0].off, vr->operands[i].off))
i--, j--;
  else if (vr->operands[i].opcode == MEM_REF
-  && maybe_ne (vr->operands[i].off, -1))
+  && known_ne (vr->operands[i].off, -1))
{
  extra_off = vr->operands[i].off - lhs_ops[0].off;
  i--, j--;
@@ -3275,6 +3275,7 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
   copy_reference_ops_from_ref (rhs1, &rhs);
 
   /* Apply an extra offset to the inner MEM_REF of the RHS.  */
+  bool force_no_tbaa = false;
   if (maybe_ne (extra_off, 0))
{
  if (rhs.length () < 2)
@@ -3287,6 +3288,10 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
  rhs[ix].op0 = int_const_binop (PLUS_EXPR, rhs[ix].op0,
 build_int_cst (TREE_TYPE (rhs[ix].op0),
extra_off));
+ /* When we have offsetted the RHS, reading only parts of it,
+we can no longer use the original TBAA type, force alias-set
+zero.  */
+ force_no_tbaa = true;
}
 
   /* Save the operands since we need to use the original ones for
@@ -3339,8 +3344,11 @@ vn_reference_lookup_3 (ao_ref *ref, tree vuse, void 
*data_,
   /* Adjust *ref from the new operands.  */
   ao_ref rhs1_ref;
   ao_ref_init (&rhs1_ref, rhs1);
-  if (!ao_ref_init_from_vn_reference (&r, ao_ref_alias_set (&rhs1_ref),
- ao_ref_base_alias_set (&rhs1_ref),
+  if (!ao_ref_init_from_vn_reference (&r,
+ force_no_tbaa ? 0
+ : ao_ref_alias_set (&rhs1_ref),
+ force_no_tbaa ? 0
+ : ao_ref_base_alias_set (&rhs1_ref),
  vr->type, vr->operands))
return (void *)-1;
   /* This can happen with bitfields.  */
-- 
2.36.1



[PATCH] Remove legacy -gz=zlib-gnu

2022-06-30 Thread Fangrui Song via Gcc-patches
From: Fangrui Song 

SHF_COMPRESSED style zlib has been supported since binutils 2.26
and the legacy zlib-gnu option hasn't gain adoption.
According to Debian Code Search (`gz=zlib-gnu`), no project uses
-gz=zlib-gnu (valgrind has a configure to use -gz=zlib).
Remove support for the legacy zlib-gnu and simplify configure.ac by
removing zlib-gnu ld/as check.
---
 gcc/common.opt  |  3 ---
 gcc/configure   | 33 ++---
 gcc/configure.ac| 29 -
 gcc/doc/invoke.texi | 11 +--
 gcc/gcc.cc  | 22 ++
 5 files changed, 17 insertions(+), 81 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index e7a51e882ba..8754d93d545 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -3424,9 +3424,6 @@ Enum(compressed_debug_sections) String(none) Value(0)
 EnumValue
 Enum(compressed_debug_sections) String(zlib) Value(1)
 
-EnumValue
-Enum(compressed_debug_sections) String(zlib-gnu) Value(2)
-
 gz
 Common Driver
 Generate compressed debug sections.
diff --git a/gcc/configure b/gcc/configure
index 62872d132ea..ca87e875e9d 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -19674,7 +19674,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19679 "configure"
+#line 19677 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19780,7 +19780,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19785 "configure"
+#line 19783 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -29711,20 +29711,13 @@ else
if $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s 2>&1 | 
grep -i warning > /dev/null
then
  gcc_cv_as_compress_debug=0
-   # Since binutils 2.26, gas supports --compress-debug-sections=type,
+   # Since binutils 2.26, gas supports --compress-debug-sections=zlib,
# defaulting to the ELF gABI format.
-   elif $gcc_cv_as --compress-debug-sections=zlib-gnu -o conftest.o conftest.s 
> /dev/null 2>&1
+   elif $gcc_cv_as --compress-debug-sections=zlib -o conftest.o conftest.s > 
/dev/null 2>&1
then
  gcc_cv_as_compress_debug=2
  gcc_cv_as_compress_debug_option="--compress-debug-sections"
  gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
-   # Before binutils 2.26, gas only supported --compress-debug-options and
-   # emitted the traditional GNU format.
-   elif $gcc_cv_as --compress-debug-sections -o conftest.o conftest.s > 
/dev/null 2>&1
-   then
- gcc_cv_as_compress_debug=1
- gcc_cv_as_compress_debug_option="--compress-debug-sections"
- gcc_cv_as_no_compress_debug_option="--nocompress-debug-sections"
else
  gcc_cv_as_compress_debug=0
fi
@@ -30238,42 +30231,28 @@ $as_echo "$gcc_cv_ld_eh_gc_sections_bug" >&6; }
 
 { $as_echo "$as_me:${as_lineno-$LINENO}: checking linker for compressed debug 
sections" >&5
 $as_echo_n "checking linker for compressed debug sections... " >&6; }
-# gold/gld support compressed debug sections since binutils 2.19/2.21
-# In binutils 2.26, gld gained support for the ELF gABI format.
+# GNU ld/gold support --compressed-debug-sections=zlib since binutils 2.26.
 if test $in_tree_ld = yes ; then
   gcc_cv_ld_compress_debug=0
   if test $ld_is_mold = yes; then
 gcc_cv_ld_compress_debug=3
 gcc_cv_ld_compress_debug_option="--compress-debug-sections"
-  elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
-ge 19 -o "$gcc_cv_gld_major_version" -gt 2 \
- && test $in_tree_ld_is_elf = yes && test $ld_is_gold = yes; then
-gcc_cv_ld_compress_debug=2
-gcc_cv_ld_compress_debug_option="--compress-debug-sections"
   elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
-ge 26 -o "$gcc_cv_gld_major_version" -gt 2 \
  && test $in_tree_ld_is_elf = yes && test $ld_is_gold = no; then
 gcc_cv_ld_compress_debug=3
 gcc_cv_ld_compress_debug_option="--compress-debug-sections"
-  elif test "$gcc_cv_gld_major_version" -eq 2 -a "$gcc_cv_gld_minor_version" 
-ge 21 -o "$gcc_cv_gld_major_version" -gt 2 \
- && test $in_tree_ld_is_elf = yes; then
-gcc_cv_ld_compress_debug=1
   fi
 elif echo "$ld_ver" | grep GNU > /dev/null; then
   if test $ld_is_mold = yes; then
 gcc_cv_ld_compress_debug=3
 gcc_cv_ld_compress_debug_option="--compress-debug-sections"
   elif test "$ld_vers_major" -lt 2 \
- || test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 21; then
+ || test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 26; then
 gcc_cv_ld_compress_debug=0
-  elif test "$ld_vers_major" -eq 2 -a "$ld_vers_minor" -lt 26; then
-gcc_cv_ld_compress_debug=1
   else
 gcc_cv_ld_compress_debug=3
 gcc_cv_ld_compress_debug_option="--compress-debug-sections"
   fi
-  if test $ld_is_gold = yes; then
-gcc_cv_ld_compress_debug=2
-gcc_cv_ld_compress_debug_option="--compress-debug-sections"
-