date:20221223

Re: Adding a new thread model to GCC

2022-12-23 Thread i.nixman--- via Gcc-patches


On 2022-12-24 05:58, NightStrike wrote:


I think this might have broken fortran.  I'm assuming because the
backtrace includes gthr.h, and I just did a git pull:

In file included from /tmp/rtmingw/mingw/include/windows.h:71,
 from ../libgcc/gthr-default.h:606,
 from ../../../libgfortran/../libgcc/gthr.h:148,
 from ../../../libgfortran/io/io.h:33,
 from ../../../libgfortran/runtime/error.c:27:
../../../libgfortran/io/io.h:298:24: error: expected identifier before
numeric constant
  298 | { CC_LIST, CC_FORTRAN, CC_NONE,
  |^~~



hmm...

I don't remember if I specified `fortran` in `--enable-language` in my 
test builds...

will try to build again now...

Re: Adding a new thread model to GCC

2022-12-23 Thread i.nixman--- via Gcc-patches


On 2022-12-23 23:59, Jonathan Yong wrote:


Done, pushed to master branch. Thanks Eric.



thank you Jonathan!

Re: Adding a new thread model to GCC

2022-12-23 Thread NightStrike via Gcc-patches

On Fri, Dec 23, 2022 at 7:00 PM Jonathan Yong via Gcc-patches
 wrote:
>
> On 12/22/22 12:28, i.nix...@autistici.org wrote:
> > On 2022-12-22 12:21, Jonathan Yong wrote:
> >
> > hello,
> >
> >> On 12/16/22 19:20, Eric Botcazou wrote:
>  The libgcc parts look reasonable to me, but I can't approve them.
>  Maybe Jonathan Yong can approve those parts as mingw-w64 target
>  maintainer, or maybe a libgcc approver can do so.
> >>>
> >>> OK.
> >>>
>  The libstdc++ parts are OK for trunk. IIUC they could go in
>  separately, they just wouldn't be very much use without the libgcc
>  changes.
> >>>
> >>> Sure thing.
> >>>
> >>
> >> Ping, need help to commit it?
> >
> > yes, it would be great if we can merge the path into gcc-13!
> >
> > I've tested it on gcc-12-branch and gcc-master for i686/x86_64 windows,
> > with msvcrt and ucrt runtime - works as it should!
> >
> > Eric ^^^
> >
> >
> >
> > best!
>
> Done, pushed to master branch. Thanks Eric.


I think this might have broken fortran.  I'm assuming because the
backtrace includes gthr.h, and I just did a git pull:

In file included from /tmp/rtmingw/mingw/include/windows.h:71,
 from ../libgcc/gthr-default.h:606,
 from ../../../libgfortran/../libgcc/gthr.h:148,
 from ../../../libgfortran/io/io.h:33,
 from ../../../libgfortran/runtime/error.c:27:
../../../libgfortran/io/io.h:298:24: error: expected identifier before
numeric constant
  298 | { CC_LIST, CC_FORTRAN, CC_NONE,
  |^~~

Re: testsuite under wine

2022-12-23 Thread Jacob Bachmeyer via Gcc


NightStrike wrote:

On Wed, Dec 21, 2022 at 11:37 PM Jacob Bachmeyer  wrote:
  

NightStrike wrote:


[...]
Second, the problems with extra \r's still remain, but I think we've
generally come to think that that part isn't Wine and is instead
either the testsuite or deja.  So I'll keep those replies to Jacob's
previous message.

  

Most likely, it is a combination of the MinGW libc (which emits "\r\n"
for end-of-line in accordance with Windows convention) and the kernel
terminal driver (which passes "\r" and translates "\n" to "\r\n" in
accordance with POSIX convention).  Wine, short of trying to translate
"\r\n" back to "\n" in accordance with POSIX conventions (and likely
making an even bigger mess---does Wine know if a handle is supposed to
be text or binary?) cannot really fix this, so the testsuite needs to
handle non-POSIX-standard line endings.  (The Rust tests probably have
an outright bug if the newlines are being duplicated.)



You may be onto something here.  I ran wine under script as `script -c
"wine64 ./a.exe" out` (thanks, Arsen!), and it had the same extra \r
prepended to the \r\n.  I was making the mistake previously of running
wine manually and capturing it to a file as `wine64 ./a.exe > out`,
which as several have pointed out in this thread, that would disable
the quirk, so of course it didn't reveal any problems.  I'm behind,
but I'll catch up to you guys eventually :)
  


So close, and yet so far...  script(1) /also/ uses a pty, so it is 
getting the same translations as Expect and therefore DejaGnu.



So at least we know for sure that this particular instance of extra
characters is coming from Wine.  Maybe Wine can be smart enough to
only translate \n into \r\n instead of translating \r\n into \r\r\n.
Jacek / Eric, comments here?  I'm happy to try another patch, the
first one was great.
  


I doubt that Wine is doing that translation.  MinGW libc produces output 
conformant to Windows conventions, so printf("\n") on a text handle 
emits "\r\n", which Wine passes along.  POSIX convention is that "\n" is 
translated to "\r\n" in the kernel terminal driver upon output, so the 
kernel translates the "\n" in the "\r\n" into /another/ "\r\n", yielding 
"\r\r\n" at the pty master end.  This is why DejaGnu testsuites must be 
prepared to discard excess carriage returns.  The first CR came from 
MinGW libc; the second CR came from the kernel terminal driver; the LF 
was ultimately passed through.



Rust is getting \r\r\n\n (as opposed to \r\n\r\n), so...  yeah.  Could
be the rust test, could be the rust frontend, could be another weird
Wine interaction.
  


That is probably either a Rust bug or the intended behavior.  Does the 
test produce "\n\n" or "\r\n\n" when run natively?  (Note that the 
terminal driver could reasonably optimize:  once one CR has been 
produced, any number of LF may follow:  the cursor is assumed to remain 
at the left edge.  It is possible that the kernel terminal driver could 
even strip the second CR in "\r\n\r\n" since its only effect on an 
actual serial terminal would be wasting transmission time.)



-- Jacob

[PATCH] RISC-V: Fix ICE of visiting non-existing block in CFG.

2022-12-23 Thread juzhe . zhong

From: Ju-Zhe Zhong 

This patch is to fix issue of visiting non-existing block of CFG.
Since blocks index of CFG in GCC are not always contiguous, we will potentially
visit a gap block which is no existing in the current CFG.

This patch can avoid visiting non existing block in CFG.

I noticed such issue in my internal regression of current testsuite 
when I change the X86 server machine. This patch fix it:
17:27:15  job(build_and_test_rv32): Increased FAIL List:
17:27:15  job(build_and_test_rv32): FAIL: 
gcc.target/riscv/rvv/vsetvl/vlmax_back_prop-46.c
-O2 -flto -fno-use-linker-plugin -flto-partition=none  (internal compiler 
error: Segmentation fault)

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc 
(pass_vsetvl::compute_global_backward_infos): Change to visit CFG.
(pass_vsetvl::prune_expressions): Ditto.

---
 gcc/config/riscv/riscv-vsetvl.cc | 21 ++---
 1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index a55b5a1c394..0d66765e09c 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -1962,12 +1962,10 @@ pass_vsetvl::compute_global_backward_infos (void)
   if (dump_file)
 {
   fprintf (dump_file, "\n\nDirty blocks list: ");
-  for (size_t i = 0; i < m_vector_manager->vector_block_infos.length ();
-  i++)
-   {
- if (m_vector_manager->vector_block_infos[i].reaching_out.dirty_p ())
-   fprintf (dump_file, "%ld ", i);
-   }
+  for (const bb_info *bb : crtl->ssa->bbs ())
+   if (m_vector_manager->vector_block_infos[bb->index ()]
+ .reaching_out.dirty_p ())
+ fprintf (dump_file, "%d ", bb->index ());
   fprintf (dump_file, "\n\n");
 }
 }
@@ -1976,15 +1974,16 @@ pass_vsetvl::compute_global_backward_infos (void)
 void
 pass_vsetvl::prune_expressions (void)
 {
-  for (size_t i = 0; i < m_vector_manager->vector_block_infos.length (); i++)
+  for (const bb_info *bb : crtl->ssa->bbs ())
 {
-  if (m_vector_manager->vector_block_infos[i].local_dem.valid_or_dirty_p 
())
+  if (m_vector_manager->vector_block_infos[bb->index ()]
+   .local_dem.valid_or_dirty_p ())
m_vector_manager->create_expr (
- m_vector_manager->vector_block_infos[i].local_dem);
-  if (m_vector_manager->vector_block_infos[i]
+ m_vector_manager->vector_block_infos[bb->index ()].local_dem);
+  if (m_vector_manager->vector_block_infos[bb->index ()]
.reaching_out.valid_or_dirty_p ())
m_vector_manager->create_expr (
- m_vector_manager->vector_block_infos[i].reaching_out);
+ m_vector_manager->vector_block_infos[bb->index ()].reaching_out);
 }
 
   if (dump_file)
-- 
2.36.3

[Bug pch/105858] MinGW-w64 64-bit build with --libstdcxx-pch: fatal error: cannot write PCH file: required memory segment unavailable

2022-12-23 Thread egor.pugin at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105858

--- Comment #6 from Egor Pugin  ---
Same issue.

[r13-4873 Regression] FAIL: gcc.target/i386/pr107548-1.c scan-assembler-times \tmovd\t 3 on Linux/x86_64

2022-12-23 Thread haochen.jiang via Gcc-patches

On Linux/x86_64,

0b2c1369d035e92847cca81fd9f7b4e9ab9da710 is the first bad commit
commit 0b2c1369d035e92847cca81fd9f7b4e9ab9da710
Author: Roger Sayle 
Date:   Fri Dec 23 09:56:30 2022 +

PR target/107548: Handle vec_select in STV on x86.

caused

FAIL: gcc.target/i386/pr107548-1.c scan-assembler-times \tmovd\t 3

with GCC configured with

../../gcc/configure 
--prefix=/export/users/haochenj/src/gcc-bisect/master/master/r13-4873/usr 
--enable-clocale=gnu --with-system-zlib --with-demangler-in-ld 
--with-fpmath=sse --enable-languages=c,c++,fortran --enable-cet --without-isl 
--enable-libmpx x86_64-linux --disable-bootstrap

To reproduce:

$ cd {build_dir}/gcc && make check 
RUNTESTFLAGS="i386.exp=gcc.target/i386/pr107548-1.c --target_board='unix{-m64\ 
-march=cascadelake}'"

(Please do not reply to this email, for question about this report, contact me 
at haochen dot jiang at intel.com)

Re: Ping^2: [PATCH] d: Update __FreeBSD_version values [PR107469]

2022-12-23 Thread Gerald Pfeifer

Hi Ian (and Andreas),

On Wed, 14 Dec 2022, Lorenzo Salvadore wrote:
> Ping https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605685.html
> 
> I would like to remind that Gerald Pfeifer already volunteered to commit 
> this patch when it is approved. However the patch has not been approved 
> yet.

I am tempted to commit this under our obvious rule (and this has been part 
of the FreeBSD ports for weeks now). 

It still would be preferable to get your review (and approval ideally ;-), 
though. Would you mind having a look?

(Andreas, any take as GCC's FreeBSD maintainer?)

Thanks,
Gerald

>> --- Original Message ---
>> On Friday, November 11th, 2022 at 12:07 AM, Lorenzo Salvadore 
>> develo...@lorenzosalvadore.it wrote:
>> 
>>> Update __FreeBSD_version values for the latest FreeBSD supported
>>> versions. In particular, add __FreeBSD_version for FreeBSD 14, which is
>>> necessary to compile libphobos successfully on FreeBSD 14.
>>> 
>>> The patch has already been applied successfully in the official FreeBSD
>>> ports tree for the ports lang/gcc11 and lang/gcc11-devel. Please see the
>>> following commits:
>>> 
>>> https://cgit.freebsd.org/ports/commit/?id=f61fb49b2e76fd4f7a5b7a11510b5109206c19f2
>>> https://cgit.freebsd.org/ports/commit/?id=57936dba89ea208e5dbc1bd2d7fda3d29a1838b3
>>> 
>>> libphobos/ChangeLog:
>>> 
>>> 2022-11-10 Lorenzo Salvadore develo...@lorenzosalvadore.it
>>> 
>>> PR d/107469.
>>> * libdruntime/core/sys/freebsd/config.d: Update __FreeBSD_version.
>>> 
>>> ---
>>> libphobos/libdruntime/core/sys/freebsd/config.d | 5 +++--
>>> 1 file changed, 3 insertions(+), 2 deletions(-)
>>> 
>>> diff --git a/libphobos/libdruntime/core/sys/freebsd/config.d 
>>> b/libphobos/libdruntime/core/sys/freebsd/config.d
>>> index 5e3129e2422..9d502e52e32 100644
>>> --- a/libphobos/libdruntime/core/sys/freebsd/config.d
>>> +++ b/libphobos/libdruntime/core/sys/freebsd/config.d
>>> @@ -14,8 +14,9 @@ public import core.sys.posix.config;
>>> // NOTE: When adding newer versions of FreeBSD, verify all current versioned
>>> // bindings are still compatible with the release.
>>> 
>>> - version (FreeBSD_13) enum __FreeBSD_version = 130;
>>> -else version (FreeBSD_12) enum __FreeBSD_version = 1202000;
>>> + version (FreeBSD_14) enum __FreeBSD_version = 140;
>>> +else version (FreeBSD_13) enum __FreeBSD_version = 1301000;
>>> +else version (FreeBSD_12) enum __FreeBSD_version = 1203000;
>>> else version (FreeBSD_11) enum __FreeBSD_version = 1104000;
>>> else version (FreeBSD_10) enum __FreeBSD_version = 1004000;
>>> else version (FreeBSD_9) enum __FreeBSD_version = 903000;
>>> --
>>> 2.38.0

Re: Adding a new thread model to GCC

2022-12-23 Thread Jonathan Yong via Gcc-patches


On 12/22/22 12:28, i.nix...@autistici.org wrote:

On 2022-12-22 12:21, Jonathan Yong wrote:

hello,


On 12/16/22 19:20, Eric Botcazou wrote:

The libgcc parts look reasonable to me, but I can't approve them.
Maybe Jonathan Yong can approve those parts as mingw-w64 target
maintainer, or maybe a libgcc approver can do so.


OK.


The libstdc++ parts are OK for trunk. IIUC they could go in
separately, they just wouldn't be very much use without the libgcc
changes.


Sure thing.



Ping, need help to commit it?


yes, it would be great if we can merge the path into gcc-13!

I've tested it on gcc-12-branch and gcc-master for i686/x86_64 windows, 
with msvcrt and ucrt runtime - works as it should!


Eric ^^^



best!


Done, pushed to master branch. Thanks Eric.



OpenPGP_0x713B5FE29C145D45_and_old_rev.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature

[Bug libstdc++/108212] [13 Regression] pretty printers don't work with Python 2 due to imports for chrono

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108212

Jonathan Wakely  changed:

   What|Removed |Added

Summary| pretty printers|[13 Regression] pretty
   |don't work with Python 2|printers don't work with
   ||Python 2 due to imports for
   ||chrono
   Priority|P3  |P1

[Bug libstdc++/108211] std::chrono::current_zone() fails if zone only has one component

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108211

--- Comment #1 from Jonathan Wakely  ---
The obvious solution is to try locate_zone(dir/filename) and if that fails try
locate_zone(filename).

[Bug libstdc++/108214] [13 Regression] writinng bitset to stringstream fails

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108214

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
 Status|NEW |ASSIGNED

Re: [PATCH] libstdc++, configure: Fix GLIBCXX_ZONEINFO_DIR configuration macro.

2022-12-23 Thread Jonathan Wakely via Gcc-patches

On Fri, 23 Dec 2022, 17:06 Iain Sandoe via Libstdc++, 
wrote:

>  This is a patch for comment on the approach - tested on x86_64-darwi21
>  thoughts?
>  Iain
>
>  --- 8< ---
>
> Testing on Darwin revealed that the GLIBCXX_ZONEINFO_DIR was not doing
> quite
> the right thing (we ended up with ${withval} in the config.h file).
>
> This patch proposes revising the behaviour of the configure flag thus:
>
> --with-libstdcxx-zoneinfo-dir=
>  unspecified : Set _GLIBCXX_ZONEINFO_DIR to a default suitable for $host
>  yes : Set _GLIBCXX_ZONEINFO_DIR to a default suitable for $host
>  no  : Do not set _GLIBCXX_ZONEINFO_DIR
>

What's the use case for "no"? Enforcing a UTC-only tzdb that doesn't even
try to load the tzdata? If that's desirable, we could #ifdef huge parts of
src/c++20/tzdb.cc to make the library smaller. That might make sense for a
toolchain for embedded targets where it's known there's no need for time
zone conversions.



 /some/path  : set _GLIBCXX_ZONEINFO_DIR = "/some/path"
>
> Signed-off-by: Iain Sandoe 
>
> libstdc++-v3/ChangeLog:
>
> * acinclude.m4 (GLIBCXX_ZONEINFO_DIR): Revise configure flag
> handling.
> * configure: Regenerate.
> * src/c++20/tzdb.cc: Add a comment that an unset
> _GLIBCXX_ZONEINFO_DIR
> implies that the configuration specified that no directory should
> be
> used.
> ---
>  libstdc++-v3/acinclude.m4  | 21 ++---
>  libstdc++-v3/configure | 28 +++-
>  libstdc++-v3/src/c++20/tzdb.cc |  1 +
>  3 files changed, 34 insertions(+), 16 deletions(-)
>
> diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
> index f73946a4918..3653822aed4 100644
> --- a/libstdc++-v3/acinclude.m4
> +++ b/libstdc++-v3/acinclude.m4
> @@ -5153,18 +5153,25 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
>AC_ARG_WITH([libstdcxx-zoneinfo-dir],
>  AC_HELP_STRING([--with-libstdcxx-zoneinfo-dir],
>[the directory to search for tzdata files]),
> -[zoneinfo_dir="${withval}"
> - AC_DEFINE(_GLIBCXX_ZONEINFO_DIR, "${withval}",
> -   [Define if a non-default location should be used for tzdata
> files.])
> -],
> -[
> +[],[with_libstdcxx_zoneinfo_dir=yes])
> +
> +  # Pick a default when no specific path is set.
> +  if test x${with_libstdcxx_zoneinfo_dir} = xyes; then
>  case "$host" in
># *-*-aix*) zoneinfo_dir="/usr/share/lib/zoneinfo" ;;
> +  *-*-darwin2*) zoneinfo_dir="/usr/share/lib/zoneinfo.default" ;;
>*) zoneinfo_dir="/usr/share/zoneinfo" ;;
>  esac
> -])
> -
> +  elif test x${with_libstdcxx_zoneinfo_dir} = xno; then
> +zoneinfo_dir=none
> +  else
> +zoneinfo_dir=${with_libstdcxx_zoneinfo_dir}
> +  fi
>AC_MSG_NOTICE([zoneinfo data directory: ${zoneinfo_dir}])
> +  if test x${zoneinfo_dir} != xnone; then
> +AC_DEFINE_UNQUOTED(_GLIBCXX_ZONEINFO_DIR, "${zoneinfo_dir}",
> +   [Define if a non-default location should be used for tzdata
> files.])
> +  fi
>  ])
>
>  # Macros from the top-level gcc directory.
>
> diff --git a/libstdc++-v3/src/c++20/tzdb.cc
> b/libstdc++-v3/src/c++20/tzdb.cc
> index 5f5c4199f65..c4311d0902a 100644
> --- a/libstdc++-v3/src/c++20/tzdb.cc
> +++ b/libstdc++-v3/src/c++20/tzdb.cc
> @@ -52,6 +52,7 @@
>  # endif
>  #endif
>
> +// This is a bit odd; the configure-time setting was 'no zoneinfo
> directory'
>  #ifndef _GLIBCXX_ZONEINFO_DIR
>  # define _GLIBCXX_ZONEINFO_DIR "/usr/share/zoneinfo"
>  #endif
> --
> 2.37.1 (Apple Git-137.1)
>
>

gcc-11-20221223 is now available

2022-12-23 Thread GCC Administrator via Gcc

Snapshot gcc-11-20221223 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/11-20221223/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 11 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-11 revision b79c2de39cb16a80d10416d3b4b10cc487844aef

You'll find:

 gcc-11-20221223.tar.xz   Complete GCC

  SHA256=69ccf08b06e1bb32f611996fddf4890dd227dd56a397f7cd02392404ba85bb6a
  SHA1=5ce254a9b86b1dbc0b4ad39c9bbfc2d47236c52d

Diffs from 11-20221216 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-11
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.

[Bug fortran/108131] [10/11/12/13 Regression] Incorrect bound calculation when bound intrinsic used in size expression

2022-12-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108131

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Harald Anlauf :

https://gcc.gnu.org/g:6a95f0e0a06d78d94138d4c3dd64d41591197281

commit r13-4880-g6a95f0e0a06d78d94138d4c3dd64d41591197281
Author: Harald Anlauf 
Date:   Sat Dec 17 22:04:32 2022 +0100

Fortran: incorrect array bounds when bound intrinsic used in decl
[PR108131]

gcc/fortran/ChangeLog:

PR fortran/108131
* array.cc (match_array_element_spec): Avoid too early
simplification
of matched array element specs that can lead to a misinterpretation
when used as array bounds in array declarations.

gcc/testsuite/ChangeLog:

PR fortran/108131
* gfortran.dg/pr103505.f90: Adjust expected patterns.
* gfortran.dg/pr108131.f90: New test.

Re: nvptx: '-mframe-malloc-threshold', '-Wframe-malloc-threshold' (was: Handling of large stack objects in GPU code generation -- maybe transform into heap allocation?)

2022-12-23 Thread Jerry D via Gcc-patches


On 12/23/22 6:08 AM, Thomas Schwinge wrote:

Hi!

On 2022-11-11T15:35:44+0100, Richard Biener via Fortran  
wrote:

On Fri, Nov 11, 2022 at 3:13 PM Thomas Schwinge  wrote:

For example, for Fortran code like:

 write (*,*) "Hello world"

..., 'gfortran' creates:

 struct __st_parameter_dt dt_parm.0;

 try
   {
 dt_parm.0.common.filename = 
&"source-gcc/libgomp/testsuite/libgomp.oacc-fortran/print-1_.f90"[1]{lb: 1 sz: 
1};
 dt_parm.0.common.line = 29;
 dt_parm.0.common.flags = 128;
 dt_parm.0.common.unit = 6;
 _gfortran_st_write (_parm.0);
 _gfortran_transfer_character_write (_parm.0, &"Hello world"[1]{lb: 
1 sz: 1}, 11);
 _gfortran_st_write_done (_parm.0);
   }
 finally
   {
 dt_parm.0 = {CLOBBER(eol)};
   }

The issue: the stack object 'dt_parm.0' is a half-KiB in size (yes,
really! -- there's a lot of state in Fortran I/O apparently).  That's a
problem for GPU execution -- here: OpenACC/nvptx -- where typically you
have small stacks.  (For example, GCC/OpenACC/nvptx: 1 KiB per thread;
GCC/OpenMP/nvptx is an exception, because of its use of '-msoft-stack'
"Use custom stacks instead of local memory for automatic storage".)

Now, the Nvidia Driver tries to accomodate for such largish stack usage,
and dynamically increases the per-thread stack as necessary (thereby
potentially reducing parallelism) -- if it manages to understand the call
graph.  In case of libgfortran I/O, it evidently doesn't.  Not being able
to disprove existance of recursion is the common problem, as I've read.
At run time, via 'CU_JIT_INFO_LOG_BUFFER' you then get, for example:

 warning : Stack size for entry function 'MAIN__$_omp_fn$0' cannot be 
statically determined

That's still not an actual problem: if the GPU kernel's stack usage still
fits into 1 KiB.  Very often it does, but if, as happens in libgfortran
I/O handling, there is another such 'dt_parm' put onto the stack, the
stack then overflows; device-side SIGSEGV.

(There is, by the way, some similar analysis by Tom de Vries in
 "[nvptx, openacc, openmp, testsuite]
Recursive tests may fail due to thread stack limit".)

Of course, you shouldn't really be doing I/O in GPU kernels, but people
do like their occasional "'printf' debugging", so we ought to make that
work (... without pessimizing any "normal" code).

I assume that generally reducing the size of 'dt_parm' etc. is out of
scope.


There are so many wiggles and turns and corner cases and the like of 
nightmares in I/O I would advise not trying to reduce the dt_parm.  It 
could probably be done.


For debugging GPU, would it not be better to have a way you signal back 
to a main thread to do a print from there, like some sort of call back 
in the users code under test.


Putting this another way, recommend users debugging to use a different 
method than embedding print statements for debugging rather than do a 
tone of work to enable something that is not really a legitimate use case.


FWIW,

Jerry

[Bug target/107998] [13 Regression] gcc-13-20221204 failure to build on Cygwin No dirname for option: m32

2022-12-23 Thread mckelvey at maskull dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107998

--- Comment #11 from James McKelvey  ---
(In reply to Christophe Lyon from comment #10)
> Can you try to revert my patches:
> f0d3b6e384a68f8b58bc750f240a15cad92600cd
> ccb9c7b129206209cfc315ab1a0432b5f517bdd9
> and remove your patch at comment #5 ?
> You should still see the problem you reported in bug #108011
> 
> 
> However, I don't understand why you had to do what you describe in comment
> #8. When multilibs are disabled, the build shouldn't try to use
> MULTILIB_OPTIONS etc...

Sorry, I don't use git. I just build from the weekly snapshots.
I double-checked by removing the fix, make distclean, and
./configure --enable-languages=c,c++ --enable-threads=posix --disable-multilib
and got the same error.

Re: [PATCH] Fortran: incorrect array bounds when bound intrinsic used in decl [PR108131]

2022-12-23 Thread Jerry D via Gcc-patches


On 12/17/22 1:21 PM, Harald Anlauf via Fortran wrote:

Dear all,

the previous fix for pr103505 introduced a regression that could lead
to wrong array bounds when LBOUND/UBOUND were used in the array spec
of a declaration.  The reason was that we tried to simplify too early
the array element spec, which appears to have interfered with the
subtle semantics of the bound intrinsics.

The solution is to undo the fix for pr103505.  It turns out that
there are other code changes in place that were put in place to
fix related ICEs, and which handle that one, too, and only lead
to a change of the emitted error diagnostics.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?



Yes, OK for mainline.

My thought is that this is the kind of bug that can go unseen with 
incorrect array bounds so is a good candidate to backport.  At least 12, 
10 and 11 if you have time and it is applicable.


As this is a 10/11/12/13 regression, I would like to backport
as seems fit.

Thanks,
Harald

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

--- Comment #5 from Andrew Pinski  ---
(In reply to Stefan Schulze Frielinghaus from comment #4) 
> and the current working directory was most likely /devel/gcc/build/gcc.
> Creating a symlink from $build/stage1-gcc to $build/prev-gcc and then
> running the command from above doesn't do the trick. There is probably an
> easier way which I miss. Any hints?

See stage2-start/stage3-start I think.

See
https://gcc.gnu.org/onlinedocs/gccint/Makefile.html#Makefile

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-rust

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

--- Comment #5 from Andrew Pinski  ---
(In reply to Stefan Schulze Frielinghaus from comment #4) 
> and the current working directory was most likely /devel/gcc/build/gcc.
> Creating a symlink from $build/stage1-gcc to $build/prev-gcc and then
> running the command from above doesn't do the trick. There is probably an
> easier way which I miss. Any hints?

See stage2-start/stage3-start I think.

See
https://gcc.gnu.org/onlinedocs/gccint/Makefile.html#Makefile

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread stefansf at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

--- Comment #4 from Stefan Schulze Frielinghaus  
---
I was playing around with this and was wondering how can I actually execute the
stageN compiler? From the output of make I see two compilations for object
rust-hir-trait-resolve.o. Thus the first one must be for stage2 and the second
one for stage3. For the former the command line is

/devel/gcc/build/./prev-gcc/xg++ -B/devel/gcc/build/./prev-gcc/
-B/devel/gcc/dst/s390x-ibm-linux-gnu/bin/ -nostdinc++
-B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs
-B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs 
-I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/s390x-ibm-linux-gnu
 -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include 
-I/devel/gcc/src/libstdc++-v3/libsupc++
-L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs
-L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs 
-fno-PIE -c  -DIN_GCC_FRONTEND -g -O2 -fno-checking -gtoggle -DIN_GCC
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing
-Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -Wno-unused-parameter
-fno-common  -DHAVE_CONFIG_H -I. -Irust -I/devel/gcc/src/gcc
-I/devel/gcc/src/gcc/rust -I/devel/gcc/src/gcc/../include
-I/devel/gcc/src/gcc/../libcpp/include -I/devel/gcc/src/gcc/../libcody 
-I/devel/gcc/src/gcc/../libdecnumber -I/devel/gcc/src/gcc/../libdecnumber/dpd
-I../libdecnumber -I/devel/gcc/src/gcc/../libbacktrace   -o
rust/rust-hir-trait-resolve.o -MT rust/rust-hir-trait-resolve.o -MMD -MP -MF
rust/.deps/rust-hir-trait-resolve.TPo -g -O2 -fno-checking -gtoggle -I
/devel/gcc/src/gcc/rust -I /devel/gcc/src/gcc/rust/lex -I
/devel/gcc/src/gcc/rust/parse -I /devel/gcc/src/gcc/rust/ast -I
/devel/gcc/src/gcc/rust/analysis -I /devel/gcc/src/gcc/rust/backend -I
/devel/gcc/src/gcc/rust/expand -I /devel/gcc/src/gcc/rust/hir/tree -I
/devel/gcc/src/gcc/rust/hir -I /devel/gcc/src/gcc/rust/resolve -I
/devel/gcc/src/gcc/rust/util -I /devel/gcc/src/gcc/rust/typecheck -I
/devel/gcc/src/gcc/rust/checks/lints -I /devel/gcc/src/gcc/rust/checks/errors
-I /devel/gcc/src/gcc/rust/checks/errors/privacy -I
/devel/gcc/src/gcc/rust/util -I /devel/gcc/src/gcc/rust/metadata
/devel/gcc/src/gcc/rust/typecheck/rust-hir-trait-resolve.cc

and the current working directory was most likely /devel/gcc/build/gcc.
Creating a symlink from $build/stage1-gcc to $build/prev-gcc and then running
the command from above doesn't do the trick. There is probably an easier way
which I miss. Any hints?

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread stefansf at linux dot ibm.com via Gcc-rust

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

--- Comment #4 from Stefan Schulze Frielinghaus  
---
I was playing around with this and was wondering how can I actually execute the
stageN compiler? From the output of make I see two compilations for object
rust-hir-trait-resolve.o. Thus the first one must be for stage2 and the second
one for stage3. For the former the command line is

/devel/gcc/build/./prev-gcc/xg++ -B/devel/gcc/build/./prev-gcc/
-B/devel/gcc/dst/s390x-ibm-linux-gnu/bin/ -nostdinc++
-B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs
-B/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs 
-I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include/s390x-ibm-linux-gnu
 -I/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/include 
-I/devel/gcc/src/libstdc++-v3/libsupc++
-L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/src/.libs
-L/devel/gcc/build/prev-s390x-ibm-linux-gnu/libstdc++-v3/libsupc++/.libs 
-fno-PIE -c  -DIN_GCC_FRONTEND -g -O2 -fno-checking -gtoggle -DIN_GCC
-fno-exceptions -fno-rtti -fasynchronous-unwind-tables -W -Wall -Wno-narrowing
-Wwrite-strings -Wcast-qual -Wmissing-format-attribute
-Wconditionally-supported -Woverloaded-virtual -pedantic -Wno-long-long
-Wno-variadic-macros -Wno-overlength-strings -Werror -Wno-unused-parameter
-fno-common  -DHAVE_CONFIG_H -I. -Irust -I/devel/gcc/src/gcc
-I/devel/gcc/src/gcc/rust -I/devel/gcc/src/gcc/../include
-I/devel/gcc/src/gcc/../libcpp/include -I/devel/gcc/src/gcc/../libcody 
-I/devel/gcc/src/gcc/../libdecnumber -I/devel/gcc/src/gcc/../libdecnumber/dpd
-I../libdecnumber -I/devel/gcc/src/gcc/../libbacktrace   -o
rust/rust-hir-trait-resolve.o -MT rust/rust-hir-trait-resolve.o -MMD -MP -MF
rust/.deps/rust-hir-trait-resolve.TPo -g -O2 -fno-checking -gtoggle -I
/devel/gcc/src/gcc/rust -I /devel/gcc/src/gcc/rust/lex -I
/devel/gcc/src/gcc/rust/parse -I /devel/gcc/src/gcc/rust/ast -I
/devel/gcc/src/gcc/rust/analysis -I /devel/gcc/src/gcc/rust/backend -I
/devel/gcc/src/gcc/rust/expand -I /devel/gcc/src/gcc/rust/hir/tree -I
/devel/gcc/src/gcc/rust/hir -I /devel/gcc/src/gcc/rust/resolve -I
/devel/gcc/src/gcc/rust/util -I /devel/gcc/src/gcc/rust/typecheck -I
/devel/gcc/src/gcc/rust/checks/lints -I /devel/gcc/src/gcc/rust/checks/errors
-I /devel/gcc/src/gcc/rust/checks/errors/privacy -I
/devel/gcc/src/gcc/rust/util -I /devel/gcc/src/gcc/rust/metadata
/devel/gcc/src/gcc/rust/typecheck/rust-hir-trait-resolve.cc

and the current working directory was most likely /devel/gcc/build/gcc.
Creating a symlink from $build/stage1-gcc to $build/prev-gcc and then running
the command from above doesn't do the trick. There is probably an easier way
which I miss. Any hints?

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust

[Bug fortran/106731] ICE on automatic array of derived type with DTIO

2022-12-23 Thread federico.perini at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106731

--- Comment #11 from federico  ---
Thank you. 

I can confirm the patch works.

I thought that, while fixing the issue, removing the assert was not the best
solution as automatic arrays are not supposed to be static. My bad.

Happy holidays, 

Federico

Re: [PATCH v2] rs6000: Rework option -mpowerpc64 handling [PR106680]

2022-12-23 Thread Segher Boessenkool

Hi!

On Wed, Oct 12, 2022 at 04:12:21PM +0800, Kewen.Lin wrote:
> PR106680 shows that -m32 -mpowerpc64 is different from
> -mpowerpc64 -m32, this is determined by the way how we
> handle option powerpc64 in rs6000_handle_option.
> 
> Segher pointed out this difference should be taken as
> a bug and we should ensure that option powerpc64 is
> independent of -m32/-m64.  So this patch removes the
> handlings in rs6000_handle_option and add some necessary
> supports in rs6000_option_override_internal instead.

Sorry for the late review.

> +  /* Don't expect powerpc64 enabled on those OSes with OS_MISSING_POWERPC64,
> + since they don't support saving the high part of 64-bit registers on
> + context switch.  If the user explicitly specifies it, we won't interfere
> + with the user's specification.  */

It depends on the OS, and what you call "context switch".  For example
on Linux the context switches done by the kernel are fine, only things
done by setjmp/longjmp and getcontext/setcontext are not.  So just be a
bit more vague here?  "Since they do not save and restore the high half
of the GPRs correctly in all cases", something like that?

Okay for trunk like that.  Thanks!

Segher

[Bug c++/108216] Wrong offset for (already-constructed) virtual base during construction of full object

2022-12-23 Thread thiago at kde dot org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108216

--- Comment #3 from Thiago Macieira  ---
In bug 70644, the pointer to Base was passed to Base's constructor, so the
conversion from the derived type to the virtual base Base happened clearly
before said base was constructed.

In this example here, the conversion happens inside C's constructor body, where
C's direct (but virtual) base A must be fully initialised, notwithstanding the
fact that it was initialised by D's in-charge constructor.

I'm not making a conclusion that this is or isn't UB. I'm saying that it can't
be UB for the explanation offered in that bug.

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Qing Zhao via Gcc-patches

> On Dec 23, 2022, at 2:36 PM, Alexander Monakov  wrote:
> 
> 
> 
> On Fri, 23 Dec 2022, Qing Zhao wrote:
> 
>> Then, sched2 still can move insn across calls? 
>> So does sched2 have the same issue of incorrectly moving  the insn across a 
>> call which has unknown control flow?
> 
> I think problems are unlikely because register allocator assigns pseudos that
> cross setjmp to memory.
> 
> I think you hit the problem with sched1 because most testing is done on x86 
> and
> sched1 is not enabled there, otherwise the problem would have been noticed 
> much
> earlier.

Yes, the problem with this bug is in sched1 on aarch64.  On x86 the same issue 
will be exposed when explicitly enable sched1 with -fschedule-insns. 

BTW, Why sched1 is not enabled on x86 by default?

Another question is:  As discussed in the original bug PR57067: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57067
The root cause of this issue related to the abnormal control flow edges (from 
setjmp/longjmp) cannot be represented correctly at RTL stage, shall we fix
this root cause instead? 

Qing

> Alexander

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Segher Boessenkool

On Fri, Dec 23, 2022 at 08:13:48PM +0100, Richard Biener wrote:
> > Am 23.12.2022 um 17:55 schrieb Segher Boessenkool 
> > :
> > There are at least six very different kinds of subreg:
> > 
> > 0) Lvalue subregs.  Most archs have no use for it, and it can be
> >   expressed much more clearly and cleanly always.
> > 1) Subregs of mem.  Do not use, deprecated.  When old reload goes away
> >   this will go away.
> > 2) Subregs of hard registers.  Do not use, there are much better ways to
> >   write subregs of a non-zero byte offset, and for zero offset this is
> >   non-canonical RTL.
> > 3) Bitcast subregs.  In principle they go from one mode to another mode
> >   of the same size (but read on).
> > 4) Paradoxical subregs.  A concept completely separate from the rest,
> >   different rules for everything, it has to be special cased almost
> >   everywhere, it would be better if it was a separate rtx_code imo.
> > 5) Finally, normal subregs, taking a contiguous span of bits from some
> >   value.
> > 
> > Now, it is invalid to have a subreg of a subreg, so a 3) of a 5) is
> > written as just one subreg, as you say.  And a 4) of a 5) is just
> > invalid afaics (and let's not talk about 0)..2) anymore :-) )
> > 
> >> Note whether targets actually support subreg operations needs to be 
> >> queried and I’m not sure how subreg with offset validation should work 
> >> there.
> > 
> > But 3) is always valid, no?  On pseudos.
> 
> Yes, but it will eventually result in a spill/reload which is undesirable 
> when we created this from CSE from a load.  So I think for CSE we do want to 
> know whether a spill will definitely not occur.

Does it cause reloads though?  On any sane backend?  If no movsf pattern
allows integer registers, can things work at all?

Anyway, the normal way to test if some RTL is valid is to just generate
it (using validate_change) and then do apply_change_group, which then
cancels the changes if they do not work.  CSE already does some of this.

(I am doubtful doing any of this in CSE is a good idea fwiw).


Segher

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Alexander Monakov via Gcc-patches

On Fri, 23 Dec 2022, Qing Zhao wrote:

> Then, sched2 still can move insn across calls? 
> So does sched2 have the same issue of incorrectly moving  the insn across a 
> call which has unknown control flow?

I think problems are unlikely because register allocator assigns pseudos that
cross setjmp to memory.

I think you hit the problem with sched1 because most testing is done on x86 and
sched1 is not enabled there, otherwise the problem would have been noticed much
earlier.

Alexander

[Bug c++/108216] Wrong offset for (already-constructed) virtual base during construction of full object

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108216

--- Comment #2 from Andrew Pinski  ---
Right reading bug 70644, then this code might be undefined.

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Qing Zhao via Gcc-patches

Then, sched2 still can move insn across calls? 
So does sched2 have the same issue of incorrectly moving  the insn across a 
call which has unknown control flow?

Qing

> On Dec 23, 2022, at 12:31 PM, Alexander Monakov  wrote:
> 
> 
> On Fri, 23 Dec 2022, Jose E. Marchesi wrote:
> 
>>> (scheduling across calls in sched2 is somewhat dubious as well, but
>>> it doesn't risk register pressure issues, and on VLIW CPUs it at least
>>> can result in better VLIW packing)
>> 
>> Does sched2 actually schedule across calls?  All the comments in the
>> source code stress the fact that the second scheduler pass (after
>> register allocation) works in regions that correspond to basic blocks:
>> "(after reload, each region is of one block)".
> 
> A call instruction does not end a basic block.
> 
> (also, with -fsched2-use-superblocks sched2 works on regions like sched1)
> 
> Alexander

[Bug c++/108216] Wrong offset for (already-constructed) virtual base during construction of full object

2022-12-23 Thread arthur.j.odwyer at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108216

--- Comment #1 from Arthur O'Dwyer  ---
Possibly tangentially related: #70644, #81051

[Bug c++/108216] New: Wrong offset for (already-constructed) virtual base during construction of full object

2022-12-23 Thread arthur.j.odwyer at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108216

Bug ID: 108216
   Summary: Wrong offset for (already-constructed) virtual base
during construction of full object
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: arthur.j.odwyer at gmail dot com
  Target Milestone: ---

// https://godbolt.org/z/6qMTY6bGn
#include 

struct A *ga = nullptr;
struct B *gb = nullptr;
struct C *gc = nullptr;
struct D *gd = nullptr;

struct A {
explicit A() {
printf("Constructing A at %p\n", (void*)this);
ga = this;
printf(" A is %p\n", (void*)ga);
}
virtual void f() {}
void *a() { return this; }
};

struct B : virtual A {
explicit B() {
printf("Constructing B at %p\n", (void*)this);
gb = this;
printf(" B.A is %p\n", (void*)(A*)gb);
}
void *b() { return this; }
};

struct C : virtual A {
explicit C() {
printf("Constructing C at %p\n", (void*)this);
gc = this;
printf(" B.A is %p -- look here!\n", (void*)(A*)gb);
printf(" C.A is %p\n", (void*)(A*)gc);
}
// void f() override {}  // give Clang trouble, too
void *c() { return this; }
};

struct D : B, C {
explicit D(): B(), C() {
printf("Constructing D at %p\n", (void*)this);
gd = this;
printf(" D.B.A is %p\n", (void*)(A*)(B*)gd);
printf(" D.C.A is %p\n", (void*)(A*)(C*)gd);
}
};

int main() {
D d;
printf(" is %p\n", (void*));
printf(" is %p\n", d.c());
printf(" is %p\n", d.b());
printf(" is %p\n", d.a());
}

==

Constructing A at 0x7ffd2ef4db10
 A is 0x7ffd2ef4db10
Constructing B at 0x7ffd2ef4db10
 B.A is 0x7ffd2ef4db10
Constructing C at 0x7ffd2ef4db18
 B.A is 0x7ffd2f34ed1c -- look here!
 C.A is 0x7ffd2ef4db10
Constructing D at 0x7ffd2ef4db10
 D.B.A is 0x7ffd2ef4db10
 D.C.A is 0x7ffd2ef4db10
 is 0x7ffd2ef4db10
 is 0x7ffd2ef4db18
 is 0x7ffd2ef4db10
 is 0x7ffd2ef4db10

==

Before the line marked "look here":
- The `A` object was constructed at 0x7ffd2ef4db10.
- The `B` object pointed to by `gb` has been completely constructed.
- So `gb->a()` ought to return the address of that `A` object, 0x7ffd2ef4db10.
But instead it returns 0x7ffd2f34ed1c, which is 0x40120c bytes away from the
correct value!

I wonder if this is caused by the B-in-D and C-in-D vptrs having the same
offset, so that when we think we're access the B vtable of `*gb`, we're
actually accessing the C vtable of that empty C object...? But then, still, the
offset from the beginning of the B object or the beginning of the C object, to
the A virtual base, ought to be exactly the same number. I can't figure out a
reason for the answer to be off by 0x40120c.

==

Notice that Clang passes this test case as shown; BUT, if you uncomment the
line marked "give Clang trouble, too", then Clang will join GCC in producing
wrong results for the line marked "look here".  MSVC passes both test cases,
but that's not surprising because MSVC has a radically different ABI for struct
layout.

Originally reported by @caster on Slack, here:
https://cpplang.slack.com/archives/CBTFTLR9R/p1671750342552189

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Richard Biener via Gcc-patches




> Am 23.12.2022 um 17:55 schrieb Segher Boessenkool 
> :
> 
> On Fri, Dec 23, 2022 at 05:20:09PM +0100, Richard Biener wrote:
 Am 23.12.2022 um 15:48 schrieb Segher Boessenkool 
 :
>>> None of this belongs in generic code at all imo.  At expand time it
>>> should be expanded to something that works and can be optimised well,
>>> so not anything with :BLK (which has to be put in memory, something with
>>> unbounded size cannot be put in registers), not anything specifically
>>> tailored to any cpu, something nice and regular.  Using a subreg (of a
>>> pseudo!) is the standard way of writing a bitcast.
>>> 
>>> So generic code would do a  (subreg:SF (reg:SI) 0)  to express a 32-bit
>>> integer bitcast to an IEEE SP number, and our machine description should
>>> make it work nicely.
>> 
>> There’s also a byte offset in subreg, so (subreg:sf (reg:di) 4) is a 
>> Highpart bitcast.
> 
> There are at least six very different kinds of subreg:
> 
> 0) Lvalue subregs.  Most archs have no use for it, and it can be
>   expressed much more clearly and cleanly always.
> 1) Subregs of mem.  Do not use, deprecated.  When old reload goes away
>   this will go away.
> 2) Subregs of hard registers.  Do not use, there are much better ways to
>   write subregs of a non-zero byte offset, and for zero offset this is
>   non-canonical RTL.
> 3) Bitcast subregs.  In principle they go from one mode to another mode
>   of the same size (but read on).
> 4) Paradoxical subregs.  A concept completely separate from the rest,
>   different rules for everything, it has to be special cased almost
>   everywhere, it would be better if it was a separate rtx_code imo.
> 5) Finally, normal subregs, taking a contiguous span of bits from some
>   value.
> 
> Now, it is invalid to have a subreg of a subreg, so a 3) of a 5) is
> written as just one subreg, as you say.  And a 4) of a 5) is just
> invalid afaics (and let's not talk about 0)..2) anymore :-) )
> 
>> Note whether targets actually support subreg operations needs to be queried 
>> and I’m not sure how subreg with offset validation should work there.
> 
> But 3) is always valid, no?  On pseudos.

Yes, but it will eventually result in a spill/reload which is undesirable when 
we created this from CSE from a load.  So I think for CSE we do want to know 
whether a spill will definitely not occur.

Richard 
> 
> Segher

[Bug tree-optimization/108215] New: Does not optimize trivial case with bit operations

2022-12-23 Thread socketpair at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108215

Bug ID: 108215
   Summary: Does not optimize trivial case with bit operations
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: socketpair at gmail dot com
  Target Milestone: ---

https://godbolt.org/z/5e3eKqPqs

```C
#include 

int firewall3(const uint8_t *restrict data) {
const uint32_t src = *((const uint32_t *)data);
if ((src & 0x) == 0x1122) return 1;
if ((src & 0xFF00) == 0x11223300) return 1;
return 0;
}

int firewall4(const uint8_t *restrict data) {
const uint32_t src = *((const uint32_t *)data);
if ((src & 0xFF00) == 0x11223300) return 1;
if ((src & 0x) == 0x1122) return 1;
return 0;
}
```

```
firewall3:
movl(%rdi), %eax
xorw%ax, %ax
cmpl$287440896, %eax
sete%al
movzbl  %al, %eax
ret
firewall4:
movl(%rdi), %eax
movl$1, %edx
movl%eax, %ecx
xorb%cl, %cl
cmpl$287453952, %ecx
je  .L3
xorw%ax, %ax
xorl%edx, %edx
cmpl$287440896, %eax
sete%dl
.L3:
movl%edx, %eax
ret
```

firewall3(): Excellent!
firewall4(): FAIL!

It's obvious that order of comparisons in this example does not matter. So I
think misoptimisation of firewall4() is a bug.

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-rust

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|1   |0
 Status|WAITING |UNCONFIRMED

--- Comment #3 from Andrew Pinski  ---
Moved to middle-end since the code that is causing issues is c++ code.

Can you attach the preprocessed source? I wonder if this is a -g0 vs -g issue
...

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

Andrew Pinski  changed:

   What|Removed |Added

 Ever confirmed|1   |0
 Status|WAITING |UNCONFIRMED

--- Comment #3 from Andrew Pinski  ---
Moved to middle-end since the code that is causing issues is c++ code.

Can you attach the preprocessed source? I wonder if this is a -g0 vs -g issue
...

[Bug middle-end/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-rust

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

Andrew Pinski  changed:

   What|Removed |Added

   Keywords||build
  Component|rust|middle-end

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust

[Bug rust/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread stefansf at linux dot ibm.com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

Stefan Schulze Frielinghaus  changed:

   What|Removed |Added

 CC||stefansf at linux dot ibm.com

--- Comment #2 from Stefan Schulze Frielinghaus  
---
Can confirm. Happens with --with-arch=arch13 and started since adding rust to
languages via commit r13-4676-ga75f038c069cc3.

$ diff <(objdump -d stage2-gcc/rust/rust-hir-trait-resolve.o) \
   <(objdump -d stage3-gcc/rust/rust-hir-trait-resolve.o)
2c2
< stage2-gcc/rust/rust-hir-trait-resolve.o: file format elf64-s390
---
> stage3-gcc/rust/rust-hir-trait-resolve.o: file format elf64-s390
1939,1940c1939,1940
< 24ec: e3 20 f2 50 00 24   stg %r2,592(%r15)
< 24f2: e3 30 f1 28 00 04   lg  %r3,296(%r15)
---
> 24ec: e3 30 f1 28 00 04   lg  %r3,296(%r15)
> 24f2: e3 20 f2 50 00 24   stg %r2,592(%r15)

[Bug rust/108102] rust bootstrap comparison failure on s390x-linux-gnu

2022-12-23 Thread stefansf at linux dot ibm.com via Gcc-rust

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108102

Stefan Schulze Frielinghaus  changed:

   What|Removed |Added

 CC||stefansf at linux dot ibm.com

--- Comment #2 from Stefan Schulze Frielinghaus  
---
Can confirm. Happens with --with-arch=arch13 and started since adding rust to
languages via commit r13-4676-ga75f038c069cc3.

$ diff <(objdump -d stage2-gcc/rust/rust-hir-trait-resolve.o) \
   <(objdump -d stage3-gcc/rust/rust-hir-trait-resolve.o)
2c2
< stage2-gcc/rust/rust-hir-trait-resolve.o: file format elf64-s390
---
> stage3-gcc/rust/rust-hir-trait-resolve.o: file format elf64-s390
1939,1940c1939,1940
< 24ec: e3 20 f2 50 00 24   stg %r2,592(%r15)
< 24f2: e3 30 f1 28 00 04   lg  %r3,296(%r15)
---
> 24ec: e3 30 f1 28 00 04   lg  %r3,296(%r15)
> 24f2: e3 20 f2 50 00 24   stg %r2,592(%r15)

-- 
You are receiving this mail because:
You are on the CC list for the bug.
-- 
Gcc-rust mailing list
Gcc-rust@gcc.gnu.org
https://gcc.gnu.org/mailman/listinfo/gcc-rust

[Bug target/107998] [13 Regression] gcc-13-20221204 failure to build on Cygwin No dirname for option: m32

2022-12-23 Thread clyon at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107998

--- Comment #10 from Christophe Lyon  ---
Can you try to revert my patches:
f0d3b6e384a68f8b58bc750f240a15cad92600cd
ccb9c7b129206209cfc315ab1a0432b5f517bdd9
and remove your patch at comment #5 ?
You should still see the problem you reported in bug #108011


However, I don't understand why you had to do what you describe in comment #8.
When multilibs are disabled, the build shouldn't try to use MULTILIB_OPTIONS
etc...

[Bug middle-end/108209] goof in genmatch.cc:commutative_op

2022-12-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108209

--- Comment #1 from Alexander Monakov  ---
Keeping notes as I go...

Duplicated checks for 'op0' in lower_for are duplicated.

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Jose E. Marchesi via Gcc-patches



> On Fri, 23 Dec 2022, Jose E. Marchesi wrote:
>
>> > (scheduling across calls in sched2 is somewhat dubious as well, but
>> > it doesn't risk register pressure issues, and on VLIW CPUs it at least
>> > can result in better VLIW packing)
>> 
>> Does sched2 actually schedule across calls?  All the comments in the
>> source code stress the fact that the second scheduler pass (after
>> register allocation) works in regions that correspond to basic blocks:
>> "(after reload, each region is of one block)".
>
> A call instruction does not end a basic block.

Ok, so my original assumption in the patch explaining why I disabled
sched1 but not sched2 was not correct.  Good to know.

> (also, with -fsched2-use-superblocks sched2 works on regions like sched1)
>
> Alexander

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Alexander Monakov via Gcc-patches



On Fri, 23 Dec 2022, Jose E. Marchesi wrote:

> > (scheduling across calls in sched2 is somewhat dubious as well, but
> > it doesn't risk register pressure issues, and on VLIW CPUs it at least
> > can result in better VLIW packing)
> 
> Does sched2 actually schedule across calls?  All the comments in the
> source code stress the fact that the second scheduler pass (after
> register allocation) works in regions that correspond to basic blocks:
> "(after reload, each region is of one block)".

A call instruction does not end a basic block.

(also, with -fsched2-use-superblocks sched2 works on regions like sched1)

Alexander

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Jose E. Marchesi via Gcc-patches



> On Fri, 23 Dec 2022, Qing Zhao wrote:
>> >> I am a little confused, you mean pre-RA scheduler does not look at the 
>> >> data flow
>> >> information at all when scheduling insns across calls currently?
>> > 
>> > I think it does not inspect liveness info, and may extend lifetime of a 
>> > pseudo
>> > across a call, transforming
>> > 
>> >  call foo
>> >  reg = 1
>> >  ...
>> >  use reg
>> > 
>> > to
>> > 
>> >  reg = 1
>> >  call foo
>> >  ...
>> >  use reg
>> > 
>> > but this is undesirable, because now register allocation cannot select a
>> > call-clobbered register for 'reg’.
>> Okay, thanks for the explanation.
>> 
>> Then, why not just check the liveness info instead of inhibiting all 
>> scheduling across calls?
>
> Because there's almost nothing to gain from pre-RA scheduling across calls in
> the first place. Remember that the call transfers control flow elsewhere and
> therefore the scheduler has no idea about the pipeline state after the call
> and after the return, so modeling-wise it's a gamble.
>
> For instructions that lie on a critical path such scheduling can be useful 
> when
> it substantially reduces the difference between the priority of the call and
> nearby instructions of the critical path. But we don't track which 
> instructions
> are on critical path(s) and which are not.
>
> (scheduling across calls in sched2 is somewhat dubious as well, but
> it doesn't risk register pressure issues, and on VLIW CPUs it at least
> can result in better VLIW packing)

Does sched2 actually schedule across calls?  All the comments in the
source code stress the fact that the second scheduler pass (after
register allocation) works in regions that correspond to basic blocks:
"(after reload, each region is of one block)".

[Bug libstdc++/108214] [13 Regression] writinng bitset to stringstream fails

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108214

Andrew Pinski  changed:

   What|Removed |Added

   Last reconfirmed||2022-12-23
 Ever confirmed|0   |1
   Target Milestone|--- |13.0
Summary|writinng bitset to  |[13 Regression] writinng
   |stringstream fails  |bitset to stringstream
   ||fails
 Status|UNCONFIRMED |NEW

--- Comment #1 from Andrew Pinski  ---
I suspect r13-2998-g1c12a3cfdfabf6 is causing this.

Confirmed.

[Bug target/106877] [12 Regression] ICE in move_for_stack_reg, at reg-stack.cc:1076 since r12-248-gb58dc0b803057c0e

2022-12-23 Thread roger at nextmovesoftware dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106877

Roger Sayle  changed:

   What|Removed |Added

   Target Milestone|12.3|13.0
 Resolution|--- |FIXED
 Status|ASSIGNED|RESOLVED

--- Comment #6 from Roger Sayle  ---
Please let me know if you'd like this backported to the gcc-12 release branch,
but I'm assuming that a low priority ICE-on-invalid means we can now consider
this closed.

Re: [x86 PATCH] Use movss/movsd to implement V4SI/V2DI VEC_PERM.

2022-12-23 Thread Uros Bizjak via Gcc-patches

On Fri, Dec 23, 2022 at 5:46 PM Roger Sayle  wrote:
>
>
> This patch tweaks the x86 backend to use the movss and movsd instructions
> to perform some vector permutations on integer vectors (V4SI and V2DI) in
> the same way they are used for floating point vectors (V4SF and V2DF).
>
> As a motivating example, consider:
>
> typedef unsigned int v4si __attribute__((vector_size(16)));
> typedef float v4sf __attribute__((vector_size(16)));
> v4si foo(v4si x,v4si y) { return (v4si){y[0],x[1],x[2],x[3]}; }
> v4sf bar(v4sf x,v4sf y) { return (v4sf){y[0],x[1],x[2],x[3]}; }
>
> which is currently compiled with -O2 to:
>
> foo:movdqa  %xmm0, %xmm2
> shufps  $80, %xmm0, %xmm1
> movdqa  %xmm1, %xmm0
> shufps  $232, %xmm2, %xmm0
> ret
>
> bar:movss   %xmm1, %xmm0
> ret
>
> with this patch both functions compile to the same form.
> Likewise for the V2DI case:
>
> typedef unsigned long v2di __attribute__((vector_size(16)));
> typedef double v2df __attribute__((vector_size(16)));
>
> v2di foo(v2di x,v2di y) { return (v2di){y[0],x[1]}; }
> v2df bar(v2df x,v2df y) { return (v2df){y[0],x[1]}; }
>
> which is currently generates:
>
> foo:shufpd  $2, %xmm0, %xmm1
> movdqa  %xmm1, %xmm0
> ret
>
> bar:movsd   %xmm1, %xmm0
> ret
>
> There are two possible approaches to adding integer vector forms of the
> sse_movss and sse2_movsd instructions.  One is to use a mode iterator
> (VI4F_128 or VI8F_128) on the existing define_insn patterns, but this
> requires renaming the patterns to sse_movss_ which then requires
> changes to i386-builtins.def and through-out the backend to reflect the
> new naming of gen_sse_movss_v4sf.  The alternate approach (taken here)
> is to simply clone and specialize the existing patterns.  Uros, if you'd
> prefer the first approach, I'm happy to make/test/commit those changes.

I would really prefer the variant with VI4F_128/VI8F_128, these two
iterators were introduced specifically for this case (see e.g.
sse_shufps_ and sse2_shufpd_. The internal name of the
pattern is fairly irrelevant and a trivial search and replace
operation can replace the grand total of 6 occurrences ...)

Also, changing sse2_movsd to use VI8F_128 mode iterator would enable
more alternatives besides movsd, so we give combine pass some more
opportunities with memory operands.

So, the patch with those two iterators is pre-approved.

Uros.

> This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
> and make -k check, both with and without --target_board=unix{-m32},
> with no new failures.  Ok for mainline?
>
> 2022-12-23  Roger Sayle  
>
> gcc/ChangeLog
> * config/i386/i386-expand.cc (expand_vec_perm_movs): Also allow
> V4SImode with TARGET_SSE and V2DImode with TARGET_SSE2.
> * config/i386/sse.md (sse_movss_v4si): New define_insn, a V4SI
> specialization of sse_movss.
> (sse2_movsd_v2di): Likewise, a V2DI specialization of sse2_movsd.
>
> gcc/testsuite/ChangeLog
> * gcc.target/i386/sse-movss-4.c: New test case.
> * gcc.target/i386/sse2-movsd-3.c: New test case.
>
>
> Thanks in advance,
> Roger
> --
>

[Bug c++/108214] New: writinng bitset to stringstream fails

2022-12-23 Thread rhalbersma at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108214

Bug ID: 108214
   Summary: writinng bitset to stringstream fails
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: rhalbersma at gmail dot com
  Target Milestone: ---

#include 
#include 

int main() {
using T = std::bitset<1>;
T a(1);
T b;
std::stringstream sstr;
sstr << a;
sstr >> b;
}

The above program works correctly for g++ until version 12, but for version 13
(trunk) it errors out with: "terminate called after throwing an instance of
'std::invalid_argument' what():  bitset::_M_copy_from_ptr"

Godbolt link: https://godbolt.org/z/nnKT6cddb

[PATCH] libstdc++, configure: Fix GLIBCXX_ZONEINFO_DIR configuration macro.

2022-12-23 Thread Iain Sandoe via Gcc-patches

 This is a patch for comment on the approach - tested on x86_64-darwi21
 thoughts?
 Iain
 
 --- 8< ---

Testing on Darwin revealed that the GLIBCXX_ZONEINFO_DIR was not doing quite
the right thing (we ended up with ${withval} in the config.h file).

This patch proposes revising the behaviour of the configure flag thus:

--with-libstdcxx-zoneinfo-dir=
 unspecified : Set _GLIBCXX_ZONEINFO_DIR to a default suitable for $host
 yes : Set _GLIBCXX_ZONEINFO_DIR to a default suitable for $host
 no  : Do not set _GLIBCXX_ZONEINFO_DIR
 /some/path  : set _GLIBCXX_ZONEINFO_DIR = "/some/path"

Signed-off-by: Iain Sandoe 

libstdc++-v3/ChangeLog:

* acinclude.m4 (GLIBCXX_ZONEINFO_DIR): Revise configure flag
handling.
* configure: Regenerate.
* src/c++20/tzdb.cc: Add a comment that an unset _GLIBCXX_ZONEINFO_DIR
implies that the configuration specified that no directory should be
used.
---
 libstdc++-v3/acinclude.m4  | 21 ++---
 libstdc++-v3/configure | 28 +++-
 libstdc++-v3/src/c++20/tzdb.cc |  1 +
 3 files changed, 34 insertions(+), 16 deletions(-)

diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index f73946a4918..3653822aed4 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -5153,18 +5153,25 @@ AC_DEFUN([GLIBCXX_ZONEINFO_DIR], [
   AC_ARG_WITH([libstdcxx-zoneinfo-dir],
 AC_HELP_STRING([--with-libstdcxx-zoneinfo-dir],
   [the directory to search for tzdata files]),
-[zoneinfo_dir="${withval}"
- AC_DEFINE(_GLIBCXX_ZONEINFO_DIR, "${withval}",
-   [Define if a non-default location should be used for tzdata files.])
-],
-[
+[],[with_libstdcxx_zoneinfo_dir=yes])
+
+  # Pick a default when no specific path is set.
+  if test x${with_libstdcxx_zoneinfo_dir} = xyes; then
 case "$host" in
   # *-*-aix*) zoneinfo_dir="/usr/share/lib/zoneinfo" ;;
+  *-*-darwin2*) zoneinfo_dir="/usr/share/lib/zoneinfo.default" ;;
   *) zoneinfo_dir="/usr/share/zoneinfo" ;;
 esac
-])
-
+  elif test x${with_libstdcxx_zoneinfo_dir} = xno; then
+zoneinfo_dir=none
+  else
+zoneinfo_dir=${with_libstdcxx_zoneinfo_dir}
+  fi
   AC_MSG_NOTICE([zoneinfo data directory: ${zoneinfo_dir}])
+  if test x${zoneinfo_dir} != xnone; then
+AC_DEFINE_UNQUOTED(_GLIBCXX_ZONEINFO_DIR, "${zoneinfo_dir}",
+   [Define if a non-default location should be used for tzdata files.])
+  fi
 ])
 
 # Macros from the top-level gcc directory.

diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index 5f5c4199f65..c4311d0902a 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -52,6 +52,7 @@
 # endif
 #endif
 
+// This is a bit odd; the configure-time setting was 'no zoneinfo directory'
 #ifndef _GLIBCXX_ZONEINFO_DIR
 # define _GLIBCXX_ZONEINFO_DIR "/usr/share/zoneinfo"
 #endif
-- 
2.37.1 (Apple Git-137.1)

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Segher Boessenkool

On Fri, Dec 23, 2022 at 05:20:09PM +0100, Richard Biener wrote:
> > Am 23.12.2022 um 15:48 schrieb Segher Boessenkool 
> > :
> > None of this belongs in generic code at all imo.  At expand time it
> > should be expanded to something that works and can be optimised well,
> > so not anything with :BLK (which has to be put in memory, something with
> > unbounded size cannot be put in registers), not anything specifically
> > tailored to any cpu, something nice and regular.  Using a subreg (of a
> > pseudo!) is the standard way of writing a bitcast.
> > 
> > So generic code would do a  (subreg:SF (reg:SI) 0)  to express a 32-bit
> > integer bitcast to an IEEE SP number, and our machine description should
> > make it work nicely.
> 
> There’s also a byte offset in subreg, so (subreg:sf (reg:di) 4) is a Highpart 
> bitcast.

There are at least six very different kinds of subreg:

0) Lvalue subregs.  Most archs have no use for it, and it can be
   expressed much more clearly and cleanly always.
1) Subregs of mem.  Do not use, deprecated.  When old reload goes away
   this will go away.
2) Subregs of hard registers.  Do not use, there are much better ways to
   write subregs of a non-zero byte offset, and for zero offset this is
   non-canonical RTL.
3) Bitcast subregs.  In principle they go from one mode to another mode
   of the same size (but read on).
4) Paradoxical subregs.  A concept completely separate from the rest,
   different rules for everything, it has to be special cased almost
   everywhere, it would be better if it was a separate rtx_code imo.
5) Finally, normal subregs, taking a contiguous span of bits from some
   value.

Now, it is invalid to have a subreg of a subreg, so a 3) of a 5) is
written as just one subreg, as you say.  And a 4) of a 5) is just
invalid afaics (and let's not talk about 0)..2) anymore :-) )

> Note whether targets actually support subreg operations needs to be queried 
> and I’m not sure how subreg with offset validation should work there.

But 3) is always valid, no?  On pseudos.

Segher

[x86 PATCH] Use movss/movsd to implement V4SI/V2DI VEC_PERM.

2022-12-23 Thread Roger Sayle


This patch tweaks the x86 backend to use the movss and movsd instructions
to perform some vector permutations on integer vectors (V4SI and V2DI) in
the same way they are used for floating point vectors (V4SF and V2DF).

As a motivating example, consider:

typedef unsigned int v4si __attribute__((vector_size(16)));
typedef float v4sf __attribute__((vector_size(16)));
v4si foo(v4si x,v4si y) { return (v4si){y[0],x[1],x[2],x[3]}; }
v4sf bar(v4sf x,v4sf y) { return (v4sf){y[0],x[1],x[2],x[3]}; }

which is currently compiled with -O2 to:

foo:movdqa  %xmm0, %xmm2
shufps  $80, %xmm0, %xmm1
movdqa  %xmm1, %xmm0
shufps  $232, %xmm2, %xmm0
ret

bar:movss   %xmm1, %xmm0
ret

with this patch both functions compile to the same form.
Likewise for the V2DI case:

typedef unsigned long v2di __attribute__((vector_size(16)));
typedef double v2df __attribute__((vector_size(16)));

v2di foo(v2di x,v2di y) { return (v2di){y[0],x[1]}; }
v2df bar(v2df x,v2df y) { return (v2df){y[0],x[1]}; }

which is currently generates:

foo:shufpd  $2, %xmm0, %xmm1
movdqa  %xmm1, %xmm0
ret

bar:movsd   %xmm1, %xmm0
ret

There are two possible approaches to adding integer vector forms of the
sse_movss and sse2_movsd instructions.  One is to use a mode iterator
(VI4F_128 or VI8F_128) on the existing define_insn patterns, but this
requires renaming the patterns to sse_movss_ which then requires
changes to i386-builtins.def and through-out the backend to reflect the
new naming of gen_sse_movss_v4sf.  The alternate approach (taken here)
is to simply clone and specialize the existing patterns.  Uros, if you'd
prefer the first approach, I'm happy to make/test/commit those changes.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32},
with no new failures.  Ok for mainline?

2022-12-23  Roger Sayle  

gcc/ChangeLog
* config/i386/i386-expand.cc (expand_vec_perm_movs): Also allow
V4SImode with TARGET_SSE and V2DImode with TARGET_SSE2.
* config/i386/sse.md (sse_movss_v4si): New define_insn, a V4SI
specialization of sse_movss.
(sse2_movsd_v2di): Likewise, a V2DI specialization of sse2_movsd.

gcc/testsuite/ChangeLog
* gcc.target/i386/sse-movss-4.c: New test case.
* gcc.target/i386/sse2-movsd-3.c: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc
index a45640f..ad7745a 100644
--- a/gcc/config/i386/i386-expand.cc
+++ b/gcc/config/i386/i386-expand.cc
@@ -18903,8 +18903,10 @@ expand_vec_perm_movs (struct expand_vec_perm_d *d)
 return false;
 
   if (!(TARGET_SSE && vmode == V4SFmode)
+  && !(TARGET_SSE && vmode == V4SImode)
   && !(TARGET_MMX_WITH_SSE && vmode == V2SFmode)
-  && !(TARGET_SSE2 && vmode == V2DFmode))
+  && !(TARGET_SSE2 && vmode == V2DFmode)
+  && !(TARGET_SSE2 && vmode == V2DImode))
 return false;
 
   /* Only the first element is changed.  */
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index de632b2..f5860f2c 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -10513,6 +10513,21 @@
(set_attr "prefix" "orig,maybe_evex")
(set_attr "mode" "SF")])
 
+(define_insn "sse_movss_v4si"
+  [(set (match_operand:V4SI 0 "register_operand"   "=x,v")
+   (vec_merge:V4SI
+ (match_operand:V4SI 2 "register_operand" " x,v")
+ (match_operand:V4SI 1 "register_operand" " 0,v")
+ (const_int 1)))]
+  "TARGET_SSE"
+  "@
+   movss\t{%2, %0|%0, %2}
+   vmovss\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "ssemov")
+   (set_attr "prefix" "orig,maybe_evex")
+   (set_attr "mode" "SF")])
+
 (define_insn "avx2_vec_dup"
   [(set (match_operand:VF1_128_256 0 "register_operand" "=v")
(vec_duplicate:VF1_128_256
@@ -13523,6 +13538,21 @@
   (const_string "orig")))
(set_attr "mode" "DF,DF,V1DF,V1DF,V1DF,V2DF,V1DF,V1DF,V1DF")])
 
+(define_insn "sse2_movsd_v2di"
+  [(set (match_operand:V2DI 0 "register_operand"   "=x,v")
+   (vec_merge:V2DI
+ (match_operand:V2DI 2 "register_operand" " x,v")
+ (match_operand:V2DI 1 "register_operand" " 0,v")
+ (const_int 1)))]
+  "TARGET_SSE2"
+  "@
+   movsd\t{%2, %0|%0, %2}
+   vmovsd\t{%2, %1, %0|%0, %1, %2}"
+  [(set_attr "isa" "noavx,avx")
+   (set_attr "type" "ssemov")
+   (set_attr "prefix" "orig,maybe_evex")
+   (set_attr "mode" "DF")])
+
 (define_insn "vec_dupv2df"
   [(set (match_operand:V2DF 0 "register_operand" "=x,x,v")
(vec_duplicate:V2DF
diff --git a/gcc/testsuite/gcc.target/i386/sse-movss-4.c 
b/gcc/testsuite/gcc.target/i386/sse-movss-4.c
new file mode 100644
index 000..ec3019c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/sse-movss-4.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -msse" } */
+
+typedef unsigned int v4si

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Richard Biener via Gcc-patches




> Am 23.12.2022 um 15:48 schrieb Segher Boessenkool 
> :
> 
> Hi!
> 
>> On Fri, Dec 23, 2022 at 08:36:36PM +0800, Jiufu Guo wrote:
>> It seems some limitations there. e.g. 1. "subreg:DF on DI register"
>> may not work well on pseudo,
> 
> It is perfectly normal:
>  A hard register may be accessed in various modes throughout one
>  function, but each pseudo register is given a natural mode
>  and is accessed only in that mode.  When it is necessary to describe
>  an access to a pseudo register using a nonnatural mode, a @code{subreg}
>  expression is used.
> 
> and:
>  @code{subreg} expressions are used to refer to a register in a machine
>  mode other than its natural one, or to refer to one register of
>  a multi-part @code{reg} that actually refers to several registers.
> 
>  Each pseudo register has a natural mode.  If it is necessary to
>  operate on it in a different mode, the register must be
>  enclosed in a @code{subreg}.
> 
> and we even have:
>  @item hard registers
>  It is seldom necessary to wrap hard registers in @code{subreg}s; such
>  registers would normally reduce to a single @code{reg} rtx.  This use of
>  @code{subreg}s is discouraged and may not be supported in the future.
> 
>> and 2. to convert high-part:DI to SF,
>> a "shift/rotate" is needed, and then we need to "emit shift insn"
>> in cse. I may need to update this patch.
> 
> Hrm.  The machine insns to do this is just mtvsrd;xscvspdpn, but for
> converting the lowpart it is mtvsrws;xscvspdpn (this needs p9 or
> later).  We should arrive at those patterns, and we should try to not
> go via the more expensive formulations with shifts, which don't describe
> the hardware well, and which overestimate the cost of it.
> 
> None of this belongs in generic code at all imo.  At expand time it
> should be expanded to something that works and can be optimised well,
> so not anything with :BLK (which has to be put in memory, something with
> unbounded size cannot be put in registers), not anything specifically
> tailored to any cpu, something nice and regular.  Using a subreg (of a
> pseudo!) is the standard way of writing a bitcast.
> 
> So generic code would do a  (subreg:SF (reg:SI) 0)  to express a 32-bit
> integer bitcast to an IEEE SP number, and our machine description should
> make it work nicely.

There’s also a byte offset in subreg, so (subreg:sf (reg:di) 4) is a Highpart 
bitcast.  Note whether targets actually support subreg operations needs to be 
queried and I’m not sure how subreg with offset validation should work there.

Richard 

> 
> 
> Segher

[Bug c++/108116] [12 Regression] ICE in check_noexcept_r, at cp/except.cc:1074 since r12-6897-gdec8d0e5fa00ceb2

2022-12-23 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108116

Patrick Palka  changed:

   What|Removed |Added

  Known to work||13.0
Summary|[12/13 Regression] ICE in   |[12 Regression] ICE in
   |check_noexcept_r, at|check_noexcept_r, at
   |cp/except.cc:1074 since |cp/except.cc:1074 since
   |r12-6897-gdec8d0e5fa00ceb2  |r12-6897-gdec8d0e5fa00ceb2

--- Comment #5 from Patrick Palka  ---
Fixed on trunk so far

[Bug c++/108116] [12/13 Regression] ICE in check_noexcept_r, at cp/except.cc:1074 since r12-6897-gdec8d0e5fa00ceb2

2022-12-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108116

--- Comment #4 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:cf59c8983ef6590f0d69014f8dc8778b5b7691c6

commit r13-4879-gcf59c8983ef6590f0d69014f8dc8778b5b7691c6
Author: Patrick Palka 
Date:   Fri Dec 23 11:17:45 2022 -0500

c++: get_nsdmi in template context [PR108116]

Here during ahead of time checking of C{}, we indirectly call get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer, during
which we build a call to A::~A and check expr_noexcept_p for it (from
build_vec_delete_1).  But this is all done with processing_template_decl
set, so the built A::~A call is templated (whose form was recently
changed by r12-6897-gdec8d0e5fa00ceb2) which expr_noexcept_p doesn't
expect, and we crash.

This patch fixes this by clearing processing_template_decl before
the call to break_out_target_exprs from get_nsdmi.  And since it more
generally seems we shouldn't be seeing (or producing) non-templated
trees in break_out_target_exprs, this patch also adds an assert to
that effect.

PR c++/108116

gcc/cp/ChangeLog:

* constexpr.cc (maybe_constant_value): Clear
processing_template_decl before calling break_out_target_exprs.
* init.cc (get_nsdmi): Likewise.
* tree.cc (break_out_target_exprs): Assert processing_template_decl
is cleared.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/nsdmi-template24.C: New test.

PING [PATCH] Fortran: incorrect array bounds when bound intrinsic used in decl [PR108131]

2022-12-23 Thread Harald Anlauf via Gcc-patches


Am 17.12.22 um 22:21 schrieb Harald Anlauf via Gcc-patches:

Dear all,

the previous fix for pr103505 introduced a regression that could lead
to wrong array bounds when LBOUND/UBOUND were used in the array spec
of a declaration.  The reason was that we tried to simplify too early
the array element spec, which appears to have interfered with the
subtle semantics of the bound intrinsics.

The solution is to undo the fix for pr103505.  It turns out that
there are other code changes in place that were put in place to
fix related ICEs, and which handle that one, too, and only lead
to a change of the emitted error diagnostics.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

As this is a 10/11/12/13 regression, I would like to backport
as seems fit.

Thanks,
Harald

Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-23 Thread Jason Merrill via Gcc-patches


On 12/23/22 10:48, Patrick Palka wrote:

On Thu, 22 Dec 2022, Patrick Palka wrote:


On Thu, 22 Dec 2022, Jason Merrill wrote:


On 12/22/22 16:41, Patrick Palka wrote:

On Thu, 22 Dec 2022, Jason Merrill wrote:


On 12/22/22 11:31, Patrick Palka wrote:

On Wed, 21 Dec 2022, Jason Merrill wrote:


On 12/21/22 09:52, Patrick Palka wrote:

Here during ahead of time checking of C{}, we indirectly call
get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer,
during
which we end up building a call to A::~A and checking
expr_noexcept_p
for it (from build_vec_delete_1).  But this is all done with
processing_template_decl set, so the built A::~A call is templated
(whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
expr_noexcept_p doesn't expect and we crash.

In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
expr_noexcept_p call with !processing_template_decl, which works
here
too.  But it seems to me since the initializer we obtain in
get_nsdmi is
always non-templated, it should be calling break_out_target_exprs
with
processing_template_decl cleared since otherwise the function might
end
up mixing templated and non-templated trees.

I'm not sure about this though, perhaps this is not the best fix
here.
Alternatively, when processing_template_decl we could make get_nsdmi
avoid calling break_out_target_exprs at all or something.
Additionally,
perhaps break_out_target_exprs should be a no-op more generally when
processing_template_decl since we shouldn't see any TARGET_EXPRs
inside
a template?


Hmm.

Any time we would call break_out_target_exprs we're dealing with
non-dependent
expressions; if we're in a template, we're building up an initializer
or a
call that we'll soon throw away, just for the purpose of checking or
type
computation.

Furthermore, as you say, the argument is always a non-template tree,
whether
in get_nsdmi or convert_default_arg.  So having
processing_template_decl
cleared would be correct.

I don't think we can get away with not calling break_out_target_exprs
at
all
in a template; if nothing else, we would lose immediate invocation
expansion.
However, we could probably skip the bot_manip tree walk, which should
avoid
the problem.

Either way we end up returning non-template trees, as we do now, and
callers
have to deal with transient CONSTRUCTORs containing such (as we do in
massage_init_elt).


Ah I see, makes sense.



Does convert_default_arg not run into the same problem, e.g. when
calling

 void g(B = {0});


In practice it seems not, because we don't call convert_default_arg
when processing_template_decl is set (verified with an assert to
that effect).  In build_over_call for example we exit early when
processing_template_decl is set, and return a templated CALL_EXPR
that doesn't include default arguments at all.  A consequence of
this is that we don't reject ahead of time a call that would use
an ill-formed dependent default argument, e.g.

 template
 void g(B = T{0});

 template
 void f() {
   g();
 }

since the default argument instantiation would be the responsibility
of convert_default_arg.

Thinking hypothetically here, if we do in the future want to include
default
arguments in the templated form of a CALL_EXPR,


We definitely do not want to; the templated form should be as close as
possible to the source.


Ah, sounds good.



We might want to perform non-dependent conversions to get any errors (such
as
this one) before throwing away the result.  Which would be parallel to
what we
currently do in calling get_nsdmi, and would want the same behavior.


*nod*




[snip]



shall we go with the original approach to clear
processing_template_decl directly from get_nsdmi?


OK, but then we should also checking_assert !processing_template_decl in
b_o_t_e.


Unfortunately we'd trigger that assert from maybe_constant_value, which
potentially calls b_o_t_e with processing_template_decl set.


maybe_constant_value could also clear processing_template_decl; entries in
cv_cache are non-templated.


Aha!  I'll try that.


How does this look?  Bootstrapped and regtested on x86_64-pc-linux-gnu.


OK.


-- >8 --

Subject: [PATCH] c++: get_nsdmi in template context [PR108116]

Here during ahead of time checking of C{}, we indirectly call get_nsdmi
for C::m from finish_compound_literal, which in turn calls
break_out_target_exprs for C::m's (non-templated) initializer, during
which we build a call to A::~A and check expr_noexcept_p for it (from
build_vec_delete_1).  But this is all done with processing_template_decl
set, so the built A::~A call is templated (whose form was recently
changed by r12-6897-gdec8d0e5fa00ceb2) which expr_noexcept_p doesn't
expect, and we crash.

This patch fixes this by clearing processing_template_decl before
the call to break_out_target_exprs from get_nsdmi.  And since it more
generally seems we shouldn't be seeing (or producing)

Re: [PATCH] c++: get_nsdmi in template context [PR108116]

2022-12-23 Thread Patrick Palka via Gcc-patches

On Thu, 22 Dec 2022, Patrick Palka wrote:

> On Thu, 22 Dec 2022, Jason Merrill wrote:
> 
> > On 12/22/22 16:41, Patrick Palka wrote:
> > > On Thu, 22 Dec 2022, Jason Merrill wrote:
> > > 
> > > > On 12/22/22 11:31, Patrick Palka wrote:
> > > > > On Wed, 21 Dec 2022, Jason Merrill wrote:
> > > > > 
> > > > > > On 12/21/22 09:52, Patrick Palka wrote:
> > > > > > > Here during ahead of time checking of C{}, we indirectly call
> > > > > > > get_nsdmi
> > > > > > > for C::m from finish_compound_literal, which in turn calls
> > > > > > > break_out_target_exprs for C::m's (non-templated) initializer,
> > > > > > > during
> > > > > > > which we end up building a call to A::~A and checking
> > > > > > > expr_noexcept_p
> > > > > > > for it (from build_vec_delete_1).  But this is all done with
> > > > > > > processing_template_decl set, so the built A::~A call is templated
> > > > > > > (whose form r12-6897-gdec8d0e5fa00ceb2 recently changed) which
> > > > > > > expr_noexcept_p doesn't expect and we crash.
> > > > > > > 
> > > > > > > In r10-6183-g20afdcd3698275 we fixed a similar issue by guarding a
> > > > > > > expr_noexcept_p call with !processing_template_decl, which works
> > > > > > > here
> > > > > > > too.  But it seems to me since the initializer we obtain in
> > > > > > > get_nsdmi is
> > > > > > > always non-templated, it should be calling break_out_target_exprs
> > > > > > > with
> > > > > > > processing_template_decl cleared since otherwise the function 
> > > > > > > might
> > > > > > > end
> > > > > > > up mixing templated and non-templated trees.
> > > > > > > 
> > > > > > > I'm not sure about this though, perhaps this is not the best fix
> > > > > > > here.
> > > > > > > Alternatively, when processing_template_decl we could make 
> > > > > > > get_nsdmi
> > > > > > > avoid calling break_out_target_exprs at all or something.
> > > > > > > Additionally,
> > > > > > > perhaps break_out_target_exprs should be a no-op more generally 
> > > > > > > when
> > > > > > > processing_template_decl since we shouldn't see any TARGET_EXPRs
> > > > > > > inside
> > > > > > > a template?
> > > > > > 
> > > > > > Hmm.
> > > > > > 
> > > > > > Any time we would call break_out_target_exprs we're dealing with
> > > > > > non-dependent
> > > > > > expressions; if we're in a template, we're building up an 
> > > > > > initializer
> > > > > > or a
> > > > > > call that we'll soon throw away, just for the purpose of checking or
> > > > > > type
> > > > > > computation.
> > > > > > 
> > > > > > Furthermore, as you say, the argument is always a non-template tree,
> > > > > > whether
> > > > > > in get_nsdmi or convert_default_arg.  So having
> > > > > > processing_template_decl
> > > > > > cleared would be correct.
> > > > > > 
> > > > > > I don't think we can get away with not calling 
> > > > > > break_out_target_exprs
> > > > > > at
> > > > > > all
> > > > > > in a template; if nothing else, we would lose immediate invocation
> > > > > > expansion.
> > > > > > However, we could probably skip the bot_manip tree walk, which 
> > > > > > should
> > > > > > avoid
> > > > > > the problem.
> > > > > > 
> > > > > > Either way we end up returning non-template trees, as we do now, and
> > > > > > callers
> > > > > > have to deal with transient CONSTRUCTORs containing such (as we do 
> > > > > > in
> > > > > > massage_init_elt).
> > > > > 
> > > > > Ah I see, makes sense.
> > > > > 
> > > > > > 
> > > > > > Does convert_default_arg not run into the same problem, e.g. when
> > > > > > calling
> > > > > > 
> > > > > > void g(B = {0});
> > > > > 
> > > > > In practice it seems not, because we don't call convert_default_arg
> > > > > when processing_template_decl is set (verified with an assert to
> > > > > that effect).  In build_over_call for example we exit early when
> > > > > processing_template_decl is set, and return a templated CALL_EXPR
> > > > > that doesn't include default arguments at all.  A consequence of
> > > > > this is that we don't reject ahead of time a call that would use
> > > > > an ill-formed dependent default argument, e.g.
> > > > > 
> > > > > template
> > > > > void g(B = T{0});
> > > > > 
> > > > > template
> > > > > void f() {
> > > > >   g();
> > > > > }
> > > > > 
> > > > > since the default argument instantiation would be the responsibility
> > > > > of convert_default_arg.
> > > > > 
> > > > > Thinking hypothetically here, if we do in the future want to include
> > > > > default
> > > > > arguments in the templated form of a CALL_EXPR,
> > > > 
> > > > We definitely do not want to; the templated form should be as close as
> > > > possible to the source.
> > > 
> > > Ah, sounds good.
> > > 
> > > > 
> > > > We might want to perform non-dependent conversions to get any errors 
> > > > (such
> > > > as
> > > > this one) before throwing away the result.  Which would be parallel to
> > > > what we
> > > > currently do in calling get_nsdmi, and would want the same behavior.
> > >

[Bug c/107947] __has_c_attribute incorrectly identifies attribute as supported

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107947

Andrew Pinski  changed:

   What|Removed |Added

 CC||marxin at gcc dot gnu.org

--- Comment #6 from Andrew Pinski  ---
*** Bug 108213 has been marked as a duplicate of this bug. ***

[Bug c/108213] [[noreturn]] cannot be used after static keyword

2022-12-23 Thread pinskia at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108213

Andrew Pinski  changed:

   What|Removed |Added

 Resolution|--- |DUPLICATE
 Status|UNCONFIRMED |RESOLVED

--- Comment #1 from Andrew Pinski  ---
It is a bug in the timezone sources.

*** This bug has been marked as a duplicate of bug 107947 ***

[Bug c/107993] ICE: tree check: expected string_cst, have integer_cst in get_target_clone_attr_len, at tree.cc:14872

2022-12-23 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107993

Martin Liška  changed:

   What|Removed |Added

   Keywords||patch

--- Comment #3 from Martin Liška  ---
Patch candidate:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609060.html

[PATCH] strlen: do not use cond_expr for boundaries

2022-12-23 Thread Martin Liška

Hi.

We reach cond_expr and then we get an ICE in tree_int_cst_lt.
Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

PR tree-optimization/108137

gcc/ChangeLog:

* tree-ssa-strlen.cc (get_range_strlen_phi): Reject anything
different from INTEGER_CST.

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr108137.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr108137.c |  8 
 gcc/tree-ssa-strlen.cc   | 13 +++--
 2 files changed, 15 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr108137.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr108137.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr108137.c
new file mode 100644
index 000..f0cb71b2267
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr108137.c
@@ -0,0 +1,8 @@
+// PR tree-optimization/108137
+// { dg-do compile }
+// { dg-options "-Wformat-overflow" }
+
+void f(unsigned short x_port, unsigned int x_host)
+{
+__builtin_printf("missing %s", x_port ? "host" : &"host:port"[x_host ? 5 : 
0]);
+}
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index abec225566d..a2edac4c77f 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -1136,14 +1136,15 @@ get_range_strlen_phi (tree src, gphi *phi,
 
   /* Adjust the minimum and maximum length determined so far and
 the upper bound on the array size.  */
-  if (!pdata->minlen
- || tree_int_cst_lt (argdata.minlen, pdata->minlen))
+  if (TREE_CODE (argdata.minlen) == INTEGER_CST
+ && (!pdata->minlen
+ || tree_int_cst_lt (argdata.minlen, pdata->minlen)))
pdata->minlen = argdata.minlen;
 
-  if (!pdata->maxlen
- || (argdata.maxlen
- && TREE_CODE (argdata.maxlen) == INTEGER_CST
- && tree_int_cst_lt (pdata->maxlen, argdata.maxlen)))
+  if (TREE_CODE (argdata.maxlen) == INTEGER_CST
+ && (!pdata->maxlen
+ || (argdata.maxlen
+ && tree_int_cst_lt (pdata->maxlen, argdata.maxlen
pdata->maxlen = argdata.maxlen;
 
   if (!pdata->maxbound
-- 
2.39.0

[Bug c/108213] New: [[noreturn]] cannot be used after static keyword

2022-12-23 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108213

Bug ID: 108213
   Summary: [[noreturn]] cannot be used after static keyword
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: marxin at gcc dot gnu.org
CC: jsm28 at gcc dot gnu.org
  Target Milestone: ---

I noticed that in timezone package:

$ cat zic.i && gcc-12 zic.i -c
static _Noreturn void
time_overflow(void)
{
  __builtin_abort ();
}

$ cat zic.i && gcc-12 zic.i -c
static [[noreturn]] void
time_overflow(void)
{
  __builtin_abort ();
}
zic.i:1:1: warning: ‘noreturn’ attribute ignored [-Wattributes]
1 | static [[noreturn]] void
  | ^~
zic.i:1:21: error: expected identifier or ‘(’ before ‘void’
1 | static [[noreturn]] void
  | ^~~~

Is it really an invalid construction?

[Bug sanitizer/108085] gcc trunk's ASAN at -O3 missed a stack-use-after-scope

2022-12-23 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108085

Martin Liška  changed:

   What|Removed |Added

   Assignee|marxin at gcc dot gnu.org  |unassigned at gcc dot 
gnu.org
 Status|ASSIGNED|NEW

[committed] tree-ssa-dom: can_infer_simple_equiv fixes [PR108068]

2022-12-23 Thread Jakub Jelinek via Gcc-patches

Hi!

As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find
if a floating point constant is zero and it shouldn't try to infer
equivalences from comparison against it if signed zeros are honored.
This doesn't work at all for decimal types, because real_zerop always
returns false for them (one can have different representations of decimal
zero beyond -0/+0), and it doesn't work for vector compares either,
as real_zerop checks if all elements are zero, while we need to avoid
infering equivalences from comparison against vector constants which have
at least one zero element in it (if signed zeros are honored).
Furthermore, as mentioned by Joseph, for decimal types many other values
aren't singleton.

So, this patch stops infering anything if element mode is decimal, and
otherwise uses instead of real_zerop a new function, real_maybe_zerop,
which will work even for decimal types and for complex or vector will
return true if any element is or might be zero (so it returns true
for anything but constants for now).

Bootstrapped/regtested on x86_64-linux and i686-linux, acked by Richi
in the PR, committed to trunk.

2022-12-23  Jakub Jelinek  

PR tree-optimization/108068
* tree.h (real_maybe_zerop): Declare.
* tree.cc (real_maybe_zerop): Define.
* tree-ssa-dom.cc (record_edge_info): Use it instead of
real_zerop or TREE_CODE (op1) == SSA_NAME || real_zerop.  Always set
can_infer_simple_equiv to false for decimal floating point types.

* gcc.dg/dfp/pr108068.c: New test.

--- gcc/tree.h.jj   2022-12-21 09:03:45.722562726 +0100
+++ gcc/tree.h  2022-12-21 16:34:56.316622678 +0100
@@ -5497,6 +5497,7 @@ extern bool needs_to_live_in_memory (con
 extern tree reconstruct_complex_type (tree, tree);
 extern bool real_onep (const_tree);
 extern bool real_minus_onep (const_tree);
+extern bool real_maybe_zerop (const_tree);
 extern void init_ttree (void);
 extern void build_common_tree_nodes (bool);
 extern void build_common_builtin_nodes (void);
--- gcc/tree.cc.jj  2022-12-21 09:03:45.719562769 +0100
+++ gcc/tree.cc 2022-12-21 16:35:46.567899636 +0100
@@ -3180,6 +3180,35 @@ real_minus_onep (const_tree expr)
 }
 }
 
+/* Return true if T could be a floating point zero.  */
+
+bool
+real_maybe_zerop (const_tree expr)
+{
+  switch (TREE_CODE (expr))
+{
+case REAL_CST:
+  /* Can't use real_zerop here, as it always returns false for decimal
+floats.  And can't use TREE_REAL_CST (expr).cl == rvc_zero
+either, as decimal zeros are rvc_normal.  */
+  return real_equal (_REAL_CST (expr), );
+case COMPLEX_CST:
+  return (real_maybe_zerop (TREE_REALPART (expr))
+ || real_maybe_zerop (TREE_IMAGPART (expr)));
+case VECTOR_CST:
+  {
+   unsigned count = vector_cst_encoded_nelts (expr);
+   for (unsigned int i = 0; i < count; ++i)
+ if (real_maybe_zerop (VECTOR_CST_ENCODED_ELT (expr, i)))
+   return true;
+   return false;
+  }
+default:
+  /* Perhaps for SSA_NAMEs we could query frange.  */
+  return true;
+}
+}
+
 /* Nonzero if EXP is a constant or a cast of a constant.  */
 
 bool
--- gcc/tree-ssa-dom.cc.jj  2022-11-23 09:24:48.781253319 +0100
+++ gcc/tree-ssa-dom.cc 2022-12-21 16:36:37.756163125 +0100
@@ -615,9 +615,9 @@ record_edge_info (basic_block bb)
 {
   tree cond = build2 (code, boolean_type_node, op0, op1);
   tree inverted = invert_truthvalue_loc (loc, cond);
-  bool can_infer_simple_equiv
-= !(HONOR_SIGNED_ZEROS (op0)
-&& real_zerop (op0));
+ bool can_infer_simple_equiv
+   = !(HONOR_SIGNED_ZEROS (op0) && real_maybe_zerop (op0))
+ && !DECIMAL_FLOAT_MODE_P (element_mode (TREE_TYPE (op0)));
  class edge_info *edge_info;
 
  edge_info = new class edge_info (true_edge);
@@ -639,9 +639,9 @@ record_edge_info (basic_block bb)
 {
   tree cond = build2 (code, boolean_type_node, op0, op1);
   tree inverted = invert_truthvalue_loc (loc, cond);
-  bool can_infer_simple_equiv
-= !(HONOR_SIGNED_ZEROS (op1)
-&& (TREE_CODE (op1) == SSA_NAME || real_zerop (op1)));
+ bool can_infer_simple_equiv
+   = !(HONOR_SIGNED_ZEROS (op1) && real_maybe_zerop (op1))
+ && !DECIMAL_FLOAT_MODE_P (element_mode (TREE_TYPE (op1)));
  class edge_info *edge_info;
 
  edge_info = new class edge_info (true_edge);
--- gcc/testsuite/gcc.dg/dfp/pr108068.c.jj  2022-12-21 16:41:45.243738850 
+0100
+++ gcc/testsuite/gcc.dg/dfp/pr108068.c 2022-12-21 16:41:38.267839223 +0100
@@ -0,0 +1,14 @@
+/* PR tree-optimization/108068 */
+/* { dg-do run } */
+/* { dg-options "-O2" } */
+
+int
+main ()
+{
+  _Decimal64 x = -1;
+  while (x != 0)
+x /= 10;
+  double d = x;
+  if (!__builtin_signbit (d))

[Bug sanitizer/108085] gcc trunk's ASAN at -O3 missed a stack-use-after-scope

2022-12-23 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108085

--- Comment #3 from Martin Liška  ---
Created attachment 54153
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54153=edit
pr108085.c.216t.uncprop1.dot.svg

So no, it's a real issue where we optimize out .ASAN_CHECK (6, , 4, 8); in
the exit block. As seen in the dump file, we have the very ASAN_CHECK in bb_3:
.ASAN_CHECK (7, , 4, 8), however, there are 2 ASAN_MARK (POISON, , 4) calls
that are on the path from bb_3 to the exit block.

@Jakub: Can you please take a look at the optimization algorithm why the check
is not preserved?

[Bug tree-optimization/108068] [10/11/12 Regression] decimal floating point signed zero is not honored

2022-12-23 Thread jakub at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108068

Jakub Jelinek  changed:

   What|Removed |Added

Summary|[10/11/12/13 Regression]|[10/11/12 Regression]
   |decimal floating point  |decimal floating point
   |signed zero is not honored  |signed zero is not honored

--- Comment #12 from Jakub Jelinek  ---
Fixed on the trunk for now.

[Bug tree-optimization/108068] [10/11/12/13 Regression] decimal floating point signed zero is not honored

2022-12-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108068

--- Comment #11 from CVS Commits  ---
The master branch has been updated by Jakub Jelinek :

https://gcc.gnu.org/g:fd1b0aefda5b65f3f841ca6e61ccea6a72daa060

commit r13-4877-gfd1b0aefda5b65f3f841ca6e61ccea6a72daa060
Author: Jakub Jelinek 
Date:   Fri Dec 23 16:12:21 2022 +0100

tree-ssa-dom: can_infer_simple_equiv fixes [PR108068]

As reported in the PR, tree-ssa-dom.cc uses real_zerop call to find
if a floating point constant is zero and it shouldn't try to infer
equivalences from comparison against it if signed zeros are honored.
This doesn't work at all for decimal types, because real_zerop always
returns false for them (one can have different representations of decimal
zero beyond -0/+0), and it doesn't work for vector compares either,
as real_zerop checks if all elements are zero, while we need to avoid
infering equivalences from comparison against vector constants which have
at least one zero element in it (if signed zeros are honored).
Furthermore, as mentioned by Joseph, for decimal types many other values
aren't singleton.

So, this patch stops infering anything if element mode is decimal, and
otherwise uses instead of real_zerop a new function, real_maybe_zerop,
which will work even for decimal types and for complex or vector will
return true if any element is or might be zero (so it returns true
for anything but constants for now).

2022-12-23  Jakub Jelinek  

PR tree-optimization/108068
* tree.h (real_maybe_zerop): Declare.
* tree.cc (real_maybe_zerop): Define.
* tree-ssa-dom.cc (record_edge_info): Use it instead of
real_zerop or TREE_CODE (op1) == SSA_NAME || real_zerop.  Always
set
can_infer_simple_equiv to false for decimal floating point types.

* gcc.dg/dfp/pr108068.c: New test.

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Alexander Monakov via Gcc-patches

On Fri, 23 Dec 2022, Qing Zhao wrote:
> >> I am a little confused, you mean pre-RA scheduler does not look at the 
> >> data flow
> >> information at all when scheduling insns across calls currently?
> > 
> > I think it does not inspect liveness info, and may extend lifetime of a 
> > pseudo
> > across a call, transforming
> > 
> >  call foo
> >  reg = 1
> >  ...
> >  use reg
> > 
> > to
> > 
> >  reg = 1
> >  call foo
> >  ...
> >  use reg
> > 
> > but this is undesirable, because now register allocation cannot select a
> > call-clobbered register for 'reg’.
> Okay, thanks for the explanation.
> 
> Then, why not just check the liveness info instead of inhibiting all 
> scheduling across calls?

Because there's almost nothing to gain from pre-RA scheduling across calls in
the first place. Remember that the call transfers control flow elsewhere and
therefore the scheduler has no idea about the pipeline state after the call
and after the return, so modeling-wise it's a gamble.

For instructions that lie on a critical path such scheduling can be useful when
it substantially reduces the difference between the priority of the call and
nearby instructions of the critical path. But we don't track which instructions
are on critical path(s) and which are not.

(scheduling across calls in sched2 is somewhat dubious as well, but
it doesn't risk register pressure issues, and on VLIW CPUs it at least
can result in better VLIW packing)

Alexander

Re: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-23 Thread 钟居哲

Thank you. Would you mind testing this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609045.html 
to see whether the issue is fixed ?
Thanks



juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2022-12-23 22:54
To: 钟居哲
CC: gcc-patches; kito.cheng; palmer
Subject: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support
On Dez 23 2022, 钟居哲 wrote:
 
> Would you mind telling me how you reproduce these errors ?
 
make bootstrap
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

[Bug c++/87697] Casting a base class to derived gives no warning

2022-12-23 Thread arthur.j.odwyer at gmail dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87697

Arthur O'Dwyer  changed:

   What|Removed |Added

 CC||arthur.j.odwyer at gmail dot 
com

--- Comment #4 from Arthur O'Dwyer  ---
jynelson: Static-casting from Base& to Derived& is the foundation of the
"Curiously Recurring Template Pattern" in C++, and therefore can't be allowed
to trigger any diagnostic with -Wall -Wextra. (Many industry codebases build
with -Wall -Wextra, and also use the CRTP.)
*Aside* from that practical consideration, I don't think there's anything wrong
with casting from one type to another. The point of type-cast syntax is to say
"Don't worry, compiler, I know what I'm doing." If one doesn't know what one's
doing, then one shouldn't use casts at all, and just stick to the implicit
conversions. It's already an error to *implicitly convert* from Base& to
Derived&, so if you stick to implicit conversions you'll get exactly the
behavior you want.

Suggest closing this issue as NOTABUG.
But see also #96765 (for this kind of cast specifically *inside a
constructor*).

Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-23 Thread Andreas Schwab

On Dez 23 2022, 钟居哲 wrote:

> Would you mind telling me how you reproduce these errors ?

make bootstrap

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: [PATCH v5 3/4] OpenMP: Pointers and member mappings

2022-12-23 Thread Julian Brown

On Thu, 15 Dec 2022 16:46:50 +
Julian Brown  wrote:

> On Thu, 15 Dec 2022 14:54:58 +
> Julian Brown  wrote:
> 
> > On Wed, 7 Dec 2022 17:31:20 +0100
> > Tobias Burnus  wrote:
> >   
> > > Hi Julian,
> > > 
> > > I think this patch is OK; however, at least for gimplify.cc Jakub
> > > needs to have a second look.
> > 
> > Thanks for the review!  Here's a new version that hopefully
> > addresses your comments.  (The gimplify bits change a bit more in
> > this version!)  
> 
> FYI, this is the current dependency list for this patch:
> 
> (1) "OpenMP/OpenACC: Reindent TO/FROM/_CACHE_ stanza in
> {c_}finish_omp_clause"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603791.html
> Approved.
> 
> (2) "OpenMP/OpenACC: Rework clause expansion and nested struct
> handling"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603792.html
> Approved, but waiting for *this* patch to avoid regressing Fortran
> pointer-mapping behaviour, and which Tobias noticed an issue with
> prior to committing, addressed by (4).
> 
> (3) "OpenMP/OpenACC: Refine condition for when map clause expansion
> happens"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607543.html
> Not reviewed (partly OpenACC).
> 
> (4) "OpenMP: implicitly map base pointer for array-section pointer
> components"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608318.html
> Not reviewed.
> 
> The following patches also depend on this one and the above:
> 
> (5) "OpenMP: lvalue parsing for map clauses (C++)"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/605367.html
> Mostly approved.
> 
> (6) "OpenMP: C++ "declare mapper" support"
> https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607544.html
> Revised version unreviewed.
> 
> ...and the to-be-revised "lvalue parsing for C", and C/Fortran
> "declare mapper" patches.

Followup:

https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609031.html

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Segher Boessenkool

Hi!

On Fri, Dec 23, 2022 at 08:36:36PM +0800, Jiufu Guo wrote:
> It seems some limitations there. e.g. 1. "subreg:DF on DI register"
> may not work well on pseudo,

It is perfectly normal:
  A hard register may be accessed in various modes throughout one
  function, but each pseudo register is given a natural mode
  and is accessed only in that mode.  When it is necessary to describe
  an access to a pseudo register using a nonnatural mode, a @code{subreg}
  expression is used.

and:
  @code{subreg} expressions are used to refer to a register in a machine
  mode other than its natural one, or to refer to one register of
  a multi-part @code{reg} that actually refers to several registers.

  Each pseudo register has a natural mode.  If it is necessary to
  operate on it in a different mode, the register must be
  enclosed in a @code{subreg}.

and we even have:
  @item hard registers
  It is seldom necessary to wrap hard registers in @code{subreg}s; such
  registers would normally reduce to a single @code{reg} rtx.  This use of
  @code{subreg}s is discouraged and may not be supported in the future.

> and 2. to convert high-part:DI to SF,
> a "shift/rotate" is needed, and then we need to "emit shift insn"
> in cse. I may need to update this patch.

Hrm.  The machine insns to do this is just mtvsrd;xscvspdpn, but for
converting the lowpart it is mtvsrws;xscvspdpn (this needs p9 or
later).  We should arrive at those patterns, and we should try to not
go via the more expensive formulations with shifts, which don't describe
the hardware well, and which overestimate the cost of it.

None of this belongs in generic code at all imo.  At expand time it
should be expanded to something that works and can be optimised well,
so not anything with :BLK (which has to be put in memory, something with
unbounded size cannot be put in registers), not anything specifically
tailored to any cpu, something nice and regular.  Using a subreg (of a
pseudo!) is the standard way of writing a bitcast.

So generic code would do a  (subreg:SF (reg:SI) 0)  to express a 32-bit
integer bitcast to an IEEE SP number, and our machine description should
make it work nicely.

Segher

[Bug libstdc++/108212] pretty printers don't work with Python 2

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108212

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
 Ever confirmed|0   |1
   Target Milestone|--- |13.0
 Status|UNCONFIRMED |ASSIGNED
   Last reconfirmed||2022-12-23

[Bug libstdc++/108212] New: pretty printers don't work with Python 2

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108212

Bug ID: 108212
   Summary:  pretty printers don't work with Python 2
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

ImportError: cannot import name timezone

[Bug libstdc++/108211] std::chrono::current_zone() fails if zone only has one component

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108211

Jonathan Wakely  changed:

   What|Removed |Added

   Assignee|unassigned at gcc dot gnu.org  |redi at gcc dot gnu.org
   Target Milestone|--- |13.0
   Last reconfirmed||2022-12-23
 Status|UNCONFIRMED |ASSIGNED
 Ever confirmed|0   |1

Re: [PATCH V2] Disable sched1 in functions that call setjmp

2022-12-23 Thread Qing Zhao via Gcc-patches



> On Dec 23, 2022, at 2:33 AM, Alexander Monakov  wrote:
> 
> 
> On Thu, 22 Dec 2022, Qing Zhao wrote:
> 
>>> I think scheduling across calls in the pre-RA scheduler is simply an 
>>> oversight,
>>> we do not look at dataflow information and with 50% chance risk extending
>>> lifetime of a pseudoregister across a call, causing higher register 
>>> pressure at
>>> the point of the call, and potentially an extra spill.
>> 
>> I am a little confused, you mean pre-RA scheduler does not look at the data 
>> flow
>> information at all when scheduling insns across calls currently?
> 
> I think it does not inspect liveness info, and may extend lifetime of a pseudo
> across a call, transforming
> 
>  call foo
>  reg = 1
>  ...
>  use reg
> 
> to
> 
>  reg = 1
>  call foo
>  ...
>  use reg
> 
> but this is undesirable, because now register allocation cannot select a
> call-clobbered register for 'reg’.
Okay, thanks for the explanation.

Then, why not just check the liveness info instead of inhibiting all scheduling 
across calls?

Qing
> 
> Alexander

[Bug libstdc++/108211] New: std::chrono::current_zone() fails if zone only has one component

2022-12-23 Thread redi at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108211

Bug ID: 108211
   Summary: std::chrono::current_zone() fails if zone only has one
component
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: redi at gcc dot gnu.org
  Target Milestone: ---

$ ls -l /etc/localtime
lrwxrwxrwx. 1 root root 25 Mar 15  2017 /etc/localtime ->
../usr/share/zoneinfo/UTC


Libstdc++ incorrectly assumes the local timezone will be "foo/bar" and so it
breaks for "UTC" or any other name without a slash in it.

terminate called after throwing an instance of 'std::runtime_error'
  what():  tzdb: cannot determine current zone


This causes:

FAIL: std/time/tzdb/1.cc execution test
FAIL: std/time/zoned_time/custom.cc execution test

[Bug c++/107853] [10/11/12 Regression] variadic template with a variadic template friend with a requires of fold expression

2022-12-23 Thread ppalka at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107853

Patrick Palka  changed:

   What|Removed |Added

Summary|[10/11/12/13 Regression]|[10/11/12 Regression]
   |variadic template with a|variadic template with a
   |variadic template friend|variadic template friend
   |with a requires of fold |with a requires of fold
   |expression  |expression
  Known to work||13.0

--- Comment #6 from Patrick Palka  ---
Fixed on trunk so far

[Bug c++/107853] [10/11/12/13 Regression] variadic template with a variadic template friend with a requires of fold expression

2022-12-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107853

--- Comment #5 from CVS Commits  ---
The master branch has been updated by Patrick Palka :

https://gcc.gnu.org/g:bd1fc4a219d8c0fad0ec41002e895b49e384c1c2

commit r13-4876-gbd1fc4a219d8c0fad0ec41002e895b49e384c1c2
Author: Patrick Palka 
Date:   Fri Dec 23 09:18:37 2022 -0500

c++: template friend with variadic constraints [PR107853]

When instantiating a constrained hidden template friend, we substitute
into its template-head requirements in tsubst_friend_function.  For this
substitution we use the template's full argument vector whose outer
levels correspond to the instantiated class's arguments and innermost
level corresponds to the template's own level-lowered generic arguments.

But for A::f here, for which the relevant argument vector is
{{int}, {Us...}}, the substitution into (C && ...) triggers the
assert in use_pack_expansion_extra_args_p since one argument is a pack
expansion and the other isn't.

And for A::f, for which the relevant argument vector is
{{int, int}, {Us...}}, the use_pack_expansion_extra_args_p assert would
also trigger but we first get a bogus "mismatched argument pack lengths"
error from tsubst_pack_expansion.

Sidestepping the question of whether tsubst_pack_expansion should be
able to handle such substitutions, it seems we can work around this by
using only the instantiated class's arguments and not also the template
friend's own generic arguments, which is consistent with how we normally
substitute into the signature of a member template.

PR c++/107853

gcc/cp/ChangeLog:

* constraint.cc (maybe_substitute_reqs_for): Substitute into
the template-head requirements of a template friend using only
its outer arguments via outer_template_args.
* cp-tree.h (outer_template_args): Declare.
* pt.cc (outer_template_args): Define, factored out and
generalized from ...
(ctor_deduction_guides_for): ... here.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-friend12.C: New test.
* g++.dg/cpp2a/concepts-friend13.C: New test.

nvptx: '-mframe-malloc-threshold', '-Wframe-malloc-threshold' (was: Handling of large stack objects in GPU code generation -- maybe transform into heap allocation?)

2022-12-23 Thread Thomas Schwinge

Hi!

On 2022-11-11T15:35:44+0100, Richard Biener via Fortran  
wrote:
> On Fri, Nov 11, 2022 at 3:13 PM Thomas Schwinge  
> wrote:
>> For example, for Fortran code like:
>>
>> write (*,*) "Hello world"
>>
>> ..., 'gfortran' creates:
>>
>> struct __st_parameter_dt dt_parm.0;
>>
>> try
>>   {
>> dt_parm.0.common.filename = 
>> &"source-gcc/libgomp/testsuite/libgomp.oacc-fortran/print-1_.f90"[1]{lb: 1 
>> sz: 1};
>> dt_parm.0.common.line = 29;
>> dt_parm.0.common.flags = 128;
>> dt_parm.0.common.unit = 6;
>> _gfortran_st_write (_parm.0);
>> _gfortran_transfer_character_write (_parm.0, &"Hello 
>> world"[1]{lb: 1 sz: 1}, 11);
>> _gfortran_st_write_done (_parm.0);
>>   }
>> finally
>>   {
>> dt_parm.0 = {CLOBBER(eol)};
>>   }
>>
>> The issue: the stack object 'dt_parm.0' is a half-KiB in size (yes,
>> really! -- there's a lot of state in Fortran I/O apparently).  That's a
>> problem for GPU execution -- here: OpenACC/nvptx -- where typically you
>> have small stacks.  (For example, GCC/OpenACC/nvptx: 1 KiB per thread;
>> GCC/OpenMP/nvptx is an exception, because of its use of '-msoft-stack'
>> "Use custom stacks instead of local memory for automatic storage".)
>>
>> Now, the Nvidia Driver tries to accomodate for such largish stack usage,
>> and dynamically increases the per-thread stack as necessary (thereby
>> potentially reducing parallelism) -- if it manages to understand the call
>> graph.  In case of libgfortran I/O, it evidently doesn't.  Not being able
>> to disprove existance of recursion is the common problem, as I've read.
>> At run time, via 'CU_JIT_INFO_LOG_BUFFER' you then get, for example:
>>
>> warning : Stack size for entry function 'MAIN__$_omp_fn$0' cannot be 
>> statically determined
>>
>> That's still not an actual problem: if the GPU kernel's stack usage still
>> fits into 1 KiB.  Very often it does, but if, as happens in libgfortran
>> I/O handling, there is another such 'dt_parm' put onto the stack, the
>> stack then overflows; device-side SIGSEGV.
>>
>> (There is, by the way, some similar analysis by Tom de Vries in
>>  "[nvptx, openacc, openmp, testsuite]
>> Recursive tests may fail due to thread stack limit".)
>>
>> Of course, you shouldn't really be doing I/O in GPU kernels, but people
>> do like their occasional "'printf' debugging", so we ought to make that
>> work (... without pessimizing any "normal" code).
>>
>> I assume that generally reducing the size of 'dt_parm' etc. is out of
>> scope.
>>
>> There is a way to manually set a per-thread stack size, but it's not
>> obvious which size to set: that sizes needs to work for the whole GPU
>> kernel, and should be as low as possible (to maximize parallelism).
>> I assume that even if GCC did an accurate call graph analysis of the GPU
>> kernel's maximum stack usage, that still wouldn't help: that's before the
>> PTX JIT does its own code transformations, including stack spilling.
>>
>> There exists a 'CU_JIT_LTO' flag to "Enable link-time optimization
>> (-dlto) for device code".  This might help, assuming that it manages to
>> simplify the libgfortran I/O code such that the PTX JIT then understands
>> the call graph.  But: that's available only starting with recent
>> CUDA 11.4, so not a general solution -- if it works at all, which I've
>> not tested.
>>
>> Similarly, we could enable GCC's LTO for device code generation -- but
>> that's a big project, out of scope at this time.  And again, we don't
>> know if that at all helps this case.
>>
>> I see a few options:
>>
>> (a) Figure out what it is in the libgfortran I/O implementation that
>> causes "Stack size [...] cannot be statically determined", and re-work
>> that code to avoid that, or even disable certain things for nvptx, if
>> feasible.

> Shrink st_parameter_dt (it's part of the ABI though, kind of).  Lots of the
> bloat is from things that are unused for simpler I/O cases (so some
> "inheritance" could help), and lots of the bloat is from using
> string/length pairs using char * + size_t for what looks like could be
> encoded a lot more efficiently.
>
> There's probably not much low-hanging fruit.

(Similarly comments in Janne's email.)


Well, as had to be expected, libgfortran I/O is really just one example,
but the underlying problem may also be triggered in other ways (via other
newlib/libc functions, for example).

So, really a generic solution seems to be called for.

>> (b) Also for GCC/OpenACC/nvptx use the GCC/OpenMP/nvptx '-msoft-stack'.
>> I don't really want to do that however: it does introduce a bit of
>> complexity in all the generated device code and run-time overhead that we
>> generally would like to avoid.

Directly using '-msoft-stack' isn't actually possible: it does implement
"one stack per 32-threads warp", but for OpenACC we need "one stack per
thread of a warp" (that is, each OpenACC 'vector' independently), and

[Bug tree-optimization/107767] [13 Regression] switch to table conversion happening even though using btq is better

2022-12-23 Thread marxin at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107767

--- Comment #15 from Martin Liška  ---
@Richi: Please send the patch for switch conversion in the next stage 1.

[patch, fortran] ICE on automatic array of derived type with DTIO

2022-12-23 Thread Jerry D via Gcc-patches


I have committed the obvious as simple.

The master branch has been updated by Jerry DeLisle :

https://gcc.gnu.org/g:7e76cd96950f49ce21246d44780e972d86b2bcdd

commit r13-4862-g7e76cd96950f49ce21246d44780e972d86b2bcdd
Author: Steve Kargl 
Date:   Thu Dec 22 20:38:57 2022 -0800

Remove not needed assert macro which fails.

PR fortran/106731

gcc/fortran/ChangeLog:

* trans-array.cc (gfc_trans_auto_array_allocation): Remove
gcc_assert (!TREE_STATIC()).

gcc/testsuite/ChangeLog:

* gfortran.dg/pr106731.f90: New test.

[Bug c++/108203] Format string checking with __USE_MINGW_ANSI_STDIO

2022-12-23 Thread lh_mouse at 126 dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108203

--- Comment #2 from LIU Hao  ---
(In reply to nightstrike from comment #0)
> Bug report that came from it:
> https://sourceforge.net/p/mingw-w64/bugs/292/
> 

I think this should be no longer the case. Two years ago I submitted a patch
that made the `ms_printf` attribute accept `%lld`, so there is now a universal
modifier for `long long`.

The issue here looks like that `printf` , which is a 'well known' function to
GCC, has a pre-defined attribute. If a new `*_printf` attribute is declared, it
gets both, as pointed out by Andrew Pinski.

[committed] libstdc++: Fix Darwin bootstrap error in src/c++20/tzdb.cc

2022-12-23 Thread Jonathan Wakely via Gcc-patches

A fix for another bootstrap error caused by yesterday's C++20 time zone
commit, for macOS this time.

I have only tested on x86_64-linux but Iain confirmed this works for his
darwin testers. Pushed to trunk.

-- >8 --

Mach-O requires weak symbols to have a definition, so add a default
implementation of __gnu_cxx::zoneinfo_dir_override.

libstdc++-v3/ChangeLog:

* src/c++20/tzdb.cc [__APPLE__] (zoneinfo_dir_override): Add
definition.
---
 libstdc++-v3/src/c++20/tzdb.cc | 13 ++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/libstdc++-v3/src/c++20/tzdb.cc b/libstdc++-v3/src/c++20/tzdb.cc
index a02bcd4aec7..5f5c4199f65 100644
--- a/libstdc++-v3/src/c++20/tzdb.cc
+++ b/libstdc++-v3/src/c++20/tzdb.cc
@@ -52,6 +52,10 @@
 # endif
 #endif
 
+#ifndef _GLIBCXX_ZONEINFO_DIR
+# define _GLIBCXX_ZONEINFO_DIR "/usr/share/zoneinfo"
+#endif
+
 namespace __gnu_cxx
 {
 #ifdef _AIX
@@ -59,6 +63,12 @@ namespace __gnu_cxx
   const char* (*zoneinfo_dir_override)() = nullptr;
 #else
   [[gnu::weak]] const char* zoneinfo_dir_override();
+
+#ifdef __APPLE__
+  // Need a weak definition for Mach-O.
+  [[gnu::weak]] const char* zoneinfo_dir_override()
+  { return _GLIBCXX_ZONEINFO_DIR; }
+#endif
 #endif
 }
 
@@ -934,9 +944,6 @@ namespace std::chrono
 return info;
   }
 
-#ifndef _GLIBCXX_ZONEINFO_DIR
-# define _GLIBCXX_ZONEINFO_DIR "/usr/share/zoneinfo"
-#endif
  namespace
  {
 string
-- 
2.38.1

[Bug libstdc++/108210] error: 'mutex' does not name a type; did you mean 'minutes'? for x86_64-w64-mingw32 target with win32 thread model

2022-12-23 Thread unlvsur at live dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108210

cqwrteur  changed:

   What|Removed |Added

 CC||unlvsur at live dot com

--- Comment #1 from cqwrteur  ---
Created attachment 54152
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54152=edit
config

nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2')

2022-12-23 Thread Thomas Schwinge

Hi!

On 2022-12-23T14:35:16+0100, I wrote:
> On 2022-12-02T14:35:35+0100, I wrote:
>> On 2022-12-01T22:13:38+0100, I wrote:
>>> I'm working on support for global constructors/destructors with
>>> GCC/nvptx
>>
>> See "nvptx: Support global constructors/destructors via 'collect2'"
>> [posted before]
>
> Building on that, attached is now the additional "for offloading" piece:
> "nvptx: Support global constructors/destructors via 'collect2' for 
> offloading".
> OK to push?

Now really attached.

> I did manually test this (by putting a few constructors/destructors into
> 'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed),
> and also in my WIP development tree with standard libgfortran
> constructors (with 'LIBGFOR_MINIMAL' disabled).


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From fb67006eeca0c8e2bfdf86576ed3109dacaf6868 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Wed, 30 Nov 2022 22:09:35 +0100
Subject: [PATCH] nvptx: Support global constructors/destructors via 'collect2'
 for offloading

This extends "nvptx: Support global constructors/destructors via 'collect2'"
for offloading.

	libgcc/
	* config/nvptx/crtstuff.c ["mgomp"]
	(__do_global_ctors__entry__mgomp)
	(__do_global_dtors__entry__mgomp): New.
	[!"mgomp"] (__do_global_ctors__entry, __do_global_dtors__entry):
	New.
	libgomp/
	* plugin/plugin-nvptx.c (nvptx_do_global_cdtors): New.
	(nvptx_close_device, GOMP_OFFLOAD_load_image)
	(GOMP_OFFLOAD_unload_image): Call it.
---
 libgcc/config/nvptx/crtstuff.c |  64 ++-
 libgomp/plugin/plugin-nvptx.c  | 113 -
 2 files changed, 175 insertions(+), 2 deletions(-)

diff --git a/libgcc/config/nvptx/crtstuff.c b/libgcc/config/nvptx/crtstuff.c
index 0823fc49901..8dc80687e0a 100644
--- a/libgcc/config/nvptx/crtstuff.c
+++ b/libgcc/config/nvptx/crtstuff.c
@@ -29,6 +29,14 @@
files (via 'CRT_BEGIN' and 'CRT_END'): 'crtbegin.o' and 'crtend.o', but we
do so anyway, for symmetry with other configurations.  */
 
+
+/* See 'crt0.c', 'mgomp.c'.  */
+#if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__)
+extern void *__nvptx_stacks[32] __attribute__((shared,nocommon));
+extern unsigned __nvptx_uni[32] __attribute__((shared,nocommon));
+#endif
+
+
 #ifdef CRT_BEGIN
 
 void
@@ -37,6 +45,33 @@ __do_global_ctors (void)
   DO_GLOBAL_CTORS_BODY;
 }
 
+/* Need '.entry' wrapper for offloading.  */
+
+# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__)
+
+__attribute__((kernel)) void __do_global_ctors__entry__mgomp (void *);
+
+void
+__do_global_ctors__entry__mgomp (void *nvptx_stacks_0)
+{
+  __nvptx_stacks[0] = nvptx_stacks_0;
+  __nvptx_uni[0] = 0;
+
+  __do_global_ctors ();
+}
+
+# else
+
+__attribute__((kernel)) void __do_global_ctors__entry (void);
+
+void
+__do_global_ctors__entry (void)
+{
+  __do_global_ctors ();
+}
+
+# endif
+
 #elif defined(CRT_END) /* ! CRT_BEGIN */
 
 void
@@ -45,7 +80,7 @@ __do_global_dtors (void)
   /* In this configuration here, there's no way that "this routine is run more
  than once [...] when exit is called recursively": for nvptx target, the
  call to '__do_global_dtors' is registered via 'atexit', which doesn't
- re-enter a function already run.
+ re-enter a function already run, and neither does nvptx offload target.
  Therefore, we do *not* "arrange to remember where in the list we left off
  processing".  */
   func_ptr *p;
@@ -53,6 +88,33 @@ __do_global_dtors (void)
 (*p++) ();
 }
 
+/* Need '.entry' wrapper for offloading.  */
+
+# if defined(__nvptx_softstack__) && defined(__nvptx_unisimt__)
+
+__attribute__((kernel)) void __do_global_dtors__entry__mgomp (void *);
+
+void
+__do_global_dtors__entry__mgomp (void *nvptx_stacks_0)
+{
+  __nvptx_stacks[0] = nvptx_stacks_0;
+  __nvptx_uni[0] = 0;
+
+  __do_global_dtors ();
+}
+
+# else
+
+__attribute__((kernel)) void __do_global_dtors__entry (void);
+
+void
+__do_global_dtors__entry (void)
+{
+  __do_global_dtors ();
+}
+
+# endif
+
 #else /* ! CRT_BEGIN && ! CRT_END */
 #error "One of CRT_BEGIN or CRT_END must be defined."
 #endif
diff --git a/libgomp/plugin/plugin-nvptx.c b/libgomp/plugin/plugin-nvptx.c
index fcc97c6e0d5..395639537e8 100644
--- a/libgomp/plugin/plugin-nvptx.c
+++ b/libgomp/plugin/plugin-nvptx.c
@@ -338,6 +338,11 @@ struct ptx_device
 
 static struct ptx_device **ptx_devices;
 
+static bool nvptx_do_global_cdtors (CUmodule, struct ptx_device *,
+const char *);
+static size_t nvptx_stacks_size ();
+static void *nvptx_stacks_acquire (struct ptx_device *, size_t, int);
+
 static inline struct nvptx_thread *
 nvptx_thread (void)
 {
@@ -557,6 +562,17 @@ nvptx_close_device (struct ptx_device *ptx_dev)
   if (!ptx_dev)
 return true;
 
+  bool ret = true;
+
+

[Bug libstdc++/108210] New: error: 'mutex' does not name a type; did you mean 'minutes'? for x86_64-w64-mingw32 target with win32 thread model

2022-12-23 Thread unlvsur at live dot com via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108210

Bug ID: 108210
   Summary: error: 'mutex' does not name a type; did you mean
'minutes'? for x86_64-w64-mingw32 target with win32
thread model
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: libstdc++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: unlvsur at live dot com
  Target Milestone: ---

/home/cqwrteur/toolchains_build/gcc/libstdc++-v3/src/c++20/tzdb.cc:565:5:
error: 'mutex' does not name a type; did you mean 'minutes'?
  565 | mutex infos_mutex;
  | ^
  | minutes

Win32 thread model does not provide mutex, lock_guard, and other threading
mechanism.

However. this can be implemented easily with win32 CriticalSection API.
https://github.com/cppfastio/fast_io/blob/master/include/fast_io_hosted/threads/mutex/win32_critical_section.h

nvptx: Support global constructors/destructors via 'collect2' for offloading (was: nvptx: Support global constructors/destructors via 'collect2')

2022-12-23 Thread Thomas Schwinge

Hi!

On 2022-12-02T14:35:35+0100, I wrote:
> On 2022-12-01T22:13:38+0100, I wrote:
>> I'm working on support for global constructors/destructors with
>> GCC/nvptx
>
> See "nvptx: Support global constructors/destructors via 'collect2'"
> [posted before]

Building on that, attached is now the additional "for offloading" piece:
"nvptx: Support global constructors/destructors via 'collect2' for offloading".
OK to push?

I did manually test this (by putting a few constructors/destructors into
'libgomp/config/nvptx/oacc-parallel.c', and observing them be executed),
and also in my WIP development tree with standard libgfortran
constructors (with 'LIBGFOR_MINIMAL' disabled).

Grüße
 Thomas
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955

[Bug middle-end/108209] New: goof in genmatch.cc:commutative_op

2022-12-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108209

Bug ID: 108209
   Summary: goof in genmatch.cc:commutative_op
   Product: gcc
   Version: 13.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: middle-end
  Assignee: unassigned at gcc dot gnu.org
  Reporter: amonakov at gcc dot gnu.org
  Target Milestone: ---

It pretends that define_operator_list is commutative when its first member is
NOT commutative:

  if (user_id *uid = dyn_cast (id))
{
  int res = commutative_op (uid->substitutes[0]);
  if (res < 0)
return 0;
  for (unsigned i = 1; i < uid->substitutes.length (); ++i)
if (res != commutative_op (uid->substitutes[i]))
  return -1;
  return res;
}

The first 'return 0' should be 'return -1' instead.

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Jiufu Guo via Gcc-patches

HI,

Jiufu Guo via Gcc-patches  writes:

> Hi,
>
> Richard Biener  writes:
>
>> On Thu, 22 Dec 2022, guojiufu wrote:
>>
>>> Hi,
>>> 
>>> On 2022-12-21 15:30, Richard Biener wrote:
>>> > On Wed, 21 Dec 2022, Jiufu Guo wrote:
>>> > 
>>> >> Hi,
>>> >> 
>>> >> This patch is fixing an issue about parameter accessing if the
>>> >> parameter is struct type and passed through integer registers, and
>>> >> there is floating member is accessed. Like below code:
>>> >> 
>>> >> typedef struct DF {double a[4]; long l; } DF;
>>> >> double foo_df (DF arg){return arg.a[3];}
>>> >> 
>>> >> On ppc64le, with trunk gcc, "std 6,-24(1) ; lfd 1,-24(1)" is
>>> >> generated.  While instruction "mtvsrd 1, 6" would be enough for
>>> >> this case.
>>> > 
>>> > So why do we end up spilling for PPC?
>>> 
>>> Good question! According to GCC source code (in function.cc/expr.cc),
>>> it is common behavior: using "word_mode" to store the parameter to stack,
>>> And using the field's mode (e.g. float mode) to load from the stack.
>>> But with some tries, I fail to construct cases on many platforms.
>>> So, I convert the fix to a target hook and implemented the rs6000 part
>>> first.
>>> 
>>> > 
>>> > struct X { int i; float f; };
>>> > 
>>> > float foo (struct X x)
>>> > {
>>> >   return x.f;
>>> > }
>>> > 
>>> > does pass the structure in $RDI on x86_64 and we manage (with
>>> > optimization, with -O0 we spill) to generate
>>> > 
>>> > shrq$32, %rdi
>>> > movd%edi, %xmm0
>>> > 
>>> > and RTL expansion generates
>>> > 
>>> > (note 4 1 2 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>>> > (insn 2 4 3 2 (set (reg/v:DI 83 [ x ])
>>> > (reg:DI 5 di [ x ])) "t.c":4:1 -1
>>> >  (nil))
>>> > (note 3 2 6 2 NOTE_INSN_FUNCTION_BEG)
>>> > (insn 6 3 7 2 (parallel [
>>> > (set (reg:DI 85)
>>> > (ashiftrt:DI (reg/v:DI 83 [ x ])
>>> > (const_int 32 [0x20])))
>>> > (clobber (reg:CC 17 flags))
>>> > ]) "t.c":5:11 -1
>>> >  (nil))
>>> > (insn 7 6 8 2 (set (reg:SI 86)
>>> > (subreg:SI (reg:DI 85) 0)) "t.c":5:11 -1
>>> >  (nil))
>>> > 
>>> > I would imagine that for the ppc case we only see the subreg here
>>> > which should be even easier to optimize.
>>> > 
>>> > So how's this not fixable by providing proper patterns / subreg
>>> > capabilities?  Looking a bit at the RTL we have the issue might
>>> > be that nothing seems to handle CSE of
>>> > 
>>> 
>>> This case is also related to 'parameter on struct', PR89310 is
>>> just for this case. On trunk, it is fixed.
>>> One difference: the parameter is in DImode, and passed via an
>>> integer register for "{int i; float f;}".
>>> But for "{double a[4]; long l;}", the parameter is in BLKmode,
>>> and stored to stack during the argument setup.
>>
>> OK, so this would be another case for "heuristics" to use
>> sth different than word_mode for storing, but of course
>> the arguments are in integer registers and using different
>> modes can for example prohibit store-multiple instruction use.
>>
>> As I said in the related thread an RTL expansion time "SRA"
>> with the incoming argument assignment in mind could make
>> more optimal decisions for these kind of special-cases.
>
> Thanks a lot for your comments!
>
> Yeap! Using SRA-like analysis during expansion for parameter
> and returns (and may also some field accessing) would be a
> generic improvement for this kind of issue (PR101926 collected
> a lot of them).
> While we may still need some work for various ABIs and different
> targets, to analyze where the 'struct field' come from
> (int/float/vector/.. registers, or stack) and how the struct
> need to be handled (keep in pseudo or store in the stack).
> This may indicate a mount of changes for param_setup code.
>
> To reduce risk, I'm just draft straightforward patches for
> special cases currently, Like:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
> and this patch.
>
>>
>>> > (note 8 0 5 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
>>> > (insn 5 8 7 2 (set (mem/c:DI (plus:DI (reg/f:DI 110 sfp)
>>> > (const_int 56 [0x38])) [2 arg+24 S8 A64])
>>> > (reg:DI 6 6)) "t.c":2:23 679 {*movdi_internal64}
>>> >  (expr_list:REG_DEAD (reg:DI 6 6)
>>> > (nil)))
>>> > (note 7 5 10 2 NOTE_INSN_FUNCTION_BEG)
>>> > (note 10 7 15 2 NOTE_INSN_DELETED)
>>> > (insn 15 10 16 2 (set (reg/i:DF 33 1)
>>> > (mem/c:DF (plus:DI (reg/f:DI 110 sfp)
>>> > (const_int 56 [0x38])) [1 arg.a[3]+0 S8 A64])) "t.c":2:40
>>> > 576 {*movdf_hardfloat64}
>>> >  (nil))
>>> > 
>>> > Possibly because the store and load happen in a different mode?  Can
>>> > you see why CSE doesn't handle this (producing a subreg)?  On
>>> 
>>> Yes, exactly! For "{double a[4]; long l;}", because the store and load
>>> are using a different mode, and then CSE does not optimize it.  This
>>> patch makes the store and load using the same mode (DImode), and then
>>> leverage CSE to handle it.
>>

Re: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-23 Thread 钟居哲

Hi, Andreas. Thank you for reporting this.
Even though I didn't reproduce this error, I have an idea to fix it:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/609045.html 
Would you mind testing this patch for me before merging it?
Thanks.


juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2022-12-23 18:53
To: juzhe.zhong
CC: gcc-patches; kito.cheng; palmer
Subject: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support
How has this been tested?
 
In file included from ../../gcc/config/riscv/riscv-vsetvl.cc:89:
../../gcc/config/riscv/riscv-vsetvl.h: In member function 
'riscv_vector::avl_info riscv_vector::vl_vtype_info::get_avl_info() const':
../../gcc/config/riscv/riscv-vsetvl.h:175:43: error: implicitly-declared 
'constexpr riscv_vector::avl_info::avl_info(const riscv_vector::avl_info&)' is 
deprecated [-Werror=deprecated-copy]
  175 |   avl_info get_avl_info () const { return m_avl; }
  |   ^
../../gcc/config/riscv/riscv-vsetvl.h:131:13: note: because 
'riscv_vector::avl_info' has user-provided 'riscv_vector::avl_info& 
riscv_vector::avl_info::operator=(const riscv_vector::avl_info&)'
  131 |   avl_info = (const avl_info &);
  | ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc: In function 'bool 
change_insn(rtl_ssa::function_info*, rtl_ssa::insn_change, rtl_ssa::insn_info*, 
rtx)':
../../gcc/config/riscv/riscv-vsetvl.cc:823:27: error: unquoted whitespace 
character '\x0a' in format [-Werror=format-diag]
  823 |   pp_printf (, "\n");
  |   ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc:847:27: error: unquoted whitespace 
character '\x0a' in format [-Werror=format-diag]
  847 |   pp_printf (, "\n");
  |   ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc: In constructor 
'riscv_vector::vl_vtype_info::vl_vtype_info(riscv_vector::avl_info, uint8_t, 
riscv_vector::vlmul_type, uint8_t, bool, bool)':
../../gcc/config/riscv/riscv-vsetvl.cc:905:5: error: implicitly-declared 
'constexpr riscv_vector::avl_info::avl_info(const riscv_vector::avl_info&)' is 
deprecated [-Werror=deprecated-copy]
  905 |   : m_avl (avl_in), m_sew (sew_in), m_vlmul (vlmul_in), m_ratio 
(ratio_in),
  | ^~
../../gcc/config/riscv/riscv-vsetvl.cc:859:1: note: because 
'riscv_vector::avl_info' has user-provided 'riscv_vector::avl_info& 
riscv_vector::avl_info::operator=(const riscv_vector::avl_info&)'
  859 | avl_info::operator= (const avl_info )
  | ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc: In member function 'void 
riscv_vector::vector_insn_info::dump(FILE*) const':
../../gcc/config/riscv/riscv-vsetvl.cc:1366:27: error: unquoted whitespace 
character '\x0a' in format [-Werror=format-diag]
1366 |   pp_printf (, "\n");
  |   ^~~~
cc1plus: all warnings being treated as errors
make[3]: *** [../../gcc/config/riscv/t-riscv:59: riscv-vsetvl.o] Error 1
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

Re: testsuite under wine

2022-12-23 Thread Eric Pouech


Le 23/12/2022 à 11:36, NightStrike a écrit :

On Wed, Dec 21, 2022 at 11:37 PM Jacob Bachmeyer  wrote:

NightStrike wrote:

[...]
Second, the problems with extra \r's still remain, but I think we've
generally come to think that that part isn't Wine and is instead
either the testsuite or deja.  So I'll keep those replies to Jacob's
previous message.


Most likely, it is a combination of the MinGW libc (which emits "\r\n"
for end-of-line in accordance with Windows convention) and the kernel
terminal driver (which passes "\r" and translates "\n" to "\r\n" in
accordance with POSIX convention).  Wine, short of trying to translate
"\r\n" back to "\n" in accordance with POSIX conventions (and likely
making an even bigger mess---does Wine know if a handle is supposed to
be text or binary?) cannot really fix this, so the testsuite needs to
handle non-POSIX-standard line endings.  (The Rust tests probably have
an outright bug if the newlines are being duplicated.)

You may be onto something here.  I ran wine under script as `script -c
"wine64 ./a.exe" out` (thanks, Arsen!), and it had the same extra \r
prepended to the \r\n.  I was making the mistake previously of running
wine manually and capturing it to a file as `wine64 ./a.exe > out`,
which as several have pointed out in this thread, that would disable
the quirk, so of course it didn't reveal any problems.  I'm behind,
but I'll catch up to you guys eventually :)

So at least we know for sure that this particular instance of extra
characters is coming from Wine.  Maybe Wine can be smart enough to
only translate \n into \r\n instead of translating \r\n into \r\r\n.
Jacek / Eric, comments here?  I'm happy to try another patch, the
first one was great.

actually, it depends on how the file has been opened by the application. 
if it's done in binary mode, no \n => \r\n translation takes place


but it the file is opened in text mode, wine just does what the 
application requires, which is \n => \r\n translation


(and by default, stdout and stderr are opened in text mode)


IMO, you should not expect more from Wine. Wine's goal is to run windows 
application on Unix ; it's not to run windows applications on Unix and 
requiring that they behave as they had been written for Linux semantics



anyway, we (wine) have to go back to blackboard to figure out a solution 
for disabling nicely conhost



in the meantime, you could use (without any patch to wine), some wrapper 
(bash) script like:


#!/bin/bash

# assumes wine is in $PATH

case "$1" in
    --unix-raw) shift; cat | wine $* 1> >(tee /dev/null) 2> >(tee 
/dev/null >&2);;
    --unix-lf)  shift; cat | wine $* 1> >(tee /dev/null | sed 's/\r$//' 
) 2> >(tee /dev/null | sed 's/\r$//' >&2);;

    *) wine $*;;
esac

using --unix-raw will just disable conhost (hence shall remove most of 
the ansi sequences reported)


using --unix-lf will also disable conhost and replace \r\n with \n

this shall put wine in the behavior as you expect, and still allow using 
proper redirection and piping on the shell script if needed (without 
alterning wine's behavior)


(this could be further improved by not adding a pipe for fd:s that are 
not tty, or adapted to be triggered when, say, TERM=dumb)


HTH

[PATCH] RISC-V: Fix ICE for avl_info deprecated copy and pp_print error.

2022-12-23 Thread juzhe . zhong

From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vsetvl.cc (change_insn): Remove pp_print.
(avl_info::avl_info): Add copy function.
(vector_insn_info::dump): Remove pp_print.
* config/riscv/riscv-vsetvl.h: Add copy function.

---
 gcc/config/riscv/riscv-vsetvl.cc | 32 
 gcc/config/riscv/riscv-vsetvl.h  |  1 +
 2 files changed, 9 insertions(+), 24 deletions(-)

diff --git a/gcc/config/riscv/riscv-vsetvl.cc b/gcc/config/riscv/riscv-vsetvl.cc
index 01530c1ae75..a55b5a1c394 100644
--- a/gcc/config/riscv/riscv-vsetvl.cc
+++ b/gcc/config/riscv/riscv-vsetvl.cc
@@ -810,15 +810,6 @@ change_insn (function_info *ssa, insn_change change, 
insn_info *insn,
   fprintf (dump_file, "\nChange PATTERN of insn %d from:\n",
   INSN_UID (rinsn));
   print_rtl_single (dump_file, PATTERN (rinsn));
-  if (dump_flags & TDF_DETAILS)
-   {
- fprintf (dump_file, "RTL_SSA info:\n");
- pretty_printer pp;
- pp.buffer->stream = dump_file;
- insn->print_full ();
- pp_printf (, "\n");
- pp_flush ();
-   }
 }
 
   insn_change_watermark watermark;
@@ -834,19 +825,16 @@ change_insn (function_info *ssa, insn_change change, 
insn_info *insn,
 {
   fprintf (dump_file, "\nto:\n");
   print_rtl_single (dump_file, PATTERN (rinsn));
-  if (dump_flags & TDF_DETAILS)
-   {
- fprintf (dump_file, "RTL_SSA info:\n");
- pretty_printer pp;
- pp.buffer->stream = dump_file;
- insn->print_full ();
- pp_printf (, "\n");
- pp_flush ();
-   }
 }
   return true;
 }
 
+avl_info::avl_info (const avl_info )
+{
+  m_value = other.get_value ();
+  m_source = other.get_source ();
+}
+
 avl_info::avl_info (rtx value_in, set_info *source_in)
   : m_value (value_in), m_source (source_in)
 {}
@@ -1355,12 +1343,8 @@ vector_insn_info::dump (FILE *file) const
 {
   if (get_insn ())
{
- fprintf (file, "RTL_SSA insn_info=");
- pretty_printer pp;
- pp.buffer->stream = file;
- get_insn ()->print_full ();
- pp_printf (, "\n");
- pp_flush ();
+ fprintf (file, "The real INSN=");
+ print_rtl_single (file, get_insn ()->rtl ());
}
   if (get_dirty_pat ())
{
diff --git a/gcc/config/riscv/riscv-vsetvl.h b/gcc/config/riscv/riscv-vsetvl.h
index ad9bb27cebf..6f27004fab1 100644
--- a/gcc/config/riscv/riscv-vsetvl.h
+++ b/gcc/config/riscv/riscv-vsetvl.h
@@ -125,6 +125,7 @@ private:
 
 public:
   avl_info () : m_value (NULL_RTX), m_source (nullptr) {}
+  avl_info (const avl_info &);
   avl_info (rtx, rtl_ssa::set_info *);
   rtx get_value () const { return m_value; }
   rtl_ssa::set_info *get_source () const { return m_source; }
-- 
2.36.3

Re: [PATCH] loading float member of parameter stored via int registers

2022-12-23 Thread Jiufu Guo via Gcc-patches

Hi,

Segher Boessenkool  writes:

> On Thu, Dec 22, 2022 at 11:28:01AM +, Richard Biener wrote:
>> On Thu, 22 Dec 2022, Jiufu Guo wrote:
>> > To reduce risk, I'm just draft straightforward patches for
>> > special cases currently, Like:
>> > https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608081.html
>> > and this patch.
>> 
>> Heh, yes - though I'm not fond of special-casing things.  RTL
>> expansion is already full of special cases :/
>
> And many of those are not useful at all (would be done by later passes),
> or are actively harmful.  Not to mention that expand is currently one of
> the most impregnable and undebuggable RTL passes.
>
> But there are also many things done during expand that although they
> should be done somewhat later, aren't actually done later at all
> currently.  So that needs fixing.
>
> Maybe things should go via an intermediate step, where all the decisions
> can be made, and then later we just have to translate the "low Gimple"
> or "RTL-Gimple" ("Rimple"?) to RTL.  A format that is looser in many
> ways than either RTL or Gimple.  A bit like Generic in that way.

Thanks for all your great comments!

BR,
Jeff (Jiufu)
>
>
> Segher

Re: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support

2022-12-23 Thread 钟居哲

Would you mind telling me how you reproduce these errors ?
I failed to reproduce this. Thanks



juzhe.zh...@rivai.ai
 
From: Andreas Schwab
Date: 2022-12-23 18:53
To: juzhe.zhong
CC: gcc-patches; kito.cheng; palmer
Subject: Re: [PATCH] RISC-V: Support VSETVL PASS for RVV support
How has this been tested?
 
In file included from ../../gcc/config/riscv/riscv-vsetvl.cc:89:
../../gcc/config/riscv/riscv-vsetvl.h: In member function 
'riscv_vector::avl_info riscv_vector::vl_vtype_info::get_avl_info() const':
../../gcc/config/riscv/riscv-vsetvl.h:175:43: error: implicitly-declared 
'constexpr riscv_vector::avl_info::avl_info(const riscv_vector::avl_info&)' is 
deprecated [-Werror=deprecated-copy]
  175 |   avl_info get_avl_info () const { return m_avl; }
  |   ^
../../gcc/config/riscv/riscv-vsetvl.h:131:13: note: because 
'riscv_vector::avl_info' has user-provided 'riscv_vector::avl_info& 
riscv_vector::avl_info::operator=(const riscv_vector::avl_info&)'
  131 |   avl_info = (const avl_info &);
  | ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc: In function 'bool 
change_insn(rtl_ssa::function_info*, rtl_ssa::insn_change, rtl_ssa::insn_info*, 
rtx)':
../../gcc/config/riscv/riscv-vsetvl.cc:823:27: error: unquoted whitespace 
character '\x0a' in format [-Werror=format-diag]
  823 |   pp_printf (, "\n");
  |   ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc:847:27: error: unquoted whitespace 
character '\x0a' in format [-Werror=format-diag]
  847 |   pp_printf (, "\n");
  |   ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc: In constructor 
'riscv_vector::vl_vtype_info::vl_vtype_info(riscv_vector::avl_info, uint8_t, 
riscv_vector::vlmul_type, uint8_t, bool, bool)':
../../gcc/config/riscv/riscv-vsetvl.cc:905:5: error: implicitly-declared 
'constexpr riscv_vector::avl_info::avl_info(const riscv_vector::avl_info&)' is 
deprecated [-Werror=deprecated-copy]
  905 |   : m_avl (avl_in), m_sew (sew_in), m_vlmul (vlmul_in), m_ratio 
(ratio_in),
  | ^~
../../gcc/config/riscv/riscv-vsetvl.cc:859:1: note: because 
'riscv_vector::avl_info' has user-provided 'riscv_vector::avl_info& 
riscv_vector::avl_info::operator=(const riscv_vector::avl_info&)'
  859 | avl_info::operator= (const avl_info )
  | ^~~~
../../gcc/config/riscv/riscv-vsetvl.cc: In member function 'void 
riscv_vector::vector_insn_info::dump(FILE*) const':
../../gcc/config/riscv/riscv-vsetvl.cc:1366:27: error: unquoted whitespace 
character '\x0a' in format [-Werror=format-diag]
1366 |   pp_printf (, "\n");
  |   ^~~~
cc1plus: all warnings being treated as errors
make[3]: *** [../../gcc/config/riscv/t-riscv:59: riscv-vsetvl.o] Error 1
 
-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

[PATCH v6 10/11] OpenMP: Support OpenMP 5.0 "declare mapper" directives for C

2022-12-23 Thread Julian Brown

This patch adds support for "declare mapper" directives (and the "mapper"
modifier on "map" clauses) for C.  As for C++, arrays of custom-mapped
objects are not supported yet.

I've taken hints from the existing C support for "declare reduction"
directives: this works a little differently from C++ for things such as
looking up user-defined reductions (or user-defined mappers, in our case).

This version of the patch removes some unnecessary function setup/teardown
code from c_parser_omp_declare_mapper, and has been rebased (hence
simplified) wrt. refactoring done higher up this patch series.

2022-12-23  Julian Brown  

gcc/c/
* c-decl.cc (c_omp_mapper_id, c_omp_mapper_decl, c_omp_mapper_lookup,
c_omp_extract_mapper_directive, c_omp_map_array_section,
c_omp_scan_mapper_bindings_r, c_omp_scan_mapper_bindings): New
functions.
* c-objc-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks for C.
* c-parser.cc (c_parser_omp_clause_map): Add KIND parameter.  Handle
mapper modifier.
(c_parser_omp_all_clauses): Update call to c_parser_omp_clause_map with
new kind argument.
(c_parser_omp_target): Instantiate explicit mappers and record bindings
for implicit mappers.
(c_parser_omp_declare_mapper): Parse "declare mapper" directives.
(c_parser_omp_declare): Support "declare mapper".
* c-tree.h (c_omp_finish_mapper_clauses, c_omp_mapper_lookup,
c_omp_extract_mapper_directive, c_omp_map_array_section,
c_omp_mapper_id, c_omp_mapper_decl, c_omp_scan_mapper_bindings,
c_omp_instantiate_mappers): Add prototypes.
* c-typeck.cc (c_finish_omp_clauses): Handle GOMP_MAP_PUSH_MAPPER_NAME
and GOMP_MAP_POP_MAPPER_NAME.
(c_omp_finish_mapper_clauses): New function (langhook).

gcc/testsuite/
* c-c++-common/gomp/declare-mapper-4.c: Enable for C.
* c-c++-common/gomp/declare-mapper-5.c: Likewise.
* c-c++-common/gomp/declare-mapper-6.c: Likewise.
* c-c++-common/gomp/declare-mapper-7.c: Likewise.
* c-c++-common/gomp/declare-mapper-8.c: Likewise.
* c-c++-common/gomp/declare-mapper-9.c: Likewise.
* c-c++-common/gomp/declare-mapper-12.c: Enable for C.
* gcc.dg/gomp/declare-mapper-10.c: New test.
* gcc.dg/gomp/declare-mapper-11.c: New test.

libgomp/
* testsuite/libgomp.c-c++-common/declare-mapper-9.c: Enable for C.
* testsuite/libgomp.c-c++-common/declare-mapper-10.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-11.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-12.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-13.c: Likewise.
* testsuite/libgomp.c-c++-common/declare-mapper-14.c: Likewise.
---
 gcc/c/c-decl.cc   | 169 +++
 gcc/c/c-objc-common.h |  12 +
 gcc/c/c-parser.cc | 277 +-
 gcc/c/c-tree.h|   8 +
 gcc/c/c-typeck.cc |  15 +
 .../c-c++-common/gomp/declare-mapper-12.c |   2 +-
 .../c-c++-common/gomp/declare-mapper-4.c  |   2 +-
 .../c-c++-common/gomp/declare-mapper-5.c  |   2 +-
 .../c-c++-common/gomp/declare-mapper-6.c  |   2 +-
 .../c-c++-common/gomp/declare-mapper-7.c  |   2 +-
 .../c-c++-common/gomp/declare-mapper-8.c  |   2 +-
 .../c-c++-common/gomp/declare-mapper-9.c  |   2 +-
 gcc/testsuite/gcc.dg/gomp/declare-mapper-10.c |  61 
 gcc/testsuite/gcc.dg/gomp/declare-mapper-11.c |  33 +++
 .../libgomp.c-c++-common/declare-mapper-10.c  |   2 +-
 .../libgomp.c-c++-common/declare-mapper-11.c  |   2 +-
 .../libgomp.c-c++-common/declare-mapper-12.c  |   2 +-
 .../libgomp.c-c++-common/declare-mapper-13.c  |   2 +-
 .../libgomp.c-c++-common/declare-mapper-14.c  |   2 +-
 .../libgomp.c-c++-common/declare-mapper-9.c   |   2 +-
 20 files changed, 572 insertions(+), 29 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gomp/declare-mapper-10.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/declare-mapper-11.c

diff --git a/gcc/c/c-decl.cc b/gcc/c/c-decl.cc
index e47ca6718b3e..de5a41ee0c02 100644
--- a/gcc/c/c-decl.cc
+++ b/gcc/c/c-decl.cc
@@ -13084,6 +13084,175 @@ c_check_omp_declare_reduction_r (tree *tp, int *, 
void *data)
   return NULL_TREE;
 }
 
+/* Return identifier to look up for omp declare reduction.  */
+
+tree
+c_omp_mapper_id (tree mapper_id)
+{
+  const char *p = NULL;
+
+  const char prefix[] = "omp declare mapper ";
+
+  if (mapper_id == NULL_TREE)
+p = "";
+  else if (TREE_CODE (mapper_id) == IDENTIFIER_NODE)
+p = IDENTIFIER_POINTER (mapper_id);
+  else
+return error_mark_node;
+
+  size_t lenp = sizeof (prefix);
+  size_t len = strlen (p);
+  char *name = XALLOCAVEC (char, lenp

[PATCH v6 08/11] OpenMP: C++ "declare mapper" support

2022-12-23 Thread Julian Brown

This is a new version of the patch to support OpenMP 5.0 "declare mapper"
functionality for C++.  As with the previously-posted version, arrays
of structs whose elements would be mapped via a user-defined mapper
remain unsupported.

This version of the patch uses a magic VAR_DECL instead of a magic
FUNCTION_DECL for representing mappers, which simplifies parsing
somewhat, and slightly reduces the number of places that need special-case
handling in the FE.  We use the DECL_INITIAL of the VAR_DECL to hold the
OMP_DECLARE_MAPPER definition.  To make types agree, we use the type of
the object to be mapped for both the var decl and the OMP_DECLARE_MAPPER
node itself.  Hence the OMP_DECLARE_MAPPER looks like a magic constant
of struct type in this particular case.

The magic var decl can go in all the places that the "declare mapper"
function decl went previously: at the top level of the program,
within a class definition (including template classes), and within a
function definition (including template functions).  In the class case
we conceptually use the C++-17-ism of definining the var decl "inline
static", equivalent to e.g.:

   [template ...]
   class bla {
 static inline omp declare mapper ... = #define omp declare mapper ..."
   };

(though of course we don't restrict the "declare mapper"-in-class syntax
to C++-17.)

The new representation necessitates some changes to template instantiation
-- declare mappers may trigger implicitly, so we must make sure they
are instantiated before they are needed (see changes to mark_used, etc.).

I've rearranged the processing done by the gimplify_scan_omp_clauses and
gimplify_adjust_omp_clauses functions so the order of the phases can
remain intact in the presence of declared mappers.  To do this, most
gimplification of clauses in gimplify_scan_omp_clauses has been moved
to gimplify_adjust_omp_clauses.  This allows e.g. struct sibling-list
handling and topological clause sorting to work with the non-gimplified
form of clauses in the latter function -- including those that arise
from mapper expansion.  This seems to work well now.

Relative to the last-posted version, this patch brings forward various
refactoring that was previously done by the C and Fortran "declare mapper"
support patches -- aiming to reduce churn.  E.g. nested mapper finding
and mapper instantiation has been moved to c-family/c-omp.cc so it can
be shared between C and C++, and omp_name_type in omp-general.h (used
as the key to hash mapper definitions) is already templatized ready for
Fortran support.

This patch does not synthesize default mappers that map each of a struct's
elements individually: whole-struct mappings are still done by copying
the block of memory containing the struct.  That works fine apart from
cases where a struct has a member that is a reference (to a pointer).
We could fix that by synthesizing a mapper for such cases (only), but
that hasn't been attempted yet.  (I think that means Jakub's concerns
about blow-up of element mappings won't be a problem until that's done.)

New tests added in {gcc,libgomp}/c-c++-common have been restricted to
C++ for now.

2022-11-30  Julian Brown  

gcc/c-family/
* c-common.h (omp_mapper_list): Add forward declaration.
(c_omp_find_nested_mappers, c_omp_instantiate_mappers): Add prototypes.
* c-omp.cc (c_omp_find_nested_mappers): New function.
(remap_mapper_decl_info): New struct.
(remap_mapper_decl_1, omp_instantiate_mapper,
c_omp_instantiate_mappers): New functions.

gcc/cp/
* constexpr.cc (reduced_constant_expression_p): Add OMP_DECLARE_MAPPER
case.
(cxx_eval_constant_expression, potential_constant_expression_1):
Likewise.
* cp-gimplify.cc (cxx_omp_finish_mapper_clauses): New function.
* cp-objcp-common.h (LANG_HOOKS_OMP_FINISH_MAPPER_CLAUSES,
LANG_HOOKS_OMP_MAPPER_LOOKUP, LANG_HOOKS_OMP_EXTRACT_MAPPER_DIRECTIVE,
LANG_HOOKS_OMP_MAP_ARRAY_SECTION): Define langhooks.
* cp-tree.h (lang_decl_base): Add omp_declare_mapper_p field.  Recount
spare bits comment.
(DECL_OMP_DECLARE_MAPPER_P): New macro.
(omp_mapper_id, cp_check_omp_declare_mapper, omp_instantiate_mappers,
cxx_omp_finish_mapper_clauses, cxx_omp_mapper_lookup,
cxx_omp_extract_mapper_directive, cxx_omp_map_array_section: Add
prototypes.
* decl.cc (check_initializer): Add OpenMP declare mapper support.
(cp_finish_decl): Set DECL_INITIAL for OpenMP declare mapper var decls
as appropriate.
* decl2.cc (mark_used): Instantiate OpenMP "declare mapper" magic var
decls.
* error.cc (dump_omp_declare_mapper): New function.
(dump_simple_decl): Use above.
* parser.cc (cp_parser_omp_clause_map): Add KIND parameter.  Support
"mapper" modifier.
(cp_parser_omp_all_clauses): Add KIND argument to
cp_parser_omp_clause_map call.

[PATCH v6 09/11] OpenMP: lvalue parsing for map clauses (C)

2022-12-23 Thread Julian Brown

This patch adds support for parsing general lvalues for OpenMP "map", "to"
and "from" clauses to the C front-end, similar to the previously-posted
patch for C++.

This version of the patch incorporates the patch to change uses of
TREE_LIST to the new OMP_ARRAY_SECTION tree code to represent OpenMP
array sections, and rejects array sections in certain expressions where
they make no sense (see new tests).

2022-12-22  Julian Brown  

gcc/c/
* c-pretty-print.cc (c_pretty_printer::postfix_expression,
c_pretty_printer::expression): Add OMP_ARRAY_SECTION support.
* c-parser.cc (c_parser_braced_init, c_parser_conditional_expression):
Don't allow OpenMP array section.
(c_parser_postfix_expression): Don't allow array section in statement
expression.
(c_parser_postfix_expression_after_primary): Add support
for OpenMP array section parsing.
(c_parser_expr_list): Don't allow OpenMP array section here.
(c_parser_omp_variable_list): Change ALLOW_DEREF parameter to
MAP_LVALUE.  Support parsing of general lvalues in "map", "to" and
"from" clauses.
(c_parser_omp_var_list_parens): Change ALLOW_DEREF parameter to
MAP_LVALUE.  Update call to c_parser_omp_variable_list.
(c_parser_oacc_data_clause, c_parser_omp_clause_to,
c_parser_omp_clause_from): Update calls to
c_parser_omp_var_list_parens.
* c-tree.h (c_omp_array_section_p): Add extern declaration.
(build_omp_array_section): Add prototype.
* c-typeck.c (c_omp_array_section_p): Add flag.
(mark_exp_read): Support OMP_ARRAY_SECTION.
(build_omp_array_section): Add function.
(build_external_ref): Tweak error path for OpenMP array sections.
(handle_omp_array_sections_1): Use OMP_ARRAY_SECTION tree code instead
of TREE_LIST.  Handle more kinds of expressions.
(c_finish_omp_clauses): Use OMP_ARRAY_SECTION instead of TREE_LIST.
Check for supported expression types.

gcc/testsuite/
* gcc.dg/gomp/bad-array-section-c-1.c: New test.
* gcc.dg/gomp/bad-array-section-c-2.c: New test.
* gcc.dg/gomp/bad-array-section-c-3.c: New test.
* gcc.dg/gomp/bad-array-section-c-4.c: New test.
* gcc.dg/gomp/bad-array-section-c-5.c: New test.
* gcc.dg/gomp/bad-array-section-c-6.c: New test.
* gcc.dg/gomp/bad-array-section-c-7.c: New test.
* gcc.dg/gomp/bad-array-section-c-8.c: New test.

libgomp/
* testsuite/libgomp.c-c++-common/ind-base-4.c: New test.
* testsuite/libgomp.c-c++-common/unary-ptr-1.c: New test.
---
 gcc/c-family/c-pretty-print.cc|  12 ++
 gcc/c/c-parser.cc | 187 +++---
 gcc/c/c-tree.h|   2 +
 gcc/c/c-typeck.cc | 109 --
 .../gcc.dg/gomp/bad-array-section-c-1.c   |  16 ++
 .../gcc.dg/gomp/bad-array-section-c-2.c   |  13 ++
 .../gcc.dg/gomp/bad-array-section-c-3.c   |  24 +++
 .../gcc.dg/gomp/bad-array-section-c-4.c   |  26 +++
 .../gcc.dg/gomp/bad-array-section-c-5.c   |  15 ++
 .../gcc.dg/gomp/bad-array-section-c-6.c   |  16 ++
 .../gcc.dg/gomp/bad-array-section-c-7.c   |  26 +++
 .../gcc.dg/gomp/bad-array-section-c-8.c   |  21 ++
 .../libgomp.c-c++-common/ind-base-4.c |  50 +
 .../libgomp.c-c++-common/unary-ptr-1.c|  16 ++
 14 files changed, 486 insertions(+), 47 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-1.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-2.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-3.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-4.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-5.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-6.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-7.c
 create mode 100644 gcc/testsuite/gcc.dg/gomp/bad-array-section-c-8.c
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/ind-base-4.c
 create mode 100644 libgomp/testsuite/libgomp.c-c++-common/unary-ptr-1.c

diff --git a/gcc/c-family/c-pretty-print.cc b/gcc/c-family/c-pretty-print.cc
index c99b2ceffe65..d9954bd2b951 100644
--- a/gcc/c-family/c-pretty-print.cc
+++ b/gcc/c-family/c-pretty-print.cc
@@ -1615,6 +1615,17 @@ c_pretty_printer::postfix_expression (tree e)
   pp_c_right_bracket (this);
   break;
 
+case OMP_ARRAY_SECTION:
+  postfix_expression (TREE_OPERAND (e, 0));
+  pp_c_left_bracket (this);
+  if (TREE_OPERAND (e, 1))
+   expression (TREE_OPERAND (e, 1));
+  pp_colon (this);
+  if (TREE_OPERAND (e, 2))
+   expression (TREE_OPERAND (e, 2));
+  pp_c_right_bracket (this);
+  break;
+
 case CALL_EXPR:
   {
call_expr_arg_iterator iter;
@@ -2664,6 +2675,7 @@ c_pretty_printer::expression (tree e)

[PATCH v6 05/11] OpenMP: Pointers and member mappings

2022-12-23 Thread Julian Brown

This patch changes the mapping node arrangement used for array components
of derived types, e.g.:

  type T
  integer, pointer, dimension(:) :: arrptr
  end type T

  type(T) :: tvar
  [...]
  !$omp target map(tofrom: tvar%arrptr)

This will currently be mapped using three mapping nodes:

  GOMP_MAP_TO tvar%arrptr   (the descriptor)
  GOMP_MAP_TOFROM *tvar%arrptr%data (the actual array data)
  GOMP_MAP_ALWAYS_POINTER tvar%arrptr%data  (a pointer to the array data)

This follows OMP 5.0, 2.19.7.1 (or OpenMP 5.2, 5.8.3) "map Clause":

  "If a list item in a map clause is an associated pointer and the
   pointer is not the base pointer of another list item in a map clause
   on the same construct, then it is treated as if its pointer target
   is implicitly mapped in the same clause. For the purposes of the map
   clause, the mapped pointer target is treated as if its base pointer
   is the associated pointer."

However, we can also write this:

  map(to: tvar%arrptr) map(tofrom: tvar%arrptr(3:8))

and then instead we should follow (OpenMP 5.2, 5.8.3 "map Clause"):

  "For map clauses on map-entering constructs, if any list item has a base
   pointer for which a corresponding pointer exists in the data environment
   upon entry to the region and either a new list item or the corresponding
   pointer is created in the device data environment on entry to the region,
   then:
   1. [Fortran] The corresponding pointer variable is associated with
  a pointer target that has the same rank and bounds as the pointer
  target of the original pointer, such that the corresponding list item
  can be accessed through the pointer in a target region.
   2. The corresponding pointer variable becomes an attached pointer
  for the corresponding list item."

With this patch you can write the above mappings, and the mapping nodes
used to map pointers to array sections (with descriptors) now look
like this:

  1) map(to: tvar%arrptr)   -->
  GOMP_MAP_TO [implicit]  *tvar%arrptr%data  (the array data)
  GOMP_MAP_TO_PSETtvar%arrptr(the descriptor)
  GOMP_MAP_ATTACH_DETACH  tvar%arrptr%data

  2) map(tofrom: tvar%arrptr(3:8)   -->
  GOMP_MAP_TOFROM *tvar%arrptr%data(3)  (size 8-3+1, etc.)
  GOMP_MAP_TO_PSETtvar%arrptr
  GOMP_MAP_ATTACH_DETACH  tvar%arrptr%data  (bias 3, etc.)

In this case, we can determine in the front-end that the
whole-array/pointer mapping (1) is only needed to map the pointer --
so we drop it entirely.  (Note also that we set -- early -- the
OMP_CLAUSE_MAP_RUNTIME_IMPLICIT_P flag for whole-array-via-pointer
mappings. See below.)

In the middle end, we process mappings using the struct sibling-list
handling machinery by moving the "GOMP_MAP_TO_PSET" node from the middle
of the group of three mapping nodes to the proper sorted position after
the GOMP_MAP_STRUCT mapping:

  GOMP_MAP_STRUCT   tvar (len: 1)
  GOMP_MAP_TO_PSET  tvar%arr (size: 64, etc.)  <--. moved here
  [...]   |
  GOMP_MAP_TOFROM *tvar%arrptr%data(3) ___|
  GOMP_MAP_ATTACH_DETACH  tvar%arrptr%data

In another case, if we have an array of derived-type values "dtarr",
and mappings like:

  i = 1
  j = 1
  map(to: dtarr(i)%arrptr) map(tofrom: dtarr(j)%arrptr(3:8))

We still map the same way, but this time we cannot prove that the base
expressions "dtarr(i) and "dtarr(j)" are the same in the front-end.
So we keep both mappings, but we move the "[implicit]" mapping of the
full-array reference to the end of the clause list in gimplify.cc (by
adjusting the topological sorting algorithm):

  GOMP_MAP_STRUCT dtvar  (len: 2)
  GOMP_MAP_TO_PSETdtvar(i)%arrptr
  GOMP_MAP_TO_PSETdtvar(j)%arrptr
  [...]
  GOMP_MAP_TOFROM *dtvar(j)%arrptr%data(3)  (size: 8-3+1)
  GOMP_MAP_ATTACH_DETACH  dtvar(j)%arrptr%data
  GOMP_MAP_TO [implicit]  *dtvar(i)%arrptr%data(1)  (size: whole array)
  GOMP_MAP_ATTACH_DETACH  dtvar(i)%arrptr%data

Always moving "[implicit]" full-array mappings after array-section
mappings (without that bit set) means that we'll avoid copying the whole
array unnecessarily -- even in cases where we can't prove that the arrays
are the same.

This version of the patch fixes some bugs with "enter data" and "exit
data" directives with this new mapping arrangement.  Also now if you
have mappings like this:

  #pragma omp target enter data map(to: dv, dv%arr(1:20))

The whole of the derived-type variable "dv" is mapped, so the
GOMP_MAP_TO_PSET for the array-section mapping can be dropped:

  GOMP_MAP_TOdv

  GOMP_MAP_TO*dv%arr%data
  GOMP_MAP_TO_PSET   dv%arr <-- deleted (array section mapping)
  GOMP_MAP_ATTACH_DETACH dv%arr%data

For struct components, the GOMP_MAP_TO_PSET mapping is turned into
GOMP_MAP_RELEASE at gimplify time for "exit data" directives.

2022-12-15  Julian Brown  

gcc/fortran/
* dependency.cc (gfc_omp_expr_prefix_same): New function.
*

1 2 >

1 - 100 of 125 matches

Mail list logo