Re: Intel AVX10.1 Compiler Design and Support

2023-08-19 Thread Richard Biener via Gcc-patches



> Am 20.08.2023 um 00:45 schrieb ZiNgA BuRgA via Gcc-patches 
> :
> 
> Hi,
> 
> With the proposed design of these switches, how would I restrict AVX10.1 to 
> particular AVX-512 subsets?
> 
> For example, usage of the |_mm256_rol_epi32| intrinsic should be compatible 
> on any AVX10/256 implementation, /as well as /any AVX-512VL without AVX10 
> implementation (e.g. Skylake-X).  But how do I signal that I want 
> compatibility with both these targets?
> 
> * |-mavx512vl| lets the compiler use 512-bit registers -> incompatible
>   with 256-bit AVX10.
> * |-mavx512vl -mprefer-vector-width=256| might steer the compiler away
>   from 512-bit registers, but I don't think it guarantees it.

We’ve been taking these cases as bugs (but yes, intrinsics are still allowed, 
so in some cases it might prove difficult to guarantee this).

I don’t see any other way of doing what you want within the constraints of this 
design.

> * |-mavx10.1-256| lets the compiler use all Sapphire Rapids AVX-512
>   features at 256-bit wide (so in theory, it could choose to compile
>   it with |vpshldd|) -> incompatible with Skylake-X.
> * |-mavx10.1-256 -mno-avx512fp16 -mno-avx512...| will emit a warning
>   and ignore the attempts at disabling AVX-512 subsets.
> * |-mavx10.1-256 -mavx512vl| takes the /union/ of the features, not
>   the /intersection./
> 
> Is there something like |-mavx512vl -mmax-vector-width=256|, or am I 
> misunderstanding the situation?
> 
> Thanks!


Re: [PATCH] improve error for when /usr/include isn't found [PR90835]

2023-08-19 Thread Eric Gallager via Gcc-patches
On Thu, Aug 17, 2023 at 11:38 PM Eric Gallager  wrote:
>
> On Thu, Aug 17, 2023 at 4:05 PM Iain Sandoe  wrote:
> >
> > Hi Eric,
> >
> > thanks for working on this.
> >
> > > On 17 Aug 2023, at 20:35, Eric Gallager  wrote:
> > >
> > > This is a pretty simple patch that ought to help Darwin users understand
> > > better why their build is failing when they forget to pass the
> > > --with-sysroot= flag to configure.
> > >
> > > gcc/ChangeLog:
> > >
> > >PR target/90835
> > >* Makefile.in: improve error message when /usr/include is
> > >missing
> >
> > 1. the main issue with this approach is that the error does not happen 
> > until after the
> >user has waited for the whole of the stage 1 build.
> >
> >(I had in mind the idea that top level configure can identify that the 
> > platform
> > is Darwin, and that there is no sysroot configured;
> >  then [for bootstrap] complain if there is no /use/include
> >  els [for non-bootstrap] complain always)
> >
> > - this would mean that the fail occurs at initial configure time.
> >
> > 2. if we went with this patch as an incremental improvement:
> >
> > + case ${build_os} in \
> > +   darwin*) \
> > + echo "(on darwin this usually means you need to pass the 
> > --with-sysroot flag to configure to point it to where the system headers 
> > are actually put)" >&2; \
> >
> > I think we need to put this in terms that relate to the system and things 
> > the user can find, so ;
> > “on Darwin this usually means you need to pass the --with-sysroot= flag to 
> > point to a valid MacOS SDK”
> >
>
> OK, so would it be ok with that change in wording?
>
> > (In practice, the headers cause the first fail, but we also need to find 
> > the libraries when linking)
> >
> > Iain
> >

Committed with your proposed change in wording as
r14-3335-g9a5d1fceb86a61:
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9a5d1fceb86a61c9ead380df89ce3c4ba387d2e5
(Jeff approved in reply to one of the other copies)


Re: [PATCH] improve error when /usr/include isn't found [PR90835]

2023-08-19 Thread Eric Gallager via Gcc-patches
Thanks, I committed the version with Iain's suggested change in wording:
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/627796.html
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=9a5d1fceb86a61c9ead380df89ce3c4ba387d2e5
(sorry that this got sent multiple times; I thought the email hadn't
gone through properly...)

On Sat, Aug 19, 2023 at 4:11 PM Jeff Law  wrote:
>
>
>
> On 8/17/23 12:59, Eric Gallager via Gcc-patches wrote:
> > Subject:
> > [PATCH] improve error when /usr/include isn't found [PR90835]
> > From:
> > Eric Gallager via Gcc-patches 
> > Date:
> > 8/17/23, 12:59
> >
> > To:
> > gcc-patches@gcc.gnu.org
> > CC:
> > ia...@gcc.gnu.org, Eric Gallager 
> >
> >
> > This is a pretty simple patch that ought to help Darwin users understand
> > better why their build is failing when they forget to pass the
> > --with-sysroot= flag to configure.
> >
> > gcc/ChangeLog:
> >
> >  PR target/90835
> >  * Makefile.in: improve error message when /usr/include is
> >  missing
> OK.
> jeff


Re: Intel AVX10.1 Compiler Design and Support

2023-08-19 Thread ZiNgA BuRgA via Gcc-patches

Hi,

With the proposed design of these switches, how would I restrict AVX10.1 
to particular AVX-512 subsets?


For example, usage of the |_mm256_rol_epi32| intrinsic should be 
compatible on any AVX10/256 implementation, /as well as /any AVX-512VL 
without AVX10 implementation (e.g. Skylake-X).  But how do I signal that 
I want compatibility with both these targets?


 * |-mavx512vl| lets the compiler use 512-bit registers -> incompatible
   with 256-bit AVX10.
 * |-mavx512vl -mprefer-vector-width=256| might steer the compiler away
   from 512-bit registers, but I don't think it guarantees it.
 * |-mavx10.1-256| lets the compiler use all Sapphire Rapids AVX-512
   features at 256-bit wide (so in theory, it could choose to compile
   it with |vpshldd|) -> incompatible with Skylake-X.
 * |-mavx10.1-256 -mno-avx512fp16 -mno-avx512...| will emit a warning
   and ignore the attempts at disabling AVX-512 subsets.
 * |-mavx10.1-256 -mavx512vl| takes the /union/ of the features, not
   the /intersection./

Is there something like |-mavx512vl -mmax-vector-width=256|, or am I 
misunderstanding the situation?


Thanks!


Re: [PATCH] Testsuite: fix analyzer tests on Darwin

2023-08-19 Thread Iain Sandoe via Gcc-patches
Hi FX,

thanks for chasing these fails down,

> On 19 Aug 2023, at 22:28, FX Coudert  wrote:
> 

> gcc.dg/analyzer/ currently has 80 failures on Darwin (both 
> x86_64-apple-darwin and aarch64-apple-darwin). All those come from two issues:
> 
> 1. Many tests use memset() without including the  header. We can 
> fix that easily.
> 
> 2. Other tests fail because of the use of macOS headers, which redefine 
> functions like memcpy and others to “checked”/fortified versions 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104042 Instead of correcting 
> this on a case-by-case basis, add the -D_FORTIFY_SOURCE=0 flag systematically 
> on Darwin.
> 
> With that, all 80 failures are silenced and that part of the testsuite is now 
> clean:
> 
> # of expected passes 5238
> # of expected failures 194
> # of unsupported tests 12

> OK to commit?

LGTM,

Iain

> FX
> 
> <0001-Testsuite-fix-analyzer-tests-on-Darwin.patch>



[PATCH] Testsuite: fix analyzer tests on Darwin

2023-08-19 Thread FX Coudert via Gcc-patches
Hi,

gcc.dg/analyzer/ currently has 80 failures on Darwin (both x86_64-apple-darwin 
and aarch64-apple-darwin). All those come from two issues:

1. Many tests use memset() without including the  header. We can fix 
that easily.

2. Other tests fail because of the use of macOS headers, which redefine 
functions like memcpy and others to “checked”/fortified versions 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104042 Instead of correcting this 
on a case-by-case basis, add the -D_FORTIFY_SOURCE=0 flag systematically on 
Darwin.

With that, all 80 failures are silenced and that part of the testsuite is now 
clean:

# of expected passes 5238
# of expected failures 194
# of unsupported tests 12



OK to commit?
FX



0001-Testsuite-fix-analyzer-tests-on-Darwin.patch
Description: Binary data


Testsuite: fix contructor priority test

2023-08-19 Thread FX Coudert via Gcc-patches
Bordering on obvious, tested on darwin where the test case fails before (and 
now passes).

OK to commit?
FX



0001-Testsuite-fix-contructor-priority-test.patch
Description: Binary data


Re: [PATCH] improve error when /usr/include isn't found [PR90835]

2023-08-19 Thread Jeff Law via Gcc-patches




On 8/17/23 12:59, Eric Gallager via Gcc-patches wrote:

Subject:
[PATCH] improve error when /usr/include isn't found [PR90835]
From:
Eric Gallager via Gcc-patches 
Date:
8/17/23, 12:59

To:
gcc-patches@gcc.gnu.org
CC:
ia...@gcc.gnu.org, Eric Gallager 


This is a pretty simple patch that ought to help Darwin users understand
better why their build is failing when they forget to pass the
--with-sysroot= flag to configure.

gcc/ChangeLog:

 PR target/90835
 * Makefile.in: improve error message when /usr/include is
 missing

OK.
jeff


Re: [PATCH] sso-string@gnu-versioned-namespace [PR83077]

2023-08-19 Thread François Dumont via Gcc-patches

Here is a rebased patch following the resize_and_overwrite change.

I confirm that tests are now fixed after the change in tzdb.cc.

I'll prepare a fix for those tests still but preparing also a test to 
detect allocations in the lib.


François

On 17/08/2023 21:44, Jonathan Wakely wrote:

On Thu, 17 Aug 2023 at 20:37, Jonathan Wakely  wrote:

On Thu, 17 Aug 2023 at 19:59, Jonathan Wakely  wrote:

On Thu, 17 Aug 2023 at 18:40, François Dumont  wrote:


On 17/08/2023 19:22, Jonathan Wakely wrote:

On Sun, 13 Aug 2023 at 14:27, François Dumont via Libstdc++
 wrote:

Here is the fixed patch tested in all 3 modes:

- _GLIBCXX_USE_DUAL_ABI

- !_GLIBCXX_USE_DUAL_ABI && !_GLIBCXX_USE_CXX11_ABI

- !_GLIBCXX_USE_DUAL_ABI && _GLIBCXX_USE_CXX11_ABI

I don't know what you have in mind for the change below but I wanted to
let you know that I tried to put COW std::basic_string into a nested
__cow namespace when _GLIBCXX_USE_CXX11_ABI. But it had more impact on
string-inst.cc so I preferred the macro substitution approach.

I was thinking of implementing the necessary special members functions
of __cow_string directly, so they are ABI compatible with the COW
std::basic_string but don't actually reuse the code. That would mean
we don't need to compile and instantiate the whole COW string just to
use a few members from it. But that can be done later, the macro
approach seems OK for now.

You'll see that when cow_string.h is included while
_GLIBCXX_USE_CXX11_ABI == 1 then I am hiding a big part of the
basic_string definition. Initially it was to avoid to have to include
basic_string.tcc but it is also a lot of useless code indeed.



There are some test failing when !_GLIBCXX_USE_CXX11_ABI that are
unrelated with my changes. I'll propose fixes in coming days.

Which tests? I run the entire testsuite with
-D_GLIBCXX_USE_CXX11_ABI=0 several times per day and I'm not seeing
failures.

I'll review the patch ASAP, thanks for working on it.


So far the only issue I found are in the mode !_GLIBCXX_USE_DUAL_ABI &&
!_GLIBCXX_USE_CXX11_ABI. They are:

23_containers/unordered_map/96088.cc
23_containers/unordered_multimap/96088.cc
23_containers/unordered_multiset/96088.cc
23_containers/unordered_set/96088.cc
ext/debug_allocator/check_new.cc
ext/malloc_allocator/check_new.cc
ext/malloc_allocator/deallocate_local.cc
ext/new_allocator/deallocate_local.cc
ext/pool_allocator/allocate_chunk.cc
ext/throw_allocator/deallocate_local.cc

Ah yes, they fail for !USE_DUAL_ABI builds, I wonder why.

/home/test/src/gcc/libstdc++-v3/testsuite/23_containers/unordered_map/96088.
cc:44: void test01(): Assertion '__gnu_test::counter::count() == 3' failed.
FAIL: 23_containers/unordered_map/96088.cc execution test

It's due to this global object in src/c++20/tzdb.cc:
1081const string tzdata_file = "/tzdata.zi";

When the library uses COW strings that requires an allocation before
main, which uses the replacement operator new in the tests, which
fails to allocate. For example, in 22_locale/locale/cons/12352.cc we
have this function used by operator new:

int times_to_fail = 0;

void* allocate(std::size_t n)
{
   if (!times_to_fail--)
 return 0;

The counter is initially zero, so if we try to allocate before it gets
set to a non-zero value in test01() then we fail.

The test should not assume no allocations before main() begins. The
simplest way to do that is with another global that says "we have
started testing" e.g.

--- a/libstdc++-v3/testsuite/22_locale/locale/cons/12352.cc
+++ b/libstdc++-v3/testsuite/22_locale/locale/cons/12352.cc
@@ -26,11 +26,12 @@
  #include 
  #include 

+bool tests_started = false;
  int times_to_fail = 0;

  void* allocate(std::size_t n)
  {
-  if (!times_to_fail--)
+  if (tests_started && !times_to_fail--)
  return 0;

void* ret = std::malloc(n ? n : 1);
@@ -106,6 +107,8 @@ void operator delete[](void* p, const
std::nothrow_t&) throw()
  // libstdc++/12352
  void test01(int iters)
  {
+  tests_started = true;
+
for (int j = 0; j < iters; ++j)
  {
for (int i = 0; i < 100; ++i)


This way the replacement operator new doesn't start intentionally
failing until we ask it to do so.

I'll replace the global std::string objects with std::string_view
objects, so that they don't allocate even if the library only uses COW
strings.

We should still fix those tests though.
diff --git a/libstdc++-v3/acinclude.m4 b/libstdc++-v3/acinclude.m4
index b25378eaace..322f1e42611 100644
--- a/libstdc++-v3/acinclude.m4
+++ b/libstdc++-v3/acinclude.m4
@@ -4875,12 +4875,16 @@ dnl
 AC_DEFUN([GLIBCXX_ENABLE_LIBSTDCXX_DUAL_ABI], [
   GLIBCXX_ENABLE(libstdcxx-dual-abi,$1,,[support two versions of std::string])
   if test x$enable_symvers = xgnu-versioned-namespace; then
-# gnu-versioned-namespace is incompatible with the dual ABI.
-enable_libstdcxx_dual_abi="no"
-  fi
-  if test x"$enable_libstdcxx_dual_abi" != xyes; then
+# gnu-versioned-namespace is incompatible with the dual ABI...
 AC_MSG_NOTICE([dual

Re: [PATCH][RFC] tree-optimization/92335 - Improve sinking heuristics for vectorization

2023-08-19 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 18 Aug 2023 at 17:11, Richard Biener  wrote:
>
> On Fri, 18 Aug 2023, Richard Biener wrote:
>
> > On Thu, 17 Aug 2023, Prathamesh Kulkarni wrote:
> >
> > > On Tue, 15 Aug 2023 at 14:28, Richard Sandiford
> > >  wrote:
> > > >
> > > > Richard Biener  writes:
> > > > > On Mon, 14 Aug 2023, Prathamesh Kulkarni wrote:
> > > > >> On Mon, 7 Aug 2023 at 13:19, Richard Biener 
> > > > >>  wrote:
> > > > >> > It doesn't seem to make a difference for x86.  That said, the 
> > > > >> > "fix" is
> > > > >> > probably sticking the correct target on the dump-check, it seems
> > > > >> > that vect_fold_extract_last is no longer correct here.
> > > > >> Um sorry, I did go thru various checks in target-supports.exp, but 
> > > > >> not
> > > > >> sure which one will be appropriate for this case,
> > > > >> and am stuck here :/ Could you please suggest how to proceed ?
> > > > >
> > > > > Maybe Richard S. knows the magic thing to test, he originally
> > > > > implemented the direct conversion support.  I suggest to implement
> > > > > such dg-checks if they are not present (I can't find them),
> > > > > possibly quite specific to the modes involved (like we have
> > > > > other checks with _qi_to_hi suffixes, for float modes maybe
> > > > > just _float).
> > > >
> > > > Yeah, can't remember specific selectors for that feature.  TBH I think
> > > > most (all?) of the tests were AArch64-specific.
> > > Hi,
> > > As Richi mentioned above, the test now vectorizes on AArch64 because
> > > it has support for direct conversion
> > > between vectors while x86 doesn't. IIUC this is because
> > > supportable_convert_operation returns true
> > > for V4HI -> V4SI on Aarch64 since it can use extend_v4hiv4si2 for
> > > doing the conversion ?
> > >
> > > In the attached patch, I added a new target check vect_extend which
> > > (currently) returns 1 only for aarch64*-*-*,
> > > which makes the test PASS on both the targets, altho I am not sure if
> > > this is entirely correct.
> > > Does the patch look OK ?
> >
> > Can you make vect_extend more specific, say vect_extend_hi_si or
> > what is specifically needed here?  Note I'll have to investigate
> > why x86 cannot vectorize here since in fact it does have
> > the extend operation ... it might be also worth splitting the
> > sign/zero extend case, so - vect_sign_extend_hi_si or
> > vect_extend_short_int?
>
> And now having anaylzed _why_ x86 doesn't vectorize it's rather
> why we get this vectorized with NEON which is because
>
> static opt_machine_mode
> aarch64_vectorize_related_mode (machine_mode vector_mode,
> scalar_mode element_mode,
> poly_uint64 nunits)
> {
> ...
>   /* Prefer to use 1 128-bit vector instead of 2 64-bit vectors.  */
>   if (TARGET_SIMD
>   && (vec_flags & VEC_ADVSIMD)
>   && known_eq (nunits, 0U)
>   && known_eq (GET_MODE_BITSIZE (vector_mode), 64U)
>   && maybe_ge (GET_MODE_BITSIZE (element_mode)
>* GET_MODE_NUNITS (vector_mode), 128U))
> {
>   machine_mode res = aarch64_simd_container_mode (element_mode, 128);
>   if (VECTOR_MODE_P (res))
> return res;
>
> which makes us get a V4SImode vector for a V4HImode loop vector_mode.
Thanks for the explanation!
>
> So I think the appropriate effective dejagnu target is
> aarch64-*-* (there's none specifically to advsimd, not sure if one
> can disable that?)
The attached patch uses aarch64*-*-* target check, and additionally
for SVE (and other targets supporting vect_fold_extract_last) it
checks
if the condition reduction was carried out using FOLD_EXTRACT_LAST.
Does that look OK ?

Thanks,
Prathamesh
>

> Richard.
>
> > > Thanks,
> > > Prathamesh
> > > >
> > > > Thanks,
> > > > Richard
> > >
> >
> >
>
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-7.c 
b/gcc/testsuite/gcc.dg/vect/pr65947-7.c
index 16cdcd1c6eb..58c46df5c54 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-7.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-7.c
@@ -52,5 +52,5 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target 
vect_fold_extract_last } } } */
-/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" { target { ! 
vect_fold_extract_last } } } } */
+/* { dg-final { scan-tree-dump "optimizing condition reduction with 
FOLD_EXTRACT_LAST" "vect" { target vect_fold_extract_last } } } */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target aarch64*-*-* 
} } } */


Re: [PATCH] Loongarch: Fix plugin header missing install.

2023-08-19 Thread Huacai Chen via Gcc-patches
Thank you very much, I think this should also be backported to Gcc12/13.

Huacai

On Sat, Aug 19, 2023 at 11:59 AM Chenghua Xu  wrote:
>
> Pushed as r14-3331.
>
> Thanks.
> chenglulu writes:
>
> > LGTM!
> >
> > 在 2023/8/16 上午9:48, Guo Jie 写道:
> >> gcc/ChangeLog:
> >>
> >>  * config/loongarch/t-loongarch: Add loongarch-driver.h into
> >>  TM_H. Add loongarch-def.h and loongarch-tune.h into
> >>  OPTIONS_H_EXTRA.
> >>
> >> Co-authored-by: Lulu Cheng 
> >> ---
> >>   gcc/config/loongarch/t-loongarch | 4 
> >>   1 file changed, 4 insertions(+)
> >>
> >> diff --git a/gcc/config/loongarch/t-loongarch 
> >> b/gcc/config/loongarch/t-loongarch
> >> index 6d6e3435d59..e73f4f437ef 100644
> >> --- a/gcc/config/loongarch/t-loongarch
> >> +++ b/gcc/config/loongarch/t-loongarch
> >> @@ -16,6 +16,10 @@
> >>   # along with GCC; see the file COPYING3.  If not see
> >>   # .
> >>   +TM_H += $(srcdir)/config/loongarch/loongarch-driver.h
> >> +OPTIONS_H_EXTRA += $(srcdir)/config/loongarch/loongarch-def.h \
> >> +   $(srcdir)/config/loongarch/loongarch-tune.h
> >> +
> >>   # Canonical target triplet from config.gcc
> >>   LA_MULTIARCH_TRIPLET = $(patsubst LA_MULTIARCH_TRIPLET=%,%,$\
> >>   $(filter LA_MULTIARCH_TRIPLET=%,$(tm_defines)))
>


[PING][PATCH] arm: Remove unsigned variant of vcaddq_m

2023-08-19 Thread Stam Markianos-Wright via Gcc-patches


(Pinging since I realised that this is required for my later Low Overhead Loop 
patch series to work)

Ok for trunk with the updated changelog that Christophe mentioned?

Thanks,
Stamatis/Stam Markianos-Wright


From: Stam Markianos-Wright
Sent: Tuesday, August 1, 2023 6:21 PM
To: gcc-patches@gcc.gnu.org 
Cc: Richard Earnshaw ; Kyrylo Tkachov 

Subject: arm: Remove unsigned variant of vcaddq_m

Hi all,

The unsigned variants of the vcaddq_m operation are not needed within the
compiler, as the assembly output of the signed and unsigned versions of the
ops is identical: with a `.i` suffix (as opposed to separate `.s` and `.u`
suffixes).

Tested with baremetal arm-none-eabi on Arm's fastmodels.

Ok for trunk?

Thanks,
Stamatis Markianos-Wright

gcc/ChangeLog:

 * config/arm/arm-mve-builtins-base.cc (vcaddq_rot90, vcaddq_rot270):
   Use common insn for signed and unsigned front-end definitions.
 * config/arm/arm_mve_builtins.def
   (vcaddq_rot90_m_u, vcaddq_rot270_m_u): Make common.
   (vcaddq_rot90_m_s, vcaddq_rot270_m_s): Remove.
 * config/arm/iterators.md (mve_insn): Merge signed and unsigned defs.
   (isu): Likewise.
   (rot): Likewise.
   (mve_rot): Likewise.
   (supf): Likewise.
   (VxCADDQ_M): Likewise.
 * config/arm/unspecs.md (unspec): Likewise.
---
  gcc/config/arm/arm-mve-builtins-base.cc |  4 ++--
  gcc/config/arm/arm_mve_builtins.def |  6 ++---
  gcc/config/arm/iterators.md | 30 +++--
  gcc/config/arm/mve.md   |  4 ++--
  gcc/config/arm/unspecs.md   |  6 ++---
  5 files changed, 21 insertions(+), 29 deletions(-)

diff --git a/gcc/config/arm/arm-mve-builtins-base.cc
b/gcc/config/arm/arm-mve-builtins-base.cc
index e31095ae112..426a87e9852 100644
--- a/gcc/config/arm/arm-mve-builtins-base.cc
+++ b/gcc/config/arm/arm-mve-builtins-base.cc
@@ -260,8 +260,8 @@ FUNCTION_PRED_P_S_U (vaddvq, VADDVQ)
  FUNCTION_PRED_P_S_U (vaddvaq, VADDVAQ)
  FUNCTION_WITH_RTX_M (vandq, AND, VANDQ)
  FUNCTION_ONLY_N (vbrsrq, VBRSRQ)
-FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot,
(UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M_S,
VCADDQ_ROT90_M_U, VCADDQ_ROT90_M_F))
-FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot,
(UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M_S,
VCADDQ_ROT270_M_U, VCADDQ_ROT270_M_F))
+FUNCTION (vcaddq_rot90, unspec_mve_function_exact_insn_rot,
(UNSPEC_VCADD90, UNSPEC_VCADD90, UNSPEC_VCADD90, VCADDQ_ROT90_M,
VCADDQ_ROT90_M, VCADDQ_ROT90_M_F))
+FUNCTION (vcaddq_rot270, unspec_mve_function_exact_insn_rot,
(UNSPEC_VCADD270, UNSPEC_VCADD270, UNSPEC_VCADD270, VCADDQ_ROT270_M,
VCADDQ_ROT270_M, VCADDQ_ROT270_M_F))
  FUNCTION (vcmlaq, unspec_mve_function_exact_insn_rot, (-1, -1,
UNSPEC_VCMLA, -1, -1, VCMLAQ_M_F))
  FUNCTION (vcmlaq_rot90, unspec_mve_function_exact_insn_rot, (-1, -1,
UNSPEC_VCMLA90, -1, -1, VCMLAQ_ROT90_M_F))
  FUNCTION (vcmlaq_rot180, unspec_mve_function_exact_insn_rot, (-1, -1,
UNSPEC_VCMLA180, -1, -1, VCMLAQ_ROT180_M_F))
diff --git a/gcc/config/arm/arm_mve_builtins.def
b/gcc/config/arm/arm_mve_builtins.def
index 43dacc3dda1..6ac1812c697 100644
--- a/gcc/config/arm/arm_mve_builtins.def
+++ b/gcc/config/arm/arm_mve_builtins.def
@@ -523,8 +523,8 @@ VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED,
vhsubq_m_n_u, v16qi, v8hi, v4si)
  VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vhaddq_m_u, v16qi, v8hi, v4si)
  VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vhaddq_m_n_u, v16qi, v8hi,
v4si)
  VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, veorq_m_u, v16qi, v8hi, v4si)
-VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot90_m_u, v16qi,
v8hi, v4si)
-VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot270_m_u, v16qi,
v8hi, v4si)
+VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot90_m_, v16qi,
v8hi, v4si)
+VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vcaddq_rot270_m_, v16qi,
v8hi, v4si)
  VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vbicq_m_u, v16qi, v8hi, v4si)
  VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vandq_m_u, v16qi, v8hi, v4si)
  VAR3 (QUADOP_UNONE_UNONE_UNONE_UNONE_PRED, vaddq_m_u, v16qi, v8hi, v4si)
@@ -587,8 +587,6 @@ VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED,
vhcaddq_rot270_m_s, v16qi, v8hi, v4si)
  VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vhaddq_m_s, v16qi, v8hi, v4si)
  VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vhaddq_m_n_s, v16qi, v8hi, v4si)
  VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, veorq_m_s, v16qi, v8hi, v4si)
-VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vcaddq_rot90_m_s, v16qi, v8hi, v4si)
-VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vcaddq_rot270_m_s, v16qi, v8hi,
v4si)
  VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vbrsrq_m_n_s, v16qi, v8hi, v4si)
  VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vbicq_m_s, v16qi, v8hi, v4si)
  VAR3 (QUADOP_NONE_NONE_NONE_NONE_PRED, vandq_m_s, v16qi, v8hi, v4si)
diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index b13ff53d36f..2edd0b06370 100644
--- a/gcc/config/arm/iterators

Re: [PATCH] tree-optimization/111048 - avoid flawed logic in fold_vec_perm

2023-08-19 Thread Prathamesh Kulkarni via Gcc-patches
On Fri, 18 Aug 2023 at 14:52, Richard Biener  wrote:
>
> On Fri, 18 Aug 2023, Richard Sandiford wrote:
>
> > Richard Biener  writes:
> > > The following avoids running into somehow flawed logic in fold_vec_perm
> > > for non-VLA vectors.
> > >
> > > Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> > >
> > > Richard.
> > >
> > > PR tree-optimization/111048
> > > * fold-const.cc (fold_vec_perm_cst): Check for non-VLA
> > > vectors first.
> > >
> > > * gcc.dg/torture/pr111048.c: New testcase.
> >
> > Please don't do this as a permanent thing.  It was a deliberate choice
> > to have the is_constant be the fallback, so that the "generic" (VLA+VLS)
> > logic gets more coverage.  Like you say, if something is wrong for VLS
> > then the chances are that it's also wrong for VLA.
>
> Sure, feel free to undo this change together with the fix for the
> VLA case.
Hi,
The attached patch reverts the workaround, and fixes the issue.
Bootstrapped+tested on aarch64-linux-gnu with and without SVE, and
x64_64-linux-gnu.
OK to commit ?

Thanks,
Prathamesh
>
> Richard.
>
> > Thanks,
> > Richard
> >
> >
> > > ---
> > >  gcc/fold-const.cc   | 12 ++--
> > >  gcc/testsuite/gcc.dg/torture/pr111048.c | 24 
> > >  2 files changed, 30 insertions(+), 6 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.dg/torture/pr111048.c
> > >
> > > diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
> > > index 5c51c9d91be..144fd7481b3 100644
> > > --- a/gcc/fold-const.cc
> > > +++ b/gcc/fold-const.cc
> > > @@ -10625,6 +10625,11 @@ fold_vec_perm_cst (tree type, tree arg0, tree 
> > > arg1, const vec_perm_indices &sel,
> > >unsigned res_npatterns, res_nelts_per_pattern;
> > >unsigned HOST_WIDE_INT res_nelts;
> > >
> > > +  if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts))
> > > +{
> > > +  res_npatterns = res_nelts;
> > > +  res_nelts_per_pattern = 1;
> > > +}
> > >/* (1) If SEL is a suitable mask as determined by
> > >   valid_mask_for_fold_vec_perm_cst_p, then:
> > >   res_npatterns = max of npatterns between ARG0, ARG1, and SEL
> > > @@ -10634,7 +10639,7 @@ fold_vec_perm_cst (tree type, tree arg0, tree 
> > > arg1, const vec_perm_indices &sel,
> > >   res_npatterns = nelts in result vector.
> > >   res_nelts_per_pattern = 1.
> > >   This exception is made so that VLS ARG0, ARG1 and SEL work as 
> > > before.  */
> > > -  if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason))
> > > +  else if (valid_mask_for_fold_vec_perm_cst_p (arg0, arg1, sel, reason))
> > >  {
> > >res_npatterns
> > > = std::max (VECTOR_CST_NPATTERNS (arg0),
> > > @@ -10648,11 +10653,6 @@ fold_vec_perm_cst (tree type, tree arg0, tree 
> > > arg1, const vec_perm_indices &sel,
> > >
> > >res_nelts = res_npatterns * res_nelts_per_pattern;
> > >  }
> > > -  else if (TYPE_VECTOR_SUBPARTS (type).is_constant (&res_nelts))
> > > -{
> > > -  res_npatterns = res_nelts;
> > > -  res_nelts_per_pattern = 1;
> > > -}
> > >else
> > >  return NULL_TREE;
> > >
> > > diff --git a/gcc/testsuite/gcc.dg/torture/pr111048.c 
> > > b/gcc/testsuite/gcc.dg/torture/pr111048.c
> > > new file mode 100644
> > > index 000..475978aae2b
> > > --- /dev/null
> > > +++ b/gcc/testsuite/gcc.dg/torture/pr111048.c
> > > @@ -0,0 +1,24 @@
> > > +/* { dg-do run } */
> > > +/* { dg-additional-options "-mavx2" { target avx2_runtime } } */
> > > +
> > > +typedef unsigned char u8;
> > > +
> > > +__attribute__((noipa))
> > > +static void check(const u8 * v) {
> > > +if (*v != 15) __builtin_trap();
> > > +}
> > > +
> > > +__attribute__((noipa))
> > > +static void bug(void) {
> > > +u8 in_lanes[32];
> > > +for (unsigned i = 0; i < 32; i += 2) {
> > > +  in_lanes[i + 0] = 0;
> > > +  in_lanes[i + 1] = ((u8)0xff) >> (i & 7);
> > > +}
> > > +
> > > +check(&in_lanes[13]);
> > > +  }
> > > +
> > > +int main() {
> > > +bug();
> > > +}
> >
>
> --
> Richard Biener 
> SUSE Software Solutions Germany GmbH,
> Frankenstrasse 146, 90461 Nuernberg, Germany;
> GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)
PR111048: Set arg_npatterns correctly.

In valid_mask_for_fold_vec_perm_cst we set arg_npatterns always
to VECTOR_CST_NPATTERNS (arg0) because of (q1 & 0) == 0 in
following condition:

 /* Ensure that the stepped sequence always selects from the same
 input pattern.  */
  unsigned arg_npatterns
= ((q1 & 0) == 0) ? VECTOR_CST_NPATTERNS (arg0)
  : VECTOR_CST_NPATTERNS (arg1);

resulting in wrong code-gen issues.
The patch fixes this by changing the condition to (q1 & 1) == 0.

gcc/ChangeLog:
PR tree-optimization/111048
* fold-const.cc (valid_mask_for_fold_vec_perm_cst_p): Set arg_npatterns
correctly.
(fold_vec_perm_cst): Remove workaround and again call
valid_mask_fold_vec_perm_cst_p for bot

Re: [committed] libstdc++: Fix std::format("{:F}", inf) to use uppercase

2023-08-19 Thread Jonathan Wakely via Gcc-patches
On Thu, 17 Aug 2023 at 13:22, Jonathan Wakely via Libstdc++
 wrote:
>
> Tested x86_64-linux. Pushed to trunk. Backport to gcc-13 will follow.

Re the backport, I forgot to say that this changes the order/values of
the enumerators for _Pres_type. In theory that could cause
incompatibilities between GCC 13.2 and 13.3, if one object uses the
old definition of std::formatter::parse and another object uses the
new definition of std::formatter::format, or vice versa. But given
that 99.999% of uses of std::formatter are via std::format (not using
the formatter class directly), I expect that the calls to parse and
format will always be instantiated together at the same time, and so
every object will contain both symbols. That will mean that the linker
will always pick a "matching pair" of symbols, i.e. both symbols will
use the new enumerator values, or both will use the old enumerator
values, and so in practice there won't be a mismatch.

I could have added the new _Pres_F enumerator at the end, so it would
not alter the values of the other enumerators. But that wouldn't
completely avoid the problem anyway, because a new object that uses
_Pres_F in formatter::parse would be incompatible with an old object
that didn't know about the new _Pres_F value in formatter::format. So
I would prefer to keep the _Pres_F enumerator adjacent to _Pres_f and
the other ones for floating-point presentation types.

There have been so many other fixes to std::format that I think it
will be reasonable to tell anybody using it that they should just use
GCC 13.3 consistently anyway, and not mix code built with 13.2 and
13.3 if they're using the experimental C++20 std::format
implementation.


>
> -- >8 --
>
> std::format was treating {:f} and {:F} identically on the basis that for
> the fixed 1.234567 format there are no alphabetical characters that need
> to be in uppercase. But that's wrong for infinities and NaNs, which
> should be formatted as "INF" and "NAN" for {:F}.
>
> libstdc++-v3/ChangeLog:
>
> * include/std/format (__format::_Pres_type): Add _Pres_F.
> (__formatter_fp::parse): Use _Pres_F for 'F'.
> (__formatter_fp::format): Set __upper for _Pres_F.
> * testsuite/std/format/functions/format.cc: Check formatting of
> infinity and NaN for each presentation type.
> ---
>  libstdc++-v3/include/std/format  | 10 --
>  .../testsuite/std/format/functions/format.cc | 12 
>  2 files changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/format b/libstdc++-v3/include/std/format
> index a8db10d6460..40c7d6128f6 100644
> --- a/libstdc++-v3/include/std/format
> +++ b/libstdc++-v3/include/std/format
> @@ -309,7 +309,7 @@ namespace __format
>  // Presentation types for integral types (including bool and charT).
>  _Pres_d = 1, _Pres_b, _Pres_B, _Pres_o, _Pres_x, _Pres_X, _Pres_c,
>  // Presentation types for floating-point types.
> -_Pres_a = 1, _Pres_A, _Pres_e, _Pres_E, _Pres_f, _Pres_g, _Pres_G,
> +_Pres_a = 1, _Pres_A, _Pres_e, _Pres_E, _Pres_f, _Pres_F, _Pres_g, 
> _Pres_G,
>  _Pres_p = 0, _Pres_P,   // For pointers.
>  _Pres_s = 0,// For strings and bool.
>  _Pres_esc = 0xf,// For strings and charT.
> @@ -1382,10 +1382,13 @@ namespace __format
> ++__first;
> break;
>   case 'f':
> - case 'F':
> __spec._M_type = _Pres_f;
> ++__first;
> break;
> + case 'F':
> +   __spec._M_type = _Pres_F;
> +   ++__first;
> +   break;
>   case 'g':
> __spec._M_type = _Pres_g;
> ++__first;
> @@ -1442,6 +1445,9 @@ namespace __format
>   __use_prec = true;
>   __fmt = chars_format::scientific;
>   break;
> +   case _Pres_F:
> + __upper = true;
> + [[fallthrough]];
> case _Pres_f:
>   __use_prec = true;
>   __fmt = chars_format::fixed;
> diff --git a/libstdc++-v3/testsuite/std/format/functions/format.cc 
> b/libstdc++-v3/testsuite/std/format/functions/format.cc
> index 4db5202815d..59ed3be8baa 100644
> --- a/libstdc++-v3/testsuite/std/format/functions/format.cc
> +++ b/libstdc++-v3/testsuite/std/format/functions/format.cc
> @@ -159,6 +159,18 @@ test_alternate_forms()
>VERIFY( s == "1.e+01 1.e+01 1.e+01" );
>  }
>
> +void
> +test_infnan()
> +{
> +  double inf = std::numeric_limits::infinity();
> +  double nan = std::numeric_limits::quiet_NaN();
> +  std::string s;
> +  s = std::format("{0} {0:e} {0:E} {0:f} {0:F} {0:g} {0:G} {0:a} {0:A}", 
> inf);
> +  VERIFY( s == "inf inf INF inf INF inf INF inf INF" );
> +  s = std::format("{0} {0:e} {0:E} {0:f} {0:F} {0:g} {0:G} {0:a} {0:A}", 
> nan);
> +  VERIFY( s == "nan nan NAN nan NAN nan NAN nan NAN" );
> +}
> +
>  struct euro_punc : std::numpunct
>  {
>std::string do_grouping() const override { return

[PATCH][Ada] Fix syntax errors in expect.c

2023-08-19 Thread Andris Pavēnis
Noticed trivial syntax errors in gcc/ada/expect.c when tried to compile gcc 13.2 as cross-compiler 
for target i686-pc-msdosdjgpp.


Errors were there since

Tiedostossa, joka sisällytettiin kohdasta expect.c:54:
expect.c:Funktio ”__gnat_waitpid”:
expect.c:353:13:virhe: expected ”(” before numeric constant
 353 |   } else if WIFSTOPPED(status) {
 | ^~
expect.c:358:1:varoitus: ei-void-tyyppisen funktion loppu saavutettu 
[-Wreturn-type]
 358 | }
 | ^
make[5]: *** [../gcc-interface/Makefile:297: expect.o] Error 1

Errors were there since commit 9e6274e0a3b60e77a42784c3fb6ef2aa3cfc071a(Wed Dec 15 19:26:50 2021 
+0600)


Fixing these errors (attached patch for master branch) was not sufficient for building Ada 
cross-compiler, but it fixed compiler errors.


This would perhaps qualify for trivial change, but it seems that I no more have write access (I got 
it in 2015, but have not used it for a long time. Perhaps I do not really need it)



Andris




commit 64c48aa99656e06d5728bf5837da3bbc50ae4cc5
Author: Andris Pavēnis 
Date:   Sat Aug 19 10:40:22 2023 +0300

Fix syntax error

gcc/ada/expect.c(__gnat_waitpid):
fix syntax errors

diff --git a/gcc/ada/expect.c b/gcc/ada/expect.c
index e6899632bc9..7333c11d954 100644
--- a/gcc/ada/expect.c
+++ b/gcc/ada/expect.c
@@ -346,11 +346,11 @@ __gnat_waitpid (int pid)
  return -1;
   }
 
-  if WIFEXITED (status) {
+  if (WIFEXITED (status)) {
  status = WEXITSTATUS (status);
-  } else if WIFSIGNALED (status) {
+  } else if (WIFSIGNALED (status)) {
  status = WTERMSIG (status);
-  } else if WIFSTOPPED (status) {
+  } else if (WIFSTOPPED (status)) {
  status = WSTOPSIG (status);
   }