date:20221020

RE: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Jiang, Haochen via Gcc-patches

> -Original Message-
> From: Segher Boessenkool 
> Sent: Thursday, October 20, 2022 5:07 AM
> To: Jiang, Haochen 
> Cc: gcc-patches@gcc.gnu.org; rguent...@suse.de; Liu, Hongtao
> ; ubiz...@gmail.com; richard.earns...@arm.com;
> richard.sandif...@arm.com; marcus.shawcr...@arm.com;
> kyrylo.tkac...@arm.com; r...@gcc.gnu.org; g...@amylaar.uk;
> claz...@synopsys.com; ni...@redhat.com; ramana.radhakrish...@arm.com;
> aol...@gcc.gnu.org; hubi...@ucw.cz; mfort...@gmail.com;
> dje@gmail.com; li...@gcc.gnu.org; uweig...@de.ibm.com;
> kreb...@linux.ibm.com; olege...@gcc.gnu.org; da...@redhat.com;
> ebotca...@libertysurf.fr; jeffreya...@gmail.com; dave.ang...@bell.net
> Subject: Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch
> to align with LLVM
> 
> On Fri, Oct 14, 2022 at 04:34:05PM +0800, Haochen Jiang wrote:
> > * config/s390/s390.cc (s390_expand_cpymem): Generate fourth
> parameter for
> 
> (Many too long lines here, this is the first one.  Changelog lines are
> max. 80 positions; a tab is eight).

I will change that in next patch.

> 
> > +  /* Argument 3 must be either zero or one.  */
> > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > +{
> > +  warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > +   " using one");
> 
> "using 1" makes sense maybe, but "using one" reads as "using an
> argument", not very sane.
> 
> An error would be better here anyway?

Will change to 1 to avoid confusion in that. The reason why this is a warning
is because previous ones related to constant arguments out of range in prefetch
are also using warning.

/* Argument 2 must be 0, 1, 2, or 3.  */
  if (INTVAL (op2) < 0 || INTVAL (op2) > 3)
{
  warning (0, "invalid third argument to %<__builtin_prefetch%>; using 
zero");
  op2 = const0_rtx;
}

Therefore I use warning to align with them.

> 
> > --- a/gcc/config/rs6000/rs6000.md
> > +++ b/gcc/config/rs6000/rs6000.md
> > @@ -14060,10 +14060,25 @@
> >DONE;
> >  })
> >
> > -(define_insn "prefetch"
> > +(define_expand "prefetch"
> > +  [(prefetch (match_operand 0 "indexed_or_indirect_address")
> > +(match_operand:SI 1 "const_int_operand")
> > +(match_operand:SI 2 "const_int_operand")
> > +(match_operand:SI 3 "const_int_operand"))]
> > +  ""
> > +{
> > +  if (INTVAL (operands[3]) == 0)
> > +  {
> 
> Broken indentation.

I will fix that in updated patch.

> 
> > +warning (0, "instruction prefetch is not supported; using data 
> > prefetch");
> 
> Please use a separate pattern for this, and leave prefetch to mean data
> prefetch, as documented!  Documentation you didn't change btw.  Call the
> new one instruction_prefetch or something equally boring maybe :-)
> 

Actually I changed documentation for prefetch but it is flooded in the patch
(Sorry for that).

In gcc/doc/rtl.texi

-@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality})
+@item (prefetch:@var{m} @var{addr} @var{rw} @var{locality} @var{cache})
 
+Operand @var{cache} is 1 if the prefetch is prefetching data, 0 for prefetching
+instruction;
+targets that do not support instruction prefetch should treat all as data
+prefetch.
 
And for the implementation on the instruction prefetch, actually I have thought
of that way previously. But I chose the way how patch current goes for the
following reasons.

1. Previously we are using parameter to indicate r/w and locality in prefetch. I
suppose it is quite similar in this case. Since the pattern is already there, I 
prefer
reusing them.

2. It will be more natural for developers to extend their prefetch in future.

If anyone have points, welcome further discussion on that.

> When you send an updated patch, please split it up better?  Generic
> changes and documentation in one patch, target changes in a separate
> patch or patches, and testsuite is distinct as well.  It isn't nice to
> have to scroll through thousands of lines to see if there is anything
> relevant to you.

Really sorry for that. Hongtao has explained the reason for why we arrange
this patch and I will split the testcase to another patch.

Also if the change on testsuites on this patch change to minimal change,
the patch will be much smaller than current one.

BRs,
Haochen

> 
> Thanks,
> 
> 
> Segher

Re: [PATCH] Enable shrink wrapping for the RISC-V target.

2022-10-20 Thread Manolis Tsamis

On Tue, Oct 18, 2022 at 8:35 PM Palmer Dabbelt  wrote:
>
> On Tue, 18 Oct 2022 08:57:37 PDT (-0700), j...@ventanamicro.com wrote:
> >
> > Just a couple more comments in-line.
> >
> > On 10/18/22 09:18, Manolis Tsamis wrote:
> >>
>  +/* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS.  */
>  +
>  +static sbitmap
>  +riscv_get_separate_components (void)
>  +{
>  +  HOST_WIDE_INT offset;
>  +  sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER);
>  +  bitmap_clear (components);
>  +
>  +  if (riscv_use_save_libcall (&cfun->machine->frame)
>  +  || cfun->machine->interrupt_handler_p)
> >>> riscv_use_save_libcall() already checks interrupt_handler_p, so that's
> >>> redundant.  That said, I'm not sure riscv_use_save_libcall() is the
> >>> right check here as unless I'm missing something we don't have all those
> >>> other constraints when shrink-wrapping.
> >>>
> >> riscv_use_save_libcall returns false when interrupt_handler_p is true, so 
> >> the
> >> check for interrupt_handler_p in the branch is not redundant in this case.
> >>
> >> I encountered some issues when shrink wrapping and libcall was used in the 
> >> same
> >> function. Thinking that libcall replaces the prologue/epilogue I didn't 
> >> see a
> >> reason to have both at the same time and hence I opted to disable
> >> shrink wrapping in that case. From my understanding this should be 
> >> harmless?
> >
> > I would have expected things to work fine with libcalls, perhaps with
> > the exception of the save/restore libcalls.  So that needs deeper
> > investigation.
>
> The save/restore libcalls only support saving/restoring a handful of
> register configurations (just the saved X registers in the order they're
> usually saved in by GCC).  It should be OK for correctness to over-save
> registers, but it kind of just un-does the shrink wrapping so not sure
> it's worth worrying about at that point.
>
> There's also some oddness around the save/restore libcall ABI, it's not
> the standard function ABI but instead a GCC-internal one.  IIRC it just
> uses the alternate link register (ie, t0 instead of ra) but I may have
> forgotten something else.
>
> >>> It seems kind of clunky to have two copies of all these loops (and we'll
> >>> need a third to make this work with the V stuff), but we've got that
> >>> issue elsewhere in the port so I don't think you need to fix it here
> >>> (though the V stuff will be there for the v2, so you'll need the third
> >>> copy of each loop).
> >>>
> >> Indeed, I was following the other ports here. Do you think it would be
> >> better to refactor this when the code for the V extension is added?
> >> By taking into account what code will be needed for V, a proper refactored
> >> function could be made to handle all cases.
> >
> > I think refactoring when V gets added would be fine.  While we could
> > probably refactor it correctly now (it isn't terribly complex code after
> > all), but we're more likely to get it right with the least amount of
> > work if we do it when V is submitted.
>
> Some of the V register blocks are already there, but ya I agree we can
> just wait.  There's going to be a bunch of V-related churn for a bit,
> juggling those patches is already enough of a headache ;)
>
> >>> Either way, this deserves a test case.  I think it should be possible to
> >>> write one by introducing some register pressure around a
> >>> shrink-wrappable block that needs a long stack offset and making sure
> >>> in-flight registers don't get trashed.
> >>>
> >> I tried to think of some way to introduce a test like that but couldn't and
> >> I don't see how it would be done. Shrink wrapping only affects saved 
> >> registers
> >> so there are always available temporaries that are not affected by
> >> shrink wrapping.
> >> (Register pressure should be irrelevant in this case if I understand 
> >> correctly).
> >> Also the implementation checks for SMALL_OPERAND (offset) shrink wrapping
> >> should be unaffected from long stack offsets. If you see some way to write
> >> a test for that based on what I explained please explain how I could do 
> >> that.
> >
> > I think the register pressure was just to ensure that some saves were
> > needed to trigger an attempt to shrink wrap something.  You'd also need
> > something to eat stack space (local array which gets referenced as an
> > asm operand, but where the asm doesn't generate any code perhaps)?
> > Whether or not that works depends on stack layout though which I don't
> > know well enough for riscv.
>
> Sorry for being a bit vague, but it's because I always find it takes a
> bit of time to write up tests like this.  I think something like this
> might do it, but that almost certainly won't work as-is:
>
> // Some extern bits to try and trip up the optimizer.
> extern long helper(long *sa, long a, long b, long c, ...);
> extern long glob_array[1024];
>
> // The function takes a bunch of arguments to fi

Re: [PATCH] Enable shrink wrapping for the RISC-V target.

2022-10-20 Thread Manolis Tsamis

On Wed, Oct 19, 2022 at 8:16 PM Jeff Law via Gcc-patches
 wrote:
>
>
> On 10/18/22 11:35, Palmer Dabbelt wrote:
> >
> >> I would have expected things to work fine with libcalls, perhaps with
> >> the exception of the save/restore libcalls.  So that needs deeper
> >> investigation.
> >
> > The save/restore libcalls only support saving/restoring a handful of
> > register configurations (just the saved X registers in the order
> > they're usually saved in by GCC).  It should be OK for correctness to
> > over-save registers, but it kind of just un-does the shrink wrapping
> > so not sure it's worth worrying about at that point.
> >
> > There's also some oddness around the save/restore libcall ABI, it's
> > not the standard function ABI but instead a GCC-internal one.  IIRC it
> > just uses the alternate link register (ie, t0 instead of ra) but I may
> > have forgotten something else.
>
> I hadn't really dug into it -- I was pretty sure they weren't following
> the standard ABI based on its name and how I've used similar routines to
> save space on some targets in the past.  So if we're having problems
> with shrink-wrapping and libcalls, those two might be worth investigating.
>
>
> But I think the most important takeaway is that shrink wrapping should
> work with libcalls, there's nothing radically different about libcalls
> that would make them inherently interact poorly with shrink-wrapping.
> So that aspect of the shrink-wrapping patch needs deeper investigation.
>

Based on the feedback from both of you that shrink wrapping and
libcall save/restore
should work fine I'll try to lift that restriction again and
investigate further what happens.

Manolis

> Jeff

Re: [COMMITTED] PR c++/106654 - Add assume support to VRP.

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Wed, Oct 19, 2022 at 08:37:57PM -0400, Andrew MacLeod wrote:
> This patch adds basic support for ASSUME functions to VRP.
> 
> Based on the previous set of patches, Ive cleaned them up, and this provides
> the basic support from rangers generalized model. It does not support
> non-ssa name parameters, I think you might be on your own for that.
> 
> I modified Jakubs assumption pass to use GORI to query parameter rangers in
> assumption functions and set the global range for those, and then ranger's
> infer infrastructure is used to inject these rangers at assume call
> locations in VRP.
> 
> I also added an optimization testcase that tests the basic functionality in
> VRP2.  For instance it can reduce:
> 
> int
> f2 (int x, int y, int z)
> {
>   [[assume (x+12 == 14 && y >= 0 && y + 10 < 13 && z + 4 >= 4 && z - 2 <
> 18)]];
>   unsigned q = x + y + z;
>   if (q*2 > 46)
>     return 0;
>   return 1;
> }
> 
> to:
> 
> return 1;
> 
> 
> Its good to get us going, bt I think theres still lots of room for
> improvement.
> 
> Bootstraps on x86_64-pc-linux-gnu with no regressions.  Pushed.

Thanks.

Jakub

GCC 13.0.0 Status Report (2022-10-20), Stage 1 ends Nov 13th

2022-10-20 Thread Richard Biener via Gcc-patches

Status
==

The GCC development branch which will become GCC 13 is open for
general development (Stage 1).  Stage 1 will end at the end of
November 13th after which we will accept no new features that
have not yet been submitted.  Starting with Novemer 14th we
are in a two month general bugfixing period (Stage 3).

I have gone over the set of unpriorized regression bugs that are in
confirmed state, please help updating regressions that are still
UNCONFIRMED and consider fixing bugs that are in your area of
interest.  Please make sure to finish and submit features you
want to see included into GCC 13 timely and actively look for
reviewers.


Quality Data


Priority  #   Change from last report
---   ---
P1  33+  33
P2  473   + 102
P3  84+   7
P4  247   -   6
P5  25-   1
---   ---
Total P1-P3 590   + 142
Total   862   + 135


Previous Report
===

https://gcc.gnu.org/pipermail/gcc/2022-April/238619.html

Re: [PATCH 0/6] Add Intel Sierra Forest Instructions

2022-10-20 Thread Hongtao Liu via Gcc-patches

On Thu, Oct 20, 2022 at 9:11 AM Hongtao Liu  wrote:
>
> On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe  wrote:
> >
> > Hi Hongtao
> >
> > > On 17 Oct 2022, at 02:56, Hongtao Liu  wrote:
> > >
> > > On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fischer
> > >  wrote:
> > >>
> > >> On 17 October 2022 03:02:22 CEST, Hongtao Liu via Gcc-patches
> > >>
> > >> Do you have this series as a branch somewhere that I can try on one 
> > >> of the
> > >> like affected platforms?
> > >
> > > Not yet.
> > > Do we have any external place to put those patches so folks from the
> > > community can validate before it's committed, HJ?
> > >>
> > >>
> > >> https://gcc.gnu.org/gitwrite.html#vendor
> > >>
> > >> Not sure where in cgit the user branches are visible, though? But they 
> > >> can certainly be cloned and worked with.
> > > Thanks for the reminder, I've pushed to remotes/vendors/ix86/ise046.
> > > * [new ref] refs/vendors/ix86/heads/ise046 ->
> > > vendors/ix86/ise046
> >
> > thanks for pushing this branch, much better to test these things before 
> > committing rather than a panic
> > to fix after…
> >
> >
> > with
> > f90df941532 (HEAD -> ise046, vendors/ix86/ise046) Add m_CORE_ATOM for atom 
> > cores
> >
> >  - on x86_64 Darwin19  I get the following bootstrap fail:
> >
> > In file included from 
> > /src-local/gcc-master/gcc/config/i386/driver-i386.cc:31:
> > /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h: In function ‘const 
> > char* get_intel_cpu(__processor_model*, __processor_model2*, unsigned 
> > int*)’:
> > /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:532:32: error: this 
> > statement may fall through [-Werror=implicit-fallthrough=]
> >   532 |   cpu_model->__cpu_subtype = INTEL_COREI7_GRANITERAPIDS;
> >   |   ~^~~~
> > /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:533:5: note: here
> >   533 | case 0xb6:
> >   | ^~~~
> > cc1plus: all warnings being treated as errors
> >
> > 
> > Will try to look later, if that does not immediately ring some bell.
> This should a bug, thanks!
I've updated the branch, please try that.
> > thanks
> > Iain
> >
>
>
> --
> BR,
> Hongtao



-- 
BR,
Hongtao

Re: [PATCH 0/6] Add Intel Sierra Forest Instructions

2022-10-20 Thread Iain Sandoe via Gcc-patches




> On 20 Oct 2022, at 10:09, Hongtao Liu via Gcc-patches 
>  wrote:
> 
> On Thu, Oct 20, 2022 at 9:11 AM Hongtao Liu  wrote:
>> 
>> On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe  wrote:
>>> 
>>> Hi Hongtao
>>> 
 On 17 Oct 2022, at 02:56, Hongtao Liu  wrote:
 
 On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fischer
  wrote:
> 
> On 17 October 2022 03:02:22 CEST, Hongtao Liu via Gcc-patches
> 
> Do you have this series as a branch somewhere that I can try on one 
> of the
> like affected platforms?
 
 Not yet.
 Do we have any external place to put those patches so folks from the
 community can validate before it's committed, HJ?
> 
> 
> https://gcc.gnu.org/gitwrite.html#vendor
> 
> Not sure where in cgit the user branches are visible, though? But they 
> can certainly be cloned and worked with.
 Thanks for the reminder, I've pushed to remotes/vendors/ix86/ise046.
 * [new ref] refs/vendors/ix86/heads/ise046 ->
 vendors/ix86/ise046
>>> 
>>> thanks for pushing this branch, much better to test these things before 
>>> committing rather than a panic
>>> to fix after…
>>> 
>>> 
>>> with
>>> f90df941532 (HEAD -> ise046, vendors/ix86/ise046) Add m_CORE_ATOM for atom 
>>> cores
>>> 
>>> - on x86_64 Darwin19  I get the following bootstrap fail:
>>> 
>>> In file included from 
>>> /src-local/gcc-master/gcc/config/i386/driver-i386.cc:31:
>>> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h: In function ‘const 
>>> char* get_intel_cpu(__processor_model*, __processor_model2*, unsigned 
>>> int*)’:
>>> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:532:32: error: this 
>>> statement may fall through [-Werror=implicit-fallthrough=]
>>>  532 |   cpu_model->__cpu_subtype = INTEL_COREI7_GRANITERAPIDS;
>>>  |   ~^~~~
>>> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:533:5: note: here
>>>  533 | case 0xb6:
>>>  | ^~~~
>>> cc1plus: all warnings being treated as errors
>>> 
>>> 
>>> Will try to look later, if that does not immediately ring some bell.
>> This should a bug, thanks!
> I've updated the branch, please try that.

I had made the same fix locally (adding the “break”, right?) and testing is 
ongoing

it would not be surprising if some tests failed (asm matches for different ABIs 
are rarely
identical) - a few tests to be fixed in stage 3 is fine ...

... but what I wanted to avoid was the case like the bf16 changes where every
single new test fails (I have a draft patch to fix the bf16 stuff to be posted 
soon).

thanks
Iain

Re: [PATCH 0/6] Add Intel Sierra Forest Instructions

2022-10-20 Thread Hongtao Liu via Gcc-patches

On Thu, Oct 20, 2022 at 5:17 PM Iain Sandoe  wrote:
>
>
>
> > On 20 Oct 2022, at 10:09, Hongtao Liu via Gcc-patches 
> >  wrote:
> >
> > On Thu, Oct 20, 2022 at 9:11 AM Hongtao Liu  wrote:
> >>
> >> On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe  
> >> wrote:
> >>>
> >>> Hi Hongtao
> >>>
>  On 17 Oct 2022, at 02:56, Hongtao Liu  wrote:
> 
>  On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fischer
>   wrote:
> >
> > On 17 October 2022 03:02:22 CEST, Hongtao Liu via Gcc-patches
> >
> > Do you have this series as a branch somewhere that I can try on one 
> > of the
> > like affected platforms?
> 
>  Not yet.
>  Do we have any external place to put those patches so folks from the
>  community can validate before it's committed, HJ?
> >
> >
> > https://gcc.gnu.org/gitwrite.html#vendor
> >
> > Not sure where in cgit the user branches are visible, though? But they 
> > can certainly be cloned and worked with.
>  Thanks for the reminder, I've pushed to remotes/vendors/ix86/ise046.
>  * [new ref] refs/vendors/ix86/heads/ise046 ->
>  vendors/ix86/ise046
> >>>
> >>> thanks for pushing this branch, much better to test these things before 
> >>> committing rather than a panic
> >>> to fix after…
> >>>
> >>>
> >>> with
> >>> f90df941532 (HEAD -> ise046, vendors/ix86/ise046) Add m_CORE_ATOM for 
> >>> atom cores
> >>>
> >>> - on x86_64 Darwin19  I get the following bootstrap fail:
> >>>
> >>> In file included from 
> >>> /src-local/gcc-master/gcc/config/i386/driver-i386.cc:31:
> >>> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h: In function 
> >>> ‘const char* get_intel_cpu(__processor_model*, __processor_model2*, 
> >>> unsigned int*)’:
> >>> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:532:32: error: 
> >>> this statement may fall through [-Werror=implicit-fallthrough=]
> >>>  532 |   cpu_model->__cpu_subtype = INTEL_COREI7_GRANITERAPIDS;
> >>>  |   ~^~~~
> >>> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:533:5: note: here
> >>>  533 | case 0xb6:
> >>>  | ^~~~
> >>> cc1plus: all warnings being treated as errors
> >>>
> >>> 
> >>> Will try to look later, if that does not immediately ring some bell.
> >> This should a bug, thanks!
> > I've updated the branch, please try that.
>
> I had made the same fix locally (adding the “break”, right?) and testing is 
> ongoing
Yes, please go ahead.
>
> it would not be surprising if some tests failed (asm matches for different 
> ABIs are rarely
> identical) - a few tests to be fixed in stage 3 is fine ...
>
> ... but what I wanted to avoid was the case like the bf16 changes where every
> single new test fails (I have a draft patch to fix the bf16 stuff to be 
> posted soon).
Understand.
>
> thanks
> Iain
>
>


-- 
BR,
Hongtao

Re: [PATCH v3, rs6000] Change mode and insn condition for VSX scalar extract/insert instructions

2022-10-20 Thread Kewen.Lin via Gcc-patches

Hi Haochen,

Sorry for late review, some comments are inline.

on 2022/9/7 15:44, HAO CHEN GUI wrote:
> Hi,
> 
>   For scalar extract/insert instructions, exponent field can be stored in a
> 32-bit register. So this patch changes the mode of exponent field from DI to
> SI. The instructions using DI registers can be invoked with -mpowerpc64 in a
> 32-bit environment. The patch changes insn condition from TARGET_64BIT to
> TARGET_POWERPC64 for those instructions.
> 
>   This patch also changes prototypes of relevant built-ins and effective
> target of test cases.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> Is this okay for trunk? Any recommendations? Thanks a lot.
> 
> ChangeLog
> 2022-09-07  Haochen Gui  
> 
> gcc/
>   * config/rs6000/rs6000-builtins.def
>   (__builtin_vsx_scalar_extract_exp): Set return type to const unsigned
>   int.
>   (__builtin_vsx_scalar_extract_sig): Set return type to const unsigned
>   long long.
>   (__builtin_vsx_scalar_insert_exp): Set type of second argument to
>   unsigned int.
>   (__builtin_vsx_scalar_insert_exp_dp): Likewise.
>   * config/rs6000/vsx.md (xsxexpdp): Set mode of first operand to
>   SImode.  Remove TARGET_64BIT from insn condition.
>   (xsxsigdp): Change insn condition from TARGET_64BIT to TARGET_POWERPC64.
>   (xsiexpdp): Change insn condition from TARGET_64BIT to
>   TARGET_POWERPC64.  Set mode of third operand to SImode.
>   (xsiexpdpf): Set mode of third operand to SImode.  Remove TARGET_64BIT
>   from insn condition.
> 
> gcc/testsuite/
>   * gcc.target/powerpc/bfp/scalar-extract-exp-0.c: Change effective
>   target from lp64 to has_arch_ppc64.
>   * gcc.target/powerpc/bfp/scalar-extract-exp-6.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-extract-sig-6.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-0.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-12.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-13.c: Likewise.
>   * gcc.target/powerpc/bfp/scalar-insert-exp-3.c: Likewise.
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000-builtins.def 
> b/gcc/config/rs6000/rs6000-builtins.def
> index f76f54793d7..ca2a1d7657e 100644
> --- a/gcc/config/rs6000/rs6000-builtins.def
> +++ b/gcc/config/rs6000/rs6000-builtins.def
> @@ -2847,17 +2847,17 @@
>pure vsc __builtin_vsx_lxvl (const void *, signed long);
>  LXVL lxvl {}
> 
> -  const signed long __builtin_vsx_scalar_extract_exp (double);
> +  const unsigned int __builtin_vsx_scalar_extract_exp (double);
>  VSEEDP xsxexpdp {}
> 

With the relevant define_insn condition change and this prototype
change, I think this bif can work on 32 bit environment.  So it
should be moved to section [power9] instead of [power9-64]?

If we want this supported on 32 bit, the related documentation
and test cases need some updates accordingly.

For the documentation, such as "The *scalar_extract_exp* and
scalar_extract_sig functions require *a 64-bit environment*
supporting ISA 3.0 " in [1].

For the test case, please see separated comments in test case 
part below.

[1] 
https://gcc.gnu.org/onlinedocs//gcc/PowerPC-AltiVec-Built-in-Functions-Available-on-ISA-3_002e0.html

The above comments are also applied for the bif
__builtin_vsx_scalar_insert_exp_dp.


> -  const signed long __builtin_vsx_scalar_extract_sig (double);
> +  const unsigned long long __builtin_vsx_scalar_extract_sig (double);
>  VSESDP xsxsigdp {}
> 
>const double __builtin_vsx_scalar_insert_exp (unsigned long long, \
> -unsigned long long);
> + unsigned int);
>  VSIEDP xsiexpdp {}
> 
> -  const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned long 
> long);
> +  const double __builtin_vsx_scalar_insert_exp_dp (double, unsigned int);
>  VSIEDPF xsiexpdpf {}
> 
>pure vsc __builtin_vsx_xl_len_r (void *, signed long);
> diff --git a/gcc/config/rs6000/vsx.md b/gcc/config/rs6000/vsx.md
> index e226a93bbe5..9d3a2340a79 100644
> --- a/gcc/config/rs6000/vsx.md
> +++ b/gcc/config/rs6000/vsx.md
> @@ -5095,10 +5095,10 @@ (define_insn "xsxexpqp_"
> 
>  ;; VSX Scalar Extract Exponent Double-Precision
>  (define_insn "xsxexpdp"
> -  [(set (match_operand:DI 0 "register_operand" "=r")
> - (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
> +  [(set (match_operand:SI 0 "register_operand" "=r")
> + (unspec:SI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UNSPEC_VSX_SXEXPDP))]
> -  "TARGET_P9_VECTOR && TARGET_64BIT"
> +  "TARGET_P9_VECTOR"
>"xsxexpdp %0,%x1"
>[(set_attr "type" "integer")])
> 
> @@ -5116,7 +5116,7 @@ (define_insn "xsxsigdp"
>[(set (match_operand:DI 0 "register_operand" "=r")
>   (unspec:DI [(match_operand:DF 1 "vsx_register_operand" "wa")]
>UN

[PATCH] c/107305 - avoid ICEing with invalid GIMPLE input to the GIMPLE FE

2022-10-20 Thread Richard Biener via Gcc-patches

The GIMPLE FE was designed to defer semantic error checking to the
GIMPLE IL verifier.  But that can end up causing spurious ICEs
earlier and in fact it will report an internal error.  The following
tries to improve the situation by explicitely calling into the
verifier from the parser and intructing it to not ICE but instead
zap the parsed body after an error is discovered.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR c/107305
PR c/107306
gcc/c/
* gimple-parser.cc (c_parser_parse_gimple_body): Verify
the parsed IL and zap the body on error.

gcc/
* tree-cfg.h (verify_gimple_in_seq): Add parameter to
indicate whether to emit an ICE.  Add return value.
(verify_gimple_in_cfg): Likewise.
* tree-cfg.cc (verify_gimple_in_seq): Likewise.
(verify_gimple_in_cfg): Likewise.

gcc/testsuite/
* gcc.dg/gimplefe-error-15.c: New testcase.
---
 gcc/c/gimple-parser.cc   | 10 ++
 gcc/testsuite/gcc.dg/gimplefe-error-15.c | 13 +
 gcc/tree-cfg.cc  | 16 ++--
 gcc/tree-cfg.h   |  4 ++--
 4 files changed, 35 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/gimplefe-error-15.c

diff --git a/gcc/c/gimple-parser.cc b/gcc/c/gimple-parser.cc
index 5a2da2cfa0e..18ed4d4236d 100644
--- a/gcc/c/gimple-parser.cc
+++ b/gcc/c/gimple-parser.cc
@@ -364,6 +364,16 @@ c_parser_parse_gimple_body (c_parser *cparser, char 
*gimple_pass,
   cgraph_node::get_create (cfun->decl);
   cgraph_edge::rebuild_edges ();
 }
+
+  /* Perform IL validation and if any error is found abort compilation
+ of this function by zapping its body.  */
+  if ((cfun->curr_properties & PROP_cfg)
+  && verify_gimple_in_cfg (cfun, false, false))
+init_empty_tree_cfg ();
+  else if (!(cfun->curr_properties & PROP_cfg)
+  && verify_gimple_in_seq (gimple_body (current_function_decl), false))
+gimple_set_body (current_function_decl, NULL);
+
   dump_function (TDI_gimple, current_function_decl);
 }
 
diff --git a/gcc/testsuite/gcc.dg/gimplefe-error-15.c 
b/gcc/testsuite/gcc.dg/gimplefe-error-15.c
new file mode 100644
index 000..066cd845d31
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/gimplefe-error-15.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-fgimple" } */
+
+unsigned a;
+static double *d;
+static _Bool b;
+__GIMPLE int
+foo (int n)
+{
+  b = __builtin_add_overflow (n, *d, &a);
+} /* { dg-error "invalid argument" } */
+
+/* { dg-message "" "" { target *-*-* } 0 } */
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index 9b2c0f6956c..d982988048f 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -5300,13 +5300,15 @@ verify_gimple_transaction (gtransaction *stmt)
 
 /* Verify the GIMPLE statements inside the statement list STMTS.  */
 
-DEBUG_FUNCTION void
-verify_gimple_in_seq (gimple_seq stmts)
+DEBUG_FUNCTION bool
+verify_gimple_in_seq (gimple_seq stmts, bool ice)
 {
   timevar_push (TV_TREE_STMT_VERIFY);
-  if (verify_gimple_in_seq_2 (stmts))
+  bool res = verify_gimple_in_seq_2 (stmts);
+  if (res && ice)
 internal_error ("% failed");
   timevar_pop (TV_TREE_STMT_VERIFY);
+  return res;
 }
 
 /* Return true when the T can be shared.  */
@@ -5496,8 +5498,8 @@ collect_subblocks (hash_set *blocks, tree block)
 
 /* Verify the GIMPLE statements in the CFG of FN.  */
 
-DEBUG_FUNCTION void
-verify_gimple_in_cfg (struct function *fn, bool verify_nothrow)
+DEBUG_FUNCTION bool
+verify_gimple_in_cfg (struct function *fn, bool verify_nothrow, bool ice)
 {
   basic_block bb;
   bool err = false;
@@ -5652,11 +5654,13 @@ verify_gimple_in_cfg (struct function *fn, bool 
verify_nothrow)
 eh_table->traverse *, verify_eh_throw_stmt_node>
   (&visited_throwing_stmts);
 
-  if (err || eh_error_found)
+  if (ice && (err || eh_error_found))
 internal_error ("verify_gimple failed");
 
   verify_histograms ();
   timevar_pop (TV_TREE_STMT_VERIFY);
+
+  return (err || eh_error_found);
 }
 
 
diff --git a/gcc/tree-cfg.h b/gcc/tree-cfg.h
index 95ec93e3a91..8c22c3dbbe3 100644
--- a/gcc/tree-cfg.h
+++ b/gcc/tree-cfg.h
@@ -63,8 +63,8 @@ extern gphi *get_virtual_phi (basic_block);
 extern gimple *first_stmt (basic_block);
 extern gimple *last_stmt (basic_block);
 extern gimple *last_and_only_stmt (basic_block);
-extern void verify_gimple_in_seq (gimple_seq);
-extern void verify_gimple_in_cfg (struct function *, bool);
+extern bool verify_gimple_in_seq (gimple_seq, bool = true);
+extern bool verify_gimple_in_cfg (struct function *, bool, bool = true);
 extern tree gimple_block_label (basic_block);
 extern void add_phi_args_after_copy_bb (basic_block);
 extern void add_phi_args_after_copy (basic_block *, unsigned, edge);
-- 
2.35.3

Re: [PATCH] rs6000: using li/lis+oris/xoris to build constants

2022-10-20 Thread Kewen.Lin via Gcc-patches

Hi Jeff,

Sorry for late review, some comments are inline.

on 2022/8/24 16:13, Jiufu Guo via Gcc-patches wrote:
> Hi,
> 
> PR106708 constaint some constants which can be support by li/lis + oris/xoris.
> 
> For constant C:
> if ((c & 0x80008000ULL) == 0x8000ULL) or say:
> 32(0)+1(1)+15(x)+1(0)+15(x), we could use li+oris to build constant 'C'.
> 
> if ((c & 0x8000ULL) == 0x8000ULL) or say:
> 32(1)+16(x)+1(1)+15(x), using li+xoris would be ok.
> 
> if ((c & 0xULL) == 0x) or say:
> 32(1)+1(0)+15(x)+16(0), using lis+xoris would be ok.
> 

Maybe it's good to add some explanation on the proposed writing "N(M)"
N continuous bit M, (x for M means either 1 or 0), and not sure if it's
good to use "||" for concatenation just like what ISA uses, the con
is it can be mis-interpreted as logical "or".

Or maybe just expand all the low 32 bits and use "1..." or "0..." for the
high 32 bits.

> This patch update rs6000_emit_set_long_const to support these forms.
> Bootstrap and regtest pass on ppc64 and ppc64le.
> 
> Is this ok for trunk?
> 
> BR,
> Jeff(Jiufu)
> 
> 
>   PR target/106708
> 
> gcc/ChangeLog:
> 
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Using li/lis +
>   oris/xoris to build constants.

Nit: Support constants which can be built with li + oris or li/lis + xoris?

> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/powerpc/pr106708.c: New test.
>   * gcc.target/powerpc/pr106708.h: New file.
>   * gcc.target/powerpc/pr106708_1.c: New test.
> 
> ---
>  gcc/config/rs6000/rs6000.cc   | 22 +++
>  gcc/testsuite/gcc.target/powerpc/pr106708.c   | 10 +
>  gcc/testsuite/gcc.target/powerpc/pr106708.h   |  9 
>  gcc/testsuite/gcc.target/powerpc/pr106708_1.c | 17 ++
>  4 files changed, 58 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106708.c
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106708.h
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/pr106708_1.c
> 
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index df491bee2ea..243247fb838 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10112,6 +10112,7 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT c)
>  {
>rtx temp;
>HOST_WIDE_INT ud1, ud2, ud3, ud4;
> +  HOST_WIDE_INT orig_c = c;
>  
>ud1 = c & 0x;
>c = c >> 16;
> @@ -10137,6 +10138,27 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>   gen_rtx_IOR (DImode, copy_rtx (temp),
>GEN_INT (ud1)));
>  }
> +  else if (ud4 == 0 && ud3 == 0 && (ud2 & 0x8000) && !(ud1 & 0x8000))
> +{
> +  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> +
> +  /* li+oris */
> +  emit_move_insn (copy_rtx (temp), GEN_INT (ud1));

Nit: in previous discussion on some other patch, copy_rtx is not necessary?

> +  emit_move_insn (dest, gen_rtx_IOR (DImode, copy_rtx (temp),
> +  GEN_INT (ud2 << 16)));
> +}

I think this hunk above can be moved to the existing "(ud3 == 0 && ud4 == 0)"
handling branch (as the diff context below), and ud2 & 0x8000 is already
asserted there, it also saves check.

> +  else if ((ud4 == 0x && ud3 == 0x)
> +&& ((ud1 & 0x8000) || (ud1 == 0 && !(ud2 & 0x8000
> +{
> +  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> +
> +  HOST_WIDE_INT imm = (ud1 & 0x8000) ? ((ud1 ^ 0x8000) - 0x8000)
> +  : ((ud2 << 16) - 0x8000);
> +  /* li/lis + xoris */
> +  emit_move_insn (copy_rtx (temp), GEN_INT (imm));
> +  emit_move_insn (dest, gen_rtx_XOR (DImode, copy_rtx (temp),
> +  GEN_INT (orig_c ^ imm)));
> +}

Same comment for copy_rtx.

>else if (ud3 == 0 && ud4 == 0)
>  {
>temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.c 
> b/gcc/testsuite/gcc.target/powerpc/pr106708.c
> new file mode 100644
> index 000..6445fa47747
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106708.c
> @@ -0,0 +1,10 @@
> +/* PR target/106708 */
> +/* { dg-options "-O2 -mdejagnu-cpu=power8" } */
> +/* { dg-do compile { target has_arch_ppc64 } } */
> +

Put dg-do as the first line, if you want has_arch_ppc64 to be behind dg-options,
separate it into a dg-require-effective-target.

> +#include "pr106708.h"
> +
> +/* { dg-final { scan-assembler-times {\mli\M} 2 } } */
> +/* { dg-final { scan-assembler-times {\mlis\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\moris\M} 1 } } */
> +/* { dg-final { scan-assembler-times {\mxoris\M} 2 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.h 
> b/gcc/testsuite/gcc.target/powerpc/pr106708.h
> new file mode 100644
> index 000..a

Re: [PATCH][AArch64] Improve immediate expansion [PR106583]

2022-10-20 Thread Richard Sandiford via Gcc-patches

Wilco Dijkstra  writes:
> ping
>
>
>
> Hi Richard,
>
 Sounds good, but could you put it before the mode version,
 to avoid the forward declaration?
>>>
>>> I can swap them around but the forward declaration is still required as
>>> aarch64_check_bitmask is 5000 lines before aarch64_bitmask_imm.
>>
>> OK, how about moving them both above aarch64_check_bitmask?
>
> Sure I've moved them as well as all related helper functions - it makes the 
> diff
> quite large but they are all together now which makes sense. I also refactored
> aarch64_mov_imm to handle the case of a 64-bit immediate being generated
> by a 32-bit MOVZ/MOVN - this simplifies aarch64_internal_move_immediate
> and movdi patterns even further.

Can you do the aarch64_mov_imm changes as a separate patch?  It's difficult
to review the two changes folded together like this.

Thanks,
Richard

>
> Cheers,
> Wilco
>
> v3: move immediate code together and avoid forward declarations,
> further cleanups and simplifications.
>
> Improve immediate expansion of immediates which can be created from a
> bitmask immediate and 2 MOVKs.  Simplify, refactor and improve
> efficiency of bitmask checks and move immediate. Move various immediate
> handling functions together to avoid forward declarations.
> Include 32-bit MOVZ/N as valid 64-bit immediates. Add new constraint so
> the movdi pattern only needs a single alternative for move immediate.
>
> This reduces the number of 4-instruction immediates in SPECINT/FP by 10-15%.
>
> Passes bootstrap & regress, OK for commit?
>
> gcc/ChangeLog:
>
> PR target/106583
> * config/aarch64/aarch64.cc (aarch64_internal_mov_immediate)
> Add support for a bitmask immediate with 2 MOVKs.
> (aarch64_check_bitmask): New function after refactorization.
> (aarch64_replicate_bitmask_imm): Remove function, merge into...
> (aarch64_bitmask_imm): Simplify replication of small modes.
> Split function into 64-bit only version for efficiency.
> (aarch64_zeroextended_move_imm): New function.
> (aarch64_move_imm): Refactor code.
> (aarch64_uimm12_shift): Move near other immediate functions.
> (aarch64_clamp_to_uimm12_shift): Likewise.
> (aarch64_movk_shift): Likewise.
> (aarch64_replicate_bitmask_imm): Likewise.
> (aarch64_and_split_imm1): Likewise.
> (aarch64_and_split_imm2): Likewise.
> (aarch64_and_bitmask_imm): Likewise.
> (aarch64_movw_imm): Remove.
> * config/aarch64/aarch64.md (movdi_aarch64): Merge 'N' and 'M'
> constraints into single 'O'.
> (mov_aarch64): Likewise.
> * config/aarch64/aarch64-protos.h (aarch64_move_imm): Use unsigned.
> (aarch64_bitmask_imm): Likewise.
> (aarch64_uimm12_shift): Likewise.
> (aarch64_zeroextended_move_imm): New prototype.
> * config/aarch64/constraints.md: Add 'O' for 32/64-bit immediates,
> limit 'N' to 64-bit only moves.
>
> gcc/testsuite:
> PR target/106583
> * gcc.target/aarch64/pr106583.c: Add new test.
>
> ---
>
> diff --git a/gcc/config/aarch64/aarch64-protos.h 
> b/gcc/config/aarch64/aarch64-protos.h
> index 
> 3e4005c9f4ff1f999f1811c6fb0b2252878dc4ae..b82f9ba7c2bb4cffa16abbf45f87061f72015083
>  100644
> --- a/gcc/config/aarch64/aarch64-protos.h
> +++ b/gcc/config/aarch64/aarch64-protos.h
> @@ -755,7 +755,7 @@ void aarch64_post_cfi_startproc (void);
>  poly_int64 aarch64_initial_elimination_offset (unsigned, unsigned);
>  int aarch64_get_condition_code (rtx);
>  bool aarch64_address_valid_for_prefetch_p (rtx, bool);
> -bool aarch64_bitmask_imm (HOST_WIDE_INT val, machine_mode);
> +bool aarch64_bitmask_imm (unsigned HOST_WIDE_INT val, machine_mode);
>  unsigned HOST_WIDE_INT aarch64_and_split_imm1 (HOST_WIDE_INT val_in);
>  unsigned HOST_WIDE_INT aarch64_and_split_imm2 (HOST_WIDE_INT val_in);
>  bool aarch64_and_bitmask_imm (unsigned HOST_WIDE_INT val_in, machine_mode 
> mode);
> @@ -792,7 +792,7 @@ bool aarch64_masks_and_shift_for_bfi_p (scalar_int_mode, 
> unsigned HOST_WIDE_INT,
>  unsigned HOST_WIDE_INT,
>  unsigned HOST_WIDE_INT);
>  bool aarch64_zero_extend_const_eq (machine_mode, rtx, machine_mode, rtx);
> -bool aarch64_move_imm (HOST_WIDE_INT, machine_mode);
> +bool aarch64_move_imm (unsigned HOST_WIDE_INT, machine_mode);
>  machine_mode aarch64_sve_int_mode (machine_mode);
>  opt_machine_mode aarch64_sve_pred_mode (unsigned int);
>  machine_mode aarch64_sve_pred_mode (machine_mode);
> @@ -842,8 +842,9 @@ bool aarch64_sve_float_arith_immediate_p (rtx, bool);
>  bool aarch64_sve_float_mul_immediate_p (rtx);
>  bool aarch64_split_dimode_const_store (rtx, rtx);
>  bool aarch64_symbolic_address_p (rtx);
> -bool aarch64_uimm12_shift (HOST_WIDE_INT);
> +bool aarch64_uimm12_shift (unsigned HOST_WIDE_INT);
>  int aarch64_movk_shift (const wide_int_ref &, const wide_int_ref &);
> +bool aa

[v4 PATCH 1/4] RISC-V: Minimal support of z*inx extension.

2022-10-20 Thread jiawei

From: Jiawei 

Minimal support of z*inx extension, include 'zfinx'， 'zdinx' and 
'zhinx/zhinxmin'
corresponding to 'f', 'd' and 'zfh/zfhmin', the 'zdinx' will imply 'zfinx'
same as 'd' imply 'f', 'zhinx' will aslo imply 'zfinx', all zfinx extension 
imply 'zicsr'.

Co-Authored-By: Sinan Lin.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: New extensions.
* config/riscv/arch-canonicalize: New imply relations.
* config/riscv/riscv-opts.h (MASK_ZFINX): New mask.
(MASK_ZDINX): Ditto.
(MASK_ZHINX): Ditto.
(MASK_ZHINXMIN): Ditto.
(TARGET_ZFINX): New target.
(TARGET_ZDINX): Ditto.
(TARGET_ZHINX): Ditto.
(TARGET_ZHINXMIN): Ditto.
* config/riscv/riscv.opt: New target variable.

---
 gcc/common/config/riscv/riscv-common.cc | 18 ++
 gcc/config/riscv/arch-canonicalize  |  5 +
 gcc/config/riscv/riscv-opts.h   | 10 ++
 gcc/config/riscv/riscv.opt  |  3 +++
 4 files changed, 36 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index c39ed2e2696..55f3328df7a 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -51,6 +51,11 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"d", "f"},
   {"f", "zicsr"},
   {"d", "zicsr"},
+
+  {"zdinx", "zfinx"},
+  {"zfinx", "zicsr"},
+  {"zdinx", "zicsr"},
+
   {"zk", "zkn"},
   {"zk", "zkr"},
   {"zk", "zkt"},
@@ -99,6 +104,9 @@ static const riscv_implied_info_t riscv_implied_info[] =
 
   {"zfh", "zfhmin"},
   {"zfhmin", "f"},
+  
+  {"zhinx", "zhinxmin"},
+  {"zhinxmin", "zfinx"},
 
   {NULL, NULL}
 };
@@ -158,6 +166,11 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"zbc", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbs", ISA_SPEC_CLASS_NONE, 1, 0},
 
+  {"zfinx", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zdinx", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zhinx", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zhinxmin", ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"zbkb",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbkc",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbkx",  ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1168,6 +1181,11 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"zbc",&gcc_options::x_riscv_zb_subext, MASK_ZBC},
   {"zbs",&gcc_options::x_riscv_zb_subext, MASK_ZBS},
 
+  {"zfinx",&gcc_options::x_riscv_zinx_subext, MASK_ZFINX},
+  {"zdinx",&gcc_options::x_riscv_zinx_subext, MASK_ZDINX},
+  {"zhinx",&gcc_options::x_riscv_zinx_subext, MASK_ZHINX},
+  {"zhinxmin", &gcc_options::x_riscv_zinx_subext, MASK_ZHINXMIN},
+
   {"zbkb",   &gcc_options::x_riscv_zk_subext, MASK_ZBKB},
   {"zbkc",   &gcc_options::x_riscv_zk_subext, MASK_ZBKC},
   {"zbkx",   &gcc_options::x_riscv_zk_subext, MASK_ZBKX},
diff --git a/gcc/config/riscv/arch-canonicalize 
b/gcc/config/riscv/arch-canonicalize
index fd7651ac491..2498db506b7 100755
--- a/gcc/config/riscv/arch-canonicalize
+++ b/gcc/config/riscv/arch-canonicalize
@@ -41,6 +41,11 @@ LONG_EXT_PREFIXES = ['z', 's', 'h', 'x']
 IMPLIED_EXT = {
   "d" : ["f", "zicsr"],
   "f" : ["zicsr"],
+  "zdinx" : ["zfinx", "zicsr"],
+  "zfinx" : ["zicsr"],
+  "zhinx" : ["zhinxmin", "zfinx", "zicsr"],
+  "zhinxmin" : ["zfinx", "zicsr"],
+
   "zk" : ["zkn", "zkr", "zkt"],
   "zkn" : ["zbkb", "zbkc", "zbkx", "zkne", "zknd", "zknh"],
   "zks" : ["zbkb", "zbkc", "zbkx", "zksed", "zksh"],
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 55e0bc0a0e9..bb2322ad182 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -83,6 +83,16 @@ enum stack_protector_guard {
 #define TARGET_ZBC((riscv_zb_subext & MASK_ZBC) != 0)
 #define TARGET_ZBS((riscv_zb_subext & MASK_ZBS) != 0)
 
+#define MASK_ZFINX  (1 << 0)
+#define MASK_ZDINX  (1 << 1)
+#define MASK_ZHINX  (1 << 2)
+#define MASK_ZHINXMIN   (1 << 3)
+
+#define TARGET_ZFINX((riscv_zinx_subext & MASK_ZFINX) != 0)
+#define TARGET_ZDINX((riscv_zinx_subext & MASK_ZDINX) != 0)
+#define TARGET_ZHINX((riscv_zinx_subext & MASK_ZHINX) != 0)
+#define TARGET_ZHINXMIN ((riscv_zinx_subext & MASK_ZHINXMIN) != 0)
+
 #define MASK_ZBKB (1 << 0)
 #define MASK_ZBKC (1 << 1)
 #define MASK_ZBKX (1 << 2)
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 8923a11a97d..7c1e0ed5f2d 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -206,6 +206,9 @@ int riscv_zi_subext
 TargetVariable
 int riscv_zb_subext
 
+TargetVariable
+int riscv_zinx_subext
+
 TargetVariable
 int riscv_zk_subext
 
-- 
2.25.1

[v4 PATCH 0/4] RISC-V: Support z*inx extensions.

2022-10-20 Thread jiawei

Zfinx extension[1] had already ratified. Here is the 
implementation patch set that reuse floating point pattern and ban
the use of fpr when use z*inx as a target.

Current works can be find in follow links, binutils and simulator 
works already supported on upstream.
  https://github.com/pz9115/riscv-gcc/tree/zfinx-rebase

Thanks for Tariq Kurd, Kito Cheng, Jim Willson, 
Jeremy Bennett helped us a lot with this work.

[1] https://github.com/riscv/riscv-zfinx/blob/main/zfinx-1.0.0-rc.pdf

Version log:

v2: As Kito Cheng's comment, add Changelog part in patches, update imply 
info in riscv-common.c, remove useless check and update annotation in 
riscv.c.

v3: Update with new isa-spec version 20191213, make zfinx imply zicsr as
default, fix the lack of fcsr use in zfinx.

v4: Rebase patch with upstream, add zhinx/zhinxmin extensions support.
Add additional zhinx/zhinxmin same like zfh/zfhmin.

Jiawei (4):
  RISC-V: Minimal support of z*inx extension.
  RISC-V: Target support for z*inx extension.
  RISC-V: Limit regs use for z*inx extension.
  RISC-V: Add zhinx/zhinxmin testcases.

 gcc/common/config/riscv/riscv-common.cc   | 18 +
 gcc/config/riscv/arch-canonicalize|  5 ++
 gcc/config/riscv/constraints.md   |  5 +-
 gcc/config/riscv/iterators.md |  6 +-
 gcc/config/riscv/riscv-builtins.cc|  4 +-
 gcc/config/riscv/riscv-c.cc   |  2 +-
 gcc/config/riscv/riscv-opts.h | 10 +++
 gcc/config/riscv/riscv.cc | 21 -
 gcc/config/riscv/riscv.md | 78 ++-
 gcc/config/riscv/riscv.opt|  3 +
 .../gcc.target/riscv/_Float16-zhinx-1.c   | 10 +++
 .../gcc.target/riscv/_Float16-zhinx-2.c   |  9 +++
 .../gcc.target/riscv/_Float16-zhinx-3.c   |  9 +++
 .../gcc.target/riscv/_Float16-zhinxmin-1.c| 10 +++
 .../gcc.target/riscv/_Float16-zhinxmin-2.c| 10 +++
 .../gcc.target/riscv/_Float16-zhinxmin-3.c| 10 +++
 16 files changed, 160 insertions(+), 50 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-3.c

-- 
2.25.1

[v4 PATCH 2/4] RISC-V: Target support for z*inx extension.

2022-10-20 Thread jiawei

From: Jiawei 

Support 'TARGET_ZFINX' with float instruction pattern and builtin function.
Reuse 'TARGET_HADR_FLOAT',  'TARGET_DOUBLE_FLOAT' and 'TARGET_ZHINX' patterns.

gcc/ChangeLog:

* config/riscv/iterators.md (TARGET_ZFINX):New target.
(TARGET_ZDINX): Ditto.
(TARGET_ZHINX): Ditto.
* config/riscv/riscv-builtins.cc (AVAIL): Ditto.
(riscv_atomic_assign_expand_fenv): Ditto.
* config/riscv/riscv-c.cc (riscv_cpu_cpp_builtins): Ditto.
* config/riscv/riscv.md: Ditto.

---
 gcc/config/riscv/iterators.md  |  6 +--
 gcc/config/riscv/riscv-builtins.cc |  4 +-
 gcc/config/riscv/riscv-c.cc|  2 +-
 gcc/config/riscv/riscv.md  | 78 +++---
 4 files changed, 46 insertions(+), 44 deletions(-)

diff --git a/gcc/config/riscv/iterators.md b/gcc/config/riscv/iterators.md
index 39dffabc235..50380ecfac9 100644
--- a/gcc/config/riscv/iterators.md
+++ b/gcc/config/riscv/iterators.md
@@ -59,9 +59,9 @@
 (define_mode_iterator ANYI [QI HI SI (DI "TARGET_64BIT")])
 
 ;; Iterator for hardware-supported floating-point modes.
-(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT")
-   (DF "TARGET_DOUBLE_FLOAT")
-   (HF "TARGET_ZFH")])
+(define_mode_iterator ANYF [(SF "TARGET_HARD_FLOAT || TARGET_ZFINX")
+   (DF "TARGET_DOUBLE_FLOAT || TARGET_ZDINX")
+   (HF "TARGET_ZFH || TARGET_ZHINX")])
 
 ;; Iterator for floating-point modes that can be loaded into X registers.
 (define_mode_iterator SOFTF [SF (DF "TARGET_64BIT") (HF "TARGET_ZFHMIN")])
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 14865d70955..1534cfd860b 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -87,7 +87,7 @@ struct riscv_builtin_description {
   unsigned int (*avail) (void);
 };
 
-AVAIL (hard_float, TARGET_HARD_FLOAT)
+AVAIL (hard_float, TARGET_HARD_FLOAT || TARGET_ZFINX)
 
 
 AVAIL (clean32, TARGET_ZICBOM && !TARGET_64BIT)
@@ -322,7 +322,7 @@ riscv_expand_builtin (tree exp, rtx target, rtx subtarget 
ATTRIBUTE_UNUSED,
 void
 riscv_atomic_assign_expand_fenv (tree *hold, tree *clear, tree *update)
 {
-  if (!TARGET_HARD_FLOAT)
+  if (!(TARGET_HARD_FLOAT || TARGET_ZFINX))
 return;
 
   tree frflags = GET_BUILTIN_DECL (CODE_FOR_riscv_frflags);
diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 78f6eacb068..826ae0067bb 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -61,7 +61,7 @@ riscv_cpu_cpp_builtins (cpp_reader *pfile)
   if (TARGET_HARD_FLOAT)
 builtin_define_with_int_value ("__riscv_flen", UNITS_PER_FP_REG * 8);
 
-  if (TARGET_HARD_FLOAT && TARGET_FDIV)
+  if ((TARGET_HARD_FLOAT || TARGET_ZFINX) && TARGET_FDIV)
 {
   builtin_define ("__riscv_fdiv");
   builtin_define ("__riscv_fsqrt");
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 2d1cda2b98f..09ca91fb2c3 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -434,7 +434,7 @@
   [(set (match_operand:ANYF0 "register_operand" "=f")
(plus:ANYF (match_operand:ANYF 1 "register_operand" " f")
   (match_operand:ANYF 2 "register_operand" " f")))]
-  "TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT || TARGET_ZFINX"
   "fadd.\t%0,%1,%2"
   [(set_attr "type" "fadd")
(set_attr "mode" "")])
@@ -565,7 +565,7 @@
   [(set (match_operand:ANYF 0 "register_operand" "=f")
(minus:ANYF (match_operand:ANYF 1 "register_operand" " f")
(match_operand:ANYF 2 "register_operand" " f")))]
-  "TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT || TARGET_ZFINX"
   "fsub.\t%0,%1,%2"
   [(set_attr "type" "fadd")
(set_attr "mode" "")])
@@ -735,7 +735,7 @@
   [(set (match_operand:ANYF   0 "register_operand" "=f")
(mult:ANYF (match_operand:ANYF1 "register_operand" " f")
  (match_operand:ANYF 2 "register_operand" " f")))]
-  "TARGET_HARD_FLOAT"
+  "TARGET_HARD_FLOAT  || TARGET_ZFINX"
   "fmul.\t%0,%1,%2"
   [(set_attr "type" "fmul")
(set_attr "mode" "")])
@@ -1042,7 +1042,7 @@
   [(set (match_operand:ANYF   0 "register_operand" "=f")
(div:ANYF (match_operand:ANYF 1 "register_operand" " f")
  (match_operand:ANYF 2 "register_operand" " f")))]
-  "TARGET_HARD_FLOAT && TARGET_FDIV"
+  "(TARGET_HARD_FLOAT || TARGET_ZFINX) && TARGET_FDIV"
   "fdiv.\t%0,%1,%2"
   [(set_attr "type" "fdiv")
(set_attr "mode" "")])
@@ -1057,7 +1057,7 @@
 (define_insn "sqrt2"
   [(set (match_operand:ANYF0 "register_operand" "=f")
(sqrt:ANYF (match_operand:ANYF 1 "register_operand" " f")))]
-  "TARGET_HARD_FLOAT && TARGET_FDIV"
+  "(TARGET_HARD_FLOAT || TARGET_ZFINX) && TARGET_FDIV"
 {
 return "fsqrt.\t%0,%1";
 }
@@ -1072,7 +1072,7 @@
(fma:ANYF (match_operand:ANYF 1 "register_operand" " f")

[v4 PATCH 4/4] RISC-V: Add zhinx/zhinxmin testcases.

2022-10-20 Thread jiawei

From: Jiawei 

Test zhinx/zhinxmin support, same like with zfh/zfhmin testcases
but use gprs and don't use fmv instruction.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/_Float16-zhinx-1.c: New test.
* gcc.target/riscv/_Float16-zhinx-2.c: New test.
* gcc.target/riscv/_Float16-zhinx-3.c: New test.
* gcc.target/riscv/_Float16-zhinxmin-1.c: New test.
* gcc.target/riscv/_Float16-zhinxmin-2.c: New test.
* gcc.target/riscv/_Float16-zhinxmin-3.c: New test.

---
 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c| 10 ++
 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-2.c|  9 +
 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-3.c|  9 +
 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-1.c | 10 ++
 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-2.c | 10 ++
 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-3.c | 10 ++
 6 files changed, 58 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinx-3.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-2.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-3.c

diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c 
b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c
new file mode 100644
index 000..90172b57e05
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64i_zhinx -mabi=lp64 -O" } */
+
+_Float16 foo1 (_Float16 a, _Float16 b)
+{
+return b;
+}
+
+/* { dg-final { scan-assembler-not "fmv.h" } } */
+/* { dg-final { scan-assembler-times "mv" 1 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-2.c 
b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-2.c
new file mode 100644
index 000..26f01198c97
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-2.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64i_zhinx -mabi=lp64 -O" } */
+
+_Float16 foo1 (_Float16 a, _Float16 b)
+{
+/* { dg-final { scan-assembler-not "fadd.h fa" } } */
+/* { dg-final { scan-assembler-times "fadd.h   a" 1 } } */
+return a + b;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-3.c 
b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-3.c
new file mode 100644
index 000..573913568e7
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinx-3.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64i_zhinx -mabi=lp64 -O" } */
+
+int foo1 (_Float16 a, _Float16 b)
+{
+/* { dg-final { scan-assembler-not "fgt.h  fa" } } */
+/* { dg-final { scan-assembler-times "fgt.ha" 1 } } */
+return a > b;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-1.c 
b/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-1.c
new file mode 100644
index 000..0070ebf616c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-1.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64i_zhinxmin -mabi=lp64 -O" } */
+
+_Float16 foo1 (_Float16 a, _Float16 b)
+{
+/* { dg-final { scan-assembler-not "fmv.h" } } */
+/* { dg-final { scan-assembler-not "fmv.s" } } */
+/* { dg-final { scan-assembler-times "mv" 1 } } */
+return b;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-2.c 
b/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-2.c
new file mode 100644
index 000..17f45a938d5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-2.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64i_zhinxmin -mabi=lp64 -O" } */
+
+_Float16 foo1 (_Float16 a, _Float16 b)
+{
+/* { dg-final { scan-assembler-not "fadd.h" } } */
+/* { dg-final { scan-assembler-not "fadd.s fa" } } */
+/* { dg-final { scan-assembler-times "fadd.s   a" 1 } } */
+return a + b;
+}
diff --git a/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-3.c 
b/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-3.c
new file mode 100644
index 000..7a43641a5a6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/_Float16-zhinxmin-3.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64if_zfhmin -mabi=lp64f -O" } */
+
+int foo1 (_Float16 a, _Float16 b)
+{
+/* { dg-final { scan-assembler-not "fgt.h" } } */
+/* { dg-final { scan-assembler-not "fgt.s  fa" } } */
+/* { dg-final { scan-assembler-times "fgt.sa" 1 } } */
+return a > b;
+}
-- 
2.25.1

[v4 PATCH 3/4] RISC-V: Limit regs use for z*inx extension.

2022-10-20 Thread jiawei

From: Jiawei 

Limit z*inx abi support with 'ilp32','ilp32e','lp64' only.
Use GPR instead FPR when 'zfinx' enable, Only use even registers 
in RV32 when 'zdinx' enable.
Enable FLOAT16 when Zhinx/Zhinxmin enabled.

Co-Authored-By: Sinan Lin.

gcc/ChangeLog:

* config/riscv/constraints.md (TARGET_ZFINX ? GR_REGS): Set GPRS
  use while Zfinx is enable.
* config/riscv/riscv.cc (riscv_hard_regno_mode_ok): Limit odd
  registers use when Zdinx enable in RV32 cases.
(riscv_option_override): New target enable MASK_FDIV.
(riscv_libgcc_floating_mode_supported_p): New error info when
  use incompatible arch&abi.
(riscv_excess_precision): New target enable FLOAT16.

---
 gcc/config/riscv/constraints.md |  5 +++--
 gcc/config/riscv/riscv.cc   | 21 +
 2 files changed, 20 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 8997284f32e..c53e0f38920 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -21,8 +21,9 @@
 
 ;; Register constraints
 
-(define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS : NO_REGS"
-  "A floating-point register (if available).")
+(define_register_constraint "f" "TARGET_HARD_FLOAT ? FP_REGS :
+  (TARGET_ZFINX ? GR_REGS) : NO_REGS"
+  "A floating-point register (if available, reuse GPR as FPR when use zfinx).")
 
 (define_register_constraint "j" "SIBCALL_REGS"
   "@internal")
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index ad57b995e7b..38631605b2c 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -5356,6 +5356,13 @@ riscv_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)
!= call_used_or_fixed_reg_p (regno + i))
   return false;
 
+  /* Only use even registers in RV32 ZDINX */
+  if (!TARGET_64BIT && TARGET_ZDINX){
+if (GET_MODE_CLASS (mode) == MODE_FLOAT &&
+ GET_MODE_UNIT_SIZE (mode) == GET_MODE_SIZE (DFmode))
+return !(regno & 1);
+  }
+
   return true;
 }
 
@@ -5595,7 +5602,7 @@ riscv_option_override (void)
 error ("%<-mdiv%> requires %<-march%> to subsume the % extension");
 
   /* Likewise floating-point division and square root.  */
-  if (TARGET_HARD_FLOAT && (target_flags_explicit & MASK_FDIV) == 0)
+  if ((TARGET_HARD_FLOAT || TARGET_ZFINX) && (target_flags_explicit & 
MASK_FDIV) == 0)
 target_flags |= MASK_FDIV;
 
   /* Handle -mtune, use -mcpu if -mtune is not given, and use default -mtune
@@ -5641,6 +5648,11 @@ riscv_option_override (void)
   if (TARGET_RVE && riscv_abi != ABI_ILP32E)
 error ("rv32e requires ilp32e ABI");
 
+  // Zfinx require abi ilp32,ilp32e or lp64.
+  if (TARGET_ZFINX && riscv_abi != ABI_ILP32
+  && riscv_abi != ABI_LP64 && riscv_abi != ABI_ILP32E)
+error ("z*inx requires ABI ilp32, ilp32e or lp64");
+
   /* We do not yet support ILP32 on RV64.  */
   if (BITS_PER_WORD != POINTER_SIZE)
 error ("ABI requires %<-march=rv%d%>", POINTER_SIZE);
@@ -6273,7 +6285,7 @@ riscv_libgcc_floating_mode_supported_p (scalar_float_mode 
mode)
precision of the _FloatN type; evaluate all other operations and
constants to the range and precision of the semantic type;
 
-   If we have the zfh extensions then we support _Float16 in native
+   If we have the zfh/zhinx extensions then we support _Float16 in native
precision, so we should set this to 16.  */
 static enum flt_eval_method
 riscv_excess_precision (enum excess_precision_type type)
@@ -6282,8 +6294,9 @@ riscv_excess_precision (enum excess_precision_type type)
 {
 case EXCESS_PRECISION_TYPE_FAST:
 case EXCESS_PRECISION_TYPE_STANDARD:
-  return (TARGET_ZFH ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
-: FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
+  return ((TARGET_ZFH || TARGET_ZHINX || TARGET_ZHINXMIN) 
+   ? FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16
+   : FLT_EVAL_METHOD_PROMOTE_TO_FLOAT);
 case EXCESS_PRECISION_TYPE_IMPLICIT:
 case EXCESS_PRECISION_TYPE_FLOAT16:
   return FLT_EVAL_METHOD_PROMOTE_TO_FLOAT16;
-- 
2.25.1

[pushed] aarch64: Fix matching of BRKNS

2022-10-20 Thread Richard Sandiford via Gcc-patches

Unlike other flag-setting SVE instructions, BRKNS sets the flags
based on an all-true governing predicate, rather than the GP operand.

Tested on aarch64-linux-gnu & pushed to trunk so far.  I'll backport
to release branches soon.

Richard


gcc/
* config/aarch64/iterators.md (SVE_BRKP): New iterator.
* config/aarch64/aarch64-sve.md (*aarch64_brkn_cc): New pattern.
(*aarch64_brkn_ptest): Likewise.
(*aarch64_brk_cc): Restrict to SVE_BRKP.
(*aarch64_brk_ptest): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/brkn_1.c: Expect separate
PTEST instructions.
* gcc.target/aarch64/sve/acle/general/brkn_2.c: New test.
---
 gcc/config/aarch64/aarch64-sve.md | 70 ---
 gcc/config/aarch64/iterators.md   |  2 +
 .../aarch64/sve/acle/general/brkn_1.c |  5 +-
 .../aarch64/sve/acle/general/brkn_2.c | 23 ++
 4 files changed, 90 insertions(+), 10 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/sve/acle/general/brkn_2.c

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index e08bee197d8..e2bb80268e5 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -9677,7 +9677,61 @@ (define_insn "@aarch64_brk"
   "brk\t%0.b, %1/z, %2.b, %.b"
 )
 
-;; Same, but also producing a flags result.
+;; BRKN, producing both a predicate and a flags result.  Unlike other
+;; flag-setting instructions, these flags are always set wrt a ptrue.
+(define_insn_and_rewrite "*aarch64_brkn_cc"
+  [(set (reg:CC_NZC CC_REGNUM)
+   (unspec:CC_NZC
+ [(match_operand:VNx16BI 4)
+  (match_operand:VNx16BI 5)
+  (const_int SVE_KNOWN_PTRUE)
+  (unspec:VNx16BI
+[(match_operand:VNx16BI 1 "register_operand" "Upa")
+ (match_operand:VNx16BI 2 "register_operand" "Upa")
+ (match_operand:VNx16BI 3 "register_operand" "0")]
+UNSPEC_BRKN)]
+ UNSPEC_PTEST))
+   (set (match_operand:VNx16BI 0 "register_operand" "=Upa")
+   (unspec:VNx16BI
+ [(match_dup 1)
+  (match_dup 2)
+  (match_dup 3)]
+ UNSPEC_BRKN))]
+  "TARGET_SVE"
+  "brkns\t%0.b, %1/z, %2.b, %0.b"
+  "&& (operands[4] != CONST0_RTX (VNx16BImode)
+   || operands[5] != CONST0_RTX (VNx16BImode))"
+  {
+operands[4] = CONST0_RTX (VNx16BImode);
+operands[5] = CONST0_RTX (VNx16BImode);
+  }
+)
+
+;; Same, but with only the flags result being interesting.
+(define_insn_and_rewrite "*aarch64_brkn_ptest"
+  [(set (reg:CC_NZC CC_REGNUM)
+   (unspec:CC_NZC
+ [(match_operand:VNx16BI 4)
+  (match_operand:VNx16BI 5)
+  (const_int SVE_KNOWN_PTRUE)
+  (unspec:VNx16BI
+[(match_operand:VNx16BI 1 "register_operand" "Upa")
+ (match_operand:VNx16BI 2 "register_operand" "Upa")
+ (match_operand:VNx16BI 3 "register_operand" "0")]
+UNSPEC_BRKN)]
+ UNSPEC_PTEST))
+   (clobber (match_scratch:VNx16BI 0 "=Upa"))]
+  "TARGET_SVE"
+  "brkns\t%0.b, %1/z, %2.b, %0.b"
+  "&& (operands[4] != CONST0_RTX (VNx16BImode)
+   || operands[5] != CONST0_RTX (VNx16BImode))"
+  {
+operands[4] = CONST0_RTX (VNx16BImode);
+operands[5] = CONST0_RTX (VNx16BImode);
+  }
+)
+
+;; BRKPA and BRKPB, producing both a predicate and a flags result.
 (define_insn "*aarch64_brk_cc"
   [(set (reg:CC_NZC CC_REGNUM)
(unspec:CC_NZC
@@ -9687,17 +9741,17 @@ (define_insn "*aarch64_brk_cc"
   (unspec:VNx16BI
 [(match_dup 1)
  (match_operand:VNx16BI 2 "register_operand" "Upa")
- (match_operand:VNx16BI 3 "register_operand" "")]
-SVE_BRK_BINARY)]
+ (match_operand:VNx16BI 3 "register_operand" "Upa")]
+SVE_BRKP)]
  UNSPEC_PTEST))
(set (match_operand:VNx16BI 0 "register_operand" "=Upa")
(unspec:VNx16BI
  [(match_dup 1)
   (match_dup 2)
   (match_dup 3)]
- SVE_BRK_BINARY))]
+ SVE_BRKP))]
   "TARGET_SVE"
-  "brks\t%0.b, %1/z, %2.b, %.b"
+  "brks\t%0.b, %1/z, %2.b, %3.b"
 )
 
 ;; Same, but with only the flags result being interesting.
@@ -9710,12 +9764,12 @@ (define_insn "*aarch64_brk_ptest"
   (unspec:VNx16BI
 [(match_dup 1)
  (match_operand:VNx16BI 2 "register_operand" "Upa")
- (match_operand:VNx16BI 3 "register_operand" "")]
-SVE_BRK_BINARY)]
+ (match_operand:VNx16BI 3 "register_operand" "Upa")]
+SVE_BRKP)]
  UNSPEC_PTEST))
(clobber (match_scratch:VNx16BI 0 "=Upa"))]
   "TARGET_SVE"
-  "brks\t%0.b, %1/z, %2.b, %.b"
+  "brks\t%0.b, %1/z, %2.b, %3.b"
 )
 
 ;; -
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 9354dbec866..a8ad4e5ff21 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/confi

[pushed] aarch64: Prevent generation of /M BRKAS and BRKBS

2022-10-20 Thread Richard Sandiford via Gcc-patches

Bit of a brown-paper-bag bug, but: GCC was generating
non-existent merging forms of BRKAS and BRKBS.  Those
instructions only support zero predication (although
BRKA and BRKB support both).

Tested on aarch64-linux-gnu & pushed to trunk so far.  I'll backport
to release branches soon.

Richard


gcc/
* config/aarch64/aarch64-sve.md (*aarch64_brk_cc): Remove
merging alternative.
(*aarch64_brk_ptest): Likewise.

gcc/testsuite/
* gcc.target/aarch64/sve/acle/general/brka_1.c: Expect a separate
PTEST instruction.
* gcc.target/aarch64/sve/acle/general/brkb_1.c: Likewise.
---
 gcc/config/aarch64/aarch64-sve.md | 24 ---
 .../aarch64/sve/acle/general/brka_1.c |  5 ++--
 .../aarch64/sve/acle/general/brkb_1.c |  5 ++--
 3 files changed, 16 insertions(+), 18 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve.md 
b/gcc/config/aarch64/aarch64-sve.md
index e2bb80268e5..b8cc47ef5fc 100644
--- a/gcc/config/aarch64/aarch64-sve.md
+++ b/gcc/config/aarch64/aarch64-sve.md
@@ -9612,45 +9612,41 @@ (define_insn "@aarch64_brk"
 (define_insn "*aarch64_brk_cc"
   [(set (reg:CC_NZC CC_REGNUM)
(unspec:CC_NZC
- [(match_operand:VNx16BI 1 "register_operand" "Upa, Upa")
+ [(match_operand:VNx16BI 1 "register_operand" "Upa")
   (match_dup 1)
   (match_operand:SI 4 "aarch64_sve_ptrue_flag")
   (unspec:VNx16BI
 [(match_dup 1)
- (match_operand:VNx16BI 2 "register_operand" "Upa, Upa")
- (match_operand:VNx16BI 3 "aarch64_simd_reg_or_zero" "Dz, 0")]
+ (match_operand:VNx16BI 2 "register_operand" "Upa")
+ (match_operand:VNx16BI 3 "aarch64_simd_imm_zero")]
 SVE_BRK_UNARY)]
  UNSPEC_PTEST))
-   (set (match_operand:VNx16BI 0 "register_operand" "=Upa, Upa")
+   (set (match_operand:VNx16BI 0 "register_operand" "=Upa")
(unspec:VNx16BI
  [(match_dup 1)
   (match_dup 2)
   (match_dup 3)]
  SVE_BRK_UNARY))]
   "TARGET_SVE"
-  "@
-   brks\t%0.b, %1/z, %2.b
-   brks\t%0.b, %1/m, %2.b"
+  "brks\t%0.b, %1/z, %2.b"
 )
 
 ;; Same, but with only the flags result being interesting.
 (define_insn "*aarch64_brk_ptest"
   [(set (reg:CC_NZC CC_REGNUM)
(unspec:CC_NZC
- [(match_operand:VNx16BI 1 "register_operand" "Upa, Upa")
+ [(match_operand:VNx16BI 1 "register_operand" "Upa")
   (match_dup 1)
   (match_operand:SI 4 "aarch64_sve_ptrue_flag")
   (unspec:VNx16BI
 [(match_dup 1)
- (match_operand:VNx16BI 2 "register_operand" "Upa, Upa")
- (match_operand:VNx16BI 3 "aarch64_simd_reg_or_zero" "Dz, 0")]
+ (match_operand:VNx16BI 2 "register_operand" "Upa")
+ (match_operand:VNx16BI 3 "aarch64_simd_imm_zero")]
 SVE_BRK_UNARY)]
  UNSPEC_PTEST))
-   (clobber (match_scratch:VNx16BI 0 "=Upa, Upa"))]
+   (clobber (match_scratch:VNx16BI 0 "=Upa"))]
   "TARGET_SVE"
-  "@
-   brks\t%0.b, %1/z, %2.b
-   brks\t%0.b, %1/m, %2.b"
+  "brks\t%0.b, %1/z, %2.b"
 )
 
 ;; -
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brka_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brka_1.c
index 24aa8f317c4..6146ef82849 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brka_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brka_1.c
@@ -18,5 +18,6 @@ test2 (svbool_t pg, svbool_t x, svbool_t y, int *any)
   return svptest_any (pg, res);
 }
 
-/* { dg-final { scan-assembler-times {\tbrkas\tp[0-9]+\.b, p[0-9]+/m,} 2 } } */
-/* { dg-final { scan-assembler-not {\tbrka\t} } } */
+/* { dg-final { scan-assembler-times {\tbrka\t} 2 } } */
+/* { dg-final { scan-assembler-times {\tptest\t} 2 } } */
+/* { dg-final { scan-assembler-not {\tbrkas\t} } } */
diff --git a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brkb_1.c 
b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brkb_1.c
index 07e3622ed6a..7f1c75837de 100644
--- a/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brkb_1.c
+++ b/gcc/testsuite/gcc.target/aarch64/sve/acle/general/brkb_1.c
@@ -18,5 +18,6 @@ test2 (svbool_t pg, svbool_t x, svbool_t y, int *any)
   return svptest_any (pg, res);
 }
 
-/* { dg-final { scan-assembler-times {\tbrkbs\tp[0-9]+\.b, p[0-9]+/m,} 2 } } */
-/* { dg-final { scan-assembler-not {\tbrkb\t} } } */
+/* { dg-final { scan-assembler-times {\tbrkb\t} 2 } } */
+/* { dg-final { scan-assembler-times {\tptest\t} 2 } } */
+/* { dg-final { scan-assembler-not {\tbrkbs\t} } } */
-- 
2.25.1

[pushed] aarch64: Commonise some folding code

2022-10-20 Thread Richard Sandiford via Gcc-patches

Add an aarch64_sve::gimple_folder helper for folding calls
to integer constants.  SME will make more use of this.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
* config/aarch64/aarch64-sve-builtins.h
(gimple_folder::fold_to_cstu): New member function.
* config/aarch64/aarch64-sve-builtins.cc
(gimple_folder::fold_to_cstu): Define.
* config/aarch64/aarch64-sve-builtins-base.cc
(svcnt_bhwd_impl::fold): Use it.
---
 gcc/config/aarch64/aarch64-sve-builtins-base.cc | 9 ++---
 gcc/config/aarch64/aarch64-sve-builtins.cc  | 7 +++
 gcc/config/aarch64/aarch64-sve-builtins.h   | 1 +
 3 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 141f44d4d94..23b4d42822a 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -517,9 +517,7 @@ public:
   gimple *
   fold (gimple_folder &f) const override
   {
-tree count = build_int_cstu (TREE_TYPE (f.lhs),
-GET_MODE_NUNITS (m_ref_mode));
-return gimple_build_assign (f.lhs, count);
+return f.fold_to_cstu (GET_MODE_NUNITS (m_ref_mode));
   }
 
   rtx
@@ -553,10 +551,7 @@ public:
 unsigned int elements_per_vq = 128 / GET_MODE_UNIT_BITSIZE (m_ref_mode);
 HOST_WIDE_INT value = aarch64_fold_sve_cnt_pat (pattern, elements_per_vq);
 if (value >= 0)
-  {
-   tree count = build_int_cstu (TREE_TYPE (f.lhs), value);
-   return gimple_build_assign (f.lhs, count);
-  }
+  return f.fold_to_cstu (value);
 
 return NULL;
   }
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.cc 
b/gcc/config/aarch64/aarch64-sve-builtins.cc
index 63b1358c138..37228f6389a 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins.cc
@@ -2615,6 +2615,13 @@ gimple_folder::redirect_call (const function_instance 
&instance)
   return call;
 }
 
+/* Fold the call to constant VAL.  */
+gimple *
+gimple_folder::fold_to_cstu (poly_uint64 val)
+{
+  return gimple_build_assign (lhs, build_int_cstu (TREE_TYPE (lhs), val));
+}
+
 /* Fold the call to a PTRUE, taking the element size from type suffix 0.  */
 gimple *
 gimple_folder::fold_to_ptrue ()
diff --git a/gcc/config/aarch64/aarch64-sve-builtins.h 
b/gcc/config/aarch64/aarch64-sve-builtins.h
index 63d1db776f7..0d130b871d0 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins.h
+++ b/gcc/config/aarch64/aarch64-sve-builtins.h
@@ -500,6 +500,7 @@ public:
   tree load_store_cookie (tree);
 
   gimple *redirect_call (const function_instance &);
+  gimple *fold_to_cstu (poly_uint64);
   gimple *fold_to_pfalse ();
   gimple *fold_to_ptrue ();
   gimple *fold_to_vl_pred (unsigned int);
-- 
2.25.1

[pushed] aarch64: Replace CONSTEXPR with constexpr

2022-10-20 Thread Richard Sandiford via Gcc-patches

Move away from the pre-C++11 compatibility macro CONSTEXPR.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
* config/aarch64/aarch64-sve-builtins-base.cc: Replace CONSTEXPR
with constexpr throughout.
* config/aarch64/aarch64-sve-builtins-functions.h: Likewise.
* config/aarch64/aarch64-sve-builtins-shapes.cc: Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc: Likewise.
* config/aarch64/aarch64-sve-builtins.cc: Likewise.
---
 .../aarch64/aarch64-sve-builtins-base.cc  | 82 +--
 .../aarch64/aarch64-sve-builtins-functions.h  | 50 +--
 .../aarch64/aarch64-sve-builtins-shapes.cc|  8 +-
 .../aarch64/aarch64-sve-builtins-sve2.cc  | 18 ++--
 gcc/config/aarch64/aarch64-sve-builtins.cc|  8 +-
 5 files changed, 83 insertions(+), 83 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index 82f9eba5c39..d52454bcf27 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -177,7 +177,7 @@ public:
 class svac_impl : public function_base
 {
 public:
-  CONSTEXPR svac_impl (int unspec) : m_unspec (unspec) {}
+  constexpr svac_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander &e) const override
@@ -209,7 +209,7 @@ public:
 class svadr_bhwd_impl : public function_base
 {
 public:
-  CONSTEXPR svadr_bhwd_impl (unsigned int shift) : m_shift (shift) {}
+  constexpr svadr_bhwd_impl (unsigned int shift) : m_shift (shift) {}
 
   rtx
   expand (function_expander &e) const override
@@ -259,7 +259,7 @@ public:
 class svbrk_binary_impl : public function_base
 {
 public:
-  CONSTEXPR svbrk_binary_impl (int unspec) : m_unspec (unspec) {}
+  constexpr svbrk_binary_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander &e) const override
@@ -275,7 +275,7 @@ public:
 class svbrk_unary_impl : public function_base
 {
 public:
-  CONSTEXPR svbrk_unary_impl (int unspec) : m_unspec (unspec) {}
+  constexpr svbrk_unary_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander &e) const override
@@ -309,7 +309,7 @@ public:
 class svclast_impl : public quiet
 {
 public:
-  CONSTEXPR svclast_impl (int unspec) : m_unspec (unspec) {}
+  constexpr svclast_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander &e) const override
@@ -381,7 +381,7 @@ public:
 class svcmp_impl : public function_base
 {
 public:
-  CONSTEXPR svcmp_impl (tree_code code, int unspec_for_fp)
+  constexpr svcmp_impl (tree_code code, int unspec_for_fp)
 : m_code (code), m_unspec_for_fp (unspec_for_fp) {}
 
   gimple *
@@ -437,7 +437,7 @@ public:
 class svcmp_wide_impl : public function_base
 {
 public:
-  CONSTEXPR svcmp_wide_impl (tree_code code, int unspec_for_sint,
+  constexpr svcmp_wide_impl (tree_code code, int unspec_for_sint,
 int unspec_for_uint)
 : m_code (code), m_unspec_for_sint (unspec_for_sint),
   m_unspec_for_uint (unspec_for_uint) {}
@@ -512,7 +512,7 @@ public:
 class svcnt_bhwd_impl : public function_base
 {
 public:
-  CONSTEXPR svcnt_bhwd_impl (machine_mode ref_mode) : m_ref_mode (ref_mode) {}
+  constexpr svcnt_bhwd_impl (machine_mode ref_mode) : m_ref_mode (ref_mode) {}
 
   gimple *
   fold (gimple_folder &f) const override
@@ -536,7 +536,7 @@ public:
 class svcnt_bhwd_pat_impl : public svcnt_bhwd_impl
 {
 public:
-  CONSTEXPR svcnt_bhwd_pat_impl (machine_mode ref_mode)
+  constexpr svcnt_bhwd_pat_impl (machine_mode ref_mode)
 : svcnt_bhwd_impl (ref_mode) {}
 
   gimple *
@@ -588,7 +588,7 @@ public:
 class svcreate_impl : public quiet
 {
 public:
-  CONSTEXPR svcreate_impl (unsigned int vectors_per_tuple)
+  constexpr svcreate_impl (unsigned int vectors_per_tuple)
 : quiet (vectors_per_tuple) {}
 
   gimple *
@@ -722,7 +722,7 @@ public:
 class svdotprod_lane_impl : public unspec_based_function_base
 {
 public:
-  CONSTEXPR svdotprod_lane_impl (int unspec_for_sint,
+  constexpr svdotprod_lane_impl (int unspec_for_sint,
 int unspec_for_uint,
 int unspec_for_float)
 : unspec_based_function_base (unspec_for_sint,
@@ -961,7 +961,7 @@ public:
 class svext_bhw_impl : public function_base
 {
 public:
-  CONSTEXPR svext_bhw_impl (scalar_int_mode from_mode)
+  constexpr svext_bhw_impl (scalar_int_mode from_mode)
 : m_from_mode (from_mode) {}
 
   rtx
@@ -1003,7 +1003,7 @@ public:
 class svget_impl : public quiet
 {
 public:
-  CONSTEXPR svget_impl (unsigned int vectors_per_tuple)
+  constexpr svget_impl (unsigned int vectors_per_tuple)
 : quiet (vectors_per_tuple) {}
 
   gimple *
@@ -1066,7 +1066,7 @@ public:
 class svlast_impl : public quiet
 {
 public:
-  CONSTEXPR svlast_impl (int unspec) : m_unspec (unspec) {}
+  constexpr svlast_impl (int unspec) : m_unspec (unspec) {}
 
   rtx
   expand (function_expander &e) c

[pushed] aarch64: Use using directives to inherit constructors

2022-10-20 Thread Richard Sandiford via Gcc-patches

Now that the codebase is C++11, we can use using directives
to inherit constructors from base classes.

Tested on aarch64-linux-gnu & pushed.

Richard


gcc/
* config/aarch64/aarch64-sve-builtins-functions.h (quiet)
(rtx_code_function, rtx_code_function_rotated, unspec_based_function)
(unspec_based_function_rotated, unspec_based_function_exact_insn)
(unspec_based_fused_function, unspec_based_fused_lane_function):
Replace constructors with using directives.
* config/aarch64/aarch64-sve-builtins-base.cc (svcnt_bhwd_pat_impl)
(svcreate_impl, svdotprod_lane_impl, svget_impl, svld1_extend_impl)
(svld1_gather_extend_impl, svld234_impl, svldff1_gather_extend)
(svset_impl, svst1_scatter_truncate_impl, svst1_truncate_impl)
(svst234_impl, svundef_impl): Likewise.
* config/aarch64/aarch64-sve-builtins-sve2.cc
(svldnt1_gather_extend_impl, svmovl_lb_impl): Likewise.
(svstnt1_scatter_truncate_impl): Likewise.
---
 .../aarch64/aarch64-sve-builtins-base.cc  | 43 +-
 .../aarch64/aarch64-sve-builtins-functions.h  | 56 +++
 .../aarch64/aarch64-sve-builtins-sve2.cc  | 12 +---
 3 files changed, 24 insertions(+), 87 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-sve-builtins-base.cc 
b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
index d52454bcf27..141f44d4d94 100644
--- a/gcc/config/aarch64/aarch64-sve-builtins-base.cc
+++ b/gcc/config/aarch64/aarch64-sve-builtins-base.cc
@@ -536,8 +536,7 @@ public:
 class svcnt_bhwd_pat_impl : public svcnt_bhwd_impl
 {
 public:
-  constexpr svcnt_bhwd_pat_impl (machine_mode ref_mode)
-: svcnt_bhwd_impl (ref_mode) {}
+  using svcnt_bhwd_impl::svcnt_bhwd_impl;
 
   gimple *
   fold (gimple_folder &f) const override
@@ -588,8 +587,7 @@ public:
 class svcreate_impl : public quiet
 {
 public:
-  constexpr svcreate_impl (unsigned int vectors_per_tuple)
-: quiet (vectors_per_tuple) {}
+  using quiet::quiet;
 
   gimple *
   fold (gimple_folder &f) const override
@@ -722,12 +720,7 @@ public:
 class svdotprod_lane_impl : public unspec_based_function_base
 {
 public:
-  constexpr svdotprod_lane_impl (int unspec_for_sint,
-int unspec_for_uint,
-int unspec_for_float)
-: unspec_based_function_base (unspec_for_sint,
- unspec_for_uint,
- unspec_for_float) {}
+  using unspec_based_function_base::unspec_based_function_base;
 
   rtx
   expand (function_expander &e) const override
@@ -1003,8 +996,7 @@ public:
 class svget_impl : public quiet
 {
 public:
-  constexpr svget_impl (unsigned int vectors_per_tuple)
-: quiet (vectors_per_tuple) {}
+  using quiet::quiet;
 
   gimple *
   fold (gimple_folder &f) const override
@@ -1118,8 +1110,7 @@ public:
 class svld1_extend_impl : public extending_load
 {
 public:
-  constexpr svld1_extend_impl (type_suffix_index memory_type)
-: extending_load (memory_type) {}
+  using extending_load::extending_load;
 
   rtx
   expand (function_expander &e) const override
@@ -1158,8 +1149,7 @@ public:
 class svld1_gather_extend_impl : public extending_load
 {
 public:
-  constexpr svld1_gather_extend_impl (type_suffix_index memory_type)
-: extending_load (memory_type) {}
+  using extending_load::extending_load;
 
   rtx
   expand (function_expander &e) const override
@@ -1289,8 +1279,7 @@ public:
 class svld234_impl : public full_width_access
 {
 public:
-  constexpr svld234_impl (unsigned int vectors_per_tuple)
-: full_width_access (vectors_per_tuple) {}
+  using full_width_access::full_width_access;
 
   unsigned int
   call_properties (const function_instance &) const override
@@ -1372,8 +1361,7 @@ public:
 class svldff1_gather_extend : public extending_load
 {
 public:
-  constexpr svldff1_gather_extend (type_suffix_index memory_type)
-: extending_load (memory_type) {}
+  using extending_load::extending_load;
 
   rtx
   expand (function_expander &e) const override
@@ -2070,8 +2058,7 @@ public:
 class svset_impl : public quiet
 {
 public:
-  constexpr svset_impl (unsigned int vectors_per_tuple)
-: quiet (vectors_per_tuple) {}
+  using quiet::quiet;
 
   gimple *
   fold (gimple_folder &f) const override
@@ -2199,8 +2186,7 @@ public:
 class svst1_scatter_truncate_impl : public truncating_store
 {
 public:
-  constexpr svst1_scatter_truncate_impl (scalar_int_mode to_mode)
-: truncating_store (to_mode) {}
+  using truncating_store::truncating_store;
 
   rtx
   expand (function_expander &e) const override
@@ -2219,8 +2205,7 @@ public:
 class svst1_truncate_impl : public truncating_store
 {
 public:
-  constexpr svst1_truncate_impl (scalar_int_mode to_mode)
-: truncating_store (to_mode) {}
+  using truncating_store::truncating_store;
 
   rtx
   expand (function_expander &e) const override
@@ -2235,8 +2220,7 @@ public:
 class svst234_impl : public full_width_access

Make 'autoreconf' work for 'gcc', 'libobjc' (was: [PATCH] regenerate configure files and config.h.in files)

2022-10-20 Thread Thomas Schwinge

Hi!

Given:

On 2022-09-20T14:42:53+0100, Iain Sandoe via Gcc-patches 
 wrote:
> +1 from me …
> ..  I have been maintaining something similar locally.

On 2022-09-20T17:23:15+0200, Martin Liška  wrote:
> I do support that as well.

..., I have now pushed to master branch
commit 25861cf3a88a07c8dca3fb32d098c0ad756bbe38
"Make 'autoreconf' work for 'gcc', 'libobjc'", see attached.


> What will be the only command invocation that will be needed once you're done?

Just plain 'autoreconf' (per each relevant directory).  See
.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 25861cf3a88a07c8dca3fb32d098c0ad756bbe38 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 20 Sep 2022 14:27:05 +0200
Subject: [PATCH] Make 'autoreconf' work for 'gcc', 'libobjc'

With that, we may then run plain 'autoreconf' for all of GCC's subpackages,
instead of for some of those (that don't use Automake) manually having to run
the applicable combination of 'aclocal', 'autoconf', 'autoheader'.

See also 'AC_CONFIG_MACRO_DIRS'/'AC_CONFIG_MACRO_DIR' usage elsewhere.

	gcc/
	* configure.ac (AC_CONFIG_MACRO_DIRS): Instantiate.
	* configure: Regenerate.
	libobjc/
	* configure.ac (AC_CONFIG_MACRO_DIRS): Instantiate.
	* configure: Regenerate.
---
 gcc/configure| 5 +++--
 gcc/configure.ac | 1 +
 libobjc/configure| 5 +++--
 libobjc/configure.ac | 1 +
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/gcc/configure b/gcc/configure
index 2ce51a8a458..c6def4c88e5 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -,6 +,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
 
+
 ac_config_headers="$ac_config_headers auto-host.h:config.in"
 
 
@@ -19713,7 +19714,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19716 "configure"
+#line 19717 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -19819,7 +19820,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 19822 "configure"
+#line 19823 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/gcc/configure.ac b/gcc/configure.ac
index e1ef2ecf026..45bf7560e6f 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -25,6 +25,7 @@
 
 AC_INIT
 AC_CONFIG_SRCDIR(tree.cc)
+AC_CONFIG_MACRO_DIRS([../config] [..])
 AC_CONFIG_HEADER(auto-host.h:config.in)
 
 gcc_version=`cat $srcdir/BASE-VER`
diff --git a/libobjc/configure b/libobjc/configure
index a8fdc643349..6da20b8e4ff 100755
--- a/libobjc/configure
+++ b/libobjc/configure
@@ -2218,6 +2218,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 
 
+
 ac_aux_dir=
 for ac_dir in "$srcdir" "$srcdir/.." "$srcdir/../.."; do
   if test -f "$ac_dir/install-sh"; then
@@ -10795,7 +10796,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 10798 "configure"
+#line 10799 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
@@ -10901,7 +10902,7 @@ else
   lt_dlunknown=0; lt_dlno_uscore=1; lt_dlneed_uscore=2
   lt_status=$lt_dlunknown
   cat > conftest.$ac_ext <<_LT_EOF
-#line 10904 "configure"
+#line 10905 "configure"
 #include "confdefs.h"
 
 #if HAVE_DLFCN_H
diff --git a/libobjc/configure.ac b/libobjc/configure.ac
index f8f577cfbef..6f58a45d4cb 100644
--- a/libobjc/configure.ac
+++ b/libobjc/configure.ac
@@ -20,6 +20,7 @@
 
 AC_INIT(package-unused, version-unused,, libobjc)
 AC_CONFIG_SRCDIR([objc/objc.h])
+AC_CONFIG_MACRO_DIRS([../config] [..])
 GCC_TOPLEV_SUBDIRS
 
 # We need the following definitions because AC_PROG_LIBTOOL relies on them
-- 
2.35.1

amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421] (was: [PATCH] [og12] amdgcn: Use FLAT addressing for all functions with pointer arguments)

2022-10-20 Thread Thomas Schwinge

Hi!

On 2022-10-14T13:38:55+, Julian Brown  wrote:
> The GCN backend uses a heuristic to determine whether to use FLAT or
> GLOBAL addressing in a particular (offload) function: namely, if a
> function takes a pointer-to-scalar parameter, it is assumed that the
> pointer may refer to "flat scratch" space, and thus FLAT addressing must
> be used instead of GLOBAL.
>
> I came up with this heuristic initially whilst working on support for
> moving OpenACC gang-private variables into local-data share (scratch)
> memory. The assumption that only scalar variables would be transformed in
> that way turned out to be wrong.  For example, prior to the next patch in
> the series, Fortran compiler-generated temporary structures were treated
> as gang private and moved to LDS space, typically overflowing the region
> allocated for such variables.  That will no longer happen after that
> patch is applied, but there may be other cases of structs moving to LDS
> space now or in the future that this patch may be needed for.
>
> Tested with offloading to AMD GCN. I will apply shortly (to og12).

Thanks.  I've verified that this does resolve PR105421
"GCN offloading, raised '-mgang-private-size': 
'HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION'"
and have thus added PR105421 tags to your commit log, and with that
pushed to master branch commit 7c55755d4c760de326809636531478fd7419e1e5
"amdgcn: Use FLAT addressing for all functions with pointer arguments 
[PR105421]",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 7c55755d4c760de326809636531478fd7419e1e5 Mon Sep 17 00:00:00 2001
From: Julian Brown 
Date: Fri, 14 Oct 2022 11:06:07 +
Subject: [PATCH] amdgcn: Use FLAT addressing for all functions with pointer
 arguments [PR105421]

The GCN backend uses a heuristic to determine whether to use FLAT or
GLOBAL addressing in a particular (offload) function: namely, if a
function takes a pointer-to-scalar parameter, it is assumed that the
pointer may refer to "flat scratch" space, and thus FLAT addressing must
be used instead of GLOBAL.

I came up with this heuristic initially whilst working on support for
moving OpenACC gang-private variables into local-data share (scratch)
memory. The assumption that only scalar variables would be transformed in
that way turned out to be wrong.  For example, prior to the next patch in
the series, Fortran compiler-generated temporary structures were treated
as gang private and moved to LDS space, typically overflowing the region
allocated for such variables.  That will no longer happen after that
patch is applied, but there may be other cases of structs moving to LDS
space now or in the future that this patch may be needed for.

2022-10-14  Julian Brown  

	PR target/105421
gcc/
	* config/gcn/gcn.cc (gcn_detect_incoming_pointer_arg): Any pointer
	argument forces FLAT addressing mode, not just
	pointer-to-non-aggregate.
---
 gcc/config/gcn/gcn.cc | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/gcc/config/gcn/gcn.cc b/gcc/config/gcn/gcn.cc
index 8777255a5c6..a9ef5c3dc02 100644
--- a/gcc/config/gcn/gcn.cc
+++ b/gcc/config/gcn/gcn.cc
@@ -2809,10 +2809,14 @@ gcn_arg_partial_bytes (cumulative_args_t cum_v, const function_arg_info &arg)
   return (NUM_PARM_REGS - cum_num) * regsize;
 }
 
-/* A normal function which takes a pointer argument (to a scalar) may be
-   passed a pointer to LDS space (via a high-bits-set aperture), and that only
-   works with FLAT addressing, not GLOBAL.  Force FLAT addressing if the
-   function has an incoming pointer-to-scalar parameter.  */
+/* A normal function which takes a pointer argument may be passed a pointer to
+   LDS space (via a high-bits-set aperture), and that only works with FLAT
+   addressing, not GLOBAL.  Force FLAT addressing if the function has an
+   incoming pointer parameter.  NOTE: This is a heuristic that works in the
+   offloading case, but in general, a function might read global pointer
+   variables, etc. that may refer to LDS space or other special memory areas
+   not supported by GLOBAL instructions, and then this argument check would not
+   suffice.  */
 
 static void
 gcn_detect_incoming_pointer_arg (tree fndecl)
@@ -2822,8 +2826,7 @@ gcn_detect_incoming_pointer_arg (tree fndecl)
   for (tree arg = TYPE_ARG_TYPES (TREE_TYPE (fndecl));
arg;
arg = TREE_CHAIN (arg))
-if (POINTER_TYPE_P (TREE_VALUE (arg))
-	&& !AGGREGATE_TYPE_P (TREE_TYPE (TREE_VALUE (arg
+if (POINTER_TYPE_P (TREE_VALUE (arg)))
   cfun->machine->use_flat_addressing = true;
 }
 
-- 
2.35.1

Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421] (was: amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421])

2022-10-20 Thread Thomas Schwinge

Hi!

On 2022-10-20T12:05:28+0200, I wrote:
> On 2022-10-14T13:38:55+, Julian Brown  wrote:
>> The GCN backend uses a heuristic to determine whether to use FLAT or
>> GLOBAL addressing in a particular (offload) function: namely, if a
>> function takes a pointer-to-scalar parameter, it is assumed that the
>> pointer may refer to "flat scratch" space, and thus FLAT addressing must
>> be used instead of GLOBAL.
>>
>> I came up with this heuristic initially whilst working on support for
>> moving OpenACC gang-private variables into local-data share (scratch)
>> memory. The assumption that only scalar variables would be transformed in
>> that way turned out to be wrong.  For example, [...]
>> Fortran compiler-generated temporary structures were treated
>> as gang private and moved to LDS space, typically overflowing the region
>> allocated for such variables.  [...]
>> there may be other cases of structs moving to LDS
>> space now or in the future that this patch may be needed for.

When I (back then) had looked into PR105421
"GCN offloading, raised '-mgang-private-size': 
'HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION'",
I had been experimenting with different test codes, that all didn't
exhibit this problem.  Now I understand that 'struct' (as implied by
PR105421's Fortran 'write', for example) was the crucial thing there
(that is, 'AGGREGATE_TYPE_P (TREE_TYPE (TREE_VALUE (arg)))' in context of
the previous code).  With...

> pushed to master branch commit 7c55755d4c760de326809636531478fd7419e1e5
> "amdgcn: Use FLAT addressing for all functions with pointer arguments 
> [PR105421]"

... that addressed, I've now pushed to master branch
commit c7ebee2378426eeca425ca5406af213a926f154c
"Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421]", see
attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From c7ebee2378426eeca425ca5406af213a926f154c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Tue, 18 Oct 2022 00:13:47 +0200
Subject: [PATCH] Add 'libgomp.oacc-c-c++-common/private-big-1.c' [PR105421]

After commit r13-3404-g7c55755d4c760de326809636531478fd7419e1e5
"amdgcn: Use FLAT addressing for all functions with pointer arguments [PR105421]",
"big" private data now works for GCN offloading, too.

	PR target/105421
	libgomp/
	* testsuite/libgomp.oacc-c-c++-common/private-big-1.c: New.
---
 .../libgomp.oacc-c-c++-common/private-big-1.c | 100 ++
 1 file changed, 100 insertions(+)
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c

diff --git a/libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c
new file mode 100644
index 000..c0e8db0c894
--- /dev/null
+++ b/libgomp/testsuite/libgomp.oacc-c-c++-common/private-big-1.c
@@ -0,0 +1,100 @@
+/* Test "big" private data.  */
+
+/* { dg-additional-options -fno-inline } for stable results regarding OpenACC 'routine'.  */
+
+/* { dg-additional-options -fopt-info-all-omp }
+   { dg-additional-options --param=openacc-privatization=noisy }
+   { dg-additional-options -foffload=-fopt-info-all-omp }
+   { dg-additional-options -foffload=--param=openacc-privatization=noisy }
+   for testing/documenting aspects of that functionality.  */
+
+/* { dg-additional-options -Wopenacc-parallelism } for testing/documenting
+   aspects of that functionality.  */
+
+/* For GCN offloading compilation, we (expectedly) run into a
+   'gang-private data-share memory exhausted' error: the default
+   '-mgang-private-size' is too small.  Raise it so that 'uint32_t x[344]' plus
+   some internal-use data fits in:
+   { dg-additional-options -foffload-options=amdgcn-amdhsa=-mgang-private-size=1555 { target openacc_radeon_accel_selected } } */
+
+/* It's only with Tcl 8.5 (released in 2007) that "the variable 'varName'
+   passed to 'incr' may be unset, and in that case, it will be set to [...]",
+   so to maintain compatibility with earlier Tcl releases, we manually
+   initialize counter variables:
+   { dg-line l_dummy[variable c_compute 0 c_loop 0] }
+   { dg-message dummy {} { target iN-VAl-Id } l_dummy } to avoid
+   "WARNING: dg-line var l_dummy defined, but not used".  */
+
+#include 
+#include 
+
+
+/* Based on 'private-variables.c:loop_g_5'.  */
+
+/* To demonstrate PR105421 "GCN offloading, raised '-mgang-private-size':
+   'HSA_STATUS_ERROR_MEMORY_APERTURE_VIOLATION'", a 'struct' indirection, for
+   example, has been necessary in combination with a separate routine.  */
+
+struct data
+{
+  uint32_t *x;
+  uint32_t *arr;
+  uint32_t i;
+};
+
+#pragma acc routine worker
+static void
+loop_g_5_r(struct data *data)
+{
+  uint32_t *x = data->x;
+  uint32_t *arr = data->arr;
+  uint32_t i = data->i;
+
+#pragma acc loop /* { dg-line

Re: [Patch] libgomp: Add offload_device_gcn check, add requires-4a.c test

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Wed, Oct 12, 2022 at 04:05:32PM +0200, Tobias Burnus wrote:
> include/ChangeLog:
> 
>   * gomp-constants.h (GOMP_DEVICE_HSA): Comment (unused).

Comment out unused define.
or so, please.

> libgomp/ChangeLog:
> 
>   * testsuite/lib/libgomp.exp (check_effective_target_offload_device_gcn):
>   New.
>   * testsuite/libgomp.c-c++-common/on_device_arch.h (device_arch_gcn,
>   on_device_arch_gcn): New.
>   * testsuite/libgomp.c-c++-common/requires-4a.c: New test; copied from
>   requires-4.c but using heap-allocated memory.

Otherwise LGTM.

Jakub

[PATCH] Avoid PHI - PHI recurrence in vectorization

2022-10-20 Thread Richard Biener via Gcc-patches

The reported regression of libgomp loop-14.C shows that there isn't
generally a good reliable place to insert the permute upfront so
the following simply restricts recurrence vectorization to the cases
where the latch value isn't defined by a PHI.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

* tree-vect-loop.cc (vect_phi_first_order_recurrence_p):
Disallow latch PHI defs.
(vectorizable_recurr): Revert previous change.
---
 gcc/tree-vect-loop.cc | 11 +++
 1 file changed, 3 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-vect-loop.cc b/gcc/tree-vect-loop.cc
index 92790bd8095..d5c2bff80be 100644
--- a/gcc/tree-vect-loop.cc
+++ b/gcc/tree-vect-loop.cc
@@ -543,6 +543,7 @@ vect_phi_first_order_recurrence_p (loop_vec_info 
loop_vinfo, class loop *loop,
   tree ldef = PHI_ARG_DEF_FROM_EDGE (phi, latch);
   if (TREE_CODE (ldef) != SSA_NAME
   || SSA_NAME_IS_DEFAULT_DEF (ldef)
+  || is_a  (SSA_NAME_DEF_STMT (ldef))
   || !flow_bb_inside_loop_p (loop, gimple_bb (SSA_NAME_DEF_STMT (ldef
 return false;
 
@@ -8486,14 +8487,8 @@ vectorizable_recurr (loop_vec_info loop_vinfo, 
stmt_vec_info stmt_info,
  vectorized the latch definition.  */
   edge le = loop_latch_edge (LOOP_VINFO_LOOP (loop_vinfo));
   gimple *latch_def = SSA_NAME_DEF_STMT (PHI_ARG_DEF_FROM_EDGE (phi, le));
-  gimple_stmt_iterator gsi2;
-  if (is_a  (latch_def))
-gsi2 = gsi_after_labels (gimple_bb (latch_def));
-  else
-{
-  gsi2 = gsi_for_stmt (latch_def);
-  gsi_next (&gsi2);
-}
+  gimple_stmt_iterator gsi2 = gsi_for_stmt (latch_def);
+  gsi_next (&gsi2);
 
   for (unsigned i = 0; i < ncopies; ++i)
 {
-- 
2.35.3

[committed] wwwdocs: *: Omit trailing slash for tags

2022-10-20 Thread Gerald Pfeifer

HTML 5 now recommends against trailing slashes on void elements, so
 it is instead of .
---
 htdocs/index.html   | 2 +-
 htdocs/news/egcs-vcg.html   | 2 +-
 htdocs/news/gcse.html   | 2 +-
 htdocs/projects/gupc.html   | 2 +-
 htdocs/projects/tree-ssa/index.html | 2 +-
 htdocs/style.mhtml  | 7 +++
 6 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/htdocs/index.html b/htdocs/index.html
index 761a598a..f4eb4817 100644
--- a/htdocs/index.html
+++ b/htdocs/index.html
@@ -13,7 +13,7 @@
 
 GCC, the GNU Compiler Collection
 
-
+
 
 The GNU Compiler Collection includes front ends for
 C,
diff --git a/htdocs/news/egcs-vcg.html b/htdocs/news/egcs-vcg.html
index 5213152d..2817ad06 100644
--- a/htdocs/news/egcs-vcg.html
+++ b/htdocs/news/egcs-vcg.html
@@ -141,7 +141,7 @@ test.c.cse.vcgtest.c.greg.vcg  test.c.lreg.vcg   
test.c.stack.vcg
 If you view these files using a suitable program, you'll get output
 similar to the following:
 
-
+
 
 These are nodes representing all the functions in the file.  If
 you expand the nodes you can get a picture like
diff --git a/htdocs/news/gcse.html b/htdocs/news/gcse.html
index 80244281..5da4749b 100644
--- a/htdocs/news/gcse.html
+++ b/htdocs/news/gcse.html
@@ -52,7 +52,7 @@ Fred Chow's thesis.
 flow graph:
 
 
-
+
 
 
 
diff --git a/htdocs/projects/gupc.html b/htdocs/projects/gupc.html
index a48d44c3..e38cd240 100644
--- a/htdocs/projects/gupc.html
+++ b/htdocs/projects/gupc.html
@@ -12,7 +12,7 @@
 GNU Unified Parallel C (GUPC)
 
 
+ alt="GUPC Logo" title="GUPC">
 
 The GNU UPC project implements a compilation and execution environment for
 programs written in the
diff --git a/htdocs/projects/tree-ssa/index.html 
b/htdocs/projects/tree-ssa/index.html
index afe7ac35..7d3740e4 100644
--- a/htdocs/projects/tree-ssa/index.html
+++ b/htdocs/projects/tree-ssa/index.html
@@ -192,7 +192,7 @@ based on the algorithms described by Cytron et. al.
 
 The graph below describes the process:
 
-
+
 
 The front ends described in the graph are just an example.  In
 general, any front end that can emit functions-as-trees can be
diff --git a/htdocs/style.mhtml b/htdocs/style.mhtml
index 4fc5e8ad..0790a972 100644
--- a/htdocs/style.mhtml
+++ b/htdocs/style.mhtml
@@ -74,14 +74,13 @@
   
   https://twitter.com/gnutools";>
 @gnutools
+  height="42" width="42" class="middle"
+  alt="@gnutools on Twitter">@gnutools
   
   
   https://my.fsf.org/civicrm/contribute/transact?reset=1&id=57";>
 
+  height="23" width="100" alt="Donate to GNU Toolchain Fund">
   
   
   
-- 
2.38.0

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Martin Liška

On 10/19/22 18:42, Joseph Myers wrote:
> On Wed, 19 Oct 2022, Martin Liška wrote:
> 
>>> Currently, there is a tarball with texinfo sources for all the manuals
>>> for each version.
>>
>> Well, then equivalent would be packaging all .rst files together with the 
>> corresponding
>> conf.py, logo.* and other files. But I don't see it much useful.
> 
> I think we should have such a source tarball when the sources are .rst, as 
> the successor to the Texinfo source tarball.

All right, putting that on my TODO list after the conversion happens. Won't be 
difficult
to achieve.

Cheers,
Martin

> 
> (Unfortunately when I added that source tarball - 
> https://gcc.gnu.org/legacy-ml/gcc-patches/2004-01/msg00140.html - I didn't 
> give specific references to any of the individual requests that resulted 
> in adding it.)

Remove support for Intel MIC offloading (was: [PATCH] Remove dead code.)

2022-10-20 Thread Thomas Schwinge

Hi!

On 2021-11-12T06:41:41-0800, "H.J. Lu via Gcc-patches" 
 wrote:
> On Fri, Nov 12, 2021 at 6:27 AM Martin Liška  wrote:
>> On 11/8/21 15:19, Jeff Law wrote:
>> > On 11/8/2021 2:59 AM, Jakub Jelinek via Gcc-patches wrote:
>> >> liboffloadmic is copied from upstream [...]
>> >> But I have no idea where it even lives upstream.
>> > I thought MIC as an architecture was dead, so it could well be the case 
>> > that there isn't a viable upstream anymore for that code.
>>
>> @H.J. ?
>
> We'd like to deprecate MIC offload in GCC 12.

This had been done in
wwwdocs commit 5c7ecfb5627e412a3d142d8dc212f4cd39b3b73f
"Document deprecation of OpenMP MIC offloading in GCC 12".

I'm sad about this, because -- in theory -- such a plugin is very useful
for offloading simulation/debugging (separate host/device memory spaces,
allow sanitizers to run on offloaded code (like LLVM a while ago
implemented), and so on), but all that doesn't help -- in practice -- if
nobody is maintaining that code.  Also, currently that (very "bulky")
code is buildable for x86/x86_64 GNU/Linux only (again for no particular
reason, as far as I can tell).

> We will remove all traces of
> MIC offload in GCC 13.

This had come up again at the GNU Tools Cauldron 2022 (relevant folks
CCed), and I had been tasked to execute that.  Explicitly note that this
does not bear any relationship with our ongoing work to support
offloading to AMD and Nvidia GPUs: the more, the merrier, as far as I'm
concerned, and actually I had been testing Intel MIC (emulated)
offloading until a few days ago.  (Also, I had been curious about support
for Intel GPUs --

"GCC/OpenMP offloading for Intel GPUs?" -- but Intel don't seem
interested in working on that themselves?)

I'm proposing the attached "Remove support for Intel MIC offloading"
(generated with 'git format-patch --irreversible-delete', and 'diff's for
regenerated files manually snipped, to reduce its size).

Grüße
 Thomas

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From cedd4fa1ad2ee78355d75b30696669716cc9546e Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 17 Oct 2022 22:19:55 +0200
Subject: [PATCH] Remove support for Intel MIC offloading

---
 Makefile.def  | 8 -
 Makefile.in   |   520 +-
 configure |66 +-
 configure.ac  |57 +-
 contrib/gcc-changelog/git_commit.py   | 1 -
 contrib/gcc_update| 6 -
 contrib/update-copyright.py   | 1 -
 gcc/config.gcc| 8 -
 gcc/config/i386/i386-options.cc   | 4 -
 gcc/config/i386/intelmic-mkoffload.cc |   728 -
 gcc/config/i386/intelmic-offload.h|35 -
 gcc/config/i386/t-intelmic|10 -
 gcc/config/i386/t-omp-device  | 6 -
 gcc/configure |14 +-
 gcc/configure.ac  |10 -
 gcc/doc/install.texi  | 2 +-
 gcc/doc/sourcebuild.texi  | 3 -
 include/gomp-constants.h  | 3 +-
 libgomp/configure | 3 -
 libgomp/libgomp-plugin.h  | 1 -
 libgomp/libgomp.texi  | 4 -
 libgomp/plugin/configfrag.ac  | 3 -
 libgomp/testsuite/lib/libgomp.exp |37 -
 .../libgomp.c-c++-common/on_device_arch.h |35 -
 .../libgomp.c-c++-common/target-45.c  | 2 -
 .../testsuite/libgomp.fortran/target10.f90| 1 -
 liboffloadmic/ChangeLog   |   765 -
 liboffloadmic/Makefile.am |   160 -
 liboffloadmic/Makefile.in |  1310 --
 liboffloadmic/aclocal.m4  |  1180 --
 liboffloadmic/configure   | 17512 
 liboffloadmic/configure.ac|   143 -
 liboffloadmic/configure.tgt   |39 -
 liboffloadmic/doc/doxygen/config  |  2328 --
 liboffloadmic/doc/doxygen/header.tex  |90 -
 .../include/coi/common/COIEngine_common.h |   121 -
 .../include/coi/common/COIEvent_common.h  |84 -
 .../include/coi/common/COIMacros_common.h |   229 -
 .../include/coi/common/COIPerf_common.h   |87 -
 .../include/coi/common/COIResult_common.h |   148 -
 .../include/coi/common/COISysInfo_common.h|   126 -
 .../include/coi/common/COITypes_common.h  |88 -
 .../include/coi/sink/COIBuffer_sink.h |   133 -
 .../include/coi/sink/COIPipeline

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Martin Liška

On 10/19/22 18:30, Sandra Loosemore wrote:
> On 10/19/22 05:09, Martin Liška wrote:
>> On 10/18/22 00:26, Sandra Loosemore wrote:
>>> On 10/17/22 07:28, Martin Liška wrote:
 Hello.

 Based on the very positive feedback I was given at the Cauldron Sphinx 
 Documentation BoF,
 I'm planning migrating the documentation on 9th November. There are still 
 some minor comments
 from Sandra when it comes to the PDF output, but we can address that once 
 the conversion is done.
>>>
>>> My main complaint about the PDF is that the blue color used for link text 
>>> is so light it interferes with readability.  Few people are going to print 
>>> the document on paper any more, but I did try printing a sample page on a 
>>> grayscale printer and the blue link text came out so faint that it was 
>>> barely visible at all.
>>
>> Sure, I've just added support for monochromatic PDF output where one needs 
>> to use
>> MONOCHROMATIC=1 make latexpdf ...
>>
>> and I linked the file here:
>> https://splichal.eu/scripts/sphinx/gcc/_build/latexmonochromatic/gcc.pdf
>>
>> right now I build only one PDF in this mode and it's mentioned here:
>> https://splichal.eu/scripts/sphinx/
>>
>> What do you think about it now?
> 
> Hmmm, removing *all* visual cues that something is a link does not seem so 
> great either, especially since the new format has changed the link text for 
> @xref to remove the page and section information.  E.g. we used to get "See 
> Section 3.4 [Options Controlling C Dialect], page 44." and now it just reads 
> "See Options Controlling C Dialect."
> 

Hey.

> I realize there is a can of worms here involving philosophical issues about 
> whether the PDF manual is intended to be formatted for reading as a book or 
> is just a handy way to repackage the hyperlinked web presentation for offline 
> reference.  Also there is another can of worms involving making the 
> documentation accessible to people who have visual disabilities, specifically 
> color blindness issues.  Just speaking for myself, I'd be happy if the PDF 
> just used a darker blue color for links that is both distinguishing and 
> higher contrast with the background than the current light blue, but I think 
> it is one of the principles of accessible design that color really shouldn't 
> be the *only* indication of something that initiates an action.  Maybe 
> underlining, or a little link glyph, or restoring the section/page info to 
> the link text?

I've just tweaked the monochrom. PDF where dark blue color is used for links. 
About the links, there are multiple PDF viewers (like Evince) which can do a 
preview if you hover
over a link. Plus a page number is showed in a toolbar.

What it comes to the philosophical issues of the monochrom. PDF, well, I would 
recommend discussing that with Sphinx upstream project. I bet
they must have other projects who's readers might request similar needs. 
Intention of my monochrom. PDF was to show that Sphinx PDF output
can be quite easily adjusted.

> 
>>
>>>    An E-ink reader device would probably have similar problems.
>>
>> There ePUB would be likely better output format. What do you think?
> 
> Ooof, a lot of problems there.  I looked at your new generated .epub in both 
> the "ebook-viewer" utility on my laptop and on my Kobo Forma.  The Kobo uses 
> the default proportionally-spaced font for everything; even the code examples 
> fail to come out in a fixed-width font.  ebook-viewer shows fixed-width fonts 
> for code examples and inline references to e.g. command line options, but the 
> names of options in the option tables sections are in the proportional body 
> font.  Also in both viewers I see hyperlinks to https://splicha.eu/...  in 
> place of internal links in some references to command-line options and the 
> like, and the formatting of the option summary tables really sucks, with 
> lines breaking at hyphens in the middle of option names.

Sure, let's leave it for now and keep it as a might-have thing for the future!

Appreciate the feedback,
Cheers,
Martin

> 
> I suggest we try to focus our efforts on the currently-supported formats 
> before adding EPUB as a new format.
> 
> -Sandra

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Martin Liška

On 10/20/22 04:26, Xi Ruoyao wrote:
> On Mon, 2022-10-17 at 15:28 +0200, Martin Liška wrote:
>> Hello.
>>
>> Based on the very positive feedback I was given at the Cauldron Sphinx 
>> Documentation BoF,
>> I'm planning migrating the documentation on 9th November. There are still 
>> some minor comments
>> from Sandra when it comes to the PDF output, but we can address that once 
>> the conversion is done.
>>
>> The reason I'm sending the email now is that I was waiting for latest Sphinx 
>> release (5.3.0) that
>> simplifies reference format for options and results in much simpler Option 
>> summary section ([1])
>>
>> The current GCC master (using Sphinx 5.3.0) converted docs can be seen here:
>> https://splichal.eu/scripts/sphinx/
>>
>> If you see any issues with the converted documentation, or have a feedback 
>> about it,
>> please reply to this email.
> 
> Ouch.  This will be very painful for Linux From Scratch.  We'll need to
> add 23 Python modules to build the documentation, while we only have 88
> packages in total currently...  And we don't want to omit GCC
> documentation in our system.

Various other distros will have to face it too. The proper solution is a 
multi-build
package (gcc:doc) which can be built later in the dependency chain. Btw. do you 
also
provide PDF documentation in your system?

> 
> Could generated man and info pages be provided as a tarball on
> gcc.gnu.org or ftp.gnu.org?

Not planning doing that.

Cheers,
Martin

Re: [DOCS] Python Language Conventions

2022-10-20 Thread Gerald Pfeifer

On Mon, 17 Oct 2022, Martin Liška wrote:
> All right, let me install my initial patch with the improved wording.

The validator noticed a small issue which I addressed thusly (by 
moving up the  - the beginning of  implicitly closes a ).

No worries - that's what validators are for. :-)

Gerald

commit e9164572d233645b51ed8fa27729a52a0e242984
Author: Gerald Pfeifer 
Date:   Thu Oct 20 13:04:48 2022 +0200

codingconventions: Fix markup

diff --git a/htdocs/codingconventions.html b/htdocs/codingconventions.html
index 9d0a3f14..f88ef019 100644
--- a/htdocs/codingconventions.html
+++ b/htdocs/codingconventions.html
@@ -1486,17 +1486,15 @@ Definitions within the body of a namespace are not 
indented.
 
 Python Language Conventions
 
-
-Python scripts should follow https://peps.python.org/pep-0008/";>PEP 8 
– Style Guide for Python Code
+Python scripts should follow https://peps.python.org/pep-0008/";>PEP 8 – Style Guide for Python Code
 which can be verified by the flake8 tool.
-We recommend using the following flake8 plug-ins:
+We recommend using the following flake8 plug-ins:
 
 
 flake8-builtins
 flake8-import-order
 flake8-quotes
 
-

Re: Remove support for Intel MIC offloading (was: [PATCH] Remove dead code.)

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Thu, Oct 20, 2022 at 01:15:43PM +0200, Thomas Schwinge wrote:
> I'm proposing the attached "Remove support for Intel MIC offloading"
> (generated with 'git format-patch --irreversible-delete', and 'diff's for
> regenerated files manually snipped, to reduce its size).

ChangeLog missing, you'll need one for a successful commit.

Otherwise LGTM.  But we'll need to update the offloading wiki too.

Jakub

Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.))

2022-10-20 Thread Thomas Schwinge

Hi!

On 2022-10-18T07:41:29+0200, Aldy Hernandez  wrote:
> On Mon, Oct 17, 2022 at 4:47 PM Thomas Schwinge  
> wrote:
>> On 2022-10-17T15:58:47+0200, Aldy Hernandez  wrote:
>> > On Mon, Oct 17, 2022 at 9:44 AM Thomas Schwinge  
>> > wrote:
>> >> On 2022-10-11T10:31:37+0200, Aldy Hernandez via Gcc-patches 
>> >>  wrote:
>> >> > When solving 0 = _15 & 1, we calculate _15 as:
>> >> >
>> >> >   [irange] int [-INF, -2][0, +INF] NONZERO 0xfffe
>> >> >
>> >> > The known value of _15 is [0, 1] NONZERO 0x1 which is intersected with
>> >> > the above, yielding:
>> >> >
>> >> >   [0, 1] NONZERO 0x0
>> >> >
>> >> > This eventually gets copied to a _Bool [0, 1] NONZERO 0x0.
>> >> >
>> >> > This is problematic because here we have a bool which is zero, but
>> >> > returns false for irange::zero_p, since the latter does not look at
>> >> > nonzero bits.  This causes logical_combine to assume the range is
>> >> > not-zero, and all hell breaks loose.
>> >> >
>> >> > I think we should just normalize a nonzero mask of 0 to [0, 0] at
>> >> > creation, thus avoiding all this.
>> >>
>> >> 1. This commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
>> >> "[PR107195] Set range to zero when nonzero mask is 0" broke a GCC/nvptx
>> >> offloading test case:
>> >>
>> >> UNSUPPORTED: 
>> >> libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c 
>> >> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
>> >> PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c 
>> >> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  
>> >> (test for excess errors)
>> >> PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c 
>> >> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  
>> >> execution test
>> >> [-PASS:-]{+FAIL:+} 
>> >> libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-sese-1.c 
>> >> -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   
>> >> scan-nvptx-none-offload-rtl-dump mach "SESE regions:.* 
>> >> [0-9]+{[0-9]+->[0-9]+(\\.[0-9]+)+}"
>> >>
>> >> Same for C++.
>> >>
>> >> I'll later send a patch (for the test case!) to fix that up.
>> >>
>> >> 2. Looking into this, I found that this
>> >> commit r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
>> >> "[PR107195] Set range to zero when nonzero mask is 0" actually enables a
>> >> code transformation/optimization that GCC apparently has not been doing
>> >> before!  I've tried to capture that in the attached
>> >> "Add 'c-c++-common/torture/pr107195-1.c' [PR107195]".
>> >
>> > Nice.
>> >
>> >> Will you please verify that one?  In its current '#if 1' configuration,
>> >> it's all-PASS after commit
>> >> r13-3217-gc4d15dddf6b9eacb36f535807ad2ee364af46e04
>> >> "[PR107195] Set range to zero when nonzero mask is 0", whereas before, we
>> >> get two calls to 'foo', because GCC apparently didnn't understand the
>> >> relation (optimization opportunity) between 'r *= 2;' and the subsequent
>> >> 'if (r & 1)'.
>> >
>> > Yeah, that looks correct.  We keep better track of nonzero masks.
>>
>> OK, next observation: this also works for split-up expressions
>> 'if ((r & 2) && (r & 1))' (same rationale as for 'if (r & 1)' alone).
>> I've added such a variant in my test case.
>
> Unless I'm missing something, your testcase doesn't have a body for
> foo[123], so GCC has no way to know what any of those functions did or
> what bits are set/unset.

Ah, there seems to be some confusion what's happening here.  :-)

First, these functions, 'foo[...]', are '__attribute__((const))', and
their argument, 'r' doesn't change if the first 'foo[...]' call returns
zero.  Thus, GCC can infer that the second 'foo[...]' call also must
return zero, and thus may elide that second function call.  Second,
should the first 'foo[...]' call return non-zero, 'r *= 2;' is executed,
and thus GCC can infer that 'if (r & 1)' can never hold, and thus the
'if' branch is not executed, and thus it may elide the second function
call for that scenario, too.  Thus, the second function is completely
elided.

The attached "Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195]" demonstrates
that this does work for 'if (r & 1)' in 'f1', 'foo1', and also does work
for 'if ((r & 2) && (r & 1))' in 'f2', 'foo2', but:

>> But: it doesn't work for logically equal 'if (r & 3)'.

... in 'f3', 'foo3'.

I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)',
right?

>> I've added such
>> an XFAILed variant in my test case.  Do you have guidance what needs to
>> be done to make such cases work, too?

Thus my question, where/how GCC would learn this?


Otherwise:

>> >> I've left in the other '#if' variants in case you'd like to experiment
>> >> with these, but would otherwise clean that up before pushing.
>> >>
>> >> Where does one put such a test case?
>> >>
>> >> Should the file be named 'pr107195' or something else?
>> >
>> > The aforementioned patch already has:
>> >
>> > * gcc.dg/t

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Xi Ruoyao via Gcc-patches

(CC our team members.)

On Thu, 2022-10-20 at 13:27 +0200, Martin Liška wrote:
> > Ouch.  This will be very painful for Linux From Scratch.  We'll need to
> > add 23 Python modules to build the documentation, while we only have 88
> > packages in total currently...  And we don't want to omit GCC
> > documentation in our system.
> 
> Various other distros will have to face it too. The proper solution is a 
> multi-build
> package (gcc:doc) which can be built later in the dependency chain. Btw. do 
> you also
> provide PDF documentation in your system?

No (texlive is much heavier than Sphinx).  But generally we expect man
pages and info pages.

We can separate man and info into the second-time build in BLFS (we're
already doing this now for Go, Objective C, etc.), but I don't really
like to omit the man and info pages...

> > Could generated man and info pages be provided as a tarball on
> > gcc.gnu.org or ftp.gnu.org?
> 
> Not planning doing that.

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Martin Liška

On 10/20/22 13:49, Xi Ruoyao wrote:
> (CC our team members.)
> 
> On Thu, 2022-10-20 at 13:27 +0200, Martin Liška wrote:
>>> Ouch.  This will be very painful for Linux From Scratch.  We'll need to
>>> add 23 Python modules to build the documentation, while we only have 88
>>> packages in total currently...  And we don't want to omit GCC
>>> documentation in our system.
>>
>> Various other distros will have to face it too. The proper solution is a 
>> multi-build
>> package (gcc:doc) which can be built later in the dependency chain. Btw. do 
>> you also
>> provide PDF documentation in your system?
> 
> No (texlive is much heavier than Sphinx).  But generally we expect man
> pages and info pages.
> 
> We can separate man and info into the second-time build in BLFS (we're
> already doing this now for Go, Objective C, etc.),

Do the same for GCC.

> but I don't really
> like to omit the man and info pages..

What should I do about it? We want to switch to a more modern documentation tool
called Sphinx and yes, it will make packaging of the GCC more complicated.

Martin

> 
>>> Could generated man and info pages be provided as a tarball on
>>> gcc.gnu.org or ftp.gnu.org?
>>
>> Not planning doing that.

Re: [DOCS] Python Language Conventions

2022-10-20 Thread Martin Liška

On 10/20/22 13:34, Gerald Pfeifer wrote:
> On Mon, 17 Oct 2022, Martin Liška wrote:
>> All right, let me install my initial patch with the improved wording.
> 
> The validator noticed a small issue which I addressed thusly (by 
> moving up the  - the beginning of  implicitly closes a ).
> 
> No worries - that's what validators are for. :-)

Heh ;) Thanks for the fix.

Martin

> 
> Gerald
> 
> commit e9164572d233645b51ed8fa27729a52a0e242984
> Author: Gerald Pfeifer 
> Date:   Thu Oct 20 13:04:48 2022 +0200
> 
> codingconventions: Fix markup
> 
> diff --git a/htdocs/codingconventions.html b/htdocs/codingconventions.html
> index 9d0a3f14..f88ef019 100644
> --- a/htdocs/codingconventions.html
> +++ b/htdocs/codingconventions.html
> @@ -1486,17 +1486,15 @@ Definitions within the body of a namespace are not 
> indented.
>  
>  Python Language Conventions
>  
> -
> -Python scripts should follow https://peps.python.org/pep-0008/";>PEP 
> 8 – Style Guide for Python Code
> +Python scripts should follow  href="https://peps.python.org/pep-0008/";>PEP 8 – Style Guide for Python 
> Code
>  which can be verified by the flake8 tool.
> -We recommend using the following flake8 plug-ins:
> +We recommend using the following flake8 plug-ins:
>  
>  
>  flake8-builtins
>  flake8-import-order
>  flake8-quotes
>  
> -
>  
>  
>

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Xi Ruoyao via Gcc-patches

On Thu, 2022-10-20 at 13:53 +0200, Martin Liška wrote:
> On 10/20/22 13:49, Xi Ruoyao wrote:
> > (CC our team members.)
> > 
> > On Thu, 2022-10-20 at 13:27 +0200, Martin Liška wrote:
> > > > Ouch.  This will be very painful for Linux From Scratch.  We'll need to
> > > > add 23 Python modules to build the documentation, while we only have 88
> > > > packages in total currently...  And we don't want to omit GCC
> > > > documentation in our system.
> > > 
> > > Various other distros will have to face it too. The proper solution is a 
> > > multi-build
> > > package (gcc:doc) which can be built later in the dependency chain. Btw. 
> > > do you also
> > > provide PDF documentation in your system?
> > 
> > No (texlive is much heavier than Sphinx).  But generally we expect man
> > pages and info pages.
> > 
> > We can separate man and info into the second-time build in BLFS (we're
> > already doing this now for Go, Objective C, etc.),
> 
> Do the same for GCC.
> 
> > but I don't really
> > like to omit the man and info pages..
> 
> What should I do about it? We want to switch to a more modern documentation 
> tool
> called Sphinx and yes, it will make packaging of the GCC more complicated.

Nothing, I guess.  We'll handle it on our side (if we finally decide to
ship the man/info tarballs we can generate them by ourselves).

I was just trying to find a simpler solution before beginning all the
work :).

Thanks!

Re: [PATCH] libstdc++: Redefine __from_chars_alnum_to_val's table

2022-10-20 Thread Jonathan Wakely via Gcc-patches

On Mon, 17 Oct 2022 at 17:27, Patrick Palka via Libstdc++
 wrote:
>
> It looks like the constexpr  commit r13-3313-g378a0f1840e694
> caused some modules regressions:
>
>   FAIL: g++.dg/modules/xtreme-header-4_b.C -std=c++2b (test for excess errors)
>   FAIL: g++.dg/modules/xtreme-header_b.C -std=c++2b (test for excess errors)
>
> Like PR105297, the problem seems to be the local class from
> __from_chars_alnum_to_val ending up as the type of a namespace-scope
> entity (the variable template __detail::__table in this case).
>
> This patch works around this modules issue by using an ordinary class
> instead of a local class.  Also, I suppose we might as well use a static
> data member to define the table once for all dialects instead of having
> to define it twice in C++23 mode, once as a static local variable (which
> isn't usable during constexpr evaluation) and again as a variable template
> (which is).
>
> Tested on x86_64-pc-linux-gnu, does this look OK for trunk?  Diff
> generated with -w to ignore noisy whitespace changes.

OK, thanks.


>
> libstdc++-v3/ChangeLog:
>
> * include/std/charconv (__detail::__from_chars_alnum_to_val_table):
> Redefine as a class template containing type, value and _S_table
> members.  Don't use a local class as the table type.
> (__detail::__table): Remove.
> (__detail::__from_chars_alnum_to_val): Adjust after the above.
> ---
>  libstdc++-v3/include/std/charconv | 31 ++-
>  1 file changed, 14 insertions(+), 17 deletions(-)
>
> diff --git a/libstdc++-v3/include/std/charconv 
> b/libstdc++-v3/include/std/charconv
> index 7aefdd3298c..c157d4c74ab 100644
> --- a/libstdc++-v3/include/std/charconv
> +++ b/libstdc++-v3/include/std/charconv
> @@ -413,14 +413,19 @@ namespace __detail
>return true;
>  }
>
> +  template
> +struct __from_chars_alnum_to_val_table
> +{
> +  struct type { unsigned char __data[1u << __CHAR_BIT__] = {}; };
> +
>// Construct and return a lookup table that maps 0-9, A-Z and a-z to 
> their
>// corresponding base-36 value and maps all other characters to 127.
> -  constexpr auto
> -  __from_chars_alnum_to_val_table()
> +  static constexpr type
> +  _S_table()
>{
> constexpr unsigned char __lower_letters[27] = 
> "abcdefghijklmnopqrstuvwxyz";
> constexpr unsigned char __upper_letters[27] = 
> "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
> -struct { unsigned char __data[1u << __CHAR_BIT__] = {}; } __table;
> +   type __table;
> for (auto& __entry : __table.__data)
>   __entry = 127;
> for (int __i = 0; __i < 10; ++__i)
> @@ -433,10 +438,11 @@ namespace __detail
> return __table;
>}
>
> -#if __cpp_lib_constexpr_charconv
> -  template
> -inline constexpr auto __table = __from_chars_alnum_to_val_table();
> -#endif
> +  // This initializer is made superficially dependent in order
> +  // to prevent the compiler from wastefully constructing the
> +  // table ahead of time when it's not needed.
> +  static constexpr type value = (_DecOnly, _S_table());
> +};
>
>// If _DecOnly is true: if the character is a decimal digit, then
>// return its corresponding base-10 value, otherwise return a value >= 127.
> @@ -449,16 +455,7 @@ namespace __detail
>if _GLIBCXX17_CONSTEXPR (_DecOnly)
> return static_cast(__c - '0');
>else
> -   {
> -#if __cpp_lib_constexpr_charconv
> - if (std::__is_constant_evaluated())
> -   return __table<_DecOnly>.__data[__c];
> -#endif
> - // This initializer is deliberately made dependent in order to work
> - // around modules bug PR105322.
> - static constexpr auto __table = (_DecOnly, 
> __from_chars_alnum_to_val_table());
> - return __table.__data[__c];
> -   }
> +   return __from_chars_alnum_to_val_table<_DecOnly>::value.__data[__c];
>  }
>
>/// std::from_chars implementation for integers in a power-of-two base.
> --
> 2.38.0.68.ge85701b4af
>

[committed] passes: Fix a comment typo

2022-10-20 Thread Jakub Jelinek via Gcc-patches

Hi!

This patch fixes a single typo in comment.

Committed as obvious to trunk.

2022-10-20  Jakub Jelinek  

* passes.cc (pass_manager::register_pass): Fix a comment
typo - copmilation -> compilation.

--- gcc/passes.cc.jj2022-10-18 10:38:48.150406180 +0200
+++ gcc/passes.cc   2022-10-20 13:07:33.891705807 +0200
@@ -1559,7 +1559,7 @@ pass_manager::register_pass (struct regi
compile ()
ipa_passes ()   -> all_small_ipa_passes
-> Analysis of all_regular_ipa_passes
-   * possible LTO streaming at copmilation time *
+   * possible LTO streaming at compilation time *
-> Execution of all_regular_ipa_passes
* possible LTO streaming at link time *
-> all_late_ipa_passes

Jakub

[committed] testsuite: Add some missing -Wno-psabi options

2022-10-20 Thread Jakub Jelinek via Gcc-patches

Hi!

The following testcases FAIL on i686-linux due to excess diagnostics
for -Wpsabi.

Tested on x86_64-linux with
make check-gcc 
RUNTESTFLAGS='--target_board=unix\{-m32/-msse2,-m32/-mno-mmx/-mno-sse,-m64\} 
i386.exp=pr107271.c btf.exp=btf-function-3.c'
and committed to trunk as obvious.

2022-10-20  Jakub Jelinek  

* gcc.target/i386/pr107271.c: Add -Wno-psabi to dg-options.
* gcc.dg/debug/btf/btf-function-3.c: Likewise.

--- gcc/testsuite/gcc.target/i386/pr107271.c.jj 2022-10-19 11:20:54.633878743 
+0200
+++ gcc/testsuite/gcc.target/i386/pr107271.c2022-10-20 13:52:47.000966060 
+0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O0" } */
+/* { dg-options "-O0 -Wno-psabi" } */
 
 typedef int __attribute__((__vector_size__ (16))) V;
 
--- gcc/testsuite/gcc.dg/debug/btf/btf-function-3.c.jj  2021-07-02 
21:55:48.696990745 +0200
+++ gcc/testsuite/gcc.dg/debug/btf/btf-function-3.c 2022-10-20 
13:17:58.793270607 +0200
@@ -7,7 +7,7 @@
has type_id=0.  */
 
 /* { dg-do compile } */
-/* { dg-options "-O0 -gbtf -dA" } */
+/* { dg-options "-O0 -gbtf -dA -Wno-psabi" } */
 
 /* { dg-final { scan-assembler-times "\[\t \]0xd03\[\t 
\]+\[^\n\]*btt_info" 1 } } */
 /* { dg-final { scan-assembler-times "farg_name" 3 } } */

Jakub

Re: [PATCH][AArch64] Improve immediate expansion [PR106583]

2022-10-20 Thread Wilco Dijkstra via Gcc-patches

Hi Richard,

> Can you do the aarch64_mov_imm changes as a separate patch?  It's difficult
> to review the two changes folded together like this.

Sure, I'll send a separate patch. So here is version 2 again:

[PATCH v2][AArch64] Improve immediate expansion [PR106583]

Improve immediate expansion of immediates which can be created from a
bitmask immediate and 2 MOVKs.  Simplify, refactor and improve
efficiency of bitmask checks.  This reduces the number of 4-instruction
immediates in SPECINT/FP by 10-15%.

Passes regress, OK for commit?

gcc/ChangeLog:

PR target/106583
* config/aarch64/aarch64.cc (aarch64_internal_mov_immediate)
Add support for a bitmask immediate with 2 MOVKs.
(aarch64_check_bitmask): New function after refactorization.
(aarch64_replicate_bitmask_imm): Remove function, merge into...
(aarch64_bitmask_imm): Simplify replication of small modes.
Split function into 64-bit only version for efficiency.

gcc/testsuite:
PR target/106583
* gcc.target/aarch64/pr106583.c: Add new test.

---

diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
926e81f028c82aac9a5fecc18f921f84399c24ae..b2d9c7380975028131d0fe731a97b3909874b87b
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -306,6 +306,7 @@ static machine_mode aarch64_simd_container_mode 
(scalar_mode, poly_int64);
 static bool aarch64_print_address_internal (FILE*, machine_mode, rtx,
 aarch64_addr_query_type);
 static HOST_WIDE_INT aarch64_clamp_to_uimm12_shift (HOST_WIDE_INT val);
+static bool aarch64_bitmask_imm (unsigned HOST_WIDE_INT);
 
 /* The processor for which instructions should be scheduled.  */
 enum aarch64_processor aarch64_tune = cortexa53;
@@ -5502,6 +5503,30 @@ aarch64_output_sve_vector_inc_dec (const char *operands, 
rtx x)
  factor, nelts_per_vq);
 }
 
+/* Return true if the immediate VAL can be a bitfield immediate
+   by changing the given MASK bits in VAL to zeroes, ones or bits
+   from the other half of VAL.  Return the new immediate in VAL2.  */
+static inline bool
+aarch64_check_bitmask (unsigned HOST_WIDE_INT val,
+  unsigned HOST_WIDE_INT &val2,
+  unsigned HOST_WIDE_INT mask)
+{
+  val2 = val & ~mask;
+  if (val2 != val && aarch64_bitmask_imm (val2))
+return true;
+  val2 = val | mask;
+  if (val2 != val && aarch64_bitmask_imm (val2))
+return true;
+  val = val & ~mask;
+  val2 = val | (((val >> 32) | (val << 32)) & mask);
+  if (val2 != val && aarch64_bitmask_imm (val2))
+return true;
+  val2 = val | (((val >> 16) | (val << 48)) & mask);
+  if (val2 != val && aarch64_bitmask_imm (val2))
+return true;
+  return false;
+}
+
 static int
 aarch64_internal_mov_immediate (rtx dest, rtx imm, bool generate,
 scalar_int_mode mode)
@@ -5568,36 +5593,43 @@ aarch64_internal_mov_immediate (rtx dest, rtx imm, bool 
generate,
   one_match = ((~val & mask) == 0) + ((~val & (mask << 16)) == 0) +
 ((~val & (mask << 32)) == 0) + ((~val & (mask << 48)) == 0);
 
-  if (zero_match != 2 && one_match != 2)
+  if (zero_match < 2 && one_match < 2)
 {
   /* Try emitting a bitmask immediate with a movk replacing 16 bits.
  For a 64-bit bitmask try whether changing 16 bits to all ones or
  zeroes creates a valid bitmask.  To check any repeated bitmask,
  try using 16 bits from the other 32-bit half of val.  */
 
-  for (i = 0; i < 64; i += 16, mask <<= 16)
-   {
- val2 = val & ~mask;
- if (val2 != val && aarch64_bitmask_imm (val2, mode))
-   break;
- val2 = val | mask;
- if (val2 != val && aarch64_bitmask_imm (val2, mode))
-   break;
- val2 = val2 & ~mask;
- val2 = val2 | (((val2 >> 32) | (val2 << 32)) & mask);
- if (val2 != val && aarch64_bitmask_imm (val2, mode))
-   break;
-   }
-  if (i != 64)
-   {
- if (generate)
+  for (i = 0; i < 64; i += 16)
+   if (aarch64_check_bitmask (val, val2, mask << i))
+ {
+   if (generate)
+ {
+   emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));
+   emit_insn (gen_insv_immdi (dest, GEN_INT (i),
+  GEN_INT ((val >> i) & 0x)));
+ }
+   return 2;
+ }
+}
+
+  /* Try a bitmask plus 2 movk to generate the immediate in 3 instructions.  */
+  if (zero_match + one_match == 0)
+{
+  for (i = 0; i < 48; i += 16)
+   for (int j = i + 16; j < 64; j += 16)
+ if (aarch64_check_bitmask (val, val2, (mask << i) | (mask << j)))
 {
- emit_insn (gen_rtx_SET (dest, GEN_INT (val2)));
- emit_insn (gen_insv_immdi (dest, GEN_INT (i),
-GEN_INT ((val >> i) & 0x))

Re: Add 'gcc.dg/tree-ssa/pr107195-3.c' [PR107195] (was: Add 'c-c++-common/torture/pr107195-1.c' [PR107195] (was: [COMMITTED] [PR107195] Set range to zero when nonzero mask is 0.))

2022-10-20 Thread Aldy Hernandez via Gcc-patches

> I understand 'r & 3' to be logically equivalent to '(r & 2) && (r & 1)',
> right?

For r == 2, r & 3 == 2, whereas (r & 2) && (r & 1) == 0, so no?

Aldy

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Martin Liška

On 10/20/22 13:55, Xi Ruoyao wrote:
> On Thu, 2022-10-20 at 13:53 +0200, Martin Liška wrote:
>> On 10/20/22 13:49, Xi Ruoyao wrote:
>>> (CC our team members.)
>>>
>>> On Thu, 2022-10-20 at 13:27 +0200, Martin Liška wrote:
> Ouch.  This will be very painful for Linux From Scratch.  We'll need to
> add 23 Python modules to build the documentation, while we only have 88
> packages in total currently...  And we don't want to omit GCC
> documentation in our system.

 Various other distros will have to face it too. The proper solution is a 
 multi-build
 package (gcc:doc) which can be built later in the dependency chain. Btw. 
 do you also
 provide PDF documentation in your system?
>>>
>>> No (texlive is much heavier than Sphinx).  But generally we expect man
>>> pages and info pages.
>>>
>>> We can separate man and info into the second-time build in BLFS (we're
>>> already doing this now for Go, Objective C, etc.),
>>
>> Do the same for GCC.
>>
>>> but I don't really
>>> like to omit the man and info pages..
>>
>> What should I do about it? We want to switch to a more modern documentation 
>> tool
>> called Sphinx and yes, it will make packaging of the GCC more complicated.
> 
> Nothing, I guess.  We'll handle it on our side (if we finally decide to
> ship the man/info tarballs we can generate them by ourselves).

Good!

> 
> I was just trying to find a simpler solution before beginning all the
> work :).

Sure, makes sense.

Martin

> 
> Thanks!
>

Re: Remove support for Intel MIC offloading (was: [PATCH] Remove dead code.)

2022-10-20 Thread Michael Matz via Gcc-patches

Hey,

On Thu, 20 Oct 2022, Thomas Schwinge wrote:

> This had been done in
> wwwdocs commit 5c7ecfb5627e412a3d142d8dc212f4cd39b3b73f
> "Document deprecation of OpenMP MIC offloading in GCC 12".
> 
> I'm sad about this, because -- in theory -- such a plugin is very useful
> for offloading simulation/debugging (separate host/device memory spaces,
> allow sanitizers to run on offloaded code

Yeah, I think that's a _very_ useful feature, but indeed ...

> (like LLVM a while ago
> implemented), and so on), but all that doesn't help -- in practice -- if
> nobody is maintaining that code.

... it should then be somewhat maintained properly.  Maybe the 
MIC-specifics could be removed from the code, and it could be transformed 
into a "null"-offload target, as example and testing vehicle (and implying 
that such new liboffloadmic^H^H^Hnull would have its upstream in the GCC 
repo).  Alas, if noone is going to do that work removing is the right 
choice.

Ciao,
Michael.

Re: [PATCH] bpf: add preserve_field_info builtin

2022-10-20 Thread Jose E. Marchesi via Gcc-patches



Hi David.
Thanks for the patch.  Please see a few comments below.

> @@ -975,6 +978,161 @@ static tree bpf_core_compute (tree, vec 
> *);
>  static int bpf_core_get_index (const tree);
>  static bool is_attr_preserve_access (tree);
>  
> +static void
> +maybe_make_core_relo (tree expr, enum btf_core_reloc_kind kind)

This function is missing a comment explaining what it does.

> +{
> +  /* If we are not targetting BPF CO-RE, do not make a relocation. We
> + might not be generating any debug info at all.  */
> +  if (!TARGET_BPF_CORE)
> +return;
> +
> +  auto_vec accessors;
> +  tree container = bpf_core_compute (expr, &accessors);
> +
> +  /* Any valid use of the builtin must have at least one access. Otherwise,
> + there is nothing to record and nothing to do. This is primarily a
> + guard against optimizations leading to unexpected expressions in the
> + argument of the builtin. For example, if the builtin is used to read
> + a field of a structure which can be statically determined to hold a
> + constant value, the argument to the builtin will be optimized to that
> + constant. This is OK, and means the builtin call is superfluous.
> + e.g.
> + struct S foo;
> + foo.a = 5;
> + int x = __preserve_access_index (foo.a);
> + ... do stuff with x
> + 'foo.a' in the builtin argument will be optimized to '5' with -01+.
> + This sequence does not warrant recording a CO-RE relocation.  */
> +
> +  if (accessors.length () < 1)
> +return;
> +  accessors.reverse ();
> +
> +  rtx_code_label *label = gen_label_rtx ();
> +  LABEL_PRESERVE_P (label) = 1;
> +  emit_label (label);
> +
> +  /* Determine what output section this relocation will apply to.
> + If this function is associated with a section, use that. Otherwise,
> + fall back on '.text'.  */
> +  const char * section_name;
> +  if (current_function_decl && DECL_SECTION_NAME (current_function_decl))
> +section_name = DECL_SECTION_NAME (current_function_decl);
> +  else
> +section_name = ".text";
> +
> +  /* Add the CO-RE relocation information to the BTF container.  */
> +  bpf_core_reloc_add (TREE_TYPE (container), section_name, &accessors, label,
> +   kind);
> +}
> +
> +/* Expand a call to __builtin_preserve_field_info by evaluating the requested
> +   information about SRC according to KIND, and return a tree holding
> +   the result.  */
> +
> +static tree
> +bpf_core_field_info (tree src, enum btf_core_reloc_kind kind)
> +{
> +  unsigned int result;
> +  poly_int64 bitsize, bitpos;
> +  tree var_off;
> +  machine_mode mode;
> +  int unsignedp, reversep, volatilep;
> +
> +  get_inner_reference (src, &bitsize, &bitpos, &var_off, &mode, &unsignedp,
> +&reversep, &volatilep);

Since the information returned by the builtin is always constant
(positions, sizes) I think you will want to adjust the code for the
eventuality of variable positioned fields and also variable sized
fields.

get_inner_reference sets var_off to a tree if the position of the field
is variable.  In these cases `bitpos' is relative to that position.

Likewise, get_inner_reference sets `mode' is set to BLKmode and
`bitsize' will be set to -1.

I'm not sure what the built-in is supposed to do/return in these cases.
I guess it makes sense to error out, but what does LLVM do?

> +
> +  /* Note: Use DECL_BIT_FIELD_TYPE rather than DECL_BIT_FIELD here, because 
> it
> + remembers whether the field in question was originally declared as a
> + bitfield, regardless of how it has been optimized.  */
> +  bool bitfieldp = (TREE_CODE (src) == COMPONENT_REF
> + && DECL_BIT_FIELD_TYPE (TREE_OPERAND (src, 1)));
> +
> +  unsigned int align = TYPE_ALIGN (TREE_TYPE (src));
> +  if (TREE_CODE (src) == COMPONENT_REF)
> +{
> +  tree field = TREE_OPERAND (src, 1);
> +  if (DECL_BIT_FIELD_TYPE (field))
> + align = TYPE_ALIGN (DECL_BIT_FIELD_TYPE (field));
> +  else
> + align = TYPE_ALIGN (TREE_TYPE (field));
> +}
> +
> +  unsigned int start_bitpos = bitpos & ~(align - 1);
> +  unsigned int end_bitpos = start_bitpos + align;
> +
> +  switch (kind)
> +{
> +case BPF_RELO_FIELD_BYTE_OFFSET:
> +  {
> + if (bitfieldp)
> +   result = start_bitpos / 8;
> + else
> +   result = bitpos / 8;
> +  }
> +  break;
> +
> +case BPF_RELO_FIELD_BYTE_SIZE:
> +  {
> + if (bitfieldp)
> +   {
> + /* To match LLVM behavior, byte size of bitfields is recorded as
> +the full size of the base type. A 3-bit bitfield of type int is
> +therefore recorded as having a byte size of 4 bytes. */
> + result = end_bitpos - start_bitpos;
> + if (result & (result - 1))
> +   error ("unsupported field expression");
> + result = result / 8;
> +   }
> + else
> +   result = bitsize / 8;
> +  }
> +  break;
> +
> +case BPF_RELO_FIELD_EXISTS:
> +  /* T

Re: Remove support for Intel MIC offloading (was: [PATCH] Remove dead code.)

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Thu, Oct 20, 2022 at 12:33:28PM +, Michael Matz wrote:
> Hey,
> 
> On Thu, 20 Oct 2022, Thomas Schwinge wrote:
> 
> > This had been done in
> > wwwdocs commit 5c7ecfb5627e412a3d142d8dc212f4cd39b3b73f
> > "Document deprecation of OpenMP MIC offloading in GCC 12".
> > 
> > I'm sad about this, because -- in theory -- such a plugin is very useful
> > for offloading simulation/debugging (separate host/device memory spaces,
> > allow sanitizers to run on offloaded code
> 
> Yeah, I think that's a _very_ useful feature, but indeed ...
> 
> > (like LLVM a while ago
> > implemented), and so on), but all that doesn't help -- in practice -- if
> > nobody is maintaining that code.
> 
> ... it should then be somewhat maintained properly.  Maybe the 
> MIC-specifics could be removed from the code, and it could be transformed 
> into a "null"-offload target, as example and testing vehicle (and implying 
> that such new liboffloadmic^H^H^Hnull would have its upstream in the GCC 
> repo).  Alas, if noone is going to do that work removing is the right 
> choice.

Yeah.  But we really shouldn't need a large MIC specific library for that,
everything should be implementable with a simple portable plugin that just
forks + execs the offloading ELF and transfers data to/out of it etc.
And the config/i386/intelmic-mkoffload etc. stuff would need to be done
somewhere in generic code, such that we can do it for all targets.
Also ideally by using just the normal lto1 with some special option that
it acts as an offloading compiler, so that we don't need to bother with
building a separate offloading compiler for it.
True, everything guarded with #ifdef ACCEL_COMPILER etc. would need to
change into code guarded with some option.

Jakub

Re: [PATCH] 16/19 modula2 front end: bootstrap and documentation tools

2022-10-20 Thread Martin Liška

Hello.

I noticed the devel/modula-2 branch contains the following dead links:

- http://www.gccsummit.org/2006
- http://www.gccsummit.org/2010/speakers.php?types=LIGHTNING
- http://floppsie.comp.glam.ac.uk/Papers/paper23/gaius-mulley-gnu-m2.pdf
- http://floppsie.comp.glam.ac.uk/Papers/paper15/mulley-proc.pdf
- http://floppsie.comp.glam.ac.uk/Papers/paper22/gaius-gcc-cauldron-2016.pdf

Thanks,
Martin

[PATCH] match.pd: Fix up gcc.dg/pr54346.c on i686-linux [PR54346]

2022-10-20 Thread Jakub Jelinek via Gcc-patches

Hi!

The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple
reasons.  One is the trivial missing -Wno-psabi which the following patch
adds, but that isn't enough.  The thing is that without native vector
support, we have VEC_PERM_EXPRs in the IL and are actually considering
the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt
because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.

Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs
that can be handled by the backend natively into one VEC_PERM_EXPR
that can't be handled.  But if both of the original VEC_PERM_EXPRs
can't be handled natively either, having just one VEC_PERM_EXPR that will be
lowered by generic vec lowering is IMHO still better than 2.
Or even if we trade just one VEC_PERM_EXPR that can't be handled plus
one that can to one that can't be handled.

Lightly tested so far, ok for trunk if it passes full bootstrap/regtest
on x86_64-linux and i686-linux?

BTW, the testcase also needs to have executable permissions removed...

2022-10-20  

PR tree-optimization/54346
* match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0 VECTOR_CST)):
Optimize nested VEC_PERM_EXPRs even if target can't handle the
new one provided we don't increase number of VEC_PERM_EXPRs the
target can't handle.

* gcc.dg/pr54346.c: Add -Wno-psabi to dg-options.

--- gcc/match.pd.jj 2022-10-19 11:28:35.111654555 +0200
+++ gcc/match.pd2022-10-20 13:45:57.489512189 +0200
@@ -8118,7 +8118,16 @@ and,
vec_perm_indices sel2 (builder2, 2, nelts);
 
tree op0 = NULL_TREE;
-   if (can_vec_perm_const_p (result_mode, op_mode, sel2, false))
+   /* If the new VEC_PERM_EXPR can't be handled but both
+ original VEC_PERM_EXPRs can, punt.
+ If one or both of the original VEC_PERM_EXPRs can't be
+ handled and the new one can't be either, don't increase
+ number of VEC_PERM_EXPRs that can't be handled.  */
+   if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)
+  || (single_use (@0)
+  ? (!can_vec_perm_const_p (result_mode, op_mode, sel0, false)
+ || !can_vec_perm_const_p (result_mode, op_mode, sel1, false))
+  : !can_vec_perm_const_p (result_mode, op_mode, sel1, false)))
 op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2);
  }
  (if (op0)
--- gcc/testsuite/gcc.dg/pr54346.c.jj   2022-10-11 10:00:07.456124822 +0200
+++ gcc/testsuite/gcc.dg/pr54346.c  2022-10-20 13:46:10.90119 +0200
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O -fdump-tree-dse1" } */
+/* { dg-options "-O -fdump-tree-dse1 -Wno-psabi" } */
 
 typedef int veci __attribute__ ((vector_size (4 * sizeof (int;
 

Jakub

[PATCH] libstdc++: Don't use gstdint.h anymore

2022-10-20 Thread Arsen Arsenović via Gcc-patches

libstdc++-v3/ChangeLog:

* configure.ac: Stop generating gstdint.h.
* src/c++11/compatibility-atomic-c++0x.cc: Stop using gstdint.h.
---
Tested on x86_64-pc-linux-gnu.

 libstdc++-v3/configure.ac| 6 --
 libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc | 9 +
 2 files changed, 5 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 81d914b434a..c5ec976c026 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -440,12 +440,6 @@ GCC_CHECK_UNWIND_GETIPINFO
 
 GCC_LINUX_FUTEX([AC_DEFINE(HAVE_LINUX_FUTEX, 1, [Define if futex syscall is 
available.])])
 
-if test "$is_hosted" = yes; then
-# TODO: remove this and change src/c++11/compatibility-atomic-c++0x.cc to
-# use  instead of .
-GCC_HEADER_STDINT(include/gstdint.h)
-fi
-
 GLIBCXX_ENABLE_SYMVERS([yes])
 AC_SUBST(libtool_VERSION)
 
diff --git a/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc 
b/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
index 5a0c5459088..2065eb517db 100644
--- a/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
+++ b/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
@@ -22,10 +22,11 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
-#include "gstdint.h"
 #include 
 #include 
 
+using guintptr_t = __UINTPTR_TYPE__;
+
 // XXX GLIBCXX_ABI Deprecated
 // gcc-4.7.0
 
@@ -119,13 +120,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _GLIBCXX_CONST __atomic_flag_base*
   __atomic_flag_for_address(const volatile void* __z) _GLIBCXX_NOTHROW
   {
-uintptr_t __u = reinterpret_cast(__z);
+guintptr_t __u = reinterpret_cast(__z);
 __u += (__u >> 2) + (__u << 4);
 __u += (__u >> 7) + (__u << 5);
 __u += (__u >> 17) + (__u << 13);
-if (sizeof(uintptr_t) > 4)
+if (sizeof(guintptr_t) > 4)
   __u += (__u >> 31);
-__u &= ~((~uintptr_t(0)) << LOGSIZE);
+__u &= ~((~guintptr_t(0)) << LOGSIZE);
 return flag_table + __u;
   }
 
-- 
2.38.1

Re: [PATCH] match.pd: Fix up gcc.dg/pr54346.c on i686-linux [PR54346]

2022-10-20 Thread Richard Biener via Gcc-patches




> Am 20.10.2022 um 14:49 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> Hi!
> 
> The pr54346.c testcase FAILs on i686-linux (without -msse*) for multiple
> reasons.  One is the trivial missing -Wno-psabi which the following patch
> adds, but that isn't enough.  The thing is that without native vector
> support, we have VEC_PERM_EXPRs in the IL and are actually considering
> the nested VEC_PERM_EXPRs into one VEC_PERM_EXPR optimization, but punt
> because can_vec_perm_const_p (result_mode, op_mode, sel2, false) is false.
> 
> Such a test makes sense to prevent "optimizing" two VEC_PERM_EXPRs
> that can be handled by the backend natively into one VEC_PERM_EXPR
> that can't be handled.  But if both of the original VEC_PERM_EXPRs
> can't be handled natively either, having just one VEC_PERM_EXPR that will be
> lowered by generic vec lowering is IMHO still better than 2.
> Or even if we trade just one VEC_PERM_EXPR that can't be handled plus
> one that can to one that can't be handled.
> 
> Lightly tested so far, ok for trunk if it passes full bootstrap/regtest
> on x86_64-linux and i686-linux?

Ok

Richard 

> BTW, the testcase also needs to have executable permissions removed...
> 
> 2022-10-20  
> 
>PR tree-optimization/54346
>* match.pd ((vec_perm (vec_perm@0 @1 @2 VECTOR_CST) @0 VECTOR_CST)):
>Optimize nested VEC_PERM_EXPRs even if target can't handle the
>new one provided we don't increase number of VEC_PERM_EXPRs the
>target can't handle.
> 
>* gcc.dg/pr54346.c: Add -Wno-psabi to dg-options.
>
> --- gcc/match.pd.jj2022-10-19 11:28:35.111654555 +0200
> +++ gcc/match.pd2022-10-20 13:45:57.489512189 +0200
> @@ -8118,7 +8118,16 @@ and,
>vec_perm_indices sel2 (builder2, 2, nelts);
> 
>tree op0 = NULL_TREE;
> -   if (can_vec_perm_const_p (result_mode, op_mode, sel2, false))
> +   /* If the new VEC_PERM_EXPR can't be handled but both
> +  original VEC_PERM_EXPRs can, punt.
> +  If one or both of the original VEC_PERM_EXPRs can't be
> +  handled and the new one can't be either, don't increase
> +  number of VEC_PERM_EXPRs that can't be handled.  */
> +   if (can_vec_perm_const_p (result_mode, op_mode, sel2, false)
> +   || (single_use (@0)
> +   ? (!can_vec_perm_const_p (result_mode, op_mode, sel0, false)
> +  || !can_vec_perm_const_p (result_mode, op_mode, sel1, false))
> +   : !can_vec_perm_const_p (result_mode, op_mode, sel1, false)))
> op0 = vec_perm_indices_to_tree (TREE_TYPE (@4), sel2);
>  }
>  (if (op0)
> --- gcc/testsuite/gcc.dg/pr54346.c.jj2022-10-11 10:00:07.456124822 +0200
> +++ gcc/testsuite/gcc.dg/pr54346.c2022-10-20 13:46:10.90119 +0200
> @@ -1,5 +1,5 @@
> /* { dg-do compile } */
> -/* { dg-options "-O -fdump-tree-dse1" } */
> +/* { dg-options "-O -fdump-tree-dse1 -Wno-psabi" } */
> 
> typedef int veci __attribute__ ((vector_size (4 * sizeof (int;
> 
> 
>Jakub
>

Re: [PATCH] libstdc++: Don't use gstdint.h anymore

2022-10-20 Thread Jonathan Wakely via Gcc-patches

On Thu, 20 Oct 2022 at 13:58, Arsen Arsenović via Libstdc++
 wrote:
>
> libstdc++-v3/ChangeLog:
>
> * configure.ac: Stop generating gstdint.h.
> * src/c++11/compatibility-atomic-c++0x.cc: Stop using gstdint.h.
> ---
> Tested on x86_64-pc-linux-gnu.
>
>  libstdc++-v3/configure.ac| 6 --
>  libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc | 9 +
>  2 files changed, 5 insertions(+), 10 deletions(-)
>
> diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
> index 81d914b434a..c5ec976c026 100644
> --- a/libstdc++-v3/configure.ac
> +++ b/libstdc++-v3/configure.ac
> @@ -440,12 +440,6 @@ GCC_CHECK_UNWIND_GETIPINFO
>
>  GCC_LINUX_FUTEX([AC_DEFINE(HAVE_LINUX_FUTEX, 1, [Define if futex syscall is 
> available.])])
>
> -if test "$is_hosted" = yes; then
> -# TODO: remove this and change src/c++11/compatibility-atomic-c++0x.cc to
> -# use  instead of .
> -GCC_HEADER_STDINT(include/gstdint.h)
> -fi
> -

Yes, I said in r12-6409-g68c2e9e9234cb3 that removing that could wait
for stage 1, so let's do it now.

>  GLIBCXX_ENABLE_SYMVERS([yes])
>  AC_SUBST(libtool_VERSION)
>
> diff --git a/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc 
> b/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
> index 5a0c5459088..2065eb517db 100644
> --- a/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
> +++ b/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
> @@ -22,10 +22,11 @@
>  // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
>  // .
>
> -#include "gstdint.h"
>  #include 
>  #include 
>
> +using guintptr_t = __UINTPTR_TYPE__;

I think this should be local in the only function that uses it.

> +
>  // XXX GLIBCXX_ABI Deprecated
>  // gcc-4.7.0
>
> @@ -119,13 +120,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>_GLIBCXX_CONST __atomic_flag_base*
>__atomic_flag_for_address(const volatile void* __z) _GLIBCXX_NOTHROW
>{
> -uintptr_t __u = reinterpret_cast(__z);
> +guintptr_t __u = reinterpret_cast(__z);
>  __u += (__u >> 2) + (__u << 4);
>  __u += (__u >> 7) + (__u << 5);
>  __u += (__u >> 17) + (__u << 13);
> -if (sizeof(uintptr_t) > 4)
> +if (sizeof(guintptr_t) > 4)
>__u += (__u >> 31);
> -__u &= ~((~uintptr_t(0)) << LOGSIZE);
> +__u &= ~((~guintptr_t(0)) << LOGSIZE);
>  return flag_table + __u;
>}
>
> --
> 2.38.1
>

Re: Remove support for Intel MIC offloading (was: [PATCH] Remove dead code.)

2022-10-20 Thread Richard Biener via Gcc-patches




> Am 20.10.2022 um 14:41 schrieb Jakub Jelinek via Gcc-patches 
> :
> 
> On Thu, Oct 20, 2022 at 12:33:28PM +, Michael Matz wrote:
>> Hey,
>> 
>>> On Thu, 20 Oct 2022, Thomas Schwinge wrote:
>>> 
>>> This had been done in
>>> wwwdocs commit 5c7ecfb5627e412a3d142d8dc212f4cd39b3b73f
>>> "Document deprecation of OpenMP MIC offloading in GCC 12".
>>> 
>>> I'm sad about this, because -- in theory -- such a plugin is very useful
>>> for offloading simulation/debugging (separate host/device memory spaces,
>>> allow sanitizers to run on offloaded code
>> 
>> Yeah, I think that's a _very_ useful feature, but indeed ...
>> 
>>> (like LLVM a while ago
>>> implemented), and so on), but all that doesn't help -- in practice -- if
>>> nobody is maintaining that code.
>> 
>> ... it should then be somewhat maintained properly.  Maybe the 
>> MIC-specifics could be removed from the code, and it could be transformed 
>> into a "null"-offload target, as example and testing vehicle (and implying 
>> that such new liboffloadmic^H^H^Hnull would have its upstream in the GCC 
>> repo).  Alas, if noone is going to do that work removing is the right 
>> choice.
> 
> Yeah.  But we really shouldn't need a large MIC specific library for that,
> everything should be implementable with a simple portable plugin that just
> forks + execs the offloading ELF and transfers data to/out of it etc.
> And the config/i386/intelmic-mkoffload etc. stuff would need to be done
> somewhere in generic code, such that we can do it for all targets.
> Also ideally by using just the normal lto1 with some special option that
> it acts as an offloading compiler, so that we don't need to bother with
> building a separate offloading compiler for it.
> True, everything guarded with #ifdef ACCEL_COMPILER etc. would need to
> change into code guarded with some option.

Might be a nice GSoC project …

Richard 

>Jakub
>

[PATCH]vect: Fix vectype when widening container type in bitfield pattern [PR107326]

2022-10-20 Thread Andre Vieira (lists) via Gcc-patches


Hi,

The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the 
vectype when widening the container.


I thought the original tests covered that code-path but they didn't, so 
I added a new run-test that covers it too.


Bootstrapped and regression tested on x86_64 and aarch64.

gcc/ChangeLog:

    PR tree-optimization/107326
    * tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Change
    vectype when widening container.

gcc/testsuite/ChangeLog:

    * gcc.dg/vect/pr107326.c: New test.
    * gcc.dg/vect/vect-bitfield-read-7.c
diff --git a/gcc/testsuite/gcc.dg/vect/pr107326.c 
b/gcc/testsuite/gcc.dg/vect/pr107326.c
new file mode 100644
index 
..333a515e7410a5b257a9f225b56b14b619af3118
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr107326.c
@@ -0,0 +1,24 @@
+/* PR107326 */
+/* { dg-do compile } */
+struct Gsymtab {
+  unsigned int : 8;
+  unsigned int visited_somewhere : 1;
+};
+
+extern struct Gsymtab glob_symtab[];
+
+int
+visit_children (int i)
+{
+  int numvisited = 0;
+
+  while (i < 1)
+{
+  if (glob_symtab[i].visited_somewhere)
+++numvisited;
+
+  ++i;
+}
+
+  return numvisited;
+}
diff --git a/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c 
b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
new file mode 100644
index 
..3b505db2bd3eb6938d2f3b6f7426765333c271a4
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-bitfield-read-7.c
@@ -0,0 +1,43 @@
+/* { dg-require-effective-target vect_int } */
+
+#include 
+#include "tree-vect.h"
+
+extern void abort(void);
+
+struct s {
+unsigned i : 8;
+char a : 4;
+};
+
+#define N 32
+#define ELT0 {0xFUL, 0}
+#define ELT1 {0xFUL, 1}
+#define ELT2 {0xFUL, 2}
+#define ELT3 {0xFUL, 3}
+#define RES 48
+struct s A[N]
+  = { ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3,
+  ELT0, ELT1, ELT2, ELT3, ELT0, ELT1, ELT2, ELT3};
+
+int __attribute__ ((noipa))
+f(struct s *ptr, unsigned n) {
+int res = 0;
+for (int i = 0; i < n; ++i)
+  res += ptr[i].a;
+return res;
+}
+
+int main (void)
+{
+  check_vect ();
+
+  if (f(&A[0], N) != RES)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 
6afd57a50c4bcb5aec7ccca6e5dc069caa4a5a30..24673f8d4d92e34706fa6c4ed2cf2ed85d6bb517
 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -1922,7 +1922,8 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, 
stmt_vec_info stmt_info,
   tree ret = gimple_assign_lhs (first_stmt);
   tree ret_type = TREE_TYPE (ret);
   bool shift_first = true;
-  tree vectype = get_vectype_for_scalar_type (vinfo, TREE_TYPE (container));
+  tree container_type = TREE_TYPE (container);
+  tree vectype = get_vectype_for_scalar_type (vinfo, container_type);
 
   /* We move the conversion earlier if the loaded type is smaller than the
  return type to enable the use of widening loads.  */
@@ -1933,15 +1934,15 @@ vect_recog_bitfield_ref_pattern (vec_info *vinfo, 
stmt_vec_info stmt_info,
= gimple_build_assign (vect_recog_temp_ssa_var (ret_type),
   NOP_EXPR, container);
   container = gimple_get_lhs (pattern_stmt);
-  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt);
+  container_type = TREE_TYPE (container);
+  vectype = get_vectype_for_scalar_type (vinfo, container_type);
+  append_pattern_def_seq (vinfo, stmt_info, pattern_stmt, vectype);
 }
   else if (!useless_type_conversion_p (TREE_TYPE (container), ret_type))
 /* If we are doing the conversion last then also delay the shift as we may
be able to combine the shift and conversion in certain cases.  */
 shift_first = false;
 
-  tree container_type = TREE_TYPE (container);
-
   /* If the only use of the result of this BIT_FIELD_REF + CONVERT is a
  PLUS_EXPR then do the shift last as some targets can combine the shift and
  add into a single instruction.  */

Re: [PATCH]vect: Fix vectype when widening container type in bitfield pattern [PR107326]

2022-10-20 Thread Richard Biener via Gcc-patches




> Am 20.10.2022 um 15:59 schrieb Andre Vieira (lists) via Gcc-patches 
> :
> 
> Hi,
> 
> The 'vect_recog_bitfield_ref_pattern' was not correctly adapting the vectype 
> when widening the container.
> 
> I thought the original tests covered that code-path but they didn't, so I 
> added a new run-test that covers it too.
> 
> Bootstrapped and regression tested on x86_64 and aarch64.

Ok,

Thanks,
Richard 

> gcc/ChangeLog:
> 
> PR tree-optimization/107326
> * tree-vect-patterns.cc (vect_recog_bitfield_ref_pattern): Change
> vectype when widening container.
> 
> gcc/testsuite/ChangeLog:
> 
> * gcc.dg/vect/pr107326.c: New test.
> * gcc.dg/vect/vect-bitfield-read-7.c
>

[PATCH] c++ modules: handle CONCEPT_DECL in node_template_info [PR102963]

2022-10-20 Thread Patrick Palka via Gcc-patches

Here node_template_info is overlooking that CONCEPT_DECL has TEMPLATE_INFO
too, which makes get_originating_module_decl for the CONCEPT_DECL fail to
return the corresponding TEMPLATE_DECL, which leads to an ICE from
import_entity_index while pretty printing the CONCEPT_DECL's module
suffix as part of the failed static assert diagnostic.

Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR c++/102963

gcc/cp/ChangeLog:

* module.cc (node_template_info): Handle CONCEPT_DECL.

gcc/testsuite/ChangeLog:

* g++.dg/modules/concept-7_a.C: New test.
* g++.dg/modules/concept-7_b.C: New test.
---
 gcc/cp/module.cc   | 3 ++-
 gcc/testsuite/g++.dg/modules/concept-7_a.C | 7 +++
 gcc/testsuite/g++.dg/modules/concept-7_b.C | 7 +++
 3 files changed, 16 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-7_a.C
 create mode 100644 gcc/testsuite/g++.dg/modules/concept-7_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index bb406a5cf01..dfed0a5ef89 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -4046,7 +4046,8 @@ node_template_info (tree decl, int &use)
   || TREE_CODE (decl) == TYPE_DECL
   || TREE_CODE (decl) == FUNCTION_DECL
   || TREE_CODE (decl) == FIELD_DECL
-  || TREE_CODE (decl) == TEMPLATE_DECL))
+  || TREE_CODE (decl) == TEMPLATE_DECL
+  || TREE_CODE (decl) == CONCEPT_DECL))
 {
   use_tpl = DECL_USE_TEMPLATE (decl);
   ti = DECL_TEMPLATE_INFO (decl);
diff --git a/gcc/testsuite/g++.dg/modules/concept-7_a.C 
b/gcc/testsuite/g++.dg/modules/concept-7_a.C
new file mode 100644
index 000..a39b31bf7f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-7_a.C
@@ -0,0 +1,7 @@
+// PR c++/102963
+// { dg-additional-options "-fmodules-ts -fconcepts" }
+// { dg-module-cmi pr102963 }
+
+export module pr102963;
+
+export template concept C = __is_same(T, int);
diff --git a/gcc/testsuite/g++.dg/modules/concept-7_b.C 
b/gcc/testsuite/g++.dg/modules/concept-7_b.C
new file mode 100644
index 000..1f81208ebd5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-7_b.C
@@ -0,0 +1,7 @@
+// PR c++/102963
+// { dg-additional-options "-fmodules-ts -fconcepts" }
+
+import pr102963;
+
+static_assert(C);
+static_assert(C); // { dg-error "static assert" }
-- 
2.38.1.130.g45c9f05c44

Re: [PATCH v3] Re: OpenMP: Generate SIMD clones for functions with "declare target"

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Sun, Oct 16, 2022 at 07:23:05PM -0600, Sandra Loosemore wrote:
> My sense is that the first approach would be more straightforward than the
> second one, and I am willing to continue to work on that.  However, I think
> I need some direction to get started, as I presently know nothing about
> cgraph and I was unable to find any useful overview or interface
> documentation in the GCC internals manual.  Is this as simple as inserting
> an existing pass into the passlist to clean up after vectorization, or does
> it involve writing something more or less from scratch?

We (as I've discovered during the work on assumptions) have
TODO_discard_function which when returned from an execute pass throws away
a function completely (except now assumption functions for which it doesn't
release body; this could be done in some pass shortly after IPA, or
alternatively before expansion).  But another thing that needs to be done is 
for the
non-public declare simd clones (both explicit and implicit from your patch)
to be ordered in cgraph after anything that has a cgraph edge to its
original function.  I don't know how to do that, you should talk to Honza,
Richi or Martin about that.
I think the current behavior is that callees are processed before callers
if possible (unless there are cycles), which is certainly what we want for
say assume functions, or IPA RA etc.  But in case of non-public simd clones
we want to do it the other way around (at the expense of IPA RA), so that
we can throw away functions which aren't needed.

> > I admit I don't remember where exactly the simd clone happens wrt. other
> > IPA passes, but I think it is late pass; so, does it happen for GCN
> > offloading only in the lto1 offloading compiler?
> > Shouldn't the auto optimization be then done only in the offloading
> > lto1 for GCN then (say guard on targetm boolean)?
> 
> I'm afraid I don't know much about offloading, but I was under the
> impression it all goes through the same compilation process, just with a
> different target?

I've looked at it today and it seems late ipa passes are executed after LTO
bytecode is streamed back in.
If you say try:
#pragma omp declare simd
int foo (int x) { return x; }

int
main ()
{
  int a[64] = {};
  #pragma omp target map(a)
  #pragma omp simd
  for (int i = 0; i < 64; i++)
a[i] = foo (a[i]);
}
with
gcc -foffload-options='-fdump-tree-all -fdump-ipa-all' -fdump-tree-all 
-fdump-ipa-all -O2 -fopenmp a.c -o a
you ought to see the simdclone dump both as a.c.*i.simdclone and 
a.x*.mkoffload.*i.simdclone
where the former is what is done for the host code (and host fallback),
while the latter is what is done in the offloading lto.
Can't verify it 100% because I have only nvptx-none offloading configured
and in that case pass_omp_simd_clone::gate is disabled in offloading lto
because targetm.simd_clone.compute_vecsize_and_simdlen is NULL for nvptx.
But it is non-NULL for gcn.

Thus, IMHO it is exactly the pass_omp_simd_clone pass where you want to
implement this auto-simdization discovery, guarded with
#ifdef ACCEL_COMPILER and the new option (which means it will be done
only for gcn and not on the host right now).  And do it at the start of
ipa_omp_simd_clone, before the
  FOR_EACH_FUNCTION (node)
expand_simd_clones (node);
loop, or, if it is purely local decision for each function, at the
start of expand_simd_clones with similar guarding, punt on functions
with "noclone" attribute, or !node->definition.  You need to repeat the
  if (node->has_gimple_body_p ())
node->get_body ();
to get body before you analyze it.

And please put the new functions for such analysis into omp-simd-clone.cc
where they belong.

Jakub

[COMMITTED] Do not set NAN flags for VARYING ranges when !HONOR_NANS.

2022-10-20 Thread Aldy Hernandez via Gcc-patches

Since NANs can't appear in ranges for !HONOR_NANS, there's no reason
to set them in a VARYING range.

gcc/ChangeLog:

* value-range.h (frange::set_varying): Do not set NAN flags for
!HONOR_NANS.
* value-range.cc (frange::normalize_kind): Adjust for no NAN when
!HONOR_NANS.
(frange::verify_range): Same.
* range-op-float.cc (maybe_isnan): Remove flag_finite_math_only check.
---
 gcc/range-op-float.cc |  3 ---
 gcc/value-range.cc| 11 ---
 gcc/value-range.h | 12 ++--
 3 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 2a4a99ba467..a9e74c86877 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -171,9 +171,6 @@ range_operator_float::op1_op2_relation (const frange &lhs 
ATTRIBUTE_UNUSED) cons
 static inline bool
 maybe_isnan (const frange &op1, const frange &op2)
 {
-  if (flag_finite_math_only)
-return false;
-
   return op1.maybe_isnan () || op2.maybe_isnan ();
 }
 
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 511cd0ad767..bcda4987307 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -388,7 +388,7 @@ frange::normalize_kind ()
   && frange_val_is_min (m_min, m_type)
   && frange_val_is_max (m_max, m_type))
 {
-  if (m_pos_nan && m_neg_nan)
+  if (!HONOR_NANS (m_type) || (m_pos_nan && m_neg_nan))
{
  set_varying (m_type);
  return true;
@@ -396,7 +396,7 @@ frange::normalize_kind ()
 }
   else if (m_kind == VR_VARYING)
 {
-  if (!m_pos_nan || !m_neg_nan)
+  if (HONOR_NANS (m_type) && (!m_pos_nan || !m_neg_nan))
{
  m_kind = VR_RANGE;
  m_min = frange_val_min (m_type);
@@ -712,14 +712,19 @@ frange::supports_type_p (const_tree type) const
 void
 frange::verify_range ()
 {
+  if (flag_finite_math_only)
+gcc_checking_assert (!maybe_isnan ());
   switch (m_kind)
 {
 case VR_UNDEFINED:
   gcc_checking_assert (!m_type);
   return;
 case VR_VARYING:
+  if (flag_finite_math_only)
+   gcc_checking_assert (!m_pos_nan && !m_neg_nan);
+  else
+   gcc_checking_assert (m_pos_nan && m_neg_nan);
   gcc_checking_assert (m_type);
-  gcc_checking_assert (m_pos_nan && m_neg_nan);
   gcc_checking_assert (frange_val_is_min (m_min, m_type));
   gcc_checking_assert (frange_val_is_max (m_max, m_type));
   return;
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 60b989b2b50..b48542a68aa 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -1103,8 +1103,16 @@ frange::set_varying (tree type)
   m_type = type;
   m_min = frange_val_min (type);
   m_max = frange_val_max (type);
-  m_pos_nan = true;
-  m_neg_nan = true;
+  if (HONOR_NANS (m_type))
+{
+  m_pos_nan = true;
+  m_neg_nan = true;
+}
+  else
+{
+  m_pos_nan = false;
+  m_neg_nan = false;
+}
 }
 
 inline void
-- 
2.37.3

[COMMITTED] Replace finite_operands_p with maybe_isnan.

2022-10-20 Thread Aldy Hernandez via Gcc-patches

The finite_operands_p function was incorrectly named, as it only
returned TRUE when !NAN.  This was leftover from the initial
implementation of frange.  Using the maybe_isnan() nomenclature is
more consistent and easier to understand.

gcc/ChangeLog:

* range-op-float.cc (finite_operand_p): Remove.
(finite_operands_p): Rename to...
(maybe_isnan): ...this.
(frelop_early_resolve): Use maybe_isnan instead of finite_operands_p.
(foperator_equal::fold_range): Same.
(foperator_equal::op1_range): Same.
(foperator_not_equal::fold_range): Same.
(foperator_lt::fold_range): Same.
(foperator_le::fold_range): Same.
(foperator_gt::fold_range): Same.
(foperator_ge::fold_range): Same.
---
 gcc/range-op-float.cc | 41 ++---
 1 file changed, 18 insertions(+), 23 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 0605a908684..2a4a99ba467 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -166,20 +166,15 @@ range_operator_float::op1_op2_relation (const frange &lhs 
ATTRIBUTE_UNUSED) cons
   return VREL_VARYING;
 }
 
-// Return TRUE if OP1 is known to be free of NANs.
+// Return TRUE if OP1 and OP2 may be a NAN.
 
 static inline bool
-finite_operand_p (const frange &op1)
+maybe_isnan (const frange &op1, const frange &op2)
 {
-  return flag_finite_math_only || !op1.maybe_isnan ();
-}
-
-// Return TRUE if OP1 and OP2 are known to be free of NANs.
+  if (flag_finite_math_only)
+return false;
 
-static inline bool
-finite_operands_p (const frange &op1, const frange &op2)
-{
-  return flag_finite_math_only || (!op1.maybe_isnan () && !op2.maybe_isnan ());
+  return op1.maybe_isnan () || op2.maybe_isnan ();
 }
 
 // Floating version of relop_early_resolve that takes into account NAN
@@ -196,7 +191,7 @@ frelop_early_resolve (irange &r, tree type,
 
   // We can fold relations from the oracle when we know both operands
   // are free of NANs, or when -ffinite-math-only.
-  return (finite_operands_p (op1, op2)
+  return (!maybe_isnan (op1, op2)
  && relop_early_resolve (r, type, op1, op2, rel, my_rel));
 }
 
@@ -391,7 +386,7 @@ foperator_equal::fold_range (irange &r, tree type,
   else
r = range_false (type);
 }
-  else if (finite_operands_p (op1, op2))
+  else if (!maybe_isnan (op1, op2))
 {
   // If ranges do not intersect, we know the range is not equal,
   // otherwise we don't know anything for sure.
@@ -441,7 +436,7 @@ foperator_equal::op1_range (frange &r, tree type,
   // If the result is false, the only time we know anything is
   // if OP2 is a constant.
   else if (op2.singleton_p ()
-  || (finite_operand_p (op2) && op2.zero_p ()))
+  || (!op2.maybe_isnan () && op2.zero_p ()))
{
  REAL_VALUE_TYPE tmp = op2.lower_bound ();
  r.set (type, tmp, tmp, VR_ANTI_RANGE);
@@ -494,7 +489,7 @@ foperator_not_equal::fold_range (irange &r, tree type,
   else
r = range_false (type);
 }
-  else if (finite_operands_p (op1, op2))
+  else if (!maybe_isnan (op1, op2))
 {
   // If ranges do not intersect, we know the range is not equal,
   // otherwise we don't know anything for sure.
@@ -590,7 +585,7 @@ foperator_lt::fold_range (irange &r, tree type,
 
   if (op1.known_isnan () || op2.known_isnan ())
 r = range_false (type);
-  else if (finite_operands_p (op1, op2))
+  else if (!maybe_isnan (op1, op2))
 {
   if (real_less (&op1.upper_bound (), &op2.lower_bound ()))
r = range_true (type);
@@ -706,7 +701,7 @@ foperator_le::fold_range (irange &r, tree type,
 
   if (op1.known_isnan () || op2.known_isnan ())
 r = range_false (type);
-  else if (finite_operands_p (op1, op2))
+  else if (!maybe_isnan (op1, op2))
 {
   if (real_compare (LE_EXPR, &op1.upper_bound (), &op2.lower_bound ()))
r = range_true (type);
@@ -814,7 +809,7 @@ foperator_gt::fold_range (irange &r, tree type,
 
   if (op1.known_isnan () || op2.known_isnan ())
 r = range_false (type);
-  else if (finite_operands_p (op1, op2))
+  else if (!maybe_isnan (op1, op2))
 {
   if (real_compare (GT_EXPR, &op1.lower_bound (), &op2.upper_bound ()))
r = range_true (type);
@@ -930,7 +925,7 @@ foperator_ge::fold_range (irange &r, tree type,
 
   if (op1.known_isnan () || op2.known_isnan ())
 r = range_false (type);
-  else if (finite_operands_p (op1, op2))
+  else if (!maybe_isnan (op1, op2))
 {
   if (real_compare (GE_EXPR, &op1.lower_bound (), &op2.upper_bound ()))
r = range_true (type);
@@ -1302,7 +1297,7 @@ public:
   return false;
 // The result is the same as the ordered version when the
 // comparison is true or when the operands cannot be NANs.
-if (finite_operands_p (op1, op2) || r == range_true (type))
+if (!maybe_isnan (op1, op2) || r == range_true (type))
   return true;
 else
   {
@@ -1331,

Re: [PATCH] c++: Fix up mangling ICE with void{} [PR106863]

2022-10-20 Thread Jason Merrill via Gcc-patches


On 10/19/22 04:00, Jakub Jelinek wrote:

Hi!

We ICE on the following testcase during mangling, finish_compound_literal
returns for void{} void_node and the mangler doesn't handle it.
Handling void_node in the mangler seems problematic to me, because
we don't know for which case it has been created.
The following patch arranges to mangle just void{} the same as void()
if that is what we want to use, by doing what we do for void() when
processing void{}.
The code does that only if processing_template_decl, because otherwise
build_functional_cast will return void_node, so calling it looks like
wasted effort to me.  But if you want to call it unconditionally,
I can certainly do that too.


I think in a template we want the same early-return behavior as in the 
processing_template_decl block farther down in the function: 
specifically, we want to return a CONSTRUCTOR (for which 
COMPOUND_LITERAL_P is true), so it mangles as void{} rather than void().



Or do you want to mangle it differently?  How?

clang++ doesn't support DR2351, so I can't check what they are doing.

Bootstrapped/regtested on x86_64-linux and i686-linux.

2022-10-19  Jakub Jelinek  

PR c++/106863
* semantics.cc (finish_compound_literal): For void{}, if
processing_template_decl return build_functional_cast of NULL_TREE
to VOID_TYPE rather than void_node.

* g++.dg/cpp0x/dr2351-2.C: New test.

--- gcc/cp/semantics.cc.jj  2022-10-10 09:31:57.410985121 +0200
+++ gcc/cp/semantics.cc 2022-10-18 15:24:08.726026118 +0200
@@ -3164,7 +3164,12 @@ finish_compound_literal (tree type, tree
  {
/* DR2351 */
if (VOID_TYPE_P (type) && CONSTRUCTOR_NELTS (compound_literal) == 0)
-   return void_node;
+   {
+ if (!processing_template_decl)
+   return void_node;
+ location_t loc = cp_expr_loc_or_input_loc (compound_literal);
+ return build_functional_cast (loc, type, NULL_TREE, complain);
+   }
else if (VOID_TYPE_P (type)
   && processing_template_decl
   && maybe_zero_constructor_nelts (compound_literal))
--- gcc/testsuite/g++.dg/cpp0x/dr2351-2.C.jj2022-10-18 15:27:01.146690132 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/dr2351-2.C   2022-10-18 15:27:39.909164970 
+0200
@@ -0,0 +1,16 @@
+// DR2351
+// { dg-do compile { target c++11 } }
+
+void bar (int);
+
+template 
+auto foo (T t) -> decltype (bar (t), void{})
+{
+  return bar (t);
+}
+
+int
+main ()
+{
+  foo (0);
+}

Jakub

[PATCH v2] libstdc++: Don't use gstdint.h anymore

2022-10-20 Thread Arsen Arsenović via Gcc-patches

libstdc++-v3/ChangeLog:

* configure.ac: Stop generating gstdint.h.
* src/c++11/compatibility-atomic-c++0x.cc: Stop using gstdint.h.
---

> > +using guintptr_t = __UINTPTR_TYPE__;
> 
> I think this should be local in the only function that uses it.
Sure.

Tested on x86_64-pc-linux-gnu.

 libstdc++-v3/configure.ac| 6 --
 libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc | 8 
 2 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/libstdc++-v3/configure.ac b/libstdc++-v3/configure.ac
index 81d914b434a..c5ec976c026 100644
--- a/libstdc++-v3/configure.ac
+++ b/libstdc++-v3/configure.ac
@@ -440,12 +440,6 @@ GCC_CHECK_UNWIND_GETIPINFO
 
 GCC_LINUX_FUTEX([AC_DEFINE(HAVE_LINUX_FUTEX, 1, [Define if futex syscall is 
available.])])
 
-if test "$is_hosted" = yes; then
-# TODO: remove this and change src/c++11/compatibility-atomic-c++0x.cc to
-# use  instead of .
-GCC_HEADER_STDINT(include/gstdint.h)
-fi
-
 GLIBCXX_ENABLE_SYMVERS([yes])
 AC_SUBST(libtool_VERSION)
 
diff --git a/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc 
b/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
index 5a0c5459088..e21bd76245d 100644
--- a/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
+++ b/libstdc++-v3/src/c++11/compatibility-atomic-c++0x.cc
@@ -22,7 +22,6 @@
 // see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
 // .
 
-#include "gstdint.h"
 #include 
 #include 
 
@@ -119,13 +118,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   _GLIBCXX_CONST __atomic_flag_base*
   __atomic_flag_for_address(const volatile void* __z) _GLIBCXX_NOTHROW
   {
-uintptr_t __u = reinterpret_cast(__z);
+using guintptr_t = __UINTPTR_TYPE__;
+guintptr_t __u = reinterpret_cast(__z);
 __u += (__u >> 2) + (__u << 4);
 __u += (__u >> 7) + (__u << 5);
 __u += (__u >> 17) + (__u << 13);
-if (sizeof(uintptr_t) > 4)
+if (sizeof(guintptr_t) > 4)
   __u += (__u >> 31);
-__u &= ~((~uintptr_t(0)) << LOGSIZE);
+__u &= ~((~guintptr_t(0)) << LOGSIZE);
 return flag_table + __u;
   }
 
-- 
2.38.1

Re: [PATCH v2] aarch64: update Ampere-1 core definition

2022-10-20 Thread Richard Sandiford via Gcc-patches

Richard Sandiford  writes:
> Philipp Tomsich  writes:
>> This brings the extensions detected by -mcpu=native on Ampere-1 systems
>> in sync with the defaults generated for -mcpu=ampere1.
>>
>> Note that some early kernel versions on Ampere1 may misreport the
>> presence of PAUTH and PREDRES (i.e., -mcpu=native will add 'nopauth'
>> and 'nopredres').
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-cores.def (AARCH64_CORE): Update
>>   Ampere-1 core entry.
>>
>> Signed-off-by: Philipp Tomsich 
>
> OK, thanks.
>
>> Ok for backport?
>
> Yeah.  I'll try to backport the RCPC change soon -- think it would
> be best to get that in first.

Here's what I've committed to GCC 12.  Other branches coming soon :-)

Richard

gcc/
* config/aarch64/aarch64.h (AARCH64_FL_FOR_ARCH8_3): Add
AARCH64_FL_RCPC.
(AARCH64_ISA_RCPC): New macro.
* config/aarch64/aarch64-cores.def (thunderx3t110, zeus, neoverse-v1)
(neoverse-512tvb, saphira): Remove RCPC from these Armv8.3-A+ cores.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Define
__ARM_FEATURE_RCPC when appropriate.

gcc/testsuite/
* gcc.target/aarch64/pragma_cpp_predefs_1.c: Add RCPC tests.
---
 gcc/config/aarch64/aarch64-c.cc   |  1 +
 gcc/config/aarch64/aarch64-cores.def  | 10 +-
 gcc/config/aarch64/aarch64.h  |  4 +++-
 .../gcc.target/aarch64/pragma_cpp_predefs_1.c | 20 +++
 4 files changed, 29 insertions(+), 6 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-c.cc b/gcc/config/aarch64/aarch64-c.cc
index 767ee0c763c..a4c407724a7 100644
--- a/gcc/config/aarch64/aarch64-c.cc
+++ b/gcc/config/aarch64/aarch64-c.cc
@@ -202,6 +202,7 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
"__ARM_FEATURE_BF16_SCALAR_ARITHMETIC", pfile);
   aarch64_def_or_undef (TARGET_LS64,
"__ARM_FEATURE_LS64", pfile);
+  aarch64_def_or_undef (AARCH64_ISA_RCPC, "__ARM_FEATURE_RCPC", pfile);
 
   /* Not for ACLE, but required to keep "float.h" correct if we switch
  target between implementations that do or do not support ARMv8.2-A
diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 0402bfb748f..8da254f6924 100644
--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -133,17 +133,17 @@ AARCH64_CORE("tsv110",  tsv110, tsv110, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_
 /* ARMv8.3-A Architecture Processors.  */
 
 /* Marvell cores (TX3). */
-AARCH64_CORE("thunderx3t110",  thunderx3t110,  thunderx3t110, 8_3A,  
AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC | AARCH64_FL_SM4 | 
AARCH64_FL_SHA3 | AARCH64_FL_F16FML | AARCH64_FL_RCPC8_4, thunderx3t110, 0x43, 
0x0b8, 0x0a)
+AARCH64_CORE("thunderx3t110",  thunderx3t110,  thunderx3t110, 8_3A,  
AARCH64_FL_FOR_ARCH8_3 | AARCH64_FL_CRYPTO | AARCH64_FL_SM4 | AARCH64_FL_SHA3 | 
AARCH64_FL_F16FML | AARCH64_FL_RCPC8_4, thunderx3t110, 0x43, 0x0b8, 0x0a)
 
 /* ARMv8.4-A Architecture Processors.  */
 
 /* Arm ('A') cores.  */
-AARCH64_CORE("zeus", zeus, cortexa57, 8_4A,  AARCH64_FL_FOR_ARCH8_4 | 
AARCH64_FL_SVE | AARCH64_FL_RCPC | AARCH64_FL_I8MM | AARCH64_FL_BF16 | 
AARCH64_FL_F16 | AARCH64_FL_PROFILE | AARCH64_FL_SSBS | AARCH64_FL_RNG, 
neoversev1, 0x41, 0xd40, -1)
-AARCH64_CORE("neoverse-v1", neoversev1, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_SVE | AARCH64_FL_RCPC | AARCH64_FL_I8MM | 
AARCH64_FL_BF16 | AARCH64_FL_F16 | AARCH64_FL_PROFILE | AARCH64_FL_SSBS | 
AARCH64_FL_RNG, neoversev1, 0x41, 0xd40, -1)
-AARCH64_CORE("neoverse-512tvb", neoverse512tvb, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_SVE | AARCH64_FL_RCPC | AARCH64_FL_I8MM | 
AARCH64_FL_BF16 | AARCH64_FL_F16 | AARCH64_FL_PROFILE | AARCH64_FL_SSBS | 
AARCH64_FL_RNG, neoverse512tvb, INVALID_IMP, INVALID_CORE, -1)
+AARCH64_CORE("zeus", zeus, cortexa57, 8_4A,  AARCH64_FL_FOR_ARCH8_4 | 
AARCH64_FL_SVE | AARCH64_FL_I8MM | AARCH64_FL_BF16 | AARCH64_FL_F16 | 
AARCH64_FL_PROFILE | AARCH64_FL_SSBS | AARCH64_FL_RNG, neoversev1, 0x41, 0xd40, 
-1)
+AARCH64_CORE("neoverse-v1", neoversev1, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_SVE | AARCH64_FL_I8MM | AARCH64_FL_BF16 | 
AARCH64_FL_F16 | AARCH64_FL_PROFILE | AARCH64_FL_SSBS | AARCH64_FL_RNG, 
neoversev1, 0x41, 0xd40, -1)
+AARCH64_CORE("neoverse-512tvb", neoverse512tvb, cortexa57, 8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_SVE | AARCH64_FL_I8MM | AARCH64_FL_BF16 | 
AARCH64_FL_F16 | AARCH64_FL_PROFILE | AARCH64_FL_SSBS | AARCH64_FL_RNG, 
neoverse512tvb, INVALID_IMP, INVALID_CORE, -1)
 
 /* Qualcomm ('Q') cores. */
-AARCH64_CORE("saphira", saphira,saphira,8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO | AARCH64_FL_RCPC, saphira,   0x51, 
0xC01, -1)
+AARCH64_CORE("saphira", saphira,saphira,8_4A,  
AARCH64_FL_FOR_ARCH8_4 | AARCH64_FL_CRYPTO, saphira,   0x51, 0xC01, -1)
 
 /* ARMv8-A big.LITTLE implementations.  */

[PATCH] c++, v2: Fix up mangling ICE with void{} [PR106863]

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Thu, Oct 20, 2022 at 10:19:59AM -0400, Jason Merrill wrote:
> I think in a template we want the same early-return behavior as in the
> processing_template_decl block farther down in the function: specifically,
> we want to return a CONSTRUCTOR (for which COMPOUND_LITERAL_P is true), so
> it mangles as void{} rather than void().

So like this then?

2022-10-20  Jakub Jelinek  

PR c++/106863
* semantics.cc (finish_compound_literal): For void{}, if
processing_template_decl return a COMPOUND_LITERAL_P
CONSTRUCTOR rather than void_node.

* g++.dg/cpp0x/dr2351-2.C: New test.

--- gcc/cp/semantics.cc.jj  2022-10-19 01:14:58.343483355 +0200
+++ gcc/cp/semantics.cc 2022-10-20 16:32:30.605571968 +0200
@@ -3164,7 +3164,16 @@ finish_compound_literal (tree type, tree
 {
   /* DR2351 */
   if (VOID_TYPE_P (type) && CONSTRUCTOR_NELTS (compound_literal) == 0)
-   return void_node;
+   {
+ if (!processing_template_decl)
+   return void_node;
+ TREE_TYPE (compound_literal) = type;
+ TREE_HAS_CONSTRUCTOR (compound_literal) = 1;
+ CONSTRUCTOR_IS_DEPENDENT (compound_literal) = 0;
+ if (fcl_context == fcl_c99)
+   CONSTRUCTOR_C99_COMPOUND_LITERAL (compound_literal) = 1;
+ return compound_literal;
+   }
   else if (VOID_TYPE_P (type)
   && processing_template_decl
   && maybe_zero_constructor_nelts (compound_literal))
--- gcc/testsuite/g++.dg/cpp0x/dr2351-2.C.jj2022-10-20 16:27:19.645821706 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/dr2351-2.C   2022-10-20 16:27:19.645821706 
+0200
@@ -0,0 +1,16 @@
+// DR2351
+// { dg-do compile { target c++11 } }
+
+void bar (int);
+
+template 
+auto foo (T t) -> decltype (bar (t), void{})
+{
+  return bar (t);
+}
+
+int
+main ()
+{
+  foo (0);
+}


Jakub

Re: [PATCH] c++, v2: Fix up mangling ICE with void{} [PR106863]

2022-10-20 Thread Jason Merrill via Gcc-patches


On 10/20/22 10:38, Jakub Jelinek wrote:

On Thu, Oct 20, 2022 at 10:19:59AM -0400, Jason Merrill wrote:

I think in a template we want the same early-return behavior as in the
processing_template_decl block farther down in the function: specifically,
we want to return a CONSTRUCTOR (for which COMPOUND_LITERAL_P is true), so
it mangles as void{} rather than void().


So like this then?

2022-10-20  Jakub Jelinek  

PR c++/106863
* semantics.cc (finish_compound_literal): For void{}, if
processing_template_decl return a COMPOUND_LITERAL_P
CONSTRUCTOR rather than void_node.

* g++.dg/cpp0x/dr2351-2.C: New test.

--- gcc/cp/semantics.cc.jj  2022-10-19 01:14:58.343483355 +0200
+++ gcc/cp/semantics.cc 2022-10-20 16:32:30.605571968 +0200
@@ -3164,7 +3164,16 @@ finish_compound_literal (tree type, tree
  {
/* DR2351 */
if (VOID_TYPE_P (type) && CONSTRUCTOR_NELTS (compound_literal) == 0)
-   return void_node;
+   {
+ if (!processing_template_decl)
+   return void_node;
+ TREE_TYPE (compound_literal) = type;
+ TREE_HAS_CONSTRUCTOR (compound_literal) = 1;
+ CONSTRUCTOR_IS_DEPENDENT (compound_literal) = 0;
+ if (fcl_context == fcl_c99)
+   CONSTRUCTOR_C99_COMPOUND_LITERAL (compound_literal) = 1;


I don't think it's possible to get here with the C compound literal 
syntax, so you can drop these two lines.  OK with that change.



+ return compound_literal;
+   }
else if (VOID_TYPE_P (type)
   && processing_template_decl
   && maybe_zero_constructor_nelts (compound_literal))
--- gcc/testsuite/g++.dg/cpp0x/dr2351-2.C.jj2022-10-20 16:27:19.645821706 
+0200
+++ gcc/testsuite/g++.dg/cpp0x/dr2351-2.C   2022-10-20 16:27:19.645821706 
+0200
@@ -0,0 +1,16 @@
+// DR2351
+// { dg-do compile { target c++11 } }
+
+void bar (int);
+
+template 
+auto foo (T t) -> decltype (bar (t), void{})
+{
+  return bar (t);
+}
+
+int
+main ()
+{
+  foo (0);
+}


Jakub

RE: [PATCH 7/15] arm: Emit build attributes for PACBTI target feature

2022-10-20 Thread Kyrylo Tkachov via Gcc-patches

Hi Andrea,

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
> Corallo via Gcc-patches
> Sent: Friday, August 12, 2022 4:31 PM
> To: Andrea Corallo via Gcc-patches 
> Cc: Richard Earnshaw ; nd 
> Subject: [PATCH 7/15] arm: Emit build attributes for PACBTI target feature
> 
> This patch emits assembler directives for PACBTI build attributes as
> defined by the
> ABI.
> 
>  aa/releases/download/2021Q1/addenda32.pdf>
> 
> gcc/ChangeLog:
> 
>   * config/arm/arm.c (arm_file_start): Emit EABI attributes for
>   Tag_PAC_extension, Tag_BTI_extension, TAG_BTI_use,
> TAG_PACRET_use.
> 
> gcc/testsuite/ChangeLog:
> 
>   * gcc.target/arm/acle/pacbti-m-predef-1.c: New test.
>   * gcc.target/arm/acle/pacbti-m-predef-3: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-6.c: Likewise.
>   * gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.
> 
> Co-Authored-By: Tejas Belagod  

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 0068817b0f2..ceec14f84b6 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -28349,6 +28349,8 @@ static void
 arm_file_start (void)
 {
   int val;
+  bool pac = (aarch_ra_sign_scope != AARCH_FUNCTION_NONE);
+  bool bti = (aarch_enable_bti == 1);
 
   arm_print_asm_arch_directives
 (asm_out_file, TREE_TARGET_OPTION (target_option_default_node));
@@ -28419,6 +28421,22 @@ arm_file_start (void)
arm_emit_eabi_attribute ("Tag_ABI_FP_16bit_format", 38,
 (int) arm_fp16_format);
 
+  if (TARGET_HAVE_PACBTI)
+   {
+ arm_emit_eabi_attribute ("Tag_PAC_extension", 50, 2);
+ arm_emit_eabi_attribute ("Tag_BTI_extension", 52, 2);
+   }
+  else if (pac || bti)
+   {
+ arm_emit_eabi_attribute ("Tag_PAC_extension", 50, 1);
+ arm_emit_eabi_attribute ("Tag_BTI_extension", 52, 1);
+   }

This hunk will set both Tag_PAC_extension and Tag_BTI_extension if only one of 
pac or bti is on. Is that intended?
Would it makes sense to instead set the two Tag_*_extension tags individually 
as in the hunk below?
+
+  if (bti)
+arm_emit_eabi_attribute ("TAG_BTI_use", 74, 1);
+  if (pac)
+   arm_emit_eabi_attribute ("TAG_PACRET_use", 76, 1);
+
   if (arm_lang_output_object_attributes_hook)
arm_lang_output_object_attributes_hook();
 }

Thanks,
Kyrill

Re: [PATCH] c++ modules: handle CONCEPT_DECL in node_template_info [PR102963]

2022-10-20 Thread Nathan Sidwell via Gcc-patches


On 10/20/22 10:07, Patrick Palka wrote:

Here node_template_info is overlooking that CONCEPT_DECL has TEMPLATE_INFO
too, which makes get_originating_module_decl for the CONCEPT_DECL fail to
return the corresponding TEMPLATE_DECL, which leads to an ICE from
import_entity_index while pretty printing the CONCEPT_DECL's module
suffix as part of the failed static assert diagnostic.


ok



Tested on x86_64-pc-linux-gnu, does this look OK for trunk?

PR c++/102963

gcc/cp/ChangeLog:

* module.cc (node_template_info): Handle CONCEPT_DECL.

gcc/testsuite/ChangeLog:

* g++.dg/modules/concept-7_a.C: New test.
* g++.dg/modules/concept-7_b.C: New test.
---
  gcc/cp/module.cc   | 3 ++-
  gcc/testsuite/g++.dg/modules/concept-7_a.C | 7 +++
  gcc/testsuite/g++.dg/modules/concept-7_b.C | 7 +++
  3 files changed, 16 insertions(+), 1 deletion(-)
  create mode 100644 gcc/testsuite/g++.dg/modules/concept-7_a.C
  create mode 100644 gcc/testsuite/g++.dg/modules/concept-7_b.C

diff --git a/gcc/cp/module.cc b/gcc/cp/module.cc
index bb406a5cf01..dfed0a5ef89 100644
--- a/gcc/cp/module.cc
+++ b/gcc/cp/module.cc
@@ -4046,7 +4046,8 @@ node_template_info (tree decl, int &use)
   || TREE_CODE (decl) == TYPE_DECL
   || TREE_CODE (decl) == FUNCTION_DECL
   || TREE_CODE (decl) == FIELD_DECL
-  || TREE_CODE (decl) == TEMPLATE_DECL))
+  || TREE_CODE (decl) == TEMPLATE_DECL
+  || TREE_CODE (decl) == CONCEPT_DECL))
  {
use_tpl = DECL_USE_TEMPLATE (decl);
ti = DECL_TEMPLATE_INFO (decl);
diff --git a/gcc/testsuite/g++.dg/modules/concept-7_a.C 
b/gcc/testsuite/g++.dg/modules/concept-7_a.C
new file mode 100644
index 000..a39b31bf7f0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-7_a.C
@@ -0,0 +1,7 @@
+// PR c++/102963
+// { dg-additional-options "-fmodules-ts -fconcepts" }
+// { dg-module-cmi pr102963 }
+
+export module pr102963;
+
+export template concept C = __is_same(T, int);
diff --git a/gcc/testsuite/g++.dg/modules/concept-7_b.C 
b/gcc/testsuite/g++.dg/modules/concept-7_b.C
new file mode 100644
index 000..1f81208ebd5
--- /dev/null
+++ b/gcc/testsuite/g++.dg/modules/concept-7_b.C
@@ -0,0 +1,7 @@
+// PR c++/102963
+// { dg-additional-options "-fmodules-ts -fconcepts" }
+
+import pr102963;
+
+static_assert(C);
+static_assert(C); // { dg-error "static assert" }


--
Nathan Sidwell

Re: [PATCH] c++: Don't shortcut TREE_CONSTANT vector type CONSTRUCTORs in cxx_eval_constant_expression [PR107295]

2022-10-20 Thread Jason Merrill via Gcc-patches


On 10/19/22 03:48, Jakub Jelinek wrote:

Hi!

The excess precision support broke building skia (dependency of firefox)
on ia32 (it has something like the a constexpr variable), but as the other
cases show, it is actually a preexisting problem if one uses casts from
constants with wider floating point types.
The problem is that cxx_eval_constant_expression tries to short-cut
processing of TREE_CONSTANT CONSTRUCTORs if they satisfy
reduced_constant_expression_p - instead of calling cxx_eval_bare_aggregate
on them it just verifies flags and if they are TREE_CONSTANT even after
that, just fold.
Now, on the testcase we have a TREE_CONSTANT CONSTRUCTOR containing
TREE_CONSTANT NOP_EXPR of REAL_CST.  And, fold, which isn't recursive,
doesn't optimize that into VECTOR_CST, while later on we are only able
to optimize VECTOR_CST arithmetics, not arithmetics with vector
CONSTRUCTORs.
The following patch fixes that by only returning what fold returned
if for vector types it returned VECTOR_CST, otherwise let us
call cxx_eval_bare_aggregate.  That function will try to constant
evaluate all the elements and if anything changes, return a CONSTRUCTOR,
in the vector type cases with fold called on it at the end.
Now, just calling cxx_eval_bare_aggregate for vector types doesn't work
either (e.g. constexpr-builtin4.C breaks), because cxx_eval_bare_aggregate
if nothing changes (like all elts are already REAL_CSTs or INTEGER_CSTs)
will return the old CONSTRUCTOR and nothing folds it into a VECTOR_CST.


That seems like a bug; for VECTOR_TYPE we should fold even if !changed.


Also, the reason for the short-cutting is I think trying to avoid
allocating a new CONSTRUCTOR when nothing changes and we just create
GC garbage by it.


We might limit the shortcut to non-vector types by hoisting the vector 
check in reduced_constant_expression_p out of the 
CONSTRUCTOR_NO_CLEARING condition:



  if (CONSTRUCTOR_NO_CLEARING (t))
{
  if (TREE_CODE (TREE_TYPE (t)) == VECTOR_TYPE)
/* An initialized vector would have a VECTOR_CST.  */
return false;


then we could remove the fold in the shortcut.


Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2022-10-19  Jakub Jelinek  

PR c++/107295
* constexpr.cc (cxx_eval_constant_expression) :
Don't short-cut TREE_CONSTANT vector ctors if fold doesn't turn them
into VECTOR_CST.

* g++.dg/ext/vector42.C: New test.

--- gcc/cp/constexpr.cc.jj  2022-10-17 12:29:33.518016420 +0200
+++ gcc/cp/constexpr.cc 2022-10-19 01:29:28.761935708 +0200
@@ -7391,7 +7391,12 @@ cxx_eval_constant_expression (const cons
 VECTOR_CST if applicable.  */
  verify_constructor_flags (t);
  if (TREE_CONSTANT (t))
-   return fold (t);
+   {
+ r = fold (t);
+ if (TREE_CODE (TREE_TYPE (t)) != VECTOR_TYPE
+ || TREE_CODE (r) == VECTOR_CST)
+   return r;
+   }
}
r = cxx_eval_bare_aggregate (ctx, t, lval,
   non_constant_p, overflow_p);
--- gcc/testsuite/g++.dg/ext/vector42.C.jj  2022-10-18 12:33:42.938510483 
+0200
+++ gcc/testsuite/g++.dg/ext/vector42.C 2022-10-18 12:32:27.448544476 +0200
@@ -0,0 +1,12 @@
+// PR c++/107295
+// { dg-do compile { target c++11 } }
+
+template  struct A {
+  typedef T __attribute__((vector_size (sizeof (int V;
+};
+template  using B = typename A::V;
+template  using V = B<4, T>;
+using F = V;
+constexpr F a = F () + 0.0f;
+constexpr F b = F () + (float) 0.0;
+constexpr F c = F () + (float) 0.0L;

Jakub

RE: [PATCH 10/12 V2] arm: Implement cortex-M return signing address codegen

2022-10-20 Thread Kyrylo Tkachov via Gcc-patches

(Sorry, this is a very late reply)

> -Original Message-
> From: Gcc-patches  bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
> Corallo via Gcc-patches
> Sent: Monday, August 8, 2022 10:34 AM
> To: Richard Earnshaw 
> Cc: Richard Earnshaw ; nd ;
> Andrea Corallo via Gcc-patches 
> Subject: Re: [PATCH 10/12 V2] arm: Implement cortex-M return signing
> address codegen
> 
> Richard Earnshaw  writes:
> 
> [...]
> 
> > +(define_insn "pac_nop"
> > +  [(set (reg:SI IP_REGNUM)
> > +   (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> > +   UNSPEC_PAC_NOP))]
> > +  "TARGET_THUMB2"
> > +  "pac\t%|ip, %|lr, %|sp"
> > +  [(set_attr "length" "2")])
> >
> > This pattern is missing a type.
> 
> Which type do you think is missing?
> 
> > The length is also incorrect as the
> > instruction is 32-bits (4 bytes).
> 
> Ack.
> 
> > Similarly for the other
> > instructions below.  Also, you need to mark them as incompatible with
> > conditional execution (they're constrained-unpredictable in IT
> > blocks).
> 
> I guess this would translate in setting it with '(set_attr "predicable" "no")'
> 
> But isn't this already the default?

I think Richard means the "conds" attribute. It's something I'd like to see 
cleaned from the arm backend eventually, but for now there's a (very) late 
condexec pass that can generate conditional instructions based on that 
attribute.
Basically, it needs to be set to "undconditional" for these instructions.
Thanks,
Kyrill

> 
> Thanks
> 
>   Andrea

RE: [PATCH 12/15 V2] arm: implement bti injection

2022-10-20 Thread Kyrylo Tkachov via Gcc-patches




> -Original Message-
> From: Andrea Corallo 
> Sent: Thursday, September 29, 2022 4:46 PM
> To: Kyrylo Tkachov 
> Cc: Andrea Corallo via Gcc-patches ; Richard
> Earnshaw ; nd 
> Subject: [PATCH 12/15 V2] arm: implement bti injection
> 
> Kyrylo Tkachov  writes:
> 
> > Hi Andrea,
> 
> [...]
> 
> > diff --git a/gcc/config/arm/aarch-bti-insert.cc b/gcc/config/arm/aarch-bti-
> insert.cc
> > index 2d1d2e334a9..8f045c247bf 100644
> > --- a/gcc/config/arm/aarch-bti-insert.cc
> > +++ b/gcc/config/arm/aarch-bti-insert.cc
> > @@ -41,6 +41,7 @@
> >  #include "cfgrtl.h"
> >  #include "tree-pass.h"
> >  #include "cgraph.h"
> > +#include "diagnostic-core.h"
> >
> > This change doesn't seem to match what's in the ChangeLog and doesn't
> make sense to me.
> 
> Change removed thanks.
> 
> > @@ -32985,6 +32979,58 @@ arm_current_function_pac_enabled_p (void)
> > && !crtl->is_leaf);
> >  }
> >
> > +/* Return TRUE if Branch Target Identification Mechanism is enabled.  */
> > +bool
> > +aarch_bti_enabled (void)
> > +{
> > +  return aarch_enable_bti == 1;
> > +}
> > +
> > +/* Check if INSN is a BTI J insn.  */
> > +bool
> > +aarch_bti_j_insn_p (rtx_insn *insn)
> > +{
> > +  if (!insn || !INSN_P (insn))
> > +return false;
> > +
> > +  rtx pat = PATTERN (insn);
> > +  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
> UNSPEC_BTI_NOP;
> > +}
> > +
> > +/* Check if X (or any sub-rtx of X) is a PACIASP/PACIBSP instruction.  */
> >
> > The arm instructions are not PACIASP/PACIBSP.
> > This comment should be rewritten.
> 
> This hunk belongs to aarch64.cc so it's aarch64 specific.
> 
> > +bool
> > +aarch_pac_insn_p (rtx x)
> > +{
> >
> > ..
> >
> > +rtx
> > +aarch_gen_bti_c (void)
> > +{
> > +  return gen_bti_nop ();
> > +}
> > +
> > +rtx
> > +aarch_gen_bti_j (void)
> > +{
> > +  return gen_bti_nop ();
> > +}
> > +
> >
> > A reader may be confused for why we have a bti_c and bti_j function that
> have identical functionality.
> > Please add function comments explaining the situation.
> 
> Done
> 
> > diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
> > index 92269a7819a..90c8c1d66f5 100644
> > --- a/gcc/config/arm/arm.md
> > +++ b/gcc/config/arm/arm.md
> > @@ -12913,6 +12913,13 @@
> >"aut\t%|ip, %|lr, %|sp"
> >[(set_attr "length" "4")])
> >
> > +(define_insn "bti_nop"
> > +  [(unspec_volatile [(const_int 0)] UNSPEC_BTI_NOP)]
> > +  "arm_arch7 && arm_arch_cmse"
> >
> > That seems like a copy-paste mistake. CMSE has nothing to do with this
> functionality?
> 
> This is because we don't have arm_arch8m_main, but this is equivalent to
> arm_arch7 && arm_arch_cmse.  IIUC it wasn't added becasue armv8-m is
> basically just armv7-m + cmse.
> 
> Any other preferred way to express this?

I think I'd prefer if we added an explicit arm_arch8m_main. It would help 
readability

> 
> > +  "bti"
> > +  [(set_attr "length" "4")
> >
> > The length of instructions in the arm backend is 4 by default, this set_attr
> can be omitted
> >
> > +   (set_attr "type" "mov_reg")])
> > +
> > Probably better to use the "nop" attribute here?
> 
> Done

Thanks, and as in patch 10/12 I think we'll want to set the "conds" attribute 
here to "unconditional".
Looks good to me otherwise!
Kyrill

> 
> Thanks for reviewing, please find attached the updated version.
> 
>   Andrea

[PATCH] libstdc++: Make placeholders inline when inline variables are available

2022-10-20 Thread Arsen Arsenović via Gcc-patches

This slightly lowers the dependency of generated code on libstdc++.so.

libstdc++-v3/ChangeLog:

* include/std/functional: Make placeholders inline, if possible.
---
 libstdc++-v3/include/std/functional | 66 -
 1 file changed, 37 insertions(+), 29 deletions(-)

diff --git a/libstdc++-v3/include/std/functional 
b/libstdc++-v3/include/std/functional
index d22acaa3cb8..b396e8dbbdc 100644
--- a/libstdc++-v3/include/std/functional
+++ b/libstdc++-v3/include/std/functional
@@ -285,35 +285,43 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
* simplify this with variadic templates, because we're introducing
* unique names for each.
*/
-extern const _Placeholder<1> _1;
-extern const _Placeholder<2> _2;
-extern const _Placeholder<3> _3;
-extern const _Placeholder<4> _4;
-extern const _Placeholder<5> _5;
-extern const _Placeholder<6> _6;
-extern const _Placeholder<7> _7;
-extern const _Placeholder<8> _8;
-extern const _Placeholder<9> _9;
-extern const _Placeholder<10> _10;
-extern const _Placeholder<11> _11;
-extern const _Placeholder<12> _12;
-extern const _Placeholder<13> _13;
-extern const _Placeholder<14> _14;
-extern const _Placeholder<15> _15;
-extern const _Placeholder<16> _16;
-extern const _Placeholder<17> _17;
-extern const _Placeholder<18> _18;
-extern const _Placeholder<19> _19;
-extern const _Placeholder<20> _20;
-extern const _Placeholder<21> _21;
-extern const _Placeholder<22> _22;
-extern const _Placeholder<23> _23;
-extern const _Placeholder<24> _24;
-extern const _Placeholder<25> _25;
-extern const _Placeholder<26> _26;
-extern const _Placeholder<27> _27;
-extern const _Placeholder<28> _28;
-extern const _Placeholder<29> _29;
+#if __cpp_inline_variables
+#  define _GLIBCXX_PLACEHOLDER inline
+#else
+#  define _GLIBCXX_PLACEHOLDER extern
+#endif
+
+_GLIBCXX_PLACEHOLDER const _Placeholder<1> _1;
+_GLIBCXX_PLACEHOLDER const _Placeholder<2> _2;
+_GLIBCXX_PLACEHOLDER const _Placeholder<3> _3;
+_GLIBCXX_PLACEHOLDER const _Placeholder<4> _4;
+_GLIBCXX_PLACEHOLDER const _Placeholder<5> _5;
+_GLIBCXX_PLACEHOLDER const _Placeholder<6> _6;
+_GLIBCXX_PLACEHOLDER const _Placeholder<7> _7;
+_GLIBCXX_PLACEHOLDER const _Placeholder<8> _8;
+_GLIBCXX_PLACEHOLDER const _Placeholder<9> _9;
+_GLIBCXX_PLACEHOLDER const _Placeholder<10> _10;
+_GLIBCXX_PLACEHOLDER const _Placeholder<11> _11;
+_GLIBCXX_PLACEHOLDER const _Placeholder<12> _12;
+_GLIBCXX_PLACEHOLDER const _Placeholder<13> _13;
+_GLIBCXX_PLACEHOLDER const _Placeholder<14> _14;
+_GLIBCXX_PLACEHOLDER const _Placeholder<15> _15;
+_GLIBCXX_PLACEHOLDER const _Placeholder<16> _16;
+_GLIBCXX_PLACEHOLDER const _Placeholder<17> _17;
+_GLIBCXX_PLACEHOLDER const _Placeholder<18> _18;
+_GLIBCXX_PLACEHOLDER const _Placeholder<19> _19;
+_GLIBCXX_PLACEHOLDER const _Placeholder<20> _20;
+_GLIBCXX_PLACEHOLDER const _Placeholder<21> _21;
+_GLIBCXX_PLACEHOLDER const _Placeholder<22> _22;
+_GLIBCXX_PLACEHOLDER const _Placeholder<23> _23;
+_GLIBCXX_PLACEHOLDER const _Placeholder<24> _24;
+_GLIBCXX_PLACEHOLDER const _Placeholder<25> _25;
+_GLIBCXX_PLACEHOLDER const _Placeholder<26> _26;
+_GLIBCXX_PLACEHOLDER const _Placeholder<27> _27;
+_GLIBCXX_PLACEHOLDER const _Placeholder<28> _28;
+_GLIBCXX_PLACEHOLDER const _Placeholder<29> _29;
+
+#undef _GLIBCXX_PLACEHOLDER
   }
 
   /**
-- 
2.38.1

Re: [PATCH 7/15] arm: Emit build attributes for PACBTI target feature

2022-10-20 Thread Richard Earnshaw via Gcc-patches





On 20/10/2022 15:47, Kyrylo Tkachov via Gcc-patches wrote:

Hi Andrea,


-Original Message-
From: Gcc-patches  On Behalf Of Andrea
Corallo via Gcc-patches
Sent: Friday, August 12, 2022 4:31 PM
To: Andrea Corallo via Gcc-patches 
Cc: Richard Earnshaw ; nd 
Subject: [PATCH 7/15] arm: Emit build attributes for PACBTI target feature

This patch emits assembler directives for PACBTI build attributes as
defined by the
ABI.



gcc/ChangeLog:

* config/arm/arm.c (arm_file_start): Emit EABI attributes for
Tag_PAC_extension, Tag_BTI_extension, TAG_BTI_use,
TAG_PACRET_use.

gcc/testsuite/ChangeLog:

* gcc.target/arm/acle/pacbti-m-predef-1.c: New test.
* gcc.target/arm/acle/pacbti-m-predef-3: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-6.c: Likewise.
* gcc.target/arm/acle/pacbti-m-predef-7.c: Likewise.

Co-Authored-By: Tejas Belagod  


diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index 0068817b0f2..ceec14f84b6 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -28349,6 +28349,8 @@ static void
  arm_file_start (void)
  {
int val;
+  bool pac = (aarch_ra_sign_scope != AARCH_FUNCTION_NONE);
+  bool bti = (aarch_enable_bti == 1);
  
arm_print_asm_arch_directives

  (asm_out_file, TREE_TARGET_OPTION (target_option_default_node));
@@ -28419,6 +28421,22 @@ arm_file_start (void)
arm_emit_eabi_attribute ("Tag_ABI_FP_16bit_format", 38,
 (int) arm_fp16_format);
  
+  if (TARGET_HAVE_PACBTI)

+   {
+ arm_emit_eabi_attribute ("Tag_PAC_extension", 50, 2);
+ arm_emit_eabi_attribute ("Tag_BTI_extension", 52, 2);
+   }
+  else if (pac || bti)
+   {
+ arm_emit_eabi_attribute ("Tag_PAC_extension", 50, 1);
+ arm_emit_eabi_attribute ("Tag_BTI_extension", 52, 1);
+   }

This hunk will set both Tag_PAC_extension and Tag_BTI_extension if only one of 
pac or bti is on. Is that intended?
Would it makes sense to instead set the two Tag_*_extension tags individually 
as in the hunk below?


That's because they are one feature in armv8-m and these tags describe 
the presence of the feature in the architecture.



+
+  if (bti)
+arm_emit_eabi_attribute ("TAG_BTI_use", 74, 1);
+  if (pac)
+   arm_emit_eabi_attribute ("TAG_PACRET_use", 76, 1);
+


But this describes /use/ by the code of each feature.

R.


if (arm_lang_output_object_attributes_hook)
arm_lang_output_object_attributes_hook();
  }

Thanks,
Kyrill

Re: [PATCH 0/6] Add Intel Sierra Forest Instructions

2022-10-20 Thread Iain Sandoe

Hi Hongtao,

> On 20 Oct 2022, at 10:20, Hongtao Liu  wrote:
> 
> On Thu, Oct 20, 2022 at 5:17 PM Iain Sandoe  wrote:
>> 
>> 
>> 
>>> On 20 Oct 2022, at 10:09, Hongtao Liu via Gcc-patches 
>>>  wrote:
>>> 
>>> On Thu, Oct 20, 2022 at 9:11 AM Hongtao Liu  wrote:
 
 On Wed, Oct 19, 2022 at 7:09 PM Iain Sandoe  
 wrote:

>> On 17 Oct 2022, at 02:56, Hongtao Liu  wrote:
>> 
>> On Mon, Oct 17, 2022 at 9:30 AM Bernhard Reutner-Fischer
>>  wrote:
>>> 
>>> On 17 October 2022 03:02:22 CEST, Hongtao Liu via Gcc-patches
>>> 
>>> Do you have this series as a branch somewhere that I can try on one 
>>> of the
>>> like affected platforms?
>> 
>> Not yet.
>> Do we have any external place to put those patches so folks from the
>> community can validate before it's committed, HJ?
>>> 
>>> 
>>> https://gcc.gnu.org/gitwrite.html#vendor
>>> 
>>> Not sure where in cgit the user branches are visible, though? But they 
>>> can certainly be cloned and worked with.
>> Thanks for the reminder, I've pushed to remotes/vendors/ix86/ise046.
>> * [new ref] refs/vendors/ix86/heads/ise046 ->
>> vendors/ix86/ise046
> 
> thanks for pushing this branch, much better to test these things before 
> committing rather than a panic
> to fix after…
> 
> 
> with
> f90df941532 (HEAD -> ise046, vendors/ix86/ise046) Add m_CORE_ATOM for 
> atom cores
> 
> - on x86_64 Darwin19  I get the following bootstrap fail:
> 
> In file included from 
> /src-local/gcc-master/gcc/config/i386/driver-i386.cc:31:
> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h: In function 
> ‘const char* get_intel_cpu(__processor_model*, __processor_model2*, 
> unsigned int*)’:
> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:532:32: error: 
> this statement may fall through [-Werror=implicit-fallthrough=]
> 532 |   cpu_model->__cpu_subtype = INTEL_COREI7_GRANITERAPIDS;
> |   ~^~~~
> /src-local/gcc-master/gcc/common/config/i386/cpuinfo.h:533:5: note: here
> 533 | case 0xb6:
> | ^~~~
> cc1plus: all warnings being treated as errors
> 
> 
> Will try to look later, if that does not immediately ring some bell.
 This should a bug, thanks!
>>> I've updated the branch, please try that.
>> 
>> I had made the same fix locally (adding the “break”, right?) and testing is 
>> ongoing
> Yes, please go ahead.

Thanks for giving me a chance to test, this seems OK on Darwin (no large-scale
fallout, anyway) ..  

I tested the ise046 branch which looks like it collects several of the posted 
patch
series, so I’ve covered those too. (not had a chance to test on AVX512 yet, but 
if
the series is basically OK on skylake, then should be not too much issue).

Iain

P.S. I am usually able to test patches / series like this on Darwin if you 
point me at
a branch.

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Joseph Myers

On Thu, 20 Oct 2022, Martin Liška wrote:

> > Could generated man and info pages be provided as a tarball on
> > gcc.gnu.org or ftp.gnu.org?
> 
> Not planning doing that.

Release tarballs (but not snapshots) currently include the info files and 
man pages, via gcc_release running a build with 
--enable-generated-files-in-srcdir before building the tarball.

I think they should continue to do so.  This means:

(a) --enable-generated-files-in-srcdir needs to cause those files to be 
generated in the source directory, as it does at present.

(b) gcc_release, for building a release but not a snapshot, needs to give 
an error if Sphinx is missing or too old and so those files weren't built 
properly (and thus people running gcc_release to build a release tarball 
will need new-enough Sphinx).

(c) It needs to be verified that building and installing from such a 
release tarball works even if Sphinx is missing or too old - that is, that 
it installs the prebuilt info / man files rather than giving an error or 
failing to install them.

Also, but not strictly part of the release issue:

(d) Builds with missing or old Sphinx should work regardless of whether 
such files are in the source directory - but if they aren't in the source 
directory, the effect of missing or old Sphinx (detected at configure 
time) should be to disable building and installing documentation.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH RESEND 1/1] p1689r5: initial support

2022-10-20 Thread Jason Merrill via Gcc-patches


On 10/18/22 08:18, Ben Boeckel wrote:

On Tue, Oct 11, 2022 at 07:42:43 -0400, Ben Boeckel wrote:

On Mon, Oct 10, 2022 at 17:04:09 -0400, Jason Merrill wrote:

Can we share utf8 parsing code with decode_utf8_char in pretty-print.cc?


I can look at factoring that out. I'll have to decode its logic to see
how much overlap there is.


There is some mismatch. First, that is in `gcc` and this is in `libcpp`.


Oops, I was thinking this was in gcc as well.  In libcpp there's 
_cpp_valid_utf8 (which calls one_utf8_to_cppchar).


Jason

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Martin Liška

On 10/20/22 17:35, Joseph Myers wrote:
> On Thu, 20 Oct 2022, Martin Liška wrote:
> 
>>> Could generated man and info pages be provided as a tarball on
>>> gcc.gnu.org or ftp.gnu.org?
>>
>> Not planning doing that.
> 
> Release tarballs (but not snapshots) currently include the info files and 
> man pages, via gcc_release running a build with 
> --enable-generated-files-in-srcdir before building the tarball.
> 
> I think they should continue to do so.  This means:
> 
> (a) --enable-generated-files-in-srcdir needs to cause those files to be 
> generated in the source directory, as it does at present.
> 
> (b) gcc_release, for building a release but not a snapshot, needs to give 
> an error if Sphinx is missing or too old and so those files weren't built 
> properly (and thus people running gcc_release to build a release tarball 
> will need new-enough Sphinx).
> 
> (c) It needs to be verified that building and installing from such a 
> release tarball works even if Sphinx is missing or too old - that is, that 
> it installs the prebuilt info / man files rather than giving an error or 
> failing to install them.
> 
> Also, but not strictly part of the release issue:
> 
> (d) Builds with missing or old Sphinx should work regardless of whether 
> such files are in the source directory - but if they aren't in the source 
> directory, the effect of missing or old Sphinx (detected at configure 
> time) should be to disable building and installing documentation.

All right Joseph, is it something you're willing to help me once we start
using Sphinx? Apparently, there will be many consequent steps after we switch.

Cheers,
Martin

[PATCH] libstdc++: Enable building libstdc++.{a,so} when !HOSTED

2022-10-20 Thread Arsen Arsenović via Gcc-patches

This enables us to provide symbols for placeholders and numeric limits,
and allows users to mess about with linker flags less.

libstdc++-v3/ChangeLog:

* Makefile.am [!_GLIBCXX_HOSTED]: Enable src/ subdirectory.
* Makefile.in: Regenerate.
* src/Makefile.am [!_GLIBCXX_HOSTED]: Omit compatibility files.
There's no history to be compatible with.
* src/c++11/Makefile.am [!_GLIBCXX_HOSTED]: Omit hosted-only
source files from the build.
* src/c++17/Makefile.am [!_GLIBCXX_HOSTED]: Likewise.
* src/c++20/Makefile.am [!_GLIBCXX_HOSTED]: Likewise.
* src/c++98/Makefile.am [!_GLIBCXX_HOSTED]: Likewise.
* src/Makefile.in: Regenerate.
* src/c++11/Makefile.in: Regenerate.
* src/c++17/Makefile.in: Regenerate.
* src/c++20/Makefile.in: Regenerate.
* src/c++98/Makefile.in: Regenerate.
---
Afternoon,

With these changes, when we aren't hosted, we get a libstdc++ library that
contains only library facilities available in freestanding (i.e. placeholders
and limits.cc).  This is, AFAICT, the only code in libstdc++.{a,so} that can
(and should) be available in freestanding.

As an implementation note, this could be a little bit faster (at
build/configure time), though not necessarily nicer, by having
src/Makefile.am not try to build convenience libraries for versions of
C++ that provide nothing.  I opted not to do this since it'd make
src/Makefile.am even more complex, and make future changes harder to implement.
libstdc++ also isn't that slow to build, anyway.

Tested on i686-elf.

Have a good day!

 libstdc++-v3/Makefile.am   |  4 ++--
 libstdc++-v3/Makefile.in   |  4 ++--
 libstdc++-v3/src/Makefile.am   |  6 +
 libstdc++-v3/src/Makefile.in   |  8 +--
 libstdc++-v3/src/c++11/Makefile.am | 16 ++---
 libstdc++-v3/src/c++11/Makefile.in | 37 +++---
 libstdc++-v3/src/c++17/Makefile.am |  4 
 libstdc++-v3/src/c++17/Makefile.in |  6 +++--
 libstdc++-v3/src/c++20/Makefile.am |  4 
 libstdc++-v3/src/c++20/Makefile.in |  6 +++--
 libstdc++-v3/src/c++98/Makefile.am |  4 
 libstdc++-v3/src/c++98/Makefile.in |  6 +++--
 12 files changed, 77 insertions(+), 28 deletions(-)

diff --git a/libstdc++-v3/Makefile.am b/libstdc++-v3/Makefile.am
index 0d147ad3ffe..d7f2b6e76a5 100644
--- a/libstdc++-v3/Makefile.am
+++ b/libstdc++-v3/Makefile.am
@@ -24,11 +24,11 @@ include $(top_srcdir)/fragment.am
 
 if GLIBCXX_HOSTED
 ## Note that python must come after src.
-  hosted_source = src doc po testsuite python
+  hosted_source = doc po testsuite python
 endif
 
 ## Keep this list sync'd with acinclude.m4:GLIBCXX_CONFIGURE.
-SUBDIRS = include libsupc++ $(hosted_source)
+SUBDIRS = include libsupc++ src $(hosted_source)
 
 ACLOCAL_AMFLAGS = -I . -I .. -I ../config
 
diff --git a/libstdc++-v3/src/Makefile.am b/libstdc++-v3/src/Makefile.am
index b83c222d51d..4eb78e76297 100644
--- a/libstdc++-v3/src/Makefile.am
+++ b/libstdc++-v3/src/Makefile.am
@@ -121,7 +121,13 @@ cxx11_sources = \
${cxx0x_compat_sources} \
${ldbl_alt128_compat_sources}
 
+if GLIBCXX_HOSTED
 libstdc___la_SOURCES = $(cxx98_sources) $(cxx11_sources)
+else
+# When freestanding, there's currently no compatibility to preserve.  Should
+# that change, any compatibility sources can be added here.
+libstdc___la_SOURCES =
+endif
 
 libstdc___la_LIBADD = \
$(GLIBCXX_LIBS) \
diff --git a/libstdc++-v3/src/c++11/Makefile.am 
b/libstdc++-v3/src/c++11/Makefile.am
index ecd46aafc01..72f05100c98 100644
--- a/libstdc++-v3/src/c++11/Makefile.am
+++ b/libstdc++-v3/src/c++11/Makefile.am
@@ -51,6 +51,10 @@ else
 cxx11_abi_sources =
 endif
 
+sources_freestanding = \
+   limits.cc \
+   placeholders.cc
+
 sources = \
chrono.cc \
codecvt.cc \
@@ -66,9 +70,7 @@ sources = \
hashtable_c++0x.cc \
ios.cc \
ios_errcat.cc \
-   limits.cc \
mutex.cc \
-   placeholders.cc \
random.cc \
regex.cc  \
shared_ptr.cc \
@@ -118,7 +120,15 @@ endif
 
 vpath % $(top_srcdir)/src/c++11
 
-libc__11convenience_la_SOURCES = $(sources)  $(inst_sources)
+if !GLIBCXX_HOSTED
+libc__11convenience_la_SOURCES = $(sources_freestanding)
+else
+libc__11convenience_la_SOURCES = \
+   $(sources_freestanding) \
+   $(sources) \
+   $(inst_sources)
+endif
+
 
 # Use special rules for the hashtable.cc file so that all
 # the generated template functions are also instantiated.
diff --git a/libstdc++-v3/src/c++17/Makefile.am 
b/libstdc++-v3/src/c++17/Makefile.am
index 3d53f652fac..72095f5b087 100644
--- a/libstdc++-v3/src/c++17/Makefile.am
+++ b/libstdc++-v3/src/c++17/Makefile.am
@@ -60,7 +60,11 @@ sources = \
 
 vpath % $(top_srcdir)/src/c++17
 
+if GLIBCXX_HOSTED
 libc__17convenience_la_SOURCES = $(sources)  $(inst_sources)
+else
+libc__17convenience_la_SOURCES =
+endif
 
 if GLIBCXX_LDBL_ALT128_COMPAT
 floating_from_chars.lo: floating_from_chars.cc

[PATCH (pushed)] Remove dead link to Buildbot.

2022-10-20 Thread Martin Liška

Removed after a discussion with Jan-Benedict. He's currently working
on his testing infrastructure.

Cheers,
Martin

---
 htdocs/style.mhtml | 1 -
 1 file changed, 1 deletion(-)

diff --git a/htdocs/style.mhtml b/htdocs/style.mhtml
index 0790a972..08def35e 100644
--- a/htdocs/style.mhtml
+++ b/htdocs/style.mhtml
@@ -126,7 +126,6 @@
   Back ends
   Extensions
   Benchmarks
-  http://toolchain.lug-owl.de/buildbot/";>Buildbot
   Translations
   
   
-- 
2.38.0

[PATCH] OpenMP: Duplicate checking for map clauses in Fortran (PR107214)

2022-10-20 Thread Julian Brown

This patch adds duplicate checking for OpenMP "map" clauses, taking some
cues from the implementation for C in c-typeck.cc:c_finish_omp_clauses
(and similar for C++).

In addition to the existing use of the "mark" and "comp_mark" bitfields
in the gfc_symbol structure, the patch adds several new bits handling
duplicate checking within various categories of clause types.  If "mark"
is being used for map clauses, we need to use different bits for other
clauses for cases where "map" and some other clause can refer to the
same symbol (e.g. "map(n) shared(n)").

Tested with offloading to NVPTX. OK?

2022-10-20  Julian Brown  

gcc/fortran/
PR fortran/107214
* gfortran.h (gfc_symbol): Add data_mark, dev_mark, gen_mark and
reduc_mark bitfields.
* openmp.cc (resolve_omp_clauses): Use above bitfields to improve
duplicate clause detection.

gcc/testsuite/
PR fortran/107214
* gfortran.dg/gomp/pr107214.f90: New test.
---
 gcc/fortran/gfortran.h  | 16 +-
 gcc/fortran/openmp.cc   | 63 +
 gcc/testsuite/gfortran.dg/gomp/pr107214.f90 |  7 +++
 3 files changed, 72 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/pr107214.f90

diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index fe8c4e131f3..511a1ec3623 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -1861,9 +1861,21 @@ typedef struct gfc_symbol
  the current statement.  Otherwise, old_symbol points to a copy of
  the old symbol. gfc_new is used in symbol.cc to flag new symbols.
  comp_mark is used to indicate variables which have component accesses
- in OpenMP/OpenACC directive clauses.  */
+ in OpenMP/OpenACC directive clauses (cf. c-typeck.cc:c_finish_omp_clauses,
+ map_field_head).
+ data_mark is used to check duplicate mappings for OpenMP data-sharing
+ clauses (see firstprivate_head/lastprivate_head in the above function).
+ dev_mark is used to check duplicate mappings for OpenMP
+ is_device_ptr/has_device_addr clauses (see is_on_device_head in above
+ function).
+ gen_mark is used to check duplicate mappings for OpenMP
+ use_device_ptr/use_device_addr/private/shared clauses (see generic_head in
+ above functon).
+ reduc_mark is used to check duplicate mappings for OpenMP reduction
+ clauses.  */
   struct gfc_symbol *old_symbol;
-  unsigned mark:1, comp_mark:1, gfc_new:1;
+  unsigned mark:1, comp_mark:1, data_mark:1, dev_mark:1, gen_mark:1;
+  unsigned reduc_mark:1, gfc_new:1;
 
   /* The tlink field is used in the front end to carry the module
  declaration of separate module procedures so that the characteristics
diff --git a/gcc/fortran/openmp.cc b/gcc/fortran/openmp.cc
index ce719bd5d92..d4595aae23e 100644
--- a/gcc/fortran/openmp.cc
+++ b/gcc/fortran/openmp.cc
@@ -6738,6 +6738,10 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
  continue;
n->sym->mark = 0;
n->sym->comp_mark = 0;
+   n->sym->data_mark = 0;
+   n->sym->dev_mark = 0;
+   n->sym->gen_mark = 0;
+   n->sym->reduc_mark = 0;
if (n->sym->attr.flavor == FL_VARIABLE
|| n->sym->attr.proc_pointer
|| (!code && (!n->sym->attr.dummy || n->sym->ns != ns)))
@@ -6806,7 +6810,6 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
&& list != OMP_LIST_LASTPRIVATE
&& list != OMP_LIST_ALIGNED
&& list != OMP_LIST_DEPEND
-   && (list != OMP_LIST_MAP || openacc)
&& list != OMP_LIST_FROM
&& list != OMP_LIST_TO
&& (list != OMP_LIST_REDUCTION || !openacc)
@@ -6825,10 +6828,43 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
for (gfc_ref *ref = n->expr->ref; ref; ref = ref->next)
  if (ref->type == REF_COMPONENT)
component_ref_p = true;
- if ((!component_ref_p && n->sym->comp_mark)
- || (component_ref_p && n->sym->mark))
-   gfc_error ("Symbol %qs has mixed component and non-component "
-  "accesses at %L", n->sym->name, &n->where);
+ if ((list == OMP_LIST_IS_DEVICE_PTR
+  || list == OMP_LIST_HAS_DEVICE_ADDR)
+ && !component_ref_p)
+   {
+ if (n->sym->gen_mark || n->sym->dev_mark || n->sym->reduc_mark)
+   gfc_error ("Symbol %qs present on multiple clauses at %L",
+  n->sym->name, &n->where);
+ else
+   n->sym->dev_mark = 1;
+   }
+ else if ((list == OMP_LIST_USE_DEVICE_PTR
+   || list == OMP_LIST_USE_DEVICE_ADDR
+   || list == OMP_LIST_PRIVATE
+   || list == OMP_LIST_SHARED)
+  && !component_ref_p)
+   {
+ if (n->sym->gen_mark || n->sym->dev_mark || n->sym->reduc_mark)
+   gfc_error ("Sy

Re: [PATCH] c++: constraint matching, TEMPLATE_ID_EXPR, current inst

2022-10-20 Thread Jason Merrill via Gcc-patches


On 9/17/22 10:31, Patrick Palka wrote:

On Sat, 17 Sep 2022, Jason Merrill wrote:


On 9/16/22 10:59, Patrick Palka wrote:

On Fri, 16 Sep 2022, Jason Merrill wrote:


On 9/15/22 11:58, Patrick Palka wrote:

Here we're crashing during constraint matching for the instantiated
hidden friends due to two issues with dependent substitution into a
TEMPLATE_ID_EXPR naming a template from the current instantiation
(as performed from maybe_substitute_reqs_for for C<3> with T=T):

 * tsubst_copy substitutes into such a TEMPLATE_DECL by looking it
   up from the substituted class scope.  But for this to not fail
when
   the args are dependent, we need to pass entering_scope=true for
the
   class scope substitution so that we obtain the primary template
type
   A (which has TYPE_BINFO) instead of the implicit instantiation
   A (which doesn't).
 * lookup_and_finish_template_variable shouldn't instantiate a
   TEMPLATE_ID_EXPR that names a TEMPLATE_DECL which has more than
   one level of (unsubstituted) parameters (such as A::C).

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?

gcc/cp/ChangeLog:

* pt.cc (lookup_and_finish_template_variable): Don't
instantiate if the template's scope is dependent.
(tsubst_copy) : Pass entering_scope=true
when substituting the class scope.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/concepts-friend10.C: New test.
---
gcc/cp/pt.cc  | 14 +++--
.../g++.dg/cpp2a/concepts-friend10.C  | 21
+++
2 files changed, 29 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/g++.dg/cpp2a/concepts-friend10.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index db4e808adec..bfcbe0b8670 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -10475,14 +10475,15 @@ tree
lookup_and_finish_template_variable (tree templ, tree targs,
 tsubst_flags_t complain)
{
-  templ = lookup_template_variable (templ, targs);
-  if (!any_dependent_template_arguments_p (targs))
+  tree var = lookup_template_variable (templ, targs);
+  if (TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (templ)) == 1
+  && !any_dependent_template_arguments_p (targs))


I notice that finish_id_expression_1 uses the equivalent of
type_dependent_expression_p (var).  Does that work here?


Hmm, it does, but kind of by accident: type_dependent_expression_p
returns true for all variable TEMPLATE_ID_EXPRs because of their empty
TREE_TYPE (as set by finish_template_variable).  So testing t_d_e_p here
is equivalent to testing processing_template_decl, it seems -- maximally
conservative.

We can improve type_dependent_expression_p for variable TEMPLATE_ID_EXPR
by ignoring its (always empty) TREE_TYPE and just considering dependence
of its template and args directly.


I guess the problem is that a variable template specialization is 
type-dependent until it's instantiated or specialized, and here we're 
trying to instantiate if that hasn't happened yet, so using 
type_dependent_expression_p would be wrong.


The patch is OK as is.

Jason

PATCH: c++tools: fix compilation

2022-10-20 Thread Guillaume Gomez via Gcc-patches

Hi,

This patch fixes the following compilation error:

../.././c++tools/server.cc: In function ‘void server(bool, int,
module_resolver*)’:
../.././c++tools/server.cc:756:69: error: ‘readers’ was not declared
in this scope; did you mean ‘read’?
  756 |   if (active < 0 && sock_fd >= 0 && FD_ISSET
(sock_fd, &readers))
  |
 ^~~

It was missing a preprocessor condition around this code to work as
the "readers" variable
is created only in a preprocessor condition.

Signed-off-by: Guillaume Gomez 
From 39279d8b37287c09708d910921ce5cfa5b87ac01 Mon Sep 17 00:00:00 2001
From: Guillaume Gomez 
Date: Thu, 20 Oct 2022 18:18:52 +0200
Subject: [PATCH] Add missing preprocessor condition to fix c++tools/server.cc
 file compilation

---
 c++tools/server.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/c++tools/server.cc b/c++tools/server.cc
index 00154a05925..693aec6820a 100644
--- a/c++tools/server.cc
+++ b/c++tools/server.cc
@@ -753,8 +753,10 @@ server (bool ipv6, int sock_fd, module_resolver *resolver)
 		  }
 		  }
 
+#if defined (HAVE_PSELECT) || defined (HAVE_SELECT)
 	  if (active < 0 && sock_fd >= 0 && FD_ISSET (sock_fd, &readers))
 		active = -1;
+#endif
 	}
 
 	  if (active >= 0)
-- 
2.34.1

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Joseph Myers

On Thu, 20 Oct 2022, Martin Liška wrote:

> > Also, but not strictly part of the release issue:
> > 
> > (d) Builds with missing or old Sphinx should work regardless of whether 
> > such files are in the source directory - but if they aren't in the source 
> > directory, the effect of missing or old Sphinx (detected at configure 
> > time) should be to disable building and installing documentation.
> 
> All right Joseph, is it something you're willing to help me once we start
> using Sphinx? Apparently, there will be many consequent steps after we switch.

Sure, but most of the conditionals are *already* present, just need 
updating as part of the Sphinx transition.  E.g. gcc/Makefile.in has 
BUILD_INFO and GENERATED_MANPAGES conditionals based on configure tests 
for whether relevant tools are present and new enough; the rules for 
$(DESTDIR)$(infodir)/%.info quietly allow the info files not to be 
present, so installing also works without the info files or tools to build 
them, and the rules for installing man pages similarly ignore errors; and 
there are srcinfo and srcman rules, enabled based on @GENINSRC@, to copy 
those built files to the source directory, which are what's used when 
--enable-generated-files-in-srcdir is used as part of building a release 
tarball.

The main thing I've suggested that I think may actually be new is an error 
for trying to build a release tarball without new-enough Sphinx (I think 
the current rules would quietly not copy info / man pages to the source 
directory if build tools were missing - but having those tools missing 
when building a release tarball is much less likely than not having 
new-enough Sphinx).

-- 
Joseph S. Myers
jos...@codesourcery.com

[COMMITTED] A false UNORDERED_ means neither operand can be a NAN.

2022-10-20 Thread Aldy Hernandez via Gcc-patches

The false side of UNORDERED_ means neither operand can be a NAN.
Adjust all the op[12]_range entries for the UNORDERED operators such
that a known NAN on one operands means the other operands is
undefined.

gcc/ChangeLog:

* range-op-float.cc (foperator_unordered_le::op1_range): Adjust
false side with a NAN operand.
(foperator_unordered_le::op2_range): Same.
(foperator_unordered_gt::op1_range): Same.
(foperator_unordered_gt::op2_range): Same.
(foperator_unordered_ge::op1_range): Same.
(foperator_unordered_ge::op2_range): Same.
(foperator_unordered_equal::op1_range): Same.
---
 gcc/range-op-float.cc | 51 ++-
 1 file changed, 41 insertions(+), 10 deletions(-)

diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index a9e74c86877..0cb07c2ec29 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc
@@ -1351,7 +1351,11 @@ foperator_unordered_le::op1_range (frange &r, tree type,
   break;
 
 case BRS_FALSE:
-  if (build_gt (r, type, op2))
+  // A false UNORDERED_LE means both operands are !NAN, so it's
+  // impossible for op2 to be a NAN.
+  if (op2.known_isnan ())
+   r.set_undefined ();
+  else if (build_gt (r, type, op2))
r.clear_nan ();
   break;
 
@@ -1375,7 +1379,11 @@ foperator_unordered_le::op2_range (frange &r,
   break;
 
 case BRS_FALSE:
-  if (build_lt (r, type, op1))
+  // A false UNORDERED_LE means both operands are !NAN, so it's
+  // impossible for op1 to be a NAN.
+  if (op1.known_isnan ())
+   r.set_undefined ();
+  else if (build_lt (r, type, op1))
r.clear_nan ();
   break;
 
@@ -1434,7 +1442,11 @@ foperator_unordered_gt::op1_range (frange &r,
   break;
 
 case BRS_FALSE:
-  if (build_le (r, type, op2))
+  // A false UNORDERED_GT means both operands are !NAN, so it's
+  // impossible for op2 to be a NAN.
+  if (op2.known_isnan ())
+   r.set_undefined ();
+  else if (build_le (r, type, op2))
r.clear_nan ();
   break;
 
@@ -1458,7 +1470,11 @@ foperator_unordered_gt::op2_range (frange &r,
   break;
 
 case BRS_FALSE:
-  if (build_ge (r, type, op1))
+  // A false UNORDERED_GT means both operands are !NAN, so it's
+  // impossible for op1 to be a NAN.
+  if (op1.known_isnan ())
+   r.set_undefined ();
+  else if (build_ge (r, type, op1))
r.clear_nan ();
   break;
 
@@ -1517,7 +1533,11 @@ foperator_unordered_ge::op1_range (frange &r,
   break;
 
 case BRS_FALSE:
-  if (build_lt (r, type, op2))
+  // A false UNORDERED_GE means both operands are !NAN, so it's
+  // impossible for op2 to be a NAN.
+  if (op2.known_isnan ())
+   r.set_undefined ();
+  else if (build_lt (r, type, op2))
r.clear_nan ();
   break;
 
@@ -1540,7 +1560,11 @@ foperator_unordered_ge::op2_range (frange &r, tree type,
   break;
 
 case BRS_FALSE:
-  if (build_gt (r, type, op1))
+  // A false UNORDERED_GE means both operands are !NAN, so it's
+  // impossible for op1 to be a NAN.
+  if (op1.known_isnan ())
+   r.set_undefined ();
+  else if (build_gt (r, type, op1))
r.clear_nan ();
   break;
 
@@ -1606,10 +1630,17 @@ foperator_unordered_equal::op1_range (frange &r, tree 
type,
   break;
 
 case BRS_FALSE:
-  // The false side indictates !NAN and not equal.  We can at least
-  // represent !NAN.
-  r.set_varying (type);
-  r.clear_nan ();
+  // A false UNORDERED_EQ means both operands are !NAN, so it's
+  // impossible for op2 to be a NAN.
+  if (op2.known_isnan ())
+   r.set_undefined ();
+  else
+   {
+ // The false side indictates !NAN and not equal.  We can at least
+ // represent !NAN.
+ r.set_varying (type);
+ r.clear_nan ();
+   }
   break;
 
 default:
-- 
2.37.3

Re: Announcement: Porting the Docs to Sphinx - 9. November 2022

2022-10-20 Thread Jakub Jelinek via Gcc-patches

On Thu, Oct 20, 2022 at 04:43:06PM +, Joseph Myers wrote:
> On Thu, 20 Oct 2022, Martin Liška wrote:
> 
> > > Also, but not strictly part of the release issue:
> > > 
> > > (d) Builds with missing or old Sphinx should work regardless of whether 
> > > such files are in the source directory - but if they aren't in the source 
> > > directory, the effect of missing or old Sphinx (detected at configure 
> > > time) should be to disable building and installing documentation.
> > 
> > All right Joseph, is it something you're willing to help me once we start
> > using Sphinx? Apparently, there will be many consequent steps after we 
> > switch.
> 
> Sure, but most of the conditionals are *already* present, just need 
> updating as part of the Sphinx transition.  E.g. gcc/Makefile.in has 
> BUILD_INFO and GENERATED_MANPAGES conditionals based on configure tests 
> for whether relevant tools are present and new enough; the rules for 
> $(DESTDIR)$(infodir)/%.info quietly allow the info files not to be 
> present, so installing also works without the info files or tools to build 
> them, and the rules for installing man pages similarly ignore errors; and 
> there are srcinfo and srcman rules, enabled based on @GENINSRC@, to copy 
> those built files to the source directory, which are what's used when 
> --enable-generated-files-in-srcdir is used as part of building a release 
> tarball.
> 
> The main thing I've suggested that I think may actually be new is an error 
> for trying to build a release tarball without new-enough Sphinx (I think 
> the current rules would quietly not copy info / man pages to the source 
> directory if build tools were missing - but having those tools missing 
> when building a release tarball is much less likely than not having 
> new-enough Sphinx).

But perhaps that test should go to maintainer-scripts/gcc_release.
Can be either of the form of checking if Sphinx is new enough, or checking
of make actually built the documentation before creating the tarballs.

Jakub

[PATCH] Fix uninitialized variable warnings

2022-10-20 Thread Michael Eager


The attached patch corrects a couple uninitialized variable warnings.
The variables are initialized to NULL and tests for this, calling
gcc_unreachable().  Replace other calls to abort() for with 
gcc_unreachable().


Thanks to Jan-Benedict Glaw for bringing this to my attention.

** I'm receiving a "service not enabled" error when I push.
** Can someone apply this patch while I resolve this issue?

--
Michael EagerFrom a0fd2e9baa51e85f61cebd6e78bef8b5c55199b5 Mon Sep 17 00:00:00 2001
From: Michael Eager 
Date: Thu, 20 Oct 2022 09:33:13 -0700
Subject: [PATCH] Fix uninitialized variable warnings

gcc/ChangeLog:

	* gcc/config/microblaze/microblaze.cc
	(microblaze_legitimize_address): Initialize 'reg' to NULL, check for NULL.
	(microblaze_address_insns): Replace abort() with gcc_unreachable().
	(print_operand_address): Same.
	(microblaze_expand_move): Initialize 'p1' to NULL, check for NULL.
	(get_branch_target): Replace abort() with gcc_unreachable().
---
 gcc/ChangeLog   |  9 +
 gcc/config/microblaze/microblaze.cc | 19 ++-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7a50293c780..8271fafe033 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2022-10-20  Michael Eager  
+
+	* gcc/config/microblaze/microblaze.cc
+	(microblaze_legitimize_address): Initialize 'reg' to NULL, check for NULL.
+	(microblaze_address_insns): Replace abort() with gcc_unreachable().
+	(print_operand_address): Same.
+	(microblaze_expand_move): Initialize 'p1' to NULL, check for NULL.
+	(get_branch_target): Replace abort() with gcc_unreachable().
+
 2022-10-19  Aldy Hernandez  
 
 	* range-op-float.cc (build_le): Document result.
diff --git a/gcc/config/microblaze/microblaze.cc b/gcc/config/microblaze/microblaze.cc
index 8fcca1829f6..9290a1f3958 100644
--- a/gcc/config/microblaze/microblaze.cc
+++ b/gcc/config/microblaze/microblaze.cc
@@ -1103,7 +1103,7 @@ microblaze_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
 
   if (GET_CODE (xinsn) == SYMBOL_REF)
 {
-  rtx reg;
+  rtx reg = NULL;
   if (microblaze_tls_symbol_p(xinsn))
 {
   reg = microblaze_legitimize_tls_address (xinsn, NULL_RTX);
@@ -1133,6 +1133,11 @@ microblaze_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
 	  reg = pic_ref;
 	}
 	}
+  else
+	{
+	  /* This should never happen.  */
+	  gcc_unreachable ();
+	}
   return reg;
 }
 
@@ -1474,7 +1479,7 @@ microblaze_address_insns (rtx x, machine_mode mode)
 	  case TLS_DTPREL:
 		return 1;
 	  default :
-		abort();
+		gcc_unreachable ();
 	}
 	default:
 	  break;
@@ -2624,7 +2629,7 @@ print_operand_address (FILE * file, rtx addr)
 		fputs ("@TLSDTPREL", file);
 		break;
 	  default :
-		abort();
+		gcc_unreachable ();
 		break;
 	}
 	}
@@ -3413,7 +3418,7 @@ microblaze_expand_move (machine_mode mode, rtx operands[])
 }
   if (GET_CODE (op1) == PLUS && GET_CODE (XEXP (op1,1)) == CONST)
 {
-  rtx p0, p1, result, temp;
+  rtx p0, p1 = NULL, result, temp;
 
   p0 = XEXP (XEXP (op1,1), 0);
 
@@ -3423,6 +3428,10 @@ microblaze_expand_move (machine_mode mode, rtx operands[])
 	  p0 = XEXP (p0, 0);
 	}
 
+  /* This should never happen.  */
+  if (p1 == NULL)
+	gcc_unreachable ();
+
   if (GET_CODE (p0) == UNSPEC && GET_CODE (p1) == CONST_INT
 	  && flag_pic && TARGET_PIC_DATA_TEXT_REL)
 	{
@@ -3799,7 +3808,7 @@ get_branch_target (rtx branch)
   if (GET_CODE (call) == SET)
 call = SET_SRC (call);
   if (GET_CODE (call) != CALL)
-abort ();
+	gcc_unreachable ();
   return XEXP (XEXP (call, 0), 0);
 }
 
-- 
2.31.1

[PATCH] Always use TYPE_MODE instead of DECL_MODE for vector field

2022-10-20 Thread H.J. Lu via Gcc-patches

commit e034c5c895722e0092d2239cd8c2991db77d6d39
Author: Jakub Jelinek 
Date:   Sat Dec 2 08:54:47 2017 +0100

PR target/78643
PR target/80583
* expr.c (get_inner_reference): If DECL_MODE of a non-bitfield
is BLKmode for vector field with vector raw mode, use TYPE_MODE
instead of DECL_MODE.

fixed the case where DECL_MODE of a vector field is BLKmode and its
TYPE_MODE is a vector mode because of target attribute.  Remove the
BLKmode check for the case where DECL_MODE of a vector field is a vector
mode and its TYPE_MODE is BLKmode because of target attribute.

gcc/

PR target/107304
* expr.c (get_inner_reference): Always use TYPE_MODE for vector
field with vector raw mode.

gcc/testsuite/

PR target/107304
* gcc.target/i386/pr107304.c: New test.
---
 gcc/expr.cc  |  3 +-
 gcc/testsuite/gcc.target/i386/pr107304.c | 39 
 2 files changed, 40 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr107304.c

diff --git a/gcc/expr.cc b/gcc/expr.cc
index efe387e6173..9145193c2c1 100644
--- a/gcc/expr.cc
+++ b/gcc/expr.cc
@@ -7905,8 +7905,7 @@ get_inner_reference (tree exp, poly_int64_pod *pbitsize,
  /* For vector fields re-check the target flags, as DECL_MODE
 could have been set with different target flags than
 the current function has.  */
- if (mode == BLKmode
- && VECTOR_TYPE_P (TREE_TYPE (field))
+ if (VECTOR_TYPE_P (TREE_TYPE (field))
  && VECTOR_MODE_P (TYPE_MODE_RAW (TREE_TYPE (field
mode = TYPE_MODE (TREE_TYPE (field));
}
diff --git a/gcc/testsuite/gcc.target/i386/pr107304.c 
b/gcc/testsuite/gcc.target/i386/pr107304.c
new file mode 100644
index 000..24d68795e7f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr107304.c
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-O0 -march=tigerlake" } */
+
+#include 
+
+typedef union {
+  uint8_t v __attribute__((aligned(256))) __attribute__ ((vector_size(64 * 
sizeof(uint8_t;
+  uint8_t i[64] __attribute__((aligned(256)));
+} stress_vec_u8_64_t;
+
+typedef struct {
+ struct {
+  stress_vec_u8_64_t s;
+  stress_vec_u8_64_t o;
+  stress_vec_u8_64_t mask1;
+  stress_vec_u8_64_t mask2;
+ } u8_64;
+} stress_vec_data_t;
+
+__attribute__((target_clones("arch=alderlake", "default"))) 
+void
+stress_vecshuf_u8_64(stress_vec_data_t *data)
+{
+  stress_vec_u8_64_t *__restrict s;
+  stress_vec_u8_64_t *__restrict mask1;
+  stress_vec_u8_64_t *__restrict mask2;
+  register int i;
+
+  s = &data->u8_64.s;
+  mask1 = &data->u8_64.mask1;
+  mask2 = &data->u8_64.mask2;
+
+  for (i = 0; i < 256; i++) {  /* was i < 65536 */
+  stress_vec_u8_64_t tmp;
+
+  tmp.v = __builtin_shuffle(s->v, mask1->v);
+  s->v = __builtin_shuffle(tmp.v, mask2->v);
+  }
+}
-- 
2.37.3

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Segher Boessenkool

On Thu, Oct 20, 2022 at 01:44:15AM +, Jiang, Haochen wrote:
> Maybe the testcase change cause some misunderstanding and concern.
> 
> Actually, the patch did not disrupt the previous builtins, as the 
> builtin_prefetch
> uses vargs. I set the default value of the new parameter as data prefetch, 
> which
> means that if we are not using the fourth parameter, just like how we use
> prefetch previously, it is still what it is.

I still think it is a mistake to have one builtin do two very distinct
operations, only very superficially related.  Instruction fetch and data
demand loads are almosty entirely unrelated, and so is the prefetch
machinery for them, on all machines I am familiar with.  Which makes
sense anyway, since instruction prefetch and data prefetch have
completely different performance characteristics and considerations.
Maybe if you start with the mistake of having unified L1 caches it
seems natural, but thankfully most machines do not do that.

Segher

Re: [PATCH] mips: Add appropriate linker flags when compiling with -static-pie

2022-10-20 Thread linted via Gcc-patches

On Wed, Oct 12, 2022 at 2:18 PM Jeff Law via Gcc-patches <
gcc-patches@gcc.gnu.org> wrote:

>
> On 9/25/22 09:49, linted via Gcc-patches wrote:
> > Hello,
> > I'm just checking to see if anyone has had a chance to look at this.
> >
> > Thank you
> >
> > On Wed, Sep 14, 2022 at 2:09 PM linted  wrote:
> >
> >> Hello,
> >>
> >> This patch fixes missing flags when compiling with -static-pie on mips.
> I
> >> made these modifications based on the previously submitted static pie
> patch
> >> for arm as well as the working code for aarch64.
> >>
> >> I tested with a host of mips-elf and checked with mips-sim. This patch
> was
> >> also tested and used with uclibc-ng to generate static pie elfs.
> >>
> >> This is my first patch for gcc, so please let me know if there is
> anything
> >> I missed.
> >>
> >>
> >>
> >> Signed-off-by: linted 
> >> ---
> >>   gcc/config/mips/gnu-user.h | 5 +++--
> >>   1 file changed, 3 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/gcc/config/mips/gnu-user.h b/gcc/config/mips/gnu-user.h
> >> index 6aad7192e69..b1c665b7f37 100644
> >> --- a/gcc/config/mips/gnu-user.h
> >> +++ b/gcc/config/mips/gnu-user.h
> >> @@ -56,11 +56,12 @@ along with GCC; see the file COPYING3.  If not see
> >>   #define GNU_USER_TARGET_LINK_SPEC "\
> >> %{G*} %{EB} %{EL} %{mips*} %{shared} \
> >> %{!shared: \
> >> -%{!static: \
> >> +%{!static:%{!static-pie: \
> >> %{rdynamic:-export-dynamic} \
> >> %{mabi=n32: -dynamic-linker " GNU_USER_DYNAMIC_LINKERN32 "} \
> >> %{mabi=64: -dynamic-linker " GNU_USER_DYNAMIC_LINKER64 "} \
> >> -  %{mabi=32: -dynamic-linker " GNU_USER_DYNAMIC_LINKER32 "}} \
> >> +  %{mabi=32: -dynamic-linker " GNU_USER_DYNAMIC_LINKER32 "}}} \
> >> +%{static-pie:-Bstatic -pie --no-dynamic-linker -z text} \
> >>   %{static}} \
>
> This is a bit out of my usual areas of expertise.  But what I find odd
> here is that for -static we essentially do nothing, but for -static-pie
> we need "-Bstatic -pie --no-dynamic-linker -z text".Is the -Bstatic
> really needed for static-pie  And if it is, then wouldn't it be needed
> for -static as well?If you look carefully at aarch64, you'll see it
> includes -Bstatic for -static.
>
>
> Jeff
>

This is a really good question. From what I can tell, the linker will treat
-Bstatic and -static identically. So when compiling with -static, adding a
-Bstatic is redundant since the -static will be passed on to the linker.

In the case of -static-pie, we need to explicitly specify -Bstatic so that
the linker will know that we want the binary to be statically linked since
there is no -static being passed on.

I am of the opinion that we should usually be explicit, but in this case, I
think it is more prudent to change less code.

Re: [PATCH RESEND 1/1] p1689r5: initial support

2022-10-20 Thread Ben Boeckel via Gcc-patches

On Thu, Oct 20, 2022 at 11:39:25 -0400, Jason Merrill wrote:
> Oops, I was thinking this was in gcc as well.  In libcpp there's 
> _cpp_valid_utf8 (which calls one_utf8_to_cppchar).

This routine has a lot more logic (including UCN decoding) and the
`one_utf8_to_cppchar` also supports out-of-bounds codepoints above
`0x10`.

--Ben

[PATCH] Microblaze: Fix uninitialized variable warnings

2022-10-20 Thread Michael Eager


The attached patch corrects a couple uninitialized variable warnings.
The variables are initialized to NULL and tested for this, calling
gcc_unreachable().  Replace other calls to abort() for with 
gcc_unreachable().


Thanks to Jan-Benedict Glaw for bringing this to my attention.

** I'm receiving a "service not enabled" error when I push.
** Can someone apply this patch while I resolve this issue?

--
Michael EagerFrom a0fd2e9baa51e85f61cebd6e78bef8b5c55199b5 Mon Sep 17 00:00:00 2001
From: Michael Eager 
Date: Thu, 20 Oct 2022 09:33:13 -0700
Subject: [PATCH] Fix uninitialized variable warnings

gcc/ChangeLog:

	* gcc/config/microblaze/microblaze.cc
	(microblaze_legitimize_address): Initialize 'reg' to NULL, check for NULL.
	(microblaze_address_insns): Replace abort() with gcc_unreachable().
	(print_operand_address): Same.
	(microblaze_expand_move): Initialize 'p1' to NULL, check for NULL.
	(get_branch_target): Replace abort() with gcc_unreachable().
---
 gcc/ChangeLog   |  9 +
 gcc/config/microblaze/microblaze.cc | 19 ++-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 7a50293c780..8271fafe033 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,12 @@
+2022-10-20  Michael Eager  
+
+	* gcc/config/microblaze/microblaze.cc
+	(microblaze_legitimize_address): Initialize 'reg' to NULL, check for NULL.
+	(microblaze_address_insns): Replace abort() with gcc_unreachable().
+	(print_operand_address): Same.
+	(microblaze_expand_move): Initialize 'p1' to NULL, check for NULL.
+	(get_branch_target): Replace abort() with gcc_unreachable().
+
 2022-10-19  Aldy Hernandez  
 
 	* range-op-float.cc (build_le): Document result.
diff --git a/gcc/config/microblaze/microblaze.cc b/gcc/config/microblaze/microblaze.cc
index 8fcca1829f6..9290a1f3958 100644
--- a/gcc/config/microblaze/microblaze.cc
+++ b/gcc/config/microblaze/microblaze.cc
@@ -1103,7 +1103,7 @@ microblaze_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
 
   if (GET_CODE (xinsn) == SYMBOL_REF)
 {
-  rtx reg;
+  rtx reg = NULL;
   if (microblaze_tls_symbol_p(xinsn))
 {
   reg = microblaze_legitimize_tls_address (xinsn, NULL_RTX);
@@ -1133,6 +1133,11 @@ microblaze_legitimize_address (rtx x, rtx oldx ATTRIBUTE_UNUSED,
 	  reg = pic_ref;
 	}
 	}
+  else
+	{
+	  /* This should never happen.  */
+	  gcc_unreachable ();
+	}
   return reg;
 }
 
@@ -1474,7 +1479,7 @@ microblaze_address_insns (rtx x, machine_mode mode)
 	  case TLS_DTPREL:
 		return 1;
 	  default :
-		abort();
+		gcc_unreachable ();
 	}
 	default:
 	  break;
@@ -2624,7 +2629,7 @@ print_operand_address (FILE * file, rtx addr)
 		fputs ("@TLSDTPREL", file);
 		break;
 	  default :
-		abort();
+		gcc_unreachable ();
 		break;
 	}
 	}
@@ -3413,7 +3418,7 @@ microblaze_expand_move (machine_mode mode, rtx operands[])
 }
   if (GET_CODE (op1) == PLUS && GET_CODE (XEXP (op1,1)) == CONST)
 {
-  rtx p0, p1, result, temp;
+  rtx p0, p1 = NULL, result, temp;
 
   p0 = XEXP (XEXP (op1,1), 0);
 
@@ -3423,6 +3428,10 @@ microblaze_expand_move (machine_mode mode, rtx operands[])
 	  p0 = XEXP (p0, 0);
 	}
 
+  /* This should never happen.  */
+  if (p1 == NULL)
+	gcc_unreachable ();
+
   if (GET_CODE (p0) == UNSPEC && GET_CODE (p1) == CONST_INT
 	  && flag_pic && TARGET_PIC_DATA_TEXT_REL)
 	{
@@ -3799,7 +3808,7 @@ get_branch_target (rtx branch)
   if (GET_CODE (call) == SET)
 call = SET_SRC (call);
   if (GET_CODE (call) != CALL)
-abort ();
+	gcc_unreachable ();
   return XEXP (XEXP (call, 0), 0);
 }
 
-- 
2.31.1

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Andrew Pinski via Gcc-patches

On Thu, Oct 20, 2022 at 10:28 AM Segher Boessenkool
 wrote:
>
> On Thu, Oct 20, 2022 at 01:44:15AM +, Jiang, Haochen wrote:
> > Maybe the testcase change cause some misunderstanding and concern.
> >
> > Actually, the patch did not disrupt the previous builtins, as the 
> > builtin_prefetch
> > uses vargs. I set the default value of the new parameter as data prefetch, 
> > which
> > means that if we are not using the fourth parameter, just like how we use
> > prefetch previously, it is still what it is.
>
> I still think it is a mistake to have one builtin do two very distinct
> operations, only very superficially related.  Instruction fetch and data
> demand loads are almosty entirely unrelated, and so is the prefetch
> machinery for them, on all machines I am familiar with.

On aarch64 (armv8), it is actually the same instruction: PRFM. It
might be the only one which is that way though.
It even allows to specify the level for the instruction prefetch too
(which is actually useful for say OcteonTX2 which has an interesting
cache hierarchy).

Though I agree it is a mistake to have one builtin which handles both
data and instruction prefetch.

Thanks,
Andrew


> Which makes
> sense anyway, since instruction prefetch and data prefetch have
> completely different performance characteristics and considerations.
> Maybe if you start with the mistake of having unified L1 caches it
> seems natural, but thankfully most machines do not do that.
>
>
> Segher

[OG12] libgomp.c-c++-common/requires-4.c: dg-xfail-run-if for USM with -foffload-memory=

2022-10-20 Thread Tobias Burnus


Follow up to the mainline commit (https://gcc.gnu.org/r13-3407 + backported to 
OG12):
"libgomp: Add offload_device_gcn check, add requires-4a.c test"

This xfails requires-4.c on pseudo-USM systems.

As mentioned in the email for that patch OG12's unified-share memory
implemention is for pseudo-USM systems where only specially allocated
memory (managed, pinned) is device accessible. - Thus, requires4.c
failed as it used static memory. (requires4a.c works as it uses
heap-allocated memory.)

Tobias

PS: For USM in mainline, see patch submission at 
https://gcc.gnu.org/pipermail/gcc-patches/2022-July/597976.html
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 0c47ae1c9283a812f832e80e451bfa82519c21e8
Author: Tobias Burnus 
Date:   Thu Oct 20 13:25:25 2022 +0200

libgomp.c-c++-common/requires-4.c: dg-xfail-run-if for USM with -foffload-memory=

The USM implementation uses -foffload-memory=... which allocates variables
in a special memory. This does not support static variables. Hence, XFAIL
this test on nvptx/gcn. The requires-4a.c testcase tests the same but uses
hash memory instead.

libgomp/
* testsuite/libgomp.c-c++-common/requires-4.c: dg-xfail-run-if on
nvptx and gcn.
---
 libgomp/testsuite/libgomp.c-c++-common/requires-4.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/libgomp/testsuite/libgomp.c-c++-common/requires-4.c b/libgomp/testsuite/libgomp.c-c++-common/requires-4.c
index 5883eff0d93..c6b28d5442f 100644
--- a/libgomp/testsuite/libgomp.c-c++-common/requires-4.c
+++ b/libgomp/testsuite/libgomp.c-c++-common/requires-4.c
@@ -2,6 +2,8 @@
 /* { dg-additional-options "-foffload-options=nvptx-none=-misa=sm_35" { target { offload_target_nvptx } } } */
 /* { dg-additional-sources requires-4-aux.c } */
 
+/* { dg-xfail-run-if "USM via -foffload-memory=... does not support static variables" { offload_device_nvptx || offload_device_gcn } } */
+
 /* Check no diagnostic by device-compiler's or host compiler's lto1.
Other file uses: 'requires reverse_offload', but that's inactive as
there are no declare target directives, device constructs nor device routines  */

[OG12] omp-oacc-kernels-decompose.cc: fix -fcompare-debug with GIMPLE_DEBUG

2022-10-20 Thread Tobias Burnus


Given that omp-oacc-kernels-decompose.cc only exists on OG12, the fix
only applies to OG12.

The fail show up since "Kernels loops annotation: C and C++." as that
adds GIMPLE_DEBUG which is not handled in omp-oacc-kernels-decompose.cc
at all. (Actually, it even fails with a sorry when compiling with -g2;
however, -fcompare-debug is supported and was failing.) – For details
see patch.

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
commit 807b755357c4eb03260d229f4a851009fe058e51
Author: Tobias Burnus 
Date:   Thu Oct 20 19:20:36 2022 +0200

omp-oacc-kernels-decompose.cc: fix -fcompare-debug with GIMPLE_DEBUG

GIMPLE_DEBUG were put in a parallel region of its own, which is not
only pointless but also breaks -fcompare-debug. With this commit,
they are handled like simple assignments: those placed are places
into the same body as the loop such that only one parallel region
remains as without debugging. This fixes the existing testcase
libgomp.oacc-c-c++-common/kernels-loop-g.c.

Note: GIMPLE_DEBUG are only accepted with -fcompare-debug; if they
appear otherwise, decompose_kernels_region_body rejects them with
a sorry (unchanged).

gcc/
* omp-oacc-kernels-decompose.cc (top_level_omp_for_in_stmt,
decompose_kernels_region_body): Handle GIMPLE_DEBUG like
simple assignment.
---
 gcc/omp-oacc-kernels-decompose.cc | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/gcc/omp-oacc-kernels-decompose.cc b/gcc/omp-oacc-kernels-decompose.cc
index 4e940c1ee0f..a7e3d764d52 100644
--- a/gcc/omp-oacc-kernels-decompose.cc
+++ b/gcc/omp-oacc-kernels-decompose.cc
@@ -120,7 +120,8 @@ top_level_omp_for_in_stmt (gimple *stmt)
 	  for (gsi = gsi_start (body); !gsi_end_p (gsi); gsi_next (&gsi))
 	{
 	  gimple *body_stmt = gsi_stmt (gsi);
-	  if (gimple_code (body_stmt) == GIMPLE_ASSIGN)
+	  if (gimple_code (body_stmt) == GIMPLE_ASSIGN
+		  || gimple_code (body_stmt) == GIMPLE_DEBUG)
 		continue;
 	  else if (gimple_code (body_stmt) == GIMPLE_OMP_FOR
 		   && gsi_one_before_end_p (gsi))
@@ -1398,7 +1399,7 @@ decompose_kernels_region_body (gimple *kernels_region, tree kernels_clauses)
 	= (gimple_code (stmt) == GIMPLE_ASSIGN
 	   && TREE_CODE (gimple_assign_lhs (stmt)) == VAR_DECL
 	   && DECL_ARTIFICIAL (gimple_assign_lhs (stmt)));
-	  if (!is_simple_assignment)
+	  if (!is_simple_assignment && gimple_code (stmt) != GIMPLE_DEBUG)
 	only_simple_assignments = false;
 	}
 }

Re: [PATCH v4] testsuite: Sanitize fails for SP FPU on Arm

2022-10-20 Thread Joseph Myers

On Wed, 19 Oct 2022, Torbjörn SVENSSON via Gcc-patches wrote:

> This patch stops reporting fails for Arm targets with single
> precision floating point unit for types wider than 32 bits (the width
> of float on arm-none-eabi).
> 
> As reported in PR102017, fenv is reported as supported in recent
> versions of newlib. At the same time, for some Arm targets, the
> implementation in libgcc does not support exceptions and thus, the
> test fails with a call to abort().

This patch is OK.

-- 
Joseph S. Myers
jos...@codesourcery.com

Re: [PATCH RESEND 1/1] p1689r5: initial support

2022-10-20 Thread Jason Merrill via Gcc-patches


On 10/20/22 13:31, Ben Boeckel wrote:

On Thu, Oct 20, 2022 at 11:39:25 -0400, Jason Merrill wrote:

Oops, I was thinking this was in gcc as well.  In libcpp there's
_cpp_valid_utf8 (which calls one_utf8_to_cppchar).


This routine has a lot more logic (including UCN decoding) and the
`one_utf8_to_cppchar` also supports out-of-bounds codepoints above
`0x10`.


The latter seems like a bug to be fixed; presumably it hasn't been 
updated since the range of codepoints was restricted.  This sort of 
thing is why I'd like to minimize the number of separate implementations 
of UTF-8 parsing.


Jason

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Segher Boessenkool

On Thu, Oct 20, 2022 at 11:12:01AM +0800, Hongtao Liu wrote:
> On Thu, Oct 20, 2022 at 9:39 AM Hongtao Liu  wrote:
> > On Thu, Oct 20, 2022 at 5:08 AM Segher Boessenkool
> > > Please use a separate pattern for this, and leave prefetch to mean data
> > > prefetch, as documented!  Documentation you didn't change btw.  Call the
> > > new one instruction_prefetch or something equally boring maybe :-)
> Yes, Maybe we should add new rtl def named "iprefetch", so there will
> be no need to change other backend.

Spelling it out ("instruction_prefetch") is nicer.  This is not used
so often to warrant a cryptic short name.  But yes, that is what I am
asking for.


Segher

Re: [PATCH 1/2] Add a parameter for the builtin function of prefetch to align with LLVM

2022-10-20 Thread Segher Boessenkool

On Thu, Oct 20, 2022 at 07:34:13AM +, Jiang, Haochen wrote:
> > > +  /* Argument 3 must be either zero or one.  */
> > > +  if (INTVAL (op3) != 0 && INTVAL (op3) != 1)
> > > +{
> > > +  warning (0, "invalid fourth argument to %<__builtin_prefetch%>;"
> > > + " using one");
> > 
> > "using 1" makes sense maybe, but "using one" reads as "using an
> > argument", not very sane.
> > 
> > An error would be better here anyway?
> 
> Will change to 1 to avoid confusion in that. The reason why this is a warning
> is because previous ones related to constant arguments out of range in 
> prefetch
> are also using warning.

Please don't repeat historical mistakes.  You might not want to fix the
existing code (since that can in theory break existing user code), but
that is not a reason to punish users of a new feature as well ;-)

> > Please use a separate pattern for this, and leave prefetch to mean data
> > prefetch, as documented!  Documentation you didn't change btw.  Call the
> > new one instruction_prefetch or something equally boring maybe :-)
> 
> Actually I changed documentation for prefetch but it is flooded in the patch
> (Sorry for that).

Oh huh, I looked for it but didn't find it.  Another argument for making
better patch series ;-)

> 1. Previously we are using parameter to indicate r/w and locality in 
> prefetch. I
> suppose it is quite similar in this case. Since the pattern is already there, 
> I prefer
> reusing them.

You can use the data prefetch RTL code for all data loads just as well,
it is more closely related than this -- but most people would call that
insanity!


Segher

Re: [COMMITTED] Replace finite_operands_p with maybe_isnan.

2022-10-20 Thread Mikael Morin


Hello,

Le 20/10/2022 à 16:13, Aldy Hernandez via Gcc-patches a écrit :

The finite_operands_p function was incorrectly named, as it only
returned TRUE when !NAN.  This was leftover from the initial
implementation of frange.  Using the maybe_isnan() nomenclature is
more consistent and easier to understand.


(...)


diff --git a/gcc/range-op-float.cc b/gcc/range-op-float.cc
index 0605a908684..2a4a99ba467 100644
--- a/gcc/range-op-float.cc
+++ b/gcc/range-op-float.cc

(...)

@@ -441,7 +436,7 @@ foperator_equal::op1_range (frange &r, tree type,
// If the result is false, the only time we know anything is
// if OP2 is a constant.
else if (op2.singleton_p ()
-  || (finite_operand_p (op2) && op2.zero_p ()))
+  || (!op2.maybe_isnan () && op2.zero_p ()))
{
  REAL_VALUE_TYPE tmp = op2.lower_bound ();
  r.set (type, tmp, tmp, VR_ANTI_RANGE);


Doesn't this miss a check of flag_finite_math_only to be strictly 
equivalent?  You keep that check for the two-arguments case, so I guess 
it's not redundant?

Re: [COMMITTED] Replace finite_operands_p with maybe_isnan.

2022-10-20 Thread Mikael Morin


Le 20/10/2022 à 20:56, Mikael Morin a écrit :


Doesn't this miss a check of flag_finite_math_only to be strictly 
equivalent?  You keep that check for the two-arguments case, so I guess 
it's not redundant?


Well, the check is removed in the follow-up patch, so maybe is is after all.

Re: [PATCH] Fortran: error recovery with references of bad array constructors [PR105633]

2022-10-20 Thread Mikael Morin


Le 19/10/2022 à 22:49, Harald Anlauf via Fortran a écrit :

Dear Fortranners,

here's another patch that improves error receovery with references
of bad array constructors leading to an ICE after a NULL pointer
dereference.

Original patch by Steve, which I amended with a logic cleanup.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?


Yes, thanks.

1 2 >

1 - 100 of 122 matches

Mail list logo