date:20240206

[PATCH] Arm: Fix incorrect tailcall-generation for indirect calls [PR113780]

2024-02-06 Thread Tejas Belagod

This patch fixes a bug that causes indirect calls in PAC-enabled functions
to be tailcalled incorrectly when all argument registers R0-R3 are used.

Tested on arm-none-eabi for armv8.1-m.main. OK for trunk?

2024-02-07  Tejas Belagod  

PR target/113780
* gcc/config/arm.cc (arm_function_ok_for_sibcall): Don't allow tailcalls
for indirect calls with 4 or more arguments in pac-enabled functions.

* gcc.target/arm/pac-sibcall.c: New.
---
 gcc/config/arm/arm.cc  | 12 
 gcc/testsuite/gcc.target/arm/pac-sibcall.c | 11 +++
 2 files changed, 19 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-sibcall.c

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index c44047c377a..c1f8286a4d4 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -7980,10 +7980,14 @@ arm_function_ok_for_sibcall (tree decl, tree exp)
   && DECL_WEAK (decl))
 return false;
 
-  /* We cannot do a tailcall for an indirect call by descriptor if all the
- argument registers are used because the only register left to load the
- address is IP and it will already contain the static chain.  */
-  if (!decl && CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
+  /* We cannot do a tailcall for an indirect call by descriptor or for an
+ indirect call in a pac-enabled function if all the argument registers
+ are used because the only register left to load the address is IP and
+ it will already contain the static chain or the PAC signature in the
+ case of PAC-enabled functions.  */
+  if (!decl
+  && ((CALL_EXPR_BY_DESCRIPTOR (exp) && !flag_trampolines)
+ || arm_current_function_pac_enabled_p()))
 {
   tree fntype = TREE_TYPE (TREE_TYPE (CALL_EXPR_FN (exp)));
   CUMULATIVE_ARGS cum;
diff --git a/gcc/testsuite/gcc.target/arm/pac-sibcall.c 
b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
new file mode 100644
index 000..c57bf7a952c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/pac-sibcall.c
@@ -0,0 +1,11 @@
+/* Testing return address signing.  */
+/* { dg-do compile } */
+/* { dg-require-effective-target mbranch_protection_ok } */
+/* { dg-options " -mcpu=cortex-m85 -mbranch-protection=pac-ret+leaf -O2" } */
+
+void fail(void (*f)(int, int, int, int))
+{
+  f(1, 2, 3, 4);
+}
+
+/* { dg-final { scan-assembler-not "bx\tip\t@ indirect register sibling call" 
} } */
-- 
2.25.1

Re: [PATCH] wide-int: Fix mul_internal overflow handling [PR113753]

2024-02-06 Thread Richard Biener

On Tue, 6 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> As the following testcases show, the overflow (needs_overflow) and high
> handling in wi::mul_internal seem to only work correctly for either
> small precisions (less than or equal to 32, that is handled by earlier
> simpler code, not the full Knuth's multiplication algorithm) or for
> precisions which are multiple of HOST_BITS_PER_WIDE_INT, so it happened
> to work fine in most pre-_BitInt era types (and for large bitfields we
> probably didn't have good coverage or were lucky and nothing was asking
> if there was overflow or not; I think high multiplication is typically
> used only when we have optab in corresponding mode).
> E.g. on the gcc.dg/bitint-87.c testcase, there were overflow warnings
> emitted only the the / 2wb * 3wb _BitInt(128) cases, but not in the
> other precisions.
> 
> I found 3 issues when prec > HOST_BITS_PER_HALF_WIDE_INT and
> (prec % HOST_BITS_PER_WIDE_INT) != 0:
> 1) the checking for overflow was simply checking the values of the
>r array members from half_blocks_needed to 2 * half_blocks_needed - 1,
>for UNSIGNED overflow checking if any of them is non-zero, for
>SIGNED comparing them if any is different from top where top is computed
>from the sign bit of the result (see below); similarly, the highpart
>multiplication simply picks the half limbs at r + half_blocks_needed
>offset; and furthermore, for SIGNED there is an adjustment if either
>operand was negative which also just walks r half-limbs from
>half_blocks_needed onwards;
>this works great for precisions like 64 or 128, but for precisions like
>129, 159, 160 or 161 doesn't, it would need to walk the bits in the
>half-limbs starting right above the most significant bit of the base
>precision; that can be up to a whole half-limb and all but one bit from
>the one below it earlier
> 2) as the comment says, the multiplication is originally done as unsigned
>multiplication, with adjustment of the high bits which subtracts the
>other operand once:
>   if (wi::neg_p (op1))
> {
>   b = 0;
>   for (i = 0; i < half_blocks_needed; i++)
> {
>   t = (unsigned HOST_WIDE_INT)r[i + half_blocks_needed]
> - (unsigned HOST_WIDE_INT)v[i] - b;
>   r[i + half_blocks_needed] = t & HALF_INT_MASK;
>   b = t >> (HOST_BITS_PER_WIDE_INT - 1);
> }
> }
>and similarly for the other one.  Now, this also only works nicely if
>a negative operand has just a single sign bit set in the given precision;
>but we were unpacking the operands with wi_unpack (..., SIGNED);, so
>say for the negative operand in 129-bit precision, that means the least
>significant bit of u[half_blocks_needed - 2] (or v instead of u depending
>on which argument it is) is the set sign bit, but then there are 31
>further copies of the sign bit in u[half_blocks_needed - 2] and
>further 32 copies in u[half_blocks_needed - 1]; the above adjustment
>for signed operands doesn't really do the right job in such cases, it
>would need to subtract many more times the other operand
> 3) the computation of top for SIGNED
>   top = r[(half_blocks_needed) - 1];
>   top = SIGN_MASK (top << (HOST_BITS_PER_WIDE_INT / 2));
>   top &= mask;
>also uses the most significant bit which fits into prec of the result
>only if prec is multiple of HOST_BITS_PER_WIDE_INT, otherwise we need
>to look at a different bit and sometimes it can be also a bit in
>r[half_blocks_needed - 2]
> 
> For 1), while for UNSIGNED overflow it could be fairly easy to check
> the bits above prec in r half-limbs for being non-zero, doing all the
> shifts also in the SIGNED adjustment code in 2 further locations and finally
> for the high handling (unless we want to assert one doesn't do the highpart
> multiply for such precisions) would be quite ugly and hard to maintain, so
> I instead chose (implemented in the second hunk) to shift the
> beyond-precision bits up such that the expectations of the rest of the
> code are met, that is the LSB of r[half_blocks_needed] after adjustment
> is the bit immediately above the precision, etc.  We don't need to care
> about the bits it shifts out, because the multiplication will yield at most
> 2 * prec bits.
> 
> For 2), the patch changes the wi_unpack argument from SIGNED to UNSIGNED,
> so that we get all zero bits above the precision.
> 
> And finally for 3) it does shifts and perhaps picks lower r half-limb so
> that it uses the actual MSB of the result within prec.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, and additionally
> tested with
> make check-gcc -k -j32 GCC_TEST_RUN_EXPENSIVE=1 
> RUNTESTFLAGS="GCC_TEST_RUN_EXPENSIVE=1 dg-torture.exp=*bitint*"
> Ok for trunk?

OK.

Thanks,
Richard.

> 2024-02-06  Jakub Jelinek  
> 
>   PR tree-optimization/113753
>   * wide-int

Re: Repost [PATCH 4/6] PowerPC: Make MMA insns support DMR registers.

2024-02-06 Thread Michael Meissner

On Sun, Feb 04, 2024 at 11:21:49AM +0800, Kewen.Lin wrote:
> Hi Mike,
> 
> > --- a/gcc/config/rs6000/mma.md
> > +++ b/gcc/config/rs6000/mma.md
> > @@ -559,190 +559,249 @@ (define_insn "*mma_disassemble_acc_dm"
> >"dmxxextfdmr256 %0,%1,2"
> >[(set_attr "type" "mma")])
> >  
> > -(define_insn "mma_"
> > +;; MMA instructions that do not use their accumulators as an input, still 
> > must
> > +;; not allow their vector operands to overlap the registers used by the
> > +;; accumulator.  We enforce this by marking the output as early clobber.  
> > If we
> > +;; have dense math, we don't need the whole prime/de-prime action, so just 
> > make
> > +;; thse instructions be NOPs.
> 
> typo: thse.

Ok.

> > +
> > +(define_expand "mma_"
> > +  [(set (match_operand:XO 0 "register_operand")
> > +   (unspec:XO [(match_operand:XO 1 "register_operand")]
> 
> s/register_operand/accumulator_operand/?

Ok.

> > +  MMA_ACC))]
> > +  "TARGET_MMA"
> > +{
> > +  if (TARGET_DENSE_MATH)
> > +{
> > +  if (!rtx_equal_p (operands[0], operands[1]))
> > +   emit_move_insn (operands[0], operands[1]);
> > +  DONE;
> > +}
> > +
> > +  /* Generate the prime/de-prime code.  */
> > +})
> > +
> > +(define_insn "*mma_"
> 
> May be better to name with "*mma__nodm"?

Ok.

> >[(set (match_operand:XO 0 "fpr_reg_operand" "=&d")
> > (unspec:XO [(match_operand:XO 1 "fpr_reg_operand" "0")]
> > MMA_ACC))]
> > -  "TARGET_MMA"
> > +  "TARGET_MMA && !TARGET_DENSE_MATH"
> 
> I found that "TARGET_MMA && !TARGET_DENSE_MATH" is used much (like changes in 
> function
> rs6000_split_multireg_move in this patch and some places in previous 
> patches), maybe we
> can introduce a macro named as TARGET_MMA_NODM short for it?

As I said in the message about the last patch, I added
TARGET_MMA_NO_DENSE_MATH.

> >" %A0"
> >[(set_attr "type" "mma")])
> >  
> >  ;; We can't have integer constants in XOmode so we wrap this in an
> > -;; UNSPEC_VOLATILE.
> > +;; UNSPEC_VOLATILE for the non-dense math case.  For dense math, we don't 
> > need
> > +;; to disable optimization and we can do a normal UNSPEC.
> >  
> > -(define_insn "mma_xxsetaccz"
> > -  [(set (match_operand:XO 0 "fpr_reg_operand" "=d")
> > +(define_expand "mma_xxsetaccz"
> > +  [(set (match_operand:XO 0 "register_operand")
> 
> s/register_operand/accumulator_operand/?

Ok.

> > (unspec_volatile:XO [(const_int 0)]
> > UNSPECV_MMA_XXSETACCZ))]
> >"TARGET_MMA"
> > +{
> > +  if (TARGET_DENSE_MATH)
> > +{
> > +  emit_insn (gen_mma_xxsetaccz_dm (operands[0]));
> > +  DONE;
> > +}
> > +})
> > +
> > +(define_insn "*mma_xxsetaccz_vsx"
> 
> s/vsx/nodm/

Ok.

> > +  [(set (match_operand:XO 0 "fpr_reg_operand" "=d")
> > +   (unspec_volatile:XO [(const_int 0)]
> > +   UNSPECV_MMA_XXSETACCZ))]
> > +  "TARGET_MMA && !TARGET_DENSE_MATH"
> >"xxsetaccz %A0"
> >[(set_attr "type" "mma")])
> >  
> > +
> > +(define_insn "mma_xxsetaccz_dm"
> > +  [(set (match_operand:XO 0 "dmr_operand" "=wD")
> > +   (unspec:XO [(const_int 0)]
> > +  UNSPECV_MMA_XXSETACCZ))]
> > +  "TARGET_DENSE_MATH"
> > +  "dmsetdmrz %0"
> > +  [(set_attr "type" "mma")])
> > +
> >  (define_insn "mma_"
> > -  [(set (match_operand:XO 0 "fpr_reg_operand" "=&d,&d")
> > -   (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "v,?wa")
> > -   (match_operand:V16QI 2 "vsx_register_operand" "v,?wa")]
> > +  [(set (match_operand:XO 0 "accumulator_operand" "=wD,&d,&d")
> > +   (unspec:XO [(match_operand:V16QI 1 "vsx_register_operand" "wa,v,?wa")
> > +   (match_operand:V16QI 2 "vsx_register_operand" "wa,v,?wa")]
> > MMA_VV))]
> >"TARGET_MMA"
> >" %A0,%x1,%x2"
> > -  [(set_attr "type" "mma")])
> > +  [(set_attr "type" "mma")
> > +   (set_attr "isa" "dm,not_dm,not_dm")])
> 
> Like what's suggested in previous patches, s/not_dm/nodm/

Ok.

-- 
Michael Meissner, IBM
PO Box 98, Ayer, Massachusetts, USA, 01432
email: meiss...@linux.ibm.com

[PATCH RFA] build: drop target libs from LD_LIBRARY_PATH [PR105688]

2024-02-06 Thread Jason Merrill

Tested x86_64-pc-linux-gnu.  Any thoughts?

-- 8< --

The patch for PR22340 (r104978) moved the adding of TARGET_LIB_PATH to
RPATH_ENVVAR from POSTSTAGE1_HOST_EXPORTS to HOST_EXPORTS, but didn't
mention that in the ChangeLog; it also wasn't part of the patch that was
sent to gcc-patches.  I suspect it was included accidentally?

It also causes PR105688 when rebuilding stage1: once the stage1 libstdc++
has been built, if calling the system gcc to build host code involves
invoking any tool that links against libstdc++.so (gold, ccache) they get
the just-built library instead of the system library they expect.

Reverting that hunk of the change fixed my problem with bubblestrapping GCC
12 with ccache on a host with a newer system libstdc++.

But I believe that adding TARGET_LIB_PATH to RPATH_ENVVAR is not needed for
post-stage1 either, at this point.  Including TARGET_LIB_PATH goes back to
r37545, with the stated rationale of getting other C++ library configury to
succeed, but it looks to me like that is no longer necessary.

So I propose to stop adding target libraries to LD_LIBRARY_PATH; see
https://gcc.gnu.org/legacy-ml/gcc/2012-06/msg00325.html for a previous
proposal by Ian to make this change.

I have tried and failed to test this on a system without system libstdc++;
bootstrap on cfarm220 and cfarm240 failed for unrelated reasons.

PR bootstrap/105688

ChangeLog:

* Makefile.tpl (HOST_EXPORTS): Don't add TARGET_LIB_PATH to
RPATH_ENVVAR.
* Makefile.in: Regenerate.
---
 Makefile.in  | 3 ---
 Makefile.tpl | 3 ---
 2 files changed, 6 deletions(-)

diff --git a/Makefile.in b/Makefile.in
index edb0c8a9a42..c2843d5df89 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -242,9 +242,6 @@ HOST_EXPORTS = \
ISLLIBS="$(HOST_ISLLIBS)"; export ISLLIBS; \
ISLINC="$(HOST_ISLINC)"; export ISLINC; \
XGCC_FLAGS_FOR_TARGET="$(XGCC_FLAGS_FOR_TARGET)"; export 
XGCC_FLAGS_FOR_TARGET; \
-@if gcc-bootstrap
-   $(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 
's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
-@endif gcc-bootstrap
$(RPATH_ENVVAR)=`echo "$(HOST_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 
's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR);
 
 POSTSTAGE1_CXX_EXPORT = \
diff --git a/Makefile.tpl b/Makefile.tpl
index adbcbdd1d57..cb39fbd0434 100644
--- a/Makefile.tpl
+++ b/Makefile.tpl
@@ -245,9 +245,6 @@ HOST_EXPORTS = \
ISLLIBS="$(HOST_ISLLIBS)"; export ISLLIBS; \
ISLINC="$(HOST_ISLINC)"; export ISLINC; \
XGCC_FLAGS_FOR_TARGET="$(XGCC_FLAGS_FOR_TARGET)"; export 
XGCC_FLAGS_FOR_TARGET; \
-@if gcc-bootstrap
-   $(RPATH_ENVVAR)=`echo "$(TARGET_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 
's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR); \
-@endif gcc-bootstrap
$(RPATH_ENVVAR)=`echo "$(HOST_LIB_PATH)$$$(RPATH_ENVVAR)" | sed 
's,::*,:,g;s,^:*,,;s,:*$$,,'`; export $(RPATH_ENVVAR);
 
 POSTSTAGE1_CXX_EXPORT = \

base-commit: c5d34912ad576be1ef19be92f7eabde54b9089eb
-- 
2.43.0

Re: [PATCH] c++: Don't ICE for unknown parameter to constexpr'd switch-statement, PR113545

2024-02-06 Thread Hans-Peter Nilsson

> Date: Mon, 22 Jan 2024 14:33:59 -0500
> From: Marek Polacek 

> On Mon, Jan 22, 2024 at 06:02:32PM +0100, Hans-Peter Nilsson wrote:
> > I don't really know whether this is the right way to treat
> > CONVERT_EXPR as below, but...  Regtested native
> > x86_64-linux-gnu.  Ok to commit?
> 
> Thanks for taking a look at this problem.

Thanks for the initial review.

>  
> > brgds, H-P
> > 
> > -- >8 --
> > That gcc_unreachable at the default-label seems to be over
> > the top.  It seems more correct to just say "that's not
> > constant" to whatever's not known (to be constant), when
> > looking for matches in switch-statements.
> 
> Unfortunately this doesn't seem correct to me; I don't think we
> should have gotten that far.  It appears that we lose track of
> the reinterpret_cast, which is not allowed in a constant expression:
> .
> 
> cp_convert -> ... -> convert_to_integer_1 gives us a CONVERT_EXPR
> but we only set REINTERPRET_CAST_P on NOP_EXPRs:
> 
>   expr = cp_convert (type, expr, complain);
>   if (TREE_CODE (expr) == NOP_EXPR)
> /* Mark any nop_expr that created as a reintepret_cast.  */
> REINTERPRET_CAST_P (expr) = true;
> 
> so when evaluating baz we get (long unsigned int) &foo, which
> passes verify_constant.
>  
> I don't have a good suggestion yet, sorry.

But, with this patch, we're letting the non-constant case
take the same path and failing for the same reason, albeit
much later than desired, for the switch code as for the
if-chain code.  Isn't that better than the current ICE?

I mean, if there's a risk of accepts-invalid (like, some
non-constant case incorrectly "constexpr'd"), then that risk
is as already there, for the if-chain case.

Anyway, this is a bit too late in the release season and
isn't a regression, thus I can't argue for it being a
suitable stop-gap measure...

I'm unassigning myself from the PR as I have no clue how to
fix the actual non-constexpr-operand-seen-too-late bug.

Though, I'm asking again; any clue regarding:

"I briefly considered one of the cpp[0-9a-z]* subdirectories
but found no rule.

Isn't constexpr c++11 and therefor cpp0x isn't a good match
(contrary to the many constexpr tests therein)?

What *is* the actual rule for putting a test in
g++.dg/cpp0x, cpp1x and cpp1y (et al)?
(I STFW but found nothing.)"


> > With this patch, the code generated for the (inlined) call to
> > ifbar equals that to swbar, except for the comparisons being
> > in another order.
> > 
> > gcc/cp:
> > PR c++/113545
> > * constexpr.cc (label_matches): Replace call to_unreachable with
> 
> "to gcc_unreachable"
> 
> > return false.
> 
> More like with "break" but that's not important.
>  
> > gcc/testsuite:

(Deleted -- see separate patch)

> > ---
> >  gcc/cp/constexpr.cc |  3 +-
> >  gcc/testsuite/g++.dg/expr/pr113545.C | 49 +
> >  2 files changed, 51 insertions(+), 1 deletion(-)
> >  create mode 100644 gcc/testsuite/g++.dg/expr/pr113545.C
> > 
> > diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
> > index 6350fe154085..30caf3322fff 100644
> > --- a/gcc/cp/constexpr.cc
> > +++ b/gcc/cp/constexpr.cc
> > @@ -6922,7 +6922,8 @@ label_matches (const constexpr_ctx *ctx, tree 
> > *jump_target, tree stmt)
> >break;
> >  
> >  default:
> > -  gcc_unreachable ();
> > +  /* Something else, like CONVERT_EXPR.  Unknown whether it matches.  
> > */
> > +  break;
> >  }
> >return false;
> >  }
> > diff --git a/gcc/testsuite/g++.dg/expr/pr113545.C 
> > b/gcc/testsuite/g++.dg/expr/pr113545.C
> > new file mode 100644
> > index ..914ffdeb8e16

brgds, H-P

Re: Repost [PATCH 3/6] PowerPC: Add support for accumulators in DMR registers.

2024-02-06 Thread Michael Meissner

On Thu, Jan 25, 2024 at 05:28:49PM +0800, Kewen.Lin wrote:
> Hi Mike,
> 
> on 2024/1/6 07:38, Michael Meissner wrote:
> > The MMA subsystem added the notion of accumulator registers as an optional
> > feature of ISA 3.1 (power10).  In ISA 3.1, these accumulators overlapped 
> > with
> > the traditional floating point registers 0..31, but logically the 
> > accumulator
> > registers were separate from the FPR registers.  In ISA 3.1, it was 
> > anticipated
> 
> Using VSX register 0..31 rather than traditional floating point registers 
> 0..31
> seems more clear, since floating point registers imply 64 bit long registers.

Ok.

> > that in future systems, the accumulator registers may no overlap with the 
> > FPR
> > registers.  This patch adds the support for dense math registers as separate
> > registers.
> > 
> > This particular patch does not change the MMA support to use the 
> > accumulators
> > within the dense math registers.  This patch just adds the basic support for
> > having separate DMRs.  The next patch will switch the MMA support to use the
> > accumulators if -mcpu=future is used.
> > 
> > For testing purposes, I added an undocumented option '-mdense-math' to 
> > enable
> > or disable the dense math support.
> 
> Can we avoid this and use one macro for it instead?  As you might have noticed
> that some previous temporary options like -mpower{8,9}-vector cause ICEs due 
> to
> some unexpected combination and we are going to neuter them, so let's try our
> best to avoid it if possible.  I guess one macro TARGET_DENSE_MATH defined by
> TARGET_FUTURE && TARGET_MMA matches all use places? and specifying 
> -mcpu=future
> can enable it while -mcpu=power10 can disable it.

That depends on whether there will be other things added in the future power
that are not in the MMA+ instruction set.

But I can switch to defining TARGET_DENSE_MATH to testing TARGET_FUTURE and
TARGET_MMA.  That way if/when a new cpu comes out, we will just have to change
the definition of TARGET_DENSE_MATH and not all of the uses.

I will also add TARGET_MMA_NO_DENSE_MATH to handle the existing MMA code for
assemble and disassemble when we don't have dense math instructions.

> > 
> > This patch adds a new constraint (wD).  If MMA is selected but dense math is
> > not selected (i.e. -mcpu=power10), the wD constraint will allow access to
> > accumulators that overlap with the VSX vector registers 0..31.  If both MMA 
> > and
> 
> Sorry for nitpicking, it's more accurate with "VSX registers 0..31".

Ok.

> > diff --git a/gcc/config/rs6000/constraints.md 
> > b/gcc/config/rs6000/constraints.md
> > index c7bf82b..614e431c085 100644
> > --- a/gcc/config/rs6000/constraints.md
> > +++ b/gcc/config/rs6000/constraints.md
> > @@ -107,6 +107,9 @@ (define_constraint "wB"
> > (match_test "TARGET_P8_VECTOR")
> > (match_operand 0 "s5bit_cint_operand")))
> >  
> > +(define_register_constraint "wD" "rs6000_constraints[RS6000_CONSTRAINT_wD]"
> > +  "Accumulator register.")
> > +
> >  (define_constraint "wE"
> >"@internal Vector constant that can be loaded with the XXSPLTIB 
> > instruction."
> >(match_test "xxspltib_constant_nosplit (op, mode)"))
> > diff --git a/gcc/config/rs6000/mma.md b/gcc/config/rs6000/mma.md
> > index 6a7d8a836db..bb898919ab5 100644
> > --- a/gcc/config/rs6000/mma.md
> > +++ b/gcc/config/rs6000/mma.md
> > @@ -91,6 +91,7 @@ (define_c_enum "unspec"
> > UNSPEC_MMA_XVI8GER4SPP
> > UNSPEC_MMA_XXMFACC
> > UNSPEC_MMA_XXMTACC
> > +   UNSPEC_DM_ASSEMBLE_ACC
> 
> The other UNSPEC.*ASSEMBLE like UNSPECV_MMA_ASSEMBLE don't have _ACC suffix,
> it's better to keep consistent if this suffix doesn't distinguish something.

Ok.

> >])
> >  
> >  (define_c_enum "unspecv"
> > @@ -321,7 +322,9 @@ (define_insn_and_split "*movoo"
> > (set_attr "length" "*,8,*,8,8")
> > (set_attr "isa" "lxvp,*,stxvp,*,*")])
> >  
> > -;; Vector quad support.  XOmode can only live in FPRs.
> > +;; Vector quad support.  Under the original MMA, XOmode can only live in 
> > VSX
> > +;; vector registers 0..31.  With dense math, XOmode can live in either VSX
> 
> Nit: s/vector//

Ok.

> > +;; registers (0..63) or DMR registers.
> >  (define_expand "movxo"
> >[(set (match_operand:XO 0 "nonimmediate_operand")
> > (match_operand:XO 1 "input_operand"))]
> > @@ -346,10 +349,10 @@ (define_expand "movxo"
> >  gcc_assert (false);
> >  })
> >  
> > -(define_insn_and_split "*movxo"
> > +(define_insn_and_split "*movxo_nodm"
> >[(set (match_operand:XO 0 "nonimmediate_operand" "=d,ZwO,d")
> > (match_operand:XO 1 "input_operand" "ZwO,d,d"))]
> > -  "TARGET_MMA
> > +  "TARGET_MMA && !TARGET_DENSE_MATH
> > && (gpc_reg_operand (operands[0], XOmode)
> > || gpc_reg_operand (operands[1], XOmode))"
> >"@
> > @@ -366,6 +369,31 @@ (define_insn_and_split "*movxo"
> > (set_attr "length" "*,*,16")
> > (set_attr "max_prefixed_insns" "2,2,*")])
> >  
> > +(define_insn_and_split "*movxo_dm"
> > +  [

Ping*2 PATCH: testcase for "ICE for unknown parameter to constexpr'd switch-statement, PR113545"

2024-02-06 Thread Hans-Peter Nilsson

> From: Hans-Peter Nilsson 
> Date: Tue, 30 Jan 2024 06:18:45 +0100

> Ping for the xfailed testsuite patch below the review
> (actual constexpr.cc patch to be handled separately):

Ping*2.  Again, this is for the xfailed test-case only.

> 
> > From: Hans-Peter Nilsson 
> > Date: Tue, 23 Jan 2024 05:55:00 +0100
> > 
> > > Date: Mon, 22 Jan 2024 14:33:59 -0500
> > > From: Marek Polacek 
> > 
> > > The problem seems to be more about conversion so 
> > > g++.dg/conversion/reinterpret5.C
> > > or g++.dg/cpp0x/constexpr-reinterpret3.C seems more appropriate.
> > > 
> > > > @@ -0,0 +1,49 @@
> > > 
> > > Please add
> > > 
> > > PR c++/113545
> > 
> > > > +  unsigned const char c = 
> > > > swbar(reinterpret_cast<__UINTPTR_TYPE__>(&foo));
> > > > +  xyzzy(c);
> > > > +  unsigned const char d = 
> > > > ifbar(reinterpret_cast<__UINTPTR_TYPE__>(&foo));
> > > 
> > > I suppose we should also test a C-style cast (which leads to a 
> > > reinterpret_cast
> > > in this case).
> > > 
> > > Maybe check we get an error when c/d are constexpr (that used to ICE).
> > 
> > Like this?  Not sure about the value of that variant, but here goes.
> > 
> > I checked that these behave as expected (xfail as ICE properly) without the
> > previosly posted patch to cp/constexpr.cc and XPASS with it applied.
> > 
> > Ok to commit?
> > 
> > -- >8 --
> > Subject: [PATCH] c++: testcases for PR113545 (constexpr with switch and
> >  passing non-constexpr parameter)
> > 
> > gcc/testsuite:
> > PR c++/113545
> > * g++.dg/cpp0x/constexpr-reinterpret3.C,
> > g++.dg/cpp0x/constexpr-reinterpret4.C: New tests.
> > ---
> >  .../g++.dg/cpp0x/constexpr-reinterpret3.C | 55 +++
> >  .../g++.dg/cpp0x/constexpr-reinterpret4.C | 54 ++
> >  2 files changed, 109 insertions(+)
> >  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret3.C
> >  create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret4.C
> > 
> > diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret3.C 
> > b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret3.C
> > new file mode 100644
> > index ..319cc5e8bee9
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret3.C
> > @@ -0,0 +1,55 @@
> > +// PR c++/113545
> > +// { dg-do run { target c++11 } }
> > +// { dg-ice "PR112545 - constexpr function with switch called for 
> > reinterpret_cast" }
> > +
> > +char foo;
> > +
> > +// This one caught a call to gcc_unreachable in
> > +// cp/constexpr.cc:label_matches, when passed a convert_expr from the
> > +// cast in the call.
> > +constexpr unsigned char swbar(__UINTPTR_TYPE__ baz)
> > +{
> > +  switch (baz)
> > +{
> > +case 13:
> > +  return 11;
> > +case 14:
> > +  return 78;
> > +case 2048:
> > +  return 13;
> > +default:
> > +  return 42;
> > +}
> > +}
> > +
> > +// For reference, the equivalent* if-statements.
> > +constexpr unsigned char ifbar(__UINTPTR_TYPE__ baz)
> > +{
> > +  if (baz == 13)
> > +return 11;
> > +  else if (baz == 14)
> > +return 78;
> > +  else if (baz == 2048)
> > +return 13;
> > +  else
> > +return 42;
> > +}
> > +
> > +__attribute__ ((__noipa__))
> > +void xyzzy(int x)
> > +{
> > +  if (x != 42)
> > +__builtin_abort ();
> > +}
> > +
> > +int main()
> > +{
> > +  unsigned const char c = swbar(reinterpret_cast<__UINTPTR_TYPE__>(&foo));
> > +  xyzzy(c);
> > +  unsigned const char d = ifbar(reinterpret_cast<__UINTPTR_TYPE__>(&foo));
> > +  xyzzy(d);
> > +  unsigned const char e = swbar((__UINTPTR_TYPE__) &foo);
> > +  xyzzy(e);
> > +  unsigned const char f = ifbar((__UINTPTR_TYPE__) &foo);
> > +  xyzzy(f);
> > +}
> > diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret4.C 
> > b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret4.C
> > new file mode 100644
> > index ..4d0fdf2c0a78
> > --- /dev/null
> > +++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-reinterpret4.C
> > @@ -0,0 +1,54 @@
> > +// PR c++/113545
> > +// { dg-do compile { target c++11 } }
> > +
> > +char foo;
> > +
> > +// This one caught a call to gcc_unreachable in
> > +// cp/constexpr.cc:label_matches, when passed a convert_expr from the
> > +// cast in the call.
> > +constexpr unsigned char swbar(__UINTPTR_TYPE__ baz)
> > +{
> > +  switch (baz)
> > +{
> > +case 13:
> > +  return 11;
> > +case 14:
> > +  return 78;
> > +case 2048:
> > +  return 13;
> > +default:
> > +  return 42;
> > +}
> > +}
> > +
> > +// For reference, the equivalent* if-statements.
> > +constexpr unsigned char ifbar(__UINTPTR_TYPE__ baz)
> > +{
> > +  if (baz == 13)
> > +return 11;
> > +  else if (baz == 14)
> > +return 78;
> > +  else if (baz == 2048)
> > +return 13;
> > +  else
> > +return 42;
> > +}
> > +
> > +__attribute__ ((__noipa__))
> > +void xyzzy(int x)
> > +{
> > +  if (x != 42)
> > +__builtin_abort ();
> > +}
> > +
> > +int main()
> > +{
> > +  unsigned constexpr char c = 
> > swbar(reint

[PATCH RFA] gdbhooks: regex syntax error

2024-02-06 Thread Jason Merrill

Briefly tested that break-on-pass completion works.  Oddly, it also works
without the patch, but the fix still seems worthwhile.  OK for trunk?

-- 8< --

Recent python complains about this pattern with
  SyntaxWarning: invalid escape sequence '\s'
because \s in a regular string just means 's'; for it to mean whitespace,
you need \\ or for the pattern to be a raw string.

gcc/ChangeLog:

* gdbhooks.py: Fix regex syntax.
---
 gcc/gdbhooks.py | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gdbhooks.py b/gcc/gdbhooks.py
index 3fa62652c61..92e38880a70 100644
--- a/gcc/gdbhooks.py
+++ b/gcc/gdbhooks.py
@@ -642,7 +642,7 @@ class PassNames:
 self.names = []
 with open(os.path.join(srcdir, 'passes.def')) as f:
 for line in f:
-m = re.match('\s*NEXT_PASS \(([^,]+).*\);', line)
+m = re.match(r'\s*NEXT_PASS \(([^,]+).*\);', line)
 if m:
 self.names.append(m.group(1))
 

base-commit: c5d34912ad576be1ef19be92f7eabde54b9089eb
-- 
2.43.0

[PATCH] c++: NTTP type CTAD w/ tmpl from current inst [PR113649]

2024-02-06 Thread Patrick Palka

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK
for trunk?

-- >8 --

Since template argument coercion happens relative to the most general
template (for a class template at least), during NTTP type CTAD we might
need to consider outer arguments particularly if the CTAD template is from
the current instantiation (and so depends on outer template parameters).

This patch makes do_class_deduction substitute as many levels of outer
template arguments into a CTAD template (from the current instantiation)
as it can take.

PR c++/113649

gcc/cp/ChangeLog:

* pt.cc (do_class_deduction): Add outer_targs parameter.
Substitute outer arguments into the CTAD template.
(do_auto_deduction): Pass outer_targs to do_class_deduction.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/nontype-class64.C: New test.
---
 gcc/cp/pt.cc | 21 ++--
 gcc/testsuite/g++.dg/cpp2a/nontype-class64.C | 16 +++
 2 files changed, 35 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/nontype-class64.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index 903a4a1c363..83c3b1920d6 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -30681,7 +30681,7 @@ ctad_template_p (tree tmpl)
type.  */
 
 static tree
-do_class_deduction (tree ptype, tree tmpl, tree init,
+do_class_deduction (tree ptype, tree tmpl, tree init, tree outer_targs,
int flags, tsubst_flags_t complain)
 {
   /* We should have handled this in the caller.  */
@@ -30743,6 +30743,23 @@ do_class_deduction (tree ptype, tree tmpl, tree init,
   if (type_dependent_expression_p (init))
 return ptype;
 
+  if (outer_targs)
+{
+  int args_depth = TMPL_ARGS_DEPTH (outer_targs);
+  int parms_depth = TMPL_PARMS_DEPTH (DECL_TEMPLATE_PARMS (tmpl));
+  if (parms_depth > 1)
+   {
+ /* Substitute outer arguments into this CTAD template from the
+current instantiation.  */
+ int want = std::min (args_depth, parms_depth - 1);
+ outer_targs = strip_innermost_template_args (outer_targs,
+  args_depth - want);
+ tmpl = tsubst (tmpl, outer_targs, complain, NULL_TREE);
+ if (tmpl == error_mark_node)
+   return error_mark_node;
+   }
+}
+
   /* Don't bother with the alias rules for an equivalent template.  */
   tmpl = get_underlying_template (tmpl);
 
@@ -30998,7 +31015,7 @@ do_auto_deduction (tree type, tree init, tree auto_node,
 
   if (tree ctmpl = CLASS_PLACEHOLDER_TEMPLATE (auto_node))
 /* C++17 class template argument deduction.  */
-return do_class_deduction (type, ctmpl, init, flags, complain);
+return do_class_deduction (type, ctmpl, init, outer_targs, flags, 
complain);
 
   if (init == NULL_TREE || TREE_TYPE (init) == NULL_TREE)
 /* Nothing we can do with this, even in deduction context.  */
diff --git a/gcc/testsuite/g++.dg/cpp2a/nontype-class64.C 
b/gcc/testsuite/g++.dg/cpp2a/nontype-class64.C
new file mode 100644
index 000..8397ea5a886
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/nontype-class64.C
@@ -0,0 +1,16 @@
+// PR c++/113649
+// { dg-do compile { target c++20 } }
+
+template
+struct A {
+  template
+  struct Fun { constexpr Fun(Ret(*)(Args...)) { } };
+
+  template
+  struct B { using type = decltype(f); };
+};
+
+bool f(char, long);
+
+using type = A::B<&f>::type;
+using type = A::Fun;
-- 
2.43.0.522.g2a540e432f

Re: [PATCH] c++: Disallow this specifier except for parameter declarations [PR113788]

2024-02-06 Thread Jason Merrill


On 2/6/24 15:45, Marek Polacek wrote:

On Tue, Feb 06, 2024 at 09:37:44PM +0100, Jakub Jelinek wrote:

Hi!

The deducing this patchset added parsing of this specifier to
cp_parser_decl_specifier_seq unconditionally, but in the C++ grammar
this[opt] only appears in the parameter-declaration non-terminal, so
rather than checking in all the callers of cp_parser_decl_specifier_seq
except for cp_parser_parameter_declaration that this specifier didn't
appear I think it is far easier and closer to what the standard says
to only parse this specifier when called from
cp_parser_parameter_declaration.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?


FWIW, the patch looks good to me.


Agreed, OK.

Jason

[PATCH] wide-int: Fix mul_internal overflow handling [PR113753]

2024-02-06 Thread Jakub Jelinek

Hi!

As the following testcases show, the overflow (needs_overflow) and high
handling in wi::mul_internal seem to only work correctly for either
small precisions (less than or equal to 32, that is handled by earlier
simpler code, not the full Knuth's multiplication algorithm) or for
precisions which are multiple of HOST_BITS_PER_WIDE_INT, so it happened
to work fine in most pre-_BitInt era types (and for large bitfields we
probably didn't have good coverage or were lucky and nothing was asking
if there was overflow or not; I think high multiplication is typically
used only when we have optab in corresponding mode).
E.g. on the gcc.dg/bitint-87.c testcase, there were overflow warnings
emitted only the the / 2wb * 3wb _BitInt(128) cases, but not in the
other precisions.

I found 3 issues when prec > HOST_BITS_PER_HALF_WIDE_INT and
(prec % HOST_BITS_PER_WIDE_INT) != 0:
1) the checking for overflow was simply checking the values of the
   r array members from half_blocks_needed to 2 * half_blocks_needed - 1,
   for UNSIGNED overflow checking if any of them is non-zero, for
   SIGNED comparing them if any is different from top where top is computed
   from the sign bit of the result (see below); similarly, the highpart
   multiplication simply picks the half limbs at r + half_blocks_needed
   offset; and furthermore, for SIGNED there is an adjustment if either
   operand was negative which also just walks r half-limbs from
   half_blocks_needed onwards;
   this works great for precisions like 64 or 128, but for precisions like
   129, 159, 160 or 161 doesn't, it would need to walk the bits in the
   half-limbs starting right above the most significant bit of the base
   precision; that can be up to a whole half-limb and all but one bit from
   the one below it earlier
2) as the comment says, the multiplication is originally done as unsigned
   multiplication, with adjustment of the high bits which subtracts the
   other operand once:
  if (wi::neg_p (op1))
{
  b = 0;
  for (i = 0; i < half_blocks_needed; i++)
{
  t = (unsigned HOST_WIDE_INT)r[i + half_blocks_needed]
- (unsigned HOST_WIDE_INT)v[i] - b;
  r[i + half_blocks_needed] = t & HALF_INT_MASK;
  b = t >> (HOST_BITS_PER_WIDE_INT - 1);
}
}
   and similarly for the other one.  Now, this also only works nicely if
   a negative operand has just a single sign bit set in the given precision;
   but we were unpacking the operands with wi_unpack (..., SIGNED);, so
   say for the negative operand in 129-bit precision, that means the least
   significant bit of u[half_blocks_needed - 2] (or v instead of u depending
   on which argument it is) is the set sign bit, but then there are 31
   further copies of the sign bit in u[half_blocks_needed - 2] and
   further 32 copies in u[half_blocks_needed - 1]; the above adjustment
   for signed operands doesn't really do the right job in such cases, it
   would need to subtract many more times the other operand
3) the computation of top for SIGNED
  top = r[(half_blocks_needed) - 1];
  top = SIGN_MASK (top << (HOST_BITS_PER_WIDE_INT / 2));
  top &= mask;
   also uses the most significant bit which fits into prec of the result
   only if prec is multiple of HOST_BITS_PER_WIDE_INT, otherwise we need
   to look at a different bit and sometimes it can be also a bit in
   r[half_blocks_needed - 2]

For 1), while for UNSIGNED overflow it could be fairly easy to check
the bits above prec in r half-limbs for being non-zero, doing all the
shifts also in the SIGNED adjustment code in 2 further locations and finally
for the high handling (unless we want to assert one doesn't do the highpart
multiply for such precisions) would be quite ugly and hard to maintain, so
I instead chose (implemented in the second hunk) to shift the
beyond-precision bits up such that the expectations of the rest of the
code are met, that is the LSB of r[half_blocks_needed] after adjustment
is the bit immediately above the precision, etc.  We don't need to care
about the bits it shifts out, because the multiplication will yield at most
2 * prec bits.

For 2), the patch changes the wi_unpack argument from SIGNED to UNSIGNED,
so that we get all zero bits above the precision.

And finally for 3) it does shifts and perhaps picks lower r half-limb so
that it uses the actual MSB of the result within prec.

Bootstrapped/regtested on x86_64-linux and i686-linux, and additionally
tested with
make check-gcc -k -j32 GCC_TEST_RUN_EXPENSIVE=1 
RUNTESTFLAGS="GCC_TEST_RUN_EXPENSIVE=1 dg-torture.exp=*bitint*"
Ok for trunk?

2024-02-06  Jakub Jelinek  

PR tree-optimization/113753
* wide-int.cc (wi::mul_internal): Unpack op1val and op2val with
UNSIGNED rather than SIGNED.  If high or needs_overflow and prec is
not a multiple of HOST_BITS_PER_WIDE_INT, shift left bits above prec
so that they start with r[ha

Re: [PATCH] Fix disabling of year 2038 support on 32-bit hosts by default

2024-02-06 Thread Thiago Jung Bauermann



Hello Andrew,

Andrew Pinski  writes:

> On Mon, Feb 5, 2024 at 10:40 AM Thiago Jung Bauermann
>  wrote:
>>
>>
>> Thiago Jung Bauermann  writes:
>>
>> > Hello Luis,
>> >
>> > Luis Machado  writes:
>> >>
>> >> Approved-By: Luis Machado 
>> >
>> > Thanks! Since this is a patch for the repository top-level, is your
>> > approval sufficient to commit the patch, or should I have approval from
>> > a binutils maintainer as well?
>>
>> Answering my own question: binutils/MAINTAINERS says:
>>
>>   GDB global maintainers also have permission to commit and approve
>>   patches to the top level files and to those parts of bfd files
>>   primarily used by GDB.
>>
>> So pushed as commit 9c0aa4c53104.
>
> Please also submit/commit to the gcc trunk too since the toplevel
> configure should be insync between the 2 repos.

I don't have commit access to the gcc repo so I sent a patch to the
gcc-patches mailing list.

-- 
Thiago

[PATCH] Fix disabling of year 2038 support on 32-bit hosts by default

2024-02-06 Thread Thiago Jung Bauermann

Commit e5f2f7d901ee ("Disable year 2038 support on 32-bit hosts by
default") fixed a mismatch between 64-bit time_t in GDB and system headers
and 32-bit time_t in BFD.

However, since commit 862776f26a59 ("Finalized intl-update patches")
gnulib's year 2038 support has been accidentally re-enabled — causing
problems for 32-bit hosts again.  The commit split baseargs into
{h,b}baseargs, but this hasn't been done for the code that handles
--disable-year2038.

This patch restores the intended behaviour.  With this change, the number
of unexpected core files goes from 18 to 4.

Tested on armv8l-linux-gnueabihf.

Approved-By: Luis Machado 
---

Hello,

Yesterday I committed this patch to the binutils-gdb repo. Since the
toplevel configure should be in sync between the 2 repos, could someone
please commit it to the gcc one? I don't have commit access.

 configure| 3 ++-
 configure.ac | 3 ++-
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/configure b/configure
index 874966fb9f09..1da0e67c28fa 100755
--- a/configure
+++ b/configure
@@ -10301,7 +10301,8 @@ hbaseargs="$hbaseargs --disable-option-checking"
 tbaseargs="$tbaseargs --disable-option-checking"
 
 if test "$enable_year2038" = no; then
-  baseargs="$baseargs --disable-year2038"
+  bbaseargs="$bbaseargs --disable-year2038"
+  hbaseargs="$hbaseargs --disable-year2038"
   tbaseargs="$tbaseargs --disable-year2038"
 fi
 
diff --git a/configure.ac b/configure.ac
index 4f34004a0726..fa508a0612a3 100644
--- a/configure.ac
+++ b/configure.ac
@@ -3420,7 +3420,8 @@ hbaseargs="$hbaseargs --disable-option-checking"
 tbaseargs="$tbaseargs --disable-option-checking"
 
 if test "$enable_year2038" = no; then
-  baseargs="$baseargs --disable-year2038"
+  bbaseargs="$bbaseargs --disable-year2038"
+  hbaseargs="$hbaseargs --disable-year2038"
   tbaseargs="$tbaseargs --disable-year2038"
 fi

Re: [PATCH] c++: Disallow this specifier except for parameter declarations [PR113788]

2024-02-06 Thread Marek Polacek

On Tue, Feb 06, 2024 at 09:37:44PM +0100, Jakub Jelinek wrote:
> Hi!
> 
> The deducing this patchset added parsing of this specifier to
> cp_parser_decl_specifier_seq unconditionally, but in the C++ grammar
> this[opt] only appears in the parameter-declaration non-terminal, so
> rather than checking in all the callers of cp_parser_decl_specifier_seq
> except for cp_parser_parameter_declaration that this specifier didn't
> appear I think it is far easier and closer to what the standard says
> to only parse this specifier when called from
> cp_parser_parameter_declaration.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

FWIW, the patch looks good to me.

> 2024-02-06  Jakub Jelinek  
> 
>   PR c++/113788
>   * parser.cc (CP_PARSER_FLAGS_PARAMETER): New enumerator.
>   (cp_parser_decl_specifier_seq): Parse RID_THIS only if
>   CP_PARSER_FLAGS_PARAMETER is set in flags.
>   (cp_parser_parameter_declaration): Or in CP_PARSER_FLAGS_PARAMETER
>   when calling cp_parser_decl_specifier_seq.
> 
>   * g++.dg/parse/pr113788.C: New test.
> 
> --- gcc/cp/parser.cc.jj   2024-01-17 10:34:45.337660930 +0100
> +++ gcc/cp/parser.cc  2024-02-06 18:31:35.587193903 +0100
> @@ -2088,7 +2088,9 @@ enum
>/* When parsing of the noexcept-specifier should be delayed.  */
>CP_PARSER_FLAGS_DELAY_NOEXCEPT = 0x40,
>/* When parsing a consteval declarator.  */
> -  CP_PARSER_FLAGS_CONSTEVAL = 0x80
> +  CP_PARSER_FLAGS_CONSTEVAL = 0x80,
> +  /* When parsing a parameter declaration.  */
> +  CP_PARSER_FLAGS_PARAMETER = 0x100
>  };
>  
>  /* This type is used for parameters and variables which hold
> @@ -16342,7 +16344,7 @@ cp_parser_decl_specifier_seq (cp_parser*
>/* Special case for "this" specifier, indicating a parm is an xobj 
> parm.
>The "this" specifier must be the first specifier in the declaration,
>after any attributes.  */
> -  if (token->keyword == RID_THIS)
> +  if (token->keyword == RID_THIS && (flags & CP_PARSER_FLAGS_PARAMETER))
>   {
> cp_lexer_consume_token (parser->lexer);
> if (token != first_specifier)
> @@ -25607,7 +25609,7 @@ cp_parser_parameter_declaration (cp_pars
>/* Parse the declaration-specifiers.  */
>cp_token *decl_spec_token_start = cp_lexer_peek_token (parser->lexer);
>cp_parser_decl_specifier_seq (parser,
> - flags,
> + flags | CP_PARSER_FLAGS_PARAMETER,
>   &decl_specifiers,
>   &declares_class_or_enum);
>  
> --- gcc/testsuite/g++.dg/parse/pr113788.C.jj  2024-02-06 18:40:29.553791028 
> +0100
> +++ gcc/testsuite/g++.dg/parse/pr113788.C 2024-02-06 18:41:23.326045703 
> +0100
> @@ -0,0 +1,20 @@
> +// PR c++/113788
> +// { dg-do compile { target c++11 } }
> +
> +struct S { int a, b; };
> +struct U {
> +  void foo () { this int g = 1; }// { dg-error "expected ';' before 
> 'int'" }
> +};
> +this auto h = 1; // { dg-error "expected unqualified-id 
> before 'this'" }
> +
> +int
> +main ()
> +{
> +  S s = { 1, 2 };
> +  short t[3] = { 3, 4, 5 };
> +  this auto &[a, b] = s; // { dg-error "invalid use of 'this' in 
> non-member function" }
> +  this auto &[c, d, e] = t;  // { dg-error "invalid use of 'this' in 
> non-member function" }
> +  this int f = 1;// { dg-error "invalid use of 'this' in 
> non-member function" }
> +  for (this auto &i : t) // { dg-error "invalid use of 'this' in 
> non-member function" }
> +;// { dg-error "expected" }
> +}// { dg-error "expected" }
> 
>   Jakub
> 

Marek

[PATCH] range-op: Fix up ABSU_EXPR handling [PR113756]

2024-02-06 Thread Jakub Jelinek

Hi!

ABSU_EXPR unary expr is special because it has a signed integer
argument and unsigned integer result (of the same precision).

The following testcase is miscompiled since ABSU_EXPR handling has
been added to range-op because it uses widest_int::from with the
result sign (i.e. UNSIGNED) rather than the operand sign (i.e. SIGNED),
so e.g. for the 32-bit int argument mask ends up 0xffc1 or something
similar and even when it has most significant bit in the precision set,
in widest_int (tree-ssa-ccp.cc really should stop using widest_int, but
that is I think stage1 task) it doesn't appear to be negative and so
bit_value_unop ABSU_EXPR doesn't set the resulting mask/value from
oring of the argument and its negation.

Fixed thusly, not doing that for GIMPLE_BINARY_RHS because I don't know
about a binary op that would need something similar.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-06  Jakub Jelinek  

PR tree-optimization/113756
* range-op.cc (update_known_bitmask): For GIMPLE_UNARY_RHS,
use TYPE_SIGN (lh.type ()) instead of sign for widest_int::from
of lh_bits value and mask.

* gcc.dg/pr113756.c: New test.

--- gcc/range-op.cc.jj  2024-01-03 11:51:28.199777434 +0100
+++ gcc/range-op.cc 2024-02-06 16:51:55.549127825 +0100
@@ -435,8 +435,10 @@ update_known_bitmask (irange &r, tree_co
   bit_value_unop (code, sign, prec, &widest_value, &widest_mask,
  TYPE_SIGN (lh.type ()),
  TYPE_PRECISION (lh.type ()),
- widest_int::from (lh_bits.value (), sign),
- widest_int::from (lh_bits.mask (), sign));
+ widest_int::from (lh_bits.value (),
+   TYPE_SIGN (lh.type ())),
+ widest_int::from (lh_bits.mask (),
+   TYPE_SIGN (lh.type (;
   break;
 case GIMPLE_BINARY_RHS:
   bit_value_binop (code, sign, prec, &widest_value, &widest_mask,
--- gcc/testsuite/gcc.dg/pr113756.c.jj  2024-02-06 17:00:52.835679796 +0100
+++ gcc/testsuite/gcc.dg/pr113756.c 2024-02-06 17:00:31.159980326 +0100
@@ -0,0 +1,36 @@
+/* PR tree-optimization/113756 */
+/* { dg-do run { target int32plus } } */
+/* { dg-options "-O2" } */
+
+int d, e, i, k, l = -8;
+signed char h, j;
+
+int
+bar (int n, int o, int p3)
+{
+  int a = o - p3, b = n - p3, c = a + b, f = -b, g = c < 0 ? -c : c;
+  return a <= f && a <= g ? o : p3;
+}
+
+void
+foo (int *n, unsigned short o)
+{
+  unsigned p = 8896;
+  for (; e >= 0; e--)
+p = 5377;
+  for (; h <= 0; h++)
+for (; j <= 0; j++)
+  {
+   *n = 1611581749;
+   i = bar (34, p - 5294, *n - 1611581687);
+   k = i + p + 65535 + o + *n - 1611718251;
+   if (k != 0)
+ __builtin_abort ();
+  }
+}
+
+int
+main ()
+{
+  foo (&l, l);
+}

Jakub

[PATCH] c++: Disallow this specifier except for parameter declarations [PR113788]

2024-02-06 Thread Jakub Jelinek

Hi!

The deducing this patchset added parsing of this specifier to
cp_parser_decl_specifier_seq unconditionally, but in the C++ grammar
this[opt] only appears in the parameter-declaration non-terminal, so
rather than checking in all the callers of cp_parser_decl_specifier_seq
except for cp_parser_parameter_declaration that this specifier didn't
appear I think it is far easier and closer to what the standard says
to only parse this specifier when called from
cp_parser_parameter_declaration.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-06  Jakub Jelinek  

PR c++/113788
* parser.cc (CP_PARSER_FLAGS_PARAMETER): New enumerator.
(cp_parser_decl_specifier_seq): Parse RID_THIS only if
CP_PARSER_FLAGS_PARAMETER is set in flags.
(cp_parser_parameter_declaration): Or in CP_PARSER_FLAGS_PARAMETER
when calling cp_parser_decl_specifier_seq.

* g++.dg/parse/pr113788.C: New test.

--- gcc/cp/parser.cc.jj 2024-01-17 10:34:45.337660930 +0100
+++ gcc/cp/parser.cc2024-02-06 18:31:35.587193903 +0100
@@ -2088,7 +2088,9 @@ enum
   /* When parsing of the noexcept-specifier should be delayed.  */
   CP_PARSER_FLAGS_DELAY_NOEXCEPT = 0x40,
   /* When parsing a consteval declarator.  */
-  CP_PARSER_FLAGS_CONSTEVAL = 0x80
+  CP_PARSER_FLAGS_CONSTEVAL = 0x80,
+  /* When parsing a parameter declaration.  */
+  CP_PARSER_FLAGS_PARAMETER = 0x100
 };
 
 /* This type is used for parameters and variables which hold
@@ -16342,7 +16344,7 @@ cp_parser_decl_specifier_seq (cp_parser*
   /* Special case for "this" specifier, indicating a parm is an xobj parm.
 The "this" specifier must be the first specifier in the declaration,
 after any attributes.  */
-  if (token->keyword == RID_THIS)
+  if (token->keyword == RID_THIS && (flags & CP_PARSER_FLAGS_PARAMETER))
{
  cp_lexer_consume_token (parser->lexer);
  if (token != first_specifier)
@@ -25607,7 +25609,7 @@ cp_parser_parameter_declaration (cp_pars
   /* Parse the declaration-specifiers.  */
   cp_token *decl_spec_token_start = cp_lexer_peek_token (parser->lexer);
   cp_parser_decl_specifier_seq (parser,
-   flags,
+   flags | CP_PARSER_FLAGS_PARAMETER,
&decl_specifiers,
&declares_class_or_enum);
 
--- gcc/testsuite/g++.dg/parse/pr113788.C.jj2024-02-06 18:40:29.553791028 
+0100
+++ gcc/testsuite/g++.dg/parse/pr113788.C   2024-02-06 18:41:23.326045703 
+0100
@@ -0,0 +1,20 @@
+// PR c++/113788
+// { dg-do compile { target c++11 } }
+
+struct S { int a, b; };
+struct U {
+  void foo () { this int g = 1; }  // { dg-error "expected ';' before 
'int'" }
+};
+this auto h = 1;   // { dg-error "expected unqualified-id 
before 'this'" }
+
+int
+main ()
+{
+  S s = { 1, 2 };
+  short t[3] = { 3, 4, 5 };
+  this auto &[a, b] = s;   // { dg-error "invalid use of 'this' in 
non-member function" }
+  this auto &[c, d, e] = t;// { dg-error "invalid use of 'this' in 
non-member function" }
+  this int f = 1;  // { dg-error "invalid use of 'this' in 
non-member function" }
+  for (this auto &i : t)   // { dg-error "invalid use of 'this' in 
non-member function" }
+;  // { dg-error "expected" }
+}  // { dg-error "expected" }

Jakub

Re: [PATCH] x86-64: Return 10_REG if there is no scratch register

2024-02-06 Thread Jakub Jelinek

On Tue, Feb 06, 2024 at 10:57:24AM -0800, H.J. Lu wrote:
> If we can't find a scratch register for large model profiling, return
> R10_REG.
> 
>   PR target/113689
>   * config/i386/i386.cc (x86_64_select_profile_regnum): Return
>   R10_REG after sorry.

Ok, thanks.

Jakub

[PATCH] x86-64: Return 10_REG if there is no scratch register

2024-02-06 Thread H.J. Lu

If we can't find a scratch register for large model profiling, return
R10_REG.

PR target/113689
* config/i386/i386.cc (x86_64_select_profile_regnum): Return
R10_REG after sorry.
---
 gcc/config/i386/i386.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc
index f02c6c02ac6..10bd5347dcf 100644
--- a/gcc/config/i386/i386.cc
+++ b/gcc/config/i386/i386.cc
@@ -22788,7 +22788,7 @@ x86_64_select_profile_regnum (bool r11_ok 
ATTRIBUTE_UNUSED)
   sorry ("no register available for profiling %<-mcmodel=large%s%>",
 ix86_cmodel == CM_LARGE_PIC ? " -fPIC" : "");
 
-  return INVALID_REGNUM;
+  return R10_REG;
 }
 
 /* Output assembler code to FILE to increment profiler label # LABELNO
-- 
2.43.0

Re: [PATCH] RISC-V: Add support for B standard extension

2024-02-06 Thread Andrew Pinski

On Tue, Feb 6, 2024 at 9:39 AM Edwin Lu  wrote:
>
> This patch adds support for recognizing the B standard extension to be the
> collection of Zba, Zbb, Zbs extensions for consistency and conciseness across
> toolchains
>
> * https://github.com/riscv/riscv-b/tags

Note this is/was recorded as PR 106531.

Thanks,
Andrew Pinski

>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add imply rules for B extension
> * config/riscv/arch-canonicalize: ditto
>
> Signed-off-by: Edwin Lu 
> ---
>  gcc/common/config/riscv/riscv-common.cc | 7 +++
>  gcc/config/riscv/arch-canonicalize  | 1 +
>  2 files changed, 8 insertions(+)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 631ce8309a0..31117a7b0fd 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -77,6 +77,10 @@ static const riscv_implied_info_t riscv_implied_info[] =
>{"f", "zicsr"},
>{"d", "zicsr"},
>
> +  {"b", "zba"},
> +  {"b", "zbb"},
> +  {"b", "zbs"},
> +
>{"zdinx", "zfinx"},
>{"zfinx", "zicsr"},
>{"zdinx", "zicsr"},
> @@ -235,6 +239,8 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>{"c", ISA_SPEC_CLASS_20190608, 2, 0},
>{"c", ISA_SPEC_CLASS_2P2,  2, 0},
>
> +  {"b",   ISA_SPEC_CLASS_NONE, 1, 0},
> +
>{"h",   ISA_SPEC_CLASS_NONE, 1, 0},
>
>{"v",   ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -388,6 +394,7 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>  /* Combine extensions defined in this table  */
>  static const struct riscv_ext_version riscv_combine_info[] =
>  {
> +  {"b",  ISA_SPEC_CLASS_NONE, 1, 0},
>{"zk",  ISA_SPEC_CLASS_NONE, 1, 0},
>{"zkn",  ISA_SPEC_CLASS_NONE, 1, 0},
>{"zks",  ISA_SPEC_CLASS_NONE, 1, 0},
> diff --git a/gcc/config/riscv/arch-canonicalize 
> b/gcc/config/riscv/arch-canonicalize
> index 629bed85347..dcfae732714 100755
> --- a/gcc/config/riscv/arch-canonicalize
> +++ b/gcc/config/riscv/arch-canonicalize
> @@ -41,6 +41,7 @@ LONG_EXT_PREFIXES = ['z', 's', 'h', 'x']
>  IMPLIED_EXT = {
>"d" : ["f", "zicsr"],
>"f" : ["zicsr"],
> +  "b" : ["zba", "zbb", "zbs"],
>"zdinx" : ["zfinx", "zicsr"],
>"zfinx" : ["zicsr"],
>"zhinx" : ["zhinxmin", "zfinx", "zicsr"],
> --
> 2.34.1
>

[PATCH] RISC-V: Add support for B standard extension

2024-02-06 Thread Edwin Lu

This patch adds support for recognizing the B standard extension to be the
collection of Zba, Zbb, Zbs extensions for consistency and conciseness across
toolchains

* https://github.com/riscv/riscv-b/tags

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add imply rules for B extension
* config/riscv/arch-canonicalize: ditto

Signed-off-by: Edwin Lu 
---
 gcc/common/config/riscv/riscv-common.cc | 7 +++
 gcc/config/riscv/arch-canonicalize  | 1 +
 2 files changed, 8 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 631ce8309a0..31117a7b0fd 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -77,6 +77,10 @@ static const riscv_implied_info_t riscv_implied_info[] =
   {"f", "zicsr"},
   {"d", "zicsr"},
 
+  {"b", "zba"},
+  {"b", "zbb"},
+  {"b", "zbs"},
+
   {"zdinx", "zfinx"},
   {"zfinx", "zicsr"},
   {"zdinx", "zicsr"},
@@ -235,6 +239,8 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"c", ISA_SPEC_CLASS_20190608, 2, 0},
   {"c", ISA_SPEC_CLASS_2P2,  2, 0},
 
+  {"b",   ISA_SPEC_CLASS_NONE, 1, 0},
+
   {"h",   ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"v",   ISA_SPEC_CLASS_NONE, 1, 0},
@@ -388,6 +394,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
 /* Combine extensions defined in this table  */
 static const struct riscv_ext_version riscv_combine_info[] =
 {
+  {"b",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zk",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zkn",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"zks",  ISA_SPEC_CLASS_NONE, 1, 0},
diff --git a/gcc/config/riscv/arch-canonicalize 
b/gcc/config/riscv/arch-canonicalize
index 629bed85347..dcfae732714 100755
--- a/gcc/config/riscv/arch-canonicalize
+++ b/gcc/config/riscv/arch-canonicalize
@@ -41,6 +41,7 @@ LONG_EXT_PREFIXES = ['z', 's', 'h', 'x']
 IMPLIED_EXT = {
   "d" : ["f", "zicsr"],
   "f" : ["zicsr"],
+  "b" : ["zba", "zbb", "zbs"],
   "zdinx" : ["zfinx", "zicsr"],
   "zfinx" : ["zicsr"],
   "zhinx" : ["zhinxmin", "zfinx", "zicsr"],
-- 
2.34.1

Re: [PATCH] libsanitizer: workaround libtool error when building in Yocto Kirkstone

2024-02-06 Thread Alibek Omarov

Thanks for quick reply!

If it's an inappropriate patch for GCC, should I try to send it to Yocto then?

Alibek.

[PATCH] testsuite: Pattern does not match when using --specs=nano.specs

2024-02-06 Thread Torbjörn SVENSSON

Ok for trunk and releases/gcc-13?

---

When running the testsuite for newlib nano, the --specs=nano.specs
option is used.  This option prepends cpp_unique_options with
"-isystem =/include/newlib-nano" so that the newlib nano header files
override the newlib standard ones.  As the -isystem option is prepended,
the -quiet option is no longer the first option to cc1.  Adjust the test
accordingly.

Patch has been verified on Windows and Linux.

gcc/testsuite/ChangeLog:

* gcc.misc-tests/options.exp: Allow other options before the
-quite option for cc1.

Signed-off-by: Torbjörn SVENSSON 
---
 gcc/testsuite/gcc.misc-tests/options.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.misc-tests/options.exp 
b/gcc/testsuite/gcc.misc-tests/options.exp
index ec026ecf77d..e7fcde87585 100644
--- a/gcc/testsuite/gcc.misc-tests/options.exp
+++ b/gcc/testsuite/gcc.misc-tests/options.exp
@@ -57,7 +57,7 @@ proc check_for_all_options {language gcc_options 
compiler_pattern as_pattern ld_
remote_file build delete $dumpfile
 }   
 
-if {![regexp -- "/${compiler}(\\.exe)? -quiet.*$compiler_pattern" 
$gcc_output]} {
+if {![regexp -- "/${compiler}(\\.exe)? .*-quiet.*$compiler_pattern" 
$gcc_output]} {
fail "$test (compiler options)"
return
 }
-- 
2.25.1

Re: [PATCH] libsanitizer: workaround libtool error when building in Yocto Kirkstone

2024-02-06 Thread Andrew Pinski

On Tue, Feb 6, 2024 at 8:53 AM Alibek Omarov  wrote:
>
> Some libtool versions require --tag to be set and won't run compiler
> without it, throwing an `unable to infer tagged configuration` error.
>
> I'm not sure whether it's a good idea to tag assembly files as a C source,
> but it helps to avoid the issue.

This seems like an OE/Yocto issue as updating libtool inside GCC is
going to be problematic due to some local GCC patches to libtool and
some conflicts between libtool and GCC's understanding of --sysroot.

See https://gcc.gnu.org/legacy-ml/gcc-patches/2013-08/msg01465.html
(yes 10 years ago but as far as I Know this still applies).

Thanks,
Andrew Pinski

>
> Signed-off-by: Alibek Omarov 
>
> ---
>  libsanitizer/asan/Makefile.in   | 2 +-
>  libsanitizer/hwasan/Makefile.in | 2 +-
>  libsanitizer/tsan/Makefile.in   | 2 +-
>  3 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
> index 25c7fd7b7..5992bafd3 100644
> --- a/libsanitizer/asan/Makefile.in
> +++ b/libsanitizer/asan/Makefile.in
> @@ -188,7 +188,7 @@ am__depfiles_maybe = depfiles
>  am__mv = mv -f
>  CPPASCOMPILE = $(CCAS) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
> $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CCASFLAGS) $(CCASFLAGS)
> -LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) $(AM_LIBTOOLFLAGS) \
> +LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
> $(LIBTOOLFLAGS) --mode=compile $(CCAS) $(DEFS) \
> $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) \
> $(AM_CCASFLAGS) $(CCASFLAGS)
> diff --git a/libsanitizer/hwasan/Makefile.in b/libsanitizer/hwasan/Makefile.in
> index 542af8f19..a000fe570 100644
> --- a/libsanitizer/hwasan/Makefile.in
> +++ b/libsanitizer/hwasan/Makefile.in
> @@ -180,7 +180,7 @@ am__depfiles_maybe = depfiles
>  am__mv = mv -f
>  CPPASCOMPILE = $(CCAS) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
> $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CCASFLAGS) $(CCASFLAGS)
> -LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) $(AM_LIBTOOLFLAGS) \
> +LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
> $(LIBTOOLFLAGS) --mode=compile $(CCAS) $(DEFS) \
> $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) \
> $(AM_CCASFLAGS) $(CCASFLAGS)
> diff --git a/libsanitizer/tsan/Makefile.in b/libsanitizer/tsan/Makefile.in
> index ce11d2497..40d39e31d 100644
> --- a/libsanitizer/tsan/Makefile.in
> +++ b/libsanitizer/tsan/Makefile.in
> @@ -184,7 +184,7 @@ am__depfiles_maybe = depfiles
>  am__mv = mv -f
>  CPPASCOMPILE = $(CCAS) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
> $(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CCASFLAGS) $(CCASFLAGS)
> -LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) $(AM_LIBTOOLFLAGS) \
> +LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
> $(LIBTOOLFLAGS) --mode=compile $(CCAS) $(DEFS) \
> $(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) \
> $(AM_CCASFLAGS) $(CCASFLAGS)
> --
> 2.34.1
>

[PATCH] libsanitizer: workaround libtool error when building in Yocto Kirkstone

2024-02-06 Thread Alibek Omarov

Some libtool versions require --tag to be set and won't run compiler
without it, throwing an `unable to infer tagged configuration` error.

I'm not sure whether it's a good idea to tag assembly files as a C source,
but it helps to avoid the issue.

Signed-off-by: Alibek Omarov 

---
 libsanitizer/asan/Makefile.in   | 2 +-
 libsanitizer/hwasan/Makefile.in | 2 +-
 libsanitizer/tsan/Makefile.in   | 2 +-
 3 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/libsanitizer/asan/Makefile.in b/libsanitizer/asan/Makefile.in
index 25c7fd7b7..5992bafd3 100644
--- a/libsanitizer/asan/Makefile.in
+++ b/libsanitizer/asan/Makefile.in
@@ -188,7 +188,7 @@ am__depfiles_maybe = depfiles
 am__mv = mv -f
 CPPASCOMPILE = $(CCAS) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
$(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CCASFLAGS) $(CCASFLAGS)
-LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) $(AM_LIBTOOLFLAGS) \
+LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
$(LIBTOOLFLAGS) --mode=compile $(CCAS) $(DEFS) \
$(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) \
$(AM_CCASFLAGS) $(CCASFLAGS)
diff --git a/libsanitizer/hwasan/Makefile.in b/libsanitizer/hwasan/Makefile.in
index 542af8f19..a000fe570 100644
--- a/libsanitizer/hwasan/Makefile.in
+++ b/libsanitizer/hwasan/Makefile.in
@@ -180,7 +180,7 @@ am__depfiles_maybe = depfiles
 am__mv = mv -f
 CPPASCOMPILE = $(CCAS) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
$(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CCASFLAGS) $(CCASFLAGS)
-LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) $(AM_LIBTOOLFLAGS) \
+LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
$(LIBTOOLFLAGS) --mode=compile $(CCAS) $(DEFS) \
$(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) \
$(AM_CCASFLAGS) $(CCASFLAGS)
diff --git a/libsanitizer/tsan/Makefile.in b/libsanitizer/tsan/Makefile.in
index ce11d2497..40d39e31d 100644
--- a/libsanitizer/tsan/Makefile.in
+++ b/libsanitizer/tsan/Makefile.in
@@ -184,7 +184,7 @@ am__depfiles_maybe = depfiles
 am__mv = mv -f
 CPPASCOMPILE = $(CCAS) $(DEFS) $(DEFAULT_INCLUDES) $(INCLUDES) \
$(AM_CPPFLAGS) $(CPPFLAGS) $(AM_CCASFLAGS) $(CCASFLAGS)
-LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) $(AM_LIBTOOLFLAGS) \
+LTCPPASCOMPILE = $(LIBTOOL) $(AM_V_lt) --tag=CC $(AM_LIBTOOLFLAGS) \
$(LIBTOOLFLAGS) --mode=compile $(CCAS) $(DEFS) \
$(DEFAULT_INCLUDES) $(INCLUDES) $(AM_CPPFLAGS) $(CPPFLAGS) \
$(AM_CCASFLAGS) $(CCASFLAGS)
-- 
2.34.1

[pushed] c++: add fixed test [PR94231]

2024-02-06 Thread Marek Polacek

Tested x86_64-pc-linux-gnu, applying to trunk.

-- >8 --
I was suprised to find out that r14-8759 fixed this accepts-invalid.

clang version 17.0.6 rejects the test as well; clang version 19.0.0git
crashes for which I opened
.

PR c++/94231

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/deleted17.C: New test.
---
 gcc/testsuite/g++.dg/cpp0x/deleted17.C | 20 
 1 file changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/g++.dg/cpp0x/deleted17.C

diff --git a/gcc/testsuite/g++.dg/cpp0x/deleted17.C 
b/gcc/testsuite/g++.dg/cpp0x/deleted17.C
new file mode 100644
index 000..3bebe881165
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/deleted17.C
@@ -0,0 +1,20 @@
+// PR c++/94231
+// { dg-do compile { target c++11 } }
+
+struct F {F(F&&)=delete;};
+
+template
+struct M {
+  F f;
+  M();
+  M(const M&);
+  M(M&&);
+};
+
+template
+M::M(M&&)=default; // { dg-error "use of deleted function" }
+
+M<> f() {
+  M<> m;
+  return m;
+}

base-commit: 8ec2f1922a14ee3636840d1ebc1c40d26e6043a4
-- 
2.43.0

Re: [PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao

On Tue, 2024-02-06 at 17:55 +0800, Xi Ruoyao wrote:
> Recently I've fixed two wrong FP vector negate implementation which
> caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To
> prevent a similar issue from happening again, add a test case.
> 
> Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
> (with MSA), LoongArch (with LSX and LASX).
> 
> gcc/testsuite:
> 
>   * gcc.dg/vect/vect-neg-zero.c: New test.
> ---
> 
> Ok for trunk?
> 
>  gcc/testsuite/gcc.dg/vect/vect-neg-zero.c | 39 +++
>  1 file changed, 39 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c 
> b/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
> new file mode 100644
> index 000..adb032f5c6a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
> @@ -0,0 +1,39 @@
> +/* { dg-do run } */

This patch fails on Linaro CI for ARM.  I guess I need to remove this {
dg-do run } line and let the test framework to decide run or compile.

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

LoongArch: Backport r14-4674 "LoongArch: Delete macro definition ASM_OUTPUT_ALIGN_WITH_NOP."?

2024-02-06 Thread Xi Ruoyao

Hi Lulu,

I'm proposing to backport r14-4674 "LoongArch: Delete macro definition
ASM_OUTPUT_ALIGN_WITH_NOP." to releases/gcc-12 and releases/gcc-13.  The
reasons:

1. Strictly speaking, the old ASM_OUTPUT_ALIGN_WITH_NOP macro may cause
a correctness issue.  For example, a developer may use -falign-
functions=16 and then use the low 4 bits of a function pointer to encode
some metainfo.  Then ASM_OUTPUT_ALIGN_WITH_NOP causes the functions not
really aligned to a 16 bytes boundary, causing some breakage.

2. With Binutils-2.42,  ASM_OUTPUT_ALIGN_WITH_NOP can cause illegal
opcodes.  For example:

.globl _start
_start:
.balign 32
nop
nop
nop
addi.d $a0, $r0, 1
.balign 16,54525952,4
addi.d $a0, $a0, 1

is assembled and linked to:

0220 <_start>:
 220:   0340nop
 224:   0340nop
 228:   0340nop
 22c:   02c00404li.d$a0, 1
 230:   .word   0x   # <== OOPS!
 234:   02c00484addi.d  $a0, $a0, 1

Arguably this is a bug in GAS (it should at least error out for the
unsupported case where .balign 16,54525952,4 appears with -mrelax; I'd
prefer it to support the 3-operand .align directive even -mrelax for
reasons I've given in [1]).  But we can at least work it around by
removing ASM_OUTPUT_ALIGN_WITH_NOP to allow using GCC 13.3 with Binutils
2.42.

3. Without ASM_OUTPUT_ALIGN_WITH_NOP, GCC just outputs something like
".align 5" which works as expected since Binutils-2.38.

4. GCC < 14 does not have a default setting of -falign-*, so changing
this won't affect anyone who do not specify -falign-* explicitly.

[1]:https://github.com/loongson-community/discussions/issues/41#issuecomment-1925872603

Is it OK to backport r14-4674 into releases/gcc-12 and releases/gcc-13
then?

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University

New Chinese (simplified) PO file for 'gcc' (version 13.2.0)

2024-02-06 Thread Translation Project Robot

Hello, gentle maintainer.

This is a message from the Translation Project robot.

A revised PO file for textual domain 'gcc' has been submitted
by the Chinese (simplified) team of translators.  The file is available at:

https://translationproject.org/latest/gcc/zh_CN.po

(This file, 'gcc-13.2.0.zh_CN.po', has just now been sent to you in
a separate email.)

All other PO files for your package are available in:

https://translationproject.org/latest/gcc/

Please consider including all of these in your next release, whether
official or a pretest.

Whenever you have a new distribution with a new version number ready,
containing a newer POT file, please send the URL of that distribution
tarball to the address below.  The tarball may be just a pretest or a
snapshot, it does not even have to compile.  It is just used by the
translators when they need some extra translation context.

The following HTML page has been updated:

https://translationproject.org/domain/gcc.html

If any question arises, please contact the translation coordinator.

Thank you for all your work,

The Translation Project robot, in the
name of your translation coordinator.

[committed] aarch64: Fix build against libc++ in c++11 mode [PR113763]

2024-02-06 Thread Jakub Jelinek

Hi!

std::pair ctor used in tiles constexpr variable is only constexpr in C++14
and later, it works with libstdc++ because it is marked constexpr there even
in C++11 mode.

The following patch fixes it by using an unnamed local class instead of
std::pair, and additionally changes the first element from unsigned int to
unsigned char because 0xff has to fit into unsigned char on all hosts.

Bootstrapped/regtested on aarch64-linux, preapproved by Richard in the PR,
committed to trunk.

2024-02-06  Jakub Jelinek  

PR target/113763
* config/aarch64/aarch64.cc (aarch64_output_sme_zero_za): Change tiles
element from std::pair to an unnamed struct.
Adjust uses of tile range variable.

--- gcc/config/aarch64/aarch64.cc.jj2024-02-06 08:43:14.899888072 +0100
+++ gcc/config/aarch64/aarch64.cc   2024-02-06 11:41:47.855049148 +0100
@@ -13130,7 +13130,7 @@ aarch64_output_sme_zero_za (rtx mask)
   if (mask_val == 0xff)
 return "zero\t{ za }";
 
-  static constexpr std::pair tiles[] = {
+  static constexpr struct { unsigned char mask; char letter; } tiles[] = {
 { 0xff, 'b' },
 { 0x55, 'h' },
 { 0x11, 's' },
@@ -13144,14 +13144,14 @@ aarch64_output_sme_zero_za (rtx mask)
   const char *prefix = "{ ";
   for (auto &tile : tiles)
 {
-  auto tile_mask = tile.first;
+  unsigned int tile_mask = tile.mask;
   unsigned int tile_index = 0;
   while (tile_mask < 0x100)
{
  if ((mask_val & tile_mask) == tile_mask)
{
  i += snprintf (buffer + i, sizeof (buffer) - i, "%sza%d.%c",
-prefix, tile_index, tile.second);
+prefix, tile_index, tile.letter);
  prefix = ", ";
  mask_val &= ~tile_mask;
}

Jakub

[committed] aarch64: Fix function multiversioning mangling

2024-02-06 Thread Andrew Carlotti

It would be neater if the middle end for target_clones used a target
hook for version name mangling, so we only do version name mangling
once.  However, that would require more intrusive refactoring that will
have to wait till Stage 1.

I've made the changes Richard Sandiford requested, and merged the new tests
into this patch. I'd have sent this sooner, but my initial testing failed due
to a broken master.  This is now successfully bootstrapped, regression tested
and pushed to master.

gcc/ChangeLog:

* config/aarch64/aarch64.cc (aarch64_mangle_decl_assembler_name):
Move before new caller, and add ".default" suffix.
(get_suffixed_assembler_name): New.
(make_resolver_func): Use get_suffixed_assembler_name.
(aarch64_generate_version_dispatcher_body): Redo name mangling.

gcc/testsuite/ChangeLog:

* g++.target/aarch64/mv-symbols1.C: New test.
* g++.target/aarch64/mv-symbols2.C: Ditto.
* g++.target/aarch64/mv-symbols3.C: Ditto.
* g++.target/aarch64/mv-symbols4.C: Ditto.
* g++.target/aarch64/mv-symbols5.C: Ditto.
* g++.target/aarch64/mvc-symbols1.C: Ditto.
* g++.target/aarch64/mvc-symbols2.C: Ditto.
* g++.target/aarch64/mvc-symbols3.C: Ditto.
* g++.target/aarch64/mvc-symbols4.C: Ditto.


diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index 
4556b8dd5045cc992f9e392e0dff903267adca0e..356695feb06257a477c72eb359c7628f8ecea963
 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -19870,6 +19870,62 @@ build_ifunc_arg_type ()
   return pointer_type;
 }
 
+/* Implement TARGET_MANGLE_DECL_ASSEMBLER_NAME, to add function multiversioning
+   suffixes.  */
+
+tree
+aarch64_mangle_decl_assembler_name (tree decl, tree id)
+{
+  /* For function version, add the target suffix to the assembler name.  */
+  if (TREE_CODE (decl) == FUNCTION_DECL
+  && DECL_FUNCTION_VERSIONED (decl))
+{
+  aarch64_fmv_feature_mask feature_mask = get_feature_mask_for_version 
(decl);
+
+  std::string name = IDENTIFIER_POINTER (id);
+
+  /* For the default version, append ".default".  */
+  if (feature_mask == 0ULL)
+   {
+ name += ".default";
+ return get_identifier (name.c_str());
+   }
+
+  name += "._";
+
+  for (int i = 0; i < FEAT_MAX; i++)
+   {
+ if (feature_mask & aarch64_fmv_feature_data[i].feature_mask)
+   {
+ name += "M";
+ name += aarch64_fmv_feature_data[i].name;
+   }
+   }
+
+  if (DECL_ASSEMBLER_NAME_SET_P (decl))
+   SET_DECL_RTL (decl, NULL);
+
+  id = get_identifier (name.c_str());
+}
+  return id;
+}
+
+/* Return an identifier for the base assembler name of a versioned function.
+   This is computed by taking the default version's assembler name, and
+   stripping off the ".default" suffix if it's already been appended.  */
+
+static tree
+get_suffixed_assembler_name (tree default_decl, const char *suffix)
+{
+  std::string name = IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (default_decl));
+
+  auto size = name.size ();
+  if (size >= 8 && name.compare (size - 8, 8, ".default") == 0)
+name.resize (size - 8);
+  name += suffix;
+  return get_identifier (name.c_str());
+}
+
 /* Make the resolver function decl to dispatch the versions of
a multi-versioned function,  DEFAULT_DECL.  IFUNC_ALIAS_DECL is
ifunc alias that will point to the created resolver.  Create an
@@ -19883,8 +19939,9 @@ make_resolver_func (const tree default_decl,
 {
   tree decl, type, t;
 
-  /* Create resolver function name based on default_decl.  */
-  tree decl_name = clone_function_name (default_decl, "resolver");
+  /* Create resolver function name based on default_decl.  We need to remove an
+ existing ".default" suffix if this has already been appended.  */
+  tree decl_name = get_suffixed_assembler_name (default_decl, ".resolver");
   const char *resolver_name = IDENTIFIER_POINTER (decl_name);
 
   /* The resolver function should have signature
@@ -20231,6 +20288,28 @@ aarch64_generate_version_dispatcher_body (void *node_p)
   dispatch_function_versions (resolver_decl, &fn_ver_vec, &empty_bb);
   cgraph_edge::rebuild_edges ();
   pop_cfun ();
+
+  /* Fix up symbol names.  First we need to obtain the base name, which may
+ have already been mangled.  */
+  tree base_name = get_suffixed_assembler_name (default_ver_decl, "");
+
+  /* We need to redo the version mangling on the non-default versions for the
+ target_clones case.  Redoing the mangling for the target_version case is
+ redundant but does no harm.  We need to skip the default version, because
+ expand_clones will append ".default" later; fortunately that suffix is the
+ one we want anyway.  */
+  for (versn_info = node_version_info->next->next; versn_info;
+   versn_info = versn_info->next)
+{
+  tree version_decl = versn_info->this_node->decl;
+  tree na

Re: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-06 Thread Richard Biener

On Tue, 6 Feb 2024, Jakub Jelinek wrote:

> On Mon, Feb 05, 2024 at 03:31:20PM +0100, Richard Biener wrote:
> > On Mon, 5 Feb 2024, Tamar Christina wrote:
> > > > It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"?  Is that
> > > > why you are doing gsi_move_before + gsi_prev?  Why do gsi_prev
> > > > at all?
> > > > 
> > > 
> > > As discussed on IRC, then how about this one.
> > > Incremental building passed all tests and bootstrap is running.
> > > 
> > > Ok for master if bootstrap and regtesting clean?
> > > 
> > > Thanks,
> > > Tamar
> > > 
> > > gcc/ChangeLog:
> > > 
> > >   PR tree-optimization/113731
> > >   * gimple-iterator.cc (gsi_move_before): Take new parameter for update
> > >   method.
> > >   * gimple-iterator.h (gsi_move_before): Default new param to
> > >   GSI_SAME_STMT.
> > >   * tree-vect-loop.cc (move_early_exit_stmts): Call gsi_move_before with
> > >   GSI_NEW_STMT.
> > > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > >   PR tree-optimization/113731
> > >   * gcc.dg/vect/vect-early-break_111-pr113731.c: New test.
> 
> So like following (Tobias asked for it on IRC)?
> Seems it FAILs for me on x86_64-linux unless using -msse4.2 (or -msse4),
> dg-add-options vect_early_break adds just -msse4.1.
> I believe the difference is in presence/absence of the PCMPGTQ instruction,
> the loop is comparing pointers (other than equality comparison) and so needs
> it, while in SSE4.1 one only has PCMPGT{B,W,D}, so it can > compare 16-byte
> vectors with 8-bit, 16-bit or 32-bit elements.
> 
> Shall we add
> /* { dg-additional-options "-msse4.2" { target i?86-*-* x86_64-*-* } } */
> to the testcase?

or add vect_long and/or vect_early_break_long?

> --- gcc/gimple-iterator.h.jj  2024-01-03 11:51:33.090709553 +0100
> +++ gcc/gimple-iterator.h 2024-02-06 15:23:40.732532207 +0100
> @@ -86,7 +86,8 @@ extern gimple_stmt_iterator gsi_for_stmt
>  extern gimple_stmt_iterator gsi_for_stmt (gimple *, gimple_seq *);
>  extern gphi_iterator gsi_for_phi (gphi *);
>  extern void gsi_move_after (gimple_stmt_iterator *, gimple_stmt_iterator *);
> -extern void gsi_move_before (gimple_stmt_iterator *, gimple_stmt_iterator *);
> +extern void gsi_move_before (gimple_stmt_iterator *, gimple_stmt_iterator *,
> +  gsi_iterator_update = GSI_SAME_STMT);
>  extern void gsi_move_to_bb_end (gimple_stmt_iterator *, basic_block);
>  extern void gsi_insert_on_edge (edge, gimple *);
>  extern void gsi_insert_seq_on_edge (edge, gimple_seq);
> --- gcc/gimple-iterator.cc.jj 2024-01-03 11:51:42.016585669 +0100
> +++ gcc/gimple-iterator.cc2024-02-06 15:23:12.274926647 +0100
> @@ -666,10 +666,11 @@ gsi_move_after (gimple_stmt_iterator *fr
>  
>  
>  /* Move the statement at FROM so it comes right before the statement
> -   at TO.  */
> +   at TO using method M.  */
>  
>  void
> -gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to)
> +gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to,
> +  gsi_iterator_update m)
>  {
>gimple *stmt = gsi_stmt (*from);
>gsi_remove (from, false);
> @@ -677,7 +678,7 @@ gsi_move_before (gimple_stmt_iterator *f
>/* For consistency with gsi_move_after, it might be better to have
>   GSI_NEW_STMT here; however, that breaks several places that expect
>   that TO does not change.  */
> -  gsi_insert_before (to, stmt, GSI_SAME_STMT);
> +  gsi_insert_before (to, stmt, m);
>  }
>  
>  
> --- gcc/tree-vect-loop.cc.jj  2024-01-25 09:06:34.116831262 +0100
> +++ gcc/tree-vect-loop.cc 2024-02-06 15:22:49.268245536 +0100
> @@ -11800,8 +11800,7 @@ move_early_exit_stmts (loop_vec_info loo
>   dump_printf_loc (MSG_NOTE, vect_location, "moving stmt %G", stmt);
>  
>gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
> -  gsi_move_before (&stmt_gsi, &dest_gsi);
> -  gsi_prev (&dest_gsi);
> +  gsi_move_before (&stmt_gsi, &dest_gsi, GSI_NEW_STMT);
>  }
>  
>/* Update all the stmts with their new reaching VUSES.  */
> --- gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c.jj  
> 2024-02-06 15:22:49.248245813 +0100
> +++ gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c 2024-02-06 
> 15:22:49.248245813 +0100
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-add-options vect_early_break } */
> +/* { dg-require-effective-target vect_early_break } */
> +/* { dg-require-effective-target vect_int } */
> +
> +/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
> +
> +char* inet_net_pton_ipv4_bits;
> +char inet_net_pton_ipv4_odst;
> +void __errno_location();
> +void inet_net_pton_ipv4();
> +void inet_net_pton() { inet_net_pton_ipv4(); }
> +void inet_net_pton_ipv4(char *dst, int size) {
> +  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
> +if (size-- <= 0)
> +  goto emsgsize;
> +*dst++ = '\0';
> +  }
> +emsgsize:
> +  __errno_location();
> +}
> 
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstr

Re: [PATCH]middle-end: fix ICE when moving statements to empty BB [PR113731]

2024-02-06 Thread Jakub Jelinek

On Mon, Feb 05, 2024 at 03:31:20PM +0100, Richard Biener wrote:
> On Mon, 5 Feb 2024, Tamar Christina wrote:
> > > It looks like LOOP_VINFO_EARLY_BRK_STORES is "reverse"?  Is that
> > > why you are doing gsi_move_before + gsi_prev?  Why do gsi_prev
> > > at all?
> > > 
> > 
> > As discussed on IRC, then how about this one.
> > Incremental building passed all tests and bootstrap is running.
> > 
> > Ok for master if bootstrap and regtesting clean?
> > 
> > Thanks,
> > Tamar
> > 
> > gcc/ChangeLog:
> > 
> > PR tree-optimization/113731
> > * gimple-iterator.cc (gsi_move_before): Take new parameter for update
> > method.
> > * gimple-iterator.h (gsi_move_before): Default new param to
> > GSI_SAME_STMT.
> > * tree-vect-loop.cc (move_early_exit_stmts): Call gsi_move_before with
> > GSI_NEW_STMT.
> > 
> > gcc/testsuite/ChangeLog:
> > 
> > PR tree-optimization/113731
> > * gcc.dg/vect/vect-early-break_111-pr113731.c: New test.

So like following (Tobias asked for it on IRC)?
Seems it FAILs for me on x86_64-linux unless using -msse4.2 (or -msse4),
dg-add-options vect_early_break adds just -msse4.1.
I believe the difference is in presence/absence of the PCMPGTQ instruction,
the loop is comparing pointers (other than equality comparison) and so needs
it, while in SSE4.1 one only has PCMPGT{B,W,D}, so it can > compare 16-byte
vectors with 8-bit, 16-bit or 32-bit elements.

Shall we add
/* { dg-additional-options "-msse4.2" { target i?86-*-* x86_64-*-* } } */
to the testcase?

--- gcc/gimple-iterator.h.jj2024-01-03 11:51:33.090709553 +0100
+++ gcc/gimple-iterator.h   2024-02-06 15:23:40.732532207 +0100
@@ -86,7 +86,8 @@ extern gimple_stmt_iterator gsi_for_stmt
 extern gimple_stmt_iterator gsi_for_stmt (gimple *, gimple_seq *);
 extern gphi_iterator gsi_for_phi (gphi *);
 extern void gsi_move_after (gimple_stmt_iterator *, gimple_stmt_iterator *);
-extern void gsi_move_before (gimple_stmt_iterator *, gimple_stmt_iterator *);
+extern void gsi_move_before (gimple_stmt_iterator *, gimple_stmt_iterator *,
+gsi_iterator_update = GSI_SAME_STMT);
 extern void gsi_move_to_bb_end (gimple_stmt_iterator *, basic_block);
 extern void gsi_insert_on_edge (edge, gimple *);
 extern void gsi_insert_seq_on_edge (edge, gimple_seq);
--- gcc/gimple-iterator.cc.jj   2024-01-03 11:51:42.016585669 +0100
+++ gcc/gimple-iterator.cc  2024-02-06 15:23:12.274926647 +0100
@@ -666,10 +666,11 @@ gsi_move_after (gimple_stmt_iterator *fr
 
 
 /* Move the statement at FROM so it comes right before the statement
-   at TO.  */
+   at TO using method M.  */
 
 void
-gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to)
+gsi_move_before (gimple_stmt_iterator *from, gimple_stmt_iterator *to,
+gsi_iterator_update m)
 {
   gimple *stmt = gsi_stmt (*from);
   gsi_remove (from, false);
@@ -677,7 +678,7 @@ gsi_move_before (gimple_stmt_iterator *f
   /* For consistency with gsi_move_after, it might be better to have
  GSI_NEW_STMT here; however, that breaks several places that expect
  that TO does not change.  */
-  gsi_insert_before (to, stmt, GSI_SAME_STMT);
+  gsi_insert_before (to, stmt, m);
 }
 
 
--- gcc/tree-vect-loop.cc.jj2024-01-25 09:06:34.116831262 +0100
+++ gcc/tree-vect-loop.cc   2024-02-06 15:22:49.268245536 +0100
@@ -11800,8 +11800,7 @@ move_early_exit_stmts (loop_vec_info loo
dump_printf_loc (MSG_NOTE, vect_location, "moving stmt %G", stmt);
 
   gimple_stmt_iterator stmt_gsi = gsi_for_stmt (stmt);
-  gsi_move_before (&stmt_gsi, &dest_gsi);
-  gsi_prev (&dest_gsi);
+  gsi_move_before (&stmt_gsi, &dest_gsi, GSI_NEW_STMT);
 }
 
   /* Update all the stmts with their new reaching VUSES.  */
--- gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c.jj
2024-02-06 15:22:49.248245813 +0100
+++ gcc/testsuite/gcc.dg/vect/vect-early-break_111-pr113731.c   2024-02-06 
15:22:49.248245813 +0100
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-add-options vect_early_break } */
+/* { dg-require-effective-target vect_early_break } */
+/* { dg-require-effective-target vect_int } */
+
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" } } */
+
+char* inet_net_pton_ipv4_bits;
+char inet_net_pton_ipv4_odst;
+void __errno_location();
+void inet_net_pton_ipv4();
+void inet_net_pton() { inet_net_pton_ipv4(); }
+void inet_net_pton_ipv4(char *dst, int size) {
+  while ((inet_net_pton_ipv4_bits > dst) & inet_net_pton_ipv4_odst) {
+if (size-- <= 0)
+  goto emsgsize;
+*dst++ = '\0';
+  }
+emsgsize:
+  __errno_location();
+}


Jakub

Re: [PATCH 0/4] Add DF_LIVE_SUBREG data and apply to IRA and LRA

2024-02-06 Thread Vladimir Makarov




On 2/5/24 11:10, Jeff Law wrote:



On 2/5/24 00:01, Lehua Ding wrote:
For SPEC INT 2017, when using upstream GCC (whitout these patches), 
I get a
coredump when training the peak case, so no data yet. The cause of 
the core

dump still needs to be investigated.


Typo, SPEC INT 2017 -> SPEC FP 2017
Also There is a bad news, the score of specint 2017 (with these 
patches) is dropped, a bit strange and I need to be locating the cause.
Just a note.  I doubt this will get much traction from a review 
standpoint until gcc-14 is basically out the door.


My recommendation is to continue development, bugfixing, cleanup, etc 
between now and then.  Consider creating a branch for the work in the 
upstream repo.



Thank you for posting this work.  The compilation time improvement is a 
surprise for me and very encouraging.


I agree with Jeff's recommendation to create a branch as most probably 
some people (at least me :) would like to try this on own set of benchmarks.


I am planning to make a review of RA part of these patches at the 
beginning of April.  Still when I have spare time I'll look at the 
patches and could give some feedback even earlier.

[PATCH] AArch64: Update system register database.

2024-02-06 Thread Victor Do Nascimento

With the release of Binutils 2.42, this brings the level of
system-register support in GCC in line with the current
state-of-the-art in Binutils, ensuring everything available in
Binutils is plainly accessible from GCC.

Where Binutils uses a more detailed description of which features are
responsible for enabling a given system register, GCC aliases the
binutils-equivalent feature flag macro constant to that of the base
architecture implementing the feature, resulting in entries such as

  #define AARCH64_FL_S2PIE AARCH64_FL_V8_9A

in `aarch64.h', thus ensuring that the Binutils `aarch64-sys-regs.def'
file can be understood by GCC without the need for modification.

To accompany the addition of the new system registers, a new test is
added confirming they were successfully added to the list of
recognized registers.

gcc/ChangeLog:

* gcc/config/aarch64/aarch64-sys-regs.def: Copy from Binutils.
* /config/aarch64/aarch64.h (AARCH64_FL_AIE): New.
(AARCH64_FL_DEBUGv8p9): Likewise.
(AARCH64_FL_FGT2): Likewise.Likewise.
(AARCH64_FL_ITE): Likewise.
(AARCH64_FL_PFAR): Likewise.
(AARCH64_FL_PMUv3_ICNTR): Likewise.
(AARCH64_FL_PMUv3_SS): Likewise.
(AARCH64_FL_PMUv3p9): Likewise.
(AARCH64_FL_RASv2): Likewise.
(AARCH64_FL_S1PIE): Likewise.
(AARCH64_FL_S1POE): Likewise.
(AARCH64_FL_S2PIE): Likewise.
(AARCH64_FL_S2POE): Likewise.
(AARCH64_FL_SCTLR2): Likewise.
(AARCH64_FL_SEBEP): Likewise.
(AARCH64_FL_SPE_FDS): Likewise.
(AARCH64_FL_TCR2): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/aarch64/acle/rwsr-armv8p9.c: New.
---
 gcc/config/aarch64/aarch64-sys-regs.def   | 85 
 gcc/config/aarch64/aarch64.h  | 20 
 .../gcc.target/aarch64/acle/rwsr-armv8p9.c| 99 +++
 3 files changed, 204 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/aarch64/acle/rwsr-armv8p9.c

diff --git a/gcc/config/aarch64/aarch64-sys-regs.def 
b/gcc/config/aarch64/aarch64-sys-regs.def
index fffc35f72c8..6a948171d6e 100644
--- a/gcc/config/aarch64/aarch64-sys-regs.def
+++ b/gcc/config/aarch64/aarch64-sys-regs.def
@@ -54,6 +54,10 @@
   SYSREG ("amair_el12",CPENC (3,5,10,3,0), F_ARCHEXT,  
AARCH64_FEATURE (V8_1A))
   SYSREG ("amair_el2", CPENC (3,4,10,3,0), 0,  
AARCH64_NO_FEATURES)
   SYSREG ("amair_el3", CPENC (3,6,10,3,0), 0,  
AARCH64_NO_FEATURES)
+  SYSREG ("amair2_el1",CPENC (3,0,10,3,1), F_ARCHEXT,  
AARCH64_FEATURE (AIE))
+  SYSREG ("amair2_el12",   CPENC (3,5,10,3,1), F_ARCHEXT,  
AARCH64_FEATURE (AIE))
+  SYSREG ("amair2_el2",CPENC (3,4,10,3,1), F_ARCHEXT,  
AARCH64_FEATURE (AIE))
+  SYSREG ("amair2_el3",CPENC (3,6,10,3,1), F_ARCHEXT,  
AARCH64_FEATURE (AIE))
   SYSREG ("amcfgr_el0",CPENC (3,3,13,2,1), 
F_REG_READ|F_ARCHEXT,   AARCH64_FEATURE (V8_4A))
   SYSREG ("amcg1idr_el0",  CPENC (3,3,13,2,6), F_REG_READ|F_ARCHEXT,   
AARCH64_FEATURE (V8_6A))
   SYSREG ("amcgcr_el0",CPENC (3,3,13,2,2), 
F_REG_READ|F_ARCHEXT,   AARCH64_FEATURE (V8_4A))
@@ -400,6 +404,7 @@
   SYSREG ("erxaddr_el1",   CPENC (3,0,5,4,3),  F_ARCHEXT,  
AARCH64_FEATURE (RAS))
   SYSREG ("erxctlr_el1",   CPENC (3,0,5,4,1),  F_ARCHEXT,  
AARCH64_FEATURE (RAS))
   SYSREG ("erxfr_el1", CPENC (3,0,5,4,0),  F_REG_READ|F_ARCHEXT,   
AARCH64_FEATURE (RAS))
+  SYSREG ("erxgsr_el1",CPENC (3,0,5,3,2),  
F_REG_READ|F_ARCHEXT,   AARCH64_FEATURE (RASv2))
   SYSREG ("erxmisc0_el1",  CPENC (3,0,5,5,0),  F_ARCHEXT,  
AARCH64_FEATURE (RAS))
   SYSREG ("erxmisc1_el1",  CPENC (3,0,5,5,1),  F_ARCHEXT,  
AARCH64_FEATURE (RAS))
   SYSREG ("erxmisc2_el1",  CPENC (3,0,5,5,2),  F_ARCHEXT,  
AARCH64_FEATURE (RAS))
@@ -438,10 +443,14 @@
   SYSREG ("hcr_el2",   CPENC (3,4,1,1,0),  0,  
AARCH64_NO_FEATURES)
   SYSREG ("hcrx_el2",  CPENC (3,4,1,2,2),  F_ARCHEXT,  
AARCH64_FEATURE (V8_7A))
   SYSREG ("hdfgrtr_el2",   CPENC (3,4,3,1,4),  F_ARCHEXT,  
AARCH64_FEATURE (V8_6A))
+  SYSREG ("hdfgrtr2_el2",  CPENC (3,4,3,1,0),  F_ARCHEXT,  
AARCH64_FEATURE (FGT2))
   SYSREG ("hdfgwtr_el2",   CPENC (3,4,3,1,5),  F_ARCHEXT,  
AARCH64_FEATURE (V8_6A))
+  SYSREG ("hdfgwtr2_el2",  CPENC (3,4,3,1,1),  F_ARCHEXT,  
AARCH64_FEATURE (FGT2))
   SYSREG ("hfgitr_el2",CPENC (3,4,1,1,6),  F_ARCHEXT,  
AARCH64_FEATURE (V8_6A))
   SYSREG ("hfgrtr_el2",CPENC (3,4,1,1,4),  F_ARCHEXT,  
AARCH64_FEATURE (V8_6A))
+  SYSREG ("hfgrt

Re: [PATCH] RISC-V: Allow LICM hoist POLY_INT configuration code sequence

2024-02-06 Thread Robin Dapp

> The root cause is this following RTL pattern, after fwprop1:
> 
> (insn 82 78 84 9 (set (reg:DI 230)
>         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}
>      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
> [ niters.10 ]) 0)
>                 *(const_poly_int:SI [-16, -16])*))
>         (nil)))
> 
> The highlight *(const_poly_int:SI [-16, -16])*
> causes ICE.
> 
> This RTL is because:
> (insn 69 68 71 8 (set (reg:DI 221)
>         (const_poly_int:DI [16, 16])) 208 {*movdi_64bit}
>      (nil))
> (insn 82 78 84 9 (set (reg:DI 230)
>         (sign_extend:DI (minus:SI (subreg/s/v:SI (reg:DI 150 [ niters.10 ]) 0)
>                 (subreg:SI (reg:DI 221) 0 13 {subsi3_extended}            
>                               > (subreg:SI (const_poly_int:SI [-16, 
> -16])) fwprop1 add  (const_poly_int:SI [-16, -16]) reg_equal
>      (expr_list:REG_EQUAL (sign_extend:DI (plus:SI (subreg/s/v:SI (reg:DI 150 
> [ niters.10 ]) 0)
>                 (const_poly_int:SI [-16, -16])))
>         (nil)))

I'm seeing a slightly different pattern but that doesn't change
the problem.

> (set (reg:SI)  (subreg:SI (DI: poly value))) but it causes ICE that I
> mentioned above.

That's indeed a bit more idiomatic and I wouldn't oppose that.

The problem causing the ICE is that we want to simplify a PLUS
with (const_poly_int:SI [16, 16]) and (const_int 0) but the mode
is DImode.  My suspicion is that this is caused by our
addsi3_extended pattern and we fail to deduce the proper mode
for analysis.

I'm just speculating but maybe that's because we assert that a
plus is of the form simple_reg_p (op0) && CONSTANT_P (op1).
Usually, constants don't have a mode and can just be used.
poly_int_csts do have one and need to be explicitly converted
(kind of).

We can only analyze this zero_extended plus at all since Jeff
added the addsi3_extended handling for loop-iv.   Maybe we could
punt like

diff --git a/gcc/loop-iv.cc b/gcc/loop-iv.cc
index eb7e923a38b..796413c25a3 100644
--- a/gcc/loop-iv.cc
+++ b/gcc/loop-iv.cc
@@ -714,6 +714,9 @@ get_biv_step_1 (df_ref def, scalar_int_mode outer_mode, rtx 
reg,
  if (!simple_reg_p (op0) || !CONSTANT_P (op1))
return false;
 
+ if (CONST_POLY_INT_P (op1) && GET_MODE (op1) != outer_mode)
+   return false;
+

This helps for your test case but I haven't done any further
testing.  I'd think this is relatively safe because it's only
a missed analysis/optimization in the worst case.
Still, generally, I don't see a reason why we wouldn't be able
to analyze this?

Regards
 Robin

Re: [PATCH] libgccjit: Add ability to get CPU features

2024-02-06 Thread Antoni Boucher

David: Ping.

On Tue, 2024-01-30 at 10:50 -0500, Antoni Boucher wrote:
> David: I'm unsure what to do here. It seems we cannot find a
> reviewer.
> Would it help if I show you the code in gccrs that is similar?
> Would it help if I ask someone from gccrs to review this code?
> 
> On Sat, 2024-01-20 at 09:50 -0500, Antoni Boucher wrote:
> > CC-ing Iain in case they can do the review since it is based on how
> > they did it in the D frontend.
> > Could you please do the review?
> > Thanks!
> > 
> > On Thu, 2023-11-09 at 18:04 -0500, David Malcolm wrote:
> > > On Thu, 2023-11-09 at 17:27 -0500, Antoni Boucher wrote:
> > > > Hi.
> > > > This patch adds support for getting the CPU features in
> > > > libgccjit
> > > > (bug
> > > > 112466)
> > > > 
> > > > There's a TODO in the test:
> > > > I'm not sure how to test that gcc_jit_target_info_arch returns
> > > > the
> > > > correct value since it is dependant on the CPU.
> > > > Any idea on how to improve this?
> > > > 
> > > > Also, I created a CStringHash to be able to have a
> > > > std::unordered_set. Is there any built-in way of
> > > > doing
> > > > this?
> > > 
> > > Thanks for the patch.
> > > 
> > > Some high-level questions:
> > > 
> > > Is this specifically about detecting capabilities of the host
> > > that
> > > libgccjit is currently running on? or how the target was
> > > configured
> > > when libgccjit was built?
> > > 
> > > One of the benefits of libgccjit is that, in theory, we support
> > > all
> > > of
> > > the targets that GCC already supports.  Does this patch change
> > > that,
> > > or
> > > is this more about giving client code the ability to determine
> > > capabilities of the specific host being compiled for?
> > > 
> > > I'm nervous about having per-target jit code.  Presumably there's
> > > a
> > > reason that we can't reuse existing target logic here - can you
> > > please
> > > describe what the problem is.  I see that the ChangeLog has:
> > > 
> > > > * config/i386/i386-jit.cc: New file.
> > > 
> > > where i386-jit.cc has almost 200 lines of nontrivial code.  Where
> > > did
> > > this come from?  Did you base it on existing code in our source
> > > tree,
> > > making modifications to fit the new internal API, or did you
> > > write
> > > it
> > > from scratch?  In either case, how onerous would this be for
> > > other
> > > targets?
> > > 
> > > I'm not at expert at target hooks (or at the i386 backend), so if
> > > we
> > > do
> > > go with this approach I'd want someone else to review those parts
> > > of
> > > the patch.
> > > 
> > > Have you verified that GCC builds with this patch with jit *not*
> > > enabled in the enabled languages?
> > > 
> > > [...snip...]
> > > 
> > > A nitpick:
> > > 
> > > > +.. function:: const char * \
> > > > +  gcc_jit_target_info_arch (gcc_jit_target_info
> > > > *info)
> > > > +
> > > > +   Get the architecture of the currently running CPU.
> > > 
> > > What does this string look like?
> > > How long does the pointer remain valid?
> > > 
> > > Thanks again; hope the above makes sense
> > > Dave
> > > 
> > 
>

RE: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread Li, Pan2

All passed, include overloaded and non-overloaded.

# of expected passes10885

Pan

From: Li, Pan2
Sent: Tuesday, February 6, 2024 4:17 PM
To: juzhe.zh...@rivai.ai; gcc-patches 
Cc: Wang, Yanzhang ; kito.cheng 
Subject: RE: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when 
empty args

Not yet. It is long time since last round run, will make sure there is no 
surprises from that.

Pan

From: juzhe.zh...@rivai.ai 
mailto:juzhe.zh...@rivai.ai>>
Sent: Tuesday, February 6, 2024 4:11 PM
To: Li, Pan2 mailto:pan2...@intel.com>>; gcc-patches 
mailto:gcc-patches@gcc.gnu.org>>
Cc: Li, Pan2 mailto:pan2...@intel.com>>; Wang, Yanzhang 
mailto:yanzhang.w...@intel.com>>; kito.cheng 
mailto:kito.ch...@gmail.com>>
Subject: Re: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when 
empty args

Did you run the C compiler compile C++ intrinsic test ?

juzhe.zh...@rivai.ai

From: pan2.li
Date: 2024-02-06 16:09
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty 
args
From: Pan Li mailto:pan2...@intel.com>>

There is one corn case when similar as below example:

void test (void)
{
  __riscv_vfredosum_tu ();
}

It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:

1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.

Here we report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with 
empty args
8 |   __riscv_vfredosum_tu();
  |   ^~~~

Below test are passed for this patch.

* The riscv regression tests.

PR target/113766

gcc/ChangeLog:

* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
the signature of func.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Make
overloaded func with empty args error.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
* gcc.target/riscv/rvv/base/pr113766-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-c.cc   |  3 +-
gcc/config/riscv/riscv-protos.h   |  2 +-
gcc/config/riscv/riscv-vector-builtins.cc | 23 -
.../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++
.../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++
5 files changed, 155 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 2e306057347..94c3871c760 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int 
uncast_location, tree fndecl,
 case RISCV_BUILTIN_GENERAL:
   break;
 case RISCV_BUILTIN_VECTOR:
-  new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist);
+  new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode,
+  fndecl, arglist);
   break;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b3f0bdb9924..ae1685850ac 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, 
gimple_stmt_iterator *, gcall *);
rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
-tree resolve_overloaded_builtin (unsigned int, vec *);
+tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
bool legitimize_move (rtx, rtx *);
void emit_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 403e1021fd1..efcdc8f1767 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc

Re: [PATCH] RISC-V: Fix infinite compilation of VSETVL PASS

2024-02-06 Thread Robin Dapp

> Testing is running. Ok for trunk if I passed the testing with no
> regression ?
OK.

Regards
 Robin

[PATCH] ranger: Grow BBs in relation oracle as needed [PR113735]

2024-02-06 Thread Aldy Hernandez

The relation oracle grows the internal vector of SSAs as needed, but
due to an oversight was not growing the basic block vector.  This
fixes the oversight.

OK for trunk?

PR tree-optimization/113735

gcc/testsuite/ChangeLog:

* gcc.dg/tree-ssa/pr113735.c: New test.

gcc/ChangeLog:

* value-relation.cc (equiv_oracle::add_equiv_to_block): Call
limit_check().
---
 gcc/testsuite/gcc.dg/tree-ssa/pr113735.c | 19 +++
 gcc/value-relation.cc|  1 +
 2 files changed, 20 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr113735.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr113735.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr113735.c
new file mode 100644
index 000..7b864999277
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr113735.c
@@ -0,0 +1,19 @@
+// { dg-do compile { target bitint } }
+// { dg-options "-O1" }
+
+char b;
+void bar (void);
+
+#if __BITINT_MAXWIDTH__ >= 6110
+void
+foo (_BitInt(6110) j)
+{
+  for (;;)
+{
+  _BitInt(10) k = b % j;
+  for (j = 6; j; --j)
+if (k)
+  bar ();
+}
+}
+#endif
diff --git a/gcc/value-relation.cc b/gcc/value-relation.cc
index 27f9ad61c0e..619ee5f0867 100644
--- a/gcc/value-relation.cc
+++ b/gcc/value-relation.cc
@@ -718,6 +718,7 @@ equiv_oracle::add_equiv_to_block (basic_block bb, bitmap 
equiv_set)
 
   // Check if this is the first time a block has an equivalence added.
   // and create a header block. And set the summary for this block.
+  limit_check (bb);
   if (!m_equiv[bb->index])
 {
   ptr = (equiv_chain *) obstack_alloc (&m_chain_obstack,
-- 
2.43.0

Re: [PATCH] asan: Don't fold some strlens with -fsanitize=address [PR110676]

2024-02-06 Thread Richard Biener

On Tue, 6 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> The UB on the following testcase isn't diagnosed by -fsanitize=address,
> because we see that the array has a single element and optimize the
> strlen to 0.  I think it is fine to assume e.g. for range purposes the
> lower bound for the strlen as long as we don't try to optimize
> strlen (str)
> where we know that it returns [26, 42] to
> 26 + strlen (str + 26), but for the upper bound we really want to punt
> on optimizing that for -fsanitize=address to read all the bytes of the
> string and diagnose if we run to object end etc.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK

> 2024-02-06  Jakub Jelinek  
> 
>   PR sanitizer/110676
>   * gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address
>   reset maxlen to sizetype maximum.
> 
>   * gcc.dg/asan/pr110676.c: New test.
> 
> --- gcc/gimple-fold.cc.jj 2024-01-31 12:24:51.714239628 +0100
> +++ gcc/gimple-fold.cc2024-02-05 21:38:03.829964904 +0100
> @@ -4019,6 +4019,11 @@ gimple_fold_builtin_strlen (gimple_stmt_
>maxlen = wi::to_wide (max_object_size (), prec) - 2;
>  }
>  
> +  /* For -fsanitize=address, don't optimize the upper bound of the
> + length to be able to diagnose UB on non-zero terminated arrays.  */
> +  if (sanitize_flags_p (SANITIZE_ADDRESS))
> +maxlen = wi::max_value (TYPE_PRECISION (sizetype), UNSIGNED);
> +
>if (minlen == maxlen)
>  {
>/* Fold the strlen call to a constant.  */
> --- gcc/testsuite/gcc.dg/asan/pr110676.c.jj   2024-02-05 21:42:43.657104536 
> +0100
> +++ gcc/testsuite/gcc.dg/asan/pr110676.c  2024-02-05 21:42:39.091167524 
> +0100
> @@ -0,0 +1,14 @@
> +/* PR sanitizer/110676 */
> +/* { dg-do run } */
> +/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
> +/* { dg-shouldfail "asan" } */
> +
> +int
> +main ()
> +{
> +  char s[1] = "A";
> +  return __builtin_strlen (s);
> +}
> +
> +/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow on 
> address.*(\n|\r\n|\r)" } */
> +/* { dg-output "READ of size.*" } */
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] lower-bitint: Encode address space qualifiers in VIEW_CONVERT_EXPRs [PR113736]

2024-02-06 Thread Richard Biener

On Tue, 6 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> As discussed in the PR, e.g. build_fold_addr_expr needs TYPE_ADDR_SPACE
> on the outermost reference rather than just on the base, so the
> following patch makes sure to propagate the address space from
> the accessed var to the MEM_REFs and/or VIEW_CONVERT_EXPRs used to
> access those.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks,
Richard.

> 2024-02-06  Jakub Jelinek  
> 
>   PR tree-optimization/113736
>   * gimple-lower-bitint.cc (bitint_large_huge::limb_access): Use
>   var's address space for MEM_REF or VIEW_CONVERT_EXPRs.
> 
>   * gcc.dg/bitint-86.c: New test.
> 
> --- gcc/gimple-lower-bitint.cc.jj 2024-02-05 10:57:32.946941767 +0100
> +++ gcc/gimple-lower-bitint.cc2024-02-05 11:41:28.352436669 +0100
> @@ -601,12 +601,17 @@ bitint_large_huge::limb_access (tree typ
>  {
>tree atype = (tree_fits_uhwi_p (idx)
>   ? limb_access_type (type, idx) : m_limb_type);
> +  tree ltype = m_limb_type;
> +  addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (var));
> +  if (as != TYPE_ADDR_SPACE (ltype))
> +ltype = build_qualified_type (ltype, TYPE_QUALS (ltype)
> +  | ENCODE_QUAL_ADDR_SPACE (as));
>tree ret;
>if (DECL_P (var) && tree_fits_uhwi_p (idx))
>  {
>tree ptype = build_pointer_type (strip_array_types (TREE_TYPE (var)));
>unsigned HOST_WIDE_INT off = tree_to_uhwi (idx) * m_limb_size;
> -  ret = build2 (MEM_REF, m_limb_type,
> +  ret = build2 (MEM_REF, ltype,
>   build_fold_addr_expr (var),
>   build_int_cst (ptype, off));
>TREE_THIS_VOLATILE (ret) = TREE_THIS_VOLATILE (var);
> @@ -615,7 +620,7 @@ bitint_large_huge::limb_access (tree typ
>else if (TREE_CODE (var) == MEM_REF && tree_fits_uhwi_p (idx))
>  {
>ret
> - = build2 (MEM_REF, m_limb_type, TREE_OPERAND (var, 0),
> + = build2 (MEM_REF, ltype, TREE_OPERAND (var, 0),
> size_binop (PLUS_EXPR, TREE_OPERAND (var, 1),
> build_int_cst (TREE_TYPE (TREE_OPERAND (var, 1)),
>tree_to_uhwi (idx)
> @@ -633,10 +638,10 @@ bitint_large_huge::limb_access (tree typ
>   {
> unsigned HOST_WIDE_INT nelts
>   = CEIL (tree_to_uhwi (TYPE_SIZE (type)), limb_prec);
> -   tree atype = build_array_type_nelts (m_limb_type, nelts);
> +   tree atype = build_array_type_nelts (ltype, nelts);
> var = build1 (VIEW_CONVERT_EXPR, atype, var);
>   }
> -  ret = build4 (ARRAY_REF, m_limb_type, var, idx, NULL_TREE, NULL_TREE);
> +  ret = build4 (ARRAY_REF, ltype, var, idx, NULL_TREE, NULL_TREE);
>  }
>if (!write_p && !useless_type_conversion_p (atype, m_limb_type))
>  {
> --- gcc/testsuite/gcc.dg/bitint-86.c.jj   2024-02-05 12:11:03.582868774 
> +0100
> +++ gcc/testsuite/gcc.dg/bitint-86.c  2024-02-05 12:15:14.322401544 +0100
> @@ -0,0 +1,40 @@
> +/* PR tree-optimization/113736 */
> +/* { dg-do compile { target bitint } } */
> +/* { dg-options "-O2 -std=gnu23 -w" } */
> +
> +#if __BITINT_MAXWIDTH__ >= 710
> +struct S { _BitInt(710) a; };
> +struct T { struct S b[4]; };
> +
> +#ifdef __x86_64__
> +#define SEG __seg_gs
> +#elif defined __i386__
> +#define SEG __seg_fs
> +#else
> +#define SEG
> +#endif
> +
> +void
> +foo (__seg_gs struct T *p)
> +{
> +  struct S s;
> +  p->b[0] = s;
> +}
> +
> +void
> +bar (__seg_gs struct T *p, _BitInt(710) x, int y, double z)
> +{
> +  p->b[0].a = x + 42;
> +  p->b[1].a = x << y;
> +  p->b[2].a = x >> y;
> +  p->b[3].a = z;
> +}
> +
> +int
> +baz (__seg_gs struct T *p, _BitInt(710) x, _BitInt(710) y)
> +{
> +  return __builtin_add_overflow (x, y, &p->b[1].a);
> +}
> +#else
> +int i;
> +#endif
> 
>   Jakub
> 
> 

-- 
Richard Biener 
SUSE Software Solutions Germany GmbH,
Frankenstrasse 146, 90461 Nuernberg, Germany;
GF: Ivo Totev, Andrew McDonald, Werner Knoblich; (HRB 36809, AG Nuernberg)

Re: [PATCH] tree-ssa-math-opts: Fix up convert_{mult, plusminus}_to_widen [PR113759]

2024-02-06 Thread Richard Biener

On Tue, 6 Feb 2024, Jakub Jelinek wrote:

> Hi!
> 
> On the following testcase we emit invalid stmt:
> error: type mismatch in ?widen_mult_plus_expr?
> 6 | foo (int c, int b)
>   | ^~~
> unsigned long
> int
> unsigned int
> unsigned long
> _31 = WIDEN_MULT_PLUS_EXPR ;
> 
> The recent PR113560 r14-8680 changes tweaked convert_mult_to_widen,
> but didn't change convert_plusminus_to_widen for the
> TREE_TYPE (rhsN) != typeN cases, but looking at this, it was already
> before that change quite weird.
> 
> Earlier in those functions it determines actual_precision and from_unsignedN
> and wants to use that precision and signedness for the operands and
> it used build_and_insert_cast for that (which emits a cast stmt, even for
> INTEGER_CSTs) and later on for INTEGER_CST arguments fold_converted them
> to typeN (which is unclear to me why, because it seems to have assumed
> that TREE_TYPE (rhsN) is typeN, for the actual_precision or from_unsignedN
> cases it would be wrong except that build_and_insert_cast forced a SSA_NAME
> and so it doesn't trigger anymore).
> Now, since r14-8680 it is possible that rhsN also has some other type from
> typeN and we again want to cast.
> 
> The following patch changes this, so that for the differences in
> actual_precision and/or from_unsignedN we actually update typeN and then use
> it as the type to convert the arguments to if it isn't useless, for
> INTEGER_CSTs by just fold_converting, otherwise using build_and_insert_cast.
> And uses useless_type_conversion_p test so that we don't convert unless
> necessary.  Plus by doing that effectively also doing the important part of
> the r14-8680 convert_mult_to_widen changes in convert_plusminus_to_widen.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

OK.

Thanks for cleaning this up.
Richard.

> 2024-02-06  Jakub Jelinek  
> 
>   PR tree-optimization/113759
>   * tree-ssa-math-opts.cc (convert_mult_to_widen): If actual_precision
>   or from_unsignedN differs from properties of typeN, update typeN
>   to build_nonstandard_integer_type.  If TREE_TYPE (rhsN) is not
>   uselessly convertible to typeN, convert it using fold_convert or
>   build_and_insert_cast depending on if rhsN is INTEGER_CST or not.
>   (convert_plusminus_to_widen): Likewise.
> 
>   * gcc.c-torture/compile/pr113759.c: New test.
> 
> --- gcc/tree-ssa-math-opts.cc.jj  2024-02-02 11:26:43.730589763 +0100
> +++ gcc/tree-ssa-math-opts.cc 2024-02-05 10:23:16.068489814 +0100
> @@ -2865,25 +2865,25 @@ convert_mult_to_widen (gimple *stmt, gim
>if (2 * actual_precision > TYPE_PRECISION (type))
>  return false;
>if (actual_precision != TYPE_PRECISION (type1)
> -  || from_unsigned1 != TYPE_UNSIGNED (type1)
> -  || (TREE_TYPE (rhs1) != type1
> -   && TREE_CODE (rhs1) != INTEGER_CST))
> -rhs1 = build_and_insert_cast (gsi, loc,
> -   build_nonstandard_integer_type
> - (actual_precision, from_unsigned1), rhs1);
> +  || from_unsigned1 != TYPE_UNSIGNED (type1))
> +type1 = build_nonstandard_integer_type (actual_precision, 
> from_unsigned1);
> +  if (!useless_type_conversion_p (type1, TREE_TYPE (rhs1)))
> +{
> +  if (TREE_CODE (rhs1) == INTEGER_CST)
> + rhs1 = fold_convert (type1, rhs1);
> +  else
> + rhs1 = build_and_insert_cast (gsi, loc, type1, rhs1);
> +}
>if (actual_precision != TYPE_PRECISION (type2)
> -  || from_unsigned2 != TYPE_UNSIGNED (type2)
> -  || (TREE_TYPE (rhs2) != type2
> -   && TREE_CODE (rhs2) != INTEGER_CST))
> -rhs2 = build_and_insert_cast (gsi, loc,
> -   build_nonstandard_integer_type
> - (actual_precision, from_unsigned2), rhs2);
> -
> -  /* Handle constants.  */
> -  if (TREE_CODE (rhs1) == INTEGER_CST)
> -rhs1 = fold_convert (type1, rhs1);
> -  if (TREE_CODE (rhs2) == INTEGER_CST)
> -rhs2 = fold_convert (type2, rhs2);
> +  || from_unsigned2 != TYPE_UNSIGNED (type2))
> +type2 = build_nonstandard_integer_type (actual_precision, 
> from_unsigned2);
> +  if (!useless_type_conversion_p (type2, TREE_TYPE (rhs2)))
> +{
> +  if (TREE_CODE (rhs2) == INTEGER_CST)
> + rhs2 = fold_convert (type2, rhs2);
> +  else
> + rhs2 = build_and_insert_cast (gsi, loc, type2, rhs2);
> +}
>  
>gimple_assign_set_rhs1 (stmt, rhs1);
>gimple_assign_set_rhs2 (stmt, rhs2);
> @@ -3086,26 +3086,28 @@ convert_plusminus_to_widen (gimple_stmt_
>actual_precision = GET_MODE_PRECISION (actual_mode);
>if (actual_precision != TYPE_PRECISION (type1)
>|| from_unsigned1 != TYPE_UNSIGNED (type1))
> -mult_rhs1 = build_and_insert_cast (gsi, loc,
> -build_nonstandard_integer_type
> -  (actual_precision, from_unsigned1),
> -mult_rhs1);
> +type1 = build_nonstandar

[PATCH] middle-end/113576 - avoid out-of-bound vector element access

2024-02-06 Thread Richard Biener

The following avoids accessing out-of-bound vector elements when
native encoding a boolean vector with sub-BITS_PER_UNIT precision
elements.  The error was basing the number of elements to extract
on the rounded up total byte size involved and the patch bases
everything on the total number of elements to extract instead.

As a side-effect this now consistently results in zeros in the
padding of the last encoded byte which also avoids the failure
mode seen in PR113576.

Bootstrapped and tested on x86_64-unknown-linux-gnu.

OK?

Thanks,
Richard.

PR middle-end/113576
* fold-const.cc (native_encode_vector_part): Avoid accessing
out-of-bound elements.
---
 gcc/fold-const.cc | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc
index 80e211e18c0..8638757312b 100644
--- a/gcc/fold-const.cc
+++ b/gcc/fold-const.cc
@@ -8057,13 +8057,13 @@ native_encode_vector_part (const_tree expr, unsigned 
char *ptr, int len,
off = 0;
 
   /* Zero the buffer and then set bits later where necessary.  */
-  int extract_bytes = MIN (len, total_bytes - off);
+  unsigned elts_per_byte = BITS_PER_UNIT / elt_bits;
+  unsigned first_elt = off * elts_per_byte;
+  unsigned extract_elts = MIN (len * elts_per_byte, count - first_elt);
+  unsigned extract_bytes = CEIL (elt_bits * extract_elts, BITS_PER_UNIT);
   if (ptr)
memset (ptr, 0, extract_bytes);
 
-  unsigned int elts_per_byte = BITS_PER_UNIT / elt_bits;
-  unsigned int first_elt = off * elts_per_byte;
-  unsigned int extract_elts = extract_bytes * elts_per_byte;
   for (unsigned int i = 0; i < extract_elts; ++i)
{
  tree elt = VECTOR_CST_ELT (expr, first_elt + i);
-- 
2.35.3

[PATCH v5 RESEND] C, ObjC: Add -Wunterminated-string-initialization

2024-02-06 Thread Alejandro Colomar

Warn about the following:

char  s[3] = "foo";

Initializing a char array with a string literal of the same length as
the size of the array is usually a mistake.  Rarely is the case where
one wants to create a non-terminated character sequence from a string
literal.

In some cases, for writing faster code, one may want to use arrays
instead of pointers, since that removes the need for storing an array of
pointers apart from the strings themselves.

char  *log_levels[]   = { "info", "warning", "err" };
vs.
char  log_levels[][7] = { "info", "warning", "err" };

This forces the programmer to specify a size, which might change if a
new entry is later added.  Having no way to enforce null termination is
very dangerous, however, so it is useful to have a warning for this, so
that the compiler can make sure that the programmer didn't make any
mistakes.  This warning catches the bug above, so that the programmer
will be able to fix it and write:

char  log_levels[][8] = { "info", "warning", "err" };

This warning already existed as part of -Wc++-compat, but this patch
allows enabling it separately.  It is also included in -Wextra, since
it may not always be desired (when unterminated character sequences are
wanted), but it's likely to be desired in most cases.

Since Wc++-compat now includes this warning, the test has to be modified
to expect the text of the new warning too, in .

Link: 
Link: 
Link: 

Acked-by: Doug McIlroy 
Cc: "G. Branden Robinson" 
Cc: Ralph Corderoy 
Cc: Dave Kemper 
Cc: Larry McVoy 
Cc: Andrew Pinski 
Cc: Jonathan Wakely 
Cc: Andrew Clayton 
Cc: Martin Uecker 
Cc: David Malcolm 
Signed-off-by: Alejandro Colomar 
---

v5:

-  Fix existing C++-compat tests.  [reported by ]


 gcc/c-family/c.opt | 4 
 gcc/c/c-typeck.cc  | 6 +++---
 gcc/testsuite/gcc.dg/Wcxx-compat-14.c  | 2 +-
 gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c | 6 ++
 4 files changed, 14 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c

diff --git a/gcc/c-family/c.opt b/gcc/c-family/c.opt
index 44b9c862c14..e8f6b836836 100644
--- a/gcc/c-family/c.opt
+++ b/gcc/c-family/c.opt
@@ -1407,6 +1407,10 @@ Wunsuffixed-float-constants
 C ObjC Var(warn_unsuffixed_float_constants) Warning
 Warn about unsuffixed float constants.
 
+Wunterminated-string-initialization
+C ObjC Var(warn_unterminated_string_initialization) Warning LangEnabledBy(C 
ObjC,Wextra || Wc++-compat)
+Warn about character arrays initialized as unterminated character sequences by 
a string literal.
+
 Wunused
 C ObjC C++ ObjC++ LangEnabledBy(C ObjC C++ ObjC++,Wall)
 ; documented in common.opt
diff --git a/gcc/c/c-typeck.cc b/gcc/c/c-typeck.cc
index e55e887da14..7df9de819ed 100644
--- a/gcc/c/c-typeck.cc
+++ b/gcc/c/c-typeck.cc
@@ -8399,11 +8399,11 @@ digest_init (location_t init_loc, tree type, tree init, 
tree origtype,
pedwarn_init (init_loc, 0,
  ("initializer-string for array of %qT "
   "is too long"), typ1);
- else if (warn_cxx_compat
+ else if (warn_unterminated_string_initialization
   && compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
-   warning_at (init_loc, OPT_Wc___compat,
+   warning_at (init_loc, OPT_Wunterminated_string_initialization,
("initializer-string for array of %qT "
-"is too long for C++"), typ1);
+"is too long"), typ1);
  if (compare_tree_int (TYPE_SIZE_UNIT (type), len) < 0)
{
  unsigned HOST_WIDE_INT size
diff --git a/gcc/testsuite/gcc.dg/Wcxx-compat-14.c 
b/gcc/testsuite/gcc.dg/Wcxx-compat-14.c
index 23783711be6..6df0ee197cc 100644
--- a/gcc/testsuite/gcc.dg/Wcxx-compat-14.c
+++ b/gcc/testsuite/gcc.dg/Wcxx-compat-14.c
@@ -2,5 +2,5 @@
 /* { dg-options "-Wc++-compat" } */
 
 char a1[] = "a";
-char a2[1] = "a";  /* { dg-warning "C\[+\]\[+\]" } */
+char a2[1] = "a";  /* { dg-warning "initializer-string for array of 'char' 
is too long" } */
 char a3[2] = "a";
diff --git a/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c 
b/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c
new file mode 100644
index 000..13d5dbc6640
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/Wunterminated-string-initialization.c
@@ -0,0 +1,6 @@
+/* { dg-do compile } */
+/* { dg-options "-Wunterminated-string-initialization" } */
+
+char a1[] = "a";
+char a2[1] = "a";  /* { dg-warning "initializer-string for array of 'char' 
is too long" } */
+char a3[2] = "a";
-- 
2.40.1



signatu

Re: [PATCH] libssp: Fix gets-chk.c compilation on Solaris

2024-02-06 Thread Jakub Jelinek

On Tue, Feb 06, 2024 at 11:33:17AM +0100, Rainer Orth wrote:
> 2023-12-07  Rainer Orth  
> 
>   libssp:
>   * configure.ac (AC_CHECK_DECLS): Check for gets.
>   * configure, config.h.in: Regenerate.
>   * gets-chk.c (gets): Guard declaration with !HAVE_DECL_GETS.

Ok, thanks.

Jakub

Re: [PATCH] libssp: Fix gets-chk.c compilation on Solaris

2024-02-06 Thread Rainer Orth

Hi Jakub,

sorry for dropping the ball on this.

> On Mon, Dec 04, 2023 at 11:42:09AM +0100, Rainer Orth wrote:
>> The recent warning patches broke the libssp build on Solaris:
>> 
>> /vol/gcc/src/hg/master/local/libssp/gets-chk.c: In function '__gets_chk':
>> /vol/gcc/src/hg/master/local/libssp/gets-chk.c:67:12: error: implicit
>> declaration of function 'gets'; did you mean 'getw'?
>> [-Wimplicit-function-declaration]
>>67 | return gets (s);
>>   |^~~~
>>   |getw 
>> /vol/gcc/src/hg/master/local/libssp/gets-chk.c:67:12: error: returning
>> 'int' from a function with return type 'char *' makes pointer from
>> integer without a cast [-Wint-conversion]
>>67 | return gets (s);
>>   |^~~~
>> /vol/gcc/src/hg/master/local/libssp/gets-chk.c:74:12: error: returning
>> 'int' from a function with return type 'char *' makes pointer from
>> integer without a cast [-Wint-conversion]
>>74 | return gets (s);
>>   |^~~~
>> 
>> The guard around the gets declaration in gets-chk.c is
>> 
>> #if !(!defined __USE_ISOC11 \
>>   || (defined __cplusplus && __cplusplus <= 201103L))
>> extern char *gets (char *);
>> #endif
>> 
>> __USE_ISOC11 is a glibc-only thing, while Solaris 
>> declares gets like
>> 
>> #if __STDC_VERSION__ < 201112L && __cplusplus < 201402L
>> extern char *gets(char *) __ATTR_DEPRECATED;
>> #endif
>> 
>> If one needs to check __USE_ISO11 at all, one certainly needs to check
>> __STDC_VERSION__ to avoid breaking every non-glibc target.  Besides, I
>> don't see what's the use of checking __cplusplus when compiling a C-only
>> source file.  On top of all that, the double negation makes the guard
>> unnecessarily hard to understand.
>> 
>> I really don't know if it's useful/appropriate to check __USE_ISOC11 and
>> __cplusplus here at all; still I've left both for now.
>> 
>> Here's what I've used to complete the Solaris bootstrap.
>> 
>> Tested on i386-pc-solaris2.11, sparc-sun-solaris2.11,
>> x86_64-pc-linux-gnu, and x86_64-apple-darwin23.1.0.
>> 
>> -- 
>> -
>> Rainer Orth, Center for Biotechnology, Bielefeld University
>> 
>> 
>> 2023-12-03  Rainer Orth  
>> 
>>  libssp:
>>  * gets-chk.c (gets): Avoid double negation.
>>  Also check __STDC_VERSION__ >= 201112L.
>> 
>
>> # HG changeset patch
>> # Parent  334015ab01f6c0e5af821c1e9bc83b8677cc0bfb
>> libssp: Fix gets-chk.c compilation on Solaris
>> 
>> diff --git a/libssp/gets-chk.c b/libssp/gets-chk.c
>> --- a/libssp/gets-chk.c
>> +++ b/libssp/gets-chk.c
>> @@ -51,8 +51,9 @@ see the files COPYING3 and COPYING.RUNTI
>>  # include 
>>  #endif
>>  
>> -#if !(!defined __USE_ISOC11 \
>> -  || (defined __cplusplus && __cplusplus <= 201103L))
>> +#if (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L)   \
>> + || !defined __USE_ISOC11   \
>> + || (defined __cplusplus && __cplusplus >= 201402L)
>
> The above isn't equivalent.  Avoiding double negation would mean
> #if (defined __USE_ISOC11 \
>  && !(defined __cplusplus && __cplusplus <= 201103L))
> or
> #if (defined __USE_ISOC11 \
>  && (!defined __cplusplus || __cplusplus > 201103L))
> No?
> __USE_ISOC11 is defined as
> /* This is to enable the ISO C11 extension.  */
> #if (defined _ISOC11_SOURCE || defined _ISOC2X_SOURCE \
>  || (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L))
> # define __USE_ISOC11   1
> #endif
> where _ISOC11_SOURCE or _ISOC2X_SOURCE are defined whenever _GNU_SOURCE
> is or when user defines them, or __USE_ISOC11 is also defined for
> if __cplusplus >= 201703L.
>
> Obviously, if you add that
>   (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L)
> it will mean it will be prototyped always (as I think we compile it without
> any -std= flags).
>
> What about using what we had for glibc (or even better, expect gets
> to be declared for glibc < 2.16) and use what you add for other libraries?
> The file is written and compiled as C, so we don't need to bother with C++
> though.
> So
> #if (defined (__GLIBC_PREREQ) \
>  ? (__GLIBC_PREREQ (2, 16) && defined (__USE_ISOC11)) \
>  : (defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L))
> ?
>
>>  extern char *gets (char *);
>>  #endif

this doesn't even compile on non-glibc targets:

/vol/gcc/src/hg/master/local/libssp/gets-chk.c:55:24: error: missing binary 
operator before token "("
   55 |  ? (__GLIBC_PREREQ (2, 16) && defined (__USE_ISOC11))   
\
  |   

Unless one really wants to go for ugly contortions like

#ifdef __GLIBC_PREREQ
# if __GLIBC_PREREQ (2, 16) && defined (__USE_ISOC11)
#  define NEED_DECL_GETS
# endif
#elif defined __STDC_VERSION__ && __STDC_VERSION__ >= 201112L
#  define

[PATCH] asan: Don't fold some strlens with -fsanitize=address [PR110676]

2024-02-06 Thread Jakub Jelinek

Hi!

The UB on the following testcase isn't diagnosed by -fsanitize=address,
because we see that the array has a single element and optimize the
strlen to 0.  I think it is fine to assume e.g. for range purposes the
lower bound for the strlen as long as we don't try to optimize
strlen (str)
where we know that it returns [26, 42] to
26 + strlen (str + 26), but for the upper bound we really want to punt
on optimizing that for -fsanitize=address to read all the bytes of the
string and diagnose if we run to object end etc.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-06  Jakub Jelinek  

PR sanitizer/110676
* gimple-fold.cc (gimple_fold_builtin_strlen): For -fsanitize=address
reset maxlen to sizetype maximum.

* gcc.dg/asan/pr110676.c: New test.

--- gcc/gimple-fold.cc.jj   2024-01-31 12:24:51.714239628 +0100
+++ gcc/gimple-fold.cc  2024-02-05 21:38:03.829964904 +0100
@@ -4019,6 +4019,11 @@ gimple_fold_builtin_strlen (gimple_stmt_
   maxlen = wi::to_wide (max_object_size (), prec) - 2;
 }
 
+  /* For -fsanitize=address, don't optimize the upper bound of the
+ length to be able to diagnose UB on non-zero terminated arrays.  */
+  if (sanitize_flags_p (SANITIZE_ADDRESS))
+maxlen = wi::max_value (TYPE_PRECISION (sizetype), UNSIGNED);
+
   if (minlen == maxlen)
 {
   /* Fold the strlen call to a constant.  */
--- gcc/testsuite/gcc.dg/asan/pr110676.c.jj 2024-02-05 21:42:43.657104536 
+0100
+++ gcc/testsuite/gcc.dg/asan/pr110676.c2024-02-05 21:42:39.091167524 
+0100
@@ -0,0 +1,14 @@
+/* PR sanitizer/110676 */
+/* { dg-do run } */
+/* { dg-skip-if "" { *-*-* } { "*" } { "-O0" } } */
+/* { dg-shouldfail "asan" } */
+
+int
+main ()
+{
+  char s[1] = "A";
+  return __builtin_strlen (s);
+}
+
+/* { dg-output "ERROR: AddressSanitizer: stack-buffer-overflow on 
address.*(\n|\r\n|\r)" } */
+/* { dg-output "READ of size.*" } */

Jakub

[PATCH] lower-bitint: Encode address space qualifiers in VIEW_CONVERT_EXPRs [PR113736]

2024-02-06 Thread Jakub Jelinek

Hi!

As discussed in the PR, e.g. build_fold_addr_expr needs TYPE_ADDR_SPACE
on the outermost reference rather than just on the base, so the
following patch makes sure to propagate the address space from
the accessed var to the MEM_REFs and/or VIEW_CONVERT_EXPRs used to
access those.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-06  Jakub Jelinek  

PR tree-optimization/113736
* gimple-lower-bitint.cc (bitint_large_huge::limb_access): Use
var's address space for MEM_REF or VIEW_CONVERT_EXPRs.

* gcc.dg/bitint-86.c: New test.

--- gcc/gimple-lower-bitint.cc.jj   2024-02-05 10:57:32.946941767 +0100
+++ gcc/gimple-lower-bitint.cc  2024-02-05 11:41:28.352436669 +0100
@@ -601,12 +601,17 @@ bitint_large_huge::limb_access (tree typ
 {
   tree atype = (tree_fits_uhwi_p (idx)
? limb_access_type (type, idx) : m_limb_type);
+  tree ltype = m_limb_type;
+  addr_space_t as = TYPE_ADDR_SPACE (TREE_TYPE (var));
+  if (as != TYPE_ADDR_SPACE (ltype))
+ltype = build_qualified_type (ltype, TYPE_QUALS (ltype)
+| ENCODE_QUAL_ADDR_SPACE (as));
   tree ret;
   if (DECL_P (var) && tree_fits_uhwi_p (idx))
 {
   tree ptype = build_pointer_type (strip_array_types (TREE_TYPE (var)));
   unsigned HOST_WIDE_INT off = tree_to_uhwi (idx) * m_limb_size;
-  ret = build2 (MEM_REF, m_limb_type,
+  ret = build2 (MEM_REF, ltype,
build_fold_addr_expr (var),
build_int_cst (ptype, off));
   TREE_THIS_VOLATILE (ret) = TREE_THIS_VOLATILE (var);
@@ -615,7 +620,7 @@ bitint_large_huge::limb_access (tree typ
   else if (TREE_CODE (var) == MEM_REF && tree_fits_uhwi_p (idx))
 {
   ret
-   = build2 (MEM_REF, m_limb_type, TREE_OPERAND (var, 0),
+   = build2 (MEM_REF, ltype, TREE_OPERAND (var, 0),
  size_binop (PLUS_EXPR, TREE_OPERAND (var, 1),
  build_int_cst (TREE_TYPE (TREE_OPERAND (var, 1)),
 tree_to_uhwi (idx)
@@ -633,10 +638,10 @@ bitint_large_huge::limb_access (tree typ
{
  unsigned HOST_WIDE_INT nelts
= CEIL (tree_to_uhwi (TYPE_SIZE (type)), limb_prec);
- tree atype = build_array_type_nelts (m_limb_type, nelts);
+ tree atype = build_array_type_nelts (ltype, nelts);
  var = build1 (VIEW_CONVERT_EXPR, atype, var);
}
-  ret = build4 (ARRAY_REF, m_limb_type, var, idx, NULL_TREE, NULL_TREE);
+  ret = build4 (ARRAY_REF, ltype, var, idx, NULL_TREE, NULL_TREE);
 }
   if (!write_p && !useless_type_conversion_p (atype, m_limb_type))
 {
--- gcc/testsuite/gcc.dg/bitint-86.c.jj 2024-02-05 12:11:03.582868774 +0100
+++ gcc/testsuite/gcc.dg/bitint-86.c2024-02-05 12:15:14.322401544 +0100
@@ -0,0 +1,40 @@
+/* PR tree-optimization/113736 */
+/* { dg-do compile { target bitint } } */
+/* { dg-options "-O2 -std=gnu23 -w" } */
+
+#if __BITINT_MAXWIDTH__ >= 710
+struct S { _BitInt(710) a; };
+struct T { struct S b[4]; };
+
+#ifdef __x86_64__
+#define SEG __seg_gs
+#elif defined __i386__
+#define SEG __seg_fs
+#else
+#define SEG
+#endif
+
+void
+foo (__seg_gs struct T *p)
+{
+  struct S s;
+  p->b[0] = s;
+}
+
+void
+bar (__seg_gs struct T *p, _BitInt(710) x, int y, double z)
+{
+  p->b[0].a = x + 42;
+  p->b[1].a = x << y;
+  p->b[2].a = x >> y;
+  p->b[3].a = z;
+}
+
+int
+baz (__seg_gs struct T *p, _BitInt(710) x, _BitInt(710) y)
+{
+  return __builtin_add_overflow (x, y, &p->b[1].a);
+}
+#else
+int i;
+#endif

Jakub

[PATCH] tree-ssa-math-opts: Fix up convert_{mult, plusminus}_to_widen [PR113759]

2024-02-06 Thread Jakub Jelinek

Hi!

On the following testcase we emit invalid stmt:
error: type mismatch in ‘widen_mult_plus_expr’
6 | foo (int c, int b)
  | ^~~
unsigned long
int
unsigned int
unsigned long
_31 = WIDEN_MULT_PLUS_EXPR ;

The recent PR113560 r14-8680 changes tweaked convert_mult_to_widen,
but didn't change convert_plusminus_to_widen for the
TREE_TYPE (rhsN) != typeN cases, but looking at this, it was already
before that change quite weird.

Earlier in those functions it determines actual_precision and from_unsignedN
and wants to use that precision and signedness for the operands and
it used build_and_insert_cast for that (which emits a cast stmt, even for
INTEGER_CSTs) and later on for INTEGER_CST arguments fold_converted them
to typeN (which is unclear to me why, because it seems to have assumed
that TREE_TYPE (rhsN) is typeN, for the actual_precision or from_unsignedN
cases it would be wrong except that build_and_insert_cast forced a SSA_NAME
and so it doesn't trigger anymore).
Now, since r14-8680 it is possible that rhsN also has some other type from
typeN and we again want to cast.

The following patch changes this, so that for the differences in
actual_precision and/or from_unsignedN we actually update typeN and then use
it as the type to convert the arguments to if it isn't useless, for
INTEGER_CSTs by just fold_converting, otherwise using build_and_insert_cast.
And uses useless_type_conversion_p test so that we don't convert unless
necessary.  Plus by doing that effectively also doing the important part of
the r14-8680 convert_mult_to_widen changes in convert_plusminus_to_widen.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2024-02-06  Jakub Jelinek  

PR tree-optimization/113759
* tree-ssa-math-opts.cc (convert_mult_to_widen): If actual_precision
or from_unsignedN differs from properties of typeN, update typeN
to build_nonstandard_integer_type.  If TREE_TYPE (rhsN) is not
uselessly convertible to typeN, convert it using fold_convert or
build_and_insert_cast depending on if rhsN is INTEGER_CST or not.
(convert_plusminus_to_widen): Likewise.

* gcc.c-torture/compile/pr113759.c: New test.

--- gcc/tree-ssa-math-opts.cc.jj2024-02-02 11:26:43.730589763 +0100
+++ gcc/tree-ssa-math-opts.cc   2024-02-05 10:23:16.068489814 +0100
@@ -2865,25 +2865,25 @@ convert_mult_to_widen (gimple *stmt, gim
   if (2 * actual_precision > TYPE_PRECISION (type))
 return false;
   if (actual_precision != TYPE_PRECISION (type1)
-  || from_unsigned1 != TYPE_UNSIGNED (type1)
-  || (TREE_TYPE (rhs1) != type1
- && TREE_CODE (rhs1) != INTEGER_CST))
-rhs1 = build_and_insert_cast (gsi, loc,
- build_nonstandard_integer_type
-   (actual_precision, from_unsigned1), rhs1);
+  || from_unsigned1 != TYPE_UNSIGNED (type1))
+type1 = build_nonstandard_integer_type (actual_precision, from_unsigned1);
+  if (!useless_type_conversion_p (type1, TREE_TYPE (rhs1)))
+{
+  if (TREE_CODE (rhs1) == INTEGER_CST)
+   rhs1 = fold_convert (type1, rhs1);
+  else
+   rhs1 = build_and_insert_cast (gsi, loc, type1, rhs1);
+}
   if (actual_precision != TYPE_PRECISION (type2)
-  || from_unsigned2 != TYPE_UNSIGNED (type2)
-  || (TREE_TYPE (rhs2) != type2
- && TREE_CODE (rhs2) != INTEGER_CST))
-rhs2 = build_and_insert_cast (gsi, loc,
- build_nonstandard_integer_type
-   (actual_precision, from_unsigned2), rhs2);
-
-  /* Handle constants.  */
-  if (TREE_CODE (rhs1) == INTEGER_CST)
-rhs1 = fold_convert (type1, rhs1);
-  if (TREE_CODE (rhs2) == INTEGER_CST)
-rhs2 = fold_convert (type2, rhs2);
+  || from_unsigned2 != TYPE_UNSIGNED (type2))
+type2 = build_nonstandard_integer_type (actual_precision, from_unsigned2);
+  if (!useless_type_conversion_p (type2, TREE_TYPE (rhs2)))
+{
+  if (TREE_CODE (rhs2) == INTEGER_CST)
+   rhs2 = fold_convert (type2, rhs2);
+  else
+   rhs2 = build_and_insert_cast (gsi, loc, type2, rhs2);
+}
 
   gimple_assign_set_rhs1 (stmt, rhs1);
   gimple_assign_set_rhs2 (stmt, rhs2);
@@ -3086,26 +3086,28 @@ convert_plusminus_to_widen (gimple_stmt_
   actual_precision = GET_MODE_PRECISION (actual_mode);
   if (actual_precision != TYPE_PRECISION (type1)
   || from_unsigned1 != TYPE_UNSIGNED (type1))
-mult_rhs1 = build_and_insert_cast (gsi, loc,
-  build_nonstandard_integer_type
-(actual_precision, from_unsigned1),
-  mult_rhs1);
+type1 = build_nonstandard_integer_type (actual_precision, from_unsigned1);
+  if (!useless_type_conversion_p (type1, TREE_TYPE (mult_rhs1)))
+{
+  if (TREE_CODE (mult_rhs1) == INTEGER_CST)
+   mult_rhs1 = fold_convert (type1, mult_rhs1);
+  else
+   mult_rhs

Re: [PATCH V1] RISC-V: Add mininal support for zabha extension.

2024-02-06 Thread Kito Cheng

I am not sure it's worth adding a dedicated instruction pattern for
those instructions? In theory those instructions should just be used
by those atomic builin when zabha has enabled, but I think that would
be kinda of a bigger work item.

On Tue, Feb 6, 2024 at 5:18 PM  wrote:
>
> From: yulong 
>
> This patch add the mininal support for zabha extension.
> The doc url as follow: 
> https://github.com/riscv/riscv-zabha/blob/v1.0-rc1/zabha.adoc
> There are have no amocas.[b|h] instructions, because the zacas extension is 
> not merged.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Add zabha extension name.
> * config/riscv/riscv.md (amo_addqi3): New mode.
> (amo_addhi3): Ditto.
> (amo_minqi3): Ditto.
> (amo_minuqi3): Ditto.
> (amo_minhi3): Ditto.
> (amo_minuhi3): Ditto.
> (amo_maxqi3): Ditto.
> (amo_maxuqi3): Ditto.
> (amo_maxhi3): Ditto.
> (amo_maxuhi3): Ditto.
> (amo_andqi3): Ditto.
> (amo_andhi3): Ditto.
> (amo_orqi3): Ditto.
> (amo_orhi3): Ditto.
> (amo_xorqi3): Ditto.
> (amo_xorhi3): Ditto.
> (amo_swapqi3): Ditto.
> (amo_swaphi3): Ditto.
> * config/riscv/riscv.opt: Add zabha extension.
>
> ---
>  gcc/common/config/riscv/riscv-common.cc |   2 +
>  gcc/config/riscv/riscv.md   | 167 
>  gcc/config/riscv/riscv.opt  |   2 +
>  3 files changed, 171 insertions(+)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 631ce8309a0..9c3be0d7651 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -250,6 +250,7 @@ static const struct riscv_ext_version 
> riscv_ext_version_table[] =
>{"za64rs",  ISA_SPEC_CLASS_NONE, 1, 0},
>{"za128rs", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zawrs", ISA_SPEC_CLASS_NONE, 1, 0},
> +  {"zabha", ISA_SPEC_CLASS_NONE, 1, 0},
>
>{"zba", ISA_SPEC_CLASS_NONE, 1, 0},
>{"zbb", ISA_SPEC_CLASS_NONE, 1, 0},
> @@ -1504,6 +1505,7 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>{"za64rs", &gcc_options::x_riscv_za_subext, MASK_ZA64RS},
>{"za128rs", &gcc_options::x_riscv_za_subext, MASK_ZA128RS},
>{"zawrs", &gcc_options::x_riscv_za_subext, MASK_ZAWRS},
> +  {"zabha", &gcc_options::x_riscv_za_subext, MASK_ZABHA},
>
>{"zba",&gcc_options::x_riscv_zb_subext, MASK_ZBA},
>{"zbb",&gcc_options::x_riscv_zb_subext, MASK_ZBB},
> diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
> index 39b29795cd6..058b63ac7f0 100644
> --- a/gcc/config/riscv/riscv.md
> +++ b/gcc/config/riscv/riscv.md
> @@ -134,6 +134,9 @@
>;; XTheadInt unspec
>UNSPECV_XTHEADINT_PUSH
>UNSPECV_XTHEADINT_POP
> +
> +  ;; Zabha instructions.
> +  UNSPEC_AMO_SWAP
>  ])
>
>  (define_constants
> @@ -849,6 +852,24 @@
>[(set_attr "type" "arith")
> (set_attr "mode" "SI")])
>
> +(define_insn "amo_addqi3"
> +  [(set (match_operand:QI  0 "register_operand" "=r,r")
> +   (plus:QI (match_operand:QI 1 "register_operand" " r,r")
> +(match_operand:QI 2 "arith_operand"" r,r")))]
> +  "TARGET_ZABHA"
> +  "amoadd.b\t%0,%1,%2"
> +  [(set_attr "type" "atomic")
> +   (set_attr "mode" "QI")])
> +
> +(define_insn "amo_addhi3"
> +  [(set (match_operand:HI  0 "register_operand" "=r,r")
> +   (plus:HI (match_operand:HI 1 "register_operand" " r,r")
> +(match_operand:HI 2 "arith_operand"" r,r")))]
> +  "TARGET_ZABHA"
> +  "amoadd.h\t%0,%1,%2"
> +  [(set_attr "type" "atomic")
> +   (set_attr "mode" "HI")])
> +
>  ;;
>  ;;  
>  ;;
> @@ -1645,6 +1666,78 @@
>[(set_attr "type" "fmove")
> (set_attr "mode" "")])
>
> +(define_insn "amo_minqi3"
> +  [(set (match_operand:QI0 "register_operand" "=r")
> +   (smin:QI (match_operand:QI 1 "register_operand" " r")
> +  (match_operand:QI 2 "register_operand" " r")))]
> +  "TARGET_ZABHA"
> +  "amomin.b\t%0,%1,%2"
> +  [(set_attr "type" "atomic")
> +   (set_attr "mode" "QI")])
> +
> +(define_insn "amo_minuqi3"
> +  [(set (match_operand:QI0 "register_operand" "=r")
> +   (umin:QI (match_operand:QI 1 "register_operand" " r")
> +  (match_operand:QI 2 "register_operand" " r")))]
> +  "TARGET_ZABHA"
> +  "amominu.b\t%0,%1,%2"
> +  [(set_attr "type" "atomic")
> +   (set_attr "mode" "QI")])
> +
> +(define_insn "amo_minhi3"
> +  [(set (match_operand:HI0 "register_operand" "=r")
> +   (smin:HI (match_operand:HI 1 "register_operand" " r")
> +  (match_operand:HI 2 "register_operand" " r")))]
> +  "TARGET_ZABHA"
> +  "amomin.h\t%0,%1,%2"
> +  [(set_attr "type" "atomic")
> +   (set_attr "mode" "HI")])
> +
> +(define_insn "amo_minuhi3"
> +  [(set (match_operand:HI0 "register_operand" "=r")
> +   (umin:HI (match_operand:HI 1 "register_op

[PATCH] testsuite: Add a test case for negating FP vectors containing zeros

2024-02-06 Thread Xi Ruoyao

Recently I've fixed two wrong FP vector negate implementation which
caused wrong sign bits in zeros in targets (r14-8786 and r14-8801).  To
prevent a similar issue from happening again, add a test case.

Tested on x86_64 (with SSE2, AVX, AVX2, and AVX512F), AArch64, MIPS
(with MSA), LoongArch (with LSX and LASX).

gcc/testsuite:

* gcc.dg/vect/vect-neg-zero.c: New test.
---

Ok for trunk?

 gcc/testsuite/gcc.dg/vect/vect-neg-zero.c | 39 +++
 1 file changed, 39 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/vect/vect-neg-zero.c

diff --git a/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c 
b/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
new file mode 100644
index 000..adb032f5c6a
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/vect-neg-zero.c
@@ -0,0 +1,39 @@
+/* { dg-do run } */
+/* { dg-add-options ieee } */
+/* { dg-additional-options "-fsigned-zeros" } */
+
+double x[4] = {-0.0, 0.0, -0.0, 0.0};
+float y[8] = {-0.0, 0.0, -0.0, 0.0, -0.0, -0.0, 0.0, 0.0};
+
+static __attribute__ ((always_inline)) inline void
+test (int factor)
+{
+  double a[4];
+  float b[8];
+
+  asm ("" ::: "memory");
+
+  for (int i = 0; i < 2 * factor; i++)
+a[i] = -x[i];
+
+  for (int i = 0; i < 4 * factor; i++)
+b[i] = -y[i];
+
+#pragma GCC novector
+  for (int i = 0; i < 2 * factor; i++)
+if (__builtin_signbit (a[i]) == __builtin_signbit (x[i]))
+  __builtin_abort ();
+
+#pragma GCC novector
+  for (int i = 0; i < 4 * factor; i++)
+if (__builtin_signbit (b[i]) == __builtin_signbit (y[i]))
+  __builtin_abort ();
+}
+
+int
+main (void)
+{
+  test (1);
+  test (2);
+  return 0;
+}
-- 
2.43.0

Re: [PATCH v2] openmp, fortran: Add Fortran support for indirect clause on the declare target directive

2024-02-06 Thread Kwok Cheung Yeung

Oops. I thought exactly the same thing yesterday, but forgot to add the 
changes to my commit! Here is the updated version.


Kwok

On 06/02/2024 9:03 am, Tobias Burnus wrote:
LGTM. I just wonder whether there should be a value test and not just a 
does-not-crash-when-called test for the latter testcase, i.e.




+++ b/libgomp/testsuite/libgomp.fortran/declare-target-indirect-3.f90
@@ -0,0 +1,25 @@
+! { dg-do run }
+
+! Check that indirect calls work on procedures passed in via a dummy argument
+
+module m
+contains
+  subroutine bar
+!$omp declare target enter(bar) indirect

e.g. "integer function bar()" ... " bar = 42"

+  end subroutine
+
+  subroutine foo(f)
+procedure(bar) :: f
+
+!$omp target
+  call f

And then: if (f() /= 42) stop 1

+!$omp end target
+  end subroutine
+end module


Thanks,

Tobias
From 83b734aa63aa63ea5bb438bb59ee09b00869e0fd Mon Sep 17 00:00:00 2001
From: Kwok Cheung Yeung 
Date: Mon, 5 Feb 2024 20:31:49 +
Subject: [PATCH] openmp, fortran: Add Fortran support for indirect clause on
 the declare target directive

2024-02-05  Kwok Cheung Yeung  

gcc/fortran/
* dump-parse-tree.cc (show_attr): Handle omp_declare_target_indirect
attribute.
* f95-lang.cc (gfc_gnu_attributes): Add entry for 'omp declare
target indirect'.
* gfortran.h (symbol_attribute): Add omp_declare_target_indirect
field.
(struct gfc_omp_clauses): Add indirect field.
* openmp.cc (omp_mask2): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_clauses): Match indirect clause.
(OMP_DECLARE_TARGET_CLAUSES): Add OMP_CLAUSE_INDIRECT.
(gfc_match_omp_declare_target): Check omp_device_type and apply
omp_declare_target_indirect attribute to symbol if indirect clause
active.  Show warning if there are only device_type and/or indirect
clauses on the directive.
* trans-decl.cc (add_attributes_to_decl): Add 'omp declare target
indirect' attribute if symbol has indirect attribute set.

gcc/testsuite/
* gfortran.dg/gomp/declare-target-4.f90 (f1): Update expected warning.
* gfortran.dg/gomp/declare-target-indirect-1.f90: New.
* gfortran.dg/gomp/declare-target-indirect-2.f90: New.

libgomp/
* testsuite/libgomp.fortran/declare-target-indirect-1.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-2.f90: New.
* testsuite/libgomp.fortran/declare-target-indirect-3.f90: New.
---
 gcc/fortran/dump-parse-tree.cc|  2 +
 gcc/fortran/f95-lang.cc   |  2 +
 gcc/fortran/gfortran.h|  3 +-
 gcc/fortran/openmp.cc | 50 ++-
 gcc/fortran/trans-decl.cc |  4 ++
 .../gfortran.dg/gomp/declare-target-4.f90 |  2 +-
 .../gomp/declare-target-indirect-1.f90| 62 +++
 .../gomp/declare-target-indirect-2.f90| 25 
 .../declare-target-indirect-1.f90 | 39 
 .../declare-target-indirect-2.f90 | 53 
 .../declare-target-indirect-3.f90 | 35 +++
 11 files changed, 272 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-target-indirect-1.f90
 create mode 100644 gcc/testsuite/gfortran.dg/gomp/declare-target-indirect-2.f90
 create mode 100644 
libgomp/testsuite/libgomp.fortran/declare-target-indirect-1.f90
 create mode 100644 
libgomp/testsuite/libgomp.fortran/declare-target-indirect-2.f90
 create mode 100644 
libgomp/testsuite/libgomp.fortran/declare-target-indirect-3.f90

diff --git a/gcc/fortran/dump-parse-tree.cc b/gcc/fortran/dump-parse-tree.cc
index 1563b810b98..7b154eb3ca7 100644
--- a/gcc/fortran/dump-parse-tree.cc
+++ b/gcc/fortran/dump-parse-tree.cc
@@ -914,6 +914,8 @@ show_attr (symbol_attribute *attr, const char * module)
 fputs (" OMP-DECLARE-TARGET", dumpfile);
   if (attr->omp_declare_target_link)
 fputs (" OMP-DECLARE-TARGET-LINK", dumpfile);
+  if (attr->omp_declare_target_indirect)
+fputs (" OMP-DECLARE-TARGET-INDIRECT", dumpfile);
   if (attr->elemental)
 fputs (" ELEMENTAL", dumpfile);
   if (attr->pure)
diff --git a/gcc/fortran/f95-lang.cc b/gcc/fortran/f95-lang.cc
index 358cb17fce2..67fda27aa3e 100644
--- a/gcc/fortran/f95-lang.cc
+++ b/gcc/fortran/f95-lang.cc
@@ -96,6 +96,8 @@ static const attribute_spec gfc_gnu_attributes[] =
 gfc_handle_omp_declare_target_attribute, NULL },
   { "omp declare target link", 0, 0, true,  false, false, false,
 gfc_handle_omp_declare_target_attribute, NULL },
+  { "omp declare target indirect", 0, 0, true,  false, false, false,
+gfc_handle_omp_declare_target_attribute, NULL },
   { "oacc function", 0, -1, true,  false, false, false,
 gfc_handle_omp_declare_target_attribute, NULL },
 };
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index fd73e4ce431..fd843a3241d 100644
--- a/gcc/fortran/gfortran.h

Re: [PATCH] [testsuite] Fix pretty printers regexps for GDB output

2024-02-06 Thread Christophe Lyon

ping?

On Thu, 25 Jan 2024 at 16:54, Christophe Lyon
 wrote:
>
> On Wed, 24 Jan 2024 at 12:02, Jonathan Wakely  wrote:
> >
> > On Wed, 24 Jan 2024 at 10:48, Christophe Lyon wrote:
> > >
> > > GDB emits end of lines as \r\n, we currently match the reverse \n\r,
> >
> > We currently match [\n\r]+ which should match any of \n, \r, \n\r or \r\n
> >
>
> Hmm, right, sorry I had this patch in my tree for quite some time, but
> wrote the description just now, so I read a bit too quickly.
>
> >
> > > possibly leading to mismatches under racy conditions.
> >
> > What do we incorrectly match? Is the problem that a \r\n sequence
> > might be incompletely printed, due to buffering, and so the regex only
> > sees (and matches) the \r which then leaves an unwanted \n in the
> > stream, which then interferes with the next match? I don't understand
> > why that problem wouldn't just result in a failed match with your new
> > regex though.
> >
> Exactly: READ1 forces read() to return 1 byte at a time, so we leave
> an unwanted \r in front of a string that should otherwise match the
> "got" case.
>
> >
> > >
> > > I noticed this while running the GCC testsuite using the equivalent of
> > > GDB's READ1 feature [1] which helps detecting bufferization issues.
> > >
> > > Adjusting the first regexp to match the right order implied fixing the
> > > second one, to skip the empty lines.
> >
> > At the very least, this part of the description is misleading. The
> > existing regex matches "the right order" already. The change is to
> > match *exactly* \r\n instead of any mix of CR and LF characters.
> > That's not about matching "the right order", it's being more precise
> > in what we match.
> >
> > But I'm still confused about what the failure scenario is and how the
> > change fixes it.
> >
>
> I followed what the GDB testsuite does (it matches \r\n at the end of
> many regexps), but in fact it seems it's not needed:
> it works if I update the "got" regexp like this (ie. accept any number
> of leading \r or \n):
> -   -re {^(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} {
> +   -re {^[\n\r]*(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} {
> and leave the "skipping" regexp as it is currently.
>
> Is the new attached version OK?
>
> Thanks,
>
> Christophe
>
> > >
> > > Tested on aarch64-linux-gnu.
> > >
> > > [1] 
> > > https//github.com/bminor/binutils-gdb/blob/master/gdb/testsuite/README#L269
> > >
> > > 2024-01-24  Christophe Lyon  
> > >
> > > libstdc++-v3/
> > > * testsuite/lib/gdb-test.exp (gdb-test): Fix regexps.
> > > ---
> > >  libstdc++-v3/testsuite/lib/gdb-test.exp | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/libstdc++-v3/testsuite/lib/gdb-test.exp 
> > > b/libstdc++-v3/testsuite/lib/gdb-test.exp
> > > index 31206f2fc32..0de8d9ee153 100644
> > > --- a/libstdc++-v3/testsuite/lib/gdb-test.exp
> > > +++ b/libstdc++-v3/testsuite/lib/gdb-test.exp
> > > @@ -194,7 +194,7 @@ proc gdb-test { marker {selector {}} {load_xmethods 
> > > 0} } {
> > >
> > >  set test_counter 0
> > >  remote_expect target [timeout_value] {
> > > -   -re {^(type|\$([0-9]+)) = ([^\n\r]*)[\n\r]+} {
> > > +   -re {^(type|\$([0-9]+)) = ([^\n\r]*)\r\n} {
> > > send_log "got: $expect_out(buffer)"
> > >
> > > incr test_counter
> > > @@ -250,7 +250,7 @@ proc gdb-test { marker {selector {}} {load_xmethods 
> > > 0} } {
> > > return
> > > }
> > >
> > > -   -re {^[^$][^\n\r]*[\n\r]+} {
> > > +   -re {^[\r\n]*[^$][^\n\r]*\r\n} {
> > > send_log "skipping: $expect_out(buffer)"
> > > exp_continue
> > > }
> > > --
> > > 2.34.1
> > >
> >

[PATCH V1] RISC-V: Add mininal support for zabha extension.

2024-02-06 Thread shiyulong

From: yulong 

This patch add the mininal support for zabha extension.
The doc url as follow: 
https://github.com/riscv/riscv-zabha/blob/v1.0-rc1/zabha.adoc
There are have no amocas.[b|h] instructions, because the zacas extension is not 
merged.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Add zabha extension name.
* config/riscv/riscv.md (amo_addqi3): New mode.
(amo_addhi3): Ditto.
(amo_minqi3): Ditto.
(amo_minuqi3): Ditto.
(amo_minhi3): Ditto.
(amo_minuhi3): Ditto.
(amo_maxqi3): Ditto.
(amo_maxuqi3): Ditto.
(amo_maxhi3): Ditto.
(amo_maxuhi3): Ditto.
(amo_andqi3): Ditto.
(amo_andhi3): Ditto.
(amo_orqi3): Ditto.
(amo_orhi3): Ditto.
(amo_xorqi3): Ditto.
(amo_xorhi3): Ditto.
(amo_swapqi3): Ditto.
(amo_swaphi3): Ditto.
* config/riscv/riscv.opt: Add zabha extension.

---
 gcc/common/config/riscv/riscv-common.cc |   2 +
 gcc/config/riscv/riscv.md   | 167 
 gcc/config/riscv/riscv.opt  |   2 +
 3 files changed, 171 insertions(+)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 631ce8309a0..9c3be0d7651 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -250,6 +250,7 @@ static const struct riscv_ext_version 
riscv_ext_version_table[] =
   {"za64rs",  ISA_SPEC_CLASS_NONE, 1, 0},
   {"za128rs", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zawrs", ISA_SPEC_CLASS_NONE, 1, 0},
+  {"zabha", ISA_SPEC_CLASS_NONE, 1, 0},
 
   {"zba", ISA_SPEC_CLASS_NONE, 1, 0},
   {"zbb", ISA_SPEC_CLASS_NONE, 1, 0},
@@ -1504,6 +1505,7 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
   {"za64rs", &gcc_options::x_riscv_za_subext, MASK_ZA64RS},
   {"za128rs", &gcc_options::x_riscv_za_subext, MASK_ZA128RS},
   {"zawrs", &gcc_options::x_riscv_za_subext, MASK_ZAWRS},
+  {"zabha", &gcc_options::x_riscv_za_subext, MASK_ZABHA},
 
   {"zba",&gcc_options::x_riscv_zb_subext, MASK_ZBA},
   {"zbb",&gcc_options::x_riscv_zb_subext, MASK_ZBB},
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 39b29795cd6..058b63ac7f0 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -134,6 +134,9 @@
   ;; XTheadInt unspec
   UNSPECV_XTHEADINT_PUSH
   UNSPECV_XTHEADINT_POP
+
+  ;; Zabha instructions.
+  UNSPEC_AMO_SWAP
 ])
 
 (define_constants
@@ -849,6 +852,24 @@
   [(set_attr "type" "arith")
(set_attr "mode" "SI")])
 
+(define_insn "amo_addqi3"
+  [(set (match_operand:QI  0 "register_operand" "=r,r")
+   (plus:QI (match_operand:QI 1 "register_operand" " r,r")
+(match_operand:QI 2 "arith_operand"" r,r")))]
+  "TARGET_ZABHA"
+  "amoadd.b\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "QI")])
+
+(define_insn "amo_addhi3"
+  [(set (match_operand:HI  0 "register_operand" "=r,r")
+   (plus:HI (match_operand:HI 1 "register_operand" " r,r")
+(match_operand:HI 2 "arith_operand"" r,r")))]
+  "TARGET_ZABHA"
+  "amoadd.h\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "HI")])
+
 ;;
 ;;  
 ;;
@@ -1645,6 +1666,78 @@
   [(set_attr "type" "fmove")
(set_attr "mode" "")])
 
+(define_insn "amo_minqi3"
+  [(set (match_operand:QI0 "register_operand" "=r")
+   (smin:QI (match_operand:QI 1 "register_operand" " r")
+  (match_operand:QI 2 "register_operand" " r")))]
+  "TARGET_ZABHA"
+  "amomin.b\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "QI")])
+
+(define_insn "amo_minuqi3"
+  [(set (match_operand:QI0 "register_operand" "=r")
+   (umin:QI (match_operand:QI 1 "register_operand" " r")
+  (match_operand:QI 2 "register_operand" " r")))]
+  "TARGET_ZABHA"
+  "amominu.b\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "QI")])
+
+(define_insn "amo_minhi3"
+  [(set (match_operand:HI0 "register_operand" "=r")
+   (smin:HI (match_operand:HI 1 "register_operand" " r")
+  (match_operand:HI 2 "register_operand" " r")))]
+  "TARGET_ZABHA"
+  "amomin.h\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "HI")])
+
+(define_insn "amo_minuhi3"
+  [(set (match_operand:HI0 "register_operand" "=r")
+   (umin:HI (match_operand:HI 1 "register_operand" " r")
+  (match_operand:HI 2 "register_operand" " r")))]
+  "TARGET_ZABHA"
+  "amominu.h\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "HI")])
+
+(define_insn "amo_maxqi3"
+  [(set (match_operand:QI0 "register_operand" "=r")
+   (smax:QI (match_operand:QI 1 "register_operand" " r")
+  (match_operand:QI 2 "register_operand" " r")))]
+  "TARGET_ZABHA"
+  "amomax.b\t%0,%1,%2"
+  [(set_attr "type" "atomic")
+   (set_attr "mode" "QI")])
+
+(define_insn "amo_max

Re: Ping: Re: [PATCH] libgcc: fix SEH C++ rethrow semantics [PR113337]

2024-02-06 Thread Jonathan Yong


On 2/6/24 05:31, NightStrike wrote:

On Mon, Feb 5, 2024, 06:53 Matteo Italia  wrote:


Il 31/01/24 04:24, LIU Hao ha scritto:

在 2024-01-31 08:08, Jonathan Yong 写道:

On 1/24/24 15:17, Matteo Italia wrote:

Ping! That's a one-line fix, and you can find all the details in the
bugzilla entry. Also, I can provide executables built with the
affected toolchains, demonstrating the problem and the fix.

Thanks,
Matteo



I was away last week. LH, care to comment? Changes look fine to me.



The change looks good to me, too.

I haven't tested it though. According to a similar construction around
'libgcc/unwind.inc:265' it should be that way.


Hello,

thank you for the replies, is there anything else I can do to help push
this forward?



Remember to mention the pr with the right syntax in the ChangeLog so the
bot adds a comment field. I didn't see it in yours, but I might have missed
it.







Thanks all, pushed to master branch.

Re: [PATCH] libgcc: Export i386 symbols added after GCC_7.0.0 on Solaris [PR113700]

2024-02-06 Thread Jakub Jelinek

On Tue, Feb 06, 2024 at 10:00:18AM +0100, Rainer Orth wrote:
> 2024-02-01  Rainer Orth  
> 
>   libgcc:
>   * config/i386/libgcc-sol2.ver (GCC_14.0.0): Added all symbols from
>   i386/libgcc-glibc.ver (GCC_12.0.0, GCC_13.0.0, GCC_14.0.0).
>   * config/i386/libgcc-glibc.ver: Request notifications on updates.

LGTM, except for a nit:

> # HG changeset patch
> # Parent  e582765ce980229b4c3ae5afc6a28e5aa480cdaf
> libgcc: Export i386 symbols added after GCC_7.0.0 on Solaris [PR113700]
> 
> diff --git a/libgcc/config/i386/libgcc-glibc.ver 
> b/libgcc/config/i386/libgcc-glibc.ver
> --- a/libgcc/config/i386/libgcc-glibc.ver
> +++ b/libgcc/config/i386/libgcc-glibc.ver
> @@ -236,3 +236,7 @@ GCC_14.0.0 {
>__floatbitintxf
>__floatbitinttf
>  }
> +
> +# Please notify the maintainers of libgcc-{bsd,darwin,sol2}.ver of any
> +# additions.  Those version scripts usually need to be kept in sync with
> +# libgcc-glibc.ver.
> diff --git a/libgcc/config/i386/libgcc-sol2.ver 
> b/libgcc/config/i386/libgcc-sol2.ver
> --- a/libgcc/config/i386/libgcc-sol2.ver
> +++ b/libgcc/config/i386/libgcc-sol2.ver
> @@ -115,3 +115,39 @@ GCC_4.8.0 {
>  GCC_7.0.0 {
>__signbittf2
>  }
> +
> +GCC_14.0.0 {
> +  # Added to GCC_12.0.0 in i386/libgcc-glibc.
> +  # Added to GCC_13.0.0 in i386/libgcc-glibc.
> +  # Added to GCC_14.0.0 in i386/libgcc-glibc.

Please append
ver.
to these comment lines.

Jakub

Re: [PATCH v2] openmp, fortran: Add Fortran support for indirect clause on the declare target directive

2024-02-06 Thread Tobias Burnus


Kwok Cheung Yeung wrote:
As previously discussed, this version of the patch adds code to emit a 
warning when a directive like this:


!$omp declare target indirect(.true.)

is encountered (i.e. a target directive containing at least one 
clause, but no to/enter clause, which appears to violate the OpenMP 
standard). A test is also added to 
gfortran.dg/gomp/declare-target-indirect-1.f90 to test for this.


Thanks. And indeed, the 5.1 spec requires under "Restrictions to the 
declare target directive are as follows:" "If the directive has a 
clause, it must contain at least one 'to' clause or at least one 'link' 
clause.". [5.2 replaced 'to' by its alias 'enter' and the 6.0 preview 
added 'local' to the list.]



I have also added a declare-target-indirect-3.f90 test to libgomp to 
check that procedures passed via a dummy argument work properly when 
used in an indirect call.


Okay for mainline?


LGTM. I just wonder whether there should be a value test and not just a 
does-not-crash-when-called test for the latter testcase, i.e.




+++ b/libgomp/testsuite/libgomp.fortran/declare-target-indirect-3.f90
@@ -0,0 +1,25 @@
+! { dg-do run }
+
+! Check that indirect calls work on procedures passed in via a dummy argument
+
+module m
+contains
+  subroutine bar
+!$omp declare target enter(bar) indirect

e.g. "integer function bar()" ... " bar = 42"

+  end subroutine
+
+  subroutine foo(f)
+procedure(bar) :: f
+
+!$omp target
+  call f

And then: if (f() /= 42) stop 1

+!$omp end target
+  end subroutine
+end module


Thanks,

Tobias

[PATCH] libgcc: Export i386 symbols added after GCC_7.0.0 on Solaris [PR113700]

2024-02-06 Thread Rainer Orth

As reported in the PR, all libgcc x86 symbol versions added after
GCC_7.0.0 were only added to i386/libgcc-glibc.ver, missing all of
libgcc-sol2.ver, libgcc-bsd.ver, and libgcc-darwin.ver.

This patch fixes this for Solaris/x86, adding all of them
(GCC_1[234].0.0) as GCC_14.0.0 to not retroactively change history.

Since this isn't the first time this happens, I've added a note to the
end of libgcc-glibc.ver to request notifying other maintainers in case
of additions.

Tested on i386-pc-solaris2.11.

Ok for trunk?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2024-02-01  Rainer Orth  

libgcc:
* config/i386/libgcc-sol2.ver (GCC_14.0.0): Added all symbols from
i386/libgcc-glibc.ver (GCC_12.0.0, GCC_13.0.0, GCC_14.0.0).
* config/i386/libgcc-glibc.ver: Request notifications on updates.

# HG changeset patch
# Parent  e582765ce980229b4c3ae5afc6a28e5aa480cdaf
libgcc: Export i386 symbols added after GCC_7.0.0 on Solaris [PR113700]

diff --git a/libgcc/config/i386/libgcc-glibc.ver b/libgcc/config/i386/libgcc-glibc.ver
--- a/libgcc/config/i386/libgcc-glibc.ver
+++ b/libgcc/config/i386/libgcc-glibc.ver
@@ -236,3 +236,7 @@ GCC_14.0.0 {
   __floatbitintxf
   __floatbitinttf
 }
+
+# Please notify the maintainers of libgcc-{bsd,darwin,sol2}.ver of any
+# additions.  Those version scripts usually need to be kept in sync with
+# libgcc-glibc.ver.
diff --git a/libgcc/config/i386/libgcc-sol2.ver b/libgcc/config/i386/libgcc-sol2.ver
--- a/libgcc/config/i386/libgcc-sol2.ver
+++ b/libgcc/config/i386/libgcc-sol2.ver
@@ -115,3 +115,39 @@ GCC_4.8.0 {
 GCC_7.0.0 {
   __signbittf2
 }
+
+GCC_14.0.0 {
+  # Added to GCC_12.0.0 in i386/libgcc-glibc.
+  __divhc3
+  __mulhc3
+  __eqhf2
+  __nehf2
+  __extendhfdf2
+  __extendhfsf2
+  __extendhftf2
+  __extendhfxf2
+  __fixhfti
+  __fixunshfti
+  __floattihf
+  __floatuntihf
+  __truncdfhf2
+  __truncsfhf2
+  __trunctfhf2
+  __truncxfhf2
+  # Added to GCC_13.0.0 in i386/libgcc-glibc.
+  __extendbfsf2
+  __floattibf
+  __floatuntibf
+  __truncdfbf2
+  __truncsfbf2
+  __trunctfbf2
+  __truncxfbf2
+  __trunchfbf2
+  # Added to GCC_14.0.0 in i386/libgcc-glibc.
+  __fixxfbitint
+  __fixtfbitint
+  __floatbitintbf
+  __floatbitinthf
+  __floatbitintxf
+  __floatbitinttf
+}

Re: [PATCH 2/2] rtl-optimization/113255 - avoid re-associating REG_POINTER MINUS

2024-02-06 Thread Richard Biener

On Mon, 5 Feb 2024, Jeff Law wrote:

> 
> 
> On 2/5/24 01:15, Richard Biener wrote:
> 
> >>>
> >>>   PR rtl-optimization/113255
> >>>   * simplify-rtx.cc (simplify_context::simplify_binary_operation_1):
> >>>   Do not re-associate a MINUS with a REG_POINTER op0.
> >> Nasty little set of problems.  I don't think we ever pondered that we could
> >> have multiple REGNO_POINTER_FLAG objects in the same expression, but
> >> clearly
> >> that can happen once you introduce a 3rd term in the expression.
> >>
> >> I don't mind avoiding the reassociation, but it feels like we're papering
> >> over
> >> problems in alias.cc.  Conceptually it seems like if we have two objects
> >> with
> >> REG_POINTER set, then we can't know which one is the real base.  So your
> >> patch
> >> in the PR wasn't that bad.
> > 
> > It wasn't bad, it's the only correct fix.  The question is what we do
> > for branches (or whether we do anything there) and whether we just accept
> > that that fix causes some optimization regressions.
> For the branches, I'd go whatever you feel the safest change is.  While it
> looks like some of this is fundamentally broken, it can't be *that* bad since
> it's just rearing its ugly head now.

It did in the past as well where we worked around by tweaking either
code generation or heuristics.

> I could even make a case that going with the patch from the PR for the
> branches is reasonable.  It's attacking at least part of the root problem.
> 
> > 
> >> Alternately, just stop using REG_POINTER for alias analysis?   It looks
> >> fundamentally flawed to me in that context.  In fact, one might argue that
> >> the
> >> only legitimate use would be to indicate to the target that we know a
> >> pointer
> >> points into an object.  Some targets (the PA) need this because x + y is
> >> not
> >> the same as y + x when used as a memory address.
> >>
> >> If we wanted to be a bit more surgical, drop REG_POINTER from just the
> >> MINUS
> >> handling in alias.cc?
> > 
> > The problem is that REG_POINTER is just used as a heuristic
> > (and compile-time optimization) as to which of a binary operator
> > operands we use a base of (preferrably).  find_base_{term,value}
> > happily look at operands that are not REG_POINTER (that are
> > not REG_P), since for the case in question, even w/o re-assoc
> > there would be no way to say the inner MINUS is not a pointer
> > (it's a REG flag).
> > 
> > The heuristics don't help much when passes like DSE use CSELIB
> > and combine operations like above, we then get to see that
> > the way find_base_{term,value} perform pointer analysis is
> > fundamentally flawed.  Any tweaking there has the chance to
> > make other cases run into wrong base discoveries.
> > 
> Exactly.  So maybe I'm missing something -- it sounds like we both agree that
> using REG_POINTER in the aliasing code is just fundamentally broken in the
> modern world (and perhaps has been for a long time).  So we "just" need to
> excise that code from alias.cc.

Btw, it's not so much REG_POINTER that is problematic - it's that
find_base_{term,value} for binary operators doesn't merge the
"points-to solution" for both operands and that if it doesn't
find a "base" in the part of the IL it sees it assumes the points-to
set is empty.  That is, it combines "is not a pointer" and "I have
no idea" in the NULL return value.  The fix that's now installed on
trunk "solves" this lack of merging by never handling any case that
would require merging (optimistically treating CONST_INT as known
not pointer).

I don't think proper PTA analysis on RTL will yield anything useful
and we're better off tracking the guarantees which it tries to
handle with the ADDRESS base values for stack, spill and argument
space when we create MEMs (and annotate MEMs).  But I have a hard time
deciphering RTL details which isn't really my main area of expertise ...
We do already tag some MEMs (like spills with their special MEM_ATTRs),
but argument setup seems lacking in this regard, and that's the
difficult to understand part since it involves three different
unique_base_value (based on STACK_POINTER_REGNUM, ARG_POINTER_REGNUM
and FRAME_POINTER_REGNUM) and the base of a MEM seems to vary based
on elimination state :/

> > I'll take it that we need to live with the regressions for GCC 14
> > and the wrong-code bug in GCC 13 and earlier.
> I'm not sure I agree with this statement.  Or maybe I thought the patch in the
> PR was more effective than it really is.  At some level we ought to be able to
> cut out the short-cuts enabled by REG_POINTER.  That runs the risk of
> perturbing more code, but it seems to me that's a risk we might need to take.

REG_POINTER just determines which operand we prefer, so it's a
heuristic for the case when multiple bases are involved in the computation 
of the final value.

Richard.

Re: [PATCH] libstdc++: /dev/null is not accessible on Windows

2024-02-06 Thread Jonathan Yong


On 2/5/24 19:38, Jonathan Wakely wrote:

On Mon, 5 Feb 2024, 19:07 Torbjörn SVENSSON, 
wrote:


Ok for trunk and releases/gcc-13?



OK, thanks


Done, pushed to master and releases/gcc-13.

RE: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread Li, Pan2

Not yet. It is long time since last round run, will make sure there is no 
surprises from that.

Pan

From: juzhe.zh...@rivai.ai 
Sent: Tuesday, February 6, 2024 4:11 PM
To: Li, Pan2 ; gcc-patches 
Cc: Li, Pan2 ; Wang, Yanzhang ; 
kito.cheng 
Subject: Re: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when 
empty args

Did you run the C compiler compile C++ intrinsic test ?

juzhe.zh...@rivai.ai

From: pan2.li
Date: 2024-02-06 16:09
To: gcc-patches
CC: juzhe.zhong; 
pan2.li; 
yanzhang.wang; 
kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty 
args
From: Pan Li mailto:pan2...@intel.com>>

There is one corn case when similar as below example:

void test (void)
{
  __riscv_vfredosum_tu ();
}

It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:

1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.

Here we report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with 
empty args
8 |   __riscv_vfredosum_tu();
  |   ^~~~

Below test are passed for this patch.

* The riscv regression tests.

PR target/113766

gcc/ChangeLog:

* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
the signature of func.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Make
overloaded func with empty args error.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
* gcc.target/riscv/rvv/base/pr113766-2.c: New test.

Signed-off-by: Pan Li mailto:pan2...@intel.com>>
---
gcc/config/riscv/riscv-c.cc   |  3 +-
gcc/config/riscv/riscv-protos.h   |  2 +-
gcc/config/riscv/riscv-vector-builtins.cc | 23 -
.../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++
.../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++
5 files changed, 155 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 2e306057347..94c3871c760 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int 
uncast_location, tree fndecl,
 case RISCV_BUILTIN_GENERAL:
   break;
 case RISCV_BUILTIN_VECTOR:
-  new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist);
+  new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode,
+  fndecl, arglist);
   break;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b3f0bdb9924..ae1685850ac 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, 
gimple_stmt_iterator *, gcall *);
rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
-tree resolve_overloaded_builtin (unsigned int, vec *);
+tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
bool legitimize_move (rtx, rtx *);
void emit_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 403e1021fd1..efcdc8f1767 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4606,7 +4606,8 @@ check_builtin_call (location_t location, vec, 
unsigned int code,
}
tree
-resolve_overloaded_builtin (unsigned int code, vec *arglist)
+resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl,
+ vec *arglist)
{
   if (code >= vec_safe_length (registered_functions))
 return NULL_TREE;
@@ -4616,12 +4617,26 @@ resolve_overloaded_builtin (unsigned int code, 
vec *arglist)
   if (!rfun || !rfun->overloaded_p)
 return N

Re: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread juzhe.zh...@rivai.ai

Did you run the C compiler compile C++ intrinsic test ?



juzhe.zh...@rivai.ai
 
From: pan2.li
Date: 2024-02-06 16:09
To: gcc-patches
CC: juzhe.zhong; pan2.li; yanzhang.wang; kito.cheng
Subject: [PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty 
args
From: Pan Li 
 
There is one corn case when similar as below example:
 
void test (void)
{
  __riscv_vfredosum_tu ();
}
 
It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:
 
1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.
 
Here we report error when overloaded function with empty args.  For example:
 
test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with 
empty args
8 |   __riscv_vfredosum_tu();
  |   ^~~~
 
Below test are passed for this patch.
 
* The riscv regression tests.
 
PR target/113766
 
gcc/ChangeLog:
 
* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
the signature of func.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): Make
overloaded func with empty args error.
 
gcc/testsuite/ChangeLog:
 
* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
* gcc.target/riscv/rvv/base/pr113766-2.c: New test.
 
Signed-off-by: Pan Li 
---
gcc/config/riscv/riscv-c.cc   |  3 +-
gcc/config/riscv/riscv-protos.h   |  2 +-
gcc/config/riscv/riscv-vector-builtins.cc | 23 -
.../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++
.../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++
5 files changed, 155 insertions(+), 6 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c
 
diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 2e306057347..94c3871c760 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int 
uncast_location, tree fndecl,
 case RISCV_BUILTIN_GENERAL:
   break;
 case RISCV_BUILTIN_VECTOR:
-  new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist);
+  new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode,
+  fndecl, arglist);
   break;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b3f0bdb9924..ae1685850ac 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, 
gimple_stmt_iterator *, gcall *);
rtx expand_builtin (unsigned int, tree, rtx);
bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
-tree resolve_overloaded_builtin (unsigned int, vec *);
+tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *);
bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
bool legitimize_move (rtx, rtx *);
void emit_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 403e1021fd1..efcdc8f1767 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4606,7 +4606,8 @@ check_builtin_call (location_t location, vec, 
unsigned int code,
}
tree
-resolve_overloaded_builtin (unsigned int code, vec *arglist)
+resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl,
+ vec *arglist)
{
   if (code >= vec_safe_length (registered_functions))
 return NULL_TREE;
@@ -4616,12 +4617,26 @@ resolve_overloaded_builtin (unsigned int code, 
vec *arglist)
   if (!rfun || !rfun->overloaded_p)
 return NULL_TREE;
+  /* According to the rvv intrinisc doc, we have no such overloaded function
+ with empty args.  Unfortunately, we register the empty args function as
+ overloaded for avoiding conflict.  Thus, there will actual one register
+ function after return NULL_TREE back to the middle-end, and finally result
+ in ICE when expanding.  For example:
+
+ 1. First we registered void __riscv_vfredmax () as the overloaded 
function.
+ 2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
+ 3. The functions register in step 1 bypass the args check as empty args.
+

[PATCH v1] RISC-V: Bugfix for RVV overloaded intrinisc ICE when empty args

2024-02-06 Thread pan2 . li

From: Pan Li 

There is one corn case when similar as below example:

void test (void)
{
  __riscv_vfredosum_tu ();
}

It will meet ICE because of the implement details of overloaded function
in gcc.  According to the rvv intrinisc doc, we have no such overloaded
function with empty args.  Unfortunately, we register the empty args
function as overloaded for avoiding conflict.  Thus, there will be actual
one register function after return NULL_TREE back to the middle-end,
and finally result in ICE when expanding.  For example:

1. First we registered void __riscv_vfredmax () as the overloaded function.
2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
3. The functions register in step 1 bypass the args check as empty args.
4. Finally, fall into expand_builtin with empty args and meet ICE.

Here we report error when overloaded function with empty args.  For example:

test.c: In function 'foo':
test.c:8:3: error: no matching function call to '__riscv_vfredosum_tu' with 
empty args
8 |   __riscv_vfredosum_tu();
  |   ^~~~

Below test are passed for this patch.

* The riscv regression tests.

PR target/113766

gcc/ChangeLog:

* config/riscv/riscv-protos.h (resolve_overloaded_builtin): Adjust
the signature of func.
* config/riscv/riscv-c.cc (riscv_resolve_overloaded_builtin): Ditto.
* config/riscv/riscv-vector-builtins.cc (resolve_overloaded_builtin): 
Make
overloaded func with empty args error.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/rvv/base/pr113766-1.c: New test.
* gcc.target/riscv/rvv/base/pr113766-2.c: New test.

Signed-off-by: Pan Li 
---
 gcc/config/riscv/riscv-c.cc   |  3 +-
 gcc/config/riscv/riscv-protos.h   |  2 +-
 gcc/config/riscv/riscv-vector-builtins.cc | 23 -
 .../gcc.target/riscv/rvv/base/pr113766-1.c| 85 +++
 .../gcc.target/riscv/rvv/base/pr113766-2.c| 48 +++
 5 files changed, 155 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-1.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/base/pr113766-2.c

diff --git a/gcc/config/riscv/riscv-c.cc b/gcc/config/riscv/riscv-c.cc
index 2e306057347..94c3871c760 100644
--- a/gcc/config/riscv/riscv-c.cc
+++ b/gcc/config/riscv/riscv-c.cc
@@ -250,7 +250,8 @@ riscv_resolve_overloaded_builtin (unsigned int 
uncast_location, tree fndecl,
 case RISCV_BUILTIN_GENERAL:
   break;
 case RISCV_BUILTIN_VECTOR:
-  new_fndecl = riscv_vector::resolve_overloaded_builtin (subcode, arglist);
+  new_fndecl = riscv_vector::resolve_overloaded_builtin (loc, subcode,
+fndecl, arglist);
   break;
 default:
   gcc_unreachable ();
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index b3f0bdb9924..ae1685850ac 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -560,7 +560,7 @@ gimple *gimple_fold_builtin (unsigned int, 
gimple_stmt_iterator *, gcall *);
 rtx expand_builtin (unsigned int, tree, rtx);
 bool check_builtin_call (location_t, vec, unsigned int,
   tree, unsigned int, tree *);
-tree resolve_overloaded_builtin (unsigned int, vec *);
+tree resolve_overloaded_builtin (location_t, unsigned int, tree, vec *);
 bool const_vec_all_same_in_range_p (rtx, HOST_WIDE_INT, HOST_WIDE_INT);
 bool legitimize_move (rtx, rtx *);
 void emit_vlmax_vsetvl (machine_mode, rtx);
diff --git a/gcc/config/riscv/riscv-vector-builtins.cc 
b/gcc/config/riscv/riscv-vector-builtins.cc
index 403e1021fd1..efcdc8f1767 100644
--- a/gcc/config/riscv/riscv-vector-builtins.cc
+++ b/gcc/config/riscv/riscv-vector-builtins.cc
@@ -4606,7 +4606,8 @@ check_builtin_call (location_t location, vec, 
unsigned int code,
 }
 
 tree
-resolve_overloaded_builtin (unsigned int code, vec *arglist)
+resolve_overloaded_builtin (location_t loc, unsigned int code, tree fndecl,
+   vec *arglist)
 {
   if (code >= vec_safe_length (registered_functions))
 return NULL_TREE;
@@ -4616,12 +4617,26 @@ resolve_overloaded_builtin (unsigned int code, 
vec *arglist)
   if (!rfun || !rfun->overloaded_p)
 return NULL_TREE;
 
+  /* According to the rvv intrinisc doc, we have no such overloaded function
+ with empty args.  Unfortunately, we register the empty args function as
+ overloaded for avoiding conflict.  Thus, there will actual one register
+ function after return NULL_TREE back to the middle-end, and finally result
+ in ICE when expanding.  For example:
+
+ 1. First we registered void __riscv_vfredmax () as the overloaded 
function.
+ 2. Then resolve_overloaded_builtin (this func) return NULL_TREE.
+ 3. The functions register in step 1 bypass the args check as empty args.
+ 4. Finally, fall into expand_builtin with empty args and meet ICE.
+
+ Here we report error whe

63 matches

Mail list logo