Re: [PATCH] ppc: testsuite: vec-mul requires vsx runtime

2024-04-28 Thread Alexandre Oliva
On Apr 23, 2024, "Kewen.Lin"  wrote:

>> -/* { dg-do run } */
>> +/* { dg-do compile { target { ! vsx_hw } } } */
>> +/* { dg-do run { target vsx_hw } } */
>> /* { dg-require-effective-target powerpc_vsx_ok } */

> Nit: It's useless to check powerpc_vsx_ok for vsx_hw, so powerpc_vsx_ok check
> can be moved to be with ! vsx_hw.

> OK with this nit tweaked, thanks!

Thanks, here's what I'm pushing momentarily...


ppc: testsuite: vec-mul requires vsx runtime

vec-mul is an execution test, but it only requires a powerpc_vsx_ok
effective target, which is enough only for compile tests.  In order to
check for runtime and execution environment support, we need to
require vsx_hw.  Make that a condition for execution, but still
perform a compile test if the condition is not satisfied.


for  gcc/testsuite/ChangeLog

* gcc.target/powerpc/vec-mul.c: Run on target vsx_hw, just
compile otherwise.
---
 gcc/testsuite/gcc.target/powerpc/vec-mul.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/vec-mul.c 
b/gcc/testsuite/gcc.target/powerpc/vec-mul.c
index bfcaf80719d1d..aa0ef7aa45acc 100644
--- a/gcc/testsuite/gcc.target/powerpc/vec-mul.c
+++ b/gcc/testsuite/gcc.target/powerpc/vec-mul.c
@@ -1,5 +1,5 @@
-/* { dg-do run } */
-/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-do compile { target { { ! vsx_hw } && powerpc_vsx_ok } } } */
+/* { dg-do run { target vsx_hw } } */
 /* { dg-options "-mvsx -O3" } */
 
 /* Test that the vec_mul builtin works as expected.  */


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v2] xfail fetestexcept test - ppc always uses fcmpu

2024-04-28 Thread Alexandre Oliva
On Apr 23, 2024, "Kewen.Lin"  wrote:

>> --- a/gcc/testsuite/gcc.dg/torture/pr91323.c
>> +++ b/gcc/testsuite/gcc.dg/torture/pr91323.c
>> @@ -1,4 +1,5 @@
>> -/* { dg-do run } */
>> +/* { dg-do run { xfail powerpc*-*-* } } */
>> +/* The ppc xfail is because of PR target/58684.  */

> OK, though the proposed comment is slightly different from what's in
> the related commit r8-6445-g86145a19abf39f. :)  Thanks!

Oh, thanks for the pointer, that was easy to fix.  Here's what I'm
pushing momentarily...


xfail fetestexcept test - ppc always uses fcmpu

gcc.dg/torture/pr91323.c tests that a compare with NaNf doesn't set an
exception using builtin compare intrinsics, and that it does when
using regular compare operators.

That doesn't seem to be expected to work on powerpc targets.  It fails
on GNU/Linux, it's marked to be skipped on AIX, and a similar test,
gcc.dg/torture/pr93133.c, has the execution test xfailed for all of
powerpc*-*-*.

In this test, the functions that use intrinsics for the compare end up
with the same code as the one that uses compare operators, using
fcmpu, a floating compare that, unlike fcmpo, does not set the invalid
operand exception for quiet NaN.  I couldn't find any evidence that
the rs6000 backend ever outputs fcmpo.  Therefore, I'm adding the same
execution xfail marker to this test.


for  gcc/testsuite/ChangeLog

PR target/58684
* gcc.dg/torture/pr91323.c: Expect execution fail on
powerpc*-*-*.
---
 gcc/testsuite/gcc.dg/torture/pr91323.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/torture/pr91323.c 
b/gcc/testsuite/gcc.dg/torture/pr91323.c
index 1411fcaa3966c..4574342e728db 100644
--- a/gcc/testsuite/gcc.dg/torture/pr91323.c
+++ b/gcc/testsuite/gcc.dg/torture/pr91323.c
@@ -1,4 +1,5 @@
-/* { dg-do run } */
+/* { dg-do run { xfail powerpc*-*-* } } */
+/* remove the xfail for powerpc when pr58684 is fixed */
 /* { dg-add-options ieee } */
 /* { dg-require-effective-target fenv_exceptions } */
 /* { dg-skip-if "fenv" { powerpc-ibm-aix* } } */


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v2] [testsuite] require sqrt_insn effective target where needed

2024-04-28 Thread Alexandre Oliva
On Apr 23, 2024, Iain Sandoe  wrote:

>>> --- a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
>>> @@ -1,6 +1,7 @@
>>> /* { dg-do run } */
>>> /* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
>>> /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
>>> -mpowerpc-gpopt" } */
>>> +/* { dg-require-effective-target sqrt_insn } */
>> 
>> This change looks sensible to me.
>> 
>> Nit: With the proposed change, I'd expect that we can remove the line for 
>> powerpc*-*-darwin*.
>> 
>> CC Iain to confirm.

> Indeed, the check for sqrt_insn fails and so the test is unsupported without 
> needing the separate
> powerpc*-*-darwin* line,

Thanks, here's the adjusted version I'm just about to push.


[testsuite] require sqrt_insn effective target where needed

Some tests fail on ppc and ppc64 when testing a compiler [with options
for] for a CPU [emulator] that doesn't support the sqrt insn.

The gcc.dg/cdce3.c is one in which the expected shrink-wrap
optimization only takes place when the target CPU supports a sqrt
insn.

The gcc.target/powerpc/pr46728-1[0-4].c tests use -mpowerpc-gpopt and
call sqrt(), which involves the sqrt insn that the target CPU under
test may not support.

Require a sqrt_insn effective target for all the affected tests.


for  gcc/testsuite/ChangeLog

* gcc.dg/cdce3.c: Require sqrt_insn effective target.
* gcc.target/powerpc/pr46728-10.c: Likewise.  Drop darwin
explicit skipping.
* gcc.target/powerpc/pr46728-11.c: Likewise.  Likewise.
* gcc.target/powerpc/pr46728-13.c: Likewise.  Likewise.
* gcc.target/powerpc/pr46728-14.c: Likewise.  Likewise.
---
 gcc/testsuite/gcc.dg/cdce3.c  |3 ++-
 gcc/testsuite/gcc.target/powerpc/pr46728-10.c |2 +-
 gcc/testsuite/gcc.target/powerpc/pr46728-11.c |2 +-
 gcc/testsuite/gcc.target/powerpc/pr46728-13.c |2 +-
 gcc/testsuite/gcc.target/powerpc/pr46728-14.c |2 +-
 5 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/cdce3.c b/gcc/testsuite/gcc.dg/cdce3.c
index 601ddf055fd71..f759a95972e8b 100644
--- a/gcc/testsuite/gcc.dg/cdce3.c
+++ b/gcc/testsuite/gcc.dg/cdce3.c
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
 /* { dg-require-effective-target hard_float } */
+/* { dg-require-effective-target sqrt_insn } */
 /* { dg-options "-O2 -fmath-errno -fdump-tree-cdce-details 
-fdump-tree-optimized" } */
-/* { dg-final { scan-tree-dump "cdce3.c:11: \[^\n\r]* function call is 
shrink-wrapped into error conditions\." "cdce" } } */
+/* { dg-final { scan-tree-dump "cdce3.c:12: \[^\n\r]* function call is 
shrink-wrapped into error conditions\." "cdce" } } */
 /* { dg-final { scan-tree-dump "sqrtf \\(\[^\n\r]*\\); \\\[tail call\\\]" 
"optimized" } } */
 /* { dg-skip-if "doesn't have a sqrtf insn" { mmix-*-* } } */
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
index 3be4728d333a4..c04a3101c113f 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-10.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
-/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
 /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-11.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-11.c
index 43b6728a4b812..d0e3d60212194 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-11.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-11.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
-/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
 /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-13.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-13.c
index b9fd63973b728..2b9df737a9b0d 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-13.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-13.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
-/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
 /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 
diff --git a/gcc/testsuite/gcc.target/powerpc/pr46728-14.c 
b/gcc/testsuite/gcc.target/powerpc/pr46728-14.c
index 5a13bdb6c..e6836f515e4f8 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr46728-14.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr46728-14.c
@@ -1,6 +1,6 @@
 /* { dg-do run } */
-/* { dg-skip-if "-mpowerpc-gpopt not supported" { powerpc*-*-darwin* } } */
 /* { dg-options "-O2 -ffast-math -fno-inline -fno-unroll-loops -lm 
-mpowerpc-gpopt" } */
+/* { dg-require-effective-target sqrt_insn } */
 
 #include 
 


-- 
Alexand

Re: enable sqrt insns for cdce3.c

2024-04-28 Thread Alexandre Oliva
On Apr 23, 2024, Hans-Peter Nilsson  wrote:

> (We could also fix the predicate description to actually say 
> "for all floating-point modes" and/or split the predicate into 
> mode-specific variants, etc. ;-)

Yeah, I suppose that could make sense.

> MMIX has sqrtdf2 but not sqrtsf2, and the latter is what's used 
> in cdce3.c.

I see, thanks for the info.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-28 Thread Alexandre Oliva
On Apr 24, 2024, "Kewen.Lin"  wrote:

> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one line 
> above)
> shows the original intention of this case is to expect not profitable for 
> peeling
> so it's not expected to be handled here, can we just tweak the loop bound 
> instead,
> such as:

> -#define N 14
> +#define N 13
>  #define OFF 4 

> ?, it can make this loop not profitable to be vectorized for !vect_no_align 
> with
> peeling (both pwr7 and pwr6) and keep consistent.

Like this?  I didn't feel I could claim authorship of this one-liner
just because I turned it into a patch and tested it, so I took the
liberty of turning your own words above into the commit message.  So
far, tested on ppc64le-linux-gnu (ppc9).  Testing with vxworks targets
now.  Would you like to tweak the commit message to your liking?
Otherwise, is this ok to install?

Thanks,


adjust iteration count for ppc costmodel 76b

From: Kewen Lin 

The original intention of this case is to expect not profitable for
peeling.  Tweak the loop bound to make this loop not profitable to be
vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and
keep consistent.


for  gcc/testsuite/ChangeLog

* gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak.
---
 .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
index cbbfbb24658f8..e48b0ab759e75 100644
--- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
@@ -6,7 +6,7 @@
 
 /* On Power7 without misalign vector support, this case is to check it's not
profitable to perform vectorization by peeling to align the store.  */
-#define N 14
+#define N 13
 #define OFF 4
 
 /* Check handling of accesses for which the "initial condition" -


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] [x86] Adjust alternative *k to ?k for avx512 mask in zero_extend patterns

2024-04-28 Thread Uros Bizjak
On Sun, Apr 28, 2024 at 7:47 AM liuhongt  wrote:
>
> So when both source operand and dest operand require avx512 MASK_REGS, RA
> can allocate MASK_REGS register instead of GPR to avoid reload it from
> GPR to MASK_REGS.
> It's similar as what did for logic patterns.
>
> Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}.
> Ok for trunk?
>
> gcc/ChangeLog:
>
> * config/i386/i386.md: (zero_extendsidi2): Adjust
> alternative *k to ?k.
> (zero_extenddi2): Ditto.
> (*zero_extendsi2): Ditto.
> (*zero_extendqihi2): Ditto.

OK.

Thanks,
Uros.

> ---
>  gcc/config/i386/i386.md   | 16 +++
>  .../gcc.target/i386/zero_extendkmask.c| 43 +++
>  2 files changed, 51 insertions(+), 8 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/zero_extendkmask.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index d4ce3809e6d..f2ab7fdcd58 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4567,10 +4567,10 @@ (define_expand "zero_extendsidi2"
>
>  (define_insn "*zero_extendsidi2"
>[(set (match_operand:DI 0 "nonimmediate_operand"
> -   "=r,?r,?o,r   ,o,?*y,?!*y,$r,$v,$x,*x,*v,*r,*k")
> +   "=r,?r,?o,r   ,o,?*y,?!*y,$r,$v,$x,*x,*v,?r,?k")
> (zero_extend:DI
>  (match_operand:SI 1 "x86_64_zext_operand"
> -   "0 ,rm,r ,rmWz,0,r  ,m   ,v ,r ,m ,*x,*v,*k,*km")))]
> +   "0 ,rm,r ,rmWz,0,r  ,m   ,v ,r ,m ,*x,*v,?k,?km")))]
>""
>  {
>switch (get_attr_type (insn))
> @@ -4703,9 +4703,9 @@ (define_mode_attr kmov_isa
>[(QI "avx512dq") (HI "avx512f") (SI "avx512bw") (DI "avx512bw")])
>
>  (define_insn "zero_extenddi2"
> -  [(set (match_operand:DI 0 "register_operand" "=r,*r,*k")
> +  [(set (match_operand:DI 0 "register_operand" "=r,?r,?k")
> (zero_extend:DI
> -(match_operand:SWI12 1 "nonimmediate_operand" "m,*k,*km")))]
> +(match_operand:SWI12 1 "nonimmediate_operand" "m,?k,?km")))]
>"TARGET_64BIT"
>"@
> movz{l|x}\t{%1, %k0|%k0, %1}
> @@ -4758,9 +4758,9 @@ (define_insn_and_split "zero_extendsi2_and"
> (set_attr "mode" "SI")])
>
>  (define_insn "*zero_extendsi2"
> -  [(set (match_operand:SI 0 "register_operand" "=r,*r,*k")
> +  [(set (match_operand:SI 0 "register_operand" "=r,?r,?k")
> (zero_extend:SI
> - (match_operand:SWI12 1 "nonimmediate_operand" "m,*k,*km")))]
> + (match_operand:SWI12 1 "nonimmediate_operand" "m,?k,?km")))]
>"!(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))"
>"@
> movz{l|x}\t{%1, %0|%0, %1}
> @@ -4813,8 +4813,8 @@ (define_insn_and_split "zero_extendqihi2_and"
>
>  ; zero extend to SImode to avoid partial register stalls
>  (define_insn "*zero_extendqihi2"
> -  [(set (match_operand:HI 0 "register_operand" "=r,*r,*k")
> -   (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" 
> "qm,*k,*km")))]
> +  [(set (match_operand:HI 0 "register_operand" "=r,?r,?k")
> +   (zero_extend:HI (match_operand:QI 1 "nonimmediate_operand" 
> "qm,?k,?km")))]
>"!(TARGET_ZERO_EXTEND_WITH_AND && optimize_function_for_speed_p (cfun))"
>"@
> movz{bl|x}\t{%1, %k0|%k0, %1}
> diff --git a/gcc/testsuite/gcc.target/i386/zero_extendkmask.c 
> b/gcc/testsuite/gcc.target/i386/zero_extendkmask.c
> new file mode 100644
> index 000..6b18980bbd1
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/zero_extendkmask.c
> @@ -0,0 +1,43 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-march=x86-64-v4 -O2" } */
> +/* { dg-final { scan-assembler-not {(?n)shr[bwl]} } } */
> +/* { dg-final { scan-assembler-not {(?n)movz[bw]} } } */
> +
> +#include
> +
> +__m512
> +foo (__m512d a, __m512d b, __m512 c, __m512 d)
> +{
> +  return _mm512_mask_mov_ps (c, (__mmask16) (_mm512_cmpeq_pd_mask (a, b) >> 
> 1), d);
> +}
> +
> +
> +__m512i
> +foo1 (__m512d a, __m512d b, __m512i c, __m512i d)
> +{
> +  return _mm512_mask_mov_epi16 (c, (__mmask32) (_mm512_cmpeq_pd_mask (a, b) 
> >> 1), d);
> +}
> +
> +__m512i
> +foo2 (__m512d a, __m512d b, __m512i c, __m512i d)
> +{
> +  return _mm512_mask_mov_epi8 (c, (__mmask64) (_mm512_cmpeq_pd_mask (a, b) 
> >> 1), d);
> +}
> +
> +__m512i
> +foo3 (__m512 a, __m512 b, __m512i c, __m512i d)
> +{
> +  return _mm512_mask_mov_epi16 (c, (__mmask32) (_mm512_cmpeq_ps_mask (a, b) 
> >> 1), d);
> +}
> +
> +__m512i
> +foo4 (__m512 a, __m512 b, __m512i c, __m512i d)
> +{
> +  return _mm512_mask_mov_epi8 (c, (__mmask64) (_mm512_cmpeq_ps_mask (a, b) 
> >> 1), d);
> +}
> +
> +__m512i
> +foo5 (__m512i a, __m512i b, __m512i c, __m512i d)
> +{
> +  return _mm512_mask_mov_epi8 (c, (__mmask64) (_mm512_cmp_epi16_mask (a, b, 
> 5) >> 1), d);
> +}
> --
> 2.31.1
>


Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-28 Thread Alexandre Oliva
On Apr 23, 2024, "Kewen.Lin"  wrote:

> This patch seemed to miss to CC gcc-patches list. :)

Oops, sorry, thanks for catching that.

Here it is.  FTR, you've already responded suggesting an apparent
preference for addressing PR105359, but since I meant to contribute it,
I'm reposting is to gcc-patches, now with a reference to the PR.


ppc: testsuite: pr79004 needs -mlong-double-128

Some of the asm opcodes expected by pr79004 depend on
-mlong-double-128 to be output.  E.g., without this flag, the
conditions of patterns @extenddf2 and extendsf2 do not
hold, and so GCC resorts to libcalls instead of even trying
rs6000_expand_float128_convert.

Perhaps the conditions are too strict, and they could enable the use
of conversion insns involving __ieee128/_Float128 even with 64-bit
long doubles.  Alas, for now, we need this flag for the test to pass
on target variants that use 64-bit long doubles.


for  gcc/testsuite/ChangeLog

* gcc.target/powerpr/pr79004.c: Add -mlong-double-128.
---
 gcc/testsuite/gcc.target/powerpc/pr79004.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/gcc.target/powerpc/pr79004.c 
b/gcc/testsuite/gcc.target/powerpc/pr79004.c
index e411702dc98a9..061a0e83fe2ad 100644
--- a/gcc/testsuite/gcc.target/powerpc/pr79004.c
+++ b/gcc/testsuite/gcc.target/powerpc/pr79004.c
@@ -1,6 +1,6 @@
 /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
 /* { dg-require-effective-target powerpc_p9vector_ok } */
-/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128" } */
+/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128 -mlong-double-128" } */
 /* { dg-prune-output ".-mfloat128. option may not be fully supported" } */
 
 #include 


-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PATCH] make -freg-struct-return visibly a negative alias of -fpcc-struct-return

2024-04-28 Thread Alexandre Oliva


The fact that both options accept negative forms suggests that maybe
they aren't negative forms of each other.  They are, but that isn't
clear even by examining common.opt.  Use NegativeAlias to make it
abundantly clear.

The 'Optimization' keyword next to freg-struct-return was the only
thing that caused flag_pcc_struct_return to be a per-function flag,
and ipa-inline relied on that.  After making it an alias, the
Optimization keyword was no longer operational.  I'm not sure it was
sensible or desirable for flag_pcc_struct_return to be a per-function
setting, but this patch does not intend to change behavior.

Regstrapped on x86_64-linux-gnu and ppc64le-linux-gnu.  Ok to install?


for  gcc/ChangeLog

* common.opt (freg-struct-return): Make it explicitly
fpcc-struct-return's NegativeAlias.  Copy Optimization...
(freg-struct-return): ... here.
---
 gcc/common.opt |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/common.opt b/gcc/common.opt
index ad3488447752b..12d93c76a1e63 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2406,7 +2406,7 @@ Common RejectNegative Joined UInteger Optimization
 -fpack-struct= Set initial maximum structure member alignment.
 
 fpcc-struct-return
-Common Var(flag_pcc_struct_return,1) Init(DEFAULT_PCC_STRUCT_RETURN)
+Common Var(flag_pcc_struct_return,1) Init(DEFAULT_PCC_STRUCT_RETURN) 
Optimization
 Return small aggregates in memory, not registers.
 
 fpeel-loops
@@ -2596,7 +2596,7 @@ Common Var(flag_record_gcc_switches)
 Record gcc command line switches in the object file.
 
 freg-struct-return
-Common Var(flag_pcc_struct_return,0) Optimization
+Common NegativeAlias Alias(fpcc_struct_return) Optimization
 Return small aggregates in registers.
 
 fregmove

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


[PATCH v3] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-04-28 Thread Jie Mei
This patch adds the smin/smax RTL mode for the
min/max.fmt instructions.

Also, since the min/max.fmt instrucions applies to the
IEEE 754-2008 "minNum" and "maxNum" operations, this
patch also provides the new "fmin3" and
"fmax3" modes.

gcc/ChangeLog:

* config/mips/i6400.md (i6400_fpu_minmax): New
define_insn_reservation.
* config/mips/mips.h (ISA_HAS_FMIN_FMAX): Define new macro.
* config/mips/mips.md (UNSPEC_FMIN): New unspec.
(UNSPEC_FMAX): Same as above.
(type): Add fminmax.
(smin3): Generates MIN.fmt instructions.
(smax3): Generates MAX.fmt instructions.
(fmin3): Generates MIN.fmt instructions.
(fmax3): Generates MAX.fmt instructions.
* config/mips/p6600.md (p6600_fpu_fabs): Include fminmax
type.

gcc/testsuite/ChangeLog:

* gcc.target/mips/mips-minmax1.c: New test for MIPS R6.
* gcc.target/mips/mips-minmax2.c: Same as above.
---
 gcc/config/mips/i6400.md |  6 +++
 gcc/config/mips/mips.h   |  2 +
 gcc/config/mips/mips.md  | 50 +++-
 gcc/config/mips/p6600.md |  4 +-
 gcc/testsuite/gcc.target/mips/mips-minmax1.c | 40 
 gcc/testsuite/gcc.target/mips/mips-minmax2.c | 36 ++
 6 files changed, 134 insertions(+), 4 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/mips-minmax1.c
 create mode 100644 gcc/testsuite/gcc.target/mips/mips-minmax2.c

diff --git a/gcc/config/mips/i6400.md b/gcc/config/mips/i6400.md
index 9f216fe0210..d6f691ee217 100644
--- a/gcc/config/mips/i6400.md
+++ b/gcc/config/mips/i6400.md
@@ -219,6 +219,12 @@
(eq_attr "type" "fabs,fneg,fmove"))
   "i6400_fpu_short, i6400_fpu_apu")
 
+;; min, max
+(define_insn_reservation "i6400_fpu_minmax" 2
+  (and (eq_attr "cpu" "i6400")
+   (eq_attr "type" "fminmax"))
+  "i6400_fpu_short+i6400_fpu_logic")
+
 ;; fadd, fsub, fcvt
 (define_insn_reservation "i6400_fpu_fadd" 4
   (and (eq_attr "cpu" "i6400")
diff --git a/gcc/config/mips/mips.h b/gcc/config/mips/mips.h
index 7145d23c650..5ce984ac99b 100644
--- a/gcc/config/mips/mips.h
+++ b/gcc/config/mips/mips.h
@@ -1259,6 +1259,8 @@ struct mips_cpu_info {
 #define ISA_HAS_9BIT_DISPLACEMENT  (mips_isa_rev >= 6  \
 || ISA_HAS_MIPS16E2)
 
+#define ISA_HAS_FMIN_FMAX  (mips_isa_rev >= 6)
+
 /* ISA has data indexed prefetch instructions.  This controls use of
'prefx', along with TARGET_HARD_FLOAT and TARGET_DOUBLE_FLOAT.
(prefx is a cop1x instruction, so can only be used if FP is
diff --git a/gcc/config/mips/mips.md b/gcc/config/mips/mips.md
index b0fb5850a9e..26f758c90dd 100644
--- a/gcc/config/mips/mips.md
+++ b/gcc/config/mips/mips.md
@@ -97,6 +97,10 @@
   UNSPEC_GET_FCSR
   UNSPEC_SET_FCSR
 
+  ;; Floating-point unspecs.
+  UNSPEC_FMIN
+  UNSPEC_FMAX
+
   ;; HI/LO moves.
   UNSPEC_MFHI
   UNSPEC_MTHI
@@ -370,6 +374,7 @@
 ;; frsqrt   floating point reciprocal square root
 ;; frsqrt1  floating point reciprocal square root step1
 ;; frsqrt2  floating point reciprocal square root step2
+;; fminmax  floating point min/max
 ;; dspmac   DSP MAC instructions not saturating the accumulator
 ;; dspmacsatDSP MAC instructions that saturate the accumulator
 ;; accext   DSP accumulator extract instructions
@@ -387,8 +392,8 @@
prefetch,prefetchx,condmove,mtc,mfc,mthi,mtlo,mfhi,mflo,const,arith,logical,
shift,slt,signext,clz,pop,trap,imul,imul3,imul3nc,imadd,idiv,idiv3,move,
fmove,fadd,fmul,fmadd,fdiv,frdiv,frdiv1,frdiv2,fabs,fneg,fcmp,fcvt,fsqrt,
-   frsqrt,frsqrt1,frsqrt2,dspmac,dspmacsat,accext,accmod,dspalu,dspalusat,
-   multi,atomic,syncloop,nop,ghost,multimem,
+   frsqrt,frsqrt1,frsqrt2,fminmax,dspmac,dspmacsat,accext,accmod,dspalu,
+   dspalusat,multi,atomic,syncloop,nop,ghost,multimem,
simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd,
simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp,
simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill,
@@ -7971,6 +7976,47 @@
   [(set_attr "move_type" "load")
(set_attr "insn_count" "2")])
 
+;;
+;;  Float point MIN/MAX
+;;
+
+(define_insn "smin3"
+  [(set (match_operand:SCALARF 0 "register_operand" "=f")
+   (smin:SCALARF (match_operand:SCALARF 1 "register_operand" "f")
+ (match_operand:SCALARF 2 "register_operand" "f")))]
+  "ISA_HAS_FMIN_FMAX"
+  "min.\t%0,%1,%2"
+  [(set_attr "type" "fminmax")
+   (set_attr "mode" "")])
+
+(define_insn "smax3"
+  [(set (match_operand:SCALARF 0 "register_operand" "=f")
+   (smax:SCALARF (match_operand:SCALARF 1 "register_operand" "f")
+ (match_operand:SCALARF 2 "register_operand" "f")))]
+  "ISA_HAS_FMIN_FMAX"
+  "max.\t%0,%1,%2"
+  [(set_attr "type" "fminmax")
+  (set_attr "mode" "")])
+
+(define_insn "fmin3"
+  [(set (match_operand:SCALARF 0 "register_operand"

[PATCH] PR tree-opt/113673: Avoid load merging from potentially trapping additions.

2024-04-28 Thread Roger Sayle

This patch fixes PR tree-optimization/113673, a P2 ice-on-valid regression
caused by load merging of (ptr[0]<<8)+ptr[1] when -ftrapv has been
specified.  When the operator is | or ^ this is safe, but for addition
of signed integer types, a trap may be generated/required, so merging this
idiom into a single non-trapping instruction is inappropriate, confusing
the compiler by transforming a basic block with an exception edge into one
without.  One fix is to be more selective for PLUS_EXPR than for
BIT_IOR_EXPR or BIT_XOR_EXPR in gimple-ssa-store-merging.cc's
find_bswap_or_nop_1 function.

An alternate solution might be to notice that in this idiom the addition
can't overflow, but that this detail wasn't apparent when exception edges
were added to the CFG.  In which case, it's safe to remove (or mark for
removal) the problematic exceptional edge.  Unfortunately updating the
CFG is a part of the compiler that I'm less familiar with.

This patch has been tested on x86_64-pc-linux-gnu with make bootstrap
and make -k check, both with and without --target_board=unix{-m32}
with no new failures.  Ok for mainline?


2024-04-28  Roger Sayle  

gcc/ChangeLog
PR tree-optimization/113673
* gimple-ssa-store-merging.cc (find_bswap_or_nop_1) :
Don't perform load merging if a signed addition may trap.

gcc/testsuite/ChangeLog
PR tree-optimization/113673
* g++.dg/pr113673.C: New test case.


Thanks in advance,
Roger
--

diff --git a/gcc/gimple-ssa-store-merging.cc b/gcc/gimple-ssa-store-merging.cc
index cb0cb5f..41a1066 100644
--- a/gcc/gimple-ssa-store-merging.cc
+++ b/gcc/gimple-ssa-store-merging.cc
@@ -776,9 +776,16 @@ find_bswap_or_nop_1 (gimple *stmt, struct symbolic_number 
*n, int limit)
 
   switch (code)
{
+   case PLUS_EXPR:
+ /* Don't perform load merging if this addition can trap.  */
+ if (cfun->can_throw_non_call_exceptions
+ && INTEGRAL_TYPE_P (TREE_TYPE (rhs1))
+ && TYPE_OVERFLOW_TRAPS (TREE_TYPE (rhs1)))
+   return NULL;
+ /* Fallthru.  */
+
case BIT_IOR_EXPR:
case BIT_XOR_EXPR:
-   case PLUS_EXPR:
  source_stmt1 = find_bswap_or_nop_1 (rhs1_stmt, &n1, limit - 1);
 
  if (!source_stmt1)
diff --git a/gcc/testsuite/g++.dg/pr113673.C b/gcc/testsuite/g++.dg/pr113673.C
new file mode 100644
index 000..1148977
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr113673.C
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-Os -fnon-call-exceptions -ftrapv" } */
+
+struct s { ~s(); };
+void
+h (unsigned char *data, int c)
+{
+  s a1;
+  while (c)
+{
+  int m = *data++ << 8;
+  m += *data++;
+}
+}


Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-28 Thread Kewen.Lin
Hi,

on 2024/4/28 16:14, Alexandre Oliva wrote:
> On Apr 24, 2024, "Kewen.Lin"  wrote:
> 
>> For !has_arch_pwr7 case, it still adopts peeling but as the comment (one 
>> line above)
>> shows the original intention of this case is to expect not profitable for 
>> peeling
>> so it's not expected to be handled here, can we just tweak the loop bound 
>> instead,
>> such as:
> 
>> -#define N 14
>> +#define N 13
>>  #define OFF 4 
> 
>> ?, it can make this loop not profitable to be vectorized for !vect_no_align 
>> with
>> peeling (both pwr7 and pwr6) and keep consistent.
> 
> Like this?  I didn't feel I could claim authorship of this one-liner
> just because I turned it into a patch and tested it, so I took the
> liberty of turning your own words above into the commit message.  So

Feel free to do so!

> far, tested on ppc64le-linux-gnu (ppc9).  Testing with vxworks targets
> now.  Would you like to tweak the commit message to your liking?

OK, tweaked as below.

> Otherwise, is this ok to install?
> 
> Thanks,
> 
> 
> adjust iteration count for ppc costmodel 76b

Nit: Maybe add a prefix "testsuite: ".

> 
> From: Kewen Lin 

Thanks, you can just drop this.  :)

> 
> The original intention of this case is to expect not profitable for
> peeling.  Tweak the loop bound to make this loop not profitable to be
> vectorized for !vect_no_align with peeling (both pwr7 and pwr6) and
> keep consistent.

For some hardware which doesn't support unaligned vector memory access,
test case costmodel-vect-76b.c expects to see cost modeling would make
the decision that it's not profitable for peeling, according to the
commit history, test case comments and the way to check.

For now, the existing loop bound 14 works well for Power7, but it does
not for some targets on which the cost of operation vec_perm can be
different from Power7, such as: Power6, it's 3 vs. 1.  This difference
further causes the difference (10 vs. 12) on the minimum iteration for
profitability and cause the failure.  To keep the original test point,
this patch is to tweak the loop bound to ensure it's not profitable
to be vectorized for !vect_no_align with peeling.

OK for trunk (assuming the testings run well on p6/p7 too), thanks!

BR,
Kewen

> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c (N): Tweak.
> ---
>  .../gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c 
> b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
> index cbbfbb24658f8..e48b0ab759e75 100644
> --- a/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
> +++ b/gcc/testsuite/gcc.dg/vect/costmodel/ppc/costmodel-vect-76b.c
> @@ -6,7 +6,7 @@
>  
>  /* On Power7 without misalign vector support, this case is to check it's not
> profitable to perform vectorization by peeling to align the store.  */
> -#define N 14
> +#define N 13
>  #define OFF 4
>  
>  /* Check handling of accesses for which the "initial condition" -
> 
> 



Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-28 Thread Kewen.Lin
Hi,

on 2024/4/28 16:20, Alexandre Oliva wrote:
> On Apr 23, 2024, "Kewen.Lin"  wrote:
> 
>> This patch seemed to miss to CC gcc-patches list. :)
> 
> Oops, sorry, thanks for catching that.
> 
> Here it is.  FTR, you've already responded suggesting an apparent
> preference for addressing PR105359, but since I meant to contribute it,
> I'm reposting is to gcc-patches, now with a reference to the PR.

OK, from this perspective IMHO it seems more clear to adopt xfail
with effective target long_double_64bit?

BR,
Kewen

> 
> 
> ppc: testsuite: pr79004 needs -mlong-double-128
> 
> Some of the asm opcodes expected by pr79004 depend on
> -mlong-double-128 to be output.  E.g., without this flag, the
> conditions of patterns @extenddf2 and extendsf2 do not
> hold, and so GCC resorts to libcalls instead of even trying
> rs6000_expand_float128_convert.
> 
> Perhaps the conditions are too strict, and they could enable the use
> of conversion insns involving __ieee128/_Float128 even with 64-bit
> long doubles.  Alas, for now, we need this flag for the test to pass
> on target variants that use 64-bit long doubles.
> 
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/powerpr/pr79004.c: Add -mlong-double-128.
> ---
>  gcc/testsuite/gcc.target/powerpc/pr79004.c |2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr79004.c 
> b/gcc/testsuite/gcc.target/powerpc/pr79004.c
> index e411702dc98a9..061a0e83fe2ad 100644
> --- a/gcc/testsuite/gcc.target/powerpc/pr79004.c
> +++ b/gcc/testsuite/gcc.target/powerpc/pr79004.c
> @@ -1,6 +1,6 @@
>  /* { dg-do compile { target { powerpc*-*-* && lp64 } } } */
>  /* { dg-require-effective-target powerpc_p9vector_ok } */
> -/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128" } */
> +/* { dg-options "-mdejagnu-cpu=power9 -O2 -mfloat128 -mlong-double-128" } */
>  /* { dg-prune-output ".-mfloat128. option may not be fully supported" } */
>  
>  #include 
> 
> 





[pushed] doc: Update David Binderman's entry in contrib.texi

2024-04-28 Thread Gerald Pfeifer
gcc/ChangeLog:

* doc/contrib.texi: Update David Binderman's entry.

Pushed.
Gerald

---
 gcc/doc/contrib.texi | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/doc/contrib.texi b/gcc/doc/contrib.texi
index 2a15fd05883..32e89d6df25 100644
--- a/gcc/doc/contrib.texi
+++ b/gcc/doc/contrib.texi
@@ -64,8 +64,8 @@ improved alias analysis, plus migrating GCC to Bugzilla.
 Geoff Berry for his Java object serialization work and various patches.
 
 @item
-David Binderman tests weekly snapshots of GCC trunk against Fedora Rawhide
-for several architectures.
+David Binderman for testing GCC trunk against Fedora Rawhide
+and csmith.
 
 @item
 Laurynas Biveinis for memory management work and DJGPP port fixes.
-- 
2.44.0


RE: [PATCH v2] Internal-fn: Introduce new internal function SAT_ADD

2024-04-28 Thread Li, Pan2
Kinding ping for SAT_ADD.

Pan

-Original Message-
From: Li, Pan2  
Sent: Sunday, April 7, 2024 3:03 PM
To: gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; Wang, Yanzhang 
; tamar.christ...@arm.com; richard.guent...@gmail.com; 
Liu, Hongtao ; Li, Pan2 
Subject: [PATCH v2] Internal-fn: Introduce new internal function SAT_ADD

From: Pan Li 

Update in v2:
* Fix one failure for x86 bootstrap.

Original log:

This patch would like to add the middle-end presentation for the
saturation add.  Aka set the result of add to the max when overflow.
It will take the pattern similar as below.

SAT_ADD (x, y) => (x + y) | (-(TYPE)((TYPE)(x + y) < x))

Take uint8_t as example, we will have:

* SAT_ADD (1, 254)   => 255.
* SAT_ADD (1, 255)   => 255.
* SAT_ADD (2, 255)   => 255.
* SAT_ADD (255, 255) => 255.

The patch also implement the SAT_ADD in the riscv backend as
the sample for both the scalar and vector.  Given below example:

uint64_t sat_add_u64 (uint64_t x, uint64_t y)
{
  return (x + y) | (- (uint64_t)((uint64_t)(x + y) < x));
}

Before this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  long unsigned int _1;
  _Bool _2;
  long unsigned int _3;
  long unsigned int _4;
  uint64_t _7;
  long unsigned int _10;
  __complex__ long unsigned int _11;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _11 = .ADD_OVERFLOW (x_5(D), y_6(D));
  _1 = REALPART_EXPR <_11>;
  _10 = IMAGPART_EXPR <_11>;
  _2 = _10 != 0;
  _3 = (long unsigned int) _2;
  _4 = -_3;
  _7 = _1 | _4;
  return _7;
;;succ:   EXIT

}

After this patch:
uint64_t sat_add_uint64_t (uint64_t x, uint64_t y)
{
  uint64_t _7;

;;   basic block 2, loop depth 0
;;pred:   ENTRY
  _7 = .SAT_ADD (x_5(D), y_6(D)); [tail call]
  return _7;
;;succ:   EXIT
}

For vectorize, we leverage the existing vect pattern recog to find
the pattern similar to scalar and let the vectorizer to perform
the rest part for standard name usadd3 in vector mode.
The riscv vector backend have insn "Vector Single-Width Saturating
Add and Subtract" which can be leveraged when expand the usadd3
in vector mode.  For example:

void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  unsigned i;

  for (i = 0; i < n; i++)
out[i] = (x[i] + y[i]) | (- (uint64_t)((uint64_t)(x[i] + y[i]) < x[i]));
}

Before this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _80 = .SELECT_VL (ivtmp_78, POLY_INT_CST [2, 2]);
  ivtmp_58 = _80 * 8;
  vect__4.7_61 = .MASK_LEN_LOAD (vectp_x.5_59, 64B, { -1, ... }, _80, 0);
  vect__6.10_65 = .MASK_LEN_LOAD (vectp_y.8_63, 64B, { -1, ... }, _80, 0);
  vect__7.11_66 = vect__4.7_61 + vect__6.10_65;
  mask__8.12_67 = vect__4.7_61 > vect__7.11_66;
  vect__12.15_72 = .VCOND_MASK (mask__8.12_67, { 18446744073709551615, ... }, 
vect__7.11_66);
  .MASK_LEN_STORE (vectp_out.16_74, 64B, { -1, ... }, _80, 0, vect__12.15_72);
  vectp_x.5_60 = vectp_x.5_59 + ivtmp_58;
  vectp_y.8_64 = vectp_y.8_63 + ivtmp_58;
  vectp_out.16_75 = vectp_out.16_74 + ivtmp_58;
  ivtmp_79 = ivtmp_78 - _80;
  ...
}

vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v0,0(a1)
  vle64.v v1,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vadd.vv v1,v0,v1
  vmsgtu.vv   v0,v0,v1
  vmerge.vim  v1,v1,-1,v0
  vse64.v v1,0(a0)
  ...

After this patch:
void vec_sat_add_u64 (uint64_t *out, uint64_t *x, uint64_t *y, unsigned n)
{
  ...
  _62 = .SELECT_VL (ivtmp_60, POLY_INT_CST [2, 2]);
  ivtmp_46 = _62 * 8;
  vect__4.7_49 = .MASK_LEN_LOAD (vectp_x.5_47, 64B, { -1, ... }, _62, 0);
  vect__6.10_53 = .MASK_LEN_LOAD (vectp_y.8_51, 64B, { -1, ... }, _62, 0);
  vect__12.11_54 = .SAT_ADD (vect__4.7_49, vect__6.10_53);
  .MASK_LEN_STORE (vectp_out.12_56, 64B, { -1, ... }, _62, 0, vect__12.11_54);
  ...
}

vec_sat_add_u64:
  ...
  vsetvli a5,a3,e64,m1,ta,ma
  vle64.v v1,0(a1)
  vle64.v v2,0(a2)
  sllia4,a5,3
  sub a3,a3,a5
  add a1,a1,a4
  add a2,a2,a4
  vsaddu.vv   v1,v1,v2
  vse64.v v1,0(a0)
  ...

To limit the patch size for review, only unsigned version of
usadd3 are involved here. The signed version will be covered
in the underlying patch(es).

The below test suites are passed for this patch.
* The riscv fully regression tests.
* The aarch64 fully regression tests.
* The x86 bootstrap tests.
* The x86 fully regression tests.

PR target/51492
PR target/112600

gcc/ChangeLog:

* config/riscv/autovec.md (usadd3): New pattern expand
for unsigned SAT_ADD vector.
* config/riscv/riscv-protos.h (riscv_expand_usadd): New func
decl to expand usadd3 pattern.
(expand_vec_usadd): Ditto but for vector.
* config/riscv/riscv-v.cc (emit_vec_saddu): New func impl to
emit the vsadd insn.
(expand_vec_usadd): New func impl to expand usadd3 for
vector.
* config/riscv/riscv.cc (riscv_expand_usadd): New func impl
to expand usadd

RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in get_stored_val

2024-04-28 Thread Li, Pan2
Kindly ping^^ for this ice fix.

Pan

-Original Message-
From: Li, Pan2  
Sent: Thursday, April 18, 2024 9:46 AM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Kindly ping^ for this ice fix.

Pan

-Original Message-
From: Li, Pan2 
Sent: Saturday, April 6, 2024 8:02 PM
To: Li, Pan2 ; Jeff Law ; Robin Dapp 
; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Kindly ping for this ice.

Pan

-Original Message-
From: Li, Pan2  
Sent: Saturday, March 23, 2024 1:45 PM
To: Jeff Law ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: RE: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val

Thanks Jeff for comments.

> As Richi noted using validate_subreg here isn't great.  Does it work to 
> factor out this code from extract_low_bits
>
>>   if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
>>   || !int_mode_for_mode (mode).exists (&int_mode))
>> return NULL_RTX;
>> 
>>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
>> return NULL_RTX;
>>   if (!targetm.modes_tieable_p (int_mode, mode))
>> return NULL_RTX;

> And use that in the condition (and in extract_low_bits rather than 
> duplicating the code)?

It can solve the ICE but will forbid all vector modes goes gen_lowpart.
Actually only the vector mode size is less than reg nature size will trigger 
the ICE.
Thus, how about just add one more condition before goes to gen_lowpart as below?

Feel free to correct me if any misunderstandings. 😉!

diff --git a/gcc/dse.cc b/gcc/dse.cc
index edc7a1dfecf..258d2ccc299 100644
--- a/gcc/dse.cc
+++ b/gcc/dse.cc
@@ -1946,7 +1946,9 @@ get_stored_val (store_info *store_info, machine_mode 
read_mode,
 copy_rtx (store_info->const_rhs));
   else if (VECTOR_MODE_P (read_mode) && VECTOR_MODE_P (store_mode)
 && known_le (GET_MODE_BITSIZE (read_mode), GET_MODE_BITSIZE (store_mode))
-&& targetm.modes_tieable_p (read_mode, store_mode))
+&& targetm.modes_tieable_p (read_mode, store_mode)
+/* It's invalid in validate_subreg if read_mode size is < reg natural.  */
+&& known_ge (GET_MODE_SIZE (read_mode), REGMODE_NATURAL_SIZE (read_mode)))
 read_reg = gen_lowpart (read_mode, copy_rtx (store_info->rhs));
   else
 read_reg = extract_low_bits (read_mode, store_mode,

Pan

-Original Message-
From: Jeff Law  
Sent: Saturday, March 23, 2024 2:54 AM
To: Li, Pan2 ; Robin Dapp ; 
gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; richard.guent...@gmail.com; 
Wang, Yanzhang ; Liu, Hongtao 
Subject: Re: [PATCH v2] DSE: Bugfix ICE after allow vector type in 
get_stored_val



On 3/4/24 11:22 PM, Li, Pan2 wrote:
> Thanks Jeff for comments.
> 
>> But in the case of a vector modes, we can usually reinterpret the
>> underlying bits in whatever mode we want and do any of the usual
>> operations on those bits.
> 
> Yes, I think that is why we can allow vector mode in get_stored_val if my 
> understanding is correct.
> And then the different modes will return by gen_low_part. Unfortunately, 
> there are some modes
>   (less than a vector bit size like V2SF, V2QI for vlen=128) are considered 
> as invalid by validate_subreg,
> and return NULL_RTX result in the final ICE.
That doesn't make a lot of sense to me.  Even for vlen=128 I would have 
expected that we can still use a subreg to access low bits.  After all 
we might have had a V16QI vector and done a reduction of some sort 
storing the result in the first element and we have to be able to 
extract that result and move it around.

I'm not real keen on a target workaround.  While extremely safe, I 
wouldn't be surprised if other ports could trigger the ICE and we'd end 
up patching up multiple targets for what is, IMHO, a more generic issue.

As Richi noted using validate_subreg here isn't great.  Does it work to 
factor out this code from extract_low_bits:


>   if (!int_mode_for_mode (src_mode).exists (&src_int_mode)
>   || !int_mode_for_mode (mode).exists (&int_mode))
> return NULL_RTX;
> 
>   if (!targetm.modes_tieable_p (src_int_mode, src_mode))
> return NULL_RTX;
>   if (!targetm.modes_tieable_p (int_mode, mode))
> return NULL_RTX;

And use that in the condition (and in extract_low_bits rather than 
duplicating the code)?

jeff

ps.  No need to apologize for the pings.  This completely fell off my radar.


Re: [PATCH 5/4] libbacktrace: improve getting debug information for loaded dlls

2024-04-28 Thread Ian Lance Taylor
On Thu, Apr 25, 2024 at 1:15 PM Björn Schäpers  wrote:
>
> > Attached is the combined version of the two patches, only implementing the
> > variant with the tlhelp32 API.
> >
> > Tested on x86 and x86_64 windows.
> >
> > Kind regards,
> > Björn.
>
> A friendly ping.

Thanks.  Committed as follows.

Which of your other patches are still relevant?  Thanks.

Ian
942a9cf2a958113d2ab46f5b015c36e569abedcf
diff --git a/libbacktrace/configure.ac b/libbacktrace/configure.ac
index 3e0075a2b79..59e9c415db8 100644
--- a/libbacktrace/configure.ac
+++ b/libbacktrace/configure.ac
@@ -380,6 +380,10 @@ if test "$have_loadquery" = "yes"; then
 fi
 
 AC_CHECK_HEADERS(windows.h)
+AC_CHECK_HEADERS(tlhelp32.h, [], [],
+[#ifdef HAVE_WINDOWS_H
+#  include 
+#endif])
 
 # Check for the fcntl function.
 if test -n "${with_target_subdir}"; then
diff --git a/libbacktrace/pecoff.c b/libbacktrace/pecoff.c
index 9e437d810c7..4f267841178 100644
--- a/libbacktrace/pecoff.c
+++ b/libbacktrace/pecoff.c
@@ -49,6 +49,18 @@ POSSIBILITY OF SUCH DAMAGE.  */
 #endif
 
 #include 
+
+#ifdef HAVE_TLHELP32_H
+#include 
+
+#ifdef UNICODE
+/* If UNICODE is defined, all the symbols are replaced by a macro to use the
+   wide variant. But we need the ansi variant, so undef the macros. */
+#undef MODULEENTRY32
+#undef Module32First
+#undef Module32Next
+#endif
+#endif
 #endif
 
 /* Coff file header.  */
@@ -592,7 +604,8 @@ coff_syminfo (struct backtrace_state *state, uintptr_t addr,
 static int
 coff_add (struct backtrace_state *state, int descriptor,
  backtrace_error_callback error_callback, void *data,
- fileline *fileline_fn, int *found_sym, int *found_dwarf)
+ fileline *fileline_fn, int *found_sym, int *found_dwarf,
+ uintptr_t module_handle ATTRIBUTE_UNUSED)
 {
   struct backtrace_view fhdr_view;
   off_t fhdr_off;
@@ -870,12 +883,7 @@ coff_add (struct backtrace_state *state, int descriptor,
 }
 
 #ifdef HAVE_WINDOWS_H
-  {
-uintptr_t module_handle;
-
-module_handle = (uintptr_t) GetModuleHandle (NULL);
-base_address = module_handle - image_base;
-  }
+  base_address = module_handle - image_base;
 #endif
 
   if (!backtrace_dwarf_add (state, base_address, &dwarf_sections,
@@ -917,12 +925,61 @@ backtrace_initialize (struct backtrace_state *state,
   int found_sym;
   int found_dwarf;
   fileline coff_fileline_fn;
+  uintptr_t module_handle = 0;
+#ifdef HAVE_TLHELP32_H
+  fileline module_fileline_fn;
+  int module_found_sym;
+  HANDLE snapshot;
+#endif
+
+#ifdef HAVE_WINDOWS_H
+  module_handle = (uintptr_t) GetModuleHandle (NULL);
+#endif
 
   ret = coff_add (state, descriptor, error_callback, data,
- &coff_fileline_fn, &found_sym, &found_dwarf);
+ &coff_fileline_fn, &found_sym, &found_dwarf, module_handle);
   if (!ret)
 return 0;
 
+#ifdef HAVE_TLHELP32_H
+  do
+{
+  snapshot = CreateToolhelp32Snapshot (TH32CS_SNAPMODULE, 0);
+}
+  while (snapshot == INVALID_HANDLE_VALUE
+&& GetLastError () == ERROR_BAD_LENGTH);
+
+  if (snapshot != INVALID_HANDLE_VALUE)
+{
+  MODULEENTRY32 entry;
+  BOOL ok;
+  entry.dwSize = sizeof (MODULEENTRY32);
+
+  for (ok = Module32First (snapshot, &entry); ok; ok = Module32Next 
(snapshot, &entry))
+   {
+ if (strcmp (filename, entry.szExePath) == 0)
+   continue;
+
+ module_handle = (uintptr_t) entry.hModule;
+ if (module_handle == 0)
+   continue;
+
+ descriptor = backtrace_open (entry.szExePath, error_callback, data,
+  NULL);
+ if (descriptor < 0)
+   continue;
+
+ coff_add (state, descriptor, error_callback, data,
+   &module_fileline_fn, &module_found_sym, &found_dwarf,
+   module_handle);
+ if (module_found_sym)
+   found_sym = 1;
+   }
+
+  CloseHandle (snapshot);
+}
+#endif
+
   if (!state->threaded)
 {
   if (found_sym)


[PATCH] expmed: TRUNCATE value1 if needed in store_bit_field_using_insv

2024-04-28 Thread YunQiang Su
PR target/113179.

In `store_bit_field_using_insv`, we just use SUBREG if value_mode
>= op_mode, while in some ports, a sign_extend will be needed,
such as MIPS64:
  If either GPR rs or GPR rt does not contain sign-extended 32-bit
  values (bits 63..31 equal), then the result of the operation is
  UNPREDICTABLE.

The problem happens for the code like:
  struct xx {
int a:4;
int b:24;
int c:3;
int d:1;
  };

  void xx (struct xx *a, long long b) {
a->d = b;
  }

In the above code, the hard register contains `b`, may be note well
sign-extended.

gcc/
PR target/113179
* expmed.c(store_bit_field_using_insv): TRUNCATE value1 if
needed.

gcc/testsuite
PR target/113179
* gcc.target/mips/pr113179.c: New tests.
---
 gcc/expmed.cc| 12 +---
 gcc/testsuite/gcc.target/mips/pr113179.c | 18 ++
 2 files changed, 27 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/mips/pr113179.c

diff --git a/gcc/expmed.cc b/gcc/expmed.cc
index 4ec035e4843..6a582593da8 100644
--- a/gcc/expmed.cc
+++ b/gcc/expmed.cc
@@ -704,9 +704,15 @@ store_bit_field_using_insv (const extraction_insn *insv, 
rtx op0,
}
  else
{
- tmp = gen_lowpart_if_possible (op_mode, value1);
- if (! tmp)
-   tmp = gen_lowpart (op_mode, force_reg (value_mode, value1));
+ if (targetm.mode_rep_extended (op_mode, value_mode))
+   tmp = simplify_gen_unary (TRUNCATE, op_mode,
+ value1, value_mode);
+ else
+   {
+ tmp = gen_lowpart_if_possible (op_mode, value1);
+ if (! tmp)
+   tmp = gen_lowpart (op_mode, force_reg (value_mode, value1));
+   }
}
  value1 = tmp;
}
diff --git a/gcc/testsuite/gcc.target/mips/pr113179.c 
b/gcc/testsuite/gcc.target/mips/pr113179.c
new file mode 100644
index 000..f32c5a16765
--- /dev/null
+++ b/gcc/testsuite/gcc.target/mips/pr113179.c
@@ -0,0 +1,18 @@
+/* Check if the operand of INS is sign-extended on MIPS64.  */
+/* { dg-options "-mips64r2 -mabi=64" } */
+/* { dg-skip-if "code quality test" { *-*-* } { "-O0" } { "" } } */
+
+struct xx {
+int a:1;
+int b:24;
+int c:6;
+int d:1;
+};
+
+long long xx (struct xx *a, long long b) {
+a->d = b;
+return b+1;
+}
+
+/* { dg-final { scan-assembler "\tsll\t\\\$3,\\\$5,0" } } */
+/* { dg-final { scan-assembler "\tdaddiu\t\\\$2,\\\$5,1" } } */
-- 
2.39.2



[COMMITTED 03/16] Make some Value_Range's explicitly integer.

2024-04-28 Thread Aldy Hernandez
Fix some Value_Range's that we know ahead of time will be only
integers.  This avoids using the polymorphic Value_Range unnecessarily

gcc/ChangeLog:

* gimple-ssa-warn-access.cc (check_nul_terminated_array): Make 
Value_Range an int_range.
(memmodel_to_uhwi): Same
* tree-ssa-loop-niter.cc (refine_value_range_using_guard): Same.
(determine_value_range): Same.
(infer_loop_bounds_from_signedness): Same.
(scev_var_range_cant_overflow): Same.
---
 gcc/gimple-ssa-warn-access.cc |  4 ++--
 gcc/tree-ssa-loop-niter.cc| 12 ++--
 2 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/gimple-ssa-warn-access.cc b/gcc/gimple-ssa-warn-access.cc
index dedaae27b31..450c1caa765 100644
--- a/gcc/gimple-ssa-warn-access.cc
+++ b/gcc/gimple-ssa-warn-access.cc
@@ -330,7 +330,7 @@ check_nul_terminated_array (GimpleOrTree expr, tree src, 
tree bound)
   wide_int bndrng[2];
   if (bound)
 {
-  Value_Range r (TREE_TYPE (bound));
+  int_range<2> r (TREE_TYPE (bound));
 
   get_range_query (cfun)->range_of_expr (r, bound);
 
@@ -2816,7 +2816,7 @@ memmodel_to_uhwi (tree ord, gimple *stmt, unsigned 
HOST_WIDE_INT *cstval)
 {
   /* Use the range query to determine constant values in the absence
 of constant propagation (such as at -O0).  */
-  Value_Range rng (TREE_TYPE (ord));
+  int_range<2> rng (TREE_TYPE (ord));
   if (!get_range_query (cfun)->range_of_expr (rng, ord, stmt)
  || !rng.singleton_p (&ord))
return false;
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index c6d010f6d89..cbc9dbc5a1f 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -214,7 +214,7 @@ refine_value_range_using_guard (tree type, tree var,
   get_type_static_bounds (type, mint, maxt);
   mpz_init (minc1);
   mpz_init (maxc1);
-  Value_Range r (TREE_TYPE (varc1));
+  int_range<2> r (TREE_TYPE (varc1));
   /* Setup range information for varc1.  */
   if (integer_zerop (varc1))
 {
@@ -368,7 +368,7 @@ determine_value_range (class loop *loop, tree type, tree 
var, mpz_t off,
   gphi_iterator gsi;
 
   /* Either for VAR itself...  */
-  Value_Range var_range (TREE_TYPE (var));
+  int_range<2> var_range (TREE_TYPE (var));
   get_range_query (cfun)->range_of_expr (var_range, var);
   if (var_range.varying_p () || var_range.undefined_p ())
rtype = VR_VARYING;
@@ -382,7 +382,7 @@ determine_value_range (class loop *loop, tree type, tree 
var, mpz_t off,
 
   /* Or for PHI results in loop->header where VAR is used as
 PHI argument from the loop preheader edge.  */
-  Value_Range phi_range (TREE_TYPE (var));
+  int_range<2> phi_range (TREE_TYPE (var));
   for (gsi = gsi_start_phis (loop->header); !gsi_end_p (gsi); gsi_next 
(&gsi))
{
  gphi *phi = gsi.phi ();
@@ -408,7 +408,7 @@ determine_value_range (class loop *loop, tree type, tree 
var, mpz_t off,
 involved.  */
  if (wi::gt_p (minv, maxv, sgn))
{
- Value_Range vr (TREE_TYPE (var));
+ int_range<2> vr (TREE_TYPE (var));
  get_range_query (cfun)->range_of_expr (vr, var);
  if (vr.varying_p () || vr.undefined_p ())
rtype = VR_VARYING;
@@ -4367,7 +4367,7 @@ infer_loop_bounds_from_signedness (class loop *loop, 
gimple *stmt)
 
   low = lower_bound_in_type (type, type);
   high = upper_bound_in_type (type, type);
-  Value_Range r (TREE_TYPE (def));
+  int_range<2> r (TREE_TYPE (def));
   get_range_query (cfun)->range_of_expr (r, def);
   if (!r.varying_p () && !r.undefined_p ())
 {
@@ -5426,7 +5426,7 @@ scev_var_range_cant_overflow (tree var, tree step, class 
loop *loop)
   if (!def_bb || !dominated_by_p (CDI_DOMINATORS, loop->latch, def_bb))
 return false;
 
-  Value_Range r (TREE_TYPE (var));
+  int_range<2> r (TREE_TYPE (var));
   get_range_query (cfun)->range_of_expr (r, var);
   if (r.varying_p () || r.undefined_p ())
 return false;
-- 
2.44.0



[COMMITTED 06/16] Remove GTY support for vrange and derived classes.

2024-04-28 Thread Aldy Hernandez
Now that we have a vrange storage class to save ranges in long-term
memory, there is no need for GTY markers for any of the vrange
classes, since they should never live in GC.

gcc/ChangeLog:

* value-range-storage.h: Remove friends.
* value-range.cc (gt_ggc_mx): Remove.
(gt_pch_nx): Remove.
* value-range.h (class vrange): Remove GTY markers.
(class irange): Same.
(class int_range): Same.
(class frange): Same.
(gt_ggc_mx): Remove.
(gt_pch_nx): Remove.
---
 gcc/value-range-storage.h |  4 ---
 gcc/value-range.cc| 73 ---
 gcc/value-range.h | 46 +++-
 3 files changed, 4 insertions(+), 119 deletions(-)

diff --git a/gcc/value-range-storage.h b/gcc/value-range-storage.h
index d94c520aa73..5756de7e32d 100644
--- a/gcc/value-range-storage.h
+++ b/gcc/value-range-storage.h
@@ -75,10 +75,6 @@ private:
   static size_t size (const irange &r);
   const unsigned short *lengths_address () const;
   unsigned short *write_lengths_address ();
-  friend void gt_ggc_mx_irange_storage (void *);
-  friend void gt_pch_p_14irange_storage (void *, void *,
- gt_pointer_operator, void *);
-  friend void gt_pch_nx_irange_storage (void *);
 
   // The shared precision of each number.
   unsigned short m_precision;
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 926f7b707ea..b901c864a7b 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2165,79 +2165,6 @@ vrp_operand_equal_p (const_tree val1, const_tree val2)
   return true;
 }
 
-void
-gt_ggc_mx (irange *x)
-{
-  if (!x->undefined_p ())
-gt_ggc_mx (x->m_type);
-}
-
-void
-gt_pch_nx (irange *x)
-{
-  if (!x->undefined_p ())
-gt_pch_nx (x->m_type);
-}
-
-void
-gt_pch_nx (irange *x, gt_pointer_operator op, void *cookie)
-{
-  for (unsigned i = 0; i < x->m_num_ranges; ++i)
-{
-  op (&x->m_base[i * 2], NULL, cookie);
-  op (&x->m_base[i * 2 + 1], NULL, cookie);
-}
-}
-
-void
-gt_ggc_mx (frange *x)
-{
-  gt_ggc_mx (x->m_type);
-}
-
-void
-gt_pch_nx (frange *x)
-{
-  gt_pch_nx (x->m_type);
-}
-
-void
-gt_pch_nx (frange *x, gt_pointer_operator op, void *cookie)
-{
-  op (&x->m_type, NULL, cookie);
-}
-
-void
-gt_ggc_mx (vrange *x)
-{
-  if (is_a  (*x))
-return gt_ggc_mx ((irange *) x);
-  if (is_a  (*x))
-return gt_ggc_mx ((frange *) x);
-  gcc_unreachable ();
-}
-
-void
-gt_pch_nx (vrange *x)
-{
-  if (is_a  (*x))
-return gt_pch_nx ((irange *) x);
-  if (is_a  (*x))
-return gt_pch_nx ((frange *) x);
-  gcc_unreachable ();
-}
-
-void
-gt_pch_nx (vrange *x, gt_pointer_operator op, void *cookie)
-{
-  if (is_a  (*x))
-gt_pch_nx ((irange *) x, op, cookie);
-  else if (is_a  (*x))
-gt_pch_nx ((frange *) x, op, cookie);
-  else
-gcc_unreachable ();
-}
-
 #define DEFINE_INT_RANGE_INSTANCE(N)   \
   template int_range::int_range(tree_node *,
\
   const wide_int &,\
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 991ffeafcb8..2650ded6d10 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -72,7 +72,7 @@ enum value_range_discriminator
 // if (f.supports_type_p (type)) ...
 //}
 
-class GTY((user)) vrange
+class vrange
 {
   template  friend bool is_a (vrange &);
   friend class Value_Range;
@@ -279,7 +279,7 @@ irange_bitmask::intersect (const irange_bitmask &orig_src)
 
 // An integer range without any storage.
 
-class GTY((user)) irange : public vrange
+class irange : public vrange
 {
   friend value_range_kind get_legacy_range (const irange &, tree &, tree &);
   friend class irange_storage;
@@ -350,10 +350,6 @@ protected:
   // Hard limit on max ranges allowed.
   static const int HARD_MAX_RANGES = 255;
 private:
-  friend void gt_ggc_mx (irange *);
-  friend void gt_pch_nx (irange *);
-  friend void gt_pch_nx (irange *, gt_pointer_operator, void *);
-
   bool varying_compatible_p () const;
   bool intersect_bitmask (const irange &r);
   bool union_bitmask (const irange &r);
@@ -379,7 +375,7 @@ protected:
 // HARD_MAX_RANGES.  This new storage is freed upon destruction.
 
 template
-class GTY((user)) int_range : public irange
+class int_range : public irange
 {
 public:
   int_range ();
@@ -484,13 +480,10 @@ nan_state::neg_p () const
 // The representation is a type with a couple of endpoints, unioned
 // with the set of { -NAN, +Nan }.
 
-class GTY((user)) frange : public vrange
+class frange : public vrange
 {
   friend class frange_storage;
   friend class vrange_printer;
-  friend void gt_ggc_mx (frange *);
-  friend void gt_pch_nx (frange *);
-  friend void gt_pch_nx (frange *, gt_pointer_operator, void *);
 public:
   frange ();
   frange (const frange &);
@@ -991,37 +984,6 @@ range_includes_zero_p (const irange *vr)
   return vr->contains_p (zero);
 }
 
-extern void gt_ggc_mx (vrange *);
-extern v

[COMMITTED 05/16] Move bitmask routines to vrange base class.

2024-04-28 Thread Aldy Hernandez
Any range can theoretically have a bitmask of set bits.  This patch
moves the bitmask accessors to the base class.  This cleans up some
users in IPA*, and will provide a cleaner interface when prange is in
place.

gcc/ChangeLog:

* ipa-cp.cc (propagate_bits_across_jump_function): Access bitmask
through base class.
(ipcp_store_vr_results): Same.
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Same.
(ipcp_get_parm_bits): Same.
(ipcp_update_vr): Same.
* range-op-mixed.h (update_known_bitmask): Change argument to vrange.
* range-op.cc (update_known_bitmask): Same.
* value-range.cc (vrange::update_bitmask):  New.
(irange::set_nonzero_bits): Move to vrange class.
(irange::get_nonzero_bits): Same.
* value-range.h (class vrange): Add update_bitmask, get_bitmask,
get_nonzero_bits, and set_nonzero_bits.
(class irange): Make bitmask methods virtual overrides.
(class Value_Range): Add get_bitmask and update_bitmask.
---
 gcc/ipa-cp.cc|  9 +++--
 gcc/ipa-prop.cc  | 10 --
 gcc/range-op-mixed.h |  2 +-
 gcc/range-op.cc  |  4 ++--
 gcc/value-range.cc   | 16 ++--
 gcc/value-range.h| 14 +-
 6 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index b7add455bd5..a688dced5c9 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -2485,8 +2485,7 @@ propagate_bits_across_jump_function (cgraph_edge *cs, int 
idx,
   jfunc->m_vr->get_vrange (vr);
   if (!vr.undefined_p () && !vr.varying_p ())
{
- irange &r = as_a  (vr);
- irange_bitmask bm = r.get_bitmask ();
+ irange_bitmask bm = vr.get_bitmask ();
  widest_int mask
= widest_int::from (bm.mask (), TYPE_SIGN (parm_type));
  widest_int value
@@ -6346,14 +6345,13 @@ ipcp_store_vr_results (void)
{
  Value_Range tmp = plats->m_value_range.m_vr;
  tree type = ipa_get_type (info, i);
- irange &r = as_a (tmp);
  irange_bitmask bm (wide_int::from (bits->get_value (),
 TYPE_PRECISION (type),
 TYPE_SIGN (type)),
 wide_int::from (bits->get_mask (),
 TYPE_PRECISION (type),
 TYPE_SIGN (type)));
- r.update_bitmask (bm);
+ tmp.update_bitmask (bm);
  ipa_vr vr (tmp);
  ts->m_vr->quick_push (vr);
}
@@ -6368,14 +6366,13 @@ ipcp_store_vr_results (void)
  tree type = ipa_get_type (info, i);
  Value_Range tmp;
  tmp.set_varying (type);
- irange &r = as_a (tmp);
  irange_bitmask bm (wide_int::from (bits->get_value (),
 TYPE_PRECISION (type),
 TYPE_SIGN (type)),
 wide_int::from (bits->get_mask (),
 TYPE_PRECISION (type),
 TYPE_SIGN (type)));
- r.update_bitmask (bm);
+ tmp.update_bitmask (bm);
  ipa_vr vr (tmp);
  ts->m_vr->quick_push (vr);
}
diff --git a/gcc/ipa-prop.cc b/gcc/ipa-prop.cc
index 374e998aa64..b57f9750431 100644
--- a/gcc/ipa-prop.cc
+++ b/gcc/ipa-prop.cc
@@ -2381,8 +2381,7 @@ ipa_compute_jump_functions_for_edge (struct 
ipa_func_body_info *fbi,
  irange_bitmask bm (value, mask);
  if (!addr_nonzero)
vr.set_varying (TREE_TYPE (arg));
- irange &r = as_a  (vr);
- r.update_bitmask (bm);
+ vr.update_bitmask (bm);
  ipa_set_jfunc_vr (jfunc, vr);
}
  else if (addr_nonzero)
@@ -5785,8 +5784,8 @@ ipcp_get_parm_bits (tree parm, tree *value, widest_int 
*mask)
   vr[i].get_vrange (tmp);
   if (tmp.undefined_p () || tmp.varying_p ())
 return false;
-  irange &r = as_a  (tmp);
-  irange_bitmask bm = r.get_bitmask ();
+  irange_bitmask bm;
+  bm = tmp.get_bitmask ();
   *mask = widest_int::from (bm.mask (), TYPE_SIGN (TREE_TYPE (parm)));
   *value = wide_int_to_tree (TREE_TYPE (parm), bm.value ());
   return true;
@@ -5857,8 +5856,7 @@ ipcp_update_vr (struct cgraph_node *node, 
ipcp_transformation *ts)
  if (POINTER_TYPE_P (TREE_TYPE (parm))
  && opt_for_fn (node->decl, flag_ipa_bit_cp))
{
- irange &r = as_a (tmp);
- irange_bitmask bm = r.get_bitmask ();
+ irange_bitmask bm = tmp.get_bitmask ();
  unsigned tem = bm.mask ().to_uhwi ();
  unsigned HOST_WIDE_I

[COMMITTED 04/16] Add tree versions of lower and upper bounds to vrange.

2024-04-28 Thread Aldy Hernandez
This patch adds vrange::lbound() and vrange::ubound() that return
trees.  These can be used in generic code that is type agnostic, and
avoids special casing for pointers and integers in places where we
handle both.  It also cleans up a wart in the Value_Range class.

gcc/ChangeLog:

* tree-ssa-loop-niter.cc (refine_value_range_using_guard): Convert
bound to wide_int.
* value-range.cc (Value_Range::lower_bound): Remove.
(Value_Range::upper_bound): Remove.
(unsupported_range::lbound): New.
(unsupported_range::ubound): New.
(frange::lbound): New.
(frange::ubound): New.
(irange::lbound): New.
(irange::ubound): New.
* value-range.h (class vrange): Add lbound() and ubound().
(class irange): Same.
(class frange): Same.
(class unsupported_range): Same.
(class Value_Range): Rename lower_bound and upper_bound to lbound
and ubound respectively.
---
 gcc/tree-ssa-loop-niter.cc |  4 +--
 gcc/value-range.cc | 56 --
 gcc/value-range.h  | 13 +++--
 3 files changed, 48 insertions(+), 25 deletions(-)

diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index cbc9dbc5a1f..adbc1936982 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -4067,7 +4067,7 @@ record_nonwrapping_iv (class loop *loop, tree base, tree 
step, gimple *stmt,
   Value_Range base_range (TREE_TYPE (orig_base));
   if (get_range_query (cfun)->range_of_expr (base_range, orig_base)
  && !base_range.undefined_p ())
-   max = base_range.upper_bound ();
+   max = wi::to_wide (base_range.ubound ());
   extreme = fold_convert (unsigned_type, low);
   if (TREE_CODE (orig_base) == SSA_NAME
  && TREE_CODE (high) == INTEGER_CST
@@ -4090,7 +4090,7 @@ record_nonwrapping_iv (class loop *loop, tree base, tree 
step, gimple *stmt,
   Value_Range base_range (TREE_TYPE (orig_base));
   if (get_range_query (cfun)->range_of_expr (base_range, orig_base)
  && !base_range.undefined_p ())
-   min = base_range.lower_bound ();
+   min = wi::to_wide (base_range.lbound ());
   extreme = fold_convert (unsigned_type, high);
   if (TREE_CODE (orig_base) == SSA_NAME
  && TREE_CODE (low) == INTEGER_CST
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 632d77305cc..ccac517d4c4 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -37,26 +37,6 @@ irange::accept (const vrange_visitor &v) const
   v.visit (*this);
 }
 
-// Convenience function only available for integers and pointers.
-
-wide_int
-Value_Range::lower_bound () const
-{
-  if (is_a  (*m_vrange))
-return as_a  (*m_vrange).lower_bound ();
-  gcc_unreachable ();
-}
-
-// Convenience function only available for integers and pointers.
-
-wide_int
-Value_Range::upper_bound () const
-{
-  if (is_a  (*m_vrange))
-return as_a  (*m_vrange).upper_bound ();
-  gcc_unreachable ();
-}
-
 void
 Value_Range::dump (FILE *out) const
 {
@@ -211,6 +191,18 @@ unsupported_range::operator= (const vrange &r)
   return *this;
 }
 
+tree
+unsupported_range::lbound () const
+{
+  return NULL;
+}
+
+tree
+unsupported_range::ubound () const
+{
+  return NULL;
+}
+
 // Assignment operator for generic ranges.  Copying incompatible types
 // is not allowed.
 
@@ -957,6 +949,18 @@ frange::set_nonnegative (tree type)
   set (type, dconst0, frange_val_max (type));
 }
 
+tree
+frange::lbound () const
+{
+  return build_real (type (), lower_bound ());
+}
+
+tree
+frange::ubound () const
+{
+  return build_real (type (), upper_bound ());
+}
+
 // Here we copy between any two irange's.
 
 irange &
@@ -2086,6 +2090,18 @@ irange::union_bitmask (const irange &r)
   return true;
 }
 
+tree
+irange::lbound () const
+{
+  return wide_int_to_tree (type (), lower_bound ());
+}
+
+tree
+irange::ubound () const
+{
+  return wide_int_to_tree (type (), upper_bound ());
+}
+
 void
 irange_bitmask::verify_mask () const
 {
diff --git a/gcc/value-range.h b/gcc/value-range.h
index b7c83982385..f216f1b82c1 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -96,6 +96,8 @@ public:
   virtual void set_nonnegative (tree type) = 0;
   virtual bool fits_p (const vrange &r) const = 0;
   virtual ~vrange () { }
+  virtual tree lbound () const = 0;
+  virtual tree ubound () const = 0;
 
   bool varying_p () const;
   bool undefined_p () const;
@@ -298,6 +300,8 @@ public:
   wide_int lower_bound (unsigned = 0) const;
   wide_int upper_bound (unsigned) const;
   wide_int upper_bound () const;
+  virtual tree lbound () const override;
+  virtual tree ubound () const override;
 
   // Predicates.
   virtual bool zero_p () const override;
@@ -419,6 +423,8 @@ public:
   void set_nonnegative (tree type) final override;
   bool fits_p (const vrange &) const final override;
   unsupported_range& operator= (const vrange &r);
+  tree lbound () const final override;
+  tree ubo

[COMMITTED 11/16] Move get_bitmask_from_range out of irange class.

2024-04-28 Thread Aldy Hernandez
prange will also have bitmasks, so it will need to use get_bitmask_from_range.

gcc/ChangeLog:

* value-range.cc (get_bitmask_from_range): Move out of irange class.
(irange::get_bitmask): Call function instead of internal method.
* value-range.h (class irange): Remove get_bitmask_from_range.
---
 gcc/value-range.cc | 52 +++---
 gcc/value-range.h  |  1 -
 2 files changed, 26 insertions(+), 27 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 44929b210aa..d9689bd469f 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -31,6 +31,30 @@ along with GCC; see the file COPYING3.  If not see
 #include "fold-const.h"
 #include "gimple-range.h"
 
+// Return the bitmask inherent in a range.
+
+static irange_bitmask
+get_bitmask_from_range (tree type,
+   const wide_int &min, const wide_int &max)
+{
+  unsigned prec = TYPE_PRECISION (type);
+
+  // All the bits of a singleton are known.
+  if (min == max)
+{
+  wide_int mask = wi::zero (prec);
+  wide_int value = min;
+  return irange_bitmask (value, mask);
+}
+
+  wide_int xorv = min ^ max;
+
+  if (xorv != 0)
+xorv = wi::mask (prec - wi::clz (xorv), false, prec);
+
+  return irange_bitmask (wi::zero (prec), min | xorv);
+}
+
 void
 irange::accept (const vrange_visitor &v) const
 {
@@ -1881,31 +1905,6 @@ irange::invert ()
 verify_range ();
 }
 
-// Return the bitmask inherent in the range.
-
-irange_bitmask
-irange::get_bitmask_from_range () const
-{
-  unsigned prec = TYPE_PRECISION (type ());
-  wide_int min = lower_bound ();
-  wide_int max = upper_bound ();
-
-  // All the bits of a singleton are known.
-  if (min == max)
-{
-  wide_int mask = wi::zero (prec);
-  wide_int value = lower_bound ();
-  return irange_bitmask (value, mask);
-}
-
-  wide_int xorv = min ^ max;
-
-  if (xorv != 0)
-xorv = wi::mask (prec - wi::clz (xorv), false, prec);
-
-  return irange_bitmask (wi::zero (prec), min | xorv);
-}
-
 // Remove trailing ranges that this bitmask indicates can't exist.
 
 void
@@ -2027,7 +2026,8 @@ irange::get_bitmask () const
   // in the mask.
   //
   // See also the note in irange_bitmask::intersect.
-  irange_bitmask bm = get_bitmask_from_range ();
+  irange_bitmask bm
+= get_bitmask_from_range (type (), lower_bound (), upper_bound ());
   if (!m_bitmask.unknown_p ())
 bm.intersect (m_bitmask);
   return bm;
diff --git a/gcc/value-range.h b/gcc/value-range.h
index d2e8fd5a4d9..ede90a496d8 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -352,7 +352,6 @@ private:
   bool varying_compatible_p () const;
   bool intersect_bitmask (const irange &r);
   bool union_bitmask (const irange &r);
-  irange_bitmask get_bitmask_from_range () const;
   bool set_range_from_bitmask ();
 
   bool intersect (const wide_int& lb, const wide_int& ub);
-- 
2.44.0



[COMMITTED 02/16] Add a virtual vrange destructor.

2024-04-28 Thread Aldy Hernandez
Richi mentioned in PR113476 that it would be cleaner to move the
destructor from int_range to the base class.  Although this isn't
strictly necessary, as there are no users, it is good to future proof
things, and the overall impact is miniscule.

gcc/ChangeLog:

* value-range.h (vrange::~vrange): New.
(int_range::~int_range): Make final override.
---
 gcc/value-range.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/value-range.h b/gcc/value-range.h
index e7f61950a24..b7c83982385 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -95,6 +95,7 @@ public:
   virtual void set_zero (tree type) = 0;
   virtual void set_nonnegative (tree type) = 0;
   virtual bool fits_p (const vrange &r) const = 0;
+  virtual ~vrange () { }
 
   bool varying_p () const;
   bool undefined_p () const;
@@ -382,7 +383,7 @@ public:
   int_range (tree type);
   int_range (const int_range &);
   int_range (const irange &);
-  virtual ~int_range ();
+  ~int_range () final override;
   int_range& operator= (const int_range &);
 protected:
   int_range (tree, tree, value_range_kind = VR_RANGE);
-- 
2.44.0



[COMMITTED 09/16] Verify that reading back from vrange_storage doesn't drop bits.

2024-04-28 Thread Aldy Hernandez
We have a sanity check in the irange storage code to make sure that
reading back a cache entry we have just written to yields exactly the
same range.  There's no need to do this only for integers.  This patch
moves the code to a more generic place.

However, doing so tickles a latent bug in the frange code where a
range is being pessimized from [0.0, 1.0] to [-0.0, 1.0].  Exclude
checking frange's until this bug is fixed.

gcc/ChangeLog:

* value-range-storage.cc (irange_storage::set_irange): Move
verification code from here...
(vrange_storage::set_vrange): ...to here.
---
 gcc/value-range-storage.cc | 20 +---
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/gcc/value-range-storage.cc b/gcc/value-range-storage.cc
index f00474ad0e6..09a29776a0e 100644
--- a/gcc/value-range-storage.cc
+++ b/gcc/value-range-storage.cc
@@ -165,6 +165,19 @@ vrange_storage::set_vrange (const vrange &r)
 }
   else
 gcc_unreachable ();
+
+  // Verify that reading back from the cache didn't drop bits.
+  if (flag_checking
+  // FIXME: Avoid checking frange, as it currently pessimizes some ranges:
+  //
+  // gfortran.dg/pr49472.f90 pessimizes [0.0, 1.0] into [-0.0, 1.0].
+  && !is_a  (r)
+  && !r.undefined_p ())
+{
+  Value_Range tmp (r);
+  get_vrange (tmp, r.type ());
+  gcc_checking_assert (tmp == r);
+}
 }
 
 // Restore R from storage.
@@ -306,13 +319,6 @@ irange_storage::set_irange (const irange &r)
   irange_bitmask bm = r.m_bitmask;
   write_wide_int (val, len, bm.value ());
   write_wide_int (val, len, bm.mask ());
-
-  if (flag_checking)
-{
-  int_range_max tmp;
-  get_irange (tmp, r.type ());
-  gcc_checking_assert (tmp == r);
-}
 }
 
 static inline void
-- 
2.44.0



[COMMITTED 14/16] Move print_irange_* out of vrange_printer class.

2024-04-28 Thread Aldy Hernandez
Move some code out of the irange pretty printers so it can be shared
with pointers.

gcc/ChangeLog:

* value-range-pretty-print.cc (print_int_bound): New.
(print_irange_bitmasks): New.
(vrange_printer::print_irange_bound): Remove.
(vrange_printer::print_irange_bitmasks): Remove.
* value-range-pretty-print.h: Remove print_irange_bitmasks and
print_irange_bound
---
 gcc/value-range-pretty-print.cc | 83 -
 gcc/value-range-pretty-print.h  |  2 -
 2 files changed, 41 insertions(+), 44 deletions(-)

diff --git a/gcc/value-range-pretty-print.cc b/gcc/value-range-pretty-print.cc
index c75cbea3955..b6d23dce6d2 100644
--- a/gcc/value-range-pretty-print.cc
+++ b/gcc/value-range-pretty-print.cc
@@ -30,6 +30,44 @@ along with GCC; see the file COPYING3.  If not see
 #include "gimple-range.h"
 #include "value-range-pretty-print.h"
 
+static void
+print_int_bound (pretty_printer *pp, const wide_int &bound, tree type)
+{
+  wide_int type_min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+  wide_int type_max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type));
+
+  if (INTEGRAL_TYPE_P (type)
+  && !TYPE_UNSIGNED (type)
+  && bound == type_min
+  && TYPE_PRECISION (type) != 1)
+pp_string (pp, "-INF");
+  else if (bound == type_max && TYPE_PRECISION (type) != 1)
+pp_string (pp, "+INF");
+  else
+pp_wide_int (pp, bound, TYPE_SIGN (type));
+}
+
+static void
+print_irange_bitmasks (pretty_printer *pp, const irange_bitmask &bm)
+{
+  if (bm.unknown_p ())
+return;
+
+  pp_string (pp, " MASK ");
+  char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p;
+  unsigned len_mask, len_val;
+  if (print_hex_buf_size (bm.mask (), &len_mask)
+  | print_hex_buf_size (bm.value (), &len_val))
+p = XALLOCAVEC (char, MAX (len_mask, len_val));
+  else
+p = buf;
+  print_hex (bm.mask (), p);
+  pp_string (pp, p);
+  pp_string (pp, " VALUE ");
+  print_hex (bm.value (), p);
+  pp_string (pp, p);
+}
+
 void
 vrange_printer::visit (const unsupported_range &r) const
 {
@@ -66,51 +104,12 @@ vrange_printer::visit (const irange &r) const
   for (unsigned i = 0; i < r.num_pairs (); ++i)
 {
   pp_character (pp, '[');
-  print_irange_bound (r.lower_bound (i), r.type ());
+  print_int_bound (pp, r.lower_bound (i), r.type ());
   pp_string (pp, ", ");
-  print_irange_bound (r.upper_bound (i), r.type ());
+  print_int_bound (pp, r.upper_bound (i), r.type ());
   pp_character (pp, ']');
 }
- print_irange_bitmasks (r);
-}
-
-void
-vrange_printer::print_irange_bound (const wide_int &bound, tree type) const
-{
-  wide_int type_min = wi::min_value (TYPE_PRECISION (type), TYPE_SIGN (type));
-  wide_int type_max = wi::max_value (TYPE_PRECISION (type), TYPE_SIGN (type));
-
-  if (INTEGRAL_TYPE_P (type)
-  && !TYPE_UNSIGNED (type)
-  && bound == type_min
-  && TYPE_PRECISION (type) != 1)
-pp_string (pp, "-INF");
-  else if (bound == type_max && TYPE_PRECISION (type) != 1)
-pp_string (pp, "+INF");
-  else
-pp_wide_int (pp, bound, TYPE_SIGN (type));
-}
-
-void
-vrange_printer::print_irange_bitmasks (const irange &r) const
-{
-  irange_bitmask bm = r.m_bitmask;
-  if (bm.unknown_p ())
-return;
-
-  pp_string (pp, " MASK ");
-  char buf[WIDE_INT_PRINT_BUFFER_SIZE], *p;
-  unsigned len_mask, len_val;
-  if (print_hex_buf_size (bm.mask (), &len_mask)
-  | print_hex_buf_size (bm.value (), &len_val))
-p = XALLOCAVEC (char, MAX (len_mask, len_val));
-  else
-p = buf;
-  print_hex (bm.mask (), p);
-  pp_string (pp, p);
-  pp_string (pp, " VALUE ");
-  print_hex (bm.value (), p);
-  pp_string (pp, p);
+  print_irange_bitmasks (pp, r.m_bitmask);
 }
 
 void
diff --git a/gcc/value-range-pretty-print.h b/gcc/value-range-pretty-print.h
index ca85fd6157c..44cd6e81298 100644
--- a/gcc/value-range-pretty-print.h
+++ b/gcc/value-range-pretty-print.h
@@ -29,8 +29,6 @@ public:
   void visit (const irange &) const override;
   void visit (const frange &) const override;
 private:
-  void print_irange_bound (const wide_int &w, tree type) const;
-  void print_irange_bitmasks (const irange &) const;
   void print_frange_nan (const frange &) const;
   void print_real_value (tree type, const REAL_VALUE_TYPE &r) const;
 
-- 
2.44.0



[COMMITTED 08/16] Change range_includes_zero_p argument to a reference.

2024-04-28 Thread Aldy Hernandez
Make range_includes_zero_p take an argument instead of a pointer for
consistency in the range-op code.

gcc/ChangeLog:

* gimple-range-op.cc (cfn_clz::fold_range): Change
range_includes_zero_p argument to a reference.
(cfn_ctz::fold_range): Same.
* range-op.cc (operator_plus::lhs_op1_relation): Same.
* value-range.h (range_includes_zero_p): Same.
---
 gcc/gimple-range-op.cc |  6 +++---
 gcc/range-op.cc|  2 +-
 gcc/value-range.h  | 10 +-
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/gcc/gimple-range-op.cc b/gcc/gimple-range-op.cc
index a98f7db62a7..9c50c00549e 100644
--- a/gcc/gimple-range-op.cc
+++ b/gcc/gimple-range-op.cc
@@ -853,7 +853,7 @@ public:
 // __builtin_ffs* and __builtin_popcount* return [0, prec].
 int prec = TYPE_PRECISION (lh.type ());
 // If arg is non-zero, then ffs or popcount are non-zero.
-int mini = range_includes_zero_p (&lh) ? 0 : 1;
+int mini = range_includes_zero_p (lh) ? 0 : 1;
 int maxi = prec;
 
 // If some high bits are known to be zero, decrease the maximum.
@@ -945,7 +945,7 @@ cfn_clz::fold_range (irange &r, tree type, const irange &lh,
   if (mini == -2)
mini = 0;
 }
-  else if (!range_includes_zero_p (&lh))
+  else if (!range_includes_zero_p (lh))
 {
   mini = 0;
   maxi = prec - 1;
@@ -1007,7 +1007,7 @@ cfn_ctz::fold_range (irange &r, tree type, const irange 
&lh,
mini = -2;
 }
   // If arg is non-zero, then use [0, prec - 1].
-  if (!range_includes_zero_p (&lh))
+  if (!range_includes_zero_p (lh))
 {
   mini = 0;
   maxi = prec - 1;
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index aeff55cfd78..6ea7d624a9b 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -1657,7 +1657,7 @@ operator_plus::lhs_op1_relation (const irange &lhs,
 }
 
   // If op2 does not contain 0, then LHS and OP1 can never be equal.
-  if (!range_includes_zero_p (&op2))
+  if (!range_includes_zero_p (op2))
 return VREL_NE;
 
   return VREL_VARYING;
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 2650ded6d10..62f123e2a4b 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -972,16 +972,16 @@ irange::contains_p (tree cst) const
 }
 
 inline bool
-range_includes_zero_p (const irange *vr)
+range_includes_zero_p (const irange &vr)
 {
-  if (vr->undefined_p ())
+  if (vr.undefined_p ())
 return false;
 
-  if (vr->varying_p ())
+  if (vr.varying_p ())
 return true;
 
-  wide_int zero = wi::zero (TYPE_PRECISION (vr->type ()));
-  return vr->contains_p (zero);
+  wide_int zero = wi::zero (TYPE_PRECISION (vr.type ()));
+  return vr.contains_p (zero);
 }
 
 // Constructors for irange
-- 
2.44.0



[COMMITTED 10/16] Accept a vrange in get_legacy_range.

2024-04-28 Thread Aldy Hernandez
In preparation for prange, make get_legacy_range take a generic
vrange, not just an irange.

gcc/ChangeLog:

* value-range.cc (get_legacy_range): Make static and add another
version of get_legacy_range that takes a vrange.
* value-range.h (class irange): Remove unnecessary friendship with
get_legacy_range.
(get_legacy_range): Accept a vrange.
---
 gcc/value-range.cc | 17 -
 gcc/value-range.h  |  3 +--
 2 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index b901c864a7b..44929b210aa 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -1004,7 +1004,7 @@ irange::operator= (const irange &src)
   return *this;
 }
 
-value_range_kind
+static value_range_kind
 get_legacy_range (const irange &r, tree &min, tree &max)
 {
   if (r.undefined_p ())
@@ -1041,6 +1041,21 @@ get_legacy_range (const irange &r, tree &min, tree &max)
   return VR_RANGE;
 }
 
+// Given a range in V, return an old-style legacy range consisting of
+// a value_range_kind with a MIN/MAX.  This is to maintain
+// compatibility with passes that still depend on VR_ANTI_RANGE, and
+// only works for integers and pointers.
+
+value_range_kind
+get_legacy_range (const vrange &v, tree &min, tree &max)
+{
+  if (is_a  (v))
+return get_legacy_range (as_a  (v), min, max);
+
+  gcc_unreachable ();
+  return VR_UNDEFINED;
+}
+
 /* Set value range to the canonical form of {VRTYPE, MIN, MAX, EQUIV}.
This means adjusting VRTYPE, MIN and MAX representing the case of a
wrapping range with MAX < MIN covering [MIN, type_max] U [type_min, MAX]
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 62f123e2a4b..d2e8fd5a4d9 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -281,7 +281,6 @@ irange_bitmask::intersect (const irange_bitmask &orig_src)
 
 class irange : public vrange
 {
-  friend value_range_kind get_legacy_range (const irange &, tree &, tree &);
   friend class irange_storage;
   friend class vrange_printer;
 public:
@@ -886,7 +885,7 @@ Value_Range::supports_type_p (const_tree type)
   return irange::supports_p (type) || frange::supports_p (type);
 }
 
-extern value_range_kind get_legacy_range (const irange &, tree &min, tree 
&max);
+extern value_range_kind get_legacy_range (const vrange &, tree &min, tree 
&max);
 extern void dump_value_range (FILE *, const vrange *);
 extern bool vrp_operand_equal_p (const_tree, const_tree);
 inline REAL_VALUE_TYPE frange_val_min (const_tree type);
-- 
2.44.0



[COMMITTED 01/16] Make vrange an abstract class.

2024-04-28 Thread Aldy Hernandez
Explicitly make vrange an abstract class.  This involves fleshing out
the unsupported_range overrides which we were inheriting by default
from vrange.

gcc/ChangeLog:

* value-range.cc (unsupported_range::accept): Move down.
(vrange::contains_p):  Rename to...
(unsupported_range::contains_p): ...this.
(vrange::singleton_p): Rename to...
(unsupported_range::singleton_p): ...this.
(vrange::set): Rename to...
(unsupported_range::set): ...this.
(vrange::type): Rename to...
(unsupported_range::type): ...this.
(vrange::supports_type_p): Rename to...
(unsupported_range::supports_type_p): ...this.
(vrange::set_undefined): Rename to...
(unsupported_range::set_undefined): ...this.
(vrange::set_varying): Rename to...
(unsupported_range::set_varying): ...this.
(vrange::union_): Rename to...
(unsupported_range::union_): ...this.
(vrange::intersect): Rename to...
(unsupported_range::intersect): ...this.
(vrange::zero_p): Rename to...
(unsupported_range::zero_p): ...this.
(vrange::nonzero_p): Rename to...
(unsupported_range::nonzero_p): ...this.
(vrange::set_nonzero): Rename to...
(unsupported_range::set_nonzero): ...this.
(vrange::set_zero): Rename to...
(unsupported_range::set_zero): ...this.
(vrange::set_nonnegative): Rename to...
(unsupported_range::set_nonnegative): ...this.
(vrange::fits_p): Rename to...
(unsupported_range::fits_p): ...this.
(unsupported_range::operator=): New.
(frange::fits_p): New.
* value-range.h (class vrange): Make an abstract class.
(class unsupported_range): Declare override methods.
---
 gcc/value-range.cc | 62 ++
 gcc/value-range.h  | 53 ---
 2 files changed, 73 insertions(+), 42 deletions(-)

diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index 70375f7abf9..632d77305cc 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -37,12 +37,6 @@ irange::accept (const vrange_visitor &v) const
   v.visit (*this);
 }
 
-void
-unsupported_range::accept (const vrange_visitor &v) const
-{
-  v.visit (*this);
-}
-
 // Convenience function only available for integers and pointers.
 
 wide_int
@@ -86,52 +80,58 @@ debug (const irange_bitmask &bm)
   fprintf (stderr, "\n");
 }
 
-// Default vrange definitions.
+// Definitions for unsupported_range.
+
+void
+unsupported_range::accept (const vrange_visitor &v) const
+{
+  v.visit (*this);
+}
 
 bool
-vrange::contains_p (tree) const
+unsupported_range::contains_p (tree) const
 {
   return varying_p ();
 }
 
 bool
-vrange::singleton_p (tree *) const
+unsupported_range::singleton_p (tree *) const
 {
   return false;
 }
 
 void
-vrange::set (tree min, tree, value_range_kind)
+unsupported_range::set (tree min, tree, value_range_kind)
 {
   set_varying (TREE_TYPE (min));
 }
 
 tree
-vrange::type () const
+unsupported_range::type () const
 {
   return void_type_node;
 }
 
 bool
-vrange::supports_type_p (const_tree) const
+unsupported_range::supports_type_p (const_tree) const
 {
   return false;
 }
 
 void
-vrange::set_undefined ()
+unsupported_range::set_undefined ()
 {
   m_kind = VR_UNDEFINED;
 }
 
 void
-vrange::set_varying (tree)
+unsupported_range::set_varying (tree)
 {
   m_kind = VR_VARYING;
 }
 
 bool
-vrange::union_ (const vrange &r)
+unsupported_range::union_ (const vrange &r)
 {
   if (r.undefined_p () || varying_p ())
 return false;
@@ -145,7 +145,7 @@ vrange::union_ (const vrange &r)
 }
 
 bool
-vrange::intersect (const vrange &r)
+unsupported_range::intersect (const vrange &r)
 {
   if (undefined_p () || r.varying_p ())
 return false;
@@ -164,41 +164,53 @@ vrange::intersect (const vrange &r)
 }
 
 bool
-vrange::zero_p () const
+unsupported_range::zero_p () const
 {
   return false;
 }
 
 bool
-vrange::nonzero_p () const
+unsupported_range::nonzero_p () const
 {
   return false;
 }
 
 void
-vrange::set_nonzero (tree type)
+unsupported_range::set_nonzero (tree type)
 {
   set_varying (type);
 }
 
 void
-vrange::set_zero (tree type)
+unsupported_range::set_zero (tree type)
 {
   set_varying (type);
 }
 
 void
-vrange::set_nonnegative (tree type)
+unsupported_range::set_nonnegative (tree type)
 {
   set_varying (type);
 }
 
 bool
-vrange::fits_p (const vrange &) const
+unsupported_range::fits_p (const vrange &) const
 {
   return true;
 }
 
+unsupported_range &
+unsupported_range::operator= (const vrange &r)
+{
+  if (r.undefined_p ())
+set_undefined ();
+  else if (r.varying_p ())
+set_varying (void_type_node);
+  else
+gcc_unreachable ();
+  return *this;
+}
+
 // Assignment operator for generic ranges.  Copying incompatible types
 // is not allowed.
 
@@ -359,6 +371,12 @@ frange::accept (const vrange_visitor &v) const
   v.visit (*this);
 }
 
+bool
+frange::fits_p (con

[COMMITTED 13/16] Accept any vrange in range_includes_zero_p.

2024-04-28 Thread Aldy Hernandez
Accept a vrange, as this will be used for either integers or pointers.

gcc/ChangeLog:

* value-range.h (range_includes_zero_p): Accept vrange.
---
 gcc/value-range.h | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/gcc/value-range.h b/gcc/value-range.h
index ede90a496d8..0ab717697f0 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -970,7 +970,7 @@ irange::contains_p (tree cst) const
 }
 
 inline bool
-range_includes_zero_p (const irange &vr)
+range_includes_zero_p (const vrange &vr)
 {
   if (vr.undefined_p ())
 return false;
@@ -978,8 +978,7 @@ range_includes_zero_p (const irange &vr)
   if (vr.varying_p ())
 return true;
 
-  wide_int zero = wi::zero (TYPE_PRECISION (vr.type ()));
-  return vr.contains_p (zero);
+  return vr.contains_p (build_zero_cst (vr.type ()));
 }
 
 // Constructors for irange
-- 
2.44.0



[COMMITTED 15/16] Remove range_zero and range_nonzero.

2024-04-28 Thread Aldy Hernandez
Remove legacy range_zero and range_nonzero as they return by value,
which make it not work in a separate irange and prange world.  Also,
we already have set_zero and set_nonzero methods in vrange.

gcc/ChangeLog:

* range-op-ptr.cc (pointer_plus_operator::wi_fold): Use method
range setters instead of out of line functions.
(pointer_min_max_operator::wi_fold): Same.
(pointer_and_operator::wi_fold): Same.
(pointer_or_operator::wi_fold): Same.
* range-op.cc (operator_negate::fold_range): Same.
(operator_addr_expr::fold_range): Same.
(range_op_cast_tests): Same.
* range.cc (range_zero): Remove.
(range_nonzero): Remove.
* range.h (range_zero): Remove.
(range_nonzero): Remove.
* value-range.cc (range_tests_misc): Use method instead of out of
line function.
---
 gcc/range-op-ptr.cc | 14 +++---
 gcc/range-op.cc | 14 --
 gcc/range.cc| 14 --
 gcc/range.h |  2 --
 gcc/value-range.cc  |  7 ---
 5 files changed, 19 insertions(+), 32 deletions(-)

diff --git a/gcc/range-op-ptr.cc b/gcc/range-op-ptr.cc
index 2c85d75b5e8..7343ef635f3 100644
--- a/gcc/range-op-ptr.cc
+++ b/gcc/range-op-ptr.cc
@@ -101,10 +101,10 @@ pointer_plus_operator::wi_fold (irange &r, tree type,
   && !TYPE_OVERFLOW_WRAPS (type)
   && (flag_delete_null_pointer_checks
  || !wi::sign_mask (rh_ub)))
-r = range_nonzero (type);
+r.set_nonzero (type);
   else if (lh_lb == lh_ub && lh_lb == 0
   && rh_lb == rh_ub && rh_lb == 0)
-r = range_zero (type);
+r.set_zero (type);
   else
r.set_varying (type);
 }
@@ -150,9 +150,9 @@ pointer_min_max_operator::wi_fold (irange &r, tree type,
   // are varying.
   if (!wi_includes_zero_p (type, lh_lb, lh_ub)
   && !wi_includes_zero_p (type, rh_lb, rh_ub))
-r = range_nonzero (type);
+r.set_nonzero (type);
   else if (wi_zero_p (type, lh_lb, lh_ub) && wi_zero_p (type, rh_lb, rh_ub))
-r = range_zero (type);
+r.set_zero (type);
   else
 r.set_varying (type);
 }
@@ -175,7 +175,7 @@ pointer_and_operator::wi_fold (irange &r, tree type,
   // For pointer types, we are really only interested in asserting
   // whether the expression evaluates to non-NULL.
   if (wi_zero_p (type, lh_lb, lh_ub) || wi_zero_p (type, lh_lb, lh_ub))
-r = range_zero (type);
+r.set_zero (type);
   else
 r.set_varying (type);
 }
@@ -236,9 +236,9 @@ pointer_or_operator::wi_fold (irange &r, tree type,
   // whether the expression evaluates to non-NULL.
   if (!wi_includes_zero_p (type, lh_lb, lh_ub)
   && !wi_includes_zero_p (type, rh_lb, rh_ub))
-r = range_nonzero (type);
+r.set_nonzero (type);
   else if (wi_zero_p (type, lh_lb, lh_ub) && wi_zero_p (type, rh_lb, rh_ub))
-r = range_zero (type);
+r.set_zero (type);
   else
 r.set_varying (type);
 }
diff --git a/gcc/range-op.cc b/gcc/range-op.cc
index 6ea7d624a9b..ab3a4f0b200 100644
--- a/gcc/range-op.cc
+++ b/gcc/range-op.cc
@@ -4364,9 +4364,11 @@ operator_negate::fold_range (irange &r, tree type,
 {
   if (empty_range_varying (r, type, lh, rh))
 return true;
-  // -X is simply 0 - X.
-  return range_op_handler (MINUS_EXPR).fold_range (r, type,
-  range_zero (type), lh);
+
+// -X is simply 0 - X.
+  int_range<1> zero;
+  zero.set_zero (type);
+  return range_op_handler (MINUS_EXPR).fold_range (r, type, zero, lh);
 }
 
 bool
@@ -4391,7 +4393,7 @@ operator_addr_expr::fold_range (irange &r, tree type,
 
   // Return a non-null pointer of the LHS type (passed in op2).
   if (lh.zero_p ())
-r = range_zero (type);
+r.set_zero (type);
   else if (lh.undefined_p () || contains_zero_p (lh))
 r.set_varying (type);
   else
@@ -4675,7 +4677,7 @@ range_op_cast_tests ()
   if (TYPE_PRECISION (integer_type_node)
   > TYPE_PRECISION (short_integer_type_node))
 {
-  r0 = range_nonzero (integer_type_node);
+  r0.set_nonzero (integer_type_node);
   range_cast (r0, short_integer_type_node);
   r1 = int_range<1> (short_integer_type_node,
 min_limit (short_integer_type_node),
@@ -4687,7 +4689,7 @@ range_op_cast_tests ()
   //
   // NONZERO signed 16-bits is [-MIN_16,-1][1, +MAX_16].
   // Converting this to 32-bits signed is [-MIN_16,-1][1, +MAX_16].
-  r0 = range_nonzero (short_integer_type_node);
+  r0.set_nonzero (short_integer_type_node);
   range_cast (r0, integer_type_node);
   r1 = int_range<1> (integer_type_node, INT (-32768), INT (-1));
   r2 = int_range<1> (integer_type_node, INT (1), INT (32767));
diff --git a/gcc/range.cc b/gcc/range.cc
index c68f387f71c..b362e0f12e0 100644
--- a/gcc/range.cc
+++ b/gcc/range.cc
@@ -29,20 +29,6 @@ along with GCC; see the file COPYING3.  If not see
 #include "ssa.h"
 #include "range.h"
 
-value_range
-range_zero (tree type)
-{
-  wide_int zero = wi::zero (TYPE_PRECISION (type));
-  return value_range (type, ze

[COMMITTED 12/16] Make some integer specific ranges generic Value_Range's.

2024-04-28 Thread Aldy Hernandez
There are some irange uses that should be Value_Range, because they
can be either integers or pointers.  This will become a problem when
prange comes live.

gcc/ChangeLog:

* tree-ssa-loop-split.cc (split_at_bb_p): Make int_range a Value_Range.
* tree-ssa-strlen.cc (get_range): Same.
* value-query.cc (range_query::get_tree_range):  Handle both
integers and pointers.
* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Make
r0 and r1 Value_Range's.
---
 gcc/tree-ssa-loop-split.cc | 6 +++---
 gcc/tree-ssa-strlen.cc | 2 +-
 gcc/value-query.cc | 4 +---
 gcc/vr-values.cc   | 3 ++-
 4 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/gcc/tree-ssa-loop-split.cc b/gcc/tree-ssa-loop-split.cc
index a770ea371a2..a6be0cef7b0 100644
--- a/gcc/tree-ssa-loop-split.cc
+++ b/gcc/tree-ssa-loop-split.cc
@@ -144,18 +144,18 @@ split_at_bb_p (class loop *loop, basic_block bb, tree 
*border, affine_iv *iv,
   value range.  */
else
  {
-   int_range<2> r;
+   Value_Range r (TREE_TYPE (op0));
get_global_range_query ()->range_of_expr (r, op0, stmt);
if (!r.varying_p () && !r.undefined_p ()
&& TREE_CODE (op1) == INTEGER_CST)
  {
wide_int val = wi::to_wide (op1);
-   if (known_eq (val, r.lower_bound ()))
+   if (known_eq (val, wi::to_wide (r.lbound (
  {
code = (code == EQ_EXPR) ? LE_EXPR : GT_EXPR;
break;
  }
-   else if (known_eq (val, r.upper_bound ()))
+   else if (known_eq (val, wi::to_wide (r.ubound (
  {
code = (code == EQ_EXPR) ? GE_EXPR : LT_EXPR;
break;
diff --git a/gcc/tree-ssa-strlen.cc b/gcc/tree-ssa-strlen.cc
index e09c9cc081f..61c3da22322 100644
--- a/gcc/tree-ssa-strlen.cc
+++ b/gcc/tree-ssa-strlen.cc
@@ -215,7 +215,7 @@ get_range (tree val, gimple *stmt, wide_int minmax[2],
   rvals = get_range_query (cfun);
 }
 
-  value_range vr;
+  Value_Range vr (TREE_TYPE (val));
   if (!rvals->range_of_expr (vr, val, stmt))
 return NULL_TREE;
 
diff --git a/gcc/value-query.cc b/gcc/value-query.cc
index eda71dc89d3..052b7511565 100644
--- a/gcc/value-query.cc
+++ b/gcc/value-query.cc
@@ -156,11 +156,9 @@ range_query::get_tree_range (vrange &r, tree expr, gimple 
*stmt)
 {
 case INTEGER_CST:
   {
-   irange &i = as_a  (r);
if (TREE_OVERFLOW_P (expr))
  expr = drop_tree_overflow (expr);
-   wide_int w = wi::to_wide (expr);
-   i.set (TREE_TYPE (expr), w, w);
+   r.set (expr, expr);
return true;
   }
 
diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index ff68d40c355..0572bf6c8c7 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -310,7 +310,8 @@ tree
 simplify_using_ranges::fold_cond_with_ops (enum tree_code code,
   tree op0, tree op1, gimple *s)
 {
-  int_range_max r0, r1;
+  Value_Range r0 (TREE_TYPE (op0));
+  Value_Range r1 (TREE_TYPE (op1));
   if (!query->range_of_expr (r0, op0, s)
   || !query->range_of_expr (r1, op1, s))
 return NULL_TREE;
-- 
2.44.0



[COMMITTED 07/16] Make fold_cond_with_ops use a boolean type for range_true/range_false.

2024-04-28 Thread Aldy Hernandez
Conditional operators are always boolean, regardless of their
operands.  Getting the type wrong is not currently a problem, but will
be when prange's can no longer store an integer.

gcc/ChangeLog:

* vr-values.cc (simplify_using_ranges::fold_cond_with_ops): Remove
type from range_true and range_false.
---
 gcc/vr-values.cc | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/vr-values.cc b/gcc/vr-values.cc
index a7e291a16e5..ff68d40c355 100644
--- a/gcc/vr-values.cc
+++ b/gcc/vr-values.cc
@@ -320,9 +320,9 @@ simplify_using_ranges::fold_cond_with_ops (enum tree_code 
code,
   range_op_handler handler (code);
   if (handler && handler.fold_range (res, type, r0, r1))
 {
-  if (res == range_true (type))
+  if (res == range_true ())
return boolean_true_node;
-  if (res == range_false (type))
+  if (res == range_false ())
return boolean_false_node;
 }
   return NULL;
-- 
2.44.0



[COMMITTED 16/16] Callers of irange_bitmask must normalize value/mask pairs.

2024-04-28 Thread Aldy Hernandez
As per the documentation, irange_bitmask must have the unknown bits in
the mask set to 0 in the value field.  Even though we say we must have
normalized value/mask bits, we don't enforce it, opting to normalize
on the fly in union and intersect.  Avoiding this lazy enforcing as
well as the extra saving/restoring involved in returning the changed
status, gives us a performance increase of 1.25% for VRP and 1.51% for
ipa-CP.

gcc/ChangeLog:

* tree-ssa-ccp.cc (ccp_finalize): Normalize before calling
set_bitmask.
* value-range.cc (irange::intersect_bitmask): Calculate changed
irange_bitmask bits on our own.
(irange::union_bitmask): Same.
(irange_bitmask::verify_mask): Verify that bits are normalized.
* value-range.h (irange_bitmask::union_): Do not normalize.
Remove return value.
(irange_bitmask::intersect): Same.
---
 gcc/tree-ssa-ccp.cc |  1 +
 gcc/value-range.cc  |  7 +--
 gcc/value-range.h   | 24 ++--
 3 files changed, 12 insertions(+), 20 deletions(-)

diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index f6a5cd0ee6e..3749126b5f7 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -1024,6 +1024,7 @@ ccp_finalize (bool nonzero_p)
  unsigned int precision = TYPE_PRECISION (TREE_TYPE (val->value));
  wide_int value = wi::to_wide (val->value);
  wide_int mask = wide_int::from (val->mask, precision, UNSIGNED);
+ value = value & ~mask;
  set_bitmask (name, value, mask);
}
 }
diff --git a/gcc/value-range.cc b/gcc/value-range.cc
index a27de5534e1..ca6d521c625 100644
--- a/gcc/value-range.cc
+++ b/gcc/value-range.cc
@@ -2067,7 +2067,8 @@ irange::intersect_bitmask (const irange &r)
 
   irange_bitmask bm = get_bitmask ();
   irange_bitmask save = bm;
-  if (!bm.intersect (r.get_bitmask ()))
+  bm.intersect (r.get_bitmask ());
+  if (save == bm)
 return false;
 
   m_bitmask = bm;
@@ -2099,7 +2100,8 @@ irange::union_bitmask (const irange &r)
 
   irange_bitmask bm = get_bitmask ();
   irange_bitmask save = bm;
-  if (!bm.union_ (r.get_bitmask ()))
+  bm.union_ (r.get_bitmask ());
+  if (save == bm)
 return false;
 
   m_bitmask = bm;
@@ -2133,6 +2135,7 @@ void
 irange_bitmask::verify_mask () const
 {
   gcc_assert (m_value.get_precision () == m_mask.get_precision ());
+  gcc_checking_assert (wi::bit_and (m_mask, m_value) == 0);
 }
 
 void
diff --git a/gcc/value-range.h b/gcc/value-range.h
index 0ab717697f0..11c73faca1b 100644
--- a/gcc/value-range.h
+++ b/gcc/value-range.h
@@ -139,8 +139,8 @@ public:
   void set_unknown (unsigned prec);
   bool unknown_p () const;
   unsigned get_precision () const;
-  bool union_ (const irange_bitmask &src);
-  bool intersect (const irange_bitmask &src);
+  void union_ (const irange_bitmask &src);
+  void intersect (const irange_bitmask &src);
   bool operator== (const irange_bitmask &src) const;
   bool operator!= (const irange_bitmask &src) const { return !(*this == src); }
   void verify_mask () const;
@@ -233,29 +233,18 @@ irange_bitmask::operator== (const irange_bitmask &src) 
const
   return m_value == src.m_value && m_mask == src.m_mask;
 }
 
-inline bool
-irange_bitmask::union_ (const irange_bitmask &orig_src)
+inline void
+irange_bitmask::union_ (const irange_bitmask &src)
 {
-  // Normalize mask.
-  irange_bitmask src (orig_src.m_value & ~orig_src.m_mask, orig_src.m_mask);
-  m_value &= ~m_mask;
-
-  irange_bitmask save (*this);
   m_mask = (m_mask | src.m_mask) | (m_value ^ src.m_value);
   m_value = m_value & src.m_value;
   if (flag_checking)
 verify_mask ();
-  return *this != save;
 }
 
-inline bool
-irange_bitmask::intersect (const irange_bitmask &orig_src)
+inline void
+irange_bitmask::intersect (const irange_bitmask &src)
 {
-  // Normalize mask.
-  irange_bitmask src (orig_src.m_value & ~orig_src.m_mask, orig_src.m_mask);
-  m_value &= ~m_mask;
-
-  irange_bitmask save (*this);
   // If we have two known bits that are incompatible, the resulting
   // bit is undefined.  It is unclear whether we should set the entire
   // range to UNDEFINED, or just a subset of it.  For now, set the
@@ -274,7 +263,6 @@ irange_bitmask::intersect (const irange_bitmask &orig_src)
 }
   if (flag_checking)
 verify_mask ();
-  return *this != save;
 }
 
 // An integer range without any storage.
-- 
2.44.0



[PATCH 00/16] prange supporting patchset

2024-04-28 Thread Aldy Hernandez
In this cycle, we will be contributing ranges for pointers (prange),
to disambiguate pointers from integers in a range.  Initially they
will behave exactly as they do now, with just two integer end points
and a bitmask, but eventually we will track points-to info in a less
hacky manner than what we do with the pointer equivalency class
(pointer_equiv_analyzer).

This first set of patches implements a bunch of little cleanups and
set-ups that will make it easier to drop in prange in a week or two.
The patches in this set are non-intrusive, and don't touch code that
changes much in the release, so they should be safe to push now.

There should be no change in behavior in any of these patches.

All patches have been tested on x86-64 Linux.

Aldy Hernandez (16):
  Make vrange an abstract class.
  Add a virtual vrange destructor.
  Make some Value_Range's explicitly integer.
  Add tree versions of lower and upper bounds to vrange.
  Move bitmask routines to vrange base class.
  Remove GTY support for vrange and derived classes.
  Make fold_cond_with_ops use a boolean type for range_true/range_false.
  Change range_includes_zero_p argument to a reference.
  Verify that reading back from vrange_storage doesn't drop bits.
  Accept a vrange in get_legacy_range.
  Move get_bitmask_from_range out of irange class.
  Make some integer specific ranges generic Value_Range's.
  Accept any vrange in range_includes_zero_p.
  Move print_irange_* out of vrange_printer class.
  Remove range_zero and range_nonzero.
  Callers of irange_bitmask must normalize value/mask pairs.

 gcc/gimple-range-op.cc  |   6 +-
 gcc/gimple-ssa-warn-access.cc   |   4 +-
 gcc/ipa-cp.cc   |   9 +-
 gcc/ipa-prop.cc |  10 +-
 gcc/range-op-mixed.h|   2 +-
 gcc/range-op-ptr.cc |  14 +-
 gcc/range-op.cc |  20 ++-
 gcc/range.cc|  14 --
 gcc/range.h |   2 -
 gcc/tree-ssa-ccp.cc |   1 +
 gcc/tree-ssa-loop-niter.cc  |  16 +-
 gcc/tree-ssa-loop-split.cc  |   6 +-
 gcc/tree-ssa-strlen.cc  |   2 +-
 gcc/value-query.cc  |   4 +-
 gcc/value-range-pretty-print.cc |  83 +-
 gcc/value-range-pretty-print.h  |   2 -
 gcc/value-range-storage.cc  |  20 ++-
 gcc/value-range-storage.h   |   4 -
 gcc/value-range.cc  | 284 
 gcc/value-range.h   | 166 ---
 gcc/vr-values.cc|   7 +-
 21 files changed, 310 insertions(+), 366 deletions(-)

-- 
2.44.0



[PATCH] Minor range type fixes for IPA in preparation for prange.

2024-04-28 Thread Aldy Hernandez
The polymorphic Value_Range object takes a tree type at construction
so it can determine what type of range to use (currently irange or
frange).  It seems a few of the types are slightly off.  This isn't a
problem now, because IPA only cares about integers and pointers, which
can both live in an irange.  However, with prange coming about, we
need to get the type right, because you can't store an integer in a
pointer range or vice versa.

Also, in preparation for prange, the irange::supports_p() idiom will become:

  irange::supports_p () || prange::supports_p()

To avoid changing all these palces, I've added an inline function we
can later change and change everything at once.

Finally, there's a Value_Range::supports_type_p() &&
irange::supports_p() in the code.  The latter is a subset of the
former, so there's no need to check both.

OK for trunk?

gcc/ChangeLog:

* ipa-cp.cc (ipa_vr_operation_and_type_effects): Use ipa_supports_p.
(ipa_value_range_from_jfunc): Change Value_Range type.
(propagate_vr_across_jump_function): Same.
* ipa-cp.h (ipa_supports_p): New.
* ipa-fnsummary.cc (evaluate_conditions_for_known_args): Change 
Value_Range type.
* ipa-prop.cc (ipa_compute_jump_functions_for_edge): Use ipa_supports_p.
(ipcp_get_parm_bits): Same.
---
 gcc/ipa-cp.cc| 14 +++---
 gcc/ipa-cp.h |  8 
 gcc/ipa-fnsummary.cc |  2 +-
 gcc/ipa-prop.cc  |  8 +++-
 4 files changed, 19 insertions(+), 13 deletions(-)

diff --git a/gcc/ipa-cp.cc b/gcc/ipa-cp.cc
index a688dced5c9..5781f50c854 100644
--- a/gcc/ipa-cp.cc
+++ b/gcc/ipa-cp.cc
@@ -1649,7 +1649,7 @@ ipa_vr_operation_and_type_effects (vrange &dst_vr,
   enum tree_code operation,
   tree dst_type, tree src_type)
 {
-  if (!irange::supports_p (dst_type) || !irange::supports_p (src_type))
+  if (!ipa_supports_p (dst_type) || !ipa_supports_p (src_type))
 return false;
 
   range_op_handler handler (operation);
@@ -1720,7 +1720,7 @@ ipa_value_range_from_jfunc (vrange &vr,
 
   if (TREE_CODE_CLASS (operation) == tcc_unary)
{
- Value_Range res (vr_type);
+ Value_Range res (parm_type);
 
  if (ipa_vr_operation_and_type_effects (res,
 srcvr,
@@ -1733,7 +1733,7 @@ ipa_value_range_from_jfunc (vrange &vr,
  Value_Range op_res (vr_type);
  Value_Range res (vr_type);
  tree op = ipa_get_jf_pass_through_operand (jfunc);
- Value_Range op_vr (vr_type);
+ Value_Range op_vr (TREE_TYPE (op));
  range_op_handler handler (operation);
 
  ipa_range_set_and_normalize (op_vr, op);
@@ -2527,7 +2527,7 @@ propagate_vr_across_jump_function (cgraph_edge *cs, 
ipa_jump_func *jfunc,
   if (src_lats->m_value_range.bottom_p ())
return dest_lat->set_to_bottom ();
 
-  Value_Range vr (operand_type);
+  Value_Range vr (param_type);
   if (TREE_CODE_CLASS (operation) == tcc_unary)
ipa_vr_operation_and_type_effects (vr,
   src_lats->m_value_range.m_vr,
@@ -2540,16 +2540,16 @@ propagate_vr_across_jump_function (cgraph_edge *cs, 
ipa_jump_func *jfunc,
{
  tree op = ipa_get_jf_pass_through_operand (jfunc);
  Value_Range op_vr (TREE_TYPE (op));
- Value_Range op_res (operand_type);
+ Value_Range op_res (param_type);
  range_op_handler handler (operation);
 
  ipa_range_set_and_normalize (op_vr, op);
 
  if (!handler
- || !op_res.supports_type_p (operand_type)
+ || !ipa_supports_p (operand_type)
  || !handler.fold_range (op_res, operand_type,
  src_lats->m_value_range.m_vr, op_vr))
-   op_res.set_varying (operand_type);
+   op_res.set_varying (param_type);
 
  ipa_vr_operation_and_type_effects (vr,
 op_res,
diff --git a/gcc/ipa-cp.h b/gcc/ipa-cp.h
index 7ff74fb5c98..abeaaa4053e 100644
--- a/gcc/ipa-cp.h
+++ b/gcc/ipa-cp.h
@@ -291,4 +291,12 @@ public:
 
 bool values_equal_for_ipcp_p (tree x, tree y);
 
+/* Return TRUE if IPA supports ranges of TYPE.  */
+
+static inline bool
+ipa_supports_p (tree type)
+{
+  return irange::supports_p (type);
+}
+
 #endif /* IPA_CP_H */
diff --git a/gcc/ipa-fnsummary.cc b/gcc/ipa-fnsummary.cc
index dff40cd8aa5..1dbf5278149 100644
--- a/gcc/ipa-fnsummary.cc
+++ b/gcc/ipa-fnsummary.cc
@@ -515,7 +515,7 @@ evaluate_conditions_for_known_args (struct cgraph_node 
*node,
}
  else if (!op->val[1])
{
- Value_Range op0 (op->type);
+ Value_Range op0 (TREE_TYPE (op->val[0]));
  range_op_handler handler (op->code);
 
  ipa_range_set_and_normalize (op0, op->val[0]);
diff --git a/gcc/ip

[Patch, fortran] PR114859 - [14/15 Regression] Seeing new segmentation fault in same_type_as since r14-9752

2024-04-28 Thread Paul Richard Thomas
Hi All,

Could this be looked at quickly? The timing of this regression is more than
a little embarrassing on the eve of the 14.1 release. The testcase and the
comment in gfc_trans_class_init_assign explain what this problem is all
about and how the patch fixes it.

OK for 15-branch and backporting to 14-branch (hopefully to the RC as well)?

Paul

Fortran: Fix regression caused by r14-9752 [PR114959]

2024-04-28  Paul Thomas  

gcc/fortran
PR fortran/114959
* trans-expr.cc (gfc_trans_class_init_assign): Return NULL_TREE
if the default initializer has all NULL fields. Guard this
by a requirement that the code be EXEC_INIT_ASSIGN and that the
object be an INTENT_IN dummy.
* trans-stmt.cc (gfc_trans_allocate): Change the initializer
code for allocate with mold to EXEC_ASSIGN to allow initializer
with all NULL fields..

gcc/testsuite/
PR fortran/114959
* gfortran.dg/pr114959.f90: New test.
diff --git a/gcc/fortran/trans-expr.cc b/gcc/fortran/trans-expr.cc
index 072adf3fe77..0280c441ced 100644
--- a/gcc/fortran/trans-expr.cc
+++ b/gcc/fortran/trans-expr.cc
@@ -1720,6 +1720,7 @@ gfc_trans_class_init_assign (gfc_code *code)
   gfc_se dst,src,memsz;
   gfc_expr *lhs, *rhs, *sz;
   gfc_component *cmp;
+  gfc_symbol *sym;
 
   gfc_start_block (&block);
 
@@ -1736,18 +1737,25 @@ gfc_trans_class_init_assign (gfc_code *code)
   /* The _def_init is always scalar.  */
   rhs->rank = 0;
 
-  /* Check def_init for initializers.  If this is a dummy with all default
- initializer components NULL, return NULL_TREE and use the passed value as
- required by F2018(8.5.10).  */
-  if (!lhs->ref && lhs->symtree->n.sym->attr.dummy)
+  /* Check def_init for initializers.  If this is an INTENT(OUT) dummy with all
+ default initializer components NULL, return NULL_TREE and use the passed
+ value as required by F2018(8.5.10).  */
+  sym = code->expr1->expr_type == EXPR_VARIABLE ? code->expr1->symtree->n.sym
+		: NULL;
+  if (code->op != EXEC_ALLOCATE
+  && sym && sym->attr.dummy
+  && sym->attr.intent == INTENT_OUT)
 {
-  cmp = rhs->ref->next->u.c.component->ts.u.derived->components;
-  for (; cmp; cmp = cmp->next)
+  if (!lhs->ref && lhs->symtree->n.sym->attr.dummy)
 	{
-	  if (cmp->initializer)
-	break;
-	  else if (!cmp->next)
-	return build_empty_stmt (input_location);
+	  cmp = rhs->ref->next->u.c.component->ts.u.derived->components;
+	  for (; cmp; cmp = cmp->next)
+	{
+	  if (cmp->initializer)
+		break;
+	  else if (!cmp->next)
+		return NULL_TREE;
+	}
 	}
 }
 
diff --git a/gcc/fortran/trans-stmt.cc b/gcc/fortran/trans-stmt.cc
index c34e0b4c0cd..d355009fa5e 100644
--- a/gcc/fortran/trans-stmt.cc
+++ b/gcc/fortran/trans-stmt.cc
@@ -7262,11 +7262,12 @@ gfc_trans_allocate (gfc_code * code, gfc_omp_namelist *omp_allocate)
 	{
 	  /* Use class_init_assign to initialize expr.  */
 	  gfc_code *ini;
-	  ini = gfc_get_code (EXEC_INIT_ASSIGN);
+	  ini = gfc_get_code (EXEC_ALLOCATE);
 	  ini->expr1 = gfc_find_and_cut_at_last_class_ref (expr, true);
 	  tmp = gfc_trans_class_init_assign (ini);
 	  gfc_free_statements (ini);
-	  gfc_add_expr_to_block (&block, tmp);
+	  if (tmp != NULL_TREE)
+	gfc_add_expr_to_block (&block, tmp);
 	}
   else if ((init_expr = allocate_get_initializer (code, expr)))
 	{
diff --git a/gcc/testsuite/gfortran.dg/pr114959.f90 b/gcc/testsuite/gfortran.dg/pr114959.f90
new file mode 100644
index 000..5cc3c052c1d
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr114959.f90
@@ -0,0 +1,33 @@
+! { dg-do compile }
+! { dg-options "-fdump-tree-original" }
+!
+! Fix the regression caused by r14-9752 (fix for PR112407)
+! Contributed by Orion Poplawski  
+! Problem isolated by Jakub Jelinek   and further
+! reduced here.
+!
+module m
+  type :: smoother_type
+integer :: i
+  end type
+  type :: onelev_type
+class(smoother_type), allocatable :: sm
+class(smoother_type), allocatable :: sm2a
+  end type
+contains
+  subroutine save_smoothers(level,save1, save2)
+Implicit None
+type(onelev_type), intent(inout) :: level
+class(smoother_type), allocatable , intent(inout) :: save1, save2
+integer(4) :: info
+
+info  = 0
+! r14-9752 causes the 'stat' declaration from the first ALLOCATE statement
+! to disappear, which triggers an ICE in gimplify_var_or_parm_decl. The
+! second ALLOCATE statement has to be present for the ICE to occur.
+allocate(save1, mold=level%sm,stat=info)
+allocate(save2, mold=level%sm2a,stat=info)
+  end subroutine save_smoothers
+end module m
+! Two 'stat's from the allocate statements and two from the final wrapper.
+! { dg-final { scan-tree-dump-times "integer\\(kind..\\) stat" 4 "original" } }


[PATCH] RISC-V: Fix parsing of Zic* extensions

2024-04-28 Thread Christoph Müllner
The extension parsing table entries for a range of Zic* extensions
does not match the mask definition in riscv.opt.
This results in broken TARGET_ZIC* macros, because the values of
riscv_zi_subext and riscv_zicmo_subext are set wrong.

This patch fixes this by moving Zic64b into riscv_zicmo_subext
and all other affected Zic* extensions to riscv_zi_subext.

gcc/ChangeLog:

* common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif,
zicclsm, and ziccrse into riscv_zi_subext.
* config/riscv/riscv.opt: Define MASK_ZIC64B for
riscv_ziccmo_subext.

Signed-off-by: Christoph Müllner 
---
 gcc/common/config/riscv/riscv-common.cc | 8 
 gcc/config/riscv/riscv.opt  | 4 ++--
 2 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/gcc/common/config/riscv/riscv-common.cc 
b/gcc/common/config/riscv/riscv-common.cc
index 43b7549e3ec..8cc0e727737 100644
--- a/gcc/common/config/riscv/riscv-common.cc
+++ b/gcc/common/config/riscv/riscv-common.cc
@@ -1638,15 +1638,15 @@ static const riscv_ext_flag_table_t 
riscv_ext_flag_table[] =
 
   {"zihintntl", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTNTL},
   {"zihintpause", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTPAUSE},
+  {"ziccamoa", &gcc_options::x_riscv_zi_subext, MASK_ZICCAMOA},
+  {"ziccif", &gcc_options::x_riscv_zi_subext, MASK_ZICCIF},
+  {"zicclsm", &gcc_options::x_riscv_zi_subext, MASK_ZICCLSM},
+  {"ziccrse", &gcc_options::x_riscv_zi_subext, MASK_ZICCRSE},
 
   {"zicboz", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOZ},
   {"zicbom", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOM},
   {"zicbop", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOP},
   {"zic64b", &gcc_options::x_riscv_zicmo_subext, MASK_ZIC64B},
-  {"ziccamoa", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCAMOA},
-  {"ziccif", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCIF},
-  {"zicclsm", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCLSM},
-  {"ziccrse", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCRSE},
 
   {"zve32x",   &gcc_options::x_target_flags, MASK_VECTOR},
   {"zve32f",   &gcc_options::x_target_flags, MASK_VECTOR},
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index b14888e9816..ee824756381 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -237,8 +237,6 @@ Mask(ZIHINTPAUSE) Var(riscv_zi_subext)
 
 Mask(ZICOND)  Var(riscv_zi_subext)
 
-Mask(ZIC64B)  Var(riscv_zi_subext)
-
 Mask(ZICCAMOA)Var(riscv_zi_subext)
 
 Mask(ZICCIF)  Var(riscv_zi_subext)
@@ -390,6 +388,8 @@ Mask(ZICBOM) Var(riscv_zicmo_subext)
 
 Mask(ZICBOP) Var(riscv_zicmo_subext)
 
+Mask(ZIC64B) Var(riscv_zicmo_subext)
+
 TargetVariable
 int riscv_zf_subext
 
-- 
2.44.0



[PATCH v3 2/2] lto-wrapper: Truncate files using -truncate driver option [PR110710]

2024-04-28 Thread Peter Damianov
This commit changes the Makefiles generated by lto-wrapper to no longer use
the "mv" and "touch" shell commands. These don't exist on Windows, so when the
Makefile attempts to call them, it results in errors like:
The system cannot find the file specified.

This problem only manifested when calling gcc from cmd.exe, and having no
sh.exe present on the PATH. The Windows port of GNU Make searches the PATH for
an sh.exe, and uses it if present.

I have tested this in environments with and without sh.exe on the PATH and
confirmed it works as expected.

Signed-off-by: Peter Damianov 
---
 gcc/lto-wrapper.cc | 6 ++
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/gcc/lto-wrapper.cc b/gcc/lto-wrapper.cc
index 02579951569..cfded757f26 100644
--- a/gcc/lto-wrapper.cc
+++ b/gcc/lto-wrapper.cc
@@ -2023,14 +2023,12 @@ cont:
  fprintf (mstream, "%s:\n\t@%s ", output_name, new_argv[0]);
  for (j = 1; new_argv[j] != NULL; ++j)
fprintf (mstream, " '%s'", new_argv[j]);
- fprintf (mstream, "\n");
  /* If we are not preserving the ltrans input files then
 truncate them as soon as we have processed it.  This
 reduces temporary disk-space usage.  */
  if (! save_temps)
-   fprintf (mstream, "\t@-touch -r \"%s\" \"%s.tem\" > /dev/null "
-"2>&1 && mv \"%s.tem\" \"%s\"\n",
-input_name, input_name, input_name, input_name); 
+   fprintf (mstream, " -truncate '%s'", input_name);
+ fprintf (mstream, "\n");
}
  else
{
-- 
2.39.2



[PATCH v3 1/2] Driver: Add new -truncate option

2024-04-28 Thread Peter Damianov
This commit adds a new option to the driver that truncates one file after
linking.

Tested likeso:

$ gcc hello.c -c
$ du -h hello.o
4.0K  hello.o
$ gcc hello.o -truncate hello.o
$ ./a.out
Hello world
$ du -h hello.o
$ 0   hello.o

$ gcc hello.o -truncate
gcc: error: missing filename after '-truncate'

The motivation for adding this is PR110710. It is used by lto-wrapper to
truncate files in a shell-independent manner.

Signed-off-by: Peter Damianov 
---
 gcc/common.opt |  6 ++
 gcc/gcc.cc | 14 ++
 2 files changed, 20 insertions(+)

diff --git a/gcc/common.opt b/gcc/common.opt
index ad348844775..40cab3cb36a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -422,6 +422,12 @@ Display target specific command line options (including 
assembler and linker opt
 -time
 Driver Alias(time)
 
+;; Truncate the file specified after linking.
+;; This option is used by lto-wrapper to reduce the peak disk-usage when
+;; linking with many .LTRANS units.
+truncate
+Driver Separate Undocumented MissingArgError(missing filename after %qs)
+
 -verbose
 Driver Alias(v)
 
diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 728332b8153..830a4700a87 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -2138,6 +2138,10 @@ static int have_E = 0;
 /* Pointer to output file name passed in with -o. */
 static const char *output_file = 0;
 
+/* Pointer to input file name passed in with -truncate.
+   This file should be truncated after linking. */
+static const char *totruncate_file = 0;
+
 /* This is the list of suffixes and codes (%g/%u/%U/%j) and the associated
temp file.  If the HOST_BIT_BUCKET is used for %j, no entry is made for
it here.  */
@@ -4538,6 +4542,11 @@ driver_handle_option (struct gcc_options *opts,
   do_save = false;
   break;
 
+case OPT_truncate:
+  totruncate_file = arg;
+  do_save = false;
+  break;
+
 case OPT:
   /* "-###"
 This is similar to -v except that there is no execution
@@ -9286,6 +9295,11 @@ driver::final_actions () const
 delete_failure_queue ();
   delete_temp_files ();
 
+  if (totruncate_file != NULL && !seen_error ())
+/* Truncate file specified by -truncate.
+   Used by lto-wrapper to reduce temporary disk-space usage. */
+truncate(totruncate_file, 0);
+
   if (print_help_list)
 {
   printf (("\nFor bug reporting instructions, please see:\n"));
-- 
2.39.2



Re: [PATCH v3 1/2] Driver: Add new -truncate option

2024-04-28 Thread Peter0x44

29 Apr 2024 12:16:26 am Peter Damianov :

This commit adds a new option to the driver that truncates one file 
after

linking.

Tested likeso:

$ gcc hello.c -c
$ du -h hello.o
4.0K  hello.o
$ gcc hello.o -truncate hello.o
$ ./a.out
Hello world
$ du -h hello.o
$ 0   hello.o

$ gcc hello.o -truncate
gcc: error: missing filename after '-truncate'

The motivation for adding this is PR110710. It is used by lto-wrapper 
to

truncate files in a shell-independent manner.

Signed-off-by: Peter Damianov 
---
gcc/common.opt |  6 ++
gcc/gcc.cc | 14 ++
2 files changed, 20 insertions(+)

diff --git a/gcc/common.opt b/gcc/common.opt
index ad348844775..40cab3cb36a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -422,6 +422,12 @@ Display target specific command line options 
(including assembler and linker opt

-time
Driver Alias(time)

+;; Truncate the file specified after linking.
+;; This option is used by lto-wrapper to reduce the peak disk-usage 
when

+;; linking with many .LTRANS units.
+truncate
+Driver Separate Undocumented MissingArgError(missing filename after 
%qs)

+
-verbose
Driver Alias(v)

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index 728332b8153..830a4700a87 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -2138,6 +2138,10 @@ static int have_E = 0;
/* Pointer to output file name passed in with -o. */
static const char *output_file = 0;

+/* Pointer to input file name passed in with -truncate.
+   This file should be truncated after linking. */
+static const char *totruncate_file = 0;
+
/* This is the list of suffixes and codes (%g/%u/%U/%j) and the 
associated
    temp file.  If the HOST_BIT_BUCKET is used for %j, no entry is made 
for

    it here.  */
@@ -4538,6 +4542,11 @@ driver_handle_option (struct gcc_options *opts,
   do_save = false;
   break;

+    case OPT_truncate:
+  totruncate_file = arg;
+  do_save = false;
+  break;
+
 case OPT:
   /* "-###"
 This is similar to -v except that there is no execution
@@ -9286,6 +9295,11 @@ driver::final_actions () const
 delete_failure_queue ();
   delete_temp_files ();

+  if (totruncate_file != NULL && !seen_error ())
+    /* Truncate file specified by -truncate.
+   Used by lto-wrapper to reduce temporary disk-space usage. */
+    truncate(totruncate_file, 0);
+
   if (print_help_list)
 {
   printf (("\nFor bug reporting instructions, please see:\n"));
--
2.39.2

I resubmitted the patch because the previous one had a mistake.

It didn't set "do_save" to false, so it resulted in problems like this:

./gcc/xgcc -truncate
xgcc: error: missing filename after ‘-truncate’
xgcc: fatal error: no input files

./gcc/xgcc -truncate ??
xgcc: error: unrecognized command-line option ‘-truncate’
xgcc: fatal error: no input files

Therefore regressing some tests, and not working properly.
After fixing this, I ran all of the LTO tests again and observed no 
failures.


I'm not sure how I ever observed it working before, but I'm reasonably 
confident this is correct now.


[PATCH] Silence two instances of -Wcalloc-transposed-args

2024-04-28 Thread Peter Damianov
Signed-off-by: Peter Damianov 
---

Fixes these warnings:

../../gcc/gcc/../libgcc/libgcov-util.c: In function 'void tag_counters(unsigned 
int, int)':
../../gcc/gcc/../libgcc/libgcov-util.c:214:59: warning: 'void* calloc(size_t, 
size_t)' sizes specified with 'sizeof' in the earlier argument and not in the 
later argument [-Wcalloc-transposed-args]
  214 |   k_ctrs[tag_ix].values = values = (gcov_type *) xcalloc (sizeof 
(gcov_type),
  |   
^~
../../gcc/gcc/../libgcc/libgcov-util.c:214:59: note: earlier argument should 
specify number of elements, later size of each element

../../gcc/gcc/../libgcc/libgcov-util.c: In function 'void 
topn_to_memory_representation(gcov_ctr_info*)':
../../gcc/gcc/../libgcc/libgcov-util.c:529:43: warning: 'void* calloc(size_t, 
size_t)' sizes specified with 'sizeof' in the earlier argument and not in the 
later argument [-Wcalloc-transposed-args]
  529 | = (struct gcov_kvp *)xcalloc (sizeof (struct gcov_kvp), n);
  |   ^~~~
../../gcc/gcc/../libgcc/libgcov-util.c:529:43: note: earlier argument should 
specify number of elements, later size of each element

I think this can be applied as obvious.

 libgcc/libgcov-util.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/libgcc/libgcov-util.c b/libgcc/libgcov-util.c
index ba4b90a480d..f443408c4ab 100644
--- a/libgcc/libgcov-util.c
+++ b/libgcc/libgcov-util.c
@@ -211,8 +211,8 @@ tag_counters (unsigned tag, int length)
   gcc_assert (k_ctrs[tag_ix].num == 0);
   k_ctrs[tag_ix].num = n_counts;
 
-  k_ctrs[tag_ix].values = values = (gcov_type *) xcalloc (sizeof (gcov_type),
- n_counts);
+  k_ctrs[tag_ix].values = values = (gcov_type *) xcalloc (n_counts,
+ sizeof (gcov_type));
   gcc_assert (values);
 
   if (length > 0)
@@ -526,7 +526,7 @@ topn_to_memory_representation (struct gcov_ctr_info *info)
   if (n > 0)
{
  struct gcov_kvp *tuples
-   = (struct gcov_kvp *)xcalloc (sizeof (struct gcov_kvp), n);
+   = (struct gcov_kvp *)xcalloc (n, sizeof (struct gcov_kvp));
  for (unsigned i = 0; i < n - 1; i++)
tuples[i].next = &tuples[i + 1];
  for (unsigned i = 0; i < n; i++)
-- 
2.39.2



Re: [PATCH] config-ml.in: Fix multi-os-dir search

2024-04-28 Thread YunQiang Su
Jeff Law  于2024年1月3日周三 01:00写道:
>
>
>
> On 1/1/24 09:48, YunQiang Su wrote:
> > When building multilib libraries, CC/CXX etc are set with an option
> > -B*/lib/, instead of -B/lib/.
> > This will make some trouble in some case, for example building
> > cross toolchain based on Debian's cross packages:
> >
> >If we have libc6-dev-i386-amd64-cross packages installed on
> >a non-x86 machine. This package will have the files in
> >/usr/x86_4-linux-gnu/lib32.  The fellow configure will fail
> >when build libgcc for i386, with complains the libc is not
> >i386 ones:
> >   ../configure --enable-multilib --enable-multilib \
> >  --target=x86_64-linux-gnu
> >
> > Let's insert a "-B*/lib/`CC ${flags} --print-multi-os-directory`"
> > before "-B*/lib/".
> >
> > This patch is based on the patch used by Debian now.
> >
> > ChangeLog
> >
> >   * config-ml.in: Insert an -B option with multi-os-dir into
> >   compiler commands used to build libraries.
> I would prefer this to wait for gcc-15.   I'll go ahead and ACK it for
> gcc-15 though.
>

I noticed that the gcc-14 branch has been created, and the basever has also
been 15.0 now.
Is it time for this patch now?

> What would also be valuable would be to extract out the rest of the
> multiarch patches from the Debian patches and get those into into GCC
> proper.
>
> Jeff


Re: [PATCH v2] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-04-28 Thread YunQiang Su
Xi Ruoyao  于2024年3月26日周二 18:10写道:
>
> On Tue, 2024-03-26 at 11:15 +0800, YunQiang Su wrote:
>
> /* snip */
>
> > With -ffinite-math-only -fno-signed-zeros, it does work with
> > x >= y ? x : y
> > while without `-ffinite-math-only -fno-signed-zeros`, it cannot.
> > @Xi Ruoyao Is it expected by IEEE?
>
> When y is (quiet) NaN and x is not, fmax(x, y) should produce x but x >=
> y ? x : y should produce y.  Thus -ffinite-math-only is needed.
>
> When x is +0.0 and y is -0.0, x >= y ? x : y should produce +0.0 but
> fmax(x, y) may produce +0.0 or -0.0 (IEEE allows both and I don't see a
> more strict requirement in MIPS 6.06 manual either).  Thus -fno-signed-
> zeros is needed.
>

Yes, MIPS 6.06 requires `max.f Y,+0,-0` produce +0.
There is a table after the description of max.fmt instruction,
aka Table 4.1 Special Cases for FP MAX, MIN, MAXA, MINA.

> --
> Xi Ruoyao 
> School of Aerospace Science and Technology, Xidian University


Re: [PATCH v3] MIPS: Add MIN/MAX.fmt instructions support for MIPS R6

2024-04-28 Thread YunQiang Su
I will apply this patch.
While we still have a problem about
```
float max(float a, float b) { return a>=b?a:b; }
```
If it is compiled with `-ffinite-math-only -fsigned-zeros -O2
-mips32r6 -mabi=32`,
`max.s` can be used.

The max.fmt/min.fmt of MIPSr6 can process +0/-0 correctly.


Re: [PATCH] RISC-V: Fix parsing of Zic* extensions

2024-04-28 Thread Kito Cheng
OK for trunk, and my understanding is that flag isn't really used in
code gen yet, so it's not necessary to port to GCC 14 branch?

On Mon, Apr 29, 2024 at 7:05 AM Christoph Müllner
 wrote:
>
> The extension parsing table entries for a range of Zic* extensions
> does not match the mask definition in riscv.opt.
> This results in broken TARGET_ZIC* macros, because the values of
> riscv_zi_subext and riscv_zicmo_subext are set wrong.
>
> This patch fixes this by moving Zic64b into riscv_zicmo_subext
> and all other affected Zic* extensions to riscv_zi_subext.
>
> gcc/ChangeLog:
>
> * common/config/riscv/riscv-common.cc: Move ziccamoa, ziccif,
> zicclsm, and ziccrse into riscv_zi_subext.
> * config/riscv/riscv.opt: Define MASK_ZIC64B for
> riscv_ziccmo_subext.
>
> Signed-off-by: Christoph Müllner 
> ---
>  gcc/common/config/riscv/riscv-common.cc | 8 
>  gcc/config/riscv/riscv.opt  | 4 ++--
>  2 files changed, 6 insertions(+), 6 deletions(-)
>
> diff --git a/gcc/common/config/riscv/riscv-common.cc 
> b/gcc/common/config/riscv/riscv-common.cc
> index 43b7549e3ec..8cc0e727737 100644
> --- a/gcc/common/config/riscv/riscv-common.cc
> +++ b/gcc/common/config/riscv/riscv-common.cc
> @@ -1638,15 +1638,15 @@ static const riscv_ext_flag_table_t 
> riscv_ext_flag_table[] =
>
>{"zihintntl", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTNTL},
>{"zihintpause", &gcc_options::x_riscv_zi_subext, MASK_ZIHINTPAUSE},
> +  {"ziccamoa", &gcc_options::x_riscv_zi_subext, MASK_ZICCAMOA},
> +  {"ziccif", &gcc_options::x_riscv_zi_subext, MASK_ZICCIF},
> +  {"zicclsm", &gcc_options::x_riscv_zi_subext, MASK_ZICCLSM},
> +  {"ziccrse", &gcc_options::x_riscv_zi_subext, MASK_ZICCRSE},
>
>{"zicboz", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOZ},
>{"zicbom", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOM},
>{"zicbop", &gcc_options::x_riscv_zicmo_subext, MASK_ZICBOP},
>{"zic64b", &gcc_options::x_riscv_zicmo_subext, MASK_ZIC64B},
> -  {"ziccamoa", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCAMOA},
> -  {"ziccif", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCIF},
> -  {"zicclsm", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCLSM},
> -  {"ziccrse", &gcc_options::x_riscv_zicmo_subext, MASK_ZICCRSE},
>
>{"zve32x",   &gcc_options::x_target_flags, MASK_VECTOR},
>{"zve32f",   &gcc_options::x_target_flags, MASK_VECTOR},
> diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
> index b14888e9816..ee824756381 100644
> --- a/gcc/config/riscv/riscv.opt
> +++ b/gcc/config/riscv/riscv.opt
> @@ -237,8 +237,6 @@ Mask(ZIHINTPAUSE) Var(riscv_zi_subext)
>
>  Mask(ZICOND)  Var(riscv_zi_subext)
>
> -Mask(ZIC64B)  Var(riscv_zi_subext)
> -
>  Mask(ZICCAMOA)Var(riscv_zi_subext)
>
>  Mask(ZICCIF)  Var(riscv_zi_subext)
> @@ -390,6 +388,8 @@ Mask(ZICBOM) Var(riscv_zicmo_subext)
>
>  Mask(ZICBOP) Var(riscv_zicmo_subext)
>
> +Mask(ZIC64B) Var(riscv_zicmo_subext)
> +
>  TargetVariable
>  int riscv_zf_subext
>
> --
> 2.44.0
>


Re: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV cost model

2024-04-28 Thread juzhe.zh...@rivai.ai
Hi, Han.

GCC 14 is branch out. You can commit it to trunk (GCC 15).



juzhe.zh...@rivai.ai
 
From: demin.han
Date: 2024-04-02 16:30
To: gcc-patches
CC: juzhe.zhong; kito.cheng; pan2.li; jeffreyalaw; rdapp.gcc
Subject: [PATCH v2] RISC-V: Refine the condition for add additional vars in RVV 
cost model
The adjacent_dr_p is sufficient and unnecessary condition for contiguous access.
So unnecessary live-ranges are added and result in smaller LMUL.
 
This patch uses MEMORY_ACCESS_TYPE as condition and constrains segment
load/store.
 
Tested on RV64 and no regression.
 
PR target/114506
 
gcc/ChangeLog:
 
* config/riscv/riscv-vector-costs.cc (non_contiguous_memory_access_p): Rename
(need_additional_vector_vars_p): Rename and refine condition
 
gcc/testsuite/ChangeLog:
 
* gcc.dg/vect/costmodel/riscv/rvv/pr114506.c: New test.
 
Signed-off-by: demin.han 
---
V2 changes:
  1. remove max_point issue
  2. minor change in commit message
 
gcc/config/riscv/riscv-vector-costs.cc| 23 ---
.../vect/costmodel/riscv/rvv/pr114506.c   | 23 +++
2 files changed, 38 insertions(+), 8 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
 
diff --git a/gcc/config/riscv/riscv-vector-costs.cc 
b/gcc/config/riscv/riscv-vector-costs.cc
index f462c272a6e..484196b15b4 100644
--- a/gcc/config/riscv/riscv-vector-costs.cc
+++ b/gcc/config/riscv/riscv-vector-costs.cc
@@ -563,14 +563,24 @@ get_store_value (gimple *stmt)
 return gimple_assign_rhs1 (stmt);
}
-/* Return true if it is non-contiguous load/store.  */
+/* Return true if addtional vector vars needed.  */
static bool
-non_contiguous_memory_access_p (stmt_vec_info stmt_info)
+need_additional_vector_vars_p (stmt_vec_info stmt_info)
{
   enum stmt_vec_info_type type
 = STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
-  return ((type == load_vec_info_type || type == store_vec_info_type)
-   && !adjacent_dr_p (STMT_VINFO_DATA_REF (stmt_info)));
+  if (type == load_vec_info_type || type == store_vec_info_type)
+{
+  if (STMT_VINFO_GATHER_SCATTER_P (stmt_info)
+   && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info) == VMAT_GATHER_SCATTER)
+ return true;
+
+  machine_mode mode = TYPE_MODE (STMT_VINFO_VECTYPE (stmt_info));
+  int lmul = riscv_get_v_regno_alignment (mode);
+  if (DR_GROUP_SIZE (stmt_info) * lmul > RVV_M8)
+ return true;
+}
+  return false;
}
/* Return the LMUL of the current analysis.  */
@@ -739,10 +749,7 @@ update_local_live_ranges (
  stmt_vec_info stmt_info = vinfo->lookup_stmt (gsi_stmt (si));
  enum stmt_vec_info_type type
= STMT_VINFO_TYPE (vect_stmt_to_vectorize (stmt_info));
-   if (non_contiguous_memory_access_p (stmt_info)
-   /* LOAD_LANES/STORE_LANES doesn't need a perm indice.  */
-   && STMT_VINFO_MEMORY_ACCESS_TYPE (stmt_info)
-!= VMAT_LOAD_STORE_LANES)
+   if (need_additional_vector_vars_p (stmt_info))
{
  /* For non-adjacent load/store STMT, we will potentially
convert it into:
diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c 
b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
new file mode 100644
index 000..a88d24b2d2d
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/pr114506.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gcv -mabi=lp64d -O3 -ftree-vectorize 
-mrvv-max-lmul=dynamic -fdump-tree-vect-details" } */
+
+float a[32000], b[32000], c[32000], d[32000];
+float aa[256][256], bb[256][256], cc[256][256];
+
+void
+s2275 ()
+{
+  for (int i = 0; i < 256; i++)
+{
+  for (int j = 0; j < 256; j++)
+ {
+   aa[j][i] = aa[j][i] + bb[j][i] * cc[j][i];
+ }
+  a[i] = b[i] + c[i] * d[i];
+}
+}
+
+/* { dg-final { scan-assembler-times {e32,m8} 1 } } */
+/* { dg-final { scan-assembler-not {e32,m4} } } */
+/* { dg-final { scan-assembler-not {csrr} } } */
+/* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it 
has unexpected spills" "vect" } } */
-- 
2.44.0
 
 


Re: [PATCH] adjust vectorization expectations for ppc costmodel 76b

2024-04-28 Thread Alexandre Oliva
On Apr 28, 2024, "Kewen.Lin"  wrote:

> Nit: Maybe add a prefix "testsuite: ".

ACK

>> 
>> From: Kewen Lin 

> Thanks, you can just drop this.  :)

I've turned it into Co-Authored-By, since you insist.

But unfortunately with the patch it still fails when testing for
-mcpu=power7 on ppc64le-linux-gnu: it does vectorize the loop with 13
iterations.  We need 16 iterations, as in an earlier version of this
test, for it to pass for -mcpu=power7, but then it doesn't pass for
-mcpu=power6.

It looks like we're going to have to adjust the expectations.

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH] ppc: testsuite: pr79004 needs -mlong-double-128

2024-04-28 Thread Alexandre Oliva
On Apr 28, 2024, "Kewen.Lin"  wrote:

> OK, from this perspective IMHO it seems more clear to adopt xfail
> with effective target long_double_64bit?

*nod*, yeah, that makes sense.

I'm going to travel this week, to speak at FSF's LibrePlanet conference,
so I'll look into massaging the patch into that when I get back, if you
haven't rendered it obsolete by then ;-)

Thanks,

-- 
Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
   Free Software Activist   GNU Toolchain Engineer
More tolerance and less prejudice are key for inclusion and diversity
Excluding neuro-others for not behaving ""normal"" is *not* inclusive


Re: [PATCH v1] RISC-V: Fix ICE for legitimize move on subreg const_poly_move

2024-04-28 Thread Kito Cheng
> diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
> index 0519e0679ed..bad23ea487f 100644
> --- a/gcc/config/riscv/riscv.cc
> +++ b/gcc/config/riscv/riscv.cc
> @@ -2786,6 +2786,44 @@ riscv_v_adjust_scalable_frame (rtx target, poly_int64 
> offset, bool epilogue)
>REG_NOTES (insn) = dwarf;
> }
> +/* Take care below subreg const_poly_int move:
> +
> +   1. (set (subreg:DI (reg:TI 237) 8)
> +(subreg:DI (const_poly_int:TI [4, 2]) 8))
> +  =>
> +  (set (subreg:DI (reg:TI 237) 8)
> +(const_int 0)) */
> +
> +static bool
> +riscv_legitimize_subreg_const_poly_move (machine_mode mode, rtx dest, rtx 
> src)
> +{
> +  gcc_assert (SUBREG_P (src) && CONST_POLY_INT_P (SUBREG_REG (src)));
> +  gcc_assert (SUBREG_BYTE (src).is_constant ());
> +
> +  int byte_offset = SUBREG_BYTE (src).to_constant ();
> +  rtx const_poly = SUBREG_REG (src);
> +  machine_mode subreg_mode = GET_MODE (const_poly);
> +
> +  if (subreg_mode != TImode) /* Only TImode is needed for now.  */
> +return false;
> +
> +  if (byte_offset == 8)
> +{ /* The const_poly_int cannot exceed int64, just set zero here.  */

{
 /* The const_poly_int cannot exceed int64, just set zero here.  */

New line for the comment.

> +  emit_move_insn (dest, CONST0_RTX (mode));
> +  return true;
> +}
> +
> +  /* The below transform will be covered in somewhere else.
> + Thus, ignore this here.
> +   1. (set (subreg:DI (reg:TI 237) 0)
> +(subreg:DI (const_poly_int:TI [4, 2]) 0))
> +  =>
> +  (set (subreg:DI (reg:TI 237) 0)
> +(const_poly_int:DI [4, 2])) */
> +
> +  return false;
> +}
> +
> /* If (set DEST SRC) is not a valid move instruction, emit an equivalent
> sequence that is valid.  */
> @@ -2839,6 +2877,11 @@ riscv_legitimize_move (machine_mode mode, rtx dest, 
> rtx src)
> }
>return true;
>  }
> +
> +  if (SUBREG_P (src) && CONST_POLY_INT_P (SUBREG_REG (src))
> +&& riscv_legitimize_subreg_const_poly_move (mode, dest, src))
> +return true;
> +
>/* Expand
> (set (reg:DI target) (subreg:DI (reg:V8QI reg) 0))
>   Expand this data movement instead of simply forbid it since
> --
> 2.34.1
>
>


[PATCH] PHIOPT: Value-replacement check undef

2024-04-28 Thread Andrew Pinski
While moving value replacement part of PHIOPT over
to use match-and-simplify, I ran into the case where
we would have an undef use that was conditional become
unconditional. This prevents that. I can't remember at this
point what the testcase was though.

Bootstrapped and tested on x86_64-linux-gnu with no regressions.

gcc/ChangeLog:

* tree-ssa-phiopt.cc (value_replacement): Reject undef variables
so they don't become unconditional used.

Signed-off-by: Andrew Pinski 
---
 gcc/tree-ssa-phiopt.cc | 7 +++
 1 file changed, 7 insertions(+)

diff --git a/gcc/tree-ssa-phiopt.cc b/gcc/tree-ssa-phiopt.cc
index a2bdcb5eae8..f166c3132cb 100644
--- a/gcc/tree-ssa-phiopt.cc
+++ b/gcc/tree-ssa-phiopt.cc
@@ -1146,6 +1146,13 @@ value_replacement (basic_block cond_bb, basic_block 
middle_bb,
   if (code != NE_EXPR && code != EQ_EXPR)
 return 0;
 
+  /* Do not make conditional undefs unconditional.  */
+  if ((TREE_CODE (arg0) == SSA_NAME
+   && ssa_name_maybe_undef_p (arg0))
+  || (TREE_CODE (arg1) == SSA_NAME
+ && ssa_name_maybe_undef_p (arg1)))
+return false;
+
   /* If the type says honor signed zeros we cannot do this
  optimization.  */
   if (HONOR_SIGNED_ZEROS (arg1))
-- 
2.43.0