Re: [PATCH] LoongArch: Change the value of macro TRY_EMPTY_VM_SPACE from 0x8000000000 to 0x1000000000.

2023-02-20 Thread Xi Ruoyao via Gcc-patches
On Tue, 2023-02-21 at 15:20 +0800, Lulu Cheng wrote:
> Like la264 only has 40 effective bits of virtual address space.

I'm OK with the change.  But the VA length is configurable building the
kernel.  Is there any specific reason LA264 has to use the 40-bit
configuration, or should we reword the commit message like "for
supporting the configuration with less page table level or smaller page
size"?

> When TRY_EMPTY_VM_SPACE is set to 0x80, it just exceeds
> the range of 40-bit virtual address, causing the mmap mapping
> to fail, thus causing the pch function to fail. To be compatible
> with this situation set the macro to 0x10.
> 
> gcc/ChangeLog:
> 
> * config/host-linux.cc (TRY_EMPTY_VM_SPACE): Modify the value
> of
> the macro to 0x10.
> ---
>  gcc/config/host-linux.cc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/gcc/config/host-linux.cc b/gcc/config/host-linux.cc
> index a891651a7b6..d1aa7ab28ca 100644
> --- a/gcc/config/host-linux.cc
> +++ b/gcc/config/host-linux.cc
> @@ -99,7 +99,7 @@
>  #elif defined(__riscv) && defined (__LP64__)
>  # define TRY_EMPTY_VM_SPACE0x10
>  #elif defined(__loongarch__) && defined(__LP64__)
> -# define TRY_EMPTY_VM_SPACE0x80
> +# define TRY_EMPTY_VM_SPACE0x10
>  #else
>  # define TRY_EMPTY_VM_SPACE0
>  #endif

-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Richard Biener via Gcc-patches
On Mon, Feb 20, 2023 at 5:23 PM Tobias Burnus  wrote:
>
> Hi Richard, hi all,
>
> On 20.02.23 13:46, Richard Biener wrote:
> > +  /* TODO: A more middle-end friendly alternative would be to use 
> > NULL_TREE
> > +as upper bound and store the value, e.g. as GFC_DECL_STRING_LEN.
> > +Caveat: this requires some cleanup throughout the code to 
> > consistently
> > +use some wrapper function.  */
> > +  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
> > +  tree tmp = TREE_TYPE (TYPE_SIZE (eltype));
> >
> > ...
> >
> > you are probably breaking type sharing here.  You could use
> > build_array_type_1 and pass false for 'shared' to get around that.  Note
> > that there's also canonical type building done in case 'eltype' is not
> > canonical itself.
>
> My feeling is that this is already somewhat broken. Currently, there
> is one type per decl as each has its own artificial length variable.
> I have no idea how this will be handled in the ME in terms of alias
> analysis. And whether shared=false makes sense here and what effect
> is has. (Probably yes.)
>
> In principle,
>integer(kind=8) .str., .str2;
>character(kind=1)[1:.str] * str;
>character(kind=1)[1:.str2] * str2;
> have the same type and iff .str == .str at runtime, they can alias.
> Example:
>str2 = str;
>.str2 = .str;
>
> I have no idea how the type analysis currently works (with or without 
> SAVE_EXPR)
> nor what effect shared=false has in this case.

alias analysis for array types looks only at the element type

> > The solution to the actual problem is a hack - you are relying on
> > re-evaluation of TYPE_SIZE, and for that, only from within accesses
> > from inside the frontend?
>
> I think this mostly helps with access inside the FE of the type 'size =
> TYPE_SIZE_UNIT(type)', which is used surprisingly often and is often
> directly evaluated (i.e. assigned to a temporary).

that's what I thought

> > Since gimplification will produce the result into a single temporary again, 
> > re-storing the "breakage".
> > So, does it_really_  fix things?
>
> It does seem to fix cases which do  'size = TYPE_SIZE_UNIT (type);' in
> the front end and then uses this size expression. Thus, there are fixed.
> However, there are many cases where things go wrong - with and without
> the patch. I keep discovering more and more :-(

I guess test coverage isn't too great with this feature then ;)

> * * *
>
> I still think that the proper way is to have NULL_TREE as upper value
> would be better in several ways, except that there is (too) much code

Yep.

> which relies on TYPE_UNIT_SIZE to work. (There are 117 occurrences).
> Additionally, there is more code doing assumptions in this area.
>
> Thus, the question is whether it makes sense as hackish partial solution
> or whether it should remain in the current broken stage until it is
> fixed properly.

I wonder if it makes more sense to individually fix the places using
TYPE_UNIT_SIZE in a wrong way?  You'd also get only "partial"
fixes, but at least those will be true and good?

Otherwise I defer to frontend maintainers if they agree to put in
a (partially working) hack like this.

Richard.

> Tobias,
>
> who would like to have more time for fixing such issues.
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


[PATCH] LoongArch: Change the value of macro TRY_EMPTY_VM_SPACE from 0x8000000000 to 0x1000000000.

2023-02-20 Thread Lulu Cheng
Like la264 only has 40 effective bits of virtual address space.
When TRY_EMPTY_VM_SPACE is set to 0x80, it just exceeds
the range of 40-bit virtual address, causing the mmap mapping
to fail, thus causing the pch function to fail. To be compatible
with this situation set the macro to 0x10.

gcc/ChangeLog:

* config/host-linux.cc (TRY_EMPTY_VM_SPACE): Modify the value of
the macro to 0x10.
---
 gcc/config/host-linux.cc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/config/host-linux.cc b/gcc/config/host-linux.cc
index a891651a7b6..d1aa7ab28ca 100644
--- a/gcc/config/host-linux.cc
+++ b/gcc/config/host-linux.cc
@@ -99,7 +99,7 @@
 #elif defined(__riscv) && defined (__LP64__)
 # define TRY_EMPTY_VM_SPACE0x10
 #elif defined(__loongarch__) && defined(__LP64__)
-# define TRY_EMPTY_VM_SPACE0x80
+# define TRY_EMPTY_VM_SPACE0x10
 #else
 # define TRY_EMPTY_VM_SPACE0
 #endif
-- 
2.31.1



Re: [PATCH] Fortran: improve checking of character length specification [PR96025]

2023-02-20 Thread Thomas Koenig via Gcc-patches

Hi Harald,


the attached patch fixes an ICE on invalid (non-integer)
specification expressions for character length in function
declarations.  It appears that the error handling was
already in place (mostly) and we need to essentially
prevent run-on errors.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

As a very minor matter of style, you might want to write

  function_result_typed = check_function_result_typed ();

instead of

  if (check_function_result_typed ())
function_result_typed = true;

OK either way.


The PR is marked as a 10/11/12/13 regression, so I would
like to backport this as far as it seems reasonable.


Also OK.

Thanks for the patch!

Best regards

Thomas


RE: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

2023-02-20 Thread Li, Pan2 via Gcc-patches
Hi,

Kindly reminder for this PR.

Pan

-Original Message-
From: Li, Pan2 
Sent: Friday, February 17, 2023 4:39 PM
To: richard.sandif...@arm.com; juzhe.zhong 
Cc: incarnation.p@outlook.com; gcc-patches@gcc.gnu.org; 
kito.ch...@sifive.com; Richard Biener 
Subject: RE: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

Cool, thank you!

Hi Richard S,

Could you please help to do me a fever for this change when you free? Thank you!

Pan

-Original Message-
From: Richard Biener 
Sent: Friday, February 17, 2023 3:36 PM
To: juzhe.zhong 
Cc: incarnation.p@outlook.com; gcc-patches@gcc.gnu.org; 
kito.ch...@sifive.com; Li, Pan2 ; richard.sandif...@arm.com
Subject: Re: [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment

On Thu, 16 Feb 2023, juzhe.zhong wrote:

> Thanks for the great work to fix this issue for rvv.Hi,richard. This 
> is the patch to differentiate mask mode of same bytesize. Adjust the 
> precision correctly according to rvv isa. Would you mind helping us 
> with this patch ?
> Since it‘s very important for rvv support in gcc

If adjusting the precision works fine then I suppose the patch looks 
reasonable.  I'll defer to Richard S. though since he's the one knowing the 
mode stuff better.  I'd have integrated the precision adjustment with the 
ADJUST_NITER hook since that is also documented to adjust the precision btw.

Richard.

> Thanks.
>  Replied Message 
> From
> incarnation.p@outlook.com
> Date
> 02/16/2023 23:12
> To
> gcc-patches@gcc.gnu.org
> Cc
> juzhe.zh...@rivai.ai,
> kito.ch...@sifive.com,
> rguent...@suse.de,
> pan2...@intel.com
> Subject
> [PATCH] RISC-V: Bugfix for rvv bool mode precision adjustment
> From: Pan Li 
> 
>    Fix the bug of the rvv bool mode precision with the adjustment.
>    The bits size of vbool*_t will be adjusted to
>    [1, 2, 4, 8, 16, 32, 64] according to the rvv spec 1.0 isa. The
>    adjusted mode precison of vbool*_t will help underlying pass to
>    make the right decision for both the correctness and optimization.
> 
>    Given below sample code:
>    void test_1(int8_t * restrict in, int8_t * restrict out)
>    {
>      vbool8_t v2 = *(vbool8_t*)in;
>      vbool16_t v5 = *(vbool16_t*)in;
>      *(vbool16_t*)(out + 200) = v5;
>      *(vbool8_t*)(out + 100) = v2;
>    }
> 
>    Before the precision adjustment:
>    addi    a4,a1,100
>    vsetvli a5,zero,e8,m1,ta,ma
>    addi    a1,a1,200
>    vlm.v   v24,0(a0)
>    vsm.v   v24,0(a4)
>    // Need one vsetvli and vlm.v for correctness here.
>    vsm.v   v24,0(a1)
> 
>    After the precision adjustment:
>    csrr    t0,vlenb
>    slli    t1,t0,1
>    csrr    a3,vlenb
>    sub sp,sp,t1
>    slli    a4,a3,1
>    add a4,a4,sp
>    sub a3,a4,a3
>    vsetvli a5,zero,e8,m1,ta,ma
>    addi    a2,a1,200
>    vlm.v   v24,0(a0)
>    vsm.v   v24,0(a3)
>    addi    a1,a1,100
>    vsetvli a4,zero,e8,mf2,ta,ma
>    csrr    t0,vlenb
>    vlm.v   v25,0(a3)
>    vsm.v   v25,0(a2)
>    slli    t1,t0,1
>    vsetvli a5,zero,e8,m1,ta,ma
>    vsm.v   v24,0(a1)
>    add sp,sp,t1
>    jr  ra
> 
>    However, there may be some optimization opportunates after
>    the mode precision adjustment. It can be token care of in
>    the RISC-V backend in the underlying separted PR(s).
> 
>    PR 108185
>    PR 108654
> 
> gcc/ChangeLog:
> 
>    * config/riscv/riscv-modes.def (ADJUST_PRECISION):
>    * config/riscv/riscv.cc (riscv_v_adjust_precision):
>    * config/riscv/riscv.h (riscv_v_adjust_precision):
>    * genmodes.cc (ADJUST_PRECISION):
>    (emit_mode_adjustments):
> 
> gcc/testsuite/ChangeLog:
> 
>    * gcc.target/riscv/pr108185-1.c: New test.
>    * gcc.target/riscv/pr108185-2.c: New test.
>    * gcc.target/riscv/pr108185-3.c: New test.
>    * gcc.target/riscv/pr108185-4.c: New test.
>    * gcc.target/riscv/pr108185-5.c: New test.
>    * gcc.target/riscv/pr108185-6.c: New test.
>    * gcc.target/riscv/pr108185-7.c: New test.
>    * gcc.target/riscv/pr108185-8.c: New test.
> 
> Signed-off-by: Pan Li 
> ---
> gcc/config/riscv/riscv-modes.def    |  8 +++ 
> gcc/config/riscv/riscv.cc   | 12  
> gcc/config/riscv/riscv.h    |  1 + gcc/genmodes.cc
> | 25 ++- gcc/testsuite/gcc.target/riscv/pr108185-1.c | 68
> ++ gcc/testsuite/gcc.target/riscv/pr108185-2.c | 68 
> ++ gcc/testsuite/gcc.target/riscv/pr108185-3.c | 68 
> ++ gcc/testsuite/gcc.target/riscv/pr108185-4.c | 68 
> ++ gcc/testsuite/gcc.target/riscv/pr108185-5.c | 68 
> ++ gcc/testsuite/gcc.target/riscv/pr108185-6.c | 68 
> ++ gcc/testsuite/gcc.target/riscv/pr108185-7.c | 68 
> ++ gcc/testsuite/gcc.target/riscv/pr108185-8.c | 77
> +
> 12 files changed, 598 insertions(+), 1 deletion(-) create mode 100644 
> gcc/testsuite/gcc.target/riscv/pr108185-1.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/pr108185-2.c
> create mode 

[PATCH] MIPS: Account for LWL/LWR in store_by_pieces_p.

2023-02-20 Thread Xin Liu
From: Matthew Fortune 

---
 gcc/config/mips/mips.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/gcc/config/mips/mips.cc b/gcc/config/mips/mips.cc
index 590c311e98c..bb9f4e19c22 100644
--- a/gcc/config/mips/mips.cc
+++ b/gcc/config/mips/mips.cc
@@ -8853,7 +8853,7 @@ mips_store_by_pieces_p (unsigned HOST_WIDE_INT size, 
unsigned int align)
  LW/SWL/SWR sequence.  This is often better than the 4 LIs and
  4 SBs that we would generate when storing by pieces.  */
   if (align <= BITS_PER_UNIT)
-return size < 4;
+return size < 4 || !ISA_HAS_LWL_LWR;
 
   /* If the data is 2-byte aligned, then:
 
@@ -,7 +,9 @@ mips_store_by_pieces_p (unsigned HOST_WIDE_INT size, 
unsigned int align)
  (c4) A block move of 8 bytes can use two LW/SW sequences or a single
  LD/SD sequence, and in these cases we've traditionally preferred
  the memory copy over the more bulky constant moves.  */
-  return size < 8;
+  return (size < 8
+ || (align < 4 * BITS_PER_UNIT
+ && !ISA_HAS_LWL_LWR));
 }
 
 /* Emit straight-line code to move LENGTH bytes from SRC to DEST.
-- 
2.30.2


[PATCH] Testsuite: Disable micromips for MSA tests

2023-02-20 Thread Xin Liu
From: Matthew Fortune 

---
 gcc/testsuite/gcc.target/mips/mips.exp | 1 +
 1 file changed, 1 insertion(+)

diff --git a/gcc/testsuite/gcc.target/mips/mips.exp 
b/gcc/testsuite/gcc.target/mips/mips.exp
index 81e19f39853..bf32fe0c93f 100644
--- a/gcc/testsuite/gcc.target/mips/mips.exp
+++ b/gcc/testsuite/gcc.target/mips/mips.exp
@@ -1463,6 +1463,7 @@ proc mips-dg-options { args } {
 mips_option_dependency options "-msoft-float" "-mno-paired-single"
 mips_option_dependency options "-mno-paired-single" "-mno-mips3d"
+mips_option_dependency options "-mmsa" "-mno-micromips"
 mips_option_dependency options "-mmsa" "-mno-mips16"
 
 # If the test requires an unsupported option, change run tests
 # to link tests.
-- 
2.30.2


Re: [PATCH] xtensa: Enforce return address saving when -Og is specified

2023-02-20 Thread Max Filippov via Gcc-patches
On Fri, Feb 17, 2023 at 8:54 PM Takayuki 'January June' Suwa
 wrote:
>
> Leaf function often omits saving its return address to the stack slot,
> and this feature often makes debugging very confusing, especially for
> stack dump analysis.
>
> gcc/ChangeLog:
>
> * config/xtensa/xtensa.cc (xtensa_call_save_reg): Change to return
> true if register A0 (return address register) when -Og is specified.
> ---
>  gcc/config/xtensa/xtensa.cc | 7 +--
>  1 file changed, 5 insertions(+), 2 deletions(-)

Regtested for target=xtensa-linux-uclibc, no new regressions.
Committed to master.

-- 
Thanks.
-- Max


Re: [PATCH] rs6000: mark tieable between INT and FLOAT

2023-02-20 Thread Jiufu Guo via Gcc-patches


Hi,

Gently Ping:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609504.html

BR,
Jeff (Jiufu)


Jiufu Guo  writes:

> Hi,
>
> During discussing/review patches in maillist, we find more modes are
> tieable, e.g. DI<->DF.  With some discussion, I drafted this patch
> to mark more tieable modes.
>
> Bootstrap and regtest pass on ppc64{,le}.
> Is this ok for trunk?
>
> BR,
> Jeff (Jiufu)
>
> gcc/ChangeLog:
>
>   * config/rs6000/rs6000.cc (rs6000_modes_tieable_p): Mark more tieable
>   modes.
>
> gcc/testsuite/ChangeLog:
>
>   * g++.target/powerpc/pr102024.C: Updated.
>
> ---
>  gcc/config/rs6000/rs6000.cc | 9 +
>  gcc/testsuite/g++.target/powerpc/pr102024.C | 3 ++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 6ac3adcec6b..3cb0186089e 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -1968,6 +1968,15 @@ rs6000_modes_tieable_p (machine_mode mode1, 
> machine_mode mode2)
>if (ALTIVEC_OR_VSX_VECTOR_MODE (mode2))
>  return false;
>  
> +  /* SFmode format (IEEE DP) in register would not as required,
> + So SFmode is restrict here.  */
> +  if (GET_MODE_CLASS (mode1) == MODE_FLOAT
> +  && GET_MODE_CLASS (mode2) == MODE_INT)
> +return GET_MODE_SIZE (mode2) == UNITS_PER_FP_WORD && mode1 != SFmode;
> +  if (GET_MODE_CLASS (mode1) == MODE_INT
> +  && GET_MODE_CLASS (mode2) == MODE_FLOAT)
> +return GET_MODE_SIZE (mode1) == UNITS_PER_FP_WORD && mode2 != SFmode;
> +
>if (SCALAR_FLOAT_MODE_P (mode1))
>  return SCALAR_FLOAT_MODE_P (mode2);
>if (SCALAR_FLOAT_MODE_P (mode2))
> diff --git a/gcc/testsuite/g++.target/powerpc/pr102024.C 
> b/gcc/testsuite/g++.target/powerpc/pr102024.C
> index 769585052b5..27d2dc5e80b 100644
> --- a/gcc/testsuite/g++.target/powerpc/pr102024.C
> +++ b/gcc/testsuite/g++.target/powerpc/pr102024.C
> @@ -5,7 +5,8 @@
>  // Test that a zero-width bit field in an otherwise homogeneous aggregate
>  // generates a psabi warning and passes arguments in GPRs.
>  
> -// { dg-final { scan-assembler-times {\mstd\M} 4 } }
> +// { dg-final { scan-assembler-times {\mmtvsrd\M} 4 { target has_arch_pwr8 } 
> } }
> +// { dg-final { scan-assembler-times {\mstd\M} 4 { target { ! has_arch_pwr8 
> } } } }
>  
>  struct a_thing
>  {


[PATCH] i386: Introduce general_x64constmem_operand predicate

2023-02-20 Thread Uros Bizjak via Gcc-patches
Instructions that use high-part QImode registers can not be encoded
with REX prefix.  To avoid REX prefix, operand constraints allow
only legacy QImode registers, immediates and constant memory operands.
The patch introduces matching predicate, so invalid operands are not
combined into instruction RTX only to be later fixed up by reload pass.

2023-02-20  Uroš Bizjak  

gcc/ChangeLog:

* config/i386/predicates.md
(general_x64constmem_operand): New predicate.
* config/i386/i386.md (*cmpqi_ext_1):
Use nonimm_x64constmem_operand.
(*cmpqi_ext_3): Use general_x64constmem_operand.
(*addqi_ext_1): Ditto.
(*testqi_ext_1): Ditto.
(*andqi_ext_1): Ditto.
(*andqi_ext_1_cc): Ditto.
(*qi_ext_1): Ditto.
(*xorqi_ext_1_cc): Ditto.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Pushed to master.

Uros.
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 6382cfbce21..8ebb12be2c9 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -1456,7 +1456,7 @@ (define_insn "*cmp_minus_1"
 (define_insn "*cmpqi_ext_1"
   [(set (reg FLAGS_REG)
(compare
- (match_operand:QI 0 "nonimmediate_operand" "QBc,m")
+ (match_operand:QI 0 "nonimm_x64constmem_operand" "QBc,m")
  (subreg:QI
(zero_extract:SWI248
  (match_operand 1 "int248_register_operand" "Q,Q")
@@ -1501,7 +1501,7 @@ (define_insn "*cmpqi_ext_3"
  (match_operand 0 "int248_register_operand" "Q,Q")
  (const_int 8)
  (const_int 8)) 0)
- (match_operand:QI 1 "general_operand" "QnBc,m")))]
+ (match_operand:QI 1 "general_x64constmem_operand" "QnBc,m")))]
   "ix86_match_ccmode (insn, CCmode)"
   "cmp{b}\t{%1, %h0|%h0, %1}"
   [(set_attr "isa" "*,nox64")
@@ -6683,7 +6683,7 @@ (define_insn "*addqi_ext_1"
(match_operand 1 "int248_register_operand" "0,0")
(const_int 8)
(const_int 8)) 0)
-   (match_operand:QI 2 "general_operand" "QnBc,m")) 0))
+   (match_operand:QI 2 "general_x64constmem_operand" "QnBc,m")) 0))
(clobber (reg:CC FLAGS_REG))]
   "/* FIXME: without this LRA can't reload this pattern, see PR82524.  */
rtx_equal_p (operands[0], operands[1])"
@@ -9901,7 +9901,7 @@ (define_insn "*testqi_ext_1"
(match_operand 0 "int248_register_operand" "Q,Q")
(const_int 8)
(const_int 8)) 0)
-   (match_operand:QI 1 "general_operand" "QnBc,m"))
+   (match_operand:QI 1 "general_x64constmem_operand" "QnBc,m"))
  (const_int 0)))]
   "ix86_match_ccmode (insn, CCNOmode)"
   "test{b}\t{%1, %h0|%h0, %1}"
@@ -10602,7 +10602,7 @@ (define_insn "*andqi_ext_1"
(match_operand 1 "int248_register_operand" "0,0")
(const_int 8)
(const_int 8)) 0)
-   (match_operand:QI 2 "general_operand" "QnBc,m")) 0))
+   (match_operand:QI 2 "general_x64constmem_operand" "QnBc,m")) 0))
(clobber (reg:CC FLAGS_REG))]
   "/* FIXME: without this LRA can't reload this pattern, see PR82524.  */
rtx_equal_p (operands[0], operands[1])"
@@ -10622,7 +10622,7 @@ (define_insn "*andqi_ext_1_cc"
(match_operand 1 "int248_register_operand" "0,0")
(const_int 8)
(const_int 8)) 0)
-   (match_operand:QI 2 "general_operand" "QnBc,m"))
+   (match_operand:QI 2 "general_x64constmem_operand" "QnBc,m"))
  (const_int 0)))
(set (zero_extract:SWI248
  (match_operand 0 "int248_register_operand" "+Q,Q")
@@ -11345,7 +11345,7 @@ (define_insn "*qi_ext_1"
(match_operand 1 "int248_register_operand" "0,0")
(const_int 8)
(const_int 8)) 0)
-   (match_operand:QI 2 "general_operand" "QnBc,m")) 0))
+   (match_operand:QI 2 "general_x64constmem_operand" "QnBc,m")) 0))
(clobber (reg:CC FLAGS_REG))]
   "(!TARGET_PARTIAL_REG_STALL || optimize_function_for_size_p (cfun))
/* FIXME: without this LRA can't reload this pattern, see PR82524.  */
@@ -11473,7 +11473,7 @@ (define_insn "*xorqi_ext_1_cc"
(match_operand 1 "int248_register_operand" "0,0")
(const_int 8)
(const_int 8)) 0)
-   (match_operand:QI 2 "general_operand" "QnBc,m"))
+   (match_operand:QI 2 "general_x64constmem_operand" "QnBc,m"))
  (const_int 0)))
(set (zero_extract:SWI248
  (match_operand 0 "int248_register_operand" "+Q,Q")
diff --git a/gcc/config/i386/predicates.md b/gcc/config/i386/predicates.md
index 7b3db0cc851..b4d9ab40ab9 100644
--- a/gcc/config/i386/predicates.md
+++ b/gcc/config/i386/predicates.md
@@ -116,6 +116,13 @@ (define_predicate "nonimm_x64constmem_operand"
(ior (not (match_test "TARGET_64BIT"))
 (match_test "constant_address_p (XEXP (op, 0))")
 
+;; Match general operand, but exclude non-constant addresses for x86_64.

[committed] powerpc: Another umaddditi4 fix [PR108862]

2023-02-20 Thread Jakub Jelinek via Gcc-patches
Hi!

The following testcase is miscompiled on powerpc64le-linux with
-O2 -mcpu=power9.  The problem is that gen_umaddditi4 is called with
the same TImode register for both op0 and op3, and maddlddi4
overwrites the low half of op0 before the low half of op3 is read,
so when they are the same register it reads the result of maddlddi4.

The following patch fixes that by swapping maddlddi4 and
umadddi4_highpart{,_le} during expansion, as the latter writes into
a temporary pseudo and so can't change anything maddlddi4 depends on.

Bootstrapped/regtested on powerpc64-linux (power7, tested -m32/-m64),
powerpc64le-linux (power8 and another on power9 with
--with-cpu-64=power9 --with-tune-64=power9), preapproved by Segher on IRC,
committed to trunk.

2023-02-20  Jakub Jelinek  

PR target/108862
* config/rs6000/rs6000.md (umaddditi4): Swap gen_maddlddi4 with
gen_umadddi4_highpart{,_le}.

* gcc.dg/pr108862.c: New test.
* gcc.target/powerpc/pr108862.c: New test.

--- gcc/config/rs6000/rs6000.md.jj  2023-02-15 10:51:12.745802021 +0100
+++ gcc/config/rs6000/rs6000.md 2023-02-20 16:01:02.929027764 +0100
@@ -3249,8 +3249,6 @@
   rtx op3_hi = gen_rtx_SUBREG (DImode, operands[3], BYTES_BIG_ENDIAN ? 0 : 8);
   rtx hi_temp = gen_reg_rtx (DImode);
 
-  emit_insn (gen_maddlddi4 (op0_lo, operands[1], operands[2], op3_lo));
-
   if (BYTES_BIG_ENDIAN)
 emit_insn (gen_umadddi4_highpart (hi_temp, operands[1], operands[2],
  op3_lo));
@@ -3258,6 +3256,8 @@
 emit_insn (gen_umadddi4_highpart_le (hi_temp, operands[1], operands[2],
 op3_lo));
 
+  emit_insn (gen_maddlddi4 (op0_lo, operands[1], operands[2], op3_lo));
+
   emit_insn (gen_adddi3 (op0_hi, hi_temp, op3_hi));
 
   DONE;
--- gcc/testsuite/gcc.dg/pr108862.c.jj  2023-02-20 15:52:20.570619215 +0100
+++ gcc/testsuite/gcc.dg/pr108862.c 2023-02-20 15:51:52.363029125 +0100
@@ -0,0 +1,27 @@
+/* PR target/108862 */
+/* { dg-do run { target int128 } } */
+/* { dg-options "-O2" } */
+
+unsigned long long a[2] = { 0x04a13945d898c296ULL, 0x1fffULL };
+unsigned long long b[4] = { 0x04a13945d898c296ULL, 0, 0, 0x1fffULL 
};
+
+__attribute__((noipa)) unsigned __int128
+foo (int x, unsigned long long *y, unsigned long long *z)
+{
+  unsigned __int128 w = 0;
+  for (int i = 0; i < x; i++)
+w += (unsigned __int128)*y++ * (unsigned __int128)*z--;
+  return w;
+}
+
+int
+main ()
+{
+  unsigned __int128 x = foo (1, [0], [1]);
+  unsigned __int128 y = foo (2, [0], [3]);
+  if ((unsigned long long) (x >> 64) != 0x004a13945dd3ULL
+  || (unsigned long long) x != 0x9b1c8443b3909d6aULL
+  || x != y)
+__builtin_abort ();
+  return 0;
+}
--- gcc/testsuite/gcc.target/powerpc/pr108862.c.jj  2023-02-20 
15:52:51.374171586 +0100
+++ gcc/testsuite/gcc.target/powerpc/pr108862.c 2023-02-20 15:53:04.497980869 
+0100
@@ -0,0 +1,6 @@
+/* PR target/108862 */
+/* { dg-do run { target int128 } } */
+/* { dg-require-effective-target p9vector_hw } */
+/* { dg-options "-O2 -mdejagnu-cpu=power9" } */
+
+#include "../../gcc.dg/pr108862.c"

Jakub



[PATCH] Fortran: improve checking of character length specification [PR96025]

2023-02-20 Thread Harald Anlauf via Gcc-patches
Dear all,

the attached patch fixes an ICE on invalid (non-integer)
specification expressions for character length in function
declarations.  It appears that the error handling was
already in place (mostly) and we need to essentially
prevent run-on errors.

Regtested on x86_64-pc-linux-gnu.  OK for mainline?

The PR is marked as a 10/11/12/13 regression, so I would
like to backport this as far as it seems reasonable.

Thanks,
Harald

From f581f63e206b54278c27a5c888c2566cb5077f11 Mon Sep 17 00:00:00 2001
From: Harald Anlauf 
Date: Mon, 20 Feb 2023 21:28:09 +0100
Subject: [PATCH] Fortran: improve checking of character length specification
 [PR96025]

gcc/fortran/ChangeLog:

	PR fortran/96025
	* parse.cc (check_function_result_typed): Improve type check of
	specification expression for character length and return status.
	(parse_spec): Use status from above.
	* resolve.cc (resolve_fntype): Prevent use of invalid specification
	expression for character length.

gcc/testsuite/ChangeLog:

	PR fortran/96025
	* gfortran.dg/pr96025.f90: New test.
---
 gcc/fortran/parse.cc  | 23 ---
 gcc/fortran/resolve.cc|  4 +++-
 gcc/testsuite/gfortran.dg/pr96025.f90 | 11 +++
 3 files changed, 30 insertions(+), 8 deletions(-)
 create mode 100644 gcc/testsuite/gfortran.dg/pr96025.f90

diff --git a/gcc/fortran/parse.cc b/gcc/fortran/parse.cc
index f5154d97ae8..47876a3833e 100644
--- a/gcc/fortran/parse.cc
+++ b/gcc/fortran/parse.cc
@@ -3974,21 +3974,30 @@ match_deferred_characteristics (gfc_typespec * ts)
For return types specified in a FUNCTION prefix, the IMPLICIT rules of the
scope are not yet parsed so this has to be delayed up to parse_spec.  */

-static void
+static bool
 check_function_result_typed (void)
 {
   gfc_typespec ts;

   gcc_assert (gfc_current_state () == COMP_FUNCTION);

-  if (!gfc_current_ns->proc_name->result) return;
+  if (!gfc_current_ns->proc_name->result)
+return true;

   ts = gfc_current_ns->proc_name->result->ts;

   /* Check type-parameters, at the moment only CHARACTER lengths possible.  */
   /* TODO:  Extend when KIND type parameters are implemented.  */
   if (ts.type == BT_CHARACTER && ts.u.cl && ts.u.cl->length)
-gfc_expr_check_typed (ts.u.cl->length, gfc_current_ns, true);
+{
+  /* Reject invalid type of specification expression for length.  */
+  if (ts.u.cl->length->ts.type != BT_INTEGER)
+	  return false;
+
+  gfc_expr_check_typed (ts.u.cl->length, gfc_current_ns, true);
+}
+
+  return true;
 }


@@ -4097,8 +4106,8 @@ loop:

   if (verify_now)
 	{
-	  check_function_result_typed ();
-	  function_result_typed = true;
+	  if (check_function_result_typed ())
+	function_result_typed = true;
 	}
 }

@@ -4111,8 +4120,8 @@ loop:
 case ST_IMPLICIT:
   if (!function_result_typed)
 	{
-	  check_function_result_typed ();
-	  function_result_typed = true;
+	  if (check_function_result_typed ())
+	function_result_typed = true;
 	}
   goto declSt;

diff --git a/gcc/fortran/resolve.cc b/gcc/fortran/resolve.cc
index fb0745927ac..427f901a438 100644
--- a/gcc/fortran/resolve.cc
+++ b/gcc/fortran/resolve.cc
@@ -17419,7 +17419,9 @@ resolve_fntype (gfc_namespace *ns)
 	  }
   }

-  if (sym->ts.type == BT_CHARACTER)
+  if (sym->ts.type == BT_CHARACTER
+  && sym->ts.u.cl->length
+  && sym->ts.u.cl->length->ts.type == BT_INTEGER)
 gfc_traverse_expr (sym->ts.u.cl->length, sym, flag_fn_result_spec, 0);
 }

diff --git a/gcc/testsuite/gfortran.dg/pr96025.f90 b/gcc/testsuite/gfortran.dg/pr96025.f90
new file mode 100644
index 000..ce292bd9664
--- /dev/null
+++ b/gcc/testsuite/gfortran.dg/pr96025.f90
@@ -0,0 +1,11 @@
+! { dg-do compile }
+! PR fortran/96025 - ICE in expr_check_typed_help
+! Contributed by G.Steinmetz
+
+program p
+  print *, f()
+contains
+  character(char(1)) function f() ! { dg-error "must be of INTEGER type" }
+f = 'f'
+  end
+end
--
2.35.3



[PATCH] c++: constant non-copy-init is manifestly constant [PR108243]

2023-02-20 Thread Patrick Palka via Gcc-patches
According to [basic.start.static]/2 and [expr.const]/2, a variable
with static storage duration initialized with a constant initializer
has constant initialization, and such an initializer is manifestly
constant-evaluated.

We're already getting this right with copy initialization because in
that case check_initializer would consistently call store_init_value
(which for TREE_STATIC variables calls fold_non_dependent_init with
m_c_e=true).

But for direct (or default) initialization, we don't always call
store_init_value.  We instead however always call maybe_constant_init
from expand_default_init[1], albeit with m_c_e=false which means we
don't always get the "manifestly constant-evaluated" part right for
copy-init.

This patch fixes this by simply passing m_c_e=true to this call to
maybe_constant_init for static storage duration variables, mirroring
what store_init_value basically does.

[1]: this maybe_constant_init call isn't reached in the copy-init
case because there init is a CONSTRUCTOR rather than a TREE_LIST so
expand_default_init exits early returning an INIT_EXPR.  This INIT_EXPR
is ultimately what causes us to consistently hit the store_init_value
code path from check_initializer in the copy-init case.

Bootstrapped and regtested on x86_64-pc-linux-gnu, does this look OK for
trunk?  Would it be suitable to backport this to the 12 branch since
it should only affect C++20 code?

PR c++/108243

gcc/cp/ChangeLog:

* init.cc (expand_default_init): Pass m_c_e=true instead of
=false to maybe_constant_init when initializing a variable
with static storage duration.

gcc/testsuite/ChangeLog:

* g++.dg/cpp2a/is-constant-evaluated14.C: New test.
---
 gcc/cp/init.cc|   5 +-
 .../g++.dg/cpp2a/is-constant-evaluated14.C| 140 ++
 2 files changed, 144 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated14.C

diff --git a/gcc/cp/init.cc b/gcc/cp/init.cc
index 52e96fbe590..705a5b3bdb6 100644
--- a/gcc/cp/init.cc
+++ b/gcc/cp/init.cc
@@ -2203,7 +2203,10 @@ expand_default_init (tree binfo, tree true_exp, tree 
exp, tree init, int flags,
   tree fn = get_callee_fndecl (rval);
   if (fn && DECL_DECLARED_CONSTEXPR_P (fn))
{
- tree e = maybe_constant_init (rval, exp);
+ bool manifestly_const_eval = false;
+ if (VAR_P (exp) && TREE_STATIC (exp))
+   manifestly_const_eval = true;
+ tree e = maybe_constant_init (rval, exp, manifestly_const_eval);
  if (TREE_CONSTANT (e))
rval = cp_build_init_expr (exp, e);
}
diff --git a/gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated14.C 
b/gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated14.C
new file mode 100644
index 000..365bca3fd9a
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/is-constant-evaluated14.C
@@ -0,0 +1,140 @@
+// PR c++/108243
+// Verify a variable with static storage duration initialized with a
+// constant initializer has constant initialization, and the initializer
+// is manifestly constant-evaluated.
+// { dg-do run { target c++11 } }
+// { dg-additional-options "-fdump-tree-original" }
+
+#include 
+
+struct A {
+  constexpr A(int n) : n(n), m(__builtin_is_constant_evaluated()) { }
+  constexpr A() : A(42) { }
+  void verify_mce() const {
+if (m != 1) __builtin_abort();
+  }
+  int n;
+  int m;
+};
+
+A a1 = {42};
+A a2{42};
+A a3(42);
+A a4;
+A a5{};
+
+void f() {
+  static A a1 = {42};
+  static A a2{42};
+  static A a3(42);
+  static A a4;
+  static A a5{};
+  for (auto& a : {a1, a2, a3, a4, a5})
+a.verify_mce();
+}
+
+template
+void g() {
+  static A a1 = {42};
+  static A a2{42};
+  static A a3(42);
+  static A a4;
+  static A a5{};
+  static A a6 = {N...};
+  static A a7{N...};
+  static A a8(N...);
+  for (auto& a : {a1, a2, a3, a4, a5, a6, a7, a8})
+a.verify_mce();
+}
+
+struct B {
+  static A a1;
+  static A a2;
+  static A a3;
+  static A a4;
+  static A a5;
+  static void verify_mce() {
+for (auto& a : {a1, a2, a3, a4, a5})
+  a.verify_mce();
+  }
+};
+
+A B::a1 = {42};
+A B::a2{42};
+A B::a3(42);
+A B::a4;
+A B::a5{};
+
+template
+struct BT {
+  static A a1;
+  static A a2;
+  static A a3;
+  static A a4;
+  static A a5;
+  static A a6;
+  static A a7;
+  static A a8;
+  static void verify_mce() {
+for (auto& a : {a1, a2, a3, a4, a5})
+  a.verify_mce();
+  }
+};
+
+template A BT::a1 = {42};
+template A BT::a2{42};
+template A BT::a3(42);
+template A BT::a4;
+template A BT::a5{};
+template A BT::a6 = {N...};
+template A BT::a7{N...};
+template A BT::a8(N...);
+
+#if __cpp_inline_variables
+struct BI {
+  static inline A a1 = {42};
+  static inline A a2{42};
+  static inline A a3;
+  static inline A a4{};
+  static void verify_mce() {
+for (auto& a : {a1, a2, a3, a4})
+  a.verify_mce();
+  }
+};
+
+template
+struct BIT {
+  static inline A a1 = {42};
+  static inline A a2{42};
+  static inline A 

Re: [PATCH] Fixing PR107411

2023-02-20 Thread Qing Zhao via Gcc-patches


> On Feb 20, 2023, at 10:17 AM, Jakub Jelinek  wrote:
> 
> On Mon, Feb 20, 2023 at 03:04:51PM +, Qing Zhao via Gcc-patches wrote:
>> 
>> 
>>> On Feb 17, 2023, at 5:35 PM, Jakub Jelinek  wrote:
>>> 
>>> On Fri, Feb 17, 2023 at 10:26:03PM +, Qing Zhao via Gcc-patches wrote:
 +else if (!DECL_NAME (lhs_var))
 +  {
 +char *lhs_var_name_str
 +  = xasprintf ("D.%u", DECL_UID (lhs_var));
>>> 
>>> Why xasprintf?
>> 
>> Just emulated the code in “gimple_add_init_for_auto_var” without thinking 
>> too much. -:)
>>> D.%u can be sprintfed into a fixed size automatic buffer,
>>> say 3 + (HOST_BITS_PER_INT + 2) / 3 would be a good upper bound for the size
>>> of the buffer.  Then you don't need to free it...
>> 
>> xasprintf is "like a sprintf but provided a pointer to malloc’d storage 
>> (without fail)”. If free the pointer properly, then it should be okay, right?
>> In addition to “no need to free”, what other benefit to use sprintf other 
>> than xasprintf?
> 
> xasprintf+free being significantly slower, exactly because it needs to
> malloc and free later, where both are fairly expensive functions.
> The glibc asprintf for short strings like the above uses a ~ 200 byte
> static buffer, stores in there, later mallocs the needed amount of memory
> and copies it there (so again, another waste because the string needs to be
> copied around), while for longer it can do perhaps many allocations and
> realloc at the end to the right size.
> The libiberty function actually performs the printing twice, once without
> writing result anywhere to compute size, then malloc, then again into the
> malloced buffer.

Okay, thanks a lot for the info.
I will replace xasprintf with sprintf for this patch.

Qing
> 
>   Jakub
> 



Re: [PATCH RFC 1/3] c++: add __is_deducible trait [PR105841]

2023-02-20 Thread Patrick Palka via Gcc-patches
On Sat, 18 Feb 2023, Jason Merrill via Gcc-patches wrote:

> Tested x86_64-pc-linux-gnu.  Since this is fixing experimental (C++20)
> functionality, I think it's reasonable to apply now; I'm interested in other
> opinions, and thoughts about the user-facing functionality.  I'm thinking to
> make it internal-only for GCC 13 at least by adding a space in the name, but
> does this look useful to the library?

IIUC this looks like a generalization of an __is_specialization_of trait
that returns whether a type is a specialization of a given class template,
which seems potentially useful for the library to me.  We already define
some ad-hoc predicates for testing this, e.g. __is_reverse_view,
__is_span etc in  as well as a more general __is_specialization_of
in  for templates that take only type arguments.  Using a built-in
trait should be more efficient.

> 
> -- 8< --
> 
> C++20 class template argument deduction for an alias template involves
> adding a constraint that the template arguments for the alias template can
> be deduced from the return type of the deduction guide for the underlying
> class template.  In the standard, this is modeled as defining a class
> template with a partial specialization, but it's much more efficient to
> implement with a trait that directly tries to perform the deduction.
> 
> The first argument to the trait is a template rather than a type, so various
> places needed to be adjusted to accommodate that.
> 
>   PR c++/105841
> 
> gcc/ChangeLog:
> 
>   * doc/extend.texi (Type Traits):: Document __is_deducible.
> 
> gcc/cp/ChangeLog:
> 
>   * cp-trait.def (IS_DEDUCIBLE): New.
>   * cxx-pretty-print.cc (pp_cxx_trait): Handle non-type.
>   * parser.cc (cp_parser_trait): Likewise.
>   * pt.cc (tsubst_copy_and_build): Likewise.
>   (type_targs_deducible_from): New.
>   (alias_ctad_tweaks): Use it.
>   * semantics.cc (trait_expr_value): Handle CPTK_IS_DEDUCIBLE.
>   (finish_trait_expr): Likewise.
>   * constraint.cc (diagnose_trait_expr): Likewise.
>   * cp-tree.h (type_targs_deducible_from): Declare.
> 
> gcc/testsuite/ChangeLog:
> 
>   * g++.dg/ext/is_deducible1.C: New test.
> ---
>  gcc/doc/extend.texi  |  4 +++
>  gcc/cp/cp-tree.h |  1 +
>  gcc/cp/constraint.cc |  3 ++
>  gcc/cp/cxx-pretty-print.cc   |  5 +++-
>  gcc/cp/parser.cc | 20 +++---
>  gcc/cp/pt.cc | 35 +---
>  gcc/cp/semantics.cc  | 11 
>  gcc/testsuite/g++.dg/ext/is_deducible1.C | 27 ++
>  gcc/cp/cp-trait.def  |  1 +
>  9 files changed, 92 insertions(+), 15 deletions(-)
>  create mode 100644 gcc/testsuite/g++.dg/ext/is_deducible1.C
> 
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 1ae68b0f20a..898701424ad 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -25207,6 +25207,10 @@ type.  A diagnostic is produced if this requirement 
> is not met.
>  If @code{type} is a cv-qualified class type, and not a union type
>  ([basic.compound]) the trait is @code{true}, else it is @code{false}.
>  
> +@item __is_deducible (template, type)
> +If template arguments for @code{template} can be deduced from
> +@code{type} or obtained from default template arguments.
> +
>  @item __is_empty (type)
>  If @code{__is_class (type)} is @code{false} then the trait is @code{false}.
>  Otherwise @code{type} is considered empty if and only if: @code{type}
> diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
> index 5595335bbf7..e79150ca4d8 100644
> --- a/gcc/cp/cp-tree.h
> +++ b/gcc/cp/cp-tree.h
> @@ -7372,6 +7372,7 @@ extern tree fn_type_unification (tree, 
> tree, tree,
>bool, bool);
>  extern void mark_decl_instantiated   (tree, int);
>  extern int more_specialized_fn   (tree, tree, int);
> +extern bool type_targs_deducible_from(tree, tree);
>  extern void do_decl_instantiation(tree, tree);
>  extern void do_type_instantiation(tree, tree, tsubst_flags_t);
>  extern bool always_instantiate_p (tree);
> diff --git a/gcc/cp/constraint.cc b/gcc/cp/constraint.cc
> index 9374327008b..a28c85178fe 100644
> --- a/gcc/cp/constraint.cc
> +++ b/gcc/cp/constraint.cc
> @@ -3797,6 +3797,9 @@ diagnose_trait_expr (tree expr, tree args)
>inform (loc, "  %qT is not a reference that binds to a temporary "
> "object of type %qT (copy-initialization)", t1, t2);
>break;
> +case CPTK_IS_DEDUCIBLE:
> +  inform (loc, "  %qD is not deducible from %qT", t1, t2);
> +  break;
>  #define DEFTRAIT_TYPE(CODE, NAME, ARITY) \
>  case CPTK_##CODE:
>  #include "cp-trait.def"
> diff --git a/gcc/cp/cxx-pretty-print.cc b/gcc/cp/cxx-pretty-print.cc
> index bea52a608f1..4ebd957decd 100644
> --- 

Re: [committed] libstdc++: Fix uses of non-reserved names in simd header

2023-02-20 Thread Jonathan Wakely via Gcc-patches
On Mon, 20 Feb 2023 at 16:32, Matthias Kretz via Libstdc++
 wrote:
>
> Tested x86_64-pc-linux. Pushed to trunk.

OK for all relevant branches, thanks.



[committed] libstdc++: Fix uses of non-reserved names in simd header

2023-02-20 Thread Matthias Kretz via Gcc-patches
Tested x86_64-pc-linux. Pushed to trunk.

-- >8 --

Signed-off-by: Matthias Kretz 

libstdc++-v3/ChangeLog:

* include/experimental/bits/simd.h (__extract_part, split):
Use reserved name for template parameter.
---
 libstdc++-v3/include/experimental/bits/simd.h | 22 +--
 1 file changed, 11 insertions(+), 11 deletions(-)


--
──
 Dr. Matthias Kretz   https://mattkretz.github.io
 GSI Helmholtz Centre for Heavy Ion Research   https://gsi.de
 stdₓ::simd
──diff --git a/libstdc++-v3/include/experimental/bits/simd.h b/libstdc++-v3/include/experimental/bits/simd.h
index ffe72fa6ccf..2f615d13b73 100644
--- a/libstdc++-v3/include/experimental/bits/simd.h
+++ b/libstdc++-v3/include/experimental/bits/simd.h
@@ -3783,7 +3783,7 @@ template 
   _SimdWrapper<_Tp, _Np / _Total * _Combine>
   __extract_part(const _SimdWrapper<_Tp, _Np> __x);
 
-template 
   _GLIBCXX_SIMD_INTRINSIC auto
   __extract_part(const _SimdTuple<_Tp, _A0, _As...>& __x);
@@ -3896,19 +3896,19 @@ template 
 
 // split(simd) {{{
 template  / _V::size()>
-  enable_if_t == Parts * _V::size()
-	  && is_simd_v<_V>, array<_V, Parts>>
+	  size_t _Parts = simd_size_v / _V::size()>
+  enable_if_t == _Parts * _V::size()
+		&& is_simd_v<_V>, array<_V, _Parts>>
   split(const simd& __x)
   {
 using _Tp = typename _V::value_type;
-if constexpr (Parts == 1)
+if constexpr (_Parts == 1)
   {
 	return {simd_cast<_V>(__x)};
   }
 else if (__x._M_is_constprop())
   {
-	return __generate_from_n_evaluations>(
+	return __generate_from_n_evaluations<_Parts, array<_V, _Parts>>(
 		 [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 		   return _V([&](auto __j) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA
 			 { return __x[__i * _V::size() + __j]; });
@@ -3925,12 +3925,12 @@ template * const __element_ptr
 	= reinterpret_cast*>(&__data(__x));
-  return __generate_from_n_evaluations>(
+  return __generate_from_n_evaluations<_Parts, array<_V, _Parts>>(
 	   [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA
 	   { return _V(__element_ptr + __i * _V::size(), vector_aligned); });
 #else
   const auto& __xx = __data(__x);
-  return __generate_from_n_evaluations>(
+  return __generate_from_n_evaluations<_Parts, array<_V, _Parts>>(
 	   [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 		 [[maybe_unused]] constexpr size_t __offset
 		   = decltype(__i)::value * _V::size();
@@ -3944,12 +3944,12 @@ template )
 {
   // normally memcpy should work here as well
-  return __generate_from_n_evaluations>(
+  return __generate_from_n_evaluations<_Parts, array<_V, _Parts>>(
 	   [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA { return __x[__i]; });
 }
   else
 {
-  return __generate_from_n_evaluations>(
+  return __generate_from_n_evaluations<_Parts, array<_V, _Parts>>(
 	   [&](auto __i) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
 		 if constexpr (__is_fixed_size_abi_v)
 		   return _V([&](auto __j) constexpr _GLIBCXX_SIMD_ALWAYS_INLINE_LAMBDA {
@@ -3957,7 +3957,7 @@ template (__data(__x)));
+			 __extract_part(__data(__x)));
 	   });
 }
   }


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Tobias Burnus

Hi Richard, hi all,

On 20.02.23 13:46, Richard Biener wrote:

+  /* TODO: A more middle-end friendly alternative would be to use NULL_TREE
+as upper bound and store the value, e.g. as GFC_DECL_STRING_LEN.
+Caveat: this requires some cleanup throughout the code to consistently
+use some wrapper function.  */
+  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
+  tree tmp = TREE_TYPE (TYPE_SIZE (eltype));

...

you are probably breaking type sharing here.  You could use
build_array_type_1 and pass false for 'shared' to get around that.  Note
that there's also canonical type building done in case 'eltype' is not
canonical itself.


My feeling is that this is already somewhat broken. Currently, there
is one type per decl as each has its own artificial length variable.
I have no idea how this will be handled in the ME in terms of alias
analysis. And whether shared=false makes sense here and what effect
is has. (Probably yes.)

In principle,
  integer(kind=8) .str., .str2;
  character(kind=1)[1:.str] * str;
  character(kind=1)[1:.str2] * str2;
have the same type and iff .str == .str at runtime, they can alias.
Example:
  str2 = str;
  .str2 = .str;

I have no idea how the type analysis currently works (with or without SAVE_EXPR)
nor what effect shared=false has in this case.


The solution to the actual problem is a hack - you are relying on
re-evaluation of TYPE_SIZE, and for that, only from within accesses
from inside the frontend?


I think this mostly helps with access inside the FE of the type 'size =
TYPE_SIZE_UNIT(type)', which is used surprisingly often and is often
directly evaluated (i.e. assigned to a temporary).


Since gimplification will produce the result into a single temporary again, re-storing 
the "breakage".
So, does it_really_  fix things?


It does seem to fix cases which do  'size = TYPE_SIZE_UNIT (type);' in
the front end and then uses this size expression. Thus, there are fixed.
However, there are many cases where things go wrong - with and without
the patch. I keep discovering more and more :-(

* * *

I still think that the proper way is to have NULL_TREE as upper value
would be better in several ways, except that there is (too) much code
which relies on TYPE_UNIT_SIZE to work. (There are 117 occurrences).
Additionally, there is more code doing assumptions in this area.

Thus, the question is whether it makes sense as hackish partial solution
or whether it should remain in the current broken stage until it is
fixed properly.

Tobias,

who would like to have more time for fixing such issues.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


RE: [PATCH] [arm] disable aes-1742098 mitigation for a72 combine tests

2023-02-20 Thread Kyrylo Tkachov via Gcc-patches
Hi Alexandre,

> -Original Message-
> From: Alexandre Oliva 
> Sent: Friday, February 17, 2023 7:06 AM
> To: gcc-patches@gcc.gnu.org
> Cc: ni...@redhat.com; Richard Earnshaw ;
> ramana@gmail.com; Kyrylo Tkachov 
> Subject: [PATCH] [arm] disable aes-1742098 mitigation for a72 combine tests
> 
> 
> The expected asm output for aes-fuse-[12].c does not correspond to
> that which is generated when -mfix-cortex-a57-aes-1742098 is enabled.
> It was introduced after the test, and enabled by default for the
> selected processor.  Disabling the option restores the circumstance
> that was tested for.
> 
> Regstrapped on x86_64-linux-gnu.
> Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?
> 
> for  gcc/testsuite/ChangeLog
> 
>   * gcc.target/arm/aes-fuse-1.c: Add
>   -mno-fix-cortex-a57-aes-1742098.
>   * gcc.target/arm/aes-fuse-2.c: Likewise.
> ---
>  gcc/testsuite/gcc.target/arm/aes-fuse-1.c |4 
>  gcc/testsuite/gcc.target/arm/aes-fuse-2.c |4 
>  2 files changed, 8 insertions(+)
> 
> diff --git a/gcc/testsuite/gcc.target/arm/aes-fuse-1.c
> b/gcc/testsuite/gcc.target/arm/aes-fuse-1.c
> index 27b08aeef7ba7..6ffb4991cca69 100644
> --- a/gcc/testsuite/gcc.target/arm/aes-fuse-1.c
> +++ b/gcc/testsuite/gcc.target/arm/aes-fuse-1.c
> @@ -2,6 +2,10 @@
>  /* { dg-require-effective-target arm_crypto_ok } */
>  /* { dg-add-options arm_crypto } */
>  /* { dg-additional-options "-mcpu=cortex-a72 -O3 -dp" } */
> +/* The mitigation applies to a72 by default, and protects the CRYPTO_AES
> +   inputs, such as the explicit xor ops, from being combined like test used 
> to
> +   expect.  */
> +/* { dg-additional-options "-mno-fix-cortex-a57-aes-1742098" } */

Actually the -mcpu=cortex-a72 here is significant only in that it's one of the 
CPUs that enables AES/AESMC fusion.
So rather than overriding this awkward part with 
-mno-fix-cortex-a57-aes-1742098 I'd rather just select a different
CPU that enables that fusion and isn't afflicted by this workaround, such as 
-mcpu=cortex-a53.
More broadly, I think we should be enabling tune_params::FUSE_AES_AESMC for the 
generic target in A profile, but that would be a non-testsuite change.

Ok with changing the -mcpu option instead.
Thanks,
Kyrill

> 
>  #include 
> 
> diff --git a/gcc/testsuite/gcc.target/arm/aes-fuse-2.c
> b/gcc/testsuite/gcc.target/arm/aes-fuse-2.c
> index 1266a28753169..b72479c0e5726 100644
> --- a/gcc/testsuite/gcc.target/arm/aes-fuse-2.c
> +++ b/gcc/testsuite/gcc.target/arm/aes-fuse-2.c
> @@ -2,6 +2,10 @@
>  /* { dg-require-effective-target arm_crypto_ok } */
>  /* { dg-add-options arm_crypto } */
>  /* { dg-additional-options "-mcpu=cortex-a72 -O3 -dp" } */
> +/* The mitigation applies to a72 by default, and protects the CRYPTO_AES
> +   inputs, such as the explicit xor ops, from being combined like test used 
> to
> +   expect.  */
> +/* { dg-additional-options "-mno-fix-cortex-a57-aes-1742098" } */
> 
>  #include 
> 
> 
> --
> Alexandre Oliva, happy hackerhttps://FSFLA.org/blogs/lxo/
>Free Software Activist   GNU Toolchain Engineer
> Disinformation flourishes because many people care deeply about injustice
> but very few check the facts.  Ask me about 


[committed] RISC-V: prefetch.* only take base register with zero-offset for the address

2023-02-20 Thread Kito Cheng via Gcc-patches
Catched by running gcc.c-torture/execute/builtin-prefetch-2.c with
-march=rv64gc_zicbop.

gcc/ChangeLog:

* config/riscv/riscv.md (prefetch): Use r instead of p for the
address operand.
(riscv_prefetchi_): Ditto.
---
 gcc/config/riscv/riscv.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index 487059ebe97..a5507fadc2d 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -3066,7 +3066,7 @@ (define_insn "riscv_zero_"
 )
 
 (define_insn "prefetch"
-  [(prefetch (match_operand 0 "address_operand" "p")
+  [(prefetch (match_operand 0 "address_operand" "r")
  (match_operand 1 "imm5_operand" "i")
  (match_operand 2 "const_int_operand" "n"))]
   "TARGET_ZICBOP"
@@ -3080,7 +3080,7 @@ (define_insn "prefetch"
 })
 
 (define_insn "riscv_prefetchi_"
-  [(unspec_volatile:X [(match_operand:X 0 "address_operand" "p")
+  [(unspec_volatile:X [(match_operand:X 0 "address_operand" "r")
   (match_operand:X 1 "imm5_operand" "i")]
   UNSPECV_PREI)]
   "TARGET_ZICBOP"
-- 
2.37.2



Re: [PATCH] Fixing PR107411

2023-02-20 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 20, 2023 at 03:04:51PM +, Qing Zhao via Gcc-patches wrote:
> 
> 
> > On Feb 17, 2023, at 5:35 PM, Jakub Jelinek  wrote:
> > 
> > On Fri, Feb 17, 2023 at 10:26:03PM +, Qing Zhao via Gcc-patches wrote:
> >> +else if (!DECL_NAME (lhs_var))
> >> +  {
> >> +char *lhs_var_name_str
> >> +  = xasprintf ("D.%u", DECL_UID (lhs_var));
> > 
> > Why xasprintf?
> 
> Just emulated the code in “gimple_add_init_for_auto_var” without thinking too 
> much. -:)
> >  D.%u can be sprintfed into a fixed size automatic buffer,
> > say 3 + (HOST_BITS_PER_INT + 2) / 3 would be a good upper bound for the size
> > of the buffer.  Then you don't need to free it...
> 
> xasprintf is "like a sprintf but provided a pointer to malloc’d storage 
> (without fail)”. If free the pointer properly, then it should be okay, right?
> In addition to “no need to free”, what other benefit to use sprintf other 
> than xasprintf?

xasprintf+free being significantly slower, exactly because it needs to
malloc and free later, where both are fairly expensive functions.
The glibc asprintf for short strings like the above uses a ~ 200 byte
static buffer, stores in there, later mallocs the needed amount of memory
and copies it there (so again, another waste because the string needs to be
copied around), while for longer it can do perhaps many allocations and
realloc at the end to the right size.
The libiberty function actually performs the printing twice, once without
writing result anywhere to compute size, then malloc, then again into the
malloced buffer.

Jakub



Re: [PATCH] Fixing PR107411

2023-02-20 Thread Qing Zhao via Gcc-patches


> On Feb 17, 2023, at 5:35 PM, Jakub Jelinek  wrote:
> 
> On Fri, Feb 17, 2023 at 10:26:03PM +, Qing Zhao via Gcc-patches wrote:
>> +  else if (!DECL_NAME (lhs_var))
>> +{
>> +  char *lhs_var_name_str
>> += xasprintf ("D.%u", DECL_UID (lhs_var));
> 
> Why xasprintf?

Just emulated the code in “gimple_add_init_for_auto_var” without thinking too 
much. -:)
>  D.%u can be sprintfed into a fixed size automatic buffer,
> say 3 + (HOST_BITS_PER_INT + 2) / 3 would be a good upper bound for the size
> of the buffer.  Then you don't need to free it...

xasprintf is "like a sprintf but provided a pointer to malloc’d storage 
(without fail)”. If free the pointer properly, then it should be okay, right?
In addition to “no need to free”, what other benefit to use sprintf other than 
xasprintf?

Qing
> 
>> +  if (strcmp (lhs_var_name_str, var_name_str) == 0)
>> +{
>> +  free (lhs_var_name_str);
>> +  return;
>> +}
>> +  free (lhs_var_name_str);
>> +}
>> +}
>>gcc_assert (var_name_str && var_def_stmt);
>>  }
>> }
>> -- 
>> 2.31.1
> 
>   Jakub



Prototype 'GOMP_enable_pinned_mode' (was: [PATCH 08/17] openmp: -foffload-memory=pinned)

2023-02-20 Thread Thomas Schwinge
Hi!

On 2022-07-07T23:18:03+0100, Andrew Stubbs  wrote:
> On 07/07/2022 12:54, Tobias Burnus wrote:
>> On 07.07.22 12:34, Andrew Stubbs wrote:
>>> Implement the -foffload-memory=pinned option such that libgomp is
>>> instructed to enable fully-pinned memory at start-up.  The option is
>>> intended to provide a performance boost to certain offload programs
>>> without
>>> modifying the code.
>> ...
>>> gcc/ChangeLog:
>>>
>>> * omp-builtins.def (BUILT_IN_GOMP_ENABLE_PINNED_MODE): New.
>>> * omp-low.cc (omp_enable_pinned_mode): New function.
>>> (execute_lower_omp): Call omp_enable_pinned_mode.
>>>
>>> libgomp/ChangeLog:
>>>
>>> * config/linux/allocator.c (always_pinned_mode): New variable.
>>> (GOMP_enable_pinned_mode): New function.
>>> (linux_memspace_alloc): Disable pinning when always_pinned_mode set.
>>> (linux_memspace_calloc): Likewise.
>>> (linux_memspace_free): Likewise.
>>> (linux_memspace_realloc): Likewise.
>>> * libgomp.map: Add GOMP_enable_pinned_mode.
>>> * testsuite/libgomp.c/alloc-pinned-7.c: New test.
>>> ...
>> ...
>>> --- a/gcc/omp-low.cc
>>> +++ b/gcc/omp-low.cc
>>> @@ -14620,6 +14620,68 @@ lower_omp (gimple_seq *body, omp_context *ctx)
>>> input_location = saved_location;
>>>   }
>>> +/* Emit a constructor function to enable -foffload-memory=pinned
>>> +   at runtime.  Libgomp handles the OS mode setting, but we need to
>>> trigger
>>> +   it by calling GOMP_enable_pinned mode before the program proper
>>> runs.  */
>>> +
>>> +static void
>>> +omp_enable_pinned_mode ()
>>
>> Is there a reason not to use the mechanism of OpenMP's 'requires'
>> directive for this?

I agree.  (But I'm not working on that, for avoidance of doubt.)

>> (Okay, I have to admit that the final patch was only committed on
>> Monday. But still ...)
>
> Possibly, I had most of this done before then. I'll have a look next
> time I visit this patch.

Until then, let's at least document/verify 'GOMP_enable_pinned_mode';
I've pushed to devel/omp/gcc-12
commit 9657d906869e098340c23118c2eb8592d9e77ac5
"Prototype 'GOMP_enable_pinned_mode'", see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 9657d906869e098340c23118c2eb8592d9e77ac5 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 20 Feb 2023 15:29:44 +0100
Subject: [PATCH] Prototype 'GOMP_enable_pinned_mode'

Fix-up for og12 commit 842df187487f5b16ae29bbe7e9acd79661a9df48
"openmp: -foffload-memory=pinned".  No functional change.

	libgomp/
	* libgomp_g.h (GOMP_enable_pinned_mode): New.
---
 libgomp/ChangeLog.omp | 2 ++
 libgomp/libgomp_g.h   | 1 +
 2 files changed, 3 insertions(+)

diff --git a/libgomp/ChangeLog.omp b/libgomp/ChangeLog.omp
index c5a7860478e..e4475093055 100644
--- a/libgomp/ChangeLog.omp
+++ b/libgomp/ChangeLog.omp
@@ -1,5 +1,7 @@
 2023-02-20  Thomas Schwinge  
 
+	* libgomp_g.h (GOMP_enable_pinned_mode): New.
+
 	* config/linux/allocator.c (linux_memspace_alloc): Add 'init0'
 	formal parameter.  Adjust all users.
 	(linux_memspace_alloc, linux_memspace_free): Attempt to allocate
diff --git a/libgomp/libgomp_g.h b/libgomp/libgomp_g.h
index ece1f97a61f..fe66a53d94a 100644
--- a/libgomp/libgomp_g.h
+++ b/libgomp/libgomp_g.h
@@ -375,6 +375,7 @@ extern void GOMP_teams_reg (void (*) (void *), void *, unsigned, unsigned,
 
 extern void *GOMP_alloc (size_t, size_t, uintptr_t);
 extern void GOMP_free (void *, uintptr_t);
+extern void GOMP_enable_pinned_mode (void);
 
 /* error.c */
 
-- 
2.25.1



Re: [PATCH] libstdc++: Update baseline symbols for riscv64-linux

2023-02-20 Thread Jonathan Wakely via Gcc-patches
On Mon, 20 Feb 2023 at 12:10, Andreas Schwab via Libstdc++
 wrote:
>
> libstdc++-v3/
> * config/abi/post/riscv64-linux-gnu/baseline_symbols.txt: Update.


Looks good, thanks.



[og12] Attempt to not just register but allocate OpenMP pinned memory using a device (was: [og12] Attempt to register OpenMP pinned memory using a device instead of 'mlock')

2023-02-20 Thread Thomas Schwinge
Hi!

On 2023-02-20T09:48:53+, Andrew Stubbs  wrote:
> On 17/02/2023 08:12, Thomas Schwinge wrote:
>> On 2023-02-16T23:06:44+0100, I wrote:
>>> On 2023-02-16T16:17:32+, "Stubbs, Andrew via Gcc-patches" 
>>>  wrote:
 The mmap implementation was not optimized for a lot of small allocations, 
 and I can't see that issue changing here
>>>
>>> That's correct, 'mmap' remains.  Under the hood, 'cuMemHostRegister' must
>>> surely also be doing some 'mlock'-like thing, so I figured it's best to
>>> feed page-boundary memory regions to it, which 'mmap' gets us.
>>>
 so I don't know if this can be used for mlockall replacement.

 I had assumed that using the Cuda allocator would fix that limitation.
>>>
>>>  From what I've read (but no first-hand experiments), there's non-trivial
>>> overhead with 'cuMemHostRegister' (just like with 'mlock'), so routing
>>> all small allocations individually through it probably isn't a good idea
>>> either.  Therefore, I suppose, we'll indeed want to use some local
>>> allocator if we wish this "optimized for a lot of small allocations".
>>
>> Eh, I suppose your point indirectly was that instead of 'mmap' plus
>> 'cuMemHostRegister' we ought to use 'cuMemAllocHost'/'cuMemHostAlloc', as
>> we assume those already do implement such a local allocator.  Let me
>> quickly change that indeed -- we don't currently have a need to use
>> 'cuMemHostRegister' instead of 'cuMemAllocHost'/'cuMemHostAlloc'.
>
> Yes, that's right. I suppose it makes sense to register memory we
> already have, but if we want new memory then trying to reinvent what
> happens inside cuMemAllocHost is pointless.

I've pushed to devel/omp/gcc-12 branch
commit 4bd844f3e0202b3d083f0784f4343570c88bb86c
"Attempt to not just register but allocate OpenMP pinned memory using a device",
see attached.


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 4bd844f3e0202b3d083f0784f4343570c88bb86c Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 20 Feb 2023 14:44:43 +0100
Subject: [PATCH] Attempt to not just register but allocate OpenMP pinned
 memory using a device

... instead of 'mmap' plus attempting to register using a device.

Implemented for nvptx offloading via 'cuMemHostAlloc'.

This re-works og12 commit a5a4800e92773da7126c00a9c79b172494d58ab5
"Attempt to register OpenMP pinned memory using a device instead of 'mlock'".

	include/
	* cuda/cuda.h (cuMemHostRegister, cuMemHostUnregister): Remove.
	libgomp/
	* config/linux/allocator.c (linux_memspace_alloc): Add 'init0'
	formal parameter.  Adjust all users.
	(linux_memspace_alloc, linux_memspace_free): Attempt to allocate
	OpenMP pinned memory using a device instead of 'mmap' plus
	attempting to register using a device.
	* libgomp-plugin.h (GOMP_OFFLOAD_register_page_locked)
	(GOMP_OFFLOAD_unregister_page_locked): Remove.
	(GOMP_OFFLOAD_page_locked_host_alloc)
	(GOMP_OFFLOAD_page_locked_host_free): New.
	* libgomp.h (gomp_register_page_locked)
	(gomp_unregister_page_locked): Remove.
	(gomp_page_locked_host_alloc, gomp_page_locked_host_free): New.
	(struct gomp_device_descr): Remove 'register_page_locked_func',
	'unregister_page_locked_func'.  Add 'page_locked_host_alloc_func',
	'page_locked_host_free_func'.
	* plugin/cuda-lib.def (cuMemHostRegister_v2, cuMemHostRegister)
	(cuMemHostUnregister): Remove.
	* plugin/plugin-nvptx.c (GOMP_OFFLOAD_register_page_locked)
	(GOMP_OFFLOAD_unregister_page_locked): Remove.
	(GOMP_OFFLOAD_page_locked_host_alloc)
	(GOMP_OFFLOAD_page_locked_host_free): New.
	* target.c (gomp_register_page_locked)
	(gomp_unregister_page_locked): Remove.
	(gomp_page_locked_host_alloc, gomp_page_locked_host_free): Add.
	(gomp_load_plugin_for_device): Don't handle
	'register_page_locked', 'unregister_page_locked'.  Handle
	'page_locked_host_alloc', 'page_locked_host_free'.

Suggested-by: Andrew Stubbs 
---
 include/cuda/cuda.h  |  3 --
 libgomp/config/linux/allocator.c | 85 ++--
 libgomp/libgomp-plugin.h |  4 +-
 libgomp/libgomp.h|  8 +--
 libgomp/plugin/cuda-lib.def  |  3 --
 libgomp/plugin/plugin-nvptx.c| 33 +++--
 libgomp/target.c | 49 +-
 7 files changed, 98 insertions(+), 87 deletions(-)

diff --git a/include/cuda/cuda.h b/include/cuda/cuda.h
index b0c7636d318..062d394b95f 100644
--- a/include/cuda/cuda.h
+++ b/include/cuda/cuda.h
@@ -183,9 +183,6 @@ CUresult cuMemAlloc (CUdeviceptr *, size_t);
 CUresult cuMemAllocHost (void **, size_t);
 CUresult cuMemAllocManaged(CUdeviceptr *, size_t, unsigned int);
 CUresult cuMemHostAlloc (void **, size_t, unsigned int);
-#define cuMemHostRegister cuMemHostRegister_v2
-CUresult cuMemHostRegister(void *, size_t, unsigned int);
-CUresult 

Rust: Don't depend on unused 'target-libffi', 'target-libbacktrace' (was: [PATCH Rust front-end v2 32/37] gccrs: Add config-lang.in)

2023-02-20 Thread Thomas Schwinge
Hi!

On 2022-08-24T12:59:51+0100, herron.phi...@googlemail.com wrote:
> From: Philip Herron 
>
> This was a copy paste from gccgo front-end, we do not use any of the
> target_libs yet but we will need these when we support the libpanic crate.

> --- /dev/null
> +++ b/gcc/rust/config-lang.in

> +target_libs="target-libffi target-libbacktrace"

(By the way, this setting of 'target_libs' was not present in the v1

"[PATCH Rust front-end v1 1/4] Add skeleton Rust front-end folder".)

So there's the issue that not all GCC target configurations support
building those libraries.  Given that they're indeed unused, is it be OK
to push the attached
"Rust: Don't depend on unused 'target-libffi', 'target-libbacktrace'"?
(..., and once we get to the point where we'd like to use libffi and/or
libbacktrace, then think about how to handle those GCC target
configurations.)


Grüße
 Thomas


-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
>From 5d85939a3e3ebcfcf3f2ac9d3f2e01cbb1736578 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 20 Feb 2023 13:01:50 +0100
Subject: [PATCH] Rust: Don't depend on unused 'target-libffi',
 'target-libbacktrace'

For example:

configure: error: "libffi has not been ported to nvptx-unknown-none."

Follow-up to commit a75f038c069cc3a23b214854bedf04321fe88bc5
"gccrs: Add config-lang.in", which said:

> This was a copy/paste from gccgo front-end. We do not use any of the
> target_libs yet, [...]

	gcc/rust/
	* config-lang.in (target_libs): Remove.
---
 gcc/rust/config-lang.in | 2 --
 1 file changed, 2 deletions(-)

diff --git a/gcc/rust/config-lang.in b/gcc/rust/config-lang.in
index 89055be5cd4..aac66c9b962 100644
--- a/gcc/rust/config-lang.in
+++ b/gcc/rust/config-lang.in
@@ -29,6 +29,4 @@ compilers="rust1\$(exeext)"
 
 build_by_default="no"
 
-target_libs="target-libffi target-libbacktrace"
-
 gtfiles="\$(srcdir)/rust/rust-lang.cc"
-- 
2.25.1



Re: RISC-V: Add divmod instruction support

2023-02-20 Thread Alexander Monakov via Gcc-patches


On Mon, 20 Feb 2023, Richard Biener via Gcc-patches wrote:

> On Sun, Feb 19, 2023 at 2:15 AM Maciej W. Rozycki  wrote:
> >
> > > The problem is you don't see it as a divmod in expand_divmod unless you 
> > > expose
> > > a divmod optab.  See tree-ssa-mathopts.cc's divmod handling.
> >
> >  That's the kind of stuff I'd expect to happen at the tree level though,
> > before expand.
> 
> The GIMPLE pass forming divmod could indeed choose to emit the
> div + mul/sub sequence instead if an actual divmod pattern isn't available.
> It could even generate some fake mul/sub/mod RTXen to cost the two
> variants against each other but I seriously doubt any uarch that implements
> division/modulo has a slower mul/sub.

Making a correct decision requires knowing to which degree the divider is
pipelined, and costs won't properly reflect that. If the divider accepts
a new div/mod instruction every couple of cycles, it's faster to just issue
a div followed by a mod with the same operands.

Therefore I think in this case it's fair for GIMPLE level to just check if
the divmod pattern is available, and let the target do the fine tuning via
the divmod expander.

It would make sense for tree-ssa-mathopts to emit div + mul/sub when neither
'divmod' nor 'mod' patterns are available, because RTL expansion will do the
same, just later, and we'll rely on RTL CSE to clean up the redundant div.
But RISC-V has both 'div' and 'mod', so as I tried to explain in the first
paragraph we should let the target decide.

Alexander


Re: [PATCH] [arm] adjust tests for quotes around +cdecp

2023-02-20 Thread Christophe Lyon via Gcc-patches

Hi Alexandre,


On 2/17/23 08:17, Alexandre Oliva via Gcc-patches wrote:


Back when quotes were added around "+cdecp" in the "coproc must be
a constant immediate" error in arm-builtins.cc, tests for that message
lagged behind.  Fixed thusly.

Regstrapped on x86_64-linux-gnu.
Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?



It seems this changed with r12-6553-gc3782843badbf3, right?
I see this commit added quotes in several others places: are the two 
tests you fix the only ones impacted?


Thanks,

Christophe


for  gcc/testsuite/ChangeLog

* gcc.target/arm/acle/cde-errors.c: Adjust messages for quote
around +cdecp.
* gcc.target/arm/acle/cde-mve-error-2.c: Likewise.
---
  gcc/testsuite/gcc.target/arm/acle/cde-errors.c |   52 ++---
  .../gcc.target/arm/acle/cde-mve-error-2.c  |   82 ++--
  2 files changed, 67 insertions(+), 67 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/acle/cde-errors.c 
b/gcc/testsuite/gcc.target/arm/acle/cde-errors.c
index 85a91666cd5ef..f38514848677e 100644
--- a/gcc/testsuite/gcc.target/arm/acle/cde-errors.c
+++ b/gcc/testsuite/gcc.target/arm/acle/cde-errors.c
@@ -47,19 +47,19 @@ uint64_t test_cde (uint32_t n, uint32_t m)
accum += __arm_cx3da (7, accum, n, m,   0); /* { dg-error 
{coprocessor 7 is not enabled with \+cdecp7} } */
  
/* `coproc` out of range.  */

-  accum += __arm_cx1   (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx1a  (8, (uint32_t)accum,   0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2   (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2a  (8, (uint32_t)accum, n,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3   (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3a  (8, (uint32_t)accum, n, m, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-
-  accum += __arm_cx1d  (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx1da (8, accum, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2d  (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx2da (8, accum, n,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3d  (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
-  accum += __arm_cx3da (8, accum, n, m,   0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with \+cdecp} } */
+  accum += __arm_cx1   (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx1a  (8, (uint32_t)accum,   0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2   (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2a  (8, (uint32_t)accum, n,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3   (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3a  (8, (uint32_t)accum, n, m, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+
+  accum += __arm_cx1d  (8,0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx1da (8, accum, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2d  (8, n, 0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx2da (8, accum, n,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3d  (8, n, m,  0); /* { dg-error {coproc must be 
a constant immediate in range \[0-7\] enabled with .\+cdecp.} } */
+  accum += __arm_cx3da (8, accum, n, m,   0); /* { dg-error {coproc must 

[PATCH] tree-optimization/108793 - niter compute type mismatch

2023-02-20 Thread Richard Biener via Gcc-patches
When computing the number of iterations until wrap types are mixed up,
eventually leading to checking ICEs with a pointer bitwise inversion.
The following uses niter_type for the calculation.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108793
* tree-ssa-loop-niter.cc (number_of_iterations_until_wrap):
Use convert operands to niter_type when computing num.

* gcc.dg/torture/pr108793.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr108793.c | 10 ++
 gcc/tree-ssa-loop-niter.cc  | 11 ++-
 2 files changed, 16 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr108793.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr108793.c 
b/gcc/testsuite/gcc.dg/torture/pr108793.c
new file mode 100644
index 000..83973eb05d9
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr108793.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+
+typedef int *p;
+extern p a[], b[];
+int f () {
+  int n = 0;
+  for (p* i = [0]; i > [0]; i++)
+n++;
+  return n;
+}
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 1ce5e736ce3..dc4c7a418f6 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -1494,8 +1494,9 @@ number_of_iterations_until_wrap (class loop *loop, tree 
type, affine_iv *iv0,
   if (integer_zerop (assumptions))
return false;
 
-  num = fold_build2 (MINUS_EXPR, niter_type, wide_int_to_tree (type, max),
-iv1->base);
+  num = fold_build2 (MINUS_EXPR, niter_type,
+wide_int_to_tree (niter_type, max),
+fold_convert (niter_type, iv1->base));
 
   /* When base has the form iv + 1, if we know iv >= n, then iv + 1 < n
 only when iv + 1 overflows, i.e. when iv == TYPE_VALUE_MAX.  */
@@ -1531,8 +1532,9 @@ number_of_iterations_until_wrap (class loop *loop, tree 
type, affine_iv *iv0,
   if (integer_zerop (assumptions))
return false;
 
-  num = fold_build2 (MINUS_EXPR, niter_type, iv0->base,
-wide_int_to_tree (type, min));
+  num = fold_build2 (MINUS_EXPR, niter_type,
+fold_convert (niter_type, iv0->base),
+wide_int_to_tree (niter_type, min));
   low = min;
   if (TREE_CODE (iv0->base) == INTEGER_CST)
high = wi::to_wide (iv0->base) + 1;
@@ -1546,7 +1548,6 @@ number_of_iterations_until_wrap (class loop *loop, tree 
type, affine_iv *iv0,
 
   /* (delta + step - 1) / step */
   step = fold_convert (niter_type, step);
-  num = fold_convert (niter_type, num);
   num = fold_build2 (PLUS_EXPR, niter_type, num, step);
   niter->niter = fold_build2 (FLOOR_DIV_EXPR, niter_type, num, step);
 
-- 
2.35.3


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Richard Biener via Gcc-patches
On Mon, Feb 20, 2023 at 12:57 PM Jakub Jelinek  wrote:
>
> On Mon, Feb 20, 2023 at 12:48:38PM +0100, Tobias Burnus wrote:
> > On 20.02.23 12:15, Jakub Jelinek wrote:
> > > On Mon, Feb 20, 2023 at 12:07:43PM +0100, Tobias Burnus wrote:
> > > > As mentioned in the TODO for 'deferred', I think we really want
> > > > to have NULL as upper value for the domain for the type, but that
> > > > requires literally hundred of changes to the compiler, which
> > > > I do not want to due during Stage 4, but that are eventually
> > > > required.* — In any case, this patch fixes some of the issues
> > > > in the meanwhile.
> > > Yeah, the actual len can be in some type's lang_specific member.
> >
> > Actually, I think it should be bound to the DECL and not to the TYPE,
> > i.e. lang_decl not type_lang.
> >
> > I just see that, the latter already has a 'tree stringlen' (for I/O)
> > which probably could be reused for this purpose.
>
> I'd drop the
>  && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR
> and assert == SAVE_EXPR part, with SAVE_EXPRs one never knows if they
> are added around the whole expression or say some subexpression has
> it and then some trivial arithmetics happens on the SAVE_EXPR tree.
>
> > > Anyway, for the patch for now, I'd probably instead of stripping
> > > SAVE_EXPR overwrite the 2 sizes with newly built expressions.
> >
> > What I now did. (Unchanged otherwise, except that I now also mention
> > GFC_DECL_STRING_LEN in the TODO.)
> >
> > OK for mainline?
>
> If Richard doesn't object.

 tree
-gfc_get_character_type_len_for_eltype (tree eltype, tree len)
+gfc_get_character_type_len_for_eltype (tree eltype, tree len, bool deferred)
 {
   tree bounds, type;

   bounds = build_range_type (gfc_charlen_type_node, gfc_index_one_node, len);
   type = build_array_type (eltype, bounds);
   TYPE_STRING_FLAG (type) = 1;
-
+  if (len && deferred && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR)
+{
+  /* TODO: A more middle-end friendly alternative would be to use NULL_TREE
+as upper bound and store the value, e.g. as GFC_DECL_STRING_LEN.
+Caveat: this requires some cleanup throughout the code to consistently
+use some wrapper function.  */
+  gcc_assert (TREE_CODE (TYPE_SIZE_UNIT (type)) == SAVE_EXPR);
+  tree tmp = TREE_TYPE (TYPE_SIZE (eltype));

...

you are probably breaking type sharing here.  You could use
build_array_type_1 and pass false for 'shared' to get around that.  Note
that there's also canonical type building done in case 'eltype' is not
canonical itself.

The solution to the actual problem is a hack - you are relying on
re-evaluation of TYPE_SIZE, and for that, only from within accesses
from inside the frontend?  Since gimplification will produce the result
into a single temporary again, re-storing the "breakage".

So, does it _really_ fix things?

Richard.


>
> Jakub
>


[PATCH] libstdc++: Update baseline symbols for riscv64-linux

2023-02-20 Thread Andreas Schwab via Gcc-patches
libstdc++-v3/
* config/abi/post/riscv64-linux-gnu/baseline_symbols.txt: Update.
---
 .../riscv64-linux-gnu/baseline_symbols.txt| 98 ++-
 1 file changed, 97 insertions(+), 1 deletion(-)

diff --git 
a/libstdc++-v3/config/abi/post/riscv64-linux-gnu/baseline_symbols.txt 
b/libstdc++-v3/config/abi/post/riscv64-linux-gnu/baseline_symbols.txt
index 6e5da521255..876565bfa54 100644
--- a/libstdc++-v3/config/abi/post/riscv64-linux-gnu/baseline_symbols.txt
+++ b/libstdc++-v3/config/abi/post/riscv64-linux-gnu/baseline_symbols.txt
@@ -475,6 +475,7 @@ FUNC:_ZNKSt10moneypunctIwLb1EE8groupingEv@@GLIBCXX_3.4
 FUNC:_ZNKSt10ostrstream5rdbufEv@@GLIBCXX_3.4
 FUNC:_ZNKSt10ostrstream6pcountEv@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIcE15_M_am_pm_formatEPKc@@GLIBCXX_3.4
+FUNC:_ZNKSt11__timepunctIcE15_M_am_pm_formatEPPKc@@GLIBCXX_3.4.30
 FUNC:_ZNKSt11__timepunctIcE15_M_date_formatsEPPKc@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIcE15_M_time_formatsEPPKc@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIcE19_M_days_abbreviatedEPPKc@@GLIBCXX_3.4
@@ -485,6 +486,7 @@ FUNC:_ZNKSt11__timepunctIcE7_M_daysEPPKc@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIcE8_M_am_pmEPPKc@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIcE9_M_monthsEPPKc@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIwE15_M_am_pm_formatEPKw@@GLIBCXX_3.4
+FUNC:_ZNKSt11__timepunctIwE15_M_am_pm_formatEPPKw@@GLIBCXX_3.4.30
 FUNC:_ZNKSt11__timepunctIwE15_M_date_formatsEPPKw@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIwE15_M_time_formatsEPPKw@@GLIBCXX_3.4
 FUNC:_ZNKSt11__timepunctIwE19_M_days_abbreviatedEPPKw@@GLIBCXX_3.4
@@ -666,6 +668,13 @@ FUNC:_ZNKSt5ctypeIwE8do_widenEPKcS2_Pw@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE8do_widenEc@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE9do_narrowEPKwS2_cPc@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE9do_narrowEwc@@GLIBCXX_3.4
+FUNC:_ZNKSt6chrono4tzdb11locate_zoneESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono4tzdb12current_zoneEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9time_zone15_M_get_sys_infoENS_10time_pointINS_3_V212system_clockENS_8durationIlSt5ratioILl1ELl1EE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9time_zone17_M_get_local_infoENS_10time_pointINS_7local_tENS_8durationIlSt5ratioILl1ELl1EE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list14const_iteratordeEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list5beginEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list5frontEv@@GLIBCXX_3.4.31
 FUNC:_ZNKSt6locale2id5_M_idEv@@GLIBCXX_3.4
 FUNC:_ZNKSt6locale4nameB5cxx11Ev@@GLIBCXX_3.4.21
 FUNC:_ZNKSt6locale4nameEv@@GLIBCXX_3.4
@@ -954,6 +963,7 @@ 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE14do_
 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE15_M_extract_nameES4_S4_RiPPKcmRSt8ios_baseRSt12_Ios_Iostate@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE16do_get_monthnameES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tm@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE21_M_extract_via_formatES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmPKc@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE21_M_extract_via_formatES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmPKcRSt16__time_get_state@@GLIBCXX_3.4.30
 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE24_M_extract_wday_or_monthES4_S4_RiPPKcmRSt8ios_baseRSt12_Ios_Iostate@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE3getES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmPKcSD_@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEEE3getES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmcc@@GLIBCXX_3.4.21
@@ -973,6 +983,7 @@ 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE14do_
 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE15_M_extract_nameES4_S4_RiPPKwmRSt8ios_baseRSt12_Ios_Iostate@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE16do_get_monthnameES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tm@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE21_M_extract_via_formatES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmPKw@@GLIBCXX_3.4.21
+FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE21_M_extract_via_formatES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmPKwRSt16__time_get_state@@GLIBCXX_3.4.30
 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE24_M_extract_wday_or_monthES4_S4_RiPPKwmRSt8ios_baseRSt12_Ios_Iostate@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE3getES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmPKwSD_@@GLIBCXX_3.4.21
 
FUNC:_ZNKSt7__cxx118time_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEEE3getES4_S4_RSt8ios_baseRSt12_Ios_IostateP2tmcc@@GLIBCXX_3.4.21
@@ -1225,6 +1236,7 @@ 

Re: [PATCH] libstdc++: Add missing functions to [PR79700]

2023-02-20 Thread Jonathan Wakely via Gcc-patches
On Mon, 20 Feb 2023 at 11:57, Nathaniel Shead  wrote:
>
> On Mon, Feb 20, 2023 at 10:30 PM Jonathan Wakely  wrote:
> >
> > On Mon, 20 Feb 2023 at 11:23, Nathaniel Shead via Libstdc++
> >  wrote:
> > >
> > > The comments on PR79700 mentioned that it was somewhat ambiguous whether
> > > these functions were supposed to exist for C++11 or not. I chose to add
> > > them there, since other resources (such as cppreference) seem to think
> > > that C++11 should be the standard these functions were introduced, and I
> > > don't know of any reason to do otherwise.
> > >
> > > Tested on x86_64-linux.
> >
> > Thanks for the patch, but this needs tests for the new declarations
> > (which are tedious to write, which is the main reason I haven't
> > already pushed my own very similar patch).
> >
>
> Ah OK, fair enough. Where should the tests go? The only tests I could
> find for the existing (non -f/l) functions was just tests for their
> existence in testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> which I just added the new functions to - I guess I'll add a new file
> here and test that all the functions can be called and give the same
> results as the relevant overloaded variants?

Yeah, that sounds great, thanks!



Re: [PATCH] libstdc++: Add missing functions to [PR79700]

2023-02-20 Thread Nathaniel Shead via Gcc-patches
On Mon, Feb 20, 2023 at 10:30 PM Jonathan Wakely  wrote:
>
> On Mon, 20 Feb 2023 at 11:23, Nathaniel Shead via Libstdc++
>  wrote:
> >
> > The comments on PR79700 mentioned that it was somewhat ambiguous whether
> > these functions were supposed to exist for C++11 or not. I chose to add
> > them there, since other resources (such as cppreference) seem to think
> > that C++11 should be the standard these functions were introduced, and I
> > don't know of any reason to do otherwise.
> >
> > Tested on x86_64-linux.
>
> Thanks for the patch, but this needs tests for the new declarations
> (which are tedious to write, which is the main reason I haven't
> already pushed my own very similar patch).
>

Ah OK, fair enough. Where should the tests go? The only tests I could
find for the existing (non -f/l) functions was just tests for their
existence in testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
which I just added the new functions to - I guess I'll add a new file
here and test that all the functions can be called and give the same
results as the relevant overloaded variants?

>
> >
> > -- 8< --
> >
> > This patch adds the -f and -l variants of the C89  functions to
> >  under namespace std (so std::sqrtf, std::fabsl, etc.) for C++11
> > and up.
> >
> > libstdc++-v3/ChangeLog:
> >
> > PR libstdc++/79700
> > * include/c_global/cmath (acosf, acosl, asinf, asinl, atanf,
> > atanl, atan2f, atan2l, ceilf, ceill, cosf, cosl, coshf, coshl,
> > expf, expl, fabsf, fabsl, floorf, floorl, fmodf, fmodl, frexpf,
> > frexpl, ldexpf, ldexpl, logf, logl, log10f, log10l, modff,
> > modfl, powf, powl, sinf, sinl, sinhf, sinhl, sqrtf, sqrtl, tanf,
> > tanl, tanhf, tanhl): Add aliases in namespace std.
> > * testsuite/26_numerics/headers/cmath/functions_std_c++17.cc:
> > Add checks for existence of above names.
> >
> > Signed-off-by: Nathaniel Shead 
> > ---
> >  libstdc++-v3/include/c_global/cmath   | 111 ++
> >  .../headers/cmath/functions_std_c++17.cc  |  45 +++
> >  2 files changed, 156 insertions(+)
> >
> > diff --git a/libstdc++-v3/include/c_global/cmath 
> > b/libstdc++-v3/include/c_global/cmath
> > index 568eb354c2d..eaebde33dee 100644
> > --- a/libstdc++-v3/include/c_global/cmath
> > +++ b/libstdc++-v3/include/c_global/cmath
> > @@ -1767,6 +1767,117 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
> >
> >  #if __cplusplus >= 201103L
> >
> > +#undef acosf
> > +#undef acosl
> > +#undef asinf
> > +#undef asinl
> > +#undef atanf
> > +#undef atanl
> > +#undef atan2f
> > +#undef atan2l
> > +#undef ceilf
> > +#undef ceill
> > +#undef cosf
> > +#undef cosl
> > +#undef coshf
> > +#undef coshl
> > +#undef expf
> > +#undef expl
> > +#undef fabsf
> > +#undef fabsl
> > +#undef floorf
> > +#undef floorl
> > +#undef fmodf
> > +#undef fmodl
> > +#undef frexpf
> > +#undef frexpl
> > +#undef ldexpf
> > +#undef ldexpl
> > +#undef logf
> > +#undef logl
> > +#undef log10f
> > +#undef log10l
> > +#undef modff
> > +#undef modfl
> > +#undef powf
> > +#undef powl
> > +#undef sinf
> > +#undef sinl
> > +#undef sinhf
> > +#undef sinhl
> > +#undef sqrtf
> > +#undef sqrtl
> > +#undef tanf
> > +#undef tanl
> > +#undef tanhf
> > +#undef tanhl
> > +
> > +  using ::acosf;
> > +  using ::acosl;
> > +
> > +  using ::asinf;
> > +  using ::asinl;
> > +
> > +  using ::atanf;
> > +  using ::atanl;
> > +
> > +  using ::atan2f;
> > +  using ::atan2l;
> > +
> > +  using ::ceilf;
> > +  using ::ceill;
> > +
> > +  using ::cosf;
> > +  using ::cosl;
> > +
> > +  using ::coshf;
> > +  using ::coshl;
> > +
> > +  using ::expf;
> > +  using ::expl;
> > +
> > +  using ::fabsf;
> > +  using ::fabsl;
> > +
> > +  using ::floorf;
> > +  using ::floorl;
> > +
> > +  using ::fmodf;
> > +  using ::fmodl;
> > +
> > +  using ::frexpf;
> > +  using ::frexpl;
> > +
> > +  using ::ldexpf;
> > +  using ::ldexpl;
> > +
> > +  using ::logf;
> > +  using ::logl;
> > +
> > +  using ::log10f;
> > +  using ::log10l;
> > +
> > +  using ::modff;
> > +  using ::modfl;
> > +
> > +  using ::powf;
> > +  using ::powl;
> > +
> > +  using ::sinf;
> > +  using ::sinl;
> > +
> > +  using ::sinhf;
> > +  using ::sinhl;
> > +
> > +  using ::sqrtf;
> > +  using ::sqrtl;
> > +
> > +  using ::tanf;
> > +  using ::tanl;
> > +
> > +  using ::tanhf;
> > +  using ::tanhl;
> > +
> >  #ifdef _GLIBCXX_USE_C99_MATH_TR1
> >
> >  #undef acosh
> > diff --git 
> > a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc 
> > b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> > index 3b4ada1a756..c6ec636c183 100644
> > --- 
> > a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> > +++ 
> > b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> > @@ -44,6 +44,51 @@ namespace gnu
> >using std::tan;
> >using std::tanh;
> >
> > +  using std::acosf;
> > +  using std::acosl;
> > +  using std::asinf;
> > +  using std::asinl;
> > 

Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 20, 2023 at 12:48:38PM +0100, Tobias Burnus wrote:
> On 20.02.23 12:15, Jakub Jelinek wrote:
> > On Mon, Feb 20, 2023 at 12:07:43PM +0100, Tobias Burnus wrote:
> > > As mentioned in the TODO for 'deferred', I think we really want
> > > to have NULL as upper value for the domain for the type, but that
> > > requires literally hundred of changes to the compiler, which
> > > I do not want to due during Stage 4, but that are eventually
> > > required.* — In any case, this patch fixes some of the issues
> > > in the meanwhile.
> > Yeah, the actual len can be in some type's lang_specific member.
> 
> Actually, I think it should be bound to the DECL and not to the TYPE,
> i.e. lang_decl not type_lang.
> 
> I just see that, the latter already has a 'tree stringlen' (for I/O)
> which probably could be reused for this purpose.

I'd drop the
 && TREE_CODE (TYPE_SIZE (type)) == SAVE_EXPR
and assert == SAVE_EXPR part, with SAVE_EXPRs one never knows if they
are added around the whole expression or say some subexpression has
it and then some trivial arithmetics happens on the SAVE_EXPR tree.

> > Anyway, for the patch for now, I'd probably instead of stripping
> > SAVE_EXPR overwrite the 2 sizes with newly built expressions.
> 
> What I now did. (Unchanged otherwise, except that I now also mention
> GFC_DECL_STRING_LEN in the TODO.)
> 
> OK for mainline?

If Richard doesn't object.

Jakub



[PATCH] libstdc++: Some baseline_symbols.txt updates

2023-02-20 Thread Jakub Jelinek via Gcc-patches
Hi!

This updates baseline_symbols.txt for the Fedora 39 arches.
Most of the added symbols are added to all 6 files, exceptions are
DF16_ rtti stuff (only added on x86 and aarch64 which supports those),
DF16b rtti stuff (only x86 right now), _M_replace_cold (m vs. j
differences), DF128_ charconv (only x86), GLIBCXX_IEEE128_3.4.31
symver symbols (only ppc64), GLIBCXX_LDBL_3.4.31 symver (ppc64 and s390x),
_M_get_sys_info/_M_get_local_info (l vs. x) and
  1 
+FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_getIcSt19istreambuf_iteratorIcSt11char_traitsIcEPKT_RKSt6locale@@GLIBCXX_3.4.31
  1 
+FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_getIwSt19istreambuf_iteratorIwSt11char_traitsIwEPKT_RKSt6locale@@GLIBCXX_3.4.31
  1 
+FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_putIcSt19ostreambuf_iteratorIcSt11char_traitsIcEPKT_RKSt6locale@@GLIBCXX_3.4.31
  1 
+FUNC:_ZSt15__try_use_facetINSt19__gnu_cxx11_ieee1289money_putIwSt19ostreambuf_iteratorIwSt11char_traitsIwEPKT_RKSt6locale@@GLIBCXX_3.4.31
for those, I wonder why they aren't in GLIBCXX_IEEE128_3.4.31 symver...
I was using
grep ^+ | sed 's/OBJECT:[0-9]*:/OBJECT:/' | sort | uniq -c | sort -n | less
on the patch to analyze.

Ok for trunk?

2023-02-20  Jakub Jelinek  

* config/abi/post/x86_64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/x86_64-linux-gnu/32/baseline_symbols.txt: Update.
* config/abi/post/i486-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/aarch64-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/s390x-linux-gnu/baseline_symbols.txt: Update.
* config/abi/post/powerpc64-linux-gnu/baseline_symbols.txt: Update.

--- libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt.jj   
2022-04-27 17:29:55.890705647 +0200
+++ libstdc++-v3/config/abi/post/x86_64-linux-gnu/baseline_symbols.txt  
2023-02-20 12:35:45.711922009 +0100
@@ -498,6 +498,10 @@ FUNC:_ZNKSt11__timepunctIwE8_M_am_pmEPPK
 FUNC:_ZNKSt11__timepunctIwE9_M_monthsEPPKw@@GLIBCXX_3.4
 FUNC:_ZNKSt11logic_error4whatEv@@GLIBCXX_3.4
 FUNC:_ZNKSt12__basic_fileIcE7is_openEv@@GLIBCXX_3.4
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem28recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem4_DirELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem7__cxx1128recursive_directory_iterator10_Dir_stackELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt12__shared_ptrINSt10filesystem7__cxx114_DirELN9__gnu_cxx12_Lock_policyE2EEcvbEv@@GLIBCXX_3.4.31
 FUNC:_ZNKSt12bad_weak_ptr4whatEv@@GLIBCXX_3.4.15
 FUNC:_ZNKSt12future_error4whatEv@@GLIBCXX_3.4.14
 FUNC:_ZNKSt12strstreambuf6pcountEv@@GLIBCXX_3.4
@@ -668,6 +672,13 @@ FUNC:_ZNKSt5ctypeIwE8do_widenEPKcS2_Pw@@
 FUNC:_ZNKSt5ctypeIwE8do_widenEc@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE9do_narrowEPKwS2_cPc@@GLIBCXX_3.4
 FUNC:_ZNKSt5ctypeIwE9do_narrowEwc@@GLIBCXX_3.4
+FUNC:_ZNKSt6chrono4tzdb11locate_zoneESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono4tzdb12current_zoneEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9time_zone15_M_get_sys_infoENS_10time_pointINS_3_V212system_clockENS_8durationIlSt5ratioILl1ELl1EE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9time_zone17_M_get_local_infoENS_10time_pointINS_7local_tENS_8durationIlSt5ratioILl1ELl1EE@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list14const_iteratordeEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list5beginEv@@GLIBCXX_3.4.31
+FUNC:_ZNKSt6chrono9tzdb_list5frontEv@@GLIBCXX_3.4.31
 FUNC:_ZNKSt6locale2id5_M_idEv@@GLIBCXX_3.4
 FUNC:_ZNKSt6locale4nameB5cxx11Ev@@GLIBCXX_3.4.21
 FUNC:_ZNKSt6locale4nameEv@@GLIBCXX_3.4
@@ -3095,9 +3106,18 @@ FUNC:_ZNSt6__norm15_List_node_base7_M_ho
 FUNC:_ZNSt6__norm15_List_node_base7reverseEv@@GLIBCXX_3.4.9
 FUNC:_ZNSt6__norm15_List_node_base8transferEPS0_S1_@@GLIBCXX_3.4.9
 FUNC:_ZNSt6__norm15_List_node_base9_M_unhookEv@@GLIBCXX_3.4.14
+FUNC:_ZNSt6chrono11locate_zoneESt17basic_string_viewIcSt11char_traitsIcEE@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono11reload_tzdbEv@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono12current_zoneEv@@GLIBCXX_3.4.31
 FUNC:_ZNSt6chrono12system_clock3nowEv@@GLIBCXX_3.4.11
+FUNC:_ZNSt6chrono13get_tzdb_listEv@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono14remote_versionB5cxx11Ev@@GLIBCXX_3.4.31
 FUNC:_ZNSt6chrono3_V212steady_clock3nowEv@@GLIBCXX_3.4.19
 FUNC:_ZNSt6chrono3_V212system_clock3nowEv@@GLIBCXX_3.4.19
+FUNC:_ZNSt6chrono8get_tzdbEv@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono9tzdb_list11erase_afterENS0_14const_iteratorE@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono9tzdb_list14const_iteratorppEi@@GLIBCXX_3.4.31
+FUNC:_ZNSt6chrono9tzdb_list14const_iteratorppEv@@GLIBCXX_3.4.31
 FUNC:_ZNSt6gslice8_IndexerC1EmRKSt8valarrayImES4_@@GLIBCXX_3.4
 FUNC:_ZNSt6gslice8_IndexerC2EmRKSt8valarrayImES4_@@GLIBCXX_3.4
 FUNC:_ZNSt6locale11_M_coalesceERKS_S1_i@@GLIBCXX_3.4
@@ -3213,6 +3233,7 @@ FUNC:_ZNSt7__cxx1112basic_stringIcSt11ch
 

Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Tobias Burnus

On 20.02.23 12:15, Jakub Jelinek wrote:

On Mon, Feb 20, 2023 at 12:07:43PM +0100, Tobias Burnus wrote:

As mentioned in the TODO for 'deferred', I think we really want
to have NULL as upper value for the domain for the type, but that
requires literally hundred of changes to the compiler, which
I do not want to due during Stage 4, but that are eventually
required.* — In any case, this patch fixes some of the issues
in the meanwhile.

Yeah, the actual len can be in some type's lang_specific member.


Actually, I think it should be bound to the DECL and not to the TYPE,
i.e. lang_decl not type_lang.

I just see that, the latter already has a 'tree stringlen' (for I/O)
which probably could be reused for this purpose.


Anyway, for the patch for now, I'd probably instead of stripping
SAVE_EXPR overwrite the 2 sizes with newly built expressions.


What I now did. (Unchanged otherwise, except that I now also mention
GFC_DECL_STRING_LEN in the TODO.)

OK for mainline?

Tobias
-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955
Fortran: Avoid SAVE_EXPR for deferred-len char types

Using TYPE_SIZE/TYPE_SIZE_UNIT with deferred-length character variables,
i.e. 'character(len=:), allocatable/pointer' used a SAVE_EXPR, i.e. the
value on entry to the scope instead of the latest value.

Solution: Remove the SAVE_EXPR again in this case.

gcc/fortran/ChangeLog:

	* trans-types.h (gfc_get_character_type, gfc_get_character_type_len,
	(gfc_get_character_type_len_for_eltype): Add argument 'bool deferred'.
	* trans-types.cc (gfc_get_character_type_len_for_eltype): Likewise;
	remove the SAVE_EXPR for the type size for deferred string lengths.
	(gfc_get_character_type_len, gfc_get_character_type): Add arg
	and pass on.
	(gfc_typenode_for_spec): Update call.
	* trans-array.cc (gfc_trans_create_temp_array,
	trans_array_constructor, gfc_conv_loop_setup, gfc_array_init_size,
	gfc_alloc_allocatable_for_assignment): Likewise.
	* trans-expr.cc (gfc_conv_substring, gfc_conv_concat_op,
	gfc_add_interface_mapping, gfc_conv_procedure_call,
	gfc_conv_statement_function, gfc_conv_string_parameter): Likewise.
	* trans-intrinsic.cc (gfc_conv_intrinsic_transfer,
	gfc_conv_intrinsic_repeat): Likewise.
	* trans-stmt.cc (forall_make_variable_temp,
	gfc_trans_assign_need_temp): Likewise.

 gcc/fortran/trans-array.cc | 11 ++-
 gcc/fortran/trans-expr.cc  | 15 ---
 gcc/fortran/trans-intrinsic.cc |  5 +++--
 gcc/fortran/trans-stmt.cc  |  7 ---
 gcc/fortran/trans-types.cc | 39 ++-
 gcc/fortran/trans-types.h  |  6 +++---
 6 files changed, 54 insertions(+), 29 deletions(-)

diff --git a/gcc/fortran/trans-array.cc b/gcc/fortran/trans-array.cc
index 63bd1ac573a..b0abdadc3f5 100644
--- a/gcc/fortran/trans-array.cc
+++ b/gcc/fortran/trans-array.cc
@@ -1480,7 +1480,7 @@ gfc_trans_create_temp_array (stmtblock_t * pre, stmtblock_t * post, gfc_ss * ss,
   elemsize = gfc_resize_class_size_with_len (pre, class_expr, elemsize);
   /* Casting the data as a character of the dynamic length ensures that
 	 assignment of elements works when needed.  */
-  eltype = gfc_get_character_type_len (1, elemsize);
+  eltype = gfc_get_character_type_len (1, elemsize, true);
 }
 
   memset (from, 0, sizeof (from));
@@ -2823,7 +2823,8 @@ trans_array_constructor (gfc_ss * ss, locus * where)
 
   store_backend_decl (>ts.u.cl, ss_info->string_length, force_new_cl);
 
-  type = gfc_get_character_type_len (expr->ts.kind, ss_info->string_length);
+  type = gfc_get_character_type_len (expr->ts.kind, ss_info->string_length,
+	 expr->ts.deferred);
   if (const_string)
 	type = build_pointer_type (type);
 }
@@ -5492,7 +5493,7 @@ gfc_conv_loop_setup (gfc_loopinfo * loop, locus * where)
 	tmp_ss_info->data.temp.type
 		= gfc_get_character_type_len_for_eltype
 			(TREE_TYPE (tmp_ss_info->data.temp.type),
-			 tmp_ss_info->string_length);
+			 tmp_ss_info->string_length, false);
 
   tmp = tmp_ss_info->data.temp.type;
   memset (_ss_info->data.array, 0, sizeof (gfc_array_info));
@@ -5737,7 +5738,7 @@ gfc_array_init_size (tree descriptor, int rank, int corank, tree * poffset,
   tmp = fold_build3_loc (input_location, COMPONENT_REF, TREE_TYPE (tmp),
 			 TREE_OPERAND (descriptor, 0), tmp, NULL_TREE);
   tmp = fold_convert (gfc_charlen_type_node, tmp);
-  type = gfc_get_character_type_len (expr->ts.kind, tmp);
+  type = gfc_get_character_type_len (expr->ts.kind, tmp, expr->ts.deferred);
   tmp = gfc_conv_descriptor_dtype (descriptor);
   gfc_add_modify (pblock, tmp, gfc_get_dtype_rank_type (rank, type));
 }
@@ -10908,7 +10909,7 @@ gfc_alloc_allocatable_for_assignment (gfc_loopinfo *loop,
   if (expr2->ts.type != BT_CLASS)
 	

Re: [PATCH] libstdc++: Add missing functions to [PR79700]

2023-02-20 Thread Jonathan Wakely via Gcc-patches
On Mon, 20 Feb 2023 at 11:23, Nathaniel Shead via Libstdc++
 wrote:
>
> The comments on PR79700 mentioned that it was somewhat ambiguous whether
> these functions were supposed to exist for C++11 or not. I chose to add
> them there, since other resources (such as cppreference) seem to think
> that C++11 should be the standard these functions were introduced, and I
> don't know of any reason to do otherwise.
>
> Tested on x86_64-linux.

Thanks for the patch, but this needs tests for the new declarations
(which are tedious to write, which is the main reason I haven't
already pushed my own very similar patch).


>
> -- 8< --
>
> This patch adds the -f and -l variants of the C89  functions to
>  under namespace std (so std::sqrtf, std::fabsl, etc.) for C++11
> and up.
>
> libstdc++-v3/ChangeLog:
>
> PR libstdc++/79700
> * include/c_global/cmath (acosf, acosl, asinf, asinl, atanf,
> atanl, atan2f, atan2l, ceilf, ceill, cosf, cosl, coshf, coshl,
> expf, expl, fabsf, fabsl, floorf, floorl, fmodf, fmodl, frexpf,
> frexpl, ldexpf, ldexpl, logf, logl, log10f, log10l, modff,
> modfl, powf, powl, sinf, sinl, sinhf, sinhl, sqrtf, sqrtl, tanf,
> tanl, tanhf, tanhl): Add aliases in namespace std.
> * testsuite/26_numerics/headers/cmath/functions_std_c++17.cc:
> Add checks for existence of above names.
>
> Signed-off-by: Nathaniel Shead 
> ---
>  libstdc++-v3/include/c_global/cmath   | 111 ++
>  .../headers/cmath/functions_std_c++17.cc  |  45 +++
>  2 files changed, 156 insertions(+)
>
> diff --git a/libstdc++-v3/include/c_global/cmath 
> b/libstdc++-v3/include/c_global/cmath
> index 568eb354c2d..eaebde33dee 100644
> --- a/libstdc++-v3/include/c_global/cmath
> +++ b/libstdc++-v3/include/c_global/cmath
> @@ -1767,6 +1767,117 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
>
>  #if __cplusplus >= 201103L
>
> +#undef acosf
> +#undef acosl
> +#undef asinf
> +#undef asinl
> +#undef atanf
> +#undef atanl
> +#undef atan2f
> +#undef atan2l
> +#undef ceilf
> +#undef ceill
> +#undef cosf
> +#undef cosl
> +#undef coshf
> +#undef coshl
> +#undef expf
> +#undef expl
> +#undef fabsf
> +#undef fabsl
> +#undef floorf
> +#undef floorl
> +#undef fmodf
> +#undef fmodl
> +#undef frexpf
> +#undef frexpl
> +#undef ldexpf
> +#undef ldexpl
> +#undef logf
> +#undef logl
> +#undef log10f
> +#undef log10l
> +#undef modff
> +#undef modfl
> +#undef powf
> +#undef powl
> +#undef sinf
> +#undef sinl
> +#undef sinhf
> +#undef sinhl
> +#undef sqrtf
> +#undef sqrtl
> +#undef tanf
> +#undef tanl
> +#undef tanhf
> +#undef tanhl
> +
> +  using ::acosf;
> +  using ::acosl;
> +
> +  using ::asinf;
> +  using ::asinl;
> +
> +  using ::atanf;
> +  using ::atanl;
> +
> +  using ::atan2f;
> +  using ::atan2l;
> +
> +  using ::ceilf;
> +  using ::ceill;
> +
> +  using ::cosf;
> +  using ::cosl;
> +
> +  using ::coshf;
> +  using ::coshl;
> +
> +  using ::expf;
> +  using ::expl;
> +
> +  using ::fabsf;
> +  using ::fabsl;
> +
> +  using ::floorf;
> +  using ::floorl;
> +
> +  using ::fmodf;
> +  using ::fmodl;
> +
> +  using ::frexpf;
> +  using ::frexpl;
> +
> +  using ::ldexpf;
> +  using ::ldexpl;
> +
> +  using ::logf;
> +  using ::logl;
> +
> +  using ::log10f;
> +  using ::log10l;
> +
> +  using ::modff;
> +  using ::modfl;
> +
> +  using ::powf;
> +  using ::powl;
> +
> +  using ::sinf;
> +  using ::sinl;
> +
> +  using ::sinhf;
> +  using ::sinhl;
> +
> +  using ::sqrtf;
> +  using ::sqrtl;
> +
> +  using ::tanf;
> +  using ::tanl;
> +
> +  using ::tanhf;
> +  using ::tanhl;
> +
>  #ifdef _GLIBCXX_USE_C99_MATH_TR1
>
>  #undef acosh
> diff --git 
> a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc 
> b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> index 3b4ada1a756..c6ec636c183 100644
> --- a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> +++ b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
> @@ -44,6 +44,51 @@ namespace gnu
>using std::tan;
>using std::tanh;
>
> +  using std::acosf;
> +  using std::acosl;
> +  using std::asinf;
> +  using std::asinl;
> +  using std::atanf;
> +  using std::atanl;
> +  using std::atan2f;
> +  using std::atan2l;
> +  using std::ceilf;
> +  using std::ceill;
> +  using std::cosf;
> +  using std::cosl;
> +  using std::coshf;
> +  using std::coshl;
> +  using std::expf;
> +  using std::expl;
> +  using std::fabsf;
> +  using std::fabsl;
> +  using std::floorf;
> +  using std::floorl;
> +  using std::fmodf;
> +  using std::fmodl;
> +  using std::frexpf;
> +  using std::frexpl;
> +  using std::ldexpf;
> +  using std::ldexpl;
> +  using std::logf;
> +  using std::logl;
> +  using std::log10f;
> +  using std::log10l;
> +  using std::modff;
> +  using std::modfl;
> +  using std::powf;
> +  using std::powl;
> +  using std::sinf;
> +  using std::sinl;
> +  using std::sinhf;
> +  using std::sinhl;
> +  using std::sqrtf;
> +  using 

[PATCH] libstdc++: Add missing functions to [PR79700]

2023-02-20 Thread Nathaniel Shead via Gcc-patches
The comments on PR79700 mentioned that it was somewhat ambiguous whether
these functions were supposed to exist for C++11 or not. I chose to add
them there, since other resources (such as cppreference) seem to think
that C++11 should be the standard these functions were introduced, and I
don't know of any reason to do otherwise.

Tested on x86_64-linux.

-- 8< --

This patch adds the -f and -l variants of the C89  functions to
 under namespace std (so std::sqrtf, std::fabsl, etc.) for C++11
and up.

libstdc++-v3/ChangeLog:

PR libstdc++/79700
* include/c_global/cmath (acosf, acosl, asinf, asinl, atanf,
atanl, atan2f, atan2l, ceilf, ceill, cosf, cosl, coshf, coshl,
expf, expl, fabsf, fabsl, floorf, floorl, fmodf, fmodl, frexpf,
frexpl, ldexpf, ldexpl, logf, logl, log10f, log10l, modff,
modfl, powf, powl, sinf, sinl, sinhf, sinhl, sqrtf, sqrtl, tanf,
tanl, tanhf, tanhl): Add aliases in namespace std.
* testsuite/26_numerics/headers/cmath/functions_std_c++17.cc:
Add checks for existence of above names.

Signed-off-by: Nathaniel Shead 
---
 libstdc++-v3/include/c_global/cmath   | 111 ++
 .../headers/cmath/functions_std_c++17.cc  |  45 +++
 2 files changed, 156 insertions(+)

diff --git a/libstdc++-v3/include/c_global/cmath 
b/libstdc++-v3/include/c_global/cmath
index 568eb354c2d..eaebde33dee 100644
--- a/libstdc++-v3/include/c_global/cmath
+++ b/libstdc++-v3/include/c_global/cmath
@@ -1767,6 +1767,117 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
 #if __cplusplus >= 201103L
 
+#undef acosf
+#undef acosl
+#undef asinf
+#undef asinl
+#undef atanf
+#undef atanl
+#undef atan2f
+#undef atan2l
+#undef ceilf
+#undef ceill
+#undef cosf
+#undef cosl
+#undef coshf
+#undef coshl
+#undef expf
+#undef expl
+#undef fabsf
+#undef fabsl
+#undef floorf
+#undef floorl
+#undef fmodf
+#undef fmodl
+#undef frexpf
+#undef frexpl
+#undef ldexpf
+#undef ldexpl
+#undef logf
+#undef logl
+#undef log10f
+#undef log10l
+#undef modff
+#undef modfl
+#undef powf
+#undef powl
+#undef sinf
+#undef sinl
+#undef sinhf
+#undef sinhl
+#undef sqrtf
+#undef sqrtl
+#undef tanf
+#undef tanl
+#undef tanhf
+#undef tanhl
+
+  using ::acosf;
+  using ::acosl;
+
+  using ::asinf;
+  using ::asinl;
+
+  using ::atanf;
+  using ::atanl;
+
+  using ::atan2f;
+  using ::atan2l;
+
+  using ::ceilf;
+  using ::ceill;
+
+  using ::cosf;
+  using ::cosl;
+
+  using ::coshf;
+  using ::coshl;
+
+  using ::expf;
+  using ::expl;
+
+  using ::fabsf;
+  using ::fabsl;
+
+  using ::floorf;
+  using ::floorl;
+
+  using ::fmodf;
+  using ::fmodl;
+
+  using ::frexpf;
+  using ::frexpl;
+
+  using ::ldexpf;
+  using ::ldexpl;
+
+  using ::logf;
+  using ::logl;
+
+  using ::log10f;
+  using ::log10l;
+
+  using ::modff;
+  using ::modfl;
+
+  using ::powf;
+  using ::powl;
+
+  using ::sinf;
+  using ::sinl;
+
+  using ::sinhf;
+  using ::sinhl;
+
+  using ::sqrtf;
+  using ::sqrtl;
+
+  using ::tanf;
+  using ::tanl;
+
+  using ::tanhf;
+  using ::tanhl;
+
 #ifdef _GLIBCXX_USE_C99_MATH_TR1
 
 #undef acosh
diff --git 
a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc 
b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
index 3b4ada1a756..c6ec636c183 100644
--- a/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
+++ b/libstdc++-v3/testsuite/26_numerics/headers/cmath/functions_std_c++17.cc
@@ -44,6 +44,51 @@ namespace gnu
   using std::tan;
   using std::tanh;
 
+  using std::acosf;
+  using std::acosl;
+  using std::asinf;
+  using std::asinl;
+  using std::atanf;
+  using std::atanl;
+  using std::atan2f;
+  using std::atan2l;
+  using std::ceilf;
+  using std::ceill;
+  using std::cosf;
+  using std::cosl;
+  using std::coshf;
+  using std::coshl;
+  using std::expf;
+  using std::expl;
+  using std::fabsf;
+  using std::fabsl;
+  using std::floorf;
+  using std::floorl;
+  using std::fmodf;
+  using std::fmodl;
+  using std::frexpf;
+  using std::frexpl;
+  using std::ldexpf;
+  using std::ldexpl;
+  using std::logf;
+  using std::logl;
+  using std::log10f;
+  using std::log10l;
+  using std::modff;
+  using std::modfl;
+  using std::powf;
+  using std::powl;
+  using std::sinf;
+  using std::sinl;
+  using std::sinhf;
+  using std::sinhl;
+  using std::sqrtf;
+  using std::sqrtl;
+  using std::tanf;
+  using std::tanl;
+  using std::tanhf;
+  using std::tanhl;
+
   using std::assoc_laguerre;
   using std::assoc_laguerref;
   using std::assoc_laguerrel;
-- 
2.34.1



Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Jakub Jelinek via Gcc-patches
On Mon, Feb 20, 2023 at 12:07:43PM +0100, Tobias Burnus wrote:
> As mentioned in the TODO for 'deferred', I think we really want
> to have NULL as upper value for the domain for the type, but that
> requires literally hundred of changes to the compiler, which
> I do not want to due during Stage 4, but that are eventually
> required.* — In any case, this patch fixes some of the issues
> in the meanwhile.

Yeah, the actual len can be in some type's lang_specific member.

Anyway, for the patch for now, I'd probably instead of stripping
SAVE_EXPR overwrite the 2 sizes with newly built expressions.

Jakub



Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Tobias Burnus

On 20.02.23 11:41, Richard Biener wrote:

Generally SAVE_EXPR is used to make sure an expression is only evaluated
once.  It's DECL_EXPR that ensures something is evaluated early
and available.  So generally "unwrapping" a SAVE_EXPR looks dangerous
to me unless the SAVE_EXPR is really never necessary.


For VLA-kind of variables, SAVE_EXPR makes sense
(code wise: if '!deferred') - and that use in gfortran
should remain unchanged.


However, Fortran also has deferred-length variables where one has:

character(len=:), pointer :: str
! ...
allocate(character(len=42) :: str)
!...
end


which has the dump:

  integer(kind=8) .str;
  character(kind=1)[1:.str] * str;

  str = (character(kind=1)[1:.str] *) __builtin_malloc (42);
  .str = 42;

The length variable is - a bit oddly - linked to the
data variable its TREE_TYPE - i.e. via the upper bound for
the domain / TYPE_SIZE / TYPE_SIZE_UNIT.

Currently, it happens that the SAVE_EXPR is used, e.g.
  size = D.1234;  // which D.1234 is the SAVE_EXPR
instead of the current value
  size = .str;
which leads to wrong results. As '.str' is an aritificial
variable, the issue of a user modifying the value does not exist.

* * *

As mentioned in the TODO for 'deferred', I think we really want
to have NULL as upper value for the domain for the type, but that
requires literally hundred of changes to the compiler, which
I do not want to due during Stage 4, but that are eventually
required.* — In any case, this patch fixes some of the issues
in the meanwhile.

Tobias

* The number of deferred-length bugs is really huge; especially when
used with derived types.

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


Re: [PATCH] Update baseline symbols for m68k-linux

2023-02-20 Thread Jonathan Wakely via Gcc-patches
On Sat, 18 Feb 2023 at 19:23, Andreas Schwab  wrote:
>
> libstdc++-v3/:
> * config/abi/post/m68k-linux-gnu/baseline_symbols.txt: Update.

All the additions (and the one change) look correct, thanks.



[PATCH] tree-optimization/108816 - vect versioning check split confusion

2023-02-20 Thread Richard Biener via Gcc-patches
The split of the versioning condition assumes the definition is
in the condition block which is ensured by the versioning code.
But that only works when we actually have to insert any statements
for the versioning condition.  The following adjusts the guard
accordingly and asserts this condition.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108816
* tree-vect-loop-manip.cc (vect_loop_versioning): Adjust
versioning condition split prerequesite, assert required
invariant.

* gcc.dg/torture/pr108816.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr108816.c | 18 ++
 gcc/tree-vect-loop-manip.cc |  3 ++-
 2 files changed, 20 insertions(+), 1 deletion(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr108816.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr108816.c 
b/gcc/testsuite/gcc.dg/torture/pr108816.c
new file mode 100644
index 000..4c24d5584c1
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr108816.c
@@ -0,0 +1,18 @@
+/* { dg-do compile } */
+/* { dg-additional-options "-fvect-cost-model=dynamic" } */
+
+int m;
+
+void
+foo (int p[][16], unsigned int x)
+{
+  while (x < 4)
+{
+  int j;
+
+  for (j = x * 4; j < (x + 1) * 4 - 2; j++)
+p[0][j] = p[m][j];
+
+  ++x;
+}
+}
diff --git a/gcc/tree-vect-loop-manip.cc b/gcc/tree-vect-loop-manip.cc
index c04fcf40c44..6aa3d2ed0bf 100644
--- a/gcc/tree-vect-loop-manip.cc
+++ b/gcc/tree-vect-loop-manip.cc
@@ -3477,7 +3477,7 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
   tree cost_name = NULL_TREE;
   profile_probability prob2 = profile_probability::uninitialized ();
   if (cond_expr
-  && !integer_truep (cond_expr)
+  && EXPR_P (cond_expr)
   && (version_niter
  || version_align
  || version_alias
@@ -3711,6 +3711,7 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
   if (cost_name && TREE_CODE (cost_name) == SSA_NAME)
 {
   gimple *def = SSA_NAME_DEF_STMT (cost_name);
+  gcc_assert (gimple_bb (def) == condition_bb);
   /* All uses of the cost check are 'true' after the check we
 are going to insert.  */
   replace_uses_by (cost_name, boolean_true_node);
-- 
2.35.3


Re: [PATCH 1/2] Support get_range_query with a nullptr argument

2023-02-20 Thread Richard Biener via Gcc-patches
On Fri, Feb 17, 2023 at 10:46 PM Andrew Pinski via Gcc-patches
 wrote:
>
> get_range_query didn't support a nullptr argument
> before and would crash.
> See also the thread at
> https://inbox.sourceware.org/gcc/4f6718af-e17a-41ef-a886-f45e4ac3d...@redhat.com/T/
>
> OK? Bootstrapped and tested on x86_64-linux-gnu with no regressions.

OK.

> gcc/ChangeLog:
>
> * value-query.h (get_range_query): Return the global ranges
> for a nullptr func.
> ---
>  gcc/value-query.h | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/gcc/value-query.h b/gcc/value-query.h
> index 63878968118..2d7bf8fcf33 100644
> --- a/gcc/value-query.h
> +++ b/gcc/value-query.h
> @@ -140,7 +140,7 @@ get_global_range_query ()
>  ATTRIBUTE_RETURNS_NONNULL inline range_query *
>  get_range_query (const struct function *fun)
>  {
> -  return fun->x_range_query ? fun->x_range_query : _ranges;
> +  return (fun && fun->x_range_query) ? fun->x_range_query : _ranges;
>  }
>
>  // Query the global range of NAME in function F.  Default to cfun.
> --
> 2.17.1
>


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Richard Biener via Gcc-patches
On Mon, Feb 20, 2023 at 11:05 AM Tobias Burnus  wrote:
>
> On 17.02.23 17:27, Steve Kargl wrote:
> > On Fri, Feb 17, 2023 at 12:13:52PM +0100, Tobias Burnus wrote:
> >> OK for mainline?
> > Short version: no.
>
> Would you mind to write a reasoning beyond only a single word?
>
> >> subroutine foo(n)
> >>integer :: n
> >>integer :: array(n*5)
> >>integer :: my_len
> >>...
> >>my_len = 5
> >>block
> >>  character(len=my_len, kind=4) :: str
> >>
> >>  my_len = 99
> >>  print *, len(str)  ! still shows 5 - not 99
> >>end block
> >> end
> > Are you sure about the above comment?
>
> Yes - for three reasons:
> * On the what-feels-right side: It does not make any sense to print
>any other value than 5 given that 'str' has been declared with len = 5.
> * On the GCC side, the SAVE_EXPR ensures that the length is evaluated
>early and then "saved" to ensure its original value is available

Generally SAVE_EXPR is used to make sure an expression is only evaluated
once.  It's DECL_EXPR that ensures something is evaluated early
and available.  So generally "unwrapping" a SAVE_EXPR looks dangerous
to me unless the SAVE_EXPR is really never necessary.

Richard.

> * The quoted text from the standard implies that this is what
>should happen.
>
> Why do you think that printing "5" is wrong? GCC does so since
> years; it still does so with my patch.
>
> Hence, can you elaborate? And also state which value you did expect instead?
>
> * * *
>
> The patch itself is about *deferred* length parameters, i.e.
> 'len=:', and thus for code like:
>
> character(len=:), pointer :: str
> ...
> allocate(character(len=4) :: str)
> print *, len(str)  ! should print 4
> ...
> allocate(character(len=99) :: str)
> print *, len(str)  ! should now print 99
> ...
>
> Currently, the SAVE_EXPR causes that the original value might
> get used, which is often 0 (by chance 0 initialized) or some
> random value like 57385973, depending what on what was on the
> stack before. - There are more issues with deferred strings,
> but at least one is solved by not having a SAVE_EXPR for
> deferred-length character strings.
>
> Tobias
>
> -
> Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
> München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
> Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
> München, HRB 106955


Re: [PATCH] rust: Fix rust-tree.cc compilation on SPARC

2023-02-20 Thread Arthur Cohen

Thanks Rainer!

Ok for trunk :)

Kindly,

--
Arthur

On 2/20/23 11:36, Rainer Orth wrote:

This patch

commit 27a89f84c458ae938bc3eb92ad0d594c06fc3b42
Author: Thomas Schwinge 
Date:   Fri Feb 17 23:36:20 2023 +0100

 '#include "tm_p.h"' in 'gcc/rust/backend/rust-tree.cc'

broke rust bootstrap on SPARC:

In file included from ./tm_p.h:4,
  from 
/vol/gcc/src/hg/master/local/gcc/rust/backend/rust-tree.cc:38:
/vol/gcc/src/hg/master/local/gcc/config/sparc/sparc-protos.h:46:47: error: use 
of enum 'memmodel' without previous declaration
46 | extern void sparc_emit_membar_for_model (enum memmodel, int, int);
   |   ^~~~

Fixed by including memmodel.h.  Tested on sparc-sun-solaris2.11 and
i386-pc-solaris2.11.

Ok for trunk?

I'd usually commit the patch as obvious, but have no idea how rust
patches are handled.

Rainer



OpenPGP_0x1B3465B044AD9C65.asc
Description: OpenPGP public key


OpenPGP_signature
Description: OpenPGP digital signature


[PATCH] rust: Fix rust-tree.cc compilation on SPARC

2023-02-20 Thread Rainer Orth
This patch

commit 27a89f84c458ae938bc3eb92ad0d594c06fc3b42
Author: Thomas Schwinge 
Date:   Fri Feb 17 23:36:20 2023 +0100

'#include "tm_p.h"' in 'gcc/rust/backend/rust-tree.cc'

broke rust bootstrap on SPARC:

In file included from ./tm_p.h:4,
 from 
/vol/gcc/src/hg/master/local/gcc/rust/backend/rust-tree.cc:38:
/vol/gcc/src/hg/master/local/gcc/config/sparc/sparc-protos.h:46:47: error: use 
of enum 'memmodel' without previous declaration
   46 | extern void sparc_emit_membar_for_model (enum memmodel, int, int);
  |   ^~~~

Fixed by including memmodel.h.  Tested on sparc-sun-solaris2.11 and
i386-pc-solaris2.11.

Ok for trunk?

I'd usually commit the patch as obvious, but have no idea how rust
patches are handled.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2023-02-20  Rainer Orth  

gcc/rust:
* backend/rust-tree.cc: Include memmodel.h.

diff --git a/gcc/rust/backend/rust-tree.cc b/gcc/rust/backend/rust-tree.cc
--- a/gcc/rust/backend/rust-tree.cc
+++ b/gcc/rust/backend/rust-tree.cc
@@ -35,6 +35,7 @@
 #include "file-prefix-map.h"
 #include "cgraph.h"
 #include "output.h"
+#include "memmodel.h"
 #include "tm_p.h"
 
 // forked from gcc/c-family/c-common.cc c_global_trees


[PATCH] Allow front ends to register spec functions gcc/{gcc.cc,gcc.h} [PR108261]

2023-02-20 Thread Gaius Mulley via Gcc-patches


Hello,

bootstrapped on gcc master x86_64 and no extra failures generated on all
front ends.

Would this be ok for trunc?

regards,
Gaius


Allow front ends to register spec functions gcc/{gcc.cc,gcc.h} [PR108261]

This patch allows front ends to register spec functions.  It is motivated
by PR108261 which needs to retain the order of search path related
options in the modula-2 front end.

gcc/ChangeLog:

* gcc.cc (add_spec_function):
* gcc.h (add_spec_function):

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index becc56051a8..93e4e38389d 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -46,6 +46,7 @@ compilation is specified by a string called a "spec".  */
 #include "spellcheck.h"
 #include "opts-jobserver.h"
 #include "common/common-target.h"
+#include 
 
 
 
@@ -1774,6 +1775,8 @@ static const struct spec_function static_spec_functions[] 
=
   { 0, 0 }
 };
 
+static std::vectorlang_spec_functions;
+
 static int processing_spec_function;
 
 /* Add appropriate libgcc specs to OBSTACK, taking into account
@@ -6825,9 +6828,25 @@ lookup_spec_function (const char *name)
 if (strcmp (sf->name, name) == 0)
   return sf;
 
+  for (auto *sf : lang_spec_functions)
+if (strcmp (sf->name, name) == 0)
+  return sf;
+
   return NULL;
 }
 
+/* Add a new spec function.  */
+
+void
+add_spec_function (const char *name,
+  const char *(*func) (int, const char **))
+{
+  struct spec_function *sf = XNEW (struct spec_function);
+  sf->name = name;
+  sf->func = func;
+  lang_spec_functions.push_back (sf);
+}
+
 /* Evaluate a spec function.  */
 
 static const char *
diff --git a/gcc/gcc.h b/gcc/gcc.h
index 19a61b373ee..f40de0f5520 100644
--- a/gcc/gcc.h
+++ b/gcc/gcc.h
@@ -73,6 +73,8 @@ struct spec_function
 extern int do_spec (const char *);
 extern void record_temp_file (const char *, int, int);
 extern void set_input (const char *);
+extern void add_spec_function (const char *name,
+  const char *(*func) (int, const char **));
 
 /* Spec files linked with gcc.cc must provide definitions for these.  */
 


[pushed] wwwdocs: *: Add a comma after "In addition" when used as transition

2023-02-20 Thread Gerald Pfeifer
As promised yesterday, this not only improves the one case that
triggered NightStrike's note, but all cases I found in wwwdocs.

Pushed.

Gerald

--

wwwdocs: *: Add a comma after "In addition" when used as transition

On the way reduce one use and simplify a sentence.
---
 htdocs/gcc-12/changes.html | 4 ++--
 htdocs/gcc-3.0/libgcc.html | 2 +-
 htdocs/gcc-5/changes.html  | 2 +-
 htdocs/gcc-6/changes.html  | 2 +-
 htdocs/git.html| 4 ++--
 htdocs/spam.html   | 3 +--
 6 files changed, 8 insertions(+), 9 deletions(-)

diff --git a/htdocs/gcc-12/changes.html b/htdocs/gcc-12/changes.html
index fd4062e6..c47d3285 100644
--- a/htdocs/gcc-12/changes.html
+++ b/htdocs/gcc-12/changes.html
@@ -153,7 +153,7 @@ You may also want to check out our
   
 OpenMP 5.0 support has been extended: The close map
   modifier and the affinity clause are now supported.
-  In addition Fortran gained the following features which were
+  In addition, Fortran gained the following features which were
   available in C and C++ before: declare variant is now
   available, depobj, mutexinoutset and
   iterator can now also be used with the depend
@@ -170,7 +170,7 @@ You may also want to check out our
   align and allocator modifiers to the
   allocate clause and the atomic extensions are
   now available. The OMP_PLACE environment variable supports
-  the OpenMP 5.1 features. In addition the OMP_NUM_TEAMS and
+  the OpenMP 5.1 features. In addition, the OMP_NUM_TEAMS and
   OMP_TEAMS_THREAD_LIMIT environment variables and their
   associated API routines are now supported as well as the 
memory-allocation
   routines added for Fortran and extended for C/C++ in OpenMP 5.1. In
diff --git a/htdocs/gcc-3.0/libgcc.html b/htdocs/gcc-3.0/libgcc.html
index d98f9b71..6143db0c 100644
--- a/htdocs/gcc-3.0/libgcc.html
+++ b/htdocs/gcc-3.0/libgcc.html
@@ -12,7 +12,7 @@
 
 This page provides a summary of discussions about the pros and cons
 of distributing libgcc as a shared library, as well as a
-static library.  In addition this page details the plans regarding
+static library.  In addition, it details the plans regarding
 libgcc for the GCC 3.0 release.
 
 Issues
diff --git a/htdocs/gcc-5/changes.html b/htdocs/gcc-5/changes.html
index 201a039f..6952f866 100644
--- a/htdocs/gcc-5/changes.html
+++ b/htdocs/gcc-5/changes.html
@@ -1001,7 +1001,7 @@ are not listed here).
and built-in support. It is enabled through option 
-mmwaitx.
The instructions monitorx and mwaitx
implement the same functionality as the old monitor
-   and mwait instructions. In addition mwaitx
+   and mwait instructions. In addition, mwaitx
adds a configurable timer. The timer value is received as third
argument and stored in register %ebx.
   
diff --git a/htdocs/gcc-6/changes.html b/htdocs/gcc-6/changes.html
index b400dd9c..0c7d2582 100644
--- a/htdocs/gcc-6/changes.html
+++ b/htdocs/gcc-6/changes.html
@@ -566,7 +566,7 @@ within strings:
and built-in support. It is enabled through option 
-mmwaitx.
The instructions monitorx and mwaitx
implement the same functionality as the old monitor
-   and mwait instructions. In addition mwaitx
+   and mwait instructions. In addition, mwaitx
adds a configurable timer. The timer value is received as third
argument and stored in register %ebx.
  
diff --git a/htdocs/git.html b/htdocs/git.html
index f71b451f..2543c237 100644
--- a/htdocs/git.html
+++ b/htdocs/git.html
@@ -20,8 +20,8 @@
 large.  That way you can pick up any version (including releases) of
 GCC that is in our repository.
 
-In addition you
-can https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;>browse our
+In addition, you can
+https://gcc.gnu.org/git/gitweb.cgi?p=gcc.git;>browse our
 Git history online.
 
 (Our web pages are managed via Git in a
diff --git a/htdocs/spam.html b/htdocs/spam.html
index 349e70c7..43040100 100644
--- a/htdocs/spam.html
+++ b/htdocs/spam.html
@@ -45,8 +45,7 @@ avoid creating unnecessary traffic:
  harvested.
 
 
-In addition may want to check out
-https://www.abuse.net;>www.abuse.net.
+Also check out https://www.abuse.net;>www.abuse.net.
 
 
 
-- 
2.39.1


[PATCH V3 4/5] RISC-V: Implement ZKNH extension

2023-02-20 Thread Liao Shihua
This patch supports Zknh extension. 
It includes instruction's machine description and built-in funtions. 

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sha256sig0_): Add ZKNH's 
instructions.
(riscv_sha256sig1_):
(riscv_sha256sum0_):
(riscv_sha256sum1_):
(riscv_sha512sig0h):
(riscv_sha512sig0l):
(riscv_sha512sig1h):
(riscv_sha512sig1l):
(riscv_sha512sum0r):
(riscv_sha512sum1r):
(riscv_sha512sig0):
(riscv_sha512sig1):
(riscv_sha512sum0):
(riscv_sha512sum1):
* config/riscv/riscv-builtins.cc (AVAIL): And ZKNH's AVAIL.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): And ZKNH's 
built-in functions.
(DIRECT_BUILTIN):

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknh-sha256.c: New test.
* gcc.target/riscv/zknh-sha512-32.c: New test.
* gcc.target/riscv/zknh-sha512-64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/crypto.md| 138 ++
 gcc/config/riscv/riscv-builtins.cc|   2 +
 gcc/config/riscv/riscv-scalar-crypto.def  |  22 +++
 gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  28 
 .../gcc.target/riscv/zknh-sha512-32.c |  42 ++
 .../gcc.target/riscv/zknh-sha512-64.c |  31 
 6 files changed, 263 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index 7568466ec97..17e7440c0b5 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -48,6 +48,22 @@
 UNSPEC_AES_ESM
 UNSPEC_AES_ESI
 UNSPEC_AES_ESMI
+
+;; Zknh unspecs
+UNSPEC_SHA_256_SIG0
+UNSPEC_SHA_256_SIG1
+UNSPEC_SHA_256_SUM0
+UNSPEC_SHA_256_SUM1
+UNSPEC_SHA_512_SIG0
+UNSPEC_SHA_512_SIG0H
+UNSPEC_SHA_512_SIG0L
+UNSPEC_SHA_512_SIG1
+UNSPEC_SHA_512_SIG1H
+UNSPEC_SHA_512_SIG1L
+UNSPEC_SHA_512_SUM0
+UNSPEC_SHA_512_SUM0R
+UNSPEC_SHA_512_SUM1
+UNSPEC_SHA_512_SUM1R
 ])
 
 ;; ZBKB extension
@@ -247,3 +263,125 @@
   "TARGET_ZKNE && TARGET_64BIT"
   "aes64esm\t%0,%1,%2"
   [(set_attr "type" "crypto")])
+
+;; ZKNH - SHA256
+
+(define_insn "riscv_sha256sig0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SIG0))]
+  "TARGET_ZKNH"
+  "sha256sig0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sig1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SIG1))]
+  "TARGET_ZKNH"
+  "sha256sig1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sum0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SUM0))]
+  "TARGET_ZKNH"
+  "sha256sum0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha256sum1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SHA_256_SUM1))]
+  "TARGET_ZKNH"
+  "sha256sum1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+;; ZKNH - SHA512
+
+(define_insn "riscv_sha512sig0h"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG0H))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig0h\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha512sig0l"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG0L))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig0l\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha512sig1h"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG1H))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig1h\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha512sig1l"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")]
+   UNSPEC_SHA_512_SIG1L))]
+  "TARGET_ZKNH && !TARGET_64BIT"
+  "sha512sig1l\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sha512sum0r"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+

[PATCH V3 2/5] RISC-V: Implement ZBKB, ZBKC and ZBKX extensions

2023-02-20 Thread Liao Shihua
This patch supports Zkbk, Zbkc and Zkbx extension. 
It includes instruction's machine description and built-in funtions. 
It is worth mentioning that this patch only adds instructions in Zbkb but no 
longer in Zbb.
If any instructions both in Zbb and Zbkb, they will be generated by code 
generator instead of built-in functions.

gcc/ChangeLog:

* config/riscv/bitmanip.md: Add ZBKB's instructions.
* config/riscv/riscv-builtins.cc (AVAIL): 
* config/riscv/riscv.md:
* config/riscv/crypto.md: Add Scalar Cryptography extension's machine 
description file.
* config/riscv/riscv-scalar-crypto.def: Add Scalar Cryptography 
extension's built-in function file.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zbkb32.c: New test.
* gcc.target/riscv/zbkb64.c: New test.
* gcc.target/riscv/zbkc32.c: New test.
* gcc.target/riscv/zbkc64.c: New test.
* gcc.target/riscv/zbkx32.c: New test.
* gcc.target/riscv/zbkx64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/bitmanip.md |  20 ++--
 gcc/config/riscv/crypto.md   | 128 +++
 gcc/config/riscv/riscv-builtins.cc   |   7 ++
 gcc/config/riscv/riscv-scalar-crypto.def |  45 
 gcc/config/riscv/riscv.md|   4 +-
 gcc/testsuite/gcc.target/riscv/zbkb32.c  |  36 +++
 gcc/testsuite/gcc.target/riscv/zbkb64.c  |  28 +
 gcc/testsuite/gcc.target/riscv/zbkc32.c  |  17 +++
 gcc/testsuite/gcc.target/riscv/zbkc64.c  |  17 +++
 gcc/testsuite/gcc.target/riscv/zbkx32.c  |  18 
 gcc/testsuite/gcc.target/riscv/zbkx64.c  |  18 
 11 files changed, 327 insertions(+), 11 deletions(-)
 create mode 100644 gcc/config/riscv/crypto.md
 create mode 100644 gcc/config/riscv/riscv-scalar-crypto.def
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c

diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md
index 14d18edbe62..f076ba35832 100644
--- a/gcc/config/riscv/bitmanip.md
+++ b/gcc/config/riscv/bitmanip.md
@@ -189,7 +189,7 @@
   [(set (match_operand:X 0 "register_operand" "=r")
 (bitmanip_bitwise:X (not:X (match_operand:X 1 "register_operand" "r"))
 (match_operand:X 2 "register_operand" "r")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "n\t%0,%2,%1"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -203,7 +203,7 @@
   (const_int 0)))
(match_operand:DI 2 "register_operand")))
(clobber (match_operand:DI 3 "register_operand"))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   [(set (match_dup 3) (ashiftrt:DI (match_dup 1) (const_int 63)))
(set (match_dup 0) (and:DI (not:DI (match_dup 3)) (match_dup 2)))])
 
@@ -211,7 +211,7 @@
   [(set (match_operand:X 0 "register_operand" "=r")
 (not:X (xor:X (match_operand:X 1 "register_operand" "r")
   (match_operand:X 2 "register_operand" "r"]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "xnor\t%0,%1,%2"
   [(set_attr "type" "bitmanip")
(set_attr "mode" "")])
@@ -277,7 +277,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(rotatert:SI (match_operand:SI 1 "register_operand" "r")
 (match_operand:QI 2 "arith_operand" "rI")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "ror%i2%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
@@ -285,7 +285,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(rotatert:DI (match_operand:DI 1 "register_operand" "r")
 (match_operand:QI 2 "arith_operand" "rI")))]
-  "TARGET_64BIT && TARGET_ZBB"
+  "TARGET_64BIT && (TARGET_ZBB || TARGET_ZBKB)"
   "ror%i2\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
@@ -293,7 +293,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(sign_extend:DI (rotatert:SI (match_operand:SI 1 "register_operand" "r")
 (match_operand:QI 2 "register_operand" 
"r"]
-  "TARGET_64BIT && TARGET_ZBB"
+  "TARGET_64BIT && (TARGET_ZBB || TARGET_ZBKB)"
   "rorw\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
@@ -301,7 +301,7 @@
   [(set (match_operand:SI 0 "register_operand" "=r")
(rotate:SI (match_operand:SI 1 "register_operand" "r")
   (match_operand:QI 2 "register_operand" "r")))]
-  "TARGET_ZBB"
+  "TARGET_ZBB || TARGET_ZBKB"
   "rol%~\t%0,%1,%2"
   [(set_attr "type" "bitmanip")])
 
@@ -309,7 +309,7 @@
   [(set (match_operand:DI 0 "register_operand" "=r")
(rotate:DI (match_operand:DI 1 "register_operand" "r")
   (match_operand:QI 2 "register_operand" "r")))]
-  "TARGET_64BIT && TARGET_ZBB"
+  "TARGET_64BIT 

[PATCH V3 3/5] RISC-V: Implement ZKND and ZKNE extensions

2023-02-20 Thread Liao Shihua
This patch supports Zkne and Zknd extension. 
It includes instruction's machine description and built-in funtions. 

gcc/ChangeLog:

* config/riscv/constraints.md (D03): Add constants of bs and rnum.
(DsA):
* config/riscv/crypto.md (riscv_aes32dsi): Add ZKND's and ZKNE's 
instructions.
(riscv_aes32dsmi):
(riscv_aes64ds):
(riscv_aes64dsm):
(riscv_aes64im):
(riscv_aes64ks1i):
(riscv_aes64ks2):
(riscv_aes32esi):
(riscv_aes32esmi):
(riscv_aes64es):
(riscv_aes64esm):
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKND's and ZKNE's AVAIL.
* config/riscv/riscv-scalar-crypto.def (DIRECT_BUILTIN): Add ZKND's and 
ZKNE's built-in functions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zknd32.c: New test.
* gcc.target/riscv/zknd64.c: New test.
* gcc.target/riscv/zkne32.c: New test.
* gcc.target/riscv/zkne64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/constraints.md  |   8 ++
 gcc/config/riscv/crypto.md   | 121 +++
 gcc/config/riscv/riscv-builtins.cc   |   5 +
 gcc/config/riscv/riscv-scalar-crypto.def |  15 +++
 gcc/testsuite/gcc.target/riscv/zknd32.c  |  18 
 gcc/testsuite/gcc.target/riscv/zknd64.c  |  36 +++
 gcc/testsuite/gcc.target/riscv/zkne32.c  |  18 
 gcc/testsuite/gcc.target/riscv/zkne64.c  |  30 ++
 8 files changed, 251 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c

diff --git a/gcc/config/riscv/constraints.md b/gcc/config/riscv/constraints.md
index 3637380ee47..3f46f14b10f 100644
--- a/gcc/config/riscv/constraints.md
+++ b/gcc/config/riscv/constraints.md
@@ -83,6 +83,14 @@
   (and (match_code "const_int")
(match_test "SINGLE_BIT_MASK_OPERAND (~ival)")))
 
+(define_constraint "D03"
+  "0, 1, 2 or 3 immediate"
+  (match_test "IN_RANGE (ival, 0, 3)"))
+
+(define_constraint "DsA"
+  "0 - 10 immediate"
+  (match_test "IN_RANGE (ival, 0, 10)"))
+
 ;; Floating-point constant +0.0, used for FCVT-based moves when FMV is
 ;; not available in RV32.
 (define_constraint "G"
diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index a270036e39b..7568466ec97 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -33,6 +33,21 @@
 ;; Zbkx unspecs
 UNSPEC_XPERM8
 UNSPEC_XPERM4
+
+;; Zknd unspecs
+UNSPEC_AES_DSI
+UNSPEC_AES_DSMI
+UNSPEC_AES_DS
+UNSPEC_AES_DSM
+UNSPEC_AES_IM
+UNSPEC_AES_KS1I
+UNSPEC_AES_KS2
+
+;; Zkne unspecs
+UNSPEC_AES_ES
+UNSPEC_AES_ESM
+UNSPEC_AES_ESI
+UNSPEC_AES_ESMI
 ])
 
 ;; ZBKB extension
@@ -126,3 +141,109 @@
   "TARGET_ZBKX"
   "xperm8\t%0,%1,%2"
   [(set_attr "type" "crypto")])
+
+;; ZKND extension
+
+(define_insn "riscv_aes32dsi"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "D03")]
+   UNSPEC_AES_DSI))]
+  "TARGET_ZKND && !TARGET_64BIT"
+  "aes32dsi\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes32dsmi"
+  [(set (match_operand:SI 0 "register_operand" "=r")
+(unspec:SI [(match_operand:SI 1 "register_operand" "r")
+   (match_operand:SI 2 "register_operand" "r")
+   (match_operand:SI 3 "register_operand" "D03")]
+   UNSPEC_AES_DSMI))]
+  "TARGET_ZKND && !TARGET_64BIT"
+  "aes32dsmi\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64ds"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2 "register_operand" "r")]
+   UNSPEC_AES_DS))]
+  "TARGET_ZKND && TARGET_64BIT"
+  "aes64ds\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64dsm"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:DI 2 "register_operand" "r")]
+   UNSPEC_AES_DSM))]
+  "TARGET_ZKND && TARGET_64BIT"
+  "aes64dsm\t%0,%1,%2"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64im"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")]
+   UNSPEC_AES_IM))]
+  "TARGET_ZKND && TARGET_64BIT"
+  "aes64im\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_aes64ks1i"
+  [(set (match_operand:DI 0 "register_operand" "=r")
+(unspec:DI [(match_operand:DI 1 "register_operand" "r")
+   (match_operand:SI 2 

[PATCH V3 1/5] RISC-V: Add prototypes for RISC-V Crypto built-in functions

2023-02-20 Thread Liao Shihua
This patch adds prototypes for RISC-V Crypto built-in functions.

gcc/ChangeLog:

* config/riscv/riscv-builtins.cc (RISCV_FTYPE_NAME2):
(RISCV_FTYPE_NAME3):
(RISCV_ATYPE_QI):
(RISCV_ATYPE_HI):
(RISCV_FTYPE_ATYPES2):
(RISCV_FTYPE_ATYPES3):
* config/riscv/riscv-ftypes.def (2):
(3):

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/riscv-builtins.cc |  8 
 gcc/config/riscv/riscv-ftypes.def  | 10 ++
 2 files changed, 18 insertions(+)

diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index 25ca407f9a9..ded91e17554 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -42,6 +42,8 @@ along with GCC; see the file COPYING3.  If not see
 /* Macros to create an enumeration identifier for a function prototype.  */
 #define RISCV_FTYPE_NAME0(A) RISCV_##A##_FTYPE
 #define RISCV_FTYPE_NAME1(A, B) RISCV_##A##_FTYPE_##B
+#define RISCV_FTYPE_NAME2(A, B, C) RISCV_##A##_FTYPE_##B##_##C
+#define RISCV_FTYPE_NAME3(A, B, C, D) RISCV_##A##_FTYPE_##B##_##C##_##D
 
 /* Classifies the prototype of a built-in function.  */
 enum riscv_function_type {
@@ -132,6 +134,8 @@ AVAIL (always, (!0))
 /* Argument types.  */
 #define RISCV_ATYPE_VOID void_type_node
 #define RISCV_ATYPE_USI unsigned_intSI_type_node
+#define RISCV_ATYPE_QI intQI_type_node
+#define RISCV_ATYPE_HI intHI_type_node
 #define RISCV_ATYPE_SI intSI_type_node
 #define RISCV_ATYPE_DI intDI_type_node
 #define RISCV_ATYPE_VOID_PTR ptr_type_node
@@ -142,6 +146,10 @@ AVAIL (always, (!0))
   RISCV_ATYPE_##A
 #define RISCV_FTYPE_ATYPES1(A, B) \
   RISCV_ATYPE_##A, RISCV_ATYPE_##B
+#define RISCV_FTYPE_ATYPES2(A, B, C) \
+  RISCV_ATYPE_##A, RISCV_ATYPE_##B, RISCV_ATYPE_##C
+#define RISCV_FTYPE_ATYPES3(A, B, C, D) \
+  RISCV_ATYPE_##A, RISCV_ATYPE_##B, RISCV_ATYPE_##C, RISCV_ATYPE_##D
 
 static const struct riscv_builtin_description riscv_builtins[] = {
   #include "riscv-cmo.def"
diff --git a/gcc/config/riscv/riscv-ftypes.def 
b/gcc/config/riscv/riscv-ftypes.def
index 3a40c33e7c2..3b518195a29 100644
--- a/gcc/config/riscv/riscv-ftypes.def
+++ b/gcc/config/riscv/riscv-ftypes.def
@@ -32,3 +32,13 @@ DEF_RISCV_FTYPE (1, (VOID, USI))
 DEF_RISCV_FTYPE (1, (VOID, VOID_PTR))
 DEF_RISCV_FTYPE (1, (SI, SI))
 DEF_RISCV_FTYPE (1, (DI, DI))
+DEF_RISCV_FTYPE (2, (SI, QI, QI))
+DEF_RISCV_FTYPE (2, (SI, HI, HI))
+DEF_RISCV_FTYPE (2, (SI, SI, SI))
+DEF_RISCV_FTYPE (2, (DI, QI, QI))
+DEF_RISCV_FTYPE (2, (DI, HI, HI))
+DEF_RISCV_FTYPE (2, (DI, SI, SI))
+DEF_RISCV_FTYPE (2, (DI, DI, SI))
+DEF_RISCV_FTYPE (2, (DI, DI, DI))
+DEF_RISCV_FTYPE (3, (SI, SI, SI, SI))
+DEF_RISCV_FTYPE (3, (DI, DI, DI, SI))
-- 
2.38.1.windows.1



[PATCH V3 0/5] RISC-V: Implement Scalar Cryptography Extension

2023-02-20 Thread Liao Shihua
This series adds basic support for the Scalar Cryptography extensions:
* Zbkb
* Zbkc
* Zbkx
* Zknd
* Zkne
* Zknh
* Zksed
* Zksh

The implementation follows the version Scalar Cryptography v1.0.0 of the 
specification,
which can be found here:
https://github.com/riscv/riscv-crypto/releases/tag/v1.0.0-scalar

It works by Wu Siyu and Liao Shihua .
Liao Shihua (5):
  Add prototypes for RISC-V Crypto built-in functions
  Implement ZBKB, ZBKC and ZBKX extensions
  Implement ZKND and ZKNE extensions
  Implement ZKNH extension
  Implement ZKSH and ZKSED extensions

 gcc/config/riscv/bitmanip.md  |  20 +-
 gcc/config/riscv/constraints.md   |   8 +
 gcc/config/riscv/crypto.md| 435 ++
 gcc/config/riscv/riscv-builtins.cc|  26 ++
 gcc/config/riscv/riscv-ftypes.def |  10 +
 gcc/config/riscv/riscv-scalar-crypto.def  |  94 
 gcc/config/riscv/riscv.md |   4 +-
 gcc/testsuite/gcc.target/riscv/zbkb32.c   |  36 ++
 gcc/testsuite/gcc.target/riscv/zbkb64.c   |  28 ++
 gcc/testsuite/gcc.target/riscv/zbkc32.c   |  17 +
 gcc/testsuite/gcc.target/riscv/zbkc64.c   |  17 +
 gcc/testsuite/gcc.target/riscv/zbkx32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zbkx64.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zknd32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zknd64.c   |  36 ++
 gcc/testsuite/gcc.target/riscv/zkne32.c   |  18 +
 gcc/testsuite/gcc.target/riscv/zkne64.c   |  30 ++
 gcc/testsuite/gcc.target/riscv/zknh-sha256.c  |  28 ++
 .../gcc.target/riscv/zknh-sha512-32.c |  42 ++
 .../gcc.target/riscv/zknh-sha512-64.c |  31 ++
 gcc/testsuite/gcc.target/riscv/zksed32.c  |  19 +
 gcc/testsuite/gcc.target/riscv/zksed64.c  |  19 +
 gcc/testsuite/gcc.target/riscv/zksh32.c   |  19 +
 gcc/testsuite/gcc.target/riscv/zksh64.c   |  19 +
 24 files changed, 999 insertions(+), 11 deletions(-)
 create mode 100644 gcc/config/riscv/crypto.md
 create mode 100644 gcc/config/riscv/riscv-scalar-crypto.def
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkb64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkc64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zbkx64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknd64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zkne64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha256.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zknh-sha512-64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh64.c

-- 
2.38.1.windows.1



[PATCH V3 5/5] RISC-V: Implement ZKSH and ZKSED extensions

2023-02-20 Thread Liao Shihua
This patch supports Zksh and Zksed extension. 
It includes instruction's machine description and built-in funtions. 

gcc/ChangeLog:

* config/riscv/crypto.md (riscv_sm3p0_): Add ZKSED's and ZKSH's 
instructions.
(riscv_sm3p1_):
(riscv_sm4ed_):
(riscv_sm4ks_):
* config/riscv/riscv-builtins.cc (AVAIL): Add ZKSED's and ZKSH's AVAIL.
* config/riscv/riscv-scalar-crypto.def (RISCV_BUILTIN): Add ZKSED's and 
ZKSH's built-in functions.

gcc/testsuite/ChangeLog:

* gcc.target/riscv/zksed32.c: New test.
* gcc.target/riscv/zksed64.c: New test.
* gcc.target/riscv/zksh32.c: New test.
* gcc.target/riscv/zksh64.c: New test.

Co-Authored-By: SiYu Wu
---
 gcc/config/riscv/crypto.md   | 48 
 gcc/config/riscv/riscv-builtins.cc   |  4 ++
 gcc/config/riscv/riscv-scalar-crypto.def | 12 ++
 gcc/testsuite/gcc.target/riscv/zksed32.c | 19 ++
 gcc/testsuite/gcc.target/riscv/zksed64.c | 19 ++
 gcc/testsuite/gcc.target/riscv/zksh32.c  | 19 ++
 gcc/testsuite/gcc.target/riscv/zksh64.c  | 19 ++
 7 files changed, 140 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksed64.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh32.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/zksh64.c

diff --git a/gcc/config/riscv/crypto.md b/gcc/config/riscv/crypto.md
index 17e7440c0b5..777aa529005 100644
--- a/gcc/config/riscv/crypto.md
+++ b/gcc/config/riscv/crypto.md
@@ -64,6 +64,14 @@
 UNSPEC_SHA_512_SUM0R
 UNSPEC_SHA_512_SUM1
 UNSPEC_SHA_512_SUM1R
+
+;; Zksh unspecs
+UNSPEC_SM3_P0
+UNSPEC_SM3_P1
+
+;; Zksed unspecs
+UNSPEC_SM4_ED
+UNSPEC_SM4_KS
 ])
 
 ;; ZBKB extension
@@ -385,3 +393,43 @@
   "TARGET_ZKNH && TARGET_64BIT"
   "sha512sum1\t%0,%1"
   [(set_attr "type" "crypto")])
+
+ ;; ZKSH
+
+(define_insn "riscv_sm3p0_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SM3_P0))]
+  "TARGET_ZKSH"
+  "sm3p0\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sm3p1_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")]
+  UNSPEC_SM3_P1))]
+  "TARGET_ZKSH"
+  "sm3p1\t%0,%1"
+  [(set_attr "type" "crypto")])
+
+;; ZKSED
+
+(define_insn "riscv_sm4ed_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "register_operand" "r")
+  (match_operand:SI 3 "register_operand" "D03")]
+  UNSPEC_SM4_ED))]
+  "TARGET_ZKSED"
+  "sm4ed\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
+
+(define_insn "riscv_sm4ks_"
+  [(set (match_operand:X 0 "register_operand" "=r")
+(unspec:X [(match_operand:X 1 "register_operand" "r")
+  (match_operand:X 2 "register_operand" "r")
+  (match_operand:SI 3 "register_operand" "D03")]
+  UNSPEC_SM4_KS))]
+  "TARGET_ZKSED"
+  "sm4ks\t%0,%1,%2,%3"
+  [(set_attr "type" "crypto")])
diff --git a/gcc/config/riscv/riscv-builtins.cc 
b/gcc/config/riscv/riscv-builtins.cc
index ab5bd52ee7f..390f8a38309 100644
--- a/gcc/config/riscv/riscv-builtins.cc
+++ b/gcc/config/riscv/riscv-builtins.cc
@@ -113,6 +113,10 @@ AVAIL (crypto_zkne64, TARGET_ZKNE && TARGET_64BIT)
 AVAIL (crypto_zkne_or_zknd, (TARGET_ZKNE || TARGET_ZKND) && TARGET_64BIT)
 AVAIL (crypto_zknh32, TARGET_ZKNH && !TARGET_64BIT)
 AVAIL (crypto_zknh64, TARGET_ZKNH && TARGET_64BIT)
+AVAIL (crypto_zksh32, TARGET_ZKSH && !TARGET_64BIT)
+AVAIL (crypto_zksh64, TARGET_ZKSH && TARGET_64BIT)
+AVAIL (crypto_zksed32, TARGET_ZKSED && !TARGET_64BIT)
+AVAIL (crypto_zksed64, TARGET_ZKSED && TARGET_64BIT)
 AVAIL (always, (!0))
 
 /* Construct a riscv_builtin_description from the given arguments.
diff --git a/gcc/config/riscv/riscv-scalar-crypto.def 
b/gcc/config/riscv/riscv-scalar-crypto.def
index d38aad122e5..139793c6360 100644
--- a/gcc/config/riscv/riscv-scalar-crypto.def
+++ b/gcc/config/riscv/riscv-scalar-crypto.def
@@ -80,3 +80,15 @@ DIRECT_BUILTIN (sha512sig0, RISCV_DI_FTYPE_DI, 
crypto_zknh64),
 DIRECT_BUILTIN (sha512sig1, RISCV_DI_FTYPE_DI, crypto_zknh64),
 DIRECT_BUILTIN (sha512sum0, RISCV_DI_FTYPE_DI, crypto_zknh64),
 DIRECT_BUILTIN (sha512sum1, RISCV_DI_FTYPE_DI, crypto_zknh64),
+
+// ZKSH
+RISCV_BUILTIN (sm3p0_si, "sm3p0", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI, 
crypto_zksh32),
+RISCV_BUILTIN (sm3p0_di, "sm3p0", RISCV_BUILTIN_DIRECT, RISCV_DI_FTYPE_DI, 
crypto_zksh64),
+RISCV_BUILTIN (sm3p1_si, "sm3p1", RISCV_BUILTIN_DIRECT, RISCV_SI_FTYPE_SI, 
crypto_zksh32),
+RISCV_BUILTIN (sm3p1_di, "sm3p1", RISCV_BUILTIN_DIRECT, RISCV_DI_FTYPE_DI, 
crypto_zksh64),
+
+// ZKSED
+RISCV_BUILTIN (sm4ed_si, "sm4ed", RISCV_BUILTIN_DIRECT, 
RISCV_SI_FTYPE_SI_SI_SI, 

[PATCH] tree-optimization/108819 - niter analysis ICE with unexpected constant

2023-02-20 Thread Richard Biener via Gcc-patches
The following makes sure we do not ICE on unfolded stmts like
_1 = 1 & 1.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108819
* tree-ssa-loop-niter.cc (number_of_iterations_cltz): Check
we have an SSA name as iv_2 as expected.

* gcc.dg/pr108819.c: New testcase.
---
 gcc/testsuite/gcc.dg/pr108819.c | 14 ++
 gcc/tree-ssa-loop-niter.cc  |  6 --
 2 files changed, 18 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr108819.c

diff --git a/gcc/testsuite/gcc.dg/pr108819.c b/gcc/testsuite/gcc.dg/pr108819.c
new file mode 100644
index 000..a651f819908
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/pr108819.c
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-O -fno-tree-ccp -fno-tree-forwprop" } */
+
+int a, b;
+int main() {
+  int d = 1;
+  for (; b; b++)
+if (a < 1)
+  while (d <= a && a <= 0UL) {
+int *e = 
+*e = 0;
+  }
+  return 0;
+}
diff --git a/gcc/tree-ssa-loop-niter.cc b/gcc/tree-ssa-loop-niter.cc
index 581bf5d067b..1ce5e736ce3 100644
--- a/gcc/tree-ssa-loop-niter.cc
+++ b/gcc/tree-ssa-loop-niter.cc
@@ -2354,7 +2354,8 @@ number_of_iterations_cltz (loop_p loop, edge exit,
   gimple *and_stmt = SSA_NAME_DEF_STMT (gimple_cond_lhs (cond_stmt));
   if (!is_gimple_assign (and_stmt)
  || gimple_assign_rhs_code (and_stmt) != BIT_AND_EXPR
- || !integer_pow2p (gimple_assign_rhs2 (and_stmt)))
+ || !integer_pow2p (gimple_assign_rhs2 (and_stmt))
+ || TREE_CODE (gimple_assign_rhs1 (and_stmt)) != SSA_NAME)
return false;
 
   checked_bit = tree_log2 (gimple_assign_rhs2 (and_stmt));
@@ -2382,7 +2383,8 @@ number_of_iterations_cltz (loop_p loop, edge exit,
 precision.  */
  iv_2 = gimple_assign_rhs1 (test_value_stmt);
  tree rhs_type = TREE_TYPE (iv_2);
- if (TREE_CODE (rhs_type) != INTEGER_TYPE
+ if (TREE_CODE (iv_2) != SSA_NAME
+ || TREE_CODE (rhs_type) != INTEGER_TYPE
  || (TYPE_PRECISION (rhs_type)
  != TYPE_PRECISION (test_value_type)))
return false;
-- 
2.35.3


Re: [PATCH] [PR104882] [arm] require mve hw for mve run test

2023-02-20 Thread Andrea Corallo via Gcc-patches
Alexandre Oliva via Gcc-patches  writes:

> The pr104882.c test is an execution test, but arm_v8_1m_mve_ok only
> tests for compile-time support.  Add a requirement for mve hardware.
>
> Regstrapped on x86_64-linux-gnu.
> Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?
>
> for  gcc/testsuite/ChangeLog
>
>   PR target/104882
>   * gcc.target/arm/simd/pr104882.c: Require mve hardware.
> ---
>  gcc/testsuite/gcc.target/arm/simd/pr104882.c |1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr104882.c 
> b/gcc/testsuite/gcc.target/arm/simd/pr104882.c
> index ae9709af42f22..1ea7a14836f54 100644
> --- a/gcc/testsuite/gcc.target/arm/simd/pr104882.c
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr104882.c
> @@ -1,4 +1,5 @@
>  /* { dg-do run } */
> +/* { dg-require-effective-target arm_mve_hw } */
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */

Hi Alexandre,

no approver here but LGTM, thanks.

  Andrea


Re: RISC-V: Add divmod instruction support

2023-02-20 Thread Richard Biener via Gcc-patches
On Sun, Feb 19, 2023 at 2:15 AM Maciej W. Rozycki  wrote:
>
> On Sat, 18 Feb 2023, Jeff Law wrote:
>
> > >   Barring the fusion case, which indeed asks for a dedicated `divmod'
> > > pattern (and then I suppose a post-reload splitter or a peephole so that
> > > where one of the two results produced has eventually turned out unused, we
> > > have means to discard the unneeded machine instruction), isn't the generic
> > > transformation something for the middle end to do based on RTX costs?
> > I originally though the same way you did Maciej.
> >
> > The problem is you don't see it as a divmod in expand_divmod unless you 
> > expose
> > a divmod optab.  See tree-ssa-mathopts.cc's divmod handling.
>
>  That's the kind of stuff I'd expect to happen at the tree level though,
> before expand.

The GIMPLE pass forming divmod could indeed choose to emit the
div + mul/sub sequence instead if an actual divmod pattern isn't available.
It could even generate some fake mul/sub/mod RTXen to cost the two
variants against each other but I seriously doubt any uarch that implements
division/modulo has a slower mul/sub.

Richard.

>
>   Maciej


[pushed] wwwdocs: index: Remove link to Nick's blog

2023-02-20 Thread Gerald Pfeifer
Nick has not been able to update his blog for a while and confirmed
we should remove this link.

Pushed.
 
Gerald (who is missing those nice updates)

---
 htdocs/index.html | 1 -
 1 file changed, 1 deletion(-)

diff --git a/htdocs/index.html b/htdocs/index.html
index 80730c06..3d0f8700 100644
--- a/htdocs/index.html
+++ b/htdocs/index.html
@@ -89,7 +89,6 @@ mission statement.
 
 
 https://gcc.gnu.org/news.html;>Older news |
-https://developers.redhat.com/author/nick-clifton;>Nick's Blog |
 
 More news? Let ger...@pfeifer.com know!
 
-- 
2.39.1


[PATCH] Allow front ends to register spec functions gcc/{gcc.cc,gcc.h} [PR108261]

2023-02-20 Thread Gaius Mulley via Gcc-patches


Hello,

bootstrapped on gcc master x86_64 and no extra failures generated on all
front ends.

Would this be ok for trunc?

regards,
Gaius


Allow front ends to register spec functions gcc/{gcc.cc,gcc.h} [PR108261]

This patch allows front ends to register spec functions.  It is motivated
by PR108261 which needs to retain the order of search path related
options in the modula-2 front end.

gcc/ChangeLog:

* gcc.cc (add_spec_function):
* gcc.h (add_spec_function):

diff --git a/gcc/gcc.cc b/gcc/gcc.cc
index becc56051a8..93e4e38389d 100644
--- a/gcc/gcc.cc
+++ b/gcc/gcc.cc
@@ -46,6 +46,7 @@ compilation is specified by a string called a "spec".  */
 #include "spellcheck.h"
 #include "opts-jobserver.h"
 #include "common/common-target.h"
+#include 
 
 
 
@@ -1774,6 +1775,8 @@ static const struct spec_function static_spec_functions[] 
=
   { 0, 0 }
 };
 
+static std::vectorlang_spec_functions;
+
 static int processing_spec_function;
 
 /* Add appropriate libgcc specs to OBSTACK, taking into account
@@ -6825,9 +6828,25 @@ lookup_spec_function (const char *name)
 if (strcmp (sf->name, name) == 0)
   return sf;
 
+  for (auto *sf : lang_spec_functions)
+if (strcmp (sf->name, name) == 0)
+  return sf;
+
   return NULL;
 }
 
+/* Add a new spec function.  */
+
+void
+add_spec_function (const char *name,
+  const char *(*func) (int, const char **))
+{
+  struct spec_function *sf = XNEW (struct spec_function);
+  sf->name = name;
+  sf->func = func;
+  lang_spec_functions.push_back (sf);
+}
+
 /* Evaluate a spec function.  */
 
 static const char *
diff --git a/gcc/gcc.h b/gcc/gcc.h
index 19a61b373ee..f40de0f5520 100644
--- a/gcc/gcc.h
+++ b/gcc/gcc.h
@@ -73,6 +73,8 @@ struct spec_function
 extern int do_spec (const char *);
 extern void record_temp_file (const char *, int, int);
 extern void set_input (const char *);
+extern void add_spec_function (const char *name,
+  const char *(*func) (int, const char **));
 
 /* Spec files linked with gcc.cc must provide definitions for these.  */
 


[PATCH] tree-optimization/108825 - checking ICE with unroll-and-jam

2023-02-20 Thread Richard Biener via Gcc-patches
The issue is that unroll-and-jam applies RPO VN on the transformed body but
that leaves the IL in "indetermined" state (it returns a TODO to make it
valid again).  But unroll-and-jam then continues to transform another loop and
in using the tree_unroll_loop helper runs into tree_transform_and_unroll_loop
performing IL checking on the whole function.

While the real fix is to elide all such checking I'm only making the
loop-local LC SSA verifier not perform function-wide SSA verification
at this point.

Bootstrapped and tested on x86_64-unknown-linux-gnu, pushed.

PR tree-optimization/108825
* tree-ssa-loop-manip.cc (verify_loop_closed_ssa): For
loop-local verfication only verify there's no pending SSA
update.

* gcc.dg/torture/pr108825.c: New testcase.
---
 gcc/testsuite/gcc.dg/torture/pr108825.c | 20 
 gcc/tree-ssa-loop-manip.cc  | 11 ---
 2 files changed, 28 insertions(+), 3 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/torture/pr108825.c

diff --git a/gcc/testsuite/gcc.dg/torture/pr108825.c 
b/gcc/testsuite/gcc.dg/torture/pr108825.c
new file mode 100644
index 000..ada2da86054
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr108825.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+
+int safe_mul_func_uint8_t_u_u_ui2, g_231, g_277_1, g_568, 
func_35___trans_tmp_10;
+int g_81[7];
+extern int g_96[];
+char func_35___trans_tmp_11;
+static inline int safe_add_func_int32_t_s_s(int si1, int si2)
+{ return si1 > 647 - si2 ?: si1; }
+void func_35() {
+  for (; g_277_1; g_277_1 += 1) {
+g_231 = 0;
+for (; g_231 <= 6; g_231 += 1) {
+  func_35___trans_tmp_10 =
+  safe_add_func_int32_t_s_s(g_81[g_231], g_568 || g_96[1]);
+  func_35___trans_tmp_11 =
+  func_35___trans_tmp_10 * safe_mul_func_uint8_t_u_u_ui2;
+  g_81[g_231] = func_35___trans_tmp_11;
+}
+  }
+}
diff --git a/gcc/tree-ssa-loop-manip.cc b/gcc/tree-ssa-loop-manip.cc
index 14fe65f134d..09acc1c94cc 100644
--- a/gcc/tree-ssa-loop-manip.cc
+++ b/gcc/tree-ssa-loop-manip.cc
@@ -681,15 +681,15 @@ verify_loop_closed_ssa (bool verify_ssa_p, class loop 
*loop)
   if (number_of_loops (cfun) <= 1)
 return;
 
-  if (verify_ssa_p)
-verify_ssa (false, true);
-
   timevar_push (TV_VERIFY_LOOP_CLOSED);
 
   if (loop == NULL)
 {
   basic_block bb;
 
+  if (verify_ssa_p)
+   verify_ssa (false, true);
+
   FOR_EACH_BB_FN (bb, cfun)
if (bb->loop_father && bb->loop_father->num > 0)
  check_loop_closed_ssa_bb (bb);
@@ -698,6 +698,11 @@ verify_loop_closed_ssa (bool verify_ssa_p, class loop 
*loop)
 {
   basic_block *bbs = get_loop_body (loop);
 
+  /* We do not have loop-local SSA verification so just
+check there's no update queued.  */
+  if (verify_ssa_p)
+   gcc_assert (!need_ssa_update_p (cfun));
+
   for (unsigned i = 0; i < loop->num_nodes; ++i)
check_loop_closed_ssa_bb (bbs[i]);
 
-- 
2.35.3


Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Tobias Burnus

On 17.02.23 17:27, Steve Kargl wrote:

On Fri, Feb 17, 2023 at 12:13:52PM +0100, Tobias Burnus wrote:

OK for mainline?

Short version: no.


Would you mind to write a reasoning beyond only a single word?


subroutine foo(n)
   integer :: n
   integer :: array(n*5)
   integer :: my_len
   ...
   my_len = 5
   block
 character(len=my_len, kind=4) :: str

 my_len = 99
 print *, len(str)  ! still shows 5 - not 99
   end block
end

Are you sure about the above comment?


Yes - for three reasons:
* On the what-feels-right side: It does not make any sense to print
  any other value than 5 given that 'str' has been declared with len = 5.
* On the GCC side, the SAVE_EXPR ensures that the length is evaluated
  early and then "saved" to ensure its original value is available
* The quoted text from the standard implies that this is what
  should happen.

Why do you think that printing "5" is wrong? GCC does so since
years; it still does so with my patch.

Hence, can you elaborate? And also state which value you did expect instead?

* * *

The patch itself is about *deferred* length parameters, i.e.
'len=:', and thus for code like:

character(len=:), pointer :: str
...
allocate(character(len=4) :: str)
print *, len(str)  ! should print 4
...
allocate(character(len=99) :: str)
print *, len(str)  ! should now print 99
...

Currently, the SAVE_EXPR causes that the original value might
get used, which is often 0 (by chance 0 initialized) or some
random value like 57385973, depending what on what was on the
stack before. - There are more issues with deferred strings,
but at least one is solved by not having a SAVE_EXPR for
deferred-length character strings.

Tobias

-
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 
München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas 
Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht 
München, HRB 106955


[PATCH] RISC-V: Add RVV reduction C/C++ intrinsics support

2023-02-20 Thread juzhe . zhong
From: Ju-Zhe Zhong 

gcc/ChangeLog:

* config/riscv/riscv-vector-builtins-bases.cc (class reducop): New 
class.
(class widen_reducop): Ditto.
(class freducop): Ditto.
(class widen_freducop): Ditto.
(BASE): Ditto.
* config/riscv/riscv-vector-builtins-bases.h: Ditto.
* config/riscv/riscv-vector-builtins-functions.def (vredsum): Add 
reduction support.
(vredmaxu): Ditto.
(vredmax): Ditto.
(vredminu): Ditto.
(vredmin): Ditto.
(vredand): Ditto.
(vredor): Ditto.
(vredxor): Ditto.
(vwredsum): Ditto.
(vwredsumu): Ditto.
(vfredusum): Ditto.
(vfredosum): Ditto.
(vfredmax): Ditto.
(vfredmin): Ditto.
(vfwredosum): Ditto.
(vfwredusum): Ditto.
* config/riscv/riscv-vector-builtins-shapes.cc (struct reduc_alu_def): 
Ditto.
(SHAPE): Ditto.
* config/riscv/riscv-vector-builtins-shapes.h: Ditto.
* config/riscv/riscv-vector-builtins-types.def (DEF_RVV_WI_OPS): New 
macro.
(DEF_RVV_WU_OPS): Ditto.
(DEF_RVV_WF_OPS): Ditto.
(vint8mf8_t): Ditto.
(vint8mf4_t): Ditto.
(vint8mf2_t): Ditto.
(vint8m1_t): Ditto.
(vint8m2_t): Ditto.
(vint8m4_t): Ditto.
(vint8m8_t): Ditto.
(vint16mf4_t): Ditto.
(vint16mf2_t): Ditto.
(vint16m1_t): Ditto.
(vint16m2_t): Ditto.
(vint16m4_t): Ditto.
(vint16m8_t): Ditto.
(vint32mf2_t): Ditto.
(vint32m1_t): Ditto.
(vint32m2_t): Ditto.
(vint32m4_t): Ditto.
(vint32m8_t): Ditto.
(vuint8mf8_t): Ditto.
(vuint8mf4_t): Ditto.
(vuint8mf2_t): Ditto.
(vuint8m1_t): Ditto.
(vuint8m2_t): Ditto.
(vuint8m4_t): Ditto.
(vuint8m8_t): Ditto.
(vuint16mf4_t): Ditto.
(vuint16mf2_t): Ditto.
(vuint16m1_t): Ditto.
(vuint16m2_t): Ditto.
(vuint16m4_t): Ditto.
(vuint16m8_t): Ditto.
(vuint32mf2_t): Ditto.
(vuint32m1_t): Ditto.
(vuint32m2_t): Ditto.
(vuint32m4_t): Ditto.
(vuint32m8_t): Ditto.
(vfloat32mf2_t): Ditto.
(vfloat32m1_t): Ditto.
(vfloat32m2_t): Ditto.
(vfloat32m4_t): Ditto.
(vfloat32m8_t): Ditto.
* config/riscv/riscv-vector-builtins.cc (DEF_RVV_WI_OPS): Ditto.
(DEF_RVV_WU_OPS): Ditto.
(DEF_RVV_WF_OPS): Ditto.
(required_extensions_p): Add reduction support.
(rvv_arg_type_info::get_base_vector_type): Ditto.
(rvv_arg_type_info::get_tree_type): Ditto.
* config/riscv/riscv-vector-builtins.h (enum rvv_base_type): Ditto.
* config/riscv/riscv.md: Ditto.
* config/riscv/vector-iterators.md (minu): Ditto.
* config/riscv/vector.md (@pred_reduc_): New 
patern.
(@pred_reduc_): Ditto.
(@pred_widen_reduc_plus): Ditto.
(@pred_widen_reduc_plus):Ditto.
(@pred_reduc_plus): Ditto.
(@pred_reduc_plus): Ditto.
(@pred_widen_reduc_plus): Ditto.

---
 .../riscv/riscv-vector-builtins-bases.cc  |  90 +++
 .../riscv/riscv-vector-builtins-bases.h   |  16 ++
 .../riscv/riscv-vector-builtins-functions.def |  26 +-
 .../riscv/riscv-vector-builtins-shapes.cc |  29 +++
 .../riscv/riscv-vector-builtins-shapes.h  |   1 +
 .../riscv/riscv-vector-builtins-types.def |  65 +
 gcc/config/riscv/riscv-vector-builtins.cc |  92 +++-
 gcc/config/riscv/riscv-vector-builtins.h  |   4 +-
 gcc/config/riscv/riscv.md |   6 +-
 gcc/config/riscv/vector-iterators.md  | 130 +-
 gcc/config/riscv/vector.md| 223 +-
 11 files changed, 668 insertions(+), 14 deletions(-)

diff --git a/gcc/config/riscv/riscv-vector-builtins-bases.cc 
b/gcc/config/riscv/riscv-vector-builtins-bases.cc
index bfcfab55bb9..f6ed2e53453 100644
--- a/gcc/config/riscv/riscv-vector-builtins-bases.cc
+++ b/gcc/config/riscv/riscv-vector-builtins-bases.cc
@@ -1283,6 +1283,64 @@ public:
   }
 };
 
+/* Implements reduction instructions.  */
+template
+class reducop : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander ) const override
+  {
+return e.use_exact_insn (
+  code_for_pred_reduc (CODE, e.vector_mode (), e.vector_mode ()));
+  }
+};
+
+/* Implements widen reduction instructions.  */
+template
+class widen_reducop : public function_base
+{
+public:
+  bool apply_mask_policy_p () const override { return false; }
+
+  rtx expand (function_expander ) const override
+  {
+return e.use_exact_insn (code_for_pred_widen_reduc_plus (UNSPEC,
+e.vector_mode (),
+e.vector_mode ()));
+  }
+};
+
+/* Implements floating-point reduction 

Re: [PATCH V3] rs6000: Load high and low part of 64bit constant independently

2023-02-20 Thread Jiufu Guo via Gcc-patches
Hi,

I would like to ping this patch:
https://gcc.gnu.org/pipermail/gcc-patches/2023-January/609373.html

BR,
Jeff (Jiufu)

Jiufu Guo  writes:

> Hi,
>
> Compare with previous version, this patch updates the comments only.
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608293.html
>
> For a complicate 64bit constant, below is one instruction-sequence to
> build:
>   lis 9,0x800a
>   ori 9,9,0xabcd
>   sldi 9,9,32
>   oris 9,9,0xc167
>   ori 9,9,0xfa16
>
> while we can also use below sequence to build:
>   lis 9,0xc167
>   lis 10,0x800a
>   ori 9,9,0xfa16
>   ori 10,10,0xabcd
>   rldimi 9,10,32,0
> This sequence is using 2 registers to build high and low part firstly,
> and then merge them.
>
> In parallel aspect, this sequence would be faster. (Ofcause, using 1 more
> register with potential register pressure).
>
> The instruction sequence with two registers for parallel version can be
> generated only if can_create_pseudo_p.  Otherwise, the one register
> sequence is generated.
>
> Bootstrap and regtest pass on ppc64{,le}.
> Is this ok for trunk?
>
>
> BR,
> Jeff(Jiufu)
>
>
> gcc/ChangeLog:
>
>   * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Generate
>   more parallel code if can_create_pseudo_p.
>
> gcc/testsuite/ChangeLog:
>
>   * gcc.target/powerpc/parall_5insn_const.c: New test.
>
> ---
>  gcc/config/rs6000/rs6000.cc   | 39 +--
>  .../gcc.target/powerpc/parall_5insn_const.c   | 27 +
>  2 files changed, 54 insertions(+), 12 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
>
> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
> index 6ac3adcec6b..b4f03499252 100644
> --- a/gcc/config/rs6000/rs6000.cc
> +++ b/gcc/config/rs6000/rs6000.cc
> @@ -10366,19 +10366,34 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
> c)
>  }
>else
>  {
> -  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
> -
> -  emit_move_insn (temp, GEN_INT (sext_hwi (ud4 << 16, 32)));
> -  if (ud3 != 0)
> - emit_move_insn (temp, gen_rtx_IOR (DImode, temp, GEN_INT (ud3)));
> +  if (can_create_pseudo_p ())
> + {
> +   /* lis HIGH,UD4 ; ori HIGH,UD3 ;
> +  lis LOW,UD2 ; ori LOW,UD1 ; rldimi LOW,HIGH,32,0.  */
> +   rtx high = gen_reg_rtx (DImode);
> +   rtx low = gen_reg_rtx (DImode);
> +   HOST_WIDE_INT num = (ud2 << 16) | ud1;
> +   rs6000_emit_set_long_const (low, sext_hwi (num, 32));
> +   num = (ud4 << 16) | ud3;
> +   rs6000_emit_set_long_const (high, sext_hwi (num, 32));
> +   emit_insn (gen_rotldi3_insert_3 (dest, high, GEN_INT (32), low,
> +GEN_INT (0x)));
> + }
> +  else
> + {
> +   /* lis DEST,UD4 ; ori DEST,UD3 ; rotl DEST,32 ;
> +  oris DEST,UD2 ; ori DEST,UD1.  */
> +   emit_move_insn (dest, GEN_INT (sext_hwi (ud4 << 16, 32)));
> +   if (ud3 != 0)
> + emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud3)));
>  
> -  emit_move_insn (ud2 != 0 || ud1 != 0 ? temp : dest,
> -   gen_rtx_ASHIFT (DImode, temp, GEN_INT (32)));
> -  if (ud2 != 0)
> - emit_move_insn (ud1 != 0 ? temp : dest,
> - gen_rtx_IOR (DImode, temp, GEN_INT (ud2 << 16)));
> -  if (ud1 != 0)
> - emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
> +   emit_move_insn (dest, gen_rtx_ASHIFT (DImode, dest, GEN_INT (32)));
> +   if (ud2 != 0)
> + emit_move_insn (dest,
> + gen_rtx_IOR (DImode, dest, GEN_INT (ud2 << 16)));
> +   if (ud1 != 0)
> + emit_move_insn (dest, gen_rtx_IOR (DImode, dest, GEN_INT (ud1)));
> + }
>  }
>  }
>  
> diff --git a/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c 
> b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> new file mode 100644
> index 000..e3a9a7264cf
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/parall_5insn_const.c
> @@ -0,0 +1,27 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -mno-prefixed -save-temps" } */
> +/* { dg-require-effective-target has_arch_ppc64 } */
> +
> +/* { dg-final { scan-assembler-times {\mlis\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mori\M} 4 } } */
> +/* { dg-final { scan-assembler-times {\mrldimi\M} 2 } } */
> +
> +void __attribute__ ((noinline)) foo (unsigned long long *a)
> +{
> +  /* 2 lis + 2 ori + 1 rldimi for each constant.  */
> +  *a++ = 0x800aabcdc167fa16ULL;
> +  *a++ = 0x7543a876867f616ULL;
> +}
> +
> +long long A[] = {0x800aabcdc167fa16ULL, 0x7543a876867f616ULL};
> +int
> +main ()
> +{
> +  long long res[2];
> +
> +  foo (res);
> +  if (__builtin_memcmp (res, A, sizeof (res)) != 0)
> +__builtin_abort ();
> +
> +  return 0;
> +}


Re: [PATCH] Skip module_cmi_p and related unsupported module test

2023-02-20 Thread Jason Merrill via Gcc-patches

On 2/17/23 22:55, Alexandre Oliva wrote:


When a multi-source module is found to be unsupported, we fail
module_cmi_p and subsequent sources.  Override proc unsupported to
mark the result in module_do, and test it to skip module_cmp_p and
subsequent related tests.


Hmm, I guess the problem that the modules tests are trying to use 
dg-test as a subroutine, and can't get at the result of the test to skip 
later processing?  Seems like LTO deals with the same issue by not using 
dg-test at all.


This seems like an ugly kludge around that problem, but I don't have any 
clever ideas of a better approach short of rewriting everything.  So, OK 
with a comment explaining the rationale above your overridden "unsupported".


Also, your commit subject line needs a subsystem tag, I guess 
"testsuite:" in this case.



Regstrapped on x86_64-linux-gnu.
Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?

for  gcc/testsuite/ChangeLog

* g++.dg/modules/modules.exp: Override unsupported to update
module_do, and test it after dg-test.
---
  gcc/testsuite/g++.dg/modules/modules.exp |   14 ++
  1 file changed, 14 insertions(+)

diff --git a/gcc/testsuite/g++.dg/modules/modules.exp 
b/gcc/testsuite/g++.dg/modules/modules.exp
index 61994b059457b..ba1287427bf05 100644
--- a/gcc/testsuite/g++.dg/modules/modules.exp
+++ b/gcc/testsuite/g++.dg/modules/modules.exp
@@ -315,6 +315,14 @@ proc module-check-requirements { tests } {
  # cleanup any detritus from previous run
  cleanup_module_files [find $DEFAULT_REPO *.gcm]
  
+set module_do {"compile" "P"}

+rename unsupported saved-unsupported
+proc unsupported { args } {
+global module_do
+lset module_do 1 "N"
+return [saved-unsupported $args]
+}
+
  # not grouped tests, sadly tcl doesn't have negated glob
  foreach test [prune [lsort [find $srcdir/$subdir {*.[CH]}]] \
  "$srcdir/$subdir/*_?.\[CH\]"] {
@@ -327,6 +335,9 @@ foreach test [prune [lsort [find $srcdir/$subdir {*.[CH]}]] 
\
set module_cmis {}
verbose "Testing $nshort $std" 1
dg-test $test "$std" $DEFAULT_MODFLAGS
+   if { [lindex $module_do 1] == "N" } {
+   continue
+   }
set testcase [string range $test [string length "$srcdir/"] end]
cleanup_module_files [module_cmi_p $testcase $module_cmis]
}
@@ -372,6 +383,9 @@ foreach src [lsort [find $srcdir/$subdir {*_a.[CHX}]] {
}
}
dg-test -keep-output $test "$std" $DEFAULT_MODFLAGS
+   if { [lindex $module_do 1] == "N" } {
+   break
+   }
set testcase [string range $test [string length "$srcdir/"] 
end]
lappend mod_files [module_cmi_p $testcase $module_cmis]
}





Re: RISC-V: Add divmod instruction support

2023-02-20 Thread Andrew Waterman via Gcc-patches
On Sat, Feb 18, 2023 at 1:30 PM Palmer Dabbelt  wrote:
>
> On Sat, 18 Feb 2023 13:06:02 PST (-0800), jeffreya...@gmail.com wrote:
> >
> >
> > On 2/18/23 11:26, Palmer Dabbelt wrote:
> >> On Fri, 17 Feb 2023 06:02:40 PST (-0800), gcc-patches@gcc.gnu.org wrote:
> >>> Hi all,
> >>> If we have division and remainder calculations with the same operands:
> >>>
> >>>   a = b / c;
> >>>   d = b % c;
> >>>
> >>> We can replace the calculation of remainder with multiplication +
> >>> subtraction, using the result from the previous division:
> >>>
> >>>   a = b / c;
> >>>   d = a * c;
> >>>   d = b - d;
> >>>
> >>> Which will be faster.
> >>
> >> Do you have any benchmarks that show that performance increase?  The ISA
> >> manual specifically says the suggested sequence is div+mod, and while
> >> those suggestions don't always pan out for real hardware it's likely
> >> that at least some implementations will end up with the ISA-suggested
> >> fusions.
> > It'll almost certainly be visible in mcf.  Been there, done that.  In
> > fact, that's why I asked the team Matevos works on to poke at this case
> > as I went through this issue on another processor.
> >
> > It can also be run through LLVM's MCA to estimate counts if you've got a
> > pipeline description.  THe div+rem will come out at around ~40c while a
> > div+mul+sub should weigh in around 25c for Veyron v1.
>
> Do you have a link to the patches somewhere?  I couldn't find them
> online, just the custom instruction support.  Or even just some docs
> describing what the pipeline does, as just basing one performance model
> on another is kind of a double-edged sword.
>
> That said, I think just knowing the processor doesn't do the div+mod
> fusion is sufficient to turn something like this on for the mtune for
> that processor.  That's different than turning it on globally, though --
> unless it turns out nobody is actually doing the fusion suggested in the
> ISA manual, which wouldn't be super surprising.
>
> Maybe some of the SiFive and T-Head folks can chime in on whether or not
> their processors perform the fusion in question -- and if so, do the
> instructions need to say back-to-back?

AFAIK, the sequence with the multiplication will normally be faster on
SiFive cores when both the quotient and the remainder are needed.

>  It doesn't look like we're
> really targeting the code sequences the ISA suggests as it stands, so
> maybe it's OK to just switch the default over too?
>
> It also brings up the question of mulh+mul fusions, which I don't think
> we've really looked at (though maybe they're a lot less important for
> rv64).


Re: [PATCH 1/2] c++: factor out TYPENAME_TYPE substitution

2023-02-20 Thread Jason Merrill via Gcc-patches

On 2/15/23 12:11, Patrick Palka wrote:

On Wed, 15 Feb 2023, Jason Merrill wrote:


On 2/15/23 09:21, Patrick Palka wrote:

On Tue, 14 Feb 2023, Jason Merrill wrote:


On 2/13/23 09:23, Patrick Palka wrote:

[N.B. this is a corrected version of
https://gcc.gnu.org/pipermail/gcc-patches/2022-November/607443.html ]

This patch factors out the TYPENAME_TYPE case of tsubst into a separate
function tsubst_typename_type.  It also factors out the two tsubst flags
controlling TYPENAME_TYPE substitution, tf_keep_type_decl and tf_tst_ok,
into distinct boolean parameters of this new function (and of
make_typename_type).  Consequently, callers which used to pass tf_tst_ok
to tsubst now instead must directly call tsubst_typename_type when
appropriate.


Hmm, I don't love how that turns 4 lines into 8 more complex lines in each
caller.  And the previous approach of saying "a CTAD placeholder is OK"
seem
like better abstraction than repeating the specific TYPENAME_TYPE handling
in
each place.


Ah yeah, I see what you mean.  I was thinking since tf_tst_ok is
specific to TYPENAME_TYPE handling and isn't propagated (i.e. it only
affects top-level TYPENAME_TYPEs), it seemed cleaner to encode the flag
as a bool parameter "template_ok" of tsubst_typename_type instead of as
a global tsubst_flag that gets propagated freely.




In a subsequent patch we'll add another flag to
tsubst_typename_type controlling whether we want to ignore non-types
during the qualified lookup.


As mentioned above, the second patch in this series would just add
another flag "type_only" alongside "template_ok", since this flag will
also only affects top-level TYPENAME_TYPEs and doesn't need to propagate
like tsubst_flags.

Except, it turns it, this new flag _does_ need to propagate, namely when
expanding a variadic using:

using typename Ts::type::m...; // from typename25a.C below

Here we have a USING_DECL whose USING_DECL_SCOPE is a
TYPE_PACK_EXPANSION over TYPENAME_TYPE.  In order to correctly
substitute this TYPENAME_TYPE, the USING_DECL case of tsubst_decl needs
to pass an appropriate tsubst_flag to tsubst_pack_expansion to be
propagated to tsubst (to be propagated to make_typename_type).

So in light of this case it seems adding a new tsubst_flag is the
way to go, which means we can avoid this refactoring patch entirely.

Like so?  Bootstrapped and regtested on x86_64-pc-linux-gnu.


OK, though I still wonder about adding a tsubst_scope function that would add
the tf_qualifying_scope.


Hmm, but we need to add tf_qualifying_scope to two tsubst_copy calls,
one tsubst call and one tsubst_aggr_type call (with entering_scope=true).
Would tsubst_scope call tsubst, tsubst_copy or tsubst_aggr_type?


In general it would call tsubst.

It's odd that anything is calling tsubst_copy with a type, that seems 
like a copy/paste error.  But it just hands off to tsubst anyway, so the 
effect is the same.


tsubst_aggr_type is needed when pushing into the scope of a declarator; 
I don't know offhand why we would need that when substituting the scope 
of a TYPENAME_TYPE.


I'd call tsubst, and leave the tsubst_aggr_type call alone for GCC 13.


-- >8 --

Subject: [PATCH] c++: TYPENAME_TYPE lookup ignoring non-types [PR107773]

Currently when resolving a TYPENAME_TYPE for 'typename T::m' via
make_typename_type, we consider only type bindings of 'm' and ignore
non-type ones.  But [temp.res.general]/3 says, in a note, "the usual
qualified name lookup ([basic.lookup.qual]) applies even in the presence
of 'typename'", and qualified name lookup doesn't discriminate between
type and non-type bindings.  So when resolving such a TYPENAME_TYPE
we want the lookup to consider all bindings.

An exception is when we have a TYPENAME_TYPE corresponding to the
qualifying scope of the :: scope resolution operator, such as
'T::type' in 'typename T::type::m'.  In that case, [basic.lookup.qual]/1
applies, and lookup for such a TYPENAME_TYPE must ignore non-type bindings.
So in order to correctly handle all cases, make_typename_type needs an
additional flag controlling whether lookup should ignore non-types or not.

To that end this patch adds a new tsubst flag tf_qualifying_scope to
communicate to make_typename_type whether we want to ignore non-type
bindings during the lookup (by default we don't want to ignore them).
In contexts where we do want to ignore non-types (when substituting
into the scope of TYPENAME_TYPE, SCOPE_REF or USING_DECL) we simply
pass tf_qualifying_scope to the relevant tsubst / tsubst_copy call.
This flag is intended to apply only to top-level TYPENAME_TYPEs so
we must be careful to clear the flag to avoid propagating it during
substitution of sub-trees.

PR c++/107773

gcc/cp/ChangeLog:

* cp-tree.h (enum tsubst_flags): New flag tf_qualifying_scope.
* decl.cc (make_typename_type): Use lookup_member instead of
lookup_field.  If tf_qualifying_scope is set, pass want_type=true
instead of =false to lookup_member.  

Re: [og12] Un-break nvptx libgomp build (was: [OG12][committed] amdgcn: OpenMP low-latency allocator)

2023-02-20 Thread Andrew Stubbs

On 16/02/2023 21:11, Thomas Schwinge wrote:

--- /dev/null
+++ b/libgomp/basic-allocator.c



+#ifndef BASIC_ALLOC_YIELD
+#deine BASIC_ALLOC_YIELD
+#endif


 In file included from [...]/libgomp/config/nvptx/allocator.c:49:
 [...]/libgomp/config/nvptx/../../basic-allocator.c:52:2: error: invalid 
preprocessing directive #deine; did you mean #define?
52 | #deine BASIC_ALLOC_YIELD
   |  ^
   |  define

Yes, indeed.

I've pushed to devel/omp/gcc-12 branch
commit 6cc0e7bebf1b3ad6aacf75419e7f06942409f90c
"Un-break nvptx libgomp build", see attached.


Oops, thanks Thomas.

Andrew


Re: [PATCH] c++: ICE with -fno-elide-constructors and trivial fn [PR101073]

2023-02-20 Thread Jason Merrill via Gcc-patches

On 2/15/23 13:37, Marek Polacek wrote:

On Wed, Feb 15, 2023 at 02:39:16PM -0500, Jason Merrill wrote:

On 2/9/23 09:39, Marek Polacek wrote:

In constexpr-nsdmi3.C, with -fno-elide-constructors, we don't elide
the Y::Y(const Y&) call used to initialize o.c.  So store_init_value
-> cxx_constant_init must constexpr-evaluate the call to Y::Y(const Y&)
in cxx_eval_call_expression.  It's a trivial function, so we do the
"Shortcut trivial constructor/op=" code and rather than evaluating
the function, we just create an assignment

o.c = *(const struct Y &) (const struct Y *) &(&)->b

which is a MODIFY_EXPR, so the preeval code in cxx_eval_store_expression
clears .ctor and .object, therefore we can't replace the PLACEHOLDER_EXPR
whereupon we crash at

/* A placeholder without a referent.  We can get here when
   checking whether NSDMIs are noexcept, or in massage_init_elt;
   just say it's non-constant for now.  */
gcc_assert (ctx->quiet);

The PLACEHOLDER_EXPR can also be on the LHS as in constexpr-nsdmi10.C.
I don't think we can do much here, but I noticed that the whole
trivial_fn_p (fun) block is only entered when -fno-elide-constructors.
This is true since GCC 9; it wasn't easy to bisect what changes made it
so, but r240845 is probably one of them.  -fno-elide-constructors is an
option for experiments only so it's not clear to me why we'd still want
to shortcut trivial constructor/op=.  I propose to remove the code and
add a checking assert to make sure we're not getting a trivial_fn_p
unless -fno-elide-constructors.


Hmm, trivial op= doesn't ever hit this code?


With -fno-elide-constructors we hit the trivial_fn_p block twice in
constexpr-nsdmi9.C, once for "constexpr Y::Y(const Y&)" and then for
"constexpr Y& Y::operator=(Y&&)".  So it does hit the code, but only
with -fno-elide-constructors.


Odd, I'm not sure why that would make a difference for assignment.


Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?  I don't
think I want to backport this.

PR c++/101073

gcc/cp/ChangeLog:

* constexpr.cc (cxx_eval_call_expression): Replace shortcutting trivial
constructor/op= with a checking assert.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/constexpr-nsdmi3.C: New test.
* g++.dg/cpp1y/constexpr-nsdmi10.C: New test.
---
   gcc/cp/constexpr.cc   | 25 +++
   gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C | 17 +
   .../g++.dg/cpp1y/constexpr-nsdmi10.C  | 18 +
   3 files changed, 38 insertions(+), 22 deletions(-)
   create mode 100644 gcc/testsuite/g++.dg/cpp0x/constexpr-nsdmi3.C
   create mode 100644 gcc/testsuite/g++.dg/cpp1y/constexpr-nsdmi10.C

diff --git a/gcc/cp/constexpr.cc b/gcc/cp/constexpr.cc
index 564766c8a00..1d53dcf0f20 100644
--- a/gcc/cp/constexpr.cc
+++ b/gcc/cp/constexpr.cc
@@ -2865,28 +2865,9 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
 ctx = _ctx;
   }
-  /* Shortcut trivial constructor/op=.  */
-  if (trivial_fn_p (fun))
-{
-  tree init = NULL_TREE;
-  if (call_expr_nargs (t) == 2)
-   init = convert_from_reference (get_nth_callarg (t, 1));
-  else if (TREE_CODE (t) == AGGR_INIT_EXPR
-  && AGGR_INIT_ZERO_FIRST (t))
-   init = build_zero_init (DECL_CONTEXT (fun), NULL_TREE, false);
-  if (init)
-   {
- tree op = get_nth_callarg (t, 0);
- if (is_dummy_object (op))
-   op = ctx->object;
- else
-   op = build1 (INDIRECT_REF, TREE_TYPE (TREE_TYPE (op)), op);
- tree set = build2 (MODIFY_EXPR, TREE_TYPE (op), op, init);


I think the problem is using MODIFY_EXPR instead of INIT_EXPR to represent a
constructor; that's why cxx_eval_store_expression thinks it's OK to
preevaluate.  This should properly use those two tree codes for op= and
ctor, respectively.


Maybe it was so that the RHS in SET could refer to the op in the LHS?


I think it was just an oversight.  You need INIT_EXPR for the rhs to 
refer to the lhs.



- new_ctx.call = _call;
- return cxx_eval_constant_expression (_ctx, set, lval,
-  non_constant_p, overflow_p);
-   }
-}
+  /* We used to shortcut trivial constructor/op= here, but nowadays
+ we can only get a trivial function here with -fno-elide-constructors.  */
+  gcc_checking_assert (!trivial_fn_p (fun) || !flag_elide_constructors);


...but if this optimization is so rarely triggered, this simplification is
OK too.


I'd say that's better so that we don't have to update the code (like
r234345 did).


Indeed, the patch is OK.

Jason



Re: [Patch] Fortran: Avoid SAVE_EXPR for deferred-len char types

2023-02-20 Thread Steve Kargl via Gcc-patches
On Mon, Feb 20, 2023 at 07:56:14AM +0100, Tobias Burnus wrote:
> On 17.02.23 17:27, Steve Kargl wrote:
> > On Fri, Feb 17, 2023 at 12:13:52PM +0100, Tobias Burnus wrote:
> > > OK for mainline?
> > Short version: no.
> 
> Would you mind to write a reasoning beyond only a single word?
> 
> > > subroutine foo(n)
> > >integer :: n
> > >integer :: array(n*5)
> > >integer :: my_len
> > >...
> > >my_len = 5
> > >block
> > >  character(len=my_len, kind=4) :: str
> > > 
> > >  my_len = 99
> > >  print *, len(str)  ! still shows 5 - not 99
> > >end block
> > > end
> > Are you sure about the above comment?
> 
> Yes - for three reasons:
> * On the what-feels-right side: It does not make any sense to print
>   any other value than 5 given that 'str' has been declared with len = 5.
> * On the GCC side, the SAVE_EXPR ensures that the length is evaluated
>   early and then "saved" to ensure its original value is available
> * The quoted text from the standard implies that this is what
>   should happen.

Your comment in the above code suggest to me that you
expected 99.  Of course, the print statement should
produce 5 and that is what gfortran does.  If your patch
only effects deferred character types, why are you including
a useless code example.

-- 
steve


Ping [PATCH, rs6000] Split TImode for logical operations in expand pass [PR100694]

2023-02-20 Thread HAO CHEN GUI via Gcc-patches
Hi,
  Gently ping this:
  https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611550.html

Gui Haochen
Thanks

在 2023/2/8 13:08, HAO CHEN GUI 写道:
> Hi,
>   The logical operations for TImode is split after reload pass right now. Some
> potential optimizations miss as the split is too late. This patch removes
> TImode from "AND", "IOR", "XOR" and "NOT" expander so that these logical
> operations can be split at expand pass. The new test case illustrates the
> optimization.
> 
>   Two test cases of pr92398 are merged into one as all sub-targets generates
> the same sequence of instructions with the patch.
> 
>   Bootstrapped and tested on powerpc64-linux BE and LE with no regressions.
> 
> Thanks
> Gui Haochen
> 
> 
> ChangeLog
> 2023-02-08  Haochen Gui 
> 
> gcc/
>   PR target/100694
>   * config/rs6000/rs6000.md (BOOL_128_V): New mode iterator for 128-bit
>   vector types.
>   (and3): Replace BOOL_128 with BOOL_128_V.
>   (ior3): Likewise.
>   (xor3): Likewise.
>   (one_cmpl2 expander): New expander with BOOL_128_V.
>   (one_cmpl2 insn_and_split): Rename to ...
>   (*one_cmpl2): ... this.
> 
> gcc/testsuite/
>   PR target/100694
>   * gcc.target/powerpc/pr100694.c: New.
>   * gcc.target/powerpc/pr92398.c: New.
>   * gcc.target/powerpc/pr92398.h: Remove.
>   * gcc.target/powerpc/pr92398.p9-.c: Remove.
>   * gcc.target/powerpc/pr92398.p9+.c: Remove.
> 
> 
> patch.diff
> diff --git a/gcc/config/rs6000/rs6000.md b/gcc/config/rs6000/rs6000.md
> index 4bd1dfd3da9..455b7329643 100644
> --- a/gcc/config/rs6000/rs6000.md
> +++ b/gcc/config/rs6000/rs6000.md
> @@ -743,6 +743,15 @@ (define_mode_iterator BOOL_128   [TI
>(V2DF  "TARGET_ALTIVEC")
>(V1TI  "TARGET_ALTIVEC")])
> 
> +;; Mode iterator for logical operations on 128-bit vector types
> +(define_mode_iterator BOOL_128_V [(V16QI "TARGET_ALTIVEC")
> +  (V8HI  "TARGET_ALTIVEC")
> +  (V4SI  "TARGET_ALTIVEC")
> +  (V4SF  "TARGET_ALTIVEC")
> +  (V2DI  "TARGET_ALTIVEC")
> +  (V2DF  "TARGET_ALTIVEC")
> +  (V1TI  "TARGET_ALTIVEC")])
> +
>  ;; For the GPRs we use 3 constraints for register outputs, two that are the
>  ;; same as the output register, and a third where the output register is an
>  ;; early clobber, so we don't have to deal with register overlaps.  For the
> @@ -7135,23 +7144,23 @@ (define_expand "subti3"
>  ;; 128-bit logical operations expanders
> 
>  (define_expand "and3"
> -  [(set (match_operand:BOOL_128 0 "vlogical_operand")
> - (and:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
> -   (match_operand:BOOL_128 2 "vlogical_operand")))]
> +  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
> + (and:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
> + (match_operand:BOOL_128_V 2 "vlogical_operand")))]
>""
>"")
> 
>  (define_expand "ior3"
> -  [(set (match_operand:BOOL_128 0 "vlogical_operand")
> -(ior:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
> -   (match_operand:BOOL_128 2 "vlogical_operand")))]
> +  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
> + (ior:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
> + (match_operand:BOOL_128_V 2 "vlogical_operand")))]
>""
>"")
> 
>  (define_expand "xor3"
> -  [(set (match_operand:BOOL_128 0 "vlogical_operand")
> -(xor:BOOL_128 (match_operand:BOOL_128 1 "vlogical_operand")
> -   (match_operand:BOOL_128 2 "vlogical_operand")))]
> +  [(set (match_operand:BOOL_128_V 0 "vlogical_operand")
> + (xor:BOOL_128_V (match_operand:BOOL_128_V 1 "vlogical_operand")
> + (match_operand:BOOL_128_V 2 "vlogical_operand")))]
>""
>"")
> 
> @@ -7449,7 +7458,14 @@ (define_insn_and_split "*eqv3_internal2"
>(const_string "16")))])
> 
>  ;; 128-bit one's complement
> -(define_insn_and_split "one_cmpl2"
> +(define_expand "one_cmpl2"
> +[(set (match_operand:BOOL_128_V 0 "vlogical_operand" "=")
> + (not:BOOL_128_V
> +   (match_operand:BOOL_128_V 1 "vlogical_operand" "")))]
> +  ""
> +  "")
> +
> +(define_insn_and_split "*one_cmpl2"
>[(set (match_operand:BOOL_128 0 "vlogical_operand" "=")
>   (not:BOOL_128
> (match_operand:BOOL_128 1 "vlogical_operand" "")))]
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr100694.c 
> b/gcc/testsuite/gcc.target/powerpc/pr100694.c
> new file mode 100644
> index 000..96a895d6c44
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr100694.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target int128 } */
> +/* { dg-options "-O2" } */
> +/* { dg-final { 

Ping^^ [PATCH V4 2/2] rs6000: use li;x?oris to build constant

2023-02-20 Thread Jiufu Guo via Gcc-patches
Hi,

Gentle ping:
https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608292.html

BR,
Jeff (Jiufu)


Jiufu Guo via Gcc-patches  writes:

> Hi,
>
> I would like to have a ping on this patch:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/608292.html
>
>
> BR,
> Jeff (Jiufu)
>
>
> Jiufu Guo  writes:
>
>> Hi,
>>
>> For constant C:
>> If '(c & 0xULL) == 0x' or say:
>> 32(1) || 1(0) || 15(x) || 16(0), we could use "lis; xoris" to build.
>>
>> Here N(M) means N continuous bit M, x for M means it is ok for either
>> 1 or 0; '||' means concatenation.
>>
>> This patch update rs6000_emit_set_long_const to support those constants.
>>
>> Compare with previous version:
>> https://gcc.gnu.org/pipermail/gcc-patches/2022-December/607618.htm
>> This patch fix conflicts with trunk.
>>
>> Bootstrap and regtest pass on ppc64{,le}.
>>
>> Is this ok for trunk?
>>
>> BR,
>> Jeff (Jiufu)
>>
>>
>>  PR target/106708
>>
>> gcc/ChangeLog:
>>
>>  * config/rs6000/rs6000.cc (rs6000_emit_set_long_const): Add to build
>>  constants through "lis; xoris".
>>
>> gcc/testsuite/ChangeLog:
>>
>>  * gcc.target/powerpc/pr106708.c: Add test function.
>>
>> ---
>>  gcc/config/rs6000/rs6000.cc |  7 +++
>>  gcc/testsuite/gcc.target/powerpc/pr106708.c | 10 +-
>>  2 files changed, 16 insertions(+), 1 deletion(-)
>>
>> diff --git a/gcc/config/rs6000/rs6000.cc b/gcc/config/rs6000/rs6000.cc
>> index 8c1192a10c8..1138d5e8cd4 100644
>> --- a/gcc/config/rs6000/rs6000.cc
>> +++ b/gcc/config/rs6000/rs6000.cc
>> @@ -10251,6 +10251,13 @@ rs6000_emit_set_long_const (rtx dest, HOST_WIDE_INT 
>> c)
>>if (ud1 != 0)
>>  emit_move_insn (dest, gen_rtx_IOR (DImode, temp, GEN_INT (ud1)));
>>  }
>> +  else if (ud4 == 0x && ud3 == 0x && !(ud2 & 0x8000) && ud1 == 0)
>> +{
>> +  /* lis; xoris */
>> +  temp = !can_create_pseudo_p () ? dest : gen_reg_rtx (DImode);
>> +  emit_move_insn (temp, GEN_INT (sext_hwi ((ud2 | 0x8000) << 16, 32)));
>> +  emit_move_insn (dest, gen_rtx_XOR (DImode, temp, GEN_INT 
>> (0x8000)));
>> +}
>>else if (ud4 == 0x && ud3 == 0x && (ud1 & 0x8000))
>>  {
>>/* li; xoris */
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106708.c 
>> b/gcc/testsuite/gcc.target/powerpc/pr106708.c
>> index dc9ceda8367..a015c71e630 100644
>> --- a/gcc/testsuite/gcc.target/powerpc/pr106708.c
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106708.c
>> @@ -4,7 +4,7 @@
>>  /* { dg-require-effective-target has_arch_ppc64 } */
>>  
>>  long long arr[]
>> -  = {0x7cdeab55LL, 0x98765432LL, 0xabcdLL};
>> += {0x7cdeab55LL, 0x98765432LL, 0xabcdLL, 0x6543LL};
>>  
>>  void __attribute__ ((__noipa__)) lixoris (long long *arg)
>>  {
>> @@ -27,6 +27,13 @@ void __attribute__ ((__noipa__)) lisrldicl (long long 
>> *arg)
>>  /* { dg-final { scan-assembler-times {\mlis .*,0xabcd\M} 1 } } */
>>  /* { dg-final { scan-assembler-times {\mrldicl .*,0,32\M} 1 } } */
>>  
>> +void __attribute__ ((__noipa__)) lisxoris (long long *arg)
>> +{
>> +  *arg = 0x6543LL;
>> +}
>> +/* { dg-final { scan-assembler-times {\mlis .*,0xe543\M} 1 } } */
>> +/* { dg-final { scan-assembler-times {\mxoris .*0x8000\M} 1 } } */
>> +
>>  int
>>  main ()
>>  {
>> @@ -35,6 +42,7 @@ main ()
>>lixoris (a);
>>lioris (a + 1);
>>lisrldicl (a + 2);
>> +  lisxoris (a + 3);
>>if (__builtin_memcmp (a, arr, sizeof (arr)) != 0)
>>  __builtin_abort ();
>>return 0;


[PATCH, rs6000] Merge two vector shift when their sources are the same

2023-02-20 Thread HAO CHEN GUI via Gcc-patches
Hi,
  This patch merges two "vsldoi" insns when their sources are the
same. Particularly, it is simplified to be one move if the total
shift is multiples of 16 bytes.

  Bootstrapped and tested on powerpc64-linux BE and LE with no
regressions.

Thanks
Gui Haochen


ChangeLog
2023-02-20  Haochen Gui 

gcc/
* config/rs6000/altivec.md (*altivec_vsldoi_dup_): New
insn_and_split to merge two vsldoi.

gcc/testsuite/
* gcc.target/powerpc/vsldoi_merge.c: New.


patch.diff
diff --git a/gcc/config/rs6000/altivec.md b/gcc/config/rs6000/altivec.md
index 84660073f32..22e9c4c1fc5 100644
--- a/gcc/config/rs6000/altivec.md
+++ b/gcc/config/rs6000/altivec.md
@@ -2529,6 +2529,35 @@ (define_insn "altivec_vsldoi_"
   "vsldoi %0,%1,%2,%3"
   [(set_attr "type" "vecperm")])

+(define_insn_and_split "*altivec_vsldoi_dup_"
+  [(set (match_operand:VM 0 "register_operand" "=v")
+   (unspec:VM [(unspec:VM [(match_operand:VM 1 "register_operand" "v")
+   (match_operand:VM 2 "register_operand" "v")
+   (match_operand:QI 3 "immediate_operand" "i")]
+  UNSPEC_VSLDOI)
+   (unspec:VM [(match_dup 1)
+   (match_dup 2)
+   (match_dup 3)]
+  UNSPEC_VSLDOI)
+   (match_operand:QI 4 "immediate_operand" "i")]
+  UNSPEC_VSLDOI))]
+  "TARGET_ALTIVEC"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+{
+  unsigned int shift1 = UINTVAL (operands[3]);
+  unsigned int shift2 = UINTVAL (operands[4]);
+
+  unsigned int shift = (shift1 + shift2) % 16;
+  if (shift)
+emit_insn (gen_altivec_vsldoi_ (operands[0], operands[1],
+ operands[1], GEN_INT (shift)));
+  else
+emit_move_insn (operands[0], operands[1]);
+  DONE;
+})
+
 (define_insn "altivec_vupkhs"
   [(set (match_operand:VP 0 "register_operand" "=v")
(unspec:VP [(match_operand: 1 "register_operand" "v")]
diff --git a/gcc/testsuite/gcc.target/powerpc/vsldoi_merge.c 
b/gcc/testsuite/gcc.target/powerpc/vsldoi_merge.c
new file mode 100644
index 000..4ea72561282
--- /dev/null
+++ b/gcc/testsuite/gcc.target/powerpc/vsldoi_merge.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_vsx_ok } */
+/* { dg-options "-O2 -mvsx" } */
+
+#include "altivec.h"
+
+vector signed int test1 (vector signed int a, vector signed int b)
+{
+  a = vec_sld (a, b, 2);
+  a = vec_sld (a, a, 4);
+  return a;
+}
+
+vector signed int test2 (vector signed int a, vector signed int b)
+{
+  a = vec_sld (a, b, 14);
+  a = vec_sld (a, a, 2);
+  return a;
+}
+
+/* { dg-final { scan-assembler-times {\mvsldoi\M} 1 } } */


Re: [og12] Attempt to register OpenMP pinned memory using a device instead of 'mlock' (was: [PATCH] libgomp, openmp: pinned memory)

2023-02-20 Thread Andrew Stubbs

On 17/02/2023 08:12, Thomas Schwinge wrote:

Hi Andrew!

On 2023-02-16T23:06:44+0100, I wrote:

On 2023-02-16T16:17:32+, "Stubbs, Andrew via Gcc-patches" 
 wrote:

The mmap implementation was not optimized for a lot of small allocations, and I 
can't see that issue changing here


That's correct, 'mmap' remains.  Under the hood, 'cuMemHostRegister' must
surely also be doing some 'mlock'-like thing, so I figured it's best to
feed page-boundary memory regions to it, which 'mmap' gets us.


so I don't know if this can be used for mlockall replacement.

I had assumed that using the Cuda allocator would fix that limitation.


 From what I've read (but no first-hand experiments), there's non-trivial
overhead with 'cuMemHostRegister' (just like with 'mlock'), so routing
all small allocations individually through it probably isn't a good idea
either.  Therefore, I suppose, we'll indeed want to use some local
allocator if we wish this "optimized for a lot of small allocations".


Eh, I suppose your point indirectly was that instead of 'mmap' plus
'cuMemHostRegister' we ought to use 'cuMemAllocHost'/'cuMemHostAlloc', as
we assume those already do implement such a local allocator.  Let me
quickly change that indeed -- we don't currently have a need to use
'cuMemHostRegister' instead of 'cuMemAllocHost'/'cuMemHostAlloc'.



Yes, that's right. I suppose it makes sense to register memory we 
already have, but if we want new memory then trying to reinvent what 
happens inside cuMemAllocHost is pointless.


Andrew


Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-20 Thread Kewen.Lin via Gcc-patches
Hi Segher,

Thanks for the comments!

on 2023/2/19 20:12, Segher Boessenkool wrote:
> Hi!
> 
> On Fri, Feb 17, 2023 at 11:33:16AM +0800, Kewen.Lin wrote:
>> on 2023/2/16 23:10, Segher Boessenkool wrote:
>>> No, you are right that the semantics are pretty much the same.  Please
>>> just keep UNSPEC_PARITY everywhere.
>>
>> OK, since it has UNSPEC, I would hope the reader can realize it's
>> different from RTL opcode parity and mainly operating on byte.  :)
> 
> Yeah.  Often, even usually, unspecs differ in some crucial ways from
> similarly named RTL expressions: you would not want an unspec at all
> otherwise!
> 
>>> Ah, because it cannot use the expander here, it has to be a define_insn?
>>
>> No, the above statement seems to cause some misunderstanding, let me clarify:
>> first, the built-in functions __builtin_altivec_vprtyb[wdq] require to be
>> mapped to hardware insns vprtyb[wdq] directly as the functions name show.
> 
> No, that is not true at all.  Builtins do **not** guarantee to expand to
> any specific machine instruction.  This is one reason why such names are
> not so good, are quite misleading.

OK, I agree that we don't claim there is a 1-1 map, but for those bifs
*_(vsx|altivec)_, it looks that we map them with the corresponding hw
insn (mnemonic in the name) all the time.  IMHO, it makes sense, since otherwise
it would be quite misleading (it should use one general name instead).

For this particular built-in __builtin_altivec_vprtyb[wdq], I think we all
agree that we don't want to expand it into vpopcntb + vprtyb[wdq].  :)

> 
> If you want specific machine insns, you need to use inline asm, that is
> what it is there for.  Builtins generate code with some specified
> semantics, nothing more, nothing less; just like everything else the
> compiler does, the "as-if" rule in full swing.
> 
 The name is updated from previous *p9v_parity2 (becoming
 to a named define_insn), I noticed there are some names with
 p8v_, p9v_, meant to keep it consistent with the context.
 You want this to be simplified as parity*b*v2di2?
>>>
>>> Without the "b".  But that would be better then, yes.  This is a great
>>> example why p9v_ in the name is not good: most users do not care at all
>>> what ISA version this insn first appeared in.
>>
>> The name without "b" is standard pattern name, whose semantic doesn't align
>> with what these insns provide
> 
> Heh, it is never easy is it?  :-)

Yeah. :)

> 
>> and we already have the matched expander with
>> it ("parity2"), so we can't use the name here :(.  As you felt a name
>> with "b" is better than "p9v_*", I'll go with "parityb" then.  :)
> 
> Something longer and less confusing please.  Or maybe just with the insn
> name, that isn't a problem in the machine desription (as it is for
> builtin names or other user-facing stuff).  "rs6000_vprtyb" maybe?

Thanks for the suggestion!  Will go with "rs6000_vprtyb" if the others in
v2 [1] look good to you.

[1] https://gcc.gnu.org/pipermail/gcc-patches/2023-February/612212.html

BR,
Kewen


Re: [PATCH] [arm] complete vmsr/vmrs blank and case adjustments

2023-02-20 Thread Andrea Corallo via Gcc-patches
Alexandre Oliva  writes:

> Back in September last year, some of the vmsr and vmrs patterns had an
> extraneous blank removed, and the case of register names lowered, but
> another instance remained, and so did a few testcases.

[...]

Hi Alexandre,

I'm not approver but LGTM, thanks for fixing this.

  Andrea


Re: [PATCH 0/4] rs6000: build constant via li/lis;rldicX

2023-02-20 Thread Jiufu Guo via Gcc-patches
Hi,

Gental ping these patches:
https://gcc.gnu.org/pipermail/gcc-patches/2023-February/611286.html

BR,
Jeff (Jiufu)


Jiufu Guo  writes:

> Hi,
>
> For a given constant, it would be profitable if we can use 2 insns to build.
> This patch enables more constants building through 2 insns: one is "li or 
> lis",
> another is 'rldicl, rldicr or rldic'.
> Through checking and analyzing the characters of the insns "li/lis;rldicX",
> all the possible constant values are considered by this patch.
>
> Previously, a patch is posted, but it is too large.
> https://gcc.gnu.org/pipermail/gcc-patches/2022-September/601276.html
> As suggested, I split it into this series.
>
> Considering the functionality and size, 4 patches are split as below:
> 1. Support the constants which can be built by "li;rotldi"
>Both positive and negative values from insn "li" are analyzed.
> 2. Support the constants which can be built by "lis;rotldi"
>We only need to analyze the negative value from "lis".
>And this patch uses more code to check leading 1s and tailing 0s from 
> "lis".
> 3. Support the constants which can be built by "li/lis;rldicl/rldicr":
>Leverage the APIs defined/analyzed in patches 1 and 2,
>this patch checks the characters for the mask of "rldicl/rldicr"
>to support more constants.
> 4. Support the constants which can be built by "li/lis;rldic":
>The mask of "rldic" is relatively complicated, it is analyzed in this
>patch to support more constants.
>
> BR,
> Jeff (Jiufu)


Re: [PATCH v3] c++: ICE with redundant capture [PR108829]

2023-02-20 Thread Jason Merrill via Gcc-patches

On 2/17/23 14:42, Marek Polacek wrote:

On Fri, Feb 17, 2023 at 04:32:50PM -0500, Patrick Palka wrote:

On Fri, 17 Feb 2023, Patrick Palka wrote:


On Fri, 17 Feb 2023, Marek Polacek wrote:


On Fri, Feb 17, 2023 at 03:00:39PM -0500, Patrick Palka wrote:

On Fri, 17 Feb 2023, Marek Polacek via Gcc-patches wrote:


Here we crash in is_capture_proxy:

   /* Location wrappers should be stripped or otherwise handled by the
  caller before using this predicate.  */
   gcc_checking_assert (!location_wrapper_p (decl));

so fixed as the comment suggests.  We only crash with the redundant
capture:

   int abyPage = [=, abyPage] { ... }

because prune_lambda_captures is only called when there was a default
capture, and with [=] only abyPage won't be in LAMBDA_EXPR_CAPTURE_LIST.


It's weird that we even get this far in var_to_maybe_prune.  Shouldn't
LAMBDA_CAPTURE_EXPLICIT_P be true for abyPage?


Ug, I was seduced by the ostensible obviousness and failed to notice
that check.  In that light, the correct fix ought to be this.  Thanks!

Bootstrap/regtest running on x86_64-pc-linux-gnu, ok for trunk if it
passes?

-- >8 --
Here we crash in is_capture_proxy:

   /* Location wrappers should be stripped or otherwise handled by the
  caller before using this predicate.  */
   gcc_checking_assert (!location_wrapper_p (decl));

We only crash with the redundant capture:

   int abyPage = [=, abyPage] { ... }

because prune_lambda_captures is only called when there was a default
capture, and with [=] only abyPage won't be in LAMBDA_EXPR_CAPTURE_LIST.

The problem is that LAMBDA_CAPTURE_EXPLICIT_P wasn't propagated
correctly and so var_to_maybe_prune proceeded where it shouldn't.

PR c++/108829

gcc/cp/ChangeLog:

* pt.cc (tsubst_lambda_expr): Propagate LAMBDA_CAPTURE_EXPLICIT_P.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-108829.C: New test.
---
  gcc/cp/pt.cc  |  4 
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-108829.C | 11 +++
  2 files changed, 15 insertions(+)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-108829.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index b1ac7d4beb4..f747ce877b5 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -19992,6 +19992,10 @@ tsubst_lambda_expr (tree t, tree args, tsubst_flags_t 
complain, tree in_decl)
  if (id_equal (DECL_NAME (field), "__this"))
LAMBDA_EXPR_THIS_CAPTURE (r) = field;
}
+
+  if (LAMBDA_EXPR_CAPTURE_LIST (r))
+   LAMBDA_CAPTURE_EXPLICIT_P (LAMBDA_EXPR_CAPTURE_LIST (r))
+ = LAMBDA_CAPTURE_EXPLICIT_P (LAMBDA_EXPR_CAPTURE_LIST (t));


I'm not sure how the flag works for pack captures but it looks like
this would only propagate the flag to the last expanded capture if
the capture was originally a pack.


Testcase:

   template
   void f(Ts... ts) {
 constexpr int IDX_PAGE_SIZE = 4096;
 int abyPage = [=, ts...] { return IDX_PAGE_SIZE; }();
   }
   void h() {
 f<1>(0, 1);
   }


Thanks a lot for the testacase.  Revised patch below.  Look OK?

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?


OK.


-- >8 --
Here we crash in is_capture_proxy:

   /* Location wrappers should be stripped or otherwise handled by the
  caller before using this predicate.  */
   gcc_checking_assert (!location_wrapper_p (decl));

We only crash with the redundant capture:

   int abyPage = [=, abyPage] { ... }

because prune_lambda_captures is only called when there was a default
capture, and with [=] only abyPage won't be in LAMBDA_EXPR_CAPTURE_LIST.

The problem is that LAMBDA_CAPTURE_EXPLICIT_P wasn't propagated
correctly and so var_to_maybe_prune proceeded where it shouldn't.

PR c++/108829

gcc/cp/ChangeLog:

* pt.cc (prepend_one_capture): Set LAMBDA_CAPTURE_EXPLICIT_P.
(tsubst_lambda_expr): Pass LAMBDA_CAPTURE_EXPLICIT_P to
prepend_one_capture.

gcc/testsuite/ChangeLog:

* g++.dg/cpp0x/lambda/lambda-108829-2.C: New test.
* g++.dg/cpp0x/lambda/lambda-108829.C: New test.

Co-Authored by: Patrick Palka 
---
  gcc/cp/pt.cc|  9 ++---
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-108829-2.C | 11 +++
  gcc/testsuite/g++.dg/cpp0x/lambda/lambda-108829.C   | 11 +++
  3 files changed, 28 insertions(+), 3 deletions(-)
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-108829-2.C
  create mode 100644 gcc/testsuite/g++.dg/cpp0x/lambda/lambda-108829.C

diff --git a/gcc/cp/pt.cc b/gcc/cp/pt.cc
index b1ac7d4beb4..1a071e95004 100644
--- a/gcc/cp/pt.cc
+++ b/gcc/cp/pt.cc
@@ -19870,10 +19870,11 @@ tsubst_non_call_postfix_expression (tree t, tree args,
  
  /* Subroutine of tsubst_lambda_expr: add the FIELD/INIT capture pair to the

 LAMBDA_EXPR_CAPTURE_LIST passed in LIST.  Do deduction for a previously
-   dependent init-capture.  */
+   dependent init-capture.  EXPLICIT_P is true if the original list had
+   explicit 

Re: [PATCH] rs6000: Fix vector parity support [PR108699]

2023-02-20 Thread Segher Boessenkool
Hi!

On Fri, Feb 17, 2023 at 11:33:16AM +0800, Kewen.Lin wrote:
> on 2023/2/16 23:10, Segher Boessenkool wrote:
> > No, you are right that the semantics are pretty much the same.  Please
> > just keep UNSPEC_PARITY everywhere.
> 
> OK, since it has UNSPEC, I would hope the reader can realize it's
> different from RTL opcode parity and mainly operating on byte.  :)

Yeah.  Often, even usually, unspecs differ in some crucial ways from
similarly named RTL expressions: you would not want an unspec at all
otherwise!

> > Ah, because it cannot use the expander here, it has to be a define_insn?
> 
> No, the above statement seems to cause some misunderstanding, let me clarify:
> first, the built-in functions __builtin_altivec_vprtyb[wdq] require to be
> mapped to hardware insns vprtyb[wdq] directly as the functions name show.

No, that is not true at all.  Builtins do **not** guarantee to expand to
any specific machine instruction.  This is one reason why such names are
not so good, are quite misleading.

If you want specific machine insns, you need to use inline asm, that is
what it is there for.  Builtins generate code with some specified
semantics, nothing more, nothing less; just like everything else the
compiler does, the "as-if" rule in full swing.

> >> The name is updated from previous *p9v_parity2 (becoming
> >> to a named define_insn), I noticed there are some names with
> >> p8v_, p9v_, meant to keep it consistent with the context.
> >> You want this to be simplified as parity*b*v2di2?
> > 
> > Without the "b".  But that would be better then, yes.  This is a great
> > example why p9v_ in the name is not good: most users do not care at all
> > what ISA version this insn first appeared in.
> 
> The name without "b" is standard pattern name, whose semantic doesn't align
> with what these insns provide

Heh, it is never easy is it?  :-)

> and we already have the matched expander with
> it ("parity2"), so we can't use the name here :(.  As you felt a name
> with "b" is better than "p9v_*", I'll go with "parityb" then.  :)

Something longer and less confusing please.  Or maybe just with the insn
name, that isn't a problem in the machine desription (as it is for
builtin names or other user-facing stuff).  "rs6000_vprtyb" maybe?


Segher