Re: [PATCH] arm/aarch64: Add bti for all functions [PR106671]

2024-01-09 Thread Andrea Corallo
Andrea Corallo  writes:

> Feng Xue OS via Gcc-patches  writes:
>
>> This patch extends option -mbranch-protection=bti with an optional argument
>> as bti[+all] to force compiler to unconditionally insert bti for all
>> functions. Because a direct function call at the stage of compiling might be
>> rewritten to an indirect call with some kind of linker-generated thunk stub
>> as invocation relay for some reasons. One instance is if a direct callee is
>> placed far from its caller, direct BL {imm} instruction could not represent
>> the distance, so indirect BLR {reg} should be used. For this case, a bti is
>> required at the beginning of the callee.
>>
>>caller() {
>>bl callee
>>}
>> 
>> =>
>> 
>>caller() {
>>adrp   reg, 
>>addreg, reg, #constant
>>blrreg
>>}
>> 
>> Although the issue could be fixed with a pretty new version of ld, here we
>> provide another means for user who has to rely on the old ld or other non-ld
>> linker. I also checked LLVM, by default, it implements bti just as the 
>> proposed
>> -mbranch-protection=bti+all.
>>
>> Feng
>>
>> ---
>>  gcc/config/aarch64/aarch64.cc| 12 +++-
>>  gcc/config/aarch64/aarch64.opt   |  2 +-
>>  gcc/config/arm/aarch-bti-insert.cc   |  3 ++-
>>  gcc/config/arm/aarch-common.cc   | 22 ++
>>  gcc/config/arm/aarch-common.h| 18 ++
>>  gcc/config/arm/arm.cc|  4 ++--
>>  gcc/config/arm/arm.opt   |  2 +-
>>  gcc/doc/invoke.texi  | 16 ++--
>>  gcc/testsuite/gcc.target/aarch64/bti-5.c | 17 +
>>  9 files changed, 76 insertions(+), 20 deletions(-)
>>  create mode 100644 gcc/testsuite/gcc.target/aarch64/bti-5.c
>
> [...]
>
> Hi Feng,
>
> I think this patch is missing its ChangeLog entry.  Also you should
> specify the state of the testing and regression for this patch, please
> see [1].
>
>> diff --git a/gcc/testsuite/gcc.target/aarch64/bti-5.c 
>> b/gcc/testsuite/gcc.target/aarch64/bti-5.c
>> new file mode 100644
>> index 000..654cd0cce7e
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/aarch64/bti-5.c
>> @@ -0,0 +1,17 @@
>> +/* { dg-do run } */
>> +/* { dg-options "-O1 -save-temps" } */
>> +/* { dg-require-effective-target lp64 } */
>> +/* { dg-additional-options "-mbranch-protection=bti+all" { target { ! 
>> default_branch_protection } } } */

Also an afterthought: given the patch is enabling this feature on arm as
well wouldn't be better to have a test case for arm as well?

Thanks

  Andrea


Re: [PATCH] arm/aarch64: Add bti for all functions [PR106671]

2024-01-09 Thread Andrea Corallo
Feng Xue OS via Gcc-patches  writes:

> This patch extends option -mbranch-protection=bti with an optional argument
> as bti[+all] to force compiler to unconditionally insert bti for all
> functions. Because a direct function call at the stage of compiling might be
> rewritten to an indirect call with some kind of linker-generated thunk stub
> as invocation relay for some reasons. One instance is if a direct callee is
> placed far from its caller, direct BL {imm} instruction could not represent
> the distance, so indirect BLR {reg} should be used. For this case, a bti is
> required at the beginning of the callee.
>
>caller() {
>bl callee
>}
> 
> =>
> 
>caller() {
>adrp   reg, 
>addreg, reg, #constant
>blrreg
>}
> 
> Although the issue could be fixed with a pretty new version of ld, here we
> provide another means for user who has to rely on the old ld or other non-ld
> linker. I also checked LLVM, by default, it implements bti just as the 
> proposed
> -mbranch-protection=bti+all.
>
> Feng
>
> ---
>  gcc/config/aarch64/aarch64.cc| 12 +++-
>  gcc/config/aarch64/aarch64.opt   |  2 +-
>  gcc/config/arm/aarch-bti-insert.cc   |  3 ++-
>  gcc/config/arm/aarch-common.cc   | 22 ++
>  gcc/config/arm/aarch-common.h| 18 ++
>  gcc/config/arm/arm.cc|  4 ++--
>  gcc/config/arm/arm.opt   |  2 +-
>  gcc/doc/invoke.texi  | 16 ++--
>  gcc/testsuite/gcc.target/aarch64/bti-5.c | 17 +
>  9 files changed, 76 insertions(+), 20 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/aarch64/bti-5.c

[...]

Hi Feng,

I think this patch is missing its ChangeLog entry.  Also you should
specify the state of the testing and regression for this patch, please
see [1].

> diff --git a/gcc/testsuite/gcc.target/aarch64/bti-5.c 
> b/gcc/testsuite/gcc.target/aarch64/bti-5.c
> new file mode 100644
> index 000..654cd0cce7e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/aarch64/bti-5.c
> @@ -0,0 +1,17 @@
> +/* { dg-do run } */
> +/* { dg-options "-O1 -save-temps" } */
> +/* { dg-require-effective-target lp64 } */
> +/* { dg-additional-options "-mbranch-protection=bti+all" { target { ! 
> default_branch_protection } } } */

I see the other bti execution tests we have require "aarch64_bti_hw" as
effective target, do you think here is not necessary?  If yes why?

Thanks

  Andrea

[1] 


Re: [committed] contrib: add mdcompact

2023-10-06 Thread Andrea Corallo
Richard Biener  writes:

> On Thu, Oct 5, 2023 at 5:49 PM Andrea Corallo  wrote:
>>
>> Hello all,
>>
>> this patch checks in mdcompact, the tool written in elisp that I used
>> to mass convert all the multi choice pattern in the aarch64 back-end to
>> the new compact syntax.
>>
>> I tested it on Emacs 29 (might run on older versions as well not
>> sure), also I verified it runs cleanly on a few other back-ends (arm,
>> loongarch).
>>
>> The tool can be used to convert a single pattern, an open buffer or
>> all md files in a directory.
>>
>> The tool might need further adjustment to run on some specific
>> back-end, in case very happy to help.
>>
>> This patch was pre-approved here [1].
>
> Does the result generate identical insn-*.cc files?

No, there can be indentation/aesthetic differences.

BR

  Andrea


[committed] contrib: add mdcompact

2023-10-05 Thread Andrea Corallo
Hello all,

this patch checks in mdcompact, the tool written in elisp that I used
to mass convert all the multi choice pattern in the aarch64 back-end to
the new compact syntax.

I tested it on Emacs 29 (might run on older versions as well not
sure), also I verified it runs cleanly on a few other back-ends (arm,
loongarch).

The tool can be used to convert a single pattern, an open buffer or
all md files in a directory.

The tool might need further adjustment to run on some specific
back-end, in case very happy to help.

This patch was pre-approved here [1].

Best Regards

  Andrea Corallo

[1] <https://gcc.gnu.org/pipermail/gcc-patches/2023-October/631830.html>

contrib/ChangeLog

* mdcompact/mdcompact-testsuite.el: New file.
* mdcompact/mdcompact.el: Likewise.
* mdcompact/tests/1.md: Likewise.
* mdcompact/tests/1.md.out: Likewise.
* mdcompact/tests/2.md: Likewise.
* mdcompact/tests/2.md.out: Likewise.
* mdcompact/tests/3.md: Likewise.
* mdcompact/tests/3.md.out: Likewise.
* mdcompact/tests/4.md: Likewise.
* mdcompact/tests/4.md.out: Likewise.
* mdcompact/tests/5.md: Likewise.
* mdcompact/tests/5.md.out: Likewise.
* mdcompact/tests/6.md: Likewise.
* mdcompact/tests/6.md.out: Likewise.
* mdcompact/tests/7.md: Likewise.
* mdcompact/tests/7.md.out: Likewise.
---
 contrib/mdcompact/mdcompact-testsuite.el |  56 +
 contrib/mdcompact/mdcompact.el   | 296 +++
 contrib/mdcompact/tests/1.md |  36 +++
 contrib/mdcompact/tests/1.md.out |  32 +++
 contrib/mdcompact/tests/2.md |  25 ++
 contrib/mdcompact/tests/2.md.out |  21 ++
 contrib/mdcompact/tests/3.md |  16 ++
 contrib/mdcompact/tests/3.md.out |  17 ++
 contrib/mdcompact/tests/4.md |  17 ++
 contrib/mdcompact/tests/4.md.out |  17 ++
 contrib/mdcompact/tests/5.md |  12 +
 contrib/mdcompact/tests/5.md.out |  11 +
 contrib/mdcompact/tests/6.md |  11 +
 contrib/mdcompact/tests/6.md.out |  11 +
 contrib/mdcompact/tests/7.md |  11 +
 contrib/mdcompact/tests/7.md.out |  11 +
 16 files changed, 600 insertions(+)
 create mode 100644 contrib/mdcompact/mdcompact-testsuite.el
 create mode 100644 contrib/mdcompact/mdcompact.el
 create mode 100644 contrib/mdcompact/tests/1.md
 create mode 100644 contrib/mdcompact/tests/1.md.out
 create mode 100644 contrib/mdcompact/tests/2.md
 create mode 100644 contrib/mdcompact/tests/2.md.out
 create mode 100644 contrib/mdcompact/tests/3.md
 create mode 100644 contrib/mdcompact/tests/3.md.out
 create mode 100644 contrib/mdcompact/tests/4.md
 create mode 100644 contrib/mdcompact/tests/4.md.out
 create mode 100644 contrib/mdcompact/tests/5.md
 create mode 100644 contrib/mdcompact/tests/5.md.out
 create mode 100644 contrib/mdcompact/tests/6.md
 create mode 100644 contrib/mdcompact/tests/6.md.out
 create mode 100644 contrib/mdcompact/tests/7.md
 create mode 100644 contrib/mdcompact/tests/7.md.out

diff --git a/contrib/mdcompact/mdcompact-testsuite.el 
b/contrib/mdcompact/mdcompact-testsuite.el
new file mode 100644
index 000..494c0b5cd68
--- /dev/null
+++ b/contrib/mdcompact/mdcompact-testsuite.el
@@ -0,0 +1,56 @@
+;;; -*- lexical-binding: t; -*-
+
+;; This file is part of GCC.
+
+;; GCC is free software: you can redistribute it and/or modify it
+;; under the terms of the GNU General Public License as published by
+;; the Free Software Foundation, either version 3 of the License, or
+;; (at your option) any later version.
+
+;; GCC is distributed in the hope that it will be useful, but WITHOUT
+;; ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+;; or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+;; License for more details.
+
+;; You should have received a copy of the GNU General Public License
+;; along with GCC.  If not, see <https://www.gnu.org/licenses/>.
+
+;;; Commentary:
+
+;;; Usage:
+;; $ emacs -batch -l mdcompact.el -l mdcompact-testsuite.el -f 
ert-run-tests-batch-and-exit 
+
+;;; Code:
+
+(require 'mdcompact)
+(require 'ert)
+
+(defconst mdcompat-test-directory (concat (file-name-directory
+  (or load-file-name
+   buffer-file-name))
+ "tests/"))
+
+(defun mdcompat-test-run (f)
+  (with-temp-buffer
+(insert-file-contents f)
+(mdcomp-run-at-point)
+(let ((a (buffer-string))
+ (b (with-temp-buffer
+  (insert-file-contents (concat f ".out"))
+  (buffer-string
+  (should (string= a b)
+
+(defmacro mdcompat-gen-tests ()
+  `(progn
+ ,@(cl-loop
+  for f in (directory-files mdcompat-test-directory t "md$")
+  collect
+  `(ert-deftest ,(intern (concat "m

Re: [PATCH 3/3] aarch64: Convert aarch64 multi choice patterns to new syntax

2023-10-03 Thread Andrea Corallo
Richard Sandiford  writes:

> Andrea Corallo  writes:
>> Hi all,
>> this patch converts a number of multi multi choice patterns within the
>> aarch64 backend to the new syntax.
>>
>> The list of the converted patterns is in the Changelog.
>>
>> For completeness here follows the list of multi choice patterns that
>> were rejected for conversion by my parser, they typically have some C
>> as asm output and require some manual intervention:
>> aarch64_simd_vec_set, aarch64_get_lane,
>> aarch64_cmdi, aarch64_cmdi, aarch64_cmtstdi,
>> *aarch64_movv8di, *aarch64_be_mov, *aarch64_be_movci,
>> *aarch64_be_mov, *aarch64_be_movxi, *aarch64_sve_mov_le,
>> *aarch64_sve_mov_be, @aarch64_pred_mov,
>> @aarch64_sve_gather_prefetch,
>> @aarch64_sve_gather_prefetch,
>> *aarch64_sve_gather_prefetch_sxtw,
>> *aarch64_sve_gather_prefetch_uxtw,
>> @aarch64_vec_duplicate_vq_le, *vec_extract_0,
>> *vec_extract_v128, *cmp_and,
>> *fcm_and_combine, @aarch64_sve_ext,
>> @aarch64_sve2_aba, *sibcall_insn, *sibcall_value_insn,
>> *xor_one_cmpl3, *insv_reg_,
>> *aarch64_bfi_,
>> *aarch64_bfidi_subreg_, *aarch64_bfxil,
>> *aarch64_bfxilsi_uxtw,
>> *aarch64_cvtf2_mult,
>> atomic_store.
>>
>> Bootstraped and reg tested on aarch64-unknown-linux-gnu, also I
>> analysed tmp-mddump.md (from 'make mddump') and could not find
>> effective differences, okay for trunk?
>
> I'd left this for a few days in case there were any comments on
> the formatting.  Since there weren't:
>
>>
>> Bests
>>
>>   Andrea
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64.md (@ccmp)
>>  (@ccmp_rev, *call_insn, *call_value_insn)
>>  (*mov_aarch64, load_pair_sw_)
>>  (load_pair_dw_)
>>  (store_pair_sw_)
>>  (store_pair_dw_, *extendsidi2_aarch64)
>>  (*zero_extendsidi2_aarch64, *load_pair_zero_extendsidi2_aarch64)
>>  (*extend2_aarch64)
>>  (*zero_extend2_aarch64)
>>  (*extendqihi2_aarch64, *zero_extendqihi2_aarch64)
>>  (*add3_aarch64, *addsi3_aarch64_uxtw, *add3_poly_1)
>>  (add3_compare0, *addsi3_compare0_uxtw)
>>  (*add3_compareC_cconly, add3_compareC)
>>  (*add3_compareV_cconly_imm, add3_compareV_imm)
>>  (*add3nr_compare0, subdi3, subv_imm)
>>  (*cmpv_insn, sub3_compare1_imm, neg2)
>>  (cmp, fcmp, fcmpe, *cmov_insn)
>>  (*cmovsi_insn_uxtw, 3, *si3_uxtw)
>>  (*and3_compare0, *andsi3_compare0_uxtw, one_cmpl2)
>>  (*_one_cmpl3, *and3nr_compare0)
>>  (*aarch64_ashl_sisd_or_int_3)
>>  (*aarch64_lshr_sisd_or_int_3)
>>  (*aarch64_ashr_sisd_or_int_3, *ror3_insn)
>>  (*si3_insn_uxtw, _trunc2)
>>  (2)
>>  (3)
>>  (3)
>>  (*aarch64_3_cssc, copysign3_insn): Update
>>  to new syntax.
>>
>>  * config/aarch64/aarch64-sve2.md (@aarch64_scatter_stnt)
>>  (@aarch64_scatter_stnt_)
>>  (*aarch64_mul_unpredicated_)
>>  (@aarch64_pred_, *cond__2)
>>  (*cond__3, *cond__any)
>>  (*cond__z, @aarch64_pred_)
>>  (*cond__2, *cond__3)
>>  (*cond__any, @aarch64_sve_)
>>  (@aarch64_sve__lane_)
>>  (@aarch64_sve_add_mul_lane_)
>>  (@aarch64_sve_sub_mul_lane_, @aarch64_sve2_xar)
>>  (*aarch64_sve2_bcax, @aarch64_sve2_eor3)
>>  (*aarch64_sve2_nor, *aarch64_sve2_nand)
>>  (*aarch64_sve2_bsl, *aarch64_sve2_nbsl)
>>  (*aarch64_sve2_bsl1n, *aarch64_sve2_bsl2n)
>>  (*aarch64_sve2_sra, @aarch64_sve_add_)
>>  (*aarch64_sve2_aba, @aarch64_sve_add_)
>>  (@aarch64_sve_add__lane_)
>>  (@aarch64_sve_qadd_)
>>  (@aarch64_sve_qadd__lane_)
>>  (@aarch64_sve_sub_)
>>  (@aarch64_sve_sub__lane_)
>>  (@aarch64_sve_qsub_)
>>  (@aarch64_sve_qsub__lane_)
>>  (@aarch64_sve_, @aarch64__lane_)
>>  (@aarch64_pred_)
>>  (@aarch64_pred_, *cond__2)
>>  (*cond__z, @aarch64_sve_)
>>  (@aarch64__lane_, @aarch64_sve_)
>>  (@aarch64__lane_, @aarch64_pred_)
>>  (*cond__any_relaxed)
>>  (*cond__any_strict)
>>  (@aarch64_pred_, *cond_)
>>  (@aarch64_pred_, *cond_)
>>  (*cond__strict): Update to new syntax.
>>
>>  * config/aarch64/aarch64-sve.md (*aarch64_sve_mov_ldr_str)
>>  (*aarch64_sve_mov_no_ldr_str, @aarch64_pred_mov)
>>  (*aarch64_sve_mov, aarch64_wrffr)
>>  (mask_scatter_store)
>>  (*mask_scatter_store_xtw_unpacked)
>>  (*mask_scatter_store_sxtw)
>>   

[PATCH 3/3] aarch64: Convert aarch64 multi choice patterns to new syntax

2023-09-22 Thread Andrea Corallo
[Resending this with the patch compressed as it's more than 400 KB...]

Hi all,
this patch converts a number of multi multi choice patterns within the
aarch64 backend to the new syntax.

The list of the converted patterns is in the Changelog.

For completeness here follows the list of multi choice patterns that
were rejected for conversion by my parser, they typically have some C
as asm output and require some manual intervention:
aarch64_simd_vec_set, aarch64_get_lane,
aarch64_cmdi, aarch64_cmdi, aarch64_cmtstdi,
*aarch64_movv8di, *aarch64_be_mov, *aarch64_be_movci,
*aarch64_be_mov, *aarch64_be_movxi, *aarch64_sve_mov_le,
*aarch64_sve_mov_be, @aarch64_pred_mov,
@aarch64_sve_gather_prefetch,
@aarch64_sve_gather_prefetch,
*aarch64_sve_gather_prefetch_sxtw,
*aarch64_sve_gather_prefetch_uxtw,
@aarch64_vec_duplicate_vq_le, *vec_extract_0,
*vec_extract_v128, *cmp_and,
*fcm_and_combine, @aarch64_sve_ext,
@aarch64_sve2_aba, *sibcall_insn, *sibcall_value_insn,
*xor_one_cmpl3, *insv_reg_,
*aarch64_bfi_,
*aarch64_bfidi_subreg_, *aarch64_bfxil,
*aarch64_bfxilsi_uxtw,
*aarch64_cvtf2_mult,
atomic_store.

Bootstraped and reg tested on aarch64-unknown-linux-gnu, also I
analysed tmp-mddump.md (from 'make mddump') and could not find
effective differences, okay for trunk?

Bests

  Andrea

gcc/ChangeLog:

* config/aarch64/aarch64.md (@ccmp)
(@ccmp_rev, *call_insn, *call_value_insn)
(*mov_aarch64, load_pair_sw_)
(load_pair_dw_)
(store_pair_sw_)
(store_pair_dw_, *extendsidi2_aarch64)
(*zero_extendsidi2_aarch64, *load_pair_zero_extendsidi2_aarch64)
(*extend2_aarch64)
(*zero_extend2_aarch64)
(*extendqihi2_aarch64, *zero_extendqihi2_aarch64)
(*add3_aarch64, *addsi3_aarch64_uxtw, *add3_poly_1)
(add3_compare0, *addsi3_compare0_uxtw)
(*add3_compareC_cconly, add3_compareC)
(*add3_compareV_cconly_imm, add3_compareV_imm)
(*add3nr_compare0, subdi3, subv_imm)
(*cmpv_insn, sub3_compare1_imm, neg2)
(cmp, fcmp, fcmpe, *cmov_insn)
(*cmovsi_insn_uxtw, 3, *si3_uxtw)
(*and3_compare0, *andsi3_compare0_uxtw, one_cmpl2)
(*_one_cmpl3, *and3nr_compare0)
(*aarch64_ashl_sisd_or_int_3)
(*aarch64_lshr_sisd_or_int_3)
(*aarch64_ashr_sisd_or_int_3, *ror3_insn)
(*si3_insn_uxtw, _trunc2)
(2)
(3)
(3)
(*aarch64_3_cssc, copysign3_insn): Update
to new syntax.

* config/aarch64/aarch64-sve2.md (@aarch64_scatter_stnt)
(@aarch64_scatter_stnt_)
(*aarch64_mul_unpredicated_)
(@aarch64_pred_, *cond__2)
(*cond__3, *cond__any)
(*cond__z, @aarch64_pred_)
(*cond__2, *cond__3)
(*cond__any, @aarch64_sve_)
(@aarch64_sve__lane_)
(@aarch64_sve_add_mul_lane_)
(@aarch64_sve_sub_mul_lane_, @aarch64_sve2_xar)
(*aarch64_sve2_bcax, @aarch64_sve2_eor3)
(*aarch64_sve2_nor, *aarch64_sve2_nand)
(*aarch64_sve2_bsl, *aarch64_sve2_nbsl)
(*aarch64_sve2_bsl1n, *aarch64_sve2_bsl2n)
(*aarch64_sve2_sra, @aarch64_sve_add_)
(*aarch64_sve2_aba, @aarch64_sve_add_)
(@aarch64_sve_add__lane_)
(@aarch64_sve_qadd_)
(@aarch64_sve_qadd__lane_)
(@aarch64_sve_sub_)
(@aarch64_sve_sub__lane_)
(@aarch64_sve_qsub_)
(@aarch64_sve_qsub__lane_)
(@aarch64_sve_, @aarch64__lane_)
(@aarch64_pred_)
(@aarch64_pred_, *cond__2)
(*cond__z, @aarch64_sve_)
(@aarch64__lane_, @aarch64_sve_)
(@aarch64__lane_, @aarch64_pred_)
(*cond__any_relaxed)
(*cond__any_strict)
(@aarch64_pred_, *cond_)
(@aarch64_pred_, *cond_)
(*cond__strict): Update to new syntax.

* config/aarch64/aarch64-sve.md (*aarch64_sve_mov_ldr_str)
(*aarch64_sve_mov_no_ldr_str, @aarch64_pred_mov)
(*aarch64_sve_mov, aarch64_wrffr)
(mask_scatter_store)
(*mask_scatter_store_xtw_unpacked)
(*mask_scatter_store_sxtw)
(*mask_scatter_store_uxtw)
(@aarch64_scatter_store_trunc)
(@aarch64_scatter_store_trunc)
(*aarch64_scatter_store_trunc_sxtw)
(*aarch64_scatter_store_trunc_uxtw)
(*vec_duplicate_reg, vec_shl_insert_)
(vec_series, @extract__)
(@aarch64_pred_, *cond__2)
(*cond__any, @aarch64_pred_)
(@aarch64_sve_revbhw_)
(@cond_)
(*2)
(@aarch64_pred_sxt)
(@aarch64_cond_sxt)
(*cond_uxt_2, *cond_uxt_any, *cnot)
(*cond_cnot_2, *cond_cnot_any)
(@aarch64_pred_, *cond__2_relaxed)
(*cond__2_strict, *cond__any_relaxed)
(*cond__any_strict, @aarch64_pred_)
(*cond__2, *cond__3)
(*cond__any, add3, sub3)
(@aarch64_pred_abd, *aarch64_cond_abd_2)
(*aarch64_cond_abd_3, *aarch64_cond_abd_any)
(@aarch64_sve_, @aarch64_pred_)
(*cond__2, *cond__z)

[PATCH 1/3] recog: Improve parser for pattern new compact syntax

2023-09-22 Thread Andrea Corallo
From: Richard Sandiford 

Hi all,

this is to add support to the new compact pattern syntax for the case
where the constraints do appear unsorted like:

(define_insn "*si3_insn_uxtw"
  [(set (match_operand:DI 0 "register_operand")
(zero_extend:DI (SHIFT_no_rotate:SI
 (match_operand:SI 1 "register_operand")
 (match_operand:QI 2 "aarch64_reg_or_shift_imm_si"]
  ""
  {@ [cons: =0, 2,   1]
 [  r,  Uss, r] \\t%w0, %w1, %2
 [  r,  r,   r] \\t%w0, %w1, %w2
  }
  [(set_attr "type" "bfx,shift_reg")]
)

Best Regards

  Andrea

gcc/Changelog

2023-09-20  Richard Sandiford  

* gensupport.cc (convert_syntax): Updated to support unordered
constraints in compact syntax.
---
 gcc/gensupport.cc | 32 
 1 file changed, 16 insertions(+), 16 deletions(-)

diff --git a/gcc/gensupport.cc b/gcc/gensupport.cc
index f7164b3214d..7e125e3d8db 100644
--- a/gcc/gensupport.cc
+++ b/gcc/gensupport.cc
@@ -896,19 +896,6 @@ convert_syntax (rtx x, file_location loc)
 
   parse_section_layout (loc, &templ, "cons:", tconvec, true);
 
-  /* Check for any duplicate cons entries and sort based on i.  */
-  for (auto e : tconvec)
-{
-  unsigned idx = e.idx;
-  if (idx >= convec.size ())
-   convec.resize (idx + 1);
-
-  if (convec[idx].idx >= 0)
-   fatal_at (loc, "duplicate cons number found: %d", idx);
-  convec[idx] = e;
-}
-  tconvec.clear ();
-
   if (*templ != ']')
 {
   if (*templ == ';')
@@ -951,13 +938,13 @@ convert_syntax (rtx x, file_location loc)
  new_templ += '\n';
  new_templ.append (buffer);
  /* Parse the constraint list, then the attribute list.  */
- if (convec.size () > 0)
-   parse_section (&templ, convec.size (), alt_no, convec, loc,
+ if (tconvec.size () > 0)
+   parse_section (&templ, tconvec.size (), alt_no, tconvec, loc,
   "constraint");
 
  if (attrvec.size () > 0)
{
- if (convec.size () > 0 && !expect_char (&templ, ';'))
+ if (tconvec.size () > 0 && !expect_char (&templ, ';'))
fatal_at (loc, "expected `;' to separate constraints "
   "and attributes in alternative %d", alt_no);
 
@@ -1027,6 +1014,19 @@ convert_syntax (rtx x, file_location loc)
   ++alt_no;
 }
 
+  /* Check for any duplicate cons entries and sort based on i.  */
+  for (auto e : tconvec)
+{
+  unsigned idx = e.idx;
+  if (idx >= convec.size ())
+   convec.resize (idx + 1);
+
+  if (convec[idx].idx >= 0)
+   fatal_at (loc, "duplicate cons number found: %d", idx);
+  convec[idx] = e;
+}
+  tconvec.clear ();
+
   /* Write the constraints and attributes into their proper places.  */
   if (convec.size () > 0)
 add_constraints (x, loc, convec);
-- 
2.25.1



[PATCH 2/3] recog: Support space in "[ cons"

2023-09-22 Thread Andrea Corallo
Hi all,

this is to allow for spaces before "cons:" in the definitions of
patterns using the new compact syntax, ex:

(define_insn "aarch64_simd_dup"
  [(set (match_operand:VDQ_I 0 "register_operand")
(vec_duplicate:VDQ_I
  (match_operand: 1 "register_operand")))]
  "TARGET_SIMD"
  {@ [ cons: =0 , 1  ; attrs: type  ]
 [ w, w  ; neon_dup  ] dup\t%0., %1.[0]
 [ w, ?r ; neon_from_gp  ] dup\t%0., %1
  }
)

gcc/Changelog

2023-09-20  Andrea Corallo  

* gensupport.cc (convert_syntax): Skip spaces before "cons:"
in new compact pattern syntax.
---
 gcc/gensupport.cc | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/gcc/gensupport.cc b/gcc/gensupport.cc
index 7e125e3d8db..dd920d673b4 100644
--- a/gcc/gensupport.cc
+++ b/gcc/gensupport.cc
@@ -894,6 +894,8 @@ convert_syntax (rtx x, file_location loc)
   if (!expect_char (&templ, '['))
 fatal_at (loc, "expecing `[' to begin section list");
 
+  skip_spaces (&templ);
+
   parse_section_layout (loc, &templ, "cons:", tconvec, true);
 
   if (*templ != ']')
-- 
2.25.1



Re: [PATCH 02/10] arm: Fix vstrwq* backend + testsuite

2023-05-02 Thread Andrea Corallo via Gcc-patches
Christophe Lyon  writes:

> Hi Andrea,
>
> Minor comments below:
>
> On 4/28/23 13:29, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> this patch fixes the vstrwq* MVE instrinsics failing to emit the
>> correct sequence of instruction due to a missing predicates. Also the
> nit: you have a typo, should be "predicate"

Ack thanks.

>> immediate range is fixed to be multiples of 2 up between [-252, 252].
>
> Out of curiosity, which tests were affected by this error in the
> immediate range?

None I'd say, so far we have no extensive tests checking for immediate
range in the testsuite.

BR

  Andrea


Re: [PATCH 02/10] arm: Fix vstrwq* backend + testsuite

2023-05-02 Thread Andrea Corallo via Gcc-patches
Christophe Lyon  writes:

> Hi Andrea,
>
> Minor comments below:
>
> On 4/28/23 13:29, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> this patch fixes the vstrwq* MVE instrinsics failing to emit the
>> correct sequence of instruction due to a missing predicates. Also the
> nit: you have a typo, should be "predicate"
>
>> immediate range is fixed to be multiples of 2 up between [-252, 252].
>
> Out of curiosity, which tests were affected by this error in the
> immediate range?

Hi Christophe,

no special reason, just because the test is autogenerated and we use
this same pattern for all the intrinsics (the vast majority of which do
return a non void type).  The test indeed compiles fine so no problem
there.

BR

  Andrea


[PATCH 10/10] arm testsuite: Shifts and get_FPSCR ACLE optimisation fixes

2023-04-28 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

These newly updated tests were rewritten by Andrea. Some of them
needed further manual fixing as follows:

* The #shift immediate value not in the check-function-bodies as expected
* Some shifts getting optimised to mov immediates, e.g.
  `uqshll (1, 1);` -> movsr0, #2; movsr1, #0
* The ACLE was specifying sub-optimal code: lsr+and instead of ubfx. In
  this case the test rewritten from the ACLE had the lsr+and pattern,
  but the compiler was able to optimise to ubfx. Hence I've changed the
  test to now match on ubfx.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/srshr.c: Update shift value.
* gcc.target/arm/mve/intrinsics/srshrl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/uqshl.c: Update shift value and mov imm.
* gcc.target/arm/mve/intrinsics/uqshll.c: Update shift value and mov 
imm.
* gcc.target/arm/mve/intrinsics/urshr.c: Update shift value.
* gcc.target/arm/mve/intrinsics/urshrl.c: Update shift value.
* gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadciq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vadcq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbciq_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_s32.c: Update to ubfx.
* gcc.target/arm/mve/intrinsics/vsbcq_u32.c: Update to ubfx.
---
 gcc/testsuite/gcc.target/arm/mve/intrinsics/srshr.c   | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/srshrl.c  | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/uqshl.c   | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/uqshll.c  | 5 +++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/urshr.c   | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/urshrl.c  | 4 ++--
 .../gcc.target/arm/mve/intrinsics/vadciq_m_s32.c  | 8 ++--
 .../gcc.target/arm/mve/intrinsics/vadciq_m_u32.c  | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_s32.c  | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vadciq_u32.c  | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_s32.c | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_m_u32.c | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_s32.c   | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vadcq_u32.c   | 8 ++--
 .../gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c  | 8 ++--
 .../gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c  | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_s32.c  | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbciq_u32.c  | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_s32.c   | 8 ++--
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vsbcq_u32.c   | 8 ++--
 22 files changed, 43 insertions(+), 106 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshr.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshr.c
index 94e3f42fd33..734375d58c0 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshr.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshr.c
@@ -12,7 +12,7 @@ extern "C" {
 /*
 **foo:
 ** ...
-** srshr   (?:ip|fp|r[0-9]+), #shift(?:@.*|)
+** srshr   (?:ip|fp|r[0-9]+), #1(?:@.*|)
 ** ...
 */
 int32_t
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshrl.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshrl.c
index 65f28ccbfde..a91943c38a0 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshrl.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/srshrl.c
@@ -12,7 +12,7 @@ extern "C" {
 /*
 **foo:
 ** ...
-** srshrl  (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), #shift(?: @.*|)
+** srshrl  (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), #1(?: @.*|)
 ** ...
 */
 int64_t
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/uqshl.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/uqshl.c
index b23c9d97ba6..58aa7a61e42 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/uqshl.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsi

[PATCH 02/10] arm: Fix vstrwq* backend + testsuite

2023-04-28 Thread Andrea Corallo via Gcc-patches
Hi all,

this patch fixes the vstrwq* MVE instrinsics failing to emit the
correct sequence of instruction due to a missing predicates. Also the
immediate range is fixed to be multiples of 2 up between [-252, 252].

Best Regards

  Andrea

gcc/ChangeLog:

* config/arm/constraints.md (mve_vldrd_immediate): Move it to
predicates.md.
(Ri): Move constraint definition from predicates.md.
(Rl): Define new constraint.
* config/arm/mve.md (mve_vstrwq_scatter_base_wb_p_v4si): Add
missing constraint.
(mve_vstrwq_scatter_base_wb_p_fv4sf): Add missing Up constraint
for op 1, use mve_vstrw_immediate predicate and Rl constraint for
op 2. Fix asm output spacing.
(mve_vstrdq_scatter_base_wb_p_v2di): Add missing constraint.
* config/arm/predicates.md (Ri) Move constraint to constraints.md
(mve_vldrd_immediate): Move it from
constraints.md.
(mve_vstrw_immediate): New predicate.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vstrwq_f32.c: Use
check-function-bodies instead of scan-assembler checks.  Use
extern "C" for C++ testing.
* gcc.target/arm/mve/intrinsics/vstrwq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_f32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_s32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_p_u32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_base_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_offset_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_f32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_f32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_s32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_p_u32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_s32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_scatter_shifted_offset_u32.c: 
Likewise.
* gcc.target/arm/mve/intrinsics/vstrwq_u32.c: Likewise.
---
 gcc/config/arm/constraints.md | 20 --
 gcc/config/arm/mve.md | 10 ++---
 gcc/config/arm/predicates.md  | 14 +++
 .../arm/mve/intrinsics/vstrwq_f32.c   | 32 ---
 .../arm/mve/intrinsics/vstrwq_p_f32.c | 40 ---
 .../arm/mve/intrinsics/vstrwq_p_s32.c | 40 ---
 .../arm/mve/intrinsics/vstrwq_p_u32.c | 40 ---
 .../arm/mve/intrinsics/vstrwq_s32.c   | 32 ---
 .../mve/intrinsics/vstrwq_scatter_base_f32.c  | 28 +++--
 .../intrinsics/vstrwq_scatter_base_p_f32.c| 36 +++--
 .../intrinsics/vstrwq_scatter_base_p_s32.c| 36 +++--
 .../intrinsics/vstrwq_scatter_base_p_u32.c| 36 +++--
 .../mve/intrinsics/vstrwq_scatter_base_s32.c  | 28 +++--
 .../mve/intrinsics/vstrwq_scatter_base_u32.c  | 28 +++--
 .../intrinsics/vstrwq_scatter_base_wb_f32.c   | 32 ---
 .../intrinsics/vstrwq_scatter_base_wb_p_f32.c | 40 ---
 .../intrinsics/vstrwq_scatter_base_wb_p_s32.c | 40 ---
 .../intrinsics/vstrwq_scatter_base_wb_p_u32.c | 40 ---
 .../intrinsics/vstrwq_scatter_base_wb_s32.c   | 32 ---
 .../intrinsics/vstrwq_scatter_base_wb_u32.c   | 32 ---
 .../intrinsics/vstrwq_scatter_offset_f32.c| 32 ---
 .../intrinsics/vstrwq_scatter_offset_p_f32.c  | 40 ---
 .../intrinsics/vstrwq_

[PATCH 04/10] arm: Stop vadcq, vsbcq intrinsics from overwriting the FPSCR NZ flags

2023-04-28 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

Hi all,

We noticed that calls to the vadcq and vsbcq intrinsics, both of
which use __builtin_arm_set_fpscr_nzcvqc to set the Carry flag in
the FPSCR, would produce the following code:

```
< r2 is the *carry input >
vmrsr3, FPSCR_nzcvqc
bic r3, r3, #536870912
orr r3, r3, r2, lsl #29
vmsrFPSCR_nzcvqc, r3
```

when the MVE ACLE instead gives a different instruction sequence of:
```
< Rt is the *carry input >
VMRS Rs,FPSCR_nzcvqc
BFI Rs,Rt,#29,#1
VMSR FPSCR_nzcvqc,Rs
```

the bic + orr pair is slower and it's also wrong, because, if the
*carry input is greater than 1, then we risk overwriting the top two
bits of the FPSCR register (the N and Z flags).

This turned out to be a problem in the header file and the solution was
to simply add a `& 1x0u` to the `*carry` input: then the compiler knows
that we only care about the lowest bit and can optimise to a BFI.

Ok for trunk?

Thanks,
Stam Markianos-Wright

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vadcq_s32): Fix arithmetic.
(__arm_vadcq_u32): Likewise.
(__arm_vadcq_m_s32): Likewise.
(__arm_vadcq_m_u32): Likewise.
(__arm_vsbcq_s32): Likewise.
(__arm_vsbcq_u32): Likewise.
(__arm_vsbcq_m_s32): Likewise.
(__arm_vsbcq_m_u32): Likewise.
---
 gcc/config/arm/arm_mve.h | 16 
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 1262d668121..8778216304b 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -16055,7 +16055,7 @@ __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vadcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry)
 {
-  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | (*__carry << 29));
+  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | ((*__carry & 0x1u) << 29));
   int32x4_t __res = __builtin_mve_vadcq_sv4si (__a, __b);
   *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u;
   return __res;
@@ -16065,7 +16065,7 @@ __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vadcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry)
 {
-  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | (*__carry << 29));
+  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | ((*__carry & 0x1u) << 29));
   uint32x4_t __res = __builtin_mve_vadcq_uv4si (__a, __b);
   *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u;
   return __res;
@@ -16075,7 +16075,7 @@ __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vadcq_m_s32 (int32x4_t __inactive, int32x4_t __a, int32x4_t __b, 
unsigned * __carry, mve_pred16_t __p)
 {
-  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | (*__carry << 29));
+  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | ((*__carry & 0x1u) << 29));
   int32x4_t __res = __builtin_mve_vadcq_m_sv4si (__inactive, __a, __b, __p);
   *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u;
   return __res;
@@ -16085,7 +16085,7 @@ __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vadcq_m_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32x4_t __b, 
unsigned * __carry, mve_pred16_t __p)
 {
-  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | (*__carry << 29));
+  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | ((*__carry & 0x1u) << 29));
   uint32x4_t __res =  __builtin_mve_vadcq_m_uv4si (__inactive, __a, __b, __p);
   *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u;
   return __res;
@@ -16131,7 +16131,7 @@ __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vsbcq_s32 (int32x4_t __a, int32x4_t __b, unsigned * __carry)
 {
-  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | (*__carry << 29));
+  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | ((*__carry & 0x1u) << 29));
   int32x4_t __res = __builtin_mve_vsbcq_sv4si (__a, __b);
   *__carry = (__builtin_arm_get_fpscr_nzcvqc () >> 29) & 0x1u;
   return __res;
@@ -16141,7 +16141,7 @@ __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
 __arm_vsbcq_u32 (uint32x4_t __a, uint32x4_t __b, unsigned * __carry)
 {
-  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | (*__carry << 29));
+  __builtin_arm_set_fpscr_nzcvqc((__builtin_arm_get_fpscr_nzcvqc () & 
~0x2000u) | ((*__carry & 0x1u) << 29));
   uint32x4_t __res = 

[PATCH 08/10] arm testsuite: Remove reduntant tests

2023-04-28 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

Following Andrea's overhaul of the MVE testsuite, these tests are now
reduntant, as equivalent checks have been added to the each intrinsic's
.c test.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/mve_fp_vaddq_n.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vaddq_m.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vaddq_n.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_m_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_m_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_m_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_x_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_x_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vddupq_x_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vdwdupq_x_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vdwdupq_x_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vdwdupq_x_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_m_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_m_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_m_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_x_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_x_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vidupq_x_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_viwdupq_x_n_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_viwdupq_x_n_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_viwdupq_x_n_u8.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_offset_s64.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_offset_u64.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_offset_z_s64.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_offset_z_u64.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_shifted_offset_s64.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_shifted_offset_u64.c: 
Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_shifted_offset_z_s64.c: Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrdq_gather_shifted_offset_z_u64.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_f16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_s16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_s32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_u16.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_z_f16.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_z_s16.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_z_s32.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_z_u16.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_offset_z_u32.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_f16.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_s16.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_s32.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_u16.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_u32.c: 
Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_z_f16.c: Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_z_s16.c: Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_z_s32.c: Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_z_u16.c: Removed.
* 
gcc.target/arm/mve/intrinsics/mve_vldrhq_gather_shifted_offset_z_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrwq_gather_offset_f32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrwq_gather_offset_s32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrwq_gather_offset_u32.c: Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrwq_gather_offset_z_f32.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrwq_gather_offset_z_s32.c: 
Removed.
* gcc.target/arm/mve/intrinsics/mve_vldrwq_gather_offset_z

[PATCH 09/10] arm testsuite: XFAIL or relax registers in some tests

2023-04-28 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

Hi all,

This is a simple testsuite tidy-up patch, addressing to types of errors:

* The vcmp vector-scalar tests failing due to the compiler's preference
of vector-vector comparisons, over vector-scalar comparisons. This is
due to the lack of cost model for MVE and the compiler not knowing that
the RTL vec_duplicate is free in those instructions. For now, we simply
XFAIL these checks.
* The tests for pr108177 had strict usage of q0 and r0 registers,
meaning that they would FAIL with -mfloat-abi=softf. The register checks
have now been relaxed.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/srshr.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/srshrl.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/uqshl.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/uqshll.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/urshr.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/urshrl.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadciq_m_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadciq_m_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadciq_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadciq_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadcq_m_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadcq_m_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadcq_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vadcq_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbciq_m_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbciq_m_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbciq_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbciq_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbcq_m_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbcq_m_u32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbcq_s32.c: XFAIL check.
* gcc.target/arm/mve/intrinsics/vsbcq_u32.c: XFAIL check.
* gcc.target/arm/mve/pr108177-1.c: Relax registers.
* gcc.target/arm/mve/pr108177-10.c: Relax registers.
* gcc.target/arm/mve/pr108177-11.c: Relax registers.
* gcc.target/arm/mve/pr108177-12.c: Relax registers.
* gcc.target/arm/mve/pr108177-13.c: Relax registers.
* gcc.target/arm/mve/pr108177-14.c: Relax registers.
* gcc.target/arm/mve/pr108177-2.c: Relax registers.
* gcc.target/arm/mve/pr108177-3.c: Relax registers.
* gcc.target/arm/mve/pr108177-4.c: Relax registers.
* gcc.target/arm/mve/pr108177-5.c: Relax registers.
* gcc.target/arm/mve/pr108177-6.c: Relax registers.
* gcc.target/arm/mve/pr108177-7.c: Relax registers.
* gcc.target/arm/mve/pr108177-8.c: Relax registers.
* gcc.target/arm/mve/pr108177-9.c: Relax registers.
---
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpcsq_n_u16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpcsq_n_u32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpcsq_n_u8.c  | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_f32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_u16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_u32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpeqq_n_u8.c  | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgeq_n_f32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpgtq_n_f32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmphiq_n_u16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmphiq_n_u32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmphiq_n_u8.c  | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpleq_n_f32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpltq_n_f32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_f32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_u16.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_u32.c | 2 +-
 gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmpneq_n_u8.c  | 2 +-
 gcc/testsuite/gcc.target/arm/mve/pr108177-1.c   | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/pr108177-10.c  | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/pr108177-11.c  | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/pr108177-12.c  | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/pr108177-13.c  | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/pr108177-14.c  | 4 ++--
 gcc/testsuite/gcc.target/arm/mve/p

[PATCH 05/10] arm: Add vorrq_n overloading into vorrq _Generic

2023-04-28 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

We found this as part of the wider testsuite updates.

The applicable tests are authored by Andrea earlier in this patch series

Ok for trunk?

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vorrq): Add _n variant.
---
 gcc/config/arm/arm_mve.h | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 8778216304b..3d386f320c3 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -35852,6 +35852,10 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: 
__arm_vorrq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, 
uint8x16_t)), \
   int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: 
__arm_vorrq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, 
uint16x8_t)), \
   int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: 
__arm_vorrq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, 
uint32x4_t)), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vorrq_n_u16 
(__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce3(p1, int)), \
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vorrq_n_u32 
(__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce3(p1, int)), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vorrq_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce3(p1, int)), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vorrq_n_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce3(p1, int)), \
   int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: 
__arm_vorrq_f16 (__ARM_mve_coerce(__p0, float16x8_t), __ARM_mve_coerce(__p1, 
float16x8_t)), \
   int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: 
__arm_vorrq_f32 (__ARM_mve_coerce(__p0, float32x4_t), __ARM_mve_coerce(__p1, 
float32x4_t)));})
 
@@ -38637,7 +38641,11 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vorrq_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \
   int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: 
__arm_vorrq_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, 
uint8x16_t)), \
   int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: 
__arm_vorrq_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, 
uint16x8_t)), \
-  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: 
__arm_vorrq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, 
uint32x4_t)));})
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: 
__arm_vorrq_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, 
uint32x4_t)), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vorrq_n_u16 
(__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce3(p1, int)), \
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vorrq_n_u32 
(__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce3(p1, int)), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vorrq_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce3(p1, int)), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vorrq_n_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce3(p1, int)));})
 
 #define __arm_vornq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
   __typeof(p1) __p1 = (p1); \
-- 
2.25.1



[PATCH 06/10] arm: Fix overloading of MVE scalar constant parameters on vbicq, vmvnq_m

2023-04-28 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

We found this as part of the wider testsuite updates.

The applicable tests are authored by Andrea earlier in this patch series

Ok for trunk?

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vbicq): Change coerce on
scalar constant.
(__arm_vmvnq_m): Likewise.
---
 gcc/config/arm/arm_mve.h | 24 
 1 file changed, 12 insertions(+), 12 deletions(-)

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 3d386f320c3..3a1cffb4063 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -35906,10 +35906,10 @@ extern void *__ARM_undef;
 #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
   __typeof(p1) __p1 = (p1); \
   _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \
-  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce1 (__p1, int)), \
-  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce1 (__p1, int)), \
-  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 
(__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce1 (__p1, int)), \
-  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 
(__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce1 (__p1, int)), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce3 (p1, int)), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce3 (p1, int)), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 
(__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce3 (p1, int)), \
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 
(__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce3 (p1, int)), \
   int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 
(__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \
   int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \
   int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \
@@ -38825,10 +38825,10 @@ extern void *__ARM_undef;
 #define __arm_vbicq(p0,p1) ({ __typeof(p0) __p0 = (p0); \
   __typeof(p1) __p1 = (p1); \
   _Generic( (int (*)[__ARM_mve_typeid(__p0)][__ARM_mve_typeid(__p1)])0, \
-  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce1 (__p1, int)), \
-  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce1 (__p1, int)), \
-  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 
(__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce1 (__p1, int)), \
-  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 
(__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce1 (__p1, int)), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce3 (p1, int)), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce3 (p1, int)), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u16 
(__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce3 (p1, int)), \
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: __arm_vbicq_n_u32 
(__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce3 (p1, int)), \
   int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: __arm_vbicq_s8 
(__ARM_mve_coerce(__p0, int8x16_t), __ARM_mve_coerce(__p1, int8x16_t)), \
   int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: __arm_vbicq_s16 
(__ARM_mve_coerce(__p0, int16x8_t), __ARM_mve_coerce(__p1, int16x8_t)), \
   int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: __arm_vbicq_s32 
(__ARM_mve_coerce(__p0, int32x4_t), __ARM_mve_coerce(__p1, int32x4_t)), \
@@ -40962,10 +40962,10 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: 
__arm_vmvnq_m_u8 (__ARM_mve_coerce(__p0, uint8x16_t), __ARM_mve_coerce(__p1, 
uint8x16_t), p2), \
   int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: 
__arm_vmvnq_m_u16 (__ARM_mve_coerce(__p0, uint16x8_t), __ARM_mve_coerce(__p1, 
uint16x8_t), p2), \
   int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: 
__arm_vmvnq_m_u32 (__ARM_mve_coerce(__p0, uint32x4_t), __ARM_mve_coerce(__p1, 
uint32x4_t), p2), \
-  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vmvnq_m_n_s16 
(__ARM_mve_coerce(__p0, int16x8_t

Re: [PATCH] [PR104882] [arm] require mve hw for mve run test

2023-02-20 Thread Andrea Corallo via Gcc-patches
Alexandre Oliva via Gcc-patches  writes:

> The pr104882.c test is an execution test, but arm_v8_1m_mve_ok only
> tests for compile-time support.  Add a requirement for mve hardware.
>
> Regstrapped on x86_64-linux-gnu.
> Tested on arm-vxworks7 (gcc-12) and arm-eabi (trunk).  Ok to install?
>
> for  gcc/testsuite/ChangeLog
>
>   PR target/104882
>   * gcc.target/arm/simd/pr104882.c: Require mve hardware.
> ---
>  gcc/testsuite/gcc.target/arm/simd/pr104882.c |1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/gcc/testsuite/gcc.target/arm/simd/pr104882.c 
> b/gcc/testsuite/gcc.target/arm/simd/pr104882.c
> index ae9709af42f22..1ea7a14836f54 100644
> --- a/gcc/testsuite/gcc.target/arm/simd/pr104882.c
> +++ b/gcc/testsuite/gcc.target/arm/simd/pr104882.c
> @@ -1,4 +1,5 @@
>  /* { dg-do run } */
> +/* { dg-require-effective-target arm_mve_hw } */
>  /* { dg-require-effective-target arm_v8_1m_mve_ok } */
>  /* { dg-add-options arm_v8_1m_mve } */
>  /* { dg-additional-options "-O2" } */

Hi Alexandre,

no approver here but LGTM, thanks.

  Andrea


Re: [PATCH] [arm] complete vmsr/vmrs blank and case adjustments

2023-02-20 Thread Andrea Corallo via Gcc-patches
Alexandre Oliva  writes:

> Back in September last year, some of the vmsr and vmrs patterns had an
> extraneous blank removed, and the case of register names lowered, but
> another instance remained, and so did a few testcases.

[...]

Hi Alexandre,

I'm not approver but LGTM, thanks for fixing this.

  Andrea


Re: [PATCH] arm: Implement arm Function target attribute 'branch-protection'

2023-02-08 Thread Andrea Corallo via Gcc-patches
Andrea Corallo  writes:

> gcc/
>
>   * config/arm/arm.cc (arm_valid_target_attribute_rec): Add ARM function
>   attribute 'branch-protection' and parse its options.
>   * doc/extend.texi: Document ARM Function attribute 'branch-protection'.
>
> gcc/testsuite/
>
>   * gcc.target/arm/acle/pacbti-m-predef-13.c: New test.
>
> Co-Authored-By: Tejas Belagod  
> ---
>  gcc/config/arm/arm.cc | 16 
>  gcc/doc/extend.texi   |  7 
>  .../gcc.target/arm/acle/pacbti-m-predef-13.c  | 41 +++
>  3 files changed, 64 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c
>
> diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
> index efc48349dd3..add33090f18 100644
> --- a/gcc/config/arm/arm.cc
> +++ b/gcc/config/arm/arm.cc
> @@ -33568,6 +33568,22 @@ arm_valid_target_attribute_rec (tree args, struct 
> gcc_options *opts)
>  
> opts->x_arm_arch_string = xstrndup (arch, strlen (arch));
>   }
> +  else if (startswith (q, "branch-protection="))
> + {
> +   char *bp_str = q + strlen ("branch-protection=");
> +
> +   opts->x_arm_branch_protection_string
> + = xstrndup (bp_str, strlen (bp_str));
> +
> +   /* Capture values from target attribute.  */
> +   aarch_validate_mbranch_protection
> + (opts->x_arm_branch_protection_string);
> +
> +   /* Init function target attr values.  */
> +   opts->x_aarch_ra_sign_scope = aarch_ra_sign_scope;
> +   opts->x_aarch_enable_bti = aarch_enable_bti;
> +
> + }
>else if (q[0] == '+')
>   {
> opts->x_arm_arch_string
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 4a89a3eae7c..23ee43919dd 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -4492,6 +4492,13 @@ Enable or disable calls to out-of-line helpers to 
> implement atomic operations.
>  This corresponds to the behavior of the command line options
>  @option{-moutline-atomics} and @option{-mno-outline-atomics}.
>  
> +@item branch-protection=
> +@cindex @code{branch-protection=} function attribute, arm
> +Select the function scope on which branch protection will be applied.
> +The behavior and permissible arguments are the same as for the
> +command-line option @option{-mbranch-protection=}.  The default value
> +is @code{none}.
> +
>  @end table
>  
>  The above target attributes can be specified as follows:
> diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c 
> b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c
> new file mode 100644
> index 000..b6d2df53072
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c
> @@ -0,0 +1,41 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target mbranch_protection_ok } */
> +/* { dg-options "-march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf 
> -mfloat-abi=hard --save-temps" } */
> +/* { dg-final { check-function-bodies "**" "" } } */
> +
> +#if defined (__ARM_FEATURE_BTI_DEFAULT)
> +#error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be undefined."
> +#endif
> +
> +#if !defined (__ARM_FEATURE_PAC_DEFAULT)
> +#error "Feature test macro __ARM_FEATURE_PAC_DEFAULT should be defined."
> +#endif
> +
> +/*
> +**foo:
> +**   bti
> +**   ...
> +*/
> +__attribute__((target("branch-protection=pac-ret+bti"), noinline))
> +int foo ()
> +{
> +  return 3;
> +}
> +
> +/*
> +**main:
> +**   pac ip, lr, sp
> +**   ...
> +**   aut ip, lr, sp
> +**   bx  lr
> +*/
> +int
> +main()
> +{
> +  return 1 + foo ();
> +}
> +
> +/* { dg-final { scan-assembler "\.eabi_attribute 50, 1" } } */
> +/* { dg-final { scan-assembler "\.eabi_attribute 52, 1" } } */
> +/* { dg-final { scan-assembler-not "\.eabi_attribute 74" } } */
> +/* { dg-final { scan-assembler "\.eabi_attribute 76, 1" } } */

Ping

  Andrea


Re: [PATCH] aarch64: Fix return_address_sign_ab_exception.C regression

2023-02-08 Thread Andrea Corallo via Gcc-patches
Richard Sandiford  writes:

> Andrea Corallo via Gcc-patches  writes:
>> Hi all,
>>
>> this is to fix the regression of
>> g++.target/aarch64/return_address_sign_ab_exception.C that I
>> introduced with d8dadbc9a5199bf7bac1ab7376b0f84f45e94350.
>>
>> 'aarch_ra_sign_key' for aarch64 ended up being non defined in the opt
>> file and the function attribute "branch-protection=pac-ret+leaf+b-key"
>> stopped working as expected.
>>
>> This patch moves the definition of 'aarch_ra_sign_key' to the opt
>> files for both Arm back-ends.
>>
>> Regards
>>
>>   Andera Corallo
>>
>> gcc/ChangeLog:
>>
>>  * config/aarch64/aarch64-protos.h (aarch_ra_sign_key): Remove
>>  declaration.
>>  * config/aarch64/aarch64.cc (aarch_ra_sign_key): Remove
>>  definition.
>>  * config/aarch64/aarch64.opt (aarch64_ra_sign_key): Rename
>>  to 'aarch_ra_sign_key'.
>>  * config/arm/aarch-common.cc (aarch_ra_sign_key): Remove
>>  declaration.
>>  * config/arm/arm-protos.h (aarch_ra_sign_key): Likewise.
>>  * config/arm/arm.cc (enum aarch_key_type): Remove definition.
>>  * config/arm/arm.opt: Define.
>
> OK, thanks.
>
> Richard

Thanks for reviewing, in as b1d26458839.

Best Regards

  Andrea


[PATCH] aarch64: Fix return_address_sign_ab_exception.C regression

2023-02-06 Thread Andrea Corallo via Gcc-patches
Hi all,

this is to fix the regression of
g++.target/aarch64/return_address_sign_ab_exception.C that I
introduced with d8dadbc9a5199bf7bac1ab7376b0f84f45e94350.

'aarch_ra_sign_key' for aarch64 ended up being non defined in the opt
file and the function attribute "branch-protection=pac-ret+leaf+b-key"
stopped working as expected.

This patch moves the definition of 'aarch_ra_sign_key' to the opt
files for both Arm back-ends.

Regards

  Andera Corallo

gcc/ChangeLog:

* config/aarch64/aarch64-protos.h (aarch_ra_sign_key): Remove
declaration.
* config/aarch64/aarch64.cc (aarch_ra_sign_key): Remove
definition.
* config/aarch64/aarch64.opt (aarch64_ra_sign_key): Rename
to 'aarch_ra_sign_key'.
* config/arm/aarch-common.cc (aarch_ra_sign_key): Remove
declaration.
* config/arm/arm-protos.h (aarch_ra_sign_key): Likewise.
* config/arm/arm.cc (enum aarch_key_type): Remove definition.
* config/arm/arm.opt: Define.
---
 gcc/config/aarch64/aarch64-protos.h | 2 --
 gcc/config/aarch64/aarch64.cc   | 2 --
 gcc/config/aarch64/aarch64.opt  | 2 +-
 gcc/config/arm/aarch-common.cc  | 1 -
 gcc/config/arm/arm-protos.h | 1 -
 gcc/config/arm/arm.cc   | 3 ---
 gcc/config/arm/arm.opt  | 3 +++
 7 files changed, 4 insertions(+), 10 deletions(-)

diff --git a/gcc/config/aarch64/aarch64-protos.h 
b/gcc/config/aarch64/aarch64-protos.h
index 6ab6d49af37..f75eb892f3d 100644
--- a/gcc/config/aarch64/aarch64-protos.h
+++ b/gcc/config/aarch64/aarch64-protos.h
@@ -662,8 +662,6 @@ enum simd_immediate_check {
   AARCH64_CHECK_MOV  = AARCH64_CHECK_ORR | AARCH64_CHECK_BIC
 };
 
-extern enum aarch_key_type aarch_ra_sign_key;
-
 extern struct tune_params aarch64_tune_params;
 
 /* The available SVE predicate patterns, known in the ACLE as "svpattern".  */
diff --git a/gcc/config/aarch64/aarch64.cc b/gcc/config/aarch64/aarch64.cc
index acc0cfe5f94..1b498979af1 100644
--- a/gcc/config/aarch64/aarch64.cc
+++ b/gcc/config/aarch64/aarch64.cc
@@ -2759,8 +2759,6 @@ static const struct processor all_cores[] =
   {NULL, aarch64_none, aarch64_none, aarch64_no_arch, 0, NULL}
 };
 
-enum aarch_key_type aarch_ra_sign_key = AARCH_KEY_A;
-
 /* The current tuning set.  */
 struct tune_params aarch64_tune_params = generic_tunings;
 
diff --git a/gcc/config/aarch64/aarch64.opt b/gcc/config/aarch64/aarch64.opt
index 137e506fe19..1d7967db9c0 100644
--- a/gcc/config/aarch64/aarch64.opt
+++ b/gcc/config/aarch64/aarch64.opt
@@ -40,7 +40,7 @@ TargetVariable
 unsigned aarch_enable_bti = 2
 
 TargetVariable
-enum aarch64_key_type aarch64_ra_sign_key = AARCH64_KEY_A
+enum aarch_key_type aarch_ra_sign_key = AARCH_KEY_A
 
 ; The TLS dialect names to use with -mtls-dialect.
 
diff --git a/gcc/config/arm/aarch-common.cc b/gcc/config/arm/aarch-common.cc
index 27e6c8f39b4..5b96ff4c2e8 100644
--- a/gcc/config/arm/aarch-common.cc
+++ b/gcc/config/arm/aarch-common.cc
@@ -661,7 +661,6 @@ arm_md_asm_adjust (vec &outputs, vec & /*inputs*/,
 
 #define BRANCH_PROTECT_STR_MAX 255
 extern char *accepted_branch_protection_string;
-extern enum aarch_key_type aarch_ra_sign_key;
 
 static enum aarch_parse_opt_result
 aarch_handle_no_branch_protection (char* str, char* rest)
diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index aea472bfbb9..c8ae5e1e9c1 100644
--- a/gcc/config/arm/arm-protos.h
+++ b/gcc/config/arm/arm-protos.h
@@ -585,7 +585,6 @@ struct cpu_option
 extern const arch_option all_architectures[];
 extern const cpu_option all_cores[];
 
-extern enum aarch_key_type aarch_ra_sign_key;
 
 const cpu_option *arm_parse_cpu_option_name (const cpu_option *, const char *,
 const char *, bool = true);
diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index efc48349dd3..3d778b2982e 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -2420,9 +2420,6 @@ const struct tune_params arm_fa726te_tune =
   tune_params::SCHED_AUTOPREF_OFF
 };
 
-/* Key type for Pointer Authentication extension.  */
-enum aarch_key_type aarch_ra_sign_key = AARCH_KEY_A;
-
 char *accepted_branch_protection_string = NULL;
 
 /* Auto-generated CPU, FPU and architecture tables.  */
diff --git a/gcc/config/arm/arm.opt b/gcc/config/arm/arm.opt
index 260700f16bc..3a49b51ece0 100644
--- a/gcc/config/arm/arm.opt
+++ b/gcc/config/arm/arm.opt
@@ -30,6 +30,9 @@ enum aarch_function_type aarch_ra_sign_scope = 
AARCH_FUNCTION_NONE
 TargetVariable
 unsigned aarch_enable_bti = 0
 
+TargetVariable
+enum aarch_key_type aarch_ra_sign_key = AARCH_KEY_A
+
 Enum
 Name(tls_type) Type(enum arm_tls_type)
 TLS dialect to use:
-- 
2.25.1



[PATCH] arm: Implement arm Function target attribute 'branch-protection'

2023-01-27 Thread Andrea Corallo via Gcc-patches
gcc/

* config/arm/arm.cc (arm_valid_target_attribute_rec): Add ARM function
attribute 'branch-protection' and parse its options.
* doc/extend.texi: Document ARM Function attribute 'branch-protection'.

gcc/testsuite/

* gcc.target/arm/acle/pacbti-m-predef-13.c: New test.

Co-Authored-By: Tejas Belagod  
---
 gcc/config/arm/arm.cc | 16 
 gcc/doc/extend.texi   |  7 
 .../gcc.target/arm/acle/pacbti-m-predef-13.c  | 41 +++
 3 files changed, 64 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c

diff --git a/gcc/config/arm/arm.cc b/gcc/config/arm/arm.cc
index efc48349dd3..add33090f18 100644
--- a/gcc/config/arm/arm.cc
+++ b/gcc/config/arm/arm.cc
@@ -33568,6 +33568,22 @@ arm_valid_target_attribute_rec (tree args, struct 
gcc_options *opts)
 
  opts->x_arm_arch_string = xstrndup (arch, strlen (arch));
}
+  else if (startswith (q, "branch-protection="))
+   {
+ char *bp_str = q + strlen ("branch-protection=");
+
+ opts->x_arm_branch_protection_string
+   = xstrndup (bp_str, strlen (bp_str));
+
+ /* Capture values from target attribute.  */
+ aarch_validate_mbranch_protection
+   (opts->x_arm_branch_protection_string);
+
+ /* Init function target attr values.  */
+ opts->x_aarch_ra_sign_scope = aarch_ra_sign_scope;
+ opts->x_aarch_enable_bti = aarch_enable_bti;
+
+   }
   else if (q[0] == '+')
{
  opts->x_arm_arch_string
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 4a89a3eae7c..23ee43919dd 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -4492,6 +4492,13 @@ Enable or disable calls to out-of-line helpers to 
implement atomic operations.
 This corresponds to the behavior of the command line options
 @option{-moutline-atomics} and @option{-mno-outline-atomics}.
 
+@item branch-protection=
+@cindex @code{branch-protection=} function attribute, arm
+Select the function scope on which branch protection will be applied.
+The behavior and permissible arguments are the same as for the
+command-line option @option{-mbranch-protection=}.  The default value
+is @code{none}.
+
 @end table
 
 The above target attributes can be specified as follows:
diff --git a/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c 
b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c
new file mode 100644
index 000..b6d2df53072
--- /dev/null
+++ b/gcc/testsuite/gcc.target/arm/acle/pacbti-m-predef-13.c
@@ -0,0 +1,41 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target mbranch_protection_ok } */
+/* { dg-options "-march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf 
-mfloat-abi=hard --save-temps" } */
+/* { dg-final { check-function-bodies "**" "" } } */
+
+#if defined (__ARM_FEATURE_BTI_DEFAULT)
+#error "Feature test macro __ARM_FEATURE_BTI_DEFAULT should be undefined."
+#endif
+
+#if !defined (__ARM_FEATURE_PAC_DEFAULT)
+#error "Feature test macro __ARM_FEATURE_PAC_DEFAULT should be defined."
+#endif
+
+/*
+**foo:
+** bti
+** ...
+*/
+__attribute__((target("branch-protection=pac-ret+bti"), noinline))
+int foo ()
+{
+  return 3;
+}
+
+/*
+**main:
+** pac ip, lr, sp
+** ...
+** aut ip, lr, sp
+** bx  lr
+*/
+int
+main()
+{
+  return 1 + foo ();
+}
+
+/* { dg-final { scan-assembler "\.eabi_attribute 50, 1" } } */
+/* { dg-final { scan-assembler "\.eabi_attribute 52, 1" } } */
+/* { dg-final { scan-assembler-not "\.eabi_attribute 74" } } */
+/* { dg-final { scan-assembler "\.eabi_attribute 76, 1" } } */
-- 
2.25.1



Re: [PATCH 23/23] arm: fix missing extern "C" in MVE tests

2023-01-25 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

[...]

>
> Ok.
> Thanks,
> Kyrill

Hi Kyrill,

thanks for reviewing.  These and all the previous ones are in with the
requested ChangeLogs changes.

Regards

  Andrea


[PATCH 0/15] arm: Enables return address verification and branch target identification on Cortex-M

2023-01-23 Thread Andrea Corallo via Gcc-patches
Hi Richard,

thanks for reviewing and approving this series, this is now in.

BR

  Andrea


[PATCH 23/23] arm: fix missing extern "C" in MVE tests

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vhaddq_n_s16.c: Add missing extern
"C".
* gcc.target/arm/mve/intrinsics/vhaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s8.c: 

[PATCH 00/23] arm: rework MVE testsuite and rework backend where necessary (3rd chunck)

2023-01-20 Thread Andrea Corallo via Gcc-patches
Hi all,

this 3rd series, similarly to the previous ones, rework the arm MVE
testsuite for better coverage.  Contextually some trivial fixes to the
backend are performed.

23/23 also adds some extern "C" I forgot to add with the previous
series in order to fix those tests for C++.

Best Regards

  Andrea

Andrea Corallo (23):
  arm: improve tests and fix vclsq*
  arm: improve tests and fix vclzq*
  arm: improve tests and fix vnegq*
  arm: improve tests for vmulhq*
  arm: improve tests for vmullbq*
  arm: improve tests for vmulltq*
  arm: improve tests for vcaddq*
  arm: improve tests for vcmlaq*
  arm: improve tests for vcmulq*
  arm: improve tests and fix vqabsq*
  arm: improve tests for vqdmladhq*
  arm: improve tests for vqdmladhxq*
  arm: improve tests for vqrdmladhq*
  arm: improve tests for vqrdmladhxq*
  arm: improve tests for vqrdmlashq*
  arm: improve tests for vqdmlsdhq*
  arm: improve tests for vqdmlsdhxq*
  arm: improve tests for vqrdmlsdhq*
  arm: improve tests for vqrdmlsdhxq*
  arm: improve tests for vqrdmulhq*
  arm: improve tests and fix vqnegq*
  arm: improve tests for vld2q*
  arm: fix missing extern "C" in MVE tests

 gcc/config/arm/mve.md | 12 +++
 .../arm/mve/intrinsics/vcaddq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_s16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s8.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u8.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_f16.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_f32.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_m_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_f32.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_s16.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_s32.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_s8.c| 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_u16.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_u32.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_m_u8.c| 34 ---
 .../arm/mve/intrinsics/vcaddq_rot90_s16.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_s32.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_s8.c  | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_u16.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_u32.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_u8.c  | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot90_x_f16.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_f32.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_s16.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_s32.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_s8.c| 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_u16.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_u32.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot90_x_u8.c| 33 --
 .../arm/mve/intrinsics/vclsq_m_s16.c  | 33 --
 .../arm/mve/intrinsics/vclsq_m_s32.c  | 33 --
 .../arm/mve/intrinsics/vclsq_m_s8.c   | 33 --
 .../gcc.target/arm/mve/intrinsics/vclsq_s16.c | 28 ---
 .../gcc.target/arm/mve/intrinsics/vclsq_s32.c | 28 ---
 .../gcc.target/arm/mve/intrinsics/vclsq_s8.c  | 24 +++--
 ..

[PATCH 09/23] arm: improve tests for vcmulq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcmulq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcmulq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot180_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot270_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_rot90_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmulq_x_f32.c: Likewise.
---
 .../arm/mve/intrinsics/vcmulq_f16.c   | 24 +++--
 .../arm/mve/intrinsics/vcmulq_f32.c   | 24 +++--
 .../arm/mve/intrinsics/vcmulq_m_f16.c | 34 ---
 .../arm/mve/intrinsics/vcmulq_m_f32.c | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot180_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot180_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot180_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot180_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot180_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot180_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot270_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot270_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcmulq_rot90_f16.c | 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot90_f32.c | 24 +++--
 .../arm/mve/intrinsics/vcmulq_rot90_m_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot90_m_f32.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot90_x_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_rot90_x_f32.c   | 34 ---
 .../arm/mve/intrinsics/vcmulq_x_f16.c | 33 --
 .../arm/mve/intrinsics/vcmulq_x_f32.c | 33 --
 24 files changed, 656 insertions(+), 74 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
index 142c315ecf5..456370e1de1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vcmul.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a, float16x8_t b)
 {
   return vcmulq_f16 (a, b);
 }
 
-/* { dg-final { scan-assembler "vcmul.f16"  }  } */
 
+/*
+**foo1:
+** ...
+** vcmul.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo1 (float16x8_t a, float16x8_t b)
 {
   return vcmulq (a, b);
 }
 
-/* { dg-final { scan-assembler "vcmul.f16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c
index 158d750793d..64db652a1a1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmulq_f32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m

[PATCH 14/23] arm: improve tests for vqrdmladhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmladhxq_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vqrdmladhxq_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vqrdmladhxq_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vqrdmladhxq_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhxq_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhxq_s8.c   | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
index 677efdcd1e4..1f68671b3f9 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
index 8ee8bbb420b..eaea6e1f482 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
index 7cfa88fee28..0f582a91f3a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladhxt.s8  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t 

[PATCH 20/23] arm: improve tests for vqrdmulhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmulhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmulhq_m_n_s16.c| 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_n_s32.c| 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_n_s8.c | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vqrdmulhq_n_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_n_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_n_s8.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_s16.c| 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_s32.c| 24 +++--
 .../arm/mve/intrinsics/vqrdmulhq_s8.c | 24 +++--
 12 files changed, 312 insertions(+), 36 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
index c4b6b7e22f8..fc3a33073aa 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m_n_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
index 6de3eb1cb9a..897ad5bd28c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmulhq_m_n_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m_n_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmulht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32_t b, mve_pred16_t p)
 {
   return vqrdmulhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmulht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/ar

[PATCH 07/23] arm: improve tests for vcaddq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcaddq_rot270_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot270_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcaddq_rot90_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vcaddq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u16.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u32.c  | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_m_u8.c   | 34 ---
 .../arm/mve/intrinsics/vcaddq_rot270_s16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_s8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u16.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u32.c| 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_u8.c | 24 +++--
 .../arm/mve/intrinsics/vcaddq_rot270_x_f16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_f32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s16.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s32.c  | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_s8.c   | 33 --
 .../arm/mve/intrinsics/vcaddq_rot270_x_u16.c  | 33 --

[PATCH 10/23] arm: improve tests and fix vqabsq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vqabsq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqabsq_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vqabsq_m_s16.c | 33 +--
 .../arm/mve/intrinsics/vqabsq_m_s32.c | 33 +--
 .../arm/mve/intrinsics/vqabsq_m_s8.c  | 33 +--
 .../arm/mve/intrinsics/vqabsq_s16.c   | 28 +---
 .../arm/mve/intrinsics/vqabsq_s32.c   | 28 +---
 .../gcc.target/arm/mve/intrinsics/vqabsq_s8.c | 24 --
 7 files changed, 161 insertions(+), 20 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 0a243486bdb..600adf7d69b 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -388,7 +388,7 @@ (define_insn "mve_vqabsq_s"
 VQABSQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vqabs.s%# %q0, %q1"
+  "vqabs.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
index e74e04ac92f..7172ac5cddd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqabsq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqabst.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqabsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
index f6ca8a6c3d6..297cb196f1a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqabsq_m_s32 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqabst.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqabsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
index d89a5aa3fa5..83c69931239 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqabsq_m_s8.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqabst.s8

[PATCH 22/23] arm: improve tests for vld2q*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vld2q_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vld2q_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vld2q_u8.c: Likewise.
---
 .../gcc.target/arm/mve/intrinsics/vld2q_f16.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_f32.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_s16.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_s32.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_s8.c  | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_u16.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_u32.c | 33 ---
 .../gcc.target/arm/mve/intrinsics/vld2q_u8.c  | 33 ---
 8 files changed, 224 insertions(+), 40 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
index 24e7a2ea4d0..81690b1022e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f16.c
@@ -1,22 +1,45 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float16x8x2_t
-foo (float16_t const * addr)
+foo (float16_t const *addr)
 {
   return vld2q_f16 (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.16"  }  } */
-/* { dg-final { scan-assembler "vld21.16"  }  } */
 
+/*
+**foo1:
+** ...
+** vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float16x8x2_t
-foo1 (float16_t const * addr)
+foo1 (float16_t const *addr)
 {
   return vld2q (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
index 727484caaf6..d2ae31fa9e5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_f32.c
@@ -1,22 +1,45 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vld20.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float32x4x2_t
-foo (float32_t const * addr)
+foo (float32_t const *addr)
 {
   return vld2q_f32 (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.32"  }  } */
-/* { dg-final { scan-assembler "vld21.32"  }  } */
 
+/*
+**foo1:
+** ...
+** vld20.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.32{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 float32x4x2_t
-foo1 (float32_t const * addr)
+foo1 (float32_t const *addr)
 {
   return vld2q (addr);
 }
 
-/* { dg-final { scan-assembler "vld20.32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
index f2864a00478..fb4dc1b4fcf 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vld2q_s16.c
@@ -1,22 +1,45 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vld20.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+** vld21.16{q[0-9]+, q[0-9]+}, \[(?:ip|fp|r[0-9]+)\](?:@.*|)
+** ...
+*/
 int16x8x2_t
-foo (int16_t const * addr)
+foo (int16_t const *addr)
 {
   return vld2q_s16

[PATCH 19/23] arm: improve tests for vqrdmlsdhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhxq_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhxq_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhxq_s8.c   | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
index 2fbd351f3b4..3598f50ccba 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
index 324a6e63398..1ab22edf9ca 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s32 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
index 287868b1190..01103e99b61 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdhxt.s8  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t 

[PATCH 11/23] arm: improve tests for vqdmladhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmladhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmladhq_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vqdmladhq_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vqdmladhq_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vqdmladhq_s16.c| 24 +++--
 .../arm/mve/intrinsics/vqdmladhq_s32.c| 24 +++--
 .../arm/mve/intrinsics/vqdmladhq_s8.c | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
index 51cdadc9ece..aa9c78c883b 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
index 7e43fed1503..4694a6f9ec5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
index adf591041e3..c8dc67fdd12 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladht.s8q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqdmladhq_m_s8 (inactive, a, b, p);

[PATCH 08/23] arm: improve tests for vcmlaq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcmlaq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcmlaq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot180_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot270_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcmlaq_rot90_m_f32.c: Likewise.
---
 .../arm/mve/intrinsics/vcmlaq_f16.c   | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_f32.c   | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_m_f16.c | 34 ---
 .../arm/mve/intrinsics/vcmlaq_m_f32.c | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot180_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot180_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot180_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot180_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot270_f16.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot270_f32.c| 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot270_m_f16.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot270_m_f32.c  | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot90_f16.c | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot90_f32.c | 24 +++--
 .../arm/mve/intrinsics/vcmlaq_rot90_m_f16.c   | 34 ---
 .../arm/mve/intrinsics/vcmlaq_rot90_m_f32.c   | 34 ---
 16 files changed, 416 insertions(+), 48 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
index fa7d0c05e8c..bb8a99790a0 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vcmla.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a, float16x8_t b, float16x8_t c)
 {
   return vcmlaq_f16 (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f16"  }  } */
 
+/*
+**foo1:
+** ...
+** vcmla.f16   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float16x8_t
 foo1 (float16x8_t a, float16x8_t b, float16x8_t c)
 {
   return vcmlaq (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
index 166bf421f14..71ec4b8479c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_f32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vcmla.f32   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float32x4_t
 foo (float32x4_t a, float32x4_t b, float32x4_t c)
 {
   return vcmlaq_f32 (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f32"  }  } */
 
+/*
+**foo1:
+** ...
+** vcmla.f32   q[0-9]+, q[0-9]+, q[0-9]+, #0(?:@.*|)
+** ...
+*/
 float32x4_t
 foo1 (float32x4_t a, float32x4_t b, float32x4_t c)
 {
   return vcmlaq (a, b, c);
 }
 
-/* { dg-final { scan-assembler "vcmla.f32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq_m_f16.c
index 0929f5a0a89..3db345d0791 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcmlaq

[PATCH 02/23] arm: improve tests and fix vclzq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (@mve_vclzq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vclzq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vclzq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclzq_x_u8.c: Likewise.
* gcc.target/arm/simd/mve-vclz.c: Update test.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vclzq_m_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_s8.c   | 33 +--
 .../arm/mve/intrinsics/vclzq_m_u16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_u32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_m_u8.c   | 33 +--
 .../gcc.target/arm/mve/intrinsics/vclzq_s16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_s32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_s8.c  | 24 --
 .../gcc.target/arm/mve/intrinsics/vclzq_u16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_u32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclzq_u8.c  | 28 +---
 .../arm/mve/intrinsics/vclzq_x_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_s8.c   | 33 +--
 .../arm/mve/intrinsics/vclzq_x_u16.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_u32.c  | 33 +--
 .../arm/mve/intrinsics/vclzq_x_u8.c   | 33 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vclz.c  |  6 ++--
 20 files changed, 506 insertions(+), 62 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index e35ea5d9f9c..854371f7e11 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -448,7 +448,7 @@ (define_insn "@mve_vclzq_s"
(clz:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "vclz.i%#  %q0, %q1"
+  "vclz.i%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 (define_expand "mve_vclzq_u"
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
index 9670f8f56f3..620314e4ff2 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclzt.i16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclzq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vclzt.i16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclzt.i16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclzq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c
index 18427354570..dfda1e67287 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclzq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /*

[PATCH 13/23] arm: improve tests for vqrdmladhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmladhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmladhq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqrdmladhq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqrdmladhq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmladhq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmladhq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
index fce4f5a35ef..5b0e134a0ff 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
index e550b6a7995..6fdf3879cc2 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmladhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmladht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
index b07b28e5bcd..ef75f737161 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmladhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmladht.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqrdmladhq

[PATCH 12/23] arm: improve tests for vqdmladhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_s32.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmladhxq_s8.c: Improve test.
---
 .../arm/mve/intrinsics/vqdmladhxq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqdmladhxq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqdmladhxq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqdmladhxq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmladhxq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmladhxq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
index c2446e69181..19c5ce5a64f 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
index 12b45517535..e00162addae 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmladhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmladhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
index 146aa51306b..19767d2cd41 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmladhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmladhxt.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {

[PATCH 21/23] arm: improve tests and fix vqnegq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vqnegq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqnegq_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vqnegq_m_s16.c | 33 +--
 .../arm/mve/intrinsics/vqnegq_m_s32.c | 33 +--
 .../arm/mve/intrinsics/vqnegq_m_s8.c  | 33 +--
 .../arm/mve/intrinsics/vqnegq_s16.c   | 28 +---
 .../arm/mve/intrinsics/vqnegq_s32.c   | 24 --
 .../gcc.target/arm/mve/intrinsics/vqnegq_s8.c | 24 --
 7 files changed, 159 insertions(+), 18 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 600adf7d69b..4f94cf14a0b 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -374,7 +374,7 @@ (define_insn "mve_vqnegq_s"
 VQNEGQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vqneg.s%# %q0, %q1"
+  "vqneg.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
index 4f0145d2ebd..f3799a35b12 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqnegq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqnegt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s16  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vqnegq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
index da4f90bad53..bbe64ff4d52 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqnegq_m_s32 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqnegt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s32  q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vqnegq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
index ac1250b2fac..71fcdd7cba7 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqnegq_m_s8.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqnegt.s8  

[PATCH 03/23] arm: improve tests and fix vnegq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vnegq_f, mve_vnegq_s):
Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vnegq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vnegq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vnegq_x_s8.c: Likewise.
* gcc.target/arm/simd/mve-vneg.c: Update test.
* gcc.target/arm/simd/mve-vshr.c: Likewise
---
 gcc/config/arm/mve.md |  4 +--
 .../gcc.target/arm/mve/intrinsics/vnegq_f16.c | 30 -
 .../gcc.target/arm/mve/intrinsics/vnegq_f32.c | 30 -
 .../arm/mve/intrinsics/vnegq_m_f16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_f32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_s16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_s32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_m_s8.c   | 33 +--
 .../gcc.target/arm/mve/intrinsics/vnegq_s16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vnegq_s32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vnegq_s8.c  | 24 --
 .../arm/mve/intrinsics/vnegq_x_f16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_f32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_s16.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_s32.c  | 33 +--
 .../arm/mve/intrinsics/vnegq_x_s8.c   | 33 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vneg.c  |  4 +--
 gcc/testsuite/gcc.target/arm/simd/mve-vshr.c  |  2 +-
 18 files changed, 433 insertions(+), 47 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 854371f7e11..0a243486bdb 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -252,7 +252,7 @@ (define_insn "mve_vnegq_f"
(neg:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vneg.f%#  %q0, %q1"
+  "vneg.f%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -401,7 +401,7 @@ (define_insn "mve_vnegq_s"
(neg:MVE_2 (match_operand:MVE_2 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE"
-  "vneg.s%#  %q0, %q1"
+  "vneg.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
index 9572c140d7e..9853cf6e6dd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f16.c
@@ -1,13 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vneg.f16q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a)
 {
   return vnegq_f16 (a);
 }
 
-/* { dg-final { scan-assembler "vneg.f16"  }  } */
+
+/*
+**foo1:
+** ...
+** vneg.f16q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
+float16x8_t
+foo1 (float16x8_t a)
+{
+  return vnegq (a);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
index be73cc0c5f5..489cfc760ba 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vnegq_f32.c
@@ -1,13 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vneg.f32q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 float32x4_t
 foo (float32x4_t a)
 {
   return vnegq_f32 (a);
 }
 
-/* { dg-final { scan-assembler "vneg.f32"  }  } */
+
+/*
+**foo1:
+** ...

[PATCH 05/23] arm: improve tests for vmullbq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_int_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_m_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_m_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_x_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmullbq_poly_x_p8.c: Likewise.
---
 .../arm/mve/intrinsics/vmullbq_int_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_u16.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_u32.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_int_m_u8.c | 34 ---
 .../arm/mve/intrinsics/vmullbq_int_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_s8.c   | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_u16.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_u32.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_u8.c   | 24 +++--
 .../arm/mve/intrinsics/vmullbq_int_x_s16.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_s32.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_s8.c | 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_u16.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_u32.c| 33 --
 .../arm/mve/intrinsics/vmullbq_int_x_u8.c | 33 --
 .../arm/mve/intrinsics/vmullbq_poly_m_p16.c   | 34 ---
 .../arm/mve/intrinsics/vmullbq_poly_m_p8.c| 34 ---
 .../arm/mve/intrinsics/vmullbq_poly_p16.c | 24 +++--
 .../arm/mve/intrinsics/vmullbq_poly_p8.c  | 24 +++--
 .../arm/mve/intrinsics/vmullbq_poly_x_p16.c   | 33 --
 .../arm/mve/intrinsics/vmullbq_poly_x_p8.c| 33 --
 24 files changed, 656 insertions(+), 72 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
index be933274d77..a4cc5e52773 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmullbq_int_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmullbt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmullbq_int_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmullbt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmullbt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmullbq_int_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmullbt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_

[PATCH 18/23] arm: improve tests for vqrdmlsdhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlsdhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlsdhq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqrdmlsdhq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqrdmlsdhq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
index d0054b8ea97..6a5776215ca 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
index 7d3fe45eb4d..9539e249d6a 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlsdht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
index c33f8ea903b..69e54f53a76 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlsdhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlsdht.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqrdmlsdhq

[PATCH 04/23] arm: improve tests for vmulhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulhq_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vmulhq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_u16.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_u32.c | 34 ---
 .../arm/mve/intrinsics/vmulhq_m_u8.c  | 34 ---
 .../arm/mve/intrinsics/vmulhq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vmulhq_s32.c   | 24 +++--
 .../gcc.target/arm/mve/intrinsics/vmulhq_s8.c | 24 +++--
 .../arm/mve/intrinsics/vmulhq_u16.c   | 24 +++--
 .../arm/mve/intrinsics/vmulhq_u32.c   | 24 +++--
 .../gcc.target/arm/mve/intrinsics/vmulhq_u8.c | 24 +++--
 .../arm/mve/intrinsics/vmulhq_x_s16.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_s32.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_s8.c  | 33 --
 .../arm/mve/intrinsics/vmulhq_x_u16.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_u32.c | 33 --
 .../arm/mve/intrinsics/vmulhq_x_u8.c  | 33 --
 18 files changed, 492 insertions(+), 54 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
index 4971869a27b..a7d8460c265 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulht.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
index 3006de7fd24..997fdbe8d23 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulht.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vmulhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-fi

[PATCH 06/23] arm: improve tests for vmulltq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_int_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_m_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_m_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_p8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_x_p16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulltq_poly_x_p8.c: Likewise.
---
 .../arm/mve/intrinsics/vmulltq_int_m_s16.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_s32.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_s8.c | 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_u16.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_u32.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_int_m_u8.c | 34 ---
 .../arm/mve/intrinsics/vmulltq_int_s16.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_s32.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_s8.c   | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_u16.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_u32.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_u8.c   | 24 +++--
 .../arm/mve/intrinsics/vmulltq_int_x_s16.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_s32.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_s8.c | 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_u16.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_u32.c| 33 --
 .../arm/mve/intrinsics/vmulltq_int_x_u8.c | 33 --
 .../arm/mve/intrinsics/vmulltq_poly_m_p16.c   | 34 ---
 .../arm/mve/intrinsics/vmulltq_poly_m_p8.c| 34 ---
 .../arm/mve/intrinsics/vmulltq_poly_p16.c | 24 +++--
 .../arm/mve/intrinsics/vmulltq_poly_p8.c  | 24 +++--
 .../arm/mve/intrinsics/vmulltq_poly_x_p16.c   | 33 --
 .../arm/mve/intrinsics/vmulltq_poly_x_p8.c| 33 --
 24 files changed, 656 insertions(+), 72 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
index 25ecf7a2c51..7f573e9109e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmulltq_int_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulltt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulltq_int_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulltt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmulltt.s16 q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vmulltq_int_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmulltt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_

[PATCH 16/23] arm: improve tests for vqdmlsdhq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmlsdhq_m_s16.c  | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhq_m_s32.c  | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhq_m_s8.c   | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhq_s16.c| 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhq_s32.c| 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhq_s8.c | 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
index d1e66864d10..f87287ab8cd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s16   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
index cc80f211ec8..8155aaf843c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s32   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdht.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
index 5c9d81a6526..d39badc7707 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdht.s8q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqdmlsdhq_m_s8 (inactive, a, b, p);

[PATCH 01/23] arm: improve tests and fix vclsq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vclsq_s): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vclsq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vclsq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vclsq_x_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vclsq_m_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_m_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_m_s8.c   | 33 +--
 .../gcc.target/arm/mve/intrinsics/vclsq_s16.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclsq_s32.c | 28 +---
 .../gcc.target/arm/mve/intrinsics/vclsq_s8.c  | 24 --
 .../arm/mve/intrinsics/vclsq_x_s16.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_x_s32.c  | 33 +--
 .../arm/mve/intrinsics/vclsq_x_s8.c   | 33 +--
 10 files changed, 251 insertions(+), 29 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index f123edc449b..e35ea5d9f9c 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -469,7 +469,7 @@ (define_insn "mve_vclsq_s"
 VCLSQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vcls.s%#  %q0, %q1"
+  "vcls.s%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
index d0eb7008537..1996ac8b03e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s16.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclsq_m_s16 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vclst.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s16   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, mve_pred16_t p)
 {
   return vclsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
index b6d7088a8e7..f51841d024e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s32.c
@@ -1,22 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s32   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vclsq_m_s32 (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vclst.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vclst.s32   q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, mve_pred16_t p)
 {
   return vclsq_m (inactive, a, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c
index 28d4d966802..2975c4cda56 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vclsq_m_s8.c
@@ -1,22 +1,49 @@
 /* { dg-require-

[PATCH 17/23] arm: improve tests for vqdmlsdhxq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlsdhxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmlsdhxq_m_s16.c | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhxq_m_s32.c | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhxq_m_s8.c  | 34 ---
 .../arm/mve/intrinsics/vqdmlsdhxq_s16.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhxq_s32.c   | 24 +++--
 .../arm/mve/intrinsics/vqdmlsdhxq_s8.c| 24 +++--
 6 files changed, 156 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
index 6ab9743054c..1742d47291c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s16.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s16  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16x8_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s16"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
index a34618e97fd..1c1b73a2251 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s32.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m_s32 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s32  q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int32x4_t
 foo1 (int32x4_t inactive, int32x4_t a, int32x4_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlsdhxt.s32"  }  } */
+#ifdef __cplusplus
+}
+#endif
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
index fdbe89ab6b8..0a980a081a1 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlsdhxq_m_s8.c
@@ -1,23 +1,49 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlsdhxt.s8   q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 int8x16_t
 foo (int8x16_t inactive, int8x16_t a, int8x16_t b, mve_pred16_t p)
 {
   return vqdmlsdhxq

[PATCH 15/23] arm: improve tests for vqrdmlashq*

2023-01-20 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlashq_n_s16.c | 32 +++
 .../arm/mve/intrinsics/vqrdmlashq_n_s32.c | 32 +++
 .../arm/mve/intrinsics/vqrdmlashq_n_s8.c  | 32 +++
 3 files changed, 78 insertions(+), 18 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
index 8ff8c34d529..2710f2f0442 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vqrdmlash.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo (int16x8_t a, int16x8_t b, int16_t c)
+foo (int16x8_t m1, int16x8_t m2, int16_t add)
 {
-  return vqrdmlashq_n_s16 (a, b, c);
+  return vqrdmlashq_n_s16 (m1, m2, add);
 }
 
-/* { dg-final { scan-assembler "vqrdmlash.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vqrdmlash.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo1 (int16x8_t a, int16x8_t b, int16_t c)
+foo1 (int16x8_t m1, int16x8_t m2, int16_t add)
 {
-  return vqrdmlashq (a, b, c);
+  return vqrdmlashq (m1, m2, add);
+}
+
+#ifdef __cplusplus
 }
+#endif
 
-/* { dg-final { scan-assembler "vqrdmlash.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
index 02583f0627b..5fefc3938c5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vqrdmlash.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo (int32x4_t a, int32x4_t b, int32_t c)
+foo (int32x4_t m1, int32x4_t m2, int32_t add)
 {
-  return vqrdmlashq_n_s32 (a, b, c);
+  return vqrdmlashq_n_s32 (m1, m2, add);
 }
 
-/* { dg-final { scan-assembler "vqrdmlash.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vqrdmlash.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo1 (int32x4_t a, int32x4_t b, int32_t c)
+foo1 (int32x4_t m1, int32x4_t m2, int32_t add)
 {
-  return vqrdmlashq (a, b, c);
+  return vqrdmlashq (m1, m2, add);
+}
+
+#ifdef __cplusplus
 }
+#endif
 
-/* { dg-final { scan-assembler "vqrdmlash.s32"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
index 0bd5bcac71f..df96fe85213 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_n_s8.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+/*
+**foo:
+** ...
+** vqrdmlash.s8q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int8x16_t
-foo (int8x16_t a, int8x16_t b, int8_t c)
+foo (int8x16_t m1, int8x16_t m2, int8_t add)
 {
-  return vqrdmlashq_n_s8 (a, b, c);
+  return vqrdmlashq_n_s8 (m1, m2, add);
 }
 
-/* { dg-final { scan-assembler "vqrdmlash.s8"  }  } */
 
+/*
+**foo1:
+** ...
+** vqrdmlash.s8q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int8x16_t
-foo1 (int8x16_t a, int8x16_t b, int8_t c)
+foo1 (int8x16_t m1, int8x16_t m2, int8_t add)
 {
-  return vqrdmlashq (a, b, c);
+  return vqrdmlashq (m1, m2, add);
+}
+
+#ifdef __cplusplus
 }
+#endif
 
-/* { dg-final { scan-assembler "vqrdmlash.s8"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
-- 
2.25.1



[PATCH 10/15 V7] arm: Implement cortex-M return signing address codegen

2023-01-11 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

[...]

>
> Otherwise ok with that change.
>
> R.

Minor respin of this patch addressing the suggestion to have
'use_return_insn' return zero when PAC is enabled.

BR

  Andrea

>From 0a894f73fc09be865b7a7cb205e871bf82f8abba Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 20 Jan 2022 15:36:23 +0100
Subject: [PATCH] [PATCH 10/15] arm: Implement cortex-M return signing address
 codegen

Hi all,

this patch enables address return signature and verification based on
Armv8.1-M Pointer Authentication [1].

To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry.  This is signing LR using SP and storing the
result in R12.  R12 will be pushed into the stack.

During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.

Here an example of PAC instrumented function prologue and epilogue:

void foo (void);

int main()
{
  foo ();
  return 0;
}

Compiled with '-march=armv8.1-m.main -mbranch-protection=pac-ret
-mthumb' translates into:

main:
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.

Ex. the previous example compiled with '-march=armv8.1-m.main
-mbranch-protection=pac-ret+bti -mthumb' translates into:

main:
pacbti  ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

As part of previous upstream suggestions a test for varargs has been
added and '-mtpcs-frame' is deemed being incompatible with this return
signing address feature being introduced.

[1] 
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>

gcc/Changelog

2021-11-03  Andrea Corallo  

* config/arm/arm.h (arm_arch8m_main): Declare it.
* config/arm/arm.cc (arm_arch8m_main): Define it.
(arm_option_reconfigure_globals): Set arm_arch8m_main.
(arm_compute_frame_layout, arm_expand_prologue)
(thumb2_expand_return, arm_expand_epilogue)
(arm_conditional_register_usage): Update for pac codegen.
(arm_current_function_pac_enabled_p): New function.
(aarch_bti_enabled) New function.
(use_return_insn): Return zero when pac is enabled.
* config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
Add new patterns.
* config/arm/unspecs.md (UNSPEC_PAC_NOP)
(VUNSPEC_PACBTI_NOP, VUNSPEC_AUT_NOP): Add unspecs.

gcc/testsuite/Changelog

2021-11-03  Andrea Corallo  

* gcc.target/arm/pac.h : New file.
* gcc.target/arm/pac-1.c : New test case.
* gcc.target/arm/pac-2.c : Likewise.
* gcc.target/arm/pac-3.c : Likewise.
* gcc.target/arm/pac-4.c : Likewise.
* gcc.target/arm/pac-5.c : Likewise.
* gcc.target/arm/pac-6.c : Likewise.
* gcc.target/arm/pac-7.c : Likewise.
* gcc.target/arm/pac-8.c : Likewise.
* gcc.target/arm/pac-9.c : Likewise.
* gcc.target/arm/pac-10.c : Likewise.
* gcc.target/arm/pac-11.c : Likewise.
---
 gcc/config/arm/arm-protos.h   |  1 +
 gcc/config/arm/arm.cc | 79 ---
 gcc/config/arm/arm.h  |  4 ++
 gcc/config/arm/arm.md | 23 
 gcc/config/arm/unspecs.md |  3 +
 gcc/testsuite/gcc.target/arm/pac-1.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-10.c | 10 
 gcc/testsuite/gcc.target/arm/pac-11.c | 10 
 gcc/testsuite/gcc.target/arm/pac-2.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-3.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-4.c  | 10 
 gcc/testsuite/gcc.target/arm/pac-5.c  | 28 ++
 gcc/testsuite/gcc.target/arm/pac-6.c  | 18 ++
 gcc/testsuite/gcc.target/arm/pac-7.c  | 32 +++
 gcc/testsuite/gcc.target/arm/pac-8.c  | 34 
 gcc/testsuite/gcc.target/arm/pac-9.c  | 11 
 gcc/testsuite/gcc.target/arm/pac.h| 17 ++
 17 files changed, 304 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-1.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-10.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-11.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-2.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-3.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-4.c
 create mode 100644 gcc/testsuite/gcc.target/arm/pac-5.c
 create mode 100644 gcc/testsuite/gcc.ta

Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-11 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

> On 09/01/2023 16:48, Richard Earnshaw via Gcc-patches wrote:
>> On 09/01/2023 14:58, Andrea Corallo via Gcc-patches wrote:
>>> Andrea Corallo via Gcc-patches  writes:
>>>
>>>> Richard Earnshaw  writes:
>>>>
>>>>> On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:
>>>>>>
>>>>>>> -Original Message-
>>>>>>> From: Andrea Corallo 
>>>>>>> Sent: Tuesday, September 27, 2022 11:06 AM
>>>>>>> To: Kyrylo Tkachov 
>>>>>>> Cc: Andrea Corallo via Gcc-patches ; Richard
>>>>>>> Earnshaw ; nd 
>>>>>>> Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA
>>>>>>> reg when
>>>>>>> popping if necessary
>>>>>>>
>>>>>>> Kyrylo Tkachov  writes:
>>>>>>>
>>>>>>>> Hi Andrea,
>>>>>>>>
>>>>>>>>> -Original Message-
>>>>>>>>> From: Gcc-patches >>>>>>>> bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
>>>>>>>>> Corallo via Gcc-patches
>>>>>>>>> Sent: Friday, August 12, 2022 4:34 PM
>>>>>>>>> To: Andrea Corallo via Gcc-patches 
>>>>>>>>> Cc: Richard Earnshaw ; nd 
>>>>>>>>> Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
>>>>>>> popping
>>>>>>>>> if necessary
>>>>>>>>>
>>>>>>>>> Hi all,
>>>>>>>>>
>>>>>>>>> this patch enables 'arm_emit_multi_reg_pop' to set again the stack
>>>>>>>>> pointer as CFA reg when popping if this is necessary.
>>>>>>>>>
>>>>>>>>
>>>>>>>>   From what I can tell from similar functions this is correct,
>>>>>>>> but could you
>>>>>>> elaborate on why this change is needed for my understanding please?
>>>>>>>> Thanks,
>>>>>>>> Kyrill
>>>>>>>
>>>>>>> Hi Kyrill,
>>>>>>>
>>>>>>> sure, if the frame pointer was set, than it is the current CFA
>>>>>>> register.
>>>>>>> If we request to adjust the current CFA register offset indicating it
>>>>>>> being SP (while it's actually FP) that is indeed not correct and the
>>>>>>> incoherence we will be detected by an assertion in the dwarf emission
>>>>>>> machinery.
>>>>>> Thanks,  the patch is ok
>>>>>> Kyrill
>>>>>>
>>>>>>>
>>>>>>> Best Regards
>>>>>>>
>>>>>>>     Andrea
>>>>>
>>>>> Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?
>>>>
>>>> Hi Richard,
>>>>
>>>> not sure I understand, isn't any pop updating SP by definition?
>>>
>>>
>>> Back on this,
>>>
>>> compiling:
>>>
>>> ===
>>> int i;
>>>
>>> void foo (int);
>>>
>>> int bar()
>>> {
>>>    foo (i);
>>>    return 0;
>>> }
>>> ===
>>>
>>> With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf
>>> -mthumb -O0 -g
>>>
>>> Produces the following asm for bar.
>>>
>>> bar:
>>> @ args = 0, pretend = 0, frame = 0
>>> @ frame_needed = 1, uses_anonymous_args = 0
>>> pac    ip, lr, sp
>>> push    {r3, r7, ip, lr}
>>> add    r7, sp, #0
>>> ldr    r3, .L3
>>> ldr    r3, [r3]
>>> mov    r0, r3
>>> bl    foo
>>> movs    r3, #0
>>> mov    r0, r3
>>> pop    {r3, r7, ip, lr}
>>> aut    ip, lr, sp
>>> bx    lr
>>>
>>> The offending instruction causing the ICE (without this patch) when
>>> emitting dwarf is "pop {r3, r7, ip, lr}".
>>>
>>> The current CFA reg when emitting the multipop is R7 (the frame
>>> pointer).  If is not the multipop that has the duty to restore SP as
>>> current CFA here which other instruction should do it?
>>>
>> Digging a bit deeper, I'm now even more confused. 
>> arm_expand_epilogue contains (parphrasing the code):
>>   if frame_pointer_needed
>>     {
>>   if arm
>>     {}
>>   else
>>     {
>>   if adjust
>>     r7 += adjust
>>   mov sp, r7    // Reset CFA to SP
>>     }
>>      }
>> so there should always be a move of r7 into SP, even if this is
>> strictly redundant.  I don't understand why this doesn't happen for
>> your testcase.  Can you dig a bit deeper?  I wonder if we've
>> (probably incorrectly) assumed that this function doesn't need an
>> epilogue but can use a simple return?  I don't think we should do
>> that when authentication is needed: a simple return should really be
>> one instruction.
>> 
>
> So I strongly suspect the real problem here is that use_return_insn ()
> in arm.cc needs to be updated to return false when using pointer
> authentication.  The specification for this function says that a
> return can be done in one instruction; and clearly when we need
> authentication more than one is needed.
>
> R.

So yes I agree with your analysis.  I'm respinning 10/15 to include your
suggestion and I believe we can just drop this patch.

Thanks

  Andrea


Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary

2023-01-09 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Richard Earnshaw  writes:
>
>> On 27/09/2022 16:24, Kyrylo Tkachov via Gcc-patches wrote:
>>> 
>>>> -Original Message-
>>>> From: Andrea Corallo 
>>>> Sent: Tuesday, September 27, 2022 11:06 AM
>>>> To: Kyrylo Tkachov 
>>>> Cc: Andrea Corallo via Gcc-patches ; Richard
>>>> Earnshaw ; nd 
>>>> Subject: Re: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
>>>> popping if necessary
>>>>
>>>> Kyrylo Tkachov  writes:
>>>>
>>>>> Hi Andrea,
>>>>>
>>>>>> -Original Message-
>>>>>> From: Gcc-patches >>>>> bounces+kyrylo.tkachov=arm@gcc.gnu.org> On Behalf Of Andrea
>>>>>> Corallo via Gcc-patches
>>>>>> Sent: Friday, August 12, 2022 4:34 PM
>>>>>> To: Andrea Corallo via Gcc-patches 
>>>>>> Cc: Richard Earnshaw ; nd 
>>>>>> Subject: [PATCH 9/15] arm: Set again stack pointer as CFA reg when
>>>> popping
>>>>>> if necessary
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> this patch enables 'arm_emit_multi_reg_pop' to set again the stack
>>>>>> pointer as CFA reg when popping if this is necessary.
>>>>>>
>>>>>
>>>>>  From what I can tell from similar functions this is correct, but could 
>>>>> you
>>>> elaborate on why this change is needed for my understanding please?
>>>>> Thanks,
>>>>> Kyrill
>>>>
>>>> Hi Kyrill,
>>>>
>>>> sure, if the frame pointer was set, than it is the current CFA register.
>>>> If we request to adjust the current CFA register offset indicating it
>>>> being SP (while it's actually FP) that is indeed not correct and the
>>>> incoherence we will be detected by an assertion in the dwarf emission
>>>> machinery.
>>> Thanks,  the patch is ok
>>> Kyrill
>>> 
>>>>
>>>> Best Regards
>>>>
>>>>Andrea
>>
>> Hmm, wait.  Why would a multi-reg pop be updating the stack pointer?
>
> Hi Richard,
>
> not sure I understand, isn't any pop updating SP by definition?


Back on this,

compiling:

===
int i;

void foo (int);

int bar()
{
  foo (i);
  return 0;
}
===

With -march=armv8.1-m.main+fp -mbranch-protection=pac-ret+leaf -mthumb -O0 -g

Produces the following asm for bar.

bar:
@ args = 0, pretend = 0, frame = 0
@ frame_needed = 1, uses_anonymous_args = 0
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
ldr r3, .L3
ldr r3, [r3]
mov r0, r3
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The offending instruction causing the ICE (without this patch) when
emitting dwarf is "pop {r3, r7, ip, lr}".

The current CFA reg when emitting the multipop is R7 (the frame
pointer).  If is not the multipop that has the duty to restore SP as
current CFA here which other instruction should do it?

Best Regards

  Andrea


[PATCH 12/15 V5] arm: implement bti injection

2022-12-22 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

> On 14/12/2022 17:00, Richard Earnshaw via Gcc-patches wrote:
>> On 14/12/2022 16:40, Andrea Corallo via Gcc-patches wrote:
>>> Hi Richard,
>>>
>>> thanks for reviewing.
>>>
>>> Richard Earnshaw  writes:
>>>
>>>> On 28/10/2022 17:40, Andrea Corallo via Gcc-patches wrote:
>>>>> Hi all,
>>>>> please find attached the third iteration of this patch addresing
>>>>> review
>>>>> comments.
>>>>> Thanks
>>>>>     Andrea
>>>>>
>>>>
>>>> @@ -23374,12 +23374,6 @@ output_probe_stack_range (rtx reg1, rtx reg2)
>>>>     return "";
>>>>   }
>>>>
>>>> -static bool
>>>> -aarch_bti_enabled ()
>>>> -{
>>>> -  return false;
>>>> -}
>>>> -
>>>>   /* Generate the prologue instructions for entry into an ARM or Thumb-2
>>>>  function.  */
>>>>   void
>>>> @@ -32992,6 +32986,61 @@ arm_current_function_pac_enabled_p (void)
>>>>     && !crtl->is_leaf));
>>>>   }
>>>>
>>>> +/* Return TRUE if Branch Target Identification Mechanism is
>>>> enabled.  */
>>>> +bool
>>>> +aarch_bti_enabled (void)
>>>> +{
>>>> +  return aarch_enable_bti == 1;
>>>> +}
>>>>
>>>> See comment in earlier patch about the location of this function
>>>> moving.   Can aarch_enable_bti take values other than 0 and 1?
>>>
>>> Yes default is 2.
>> It shouldn't be by this point, because, hopefully you've gone
>> through the equivalent of this hunk (from aarch64) somewhere in
>> arm_override_options:
>>     if (aarch_enable_bti == 2)
>>   {
>>   #ifdef TARGET_ENABLE_BTI
>>     aarch_enable_bti = 1;
>>   #else
>>     aarch_enable_bti = 0;
>>   #endif
>>   }
>> And after this point the '2' should never be seen again.  We use
>> this trick to permit the user to force a default that differs from
>> the configuration.
>> However, I don't see a hunk to do this in patch 3, so perhaps that
>> needs updating to fix this.
>
> I've just remembered that the above is to support a configure-time
> option of the compiler to enable branch protection.  But perhaps we
> don't want to have that in AArch32, in which case it would be better
> not to have the default be 2 anyway, just default to off (0).
>
> R.

Done in 1/15 (needs approval again now).

>>
>>> [...]
>>>
>>>> +  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
>>>> UNSPEC_BTI_NOP;
>>>>
>>>> I'm not sure where this crept in, but UNSPEC and UNSPEC_VOLATILE have
>>>> separate enums in the backend, so UNSPEC_BIT_NOP should really be
>>>> VUNSPEC_BTI_NOP and defined in the enum "unspecv".
>>>
>>> Done
>>>
>>>> +aarch_pac_insn_p (rtx x)
>>>> +{
>>>> +  if (!x || !INSN_P (x))
>>>> +    return false;
>>>> +
>>>> +  rtx pat = PATTERN (x);
>>>> +
>>>> +  if (GET_CODE (pat) == SET)
>>>> +    {
>>>> +  rtx tmp = XEXP (pat, 1);
>>>> +  if (tmp
>>>> +  && GET_CODE (tmp) == UNSPEC
>>>> +  && (XINT (tmp, 1) == UNSPEC_PAC_NOP
>>>> +  || XINT (tmp, 1) == UNSPEC_PACBTI_NOP))
>>>> +    return true;
>>>> +    }
>>>> +
>>>>
>>>> This will also need updating (see review on earlier patch) because
>>>> PACBTI needs to be unspec_volatile, while PAC doesn't.
>>>
>>> Done
>>>
>>>> +/* The following two functions are for code compatibility with aarch64
>>>> +   code, this even if in arm we have only one bti instruction.  */
>>>> +
>>>>
>>>> I'd just write
>>>>   /* Target specific mapping for aarch_gen_bti_c and
>>>>   aarch_gen_bti_j. For Arm, both of these map to a simple BTI
>>>> instruction.  */
>>>
>>> Done
>>>
>>>>
>>>> @@ -162,6 +162,7 @@ (define_c_enum "unspec" [
>>>>     UNSPEC_PAC_NOP    ; Represents PAC signing LR
>>>>     UNSPEC_PACBTI_NOP    ; Represents PAC signing LR + valid landing pad
>>>>     UNSPEC_AUT_NOP    ; Represents P

[PATCH 1/15 V2] arm: Make mbranch-protection opts parsing common to AArch32/64

2022-12-22 Thread Andrea Corallo via Gcc-patches
Hi all,

respinning this as a rebase was necessary, also now is setting
'aarch_enable_bti' to zero as default for arm as suggested during the
review of 12/15.

Best Regards

  Andrea


>From 6c765818542cc7b40701e8adae2cbe077d5982cc Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Mon, 6 Dec 2021 11:34:35 +0100
Subject: [PATCH] [PATCH 1/15] arm: Make mbranch-protection opts parsing common
 to AArch32/64

Hi all,

This change refactors all the mbranch-protection option parsing code and
types to make it common to both AArch32 and AArch64 backends.

This change also pulls in some supporting types from AArch64 to make
it common (aarch_parse_opt_result).

The significant changes in this patch are the movement of all branch
protection parsing routines from aarch64.c to aarch-common.c and
supporting data types and static data structures.

This patch also pre-declares variables and types required in the
aarch32 back-end for moved variables for function sign scope and key
to prepare for the impending series of patches that support parsing
the feature mbranch-protection in the aarch32 back-end.

gcc/ChangeLog:

* common/config/aarch64/aarch64-common.cc: Include aarch-common.h.
(all_architectures): Fix comment.
(aarch64_parse_extension): Rename return type, enum value names.
* config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Rename
factored out aarch_ra_sign_scope and aarch_ra_sign_key variables.
Also rename corresponding enum values.
* config/aarch64/aarch64-opts.h (aarch64_function_type): Factor
out aarch64_function_type and move it to common code as
aarch_function_type in aarch-common.h.
* config/aarch64/aarch64-protos.h: Include common types header,
move out types aarch64_parse_opt_result and aarch64_key_type to
aarch-common.h
* config/aarch64/aarch64.cc: Move mbranch-protection parsing types
and functions out into aarch-common.h and aarch-common.cc.  Fix up
all the name changes resulting from the move.
* config/aarch64/aarch64.md: Fix up aarch64_ra_sign_key type name change
and enum value.
* config/aarch64/aarch64.opt: Include aarch-common.h to import
type move.  Fix up name changes from factoring out common code and
data.
* config/arm/aarch-common-protos.h: Export factored out routines to both
backends.
* config/arm/aarch-common.cc: Include newly factored out types.
Move all mbranch-protection code and data structures from
aarch64.cc.
* config/arm/aarch-common.h: New header that declares types shared
between aarch32 and aarch64 backends.
* config/arm/arm-protos.h: Declare types and variables that are
made common to aarch64 and aarch32 backends - aarch_ra_sign_key,
aarch_ra_sign_scope and aarch_enable_bti.

Co-Authored-By: Tejas Belagod  
---
 gcc/common/config/aarch64/aarch64-common.cc |  13 +-
 gcc/config/aarch64/aarch64-c.cc |   8 +-
 gcc/config/aarch64/aarch64-opts.h   |  10 -
 gcc/config/aarch64/aarch64-protos.h |  21 +-
 gcc/config/aarch64/aarch64.cc   | 360 +---
 gcc/config/aarch64/aarch64.md   |   2 +-
 gcc/config/aarch64/aarch64.opt  |  15 +-
 gcc/config/arm/aarch-common-protos.h|   6 +
 gcc/config/arm/aarch-common.cc  | 185 ++
 gcc/config/arm/aarch-common.h   |  73 
 gcc/config/arm/arm-protos.h |   2 +
 gcc/config/arm/arm.cc   |   7 +
 gcc/config/arm/arm.opt  |   9 +
 13 files changed, 390 insertions(+), 321 deletions(-)
 create mode 100644 gcc/config/arm/aarch-common.h

diff --git a/gcc/common/config/aarch64/aarch64-common.cc 
b/gcc/common/config/aarch64/aarch64-common.cc
index 61007839d35..18b0b72c012 100644
--- a/gcc/common/config/aarch64/aarch64-common.cc
+++ b/gcc/common/config/aarch64/aarch64-common.cc
@@ -31,6 +31,7 @@
 #include "flags.h"
 #include "diagnostic.h"
 #include "config/aarch64/aarch64-feature-deps.h"
+#include "config/arm/aarch-common.h"
 
 #ifdef  TARGET_BIG_ENDIAN_DEFAULT
 #undef  TARGET_DEFAULT_TARGET_FLAGS
@@ -191,13 +192,13 @@ static constexpr arch_to_arch_name all_architectures[] =
 
 /* Parse the architecture extension string STR and update ISA_FLAGS
with the architecture features turned on or off.  Return a
-   aarch64_parse_opt_result describing the result.
+   aarch_parse_opt_result describing the result.
When the STR string contains an invalid extension,
a copy of the string is created and stored to INVALID_EXTENSION.  */
 
-enum aarch64_parse_opt_result
+enum aarch_parse_opt_result
 aarch64_parse_extension (const char *str, aarch64_feature_flags *isa_flags,
-std::string *invalid_extension)
+ std::string *invalid_extension)

[PATCH 12/15 V4] arm: implement bti injection

2022-12-14 Thread Andrea Corallo via Gcc-patches
Hi Richard,

thanks for reviewing.

Richard Earnshaw  writes:

> On 28/10/2022 17:40, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> please find attached the third iteration of this patch addresing
>> review
>> comments.
>> Thanks
>>Andrea
>> 
>
> @@ -23374,12 +23374,6 @@ output_probe_stack_range (rtx reg1, rtx reg2)
>return "";
>  }
>
> -static bool
> -aarch_bti_enabled ()
> -{
> -  return false;
> -}
> -
>  /* Generate the prologue instructions for entry into an ARM or Thumb-2
> function.  */
>  void
> @@ -32992,6 +32986,61 @@ arm_current_function_pac_enabled_p (void)
>&& !crtl->is_leaf));
>  }
>
> +/* Return TRUE if Branch Target Identification Mechanism is enabled.  */
> +bool
> +aarch_bti_enabled (void)
> +{
> +  return aarch_enable_bti == 1;
> +}
>
> See comment in earlier patch about the location of this function
> moving.   Can aarch_enable_bti take values other than 0 and 1?

Yes default is 2.

[...]

> +  return GET_CODE (pat) == UNSPEC_VOLATILE && XINT (pat, 1) ==
> UNSPEC_BTI_NOP;
>
> I'm not sure where this crept in, but UNSPEC and UNSPEC_VOLATILE have
> separate enums in the backend, so UNSPEC_BIT_NOP should really be
> VUNSPEC_BTI_NOP and defined in the enum "unspecv".

Done

> +aarch_pac_insn_p (rtx x)
> +{
> +  if (!x || !INSN_P (x))
> +return false;
> +
> +  rtx pat = PATTERN (x);
> +
> +  if (GET_CODE (pat) == SET)
> +{
> +  rtx tmp = XEXP (pat, 1);
> +  if (tmp
> +   && GET_CODE (tmp) == UNSPEC
> +   && (XINT (tmp, 1) == UNSPEC_PAC_NOP
> +   || XINT (tmp, 1) == UNSPEC_PACBTI_NOP))
> + return true;
> +}
> +
>
> This will also need updating (see review on earlier patch) because
> PACBTI needs to be unspec_volatile, while PAC doesn't.

Done

> +/* The following two functions are for code compatibility with aarch64
> +   code, this even if in arm we have only one bti instruction.  */
> +
>
> I'd just write
>  /* Target specific mapping for aarch_gen_bti_c and
>  aarch_gen_bti_j. For Arm, both of these map to a simple BTI
> instruction.  */

Done

>
> @@ -162,6 +162,7 @@ (define_c_enum "unspec" [
>UNSPEC_PAC_NOP ; Represents PAC signing LR
>UNSPEC_PACBTI_NOP  ; Represents PAC signing LR + valid landing pad
>UNSPEC_AUT_NOP ; Represents PAC verifying LR
> +  UNSPEC_BTI_NOP ; Represent BTI
>  ])
>
> BTI is an unspec volatile, so this should be in the "vunspec" enum and
> renamed accordingly (see above).

Done.

Please find attached the updated version of this patch.

BR

  Andrea

>From 582b5e4e4fe089f6865cc3e0360afd1ff168 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 7 Apr 2022 11:51:56 +0200
Subject: [PATCH] [PATCH 12/15] arm: implement bti injection

Hi all,

this patch enables Branch Target Identification Armv8.1-M Mechanism
[1].

This is achieved by using the bti pass made common with Aarch64.

The pass iterates through the instructions and adds the necessary BTI
instructions at the beginning of every function and at every landing
pads targeted by indirect jumps.

Best Regards

  Andrea

[1]
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>

gcc/ChangeLog

2022-04-07  Andrea Corallo  

* config.gcc (arm*-*-*): Add 'aarch-bti-insert.o' object.
* config/arm/arm-protos.h: Update.
* config/arm/arm.cc (aarch_bti_enabled) Update.
(aarch_bti_j_insn_p, aarch_pac_insn_p, aarch_gen_bti_c)
(aarch_gen_bti_j): New functions.
* config/arm/arm.md (bti_nop): New insn.
* config/arm/t-arm (PASSES_EXTRA): Add 'arm-passes.def'.
(aarch-bti-insert.o): New target.
* config/arm/unspecs.md (VUNSPEC_BTI_NOP): New unspec.
* config/arm/aarch-bti-insert.cc (rest_of_insert_bti): Verify arch
compatibility.
* config/arm/arm-passes.def: New file.

gcc/testsuite/ChangeLog

2022-04-07  Andrea Corallo  

* gcc.target/arm/bti-1.c: New testcase.
* gcc.target/arm/bti-2.c: Likewise.
---
 gcc/config.gcc   |  2 +-
 gcc/config/arm/arm-passes.def| 21 ++
 gcc/config/arm/arm-protos.h  |  2 +
 gcc/config/arm/arm.cc| 53 -
 gcc/config/arm/arm.md|  7 
 gcc/config/arm/t-arm | 10 +
 gcc/config/arm/unspecs.md|  1 +
 gcc/testsuite/gcc.target/arm/bti-1.c | 12 ++
 gcc/testsuite/gcc.target/arm/bti-2.c | 58 
 9 files changed, 163 insertions(+), 3 deletions(-)
 create m

[PATCH 10/15 V6] arm: Implement cortex-M return signing address codegen

2022-12-14 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

[...]

>
> +  if (TARGET_TPCS_FRAME)
> +error ("Return address signing and %<-mtpcs-frame%> are
> incompatible.");
>
> So really this is 'not implemented' rather than not compatible - I
> don't see why we couldn't implement this if we really wanted to.  It's
> not worth implementing it because tpcs-frames are very much legacy
> these days.
>
> So the message should use sorry() and say 'is not supported' rather
> than 'are incompatible'.
>
> +(define_insn "pacbti_nop"
> +  [(set (reg:SI IP_REGNUM)
> + (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> +VUNSPEC_PACBTI_NOP))]
>
> No, this needs to be unspec_volatile, not unspec.
>
> +(define_insn "aut_nop"
> +  [(unspec:SI [(reg:SI IP_REGNUM) (reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> +   VUNSPEC_AUT_NOP)]
>
> Similarly.
>
> R.


Hi Richard & all,

please find attached the updated patch implementing suggestions.

BR

  Andrea

>From adabef75c4af91865b0639243d6d9aa03bf8ad68 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 20 Jan 2022 15:36:23 +0100
Subject: [PATCH] [PATCH 10/15] arm: Implement cortex-M return signing address
 codegen

Hi all,

this patch enables address return signature and verification based on
Armv8.1-M Pointer Authentication [1].

To sign the return address, we use the PAC R12, LR, SP instruction
upon function entry.  This is signing LR using SP and storing the
result in R12.  R12 will be pushed into the stack.

During function epilogue R12 will be popped and AUT R12, LR, SP will
be used to verify that the content of LR is still valid before return.

Here an example of PAC instrumented function prologue and epilogue:

void foo (void);

int main()
{
  foo ();
  return 0;
}

Compiled with '-march=armv8.1-m.main -mbranch-protection=pac-ret
-mthumb' translates into:

main:
pac ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

The patch also takes care of generating a PACBTI instruction in place
of the sequence BTI+PAC when Branch Target Identification is enabled
contextually.

Ex. the previous example compiled with '-march=armv8.1-m.main
-mbranch-protection=pac-ret+bti -mthumb' translates into:

main:
pacbti  ip, lr, sp
push{r3, r7, ip, lr}
add r7, sp, #0
bl  foo
movsr3, #0
mov r0, r3
pop {r3, r7, ip, lr}
aut ip, lr, sp
bx  lr

As part of previous upstream suggestions a test for varargs has been
added and '-mtpcs-frame' is deemed being incompatible with this return
signing address feature being introduced.

[1] 
<https://community.arm.com/developer/ip-products/processors/b/processors-ip-blog/posts/armv8-1-m-pointer-authentication-and-branch-target-identification-extension>

gcc/Changelog

2021-11-03  Andrea Corallo  

* config/arm/arm.h (arm_arch8m_main): Declare it.
* config/arm/arm.cc (arm_arch8m_main): Define it.
(arm_option_reconfigure_globals): Set arm_arch8m_main.
(arm_compute_frame_layout, arm_expand_prologue)
(thumb2_expand_return, arm_expand_epilogue)
(arm_conditional_register_usage): Update for pac codegen.
(arm_current_function_pac_enabled_p): New function.
(aarch_bti_enabled) New function.
* config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
Add new patterns.
* config/arm/unspecs.md (UNSPEC_PAC_NOP)
(VUNSPEC_PACBTI_NOP, VUNSPEC_AUT_NOP): Add unspecs.

gcc/testsuite/Changelog

2021-11-03  Andrea Corallo  

* gcc.target/arm/pac.h : New file.
* gcc.target/arm/pac-1.c : New test case.
* gcc.target/arm/pac-2.c : Likewise.
* gcc.target/arm/pac-3.c : Likewise.
* gcc.target/arm/pac-4.c : Likewise.
* gcc.target/arm/pac-5.c : Likewise.
* gcc.target/arm/pac-6.c : Likewise.
* gcc.target/arm/pac-7.c : Likewise.
* gcc.target/arm/pac-8.c : Likewise.
* gcc.target/arm/pac-9.c : Likewise.
* gcc.target/arm/pac-10.c : Likewise.
* gcc.target/arm/pac-11.c : Likewise.
---
 gcc/config/arm/arm-protos.h   |  1 +
 gcc/config/arm/arm.cc | 74 +++
 gcc/config/arm/arm.h  |  4 ++
 gcc/config/arm/arm.md | 23 +
 gcc/config/arm/unspecs.md |  3 ++
 gcc/testsuite/gcc.target/arm/pac-1.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-10.c | 10 
 gcc/testsuite/gcc.target/arm/pac-11.c | 10 
 gcc/testsuite/gcc.target/arm/pac-2.c  | 11 
 gcc/testsuite/gcc.target/arm/pac-3.c  | 11 
 gcc/testsuite/gcc.tar

Re: [PATCH 9/12 V2] arm: Make libgcc bti compatible

2022-12-12 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

> On 22/07/2022 16:09, Andrea Corallo via Gcc-patches wrote:
>> Richard Earnshaw  writes:
>> 
>>> On 21/07/2022 10:17, Andrea Corallo via Gcc-patches wrote:
>>>> Richard Earnshaw  writes:
>>>>
>>>>> On 28/04/2022 10:48, Andrea Corallo via Gcc-patches wrote:
>>>>>> This change add bti instructions at the beginning of arm specific
>>>>>> libgcc hand written assembly routines.
>>>>>> 2022-03-31  Andrea Corallo  
>>>>>>  * libgcc/config/arm/crti.S (FUNC_START): Add bti instruction
>>>>>> if
>>>>>>  necessary.
>>>>>>  * libgcc/config/arm/lib1funcs.S (THUMB_FUNC_START, FUNC_START):
>>>>>>  Likewise.
>>>>>>
>>>>>
>>>>> +#if defined(__ARM_FEATURE_BTI)
>>>>>
>>>>> Wouldn't it be better to use __ARM_FEATURE_BTI_DEFAULT?  That way we
>>>>> only get BTI instructions in multilib variants that have asked for
>>>>> BTI.
>>>>>
>>>>> R.
>>>> Hi Richard,
>>>> good point, yes I think so.
>>>> Please find attached the updated patch.
>>>> BR
>>>> Andrea
>>>>
>>>
>>> I've been pondering this patch.  The way it is implemented would put a
>>> BTI instruction at the start of every assembler routine in libgcc.
>>> But the vast majority of functions in libgcc cannot have their address
>>> taken, so a BTI isn't needed (BTI is only needed when an indirect jump
>>> could be used).  So I wonder if we really need to do this so
>>> aggressively?
>>>
>>> Perhaps a better approach would be to define a macro (eg MAYBEBTI)
>>> which expands a BTI if the compilation requires it and nothing
>>> otherwise), and then manually insert that in any functions that really
>>> need this (if any).
>> I guess the main downside of this approach would be the maintanace
>> burden, we'll have to remember forever that every time an asm function
>> is called by function pointer we have to add the bti landing pad
>> manually, otherwise this will be broken when pacbti enabled. WDYT?
>> If we want to go this way I'll start reworking the patch in this
>> direction (tho this might not be trivial).
>> 
>
> Yes, it's a trade-off.  The lazy way, however, costs all users even if
> a function is never addressed (which I think is the case for
> practically all functions in libgcc).
>
> So I think in this case it's worth taking that extra development pain.
>
> R.

As a late follow-up to this.

I believe there are no hand written asm functions in libgcc that are
addressed, so this patch was dropped from the series in the following
iteration.  It is true that we could pac instrument them but ATM we
don't.

  Andrea


[PATCH 10/15 V5] arm: Implement cortex-M return signing address codegen

2022-12-09 Thread Andrea Corallo via Gcc-patches
Hi Richard,

thanks for reviewing.

Richard Earnshaw  writes:

> On 07/11/2022 08:57, Andrea Corallo via Gcc-patches wrote:
>> Hi all,
>> please find attached the lastest version of this patch incorporating
>> some
>> more improvents.  Feel free to ignore V3.
>> Best Regards
>>Andrea
>> 
>
>> As part of previous upstream suggestions a test for varargs has been
>> added and '-mtpcs-frame' is deemed being incompatible with this return
>> signing address feature being introduced.
>
> I don't see any check for the tpcs-frame incompatibility?  What
> happens if a user does combine the options?

Check added.

> gcc/Changelog
>
> 2021-11-03  Andrea Corallo  
>
>   * config/arm/arm.h (arm_arch8m_main): Declare it.
>   * config/arm/arm.cc (arm_arch8m_main): Define it.
>   (arm_option_reconfigure_globals): Set arm_arch8m_main.
>   (arm_compute_frame_layout, arm_expand_prologue)
>   (thumb2_expand_return, arm_expand_epilogue)
>   (arm_conditional_register_usage): Update for pac codegen.
>   (arm_current_function_pac_enabled_p): New function.
>   * config/arm/arm.md (pac_ip_lr_sp, pacbti_ip_lr_sp, aut_ip_lr_sp):
>   Add new patterns.
>   * config/arm/unspecs.md (UNSPEC_PAC_IP_LR_SP)
>   (UNSPEC_PACBTI_IP_LR_SP, UNSPEC_AUT_IP_LR_SP): Add unspecs.
>
> You're missing an entry for aarch_bti_enabled () - yes I realize
> that's just a placeholder at present and will be fully defined in
> patch 12.

Fixed

> +static bool
> +aarch_bti_enabled ()
> +{
> +  return false;
> +}
> +
>
> No comment on this function (and in patch 12 it moves to a different
> location).  It would be best to have it in the right place at this
> point in time.
>
> +  clobber_ip = (IS_NESTED (func_type)
> +&& (((TARGET_APCS_FRAME && frame_pointer_needed &&
> TARGET_ARM)
> + || ((flag_stack_check == STATIC_BUILTIN_STACK_CHECK
> +  || flag_stack_clash_protection)
> + && !df_regs_ever_live_p (LR_REGNUM)
> + && arm_r3_live_at_start_p ()))
> +|| (arm_current_function_pac_enabled_p (;
>
> Redundant parenthesis around arm_current_function_pac_enabled_p () call.

Fixed

> +   gcc_assert(arm_compute_static_chain_stack_bytes() == 4
> + || arm_current_function_pac_enabled_p ());
>
> I wonder if this assert is now really serving a useful purpose.  I'd
> consider removing it.

Removed

> @@ -27309,7 +27340,7 @@ thumb2_expand_return (bool simple_return)
>to assert it for now to ensure that future code changes do not silently
>change this behavior.  */
>gcc_assert (!IS_CMSE_ENTRY (arm_current_func_type ()));
> -  if (num_regs == 1)
> +  if (num_regs == 1 && !arm_current_function_pac_enabled_p ())
>  {
>rtx par = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (2));
>rtx reg = gen_rtx_REG (SImode, PC_REGNUM);
> @@ -27324,10 +27355,20 @@ thumb2_expand_return (bool simple_return)
>  }
>else
>  {
> -  saved_regs_mask &= ~ (1 << LR_REGNUM);
> -  saved_regs_mask |=   (1 << PC_REGNUM);
> -  arm_emit_multi_reg_pop (saved_regs_mask);
> -}
> +   if (arm_current_function_pac_enabled_p ())
> + {
> +   gcc_assert (!(saved_regs_mask & (1 << PC_REGNUM)));
> +   arm_emit_multi_reg_pop (saved_regs_mask);
> +   emit_insn (gen_aut_nop ());
> +   emit_jump_insn (simple_return_rtx);
> + }
> +   else
> + {
> +   saved_regs_mask &= ~ (1 << LR_REGNUM);
> +   saved_regs_mask |=   (1 << PC_REGNUM);
> +   arm_emit_multi_reg_pop (saved_regs_mask);
> + }
> + }
>  }
>else
>
> The logic for these blocks would, I think, be better expressed as
>
>if (pac_enabled)
>...
>else if (num_regs == 1)
>  ...  // existing code
>else
>  ...  // existing code

Done

> Also, I think (out of an abundance of caution) we really need a
> scheduling barrier placed before calls to gen_aut_nop() pattern is
> emitted, to ensure that the scheduler never tries to move this
> instruction away from the position we place it.  Use gen_blockage()
> for that (see TARGET_SCHED_PROLOG).  Alternatively, we could make the
> UNSPEC_PAC_NOP an unspec_volatile, which has the same effect (IIRC)
> without needing an additional insn - if you use this approach, then
> please make sure this is explained in a comment.
>

Re: [PATCH] arm: fix mve intrinsics scan body tests for C++

2022-12-08 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

> Hi Andrea,
>
>> -Original Message-----
>> From: Andrea Corallo 
>> Sent: Wednesday, December 7, 2022 3:03 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov ; Richard Earnshaw
>> ; Andrea Corallo 
>> Subject: [PATCH] arm: fix mve intrinsics scan body tests for C++
>> 
>> Hi all,
>> 
>> this patch is to export the functions defined in these MVE tests as C
>> so the body scan assembler works as expected also for our C++ tests.
>> 
>> Best Regards and sorry for the regression!
>
> Ok.
> Thanks,
> Kyrill

Thanks,

into trunk as 8d4f007398b.

Regards

  Andrea


Re: [PATCH] arm: fix mve intrinsics scan body tests for C++

2022-12-08 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

> Hi Andrea,
>
>> -Original Message-----
>> From: Andrea Corallo 
>> Sent: Wednesday, December 7, 2022 3:03 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov ; Richard Earnshaw
>> ; Andrea Corallo 
>> Subject: [PATCH] arm: fix mve intrinsics scan body tests for C++
>> 
>> Hi all,
>> 
>> this patch is to export the functions defined in these MVE tests as C
>> so the body scan assembler works as expected also for our C++ tests.
>> 
>> Best Regards and sorry for the regression!
>
> Ok.
> Thanks,
> Kyrill

Thanks attaching the original patch as compressed, the original it's
still stuck for moderator review (more than 400KB).

  Andrea



0001-arm-fix-mve-intrinsics-scan-body-tests-for-C.patch.gz
Description: application/gzip


Re: [PATCH 10/15 V4] arm: Implement cortex-M return signing address codegen

2022-12-06 Thread Andrea Corallo via Gcc-patches
Richard Earnshaw  writes:

> On 06/12/2022 15:46, Andrea Corallo wrote:
>> Hi Richard,
>> thanks for reviewing.
>> Just one clarification before I complete the respin of this patch.
>> Richard Earnshaw  writes:
>> [...]
>> 
>>> Also, I think (out of an abundance of caution) we really need a
>>> scheduling barrier placed before calls to gen_aut_nop() pattern is
>>> emitted, to ensure that the scheduler never tries to move this
>>> instruction away from the position we place it.  Use gen_blockage()
>>> for that (see TARGET_SCHED_PROLOG).  Alternatively, we could make the
>>> UNSPEC_PAC_NOP an unspec_volatile, which has the same effect (IIRC)
>>> without needing an additional insn - if you use this approach, then
>>> please make sure this is explained in a comment.
>>>
>>> +(define_insn "pacbti_nop"
>>> +  [(set (reg:SI IP_REGNUM)
>>> +   (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
>>> +  UNSPEC_PACBTI_NOP))]
>>> +  "arm_arch8m_main"
>>> +  "pacbti\t%|ip, %|lr, %|sp"
>>> +  [(set_attr "conds" "unconditional")])
>>>
>>> The additional side-effect of this being a BTI landing pad means that
>>> we mustn't move any other instruction before it.  So I think this
>>> needs to be an unspec_volatile as well.
>> IIUC from this we want to make all the three (UNSPEC_PAC_NOP,
>> UNSPEC_PACBTI_NOP, UNSPEC_AUT_NOP) unspec volatile, correct?
>
> UNSPEC_PAC_NOP doesn't need to be volatile. The register constraints
> will be enough to ensure it is run before any instruction that
> consumes the result it produces.
>
> UNSPEC_PAC_BTI_NOP needs to be volatile, as it's essential that when
> we have an instruction (for example ldr r3, [r3]) early in the program
> that doesn't interact with the prologue then it cannot be migrated
> before the BTI as the BTI is a landing pad and must be the first
> instruction in the function.  This is why UNSPEC_BTI_NOP is volatile.
>
> UNSPEC_AUT_NOP must be volatile because we want to ensure that no
> instruction is moved after this one and before the return as that
> might expose a ROP gadget to hackers.
>
> R.

Understood now, thanks.

  Andrea


Re: [PATCH 10/15 V4] arm: Implement cortex-M return signing address codegen

2022-12-06 Thread Andrea Corallo via Gcc-patches
Hi Richard,

thanks for reviewing.

Just one clarification before I complete the respin of this patch.

Richard Earnshaw  writes:

[...]

> Also, I think (out of an abundance of caution) we really need a
> scheduling barrier placed before calls to gen_aut_nop() pattern is
> emitted, to ensure that the scheduler never tries to move this
> instruction away from the position we place it.  Use gen_blockage()
> for that (see TARGET_SCHED_PROLOG).  Alternatively, we could make the
> UNSPEC_PAC_NOP an unspec_volatile, which has the same effect (IIRC)
> without needing an additional insn - if you use this approach, then
> please make sure this is explained in a comment.
>
> +(define_insn "pacbti_nop"
> +  [(set (reg:SI IP_REGNUM)
> + (unspec:SI [(reg:SI SP_REGNUM) (reg:SI LR_REGNUM)]
> +UNSPEC_PACBTI_NOP))]
> +  "arm_arch8m_main"
> +  "pacbti\t%|ip, %|lr, %|sp"
> +  [(set_attr "conds" "unconditional")])
>
> The additional side-effect of this being a BTI landing pad means that
> we mustn't move any other instruction before it.  So I think this
> needs to be an unspec_volatile as well.

IIUC from this we want to make all the three (UNSPEC_PAC_NOP,
UNSPEC_PACBTI_NOP, UNSPEC_AUT_NOP) unspec volatile, correct?

IIUC correctly the scheduler should not reorder them as we have
expressed which register they consume and produce but is for double
caution correct?

> On the tests, they are OK as they stand, but we lack anything that
> will be tested when suitable hardware is unavailable (all tests are
> "dg-do run").  Can we please have some compile-only tests as well?

Ack.

BR

  Andrea


Re: [PING][PATCH 0/15] arm: Enables return address verification and branch target identification on Cortex-M

2022-12-05 Thread Andrea Corallo via Gcc-patches
Andrea Corallo via Gcc-patches  writes:

> Hi all,
>
> ping^2 for patches 9/15 7/15 11/15 12/15 and 10/15 V2 of this series.
>
>   Andrea

Hello all,

PING^3 for:
[PATCH 6/12 V2] arm: Add pointer authentication for stack-unwinding runtime
[PATCH 9/15] arm: Set again stack pointer as CFA reg when popping if necessary
[PATCH 10/15 V4] arm: Implement cortex-M return signing address codegen
[PATCH 12/15 V3] arm: implement bti injection

which I believe are still pending for review.

Other option would be to declare the arm backend as unmaintained.

Thanks

  Andrea


Re: [PATCH 00/35] arm: rework MVE testsuite and rework backend where necessary (1st chunk)

2022-11-28 Thread Andrea Corallo via Gcc-patches
Andrea Corallo  writes:

> Hi all,
>
> this is the first patch series about improving the current MVE
> implementation and testsuite for:
>
> - Complete intrinsic implementation and coverage (the list of intrinsics is
>   specified by [1])
> - Verifying all instructions supposedly emitted by each intrinsic
> - Verifying register usage
> - Fixing the current scan assemblers to really match the wanted mnemonics
> - Verifying no external calls are emitted
>
> This series fixes the backend where necessary.
>
> Best Regards
>
>   Andrea

Hi Kyrill,

thank for reviewing the series!

With the requested changes this is now into trunk as of f2b54e5b796.

Best Regards

  Andrea



[PATCH 35/35 V2] arm: improve tests for vsetq_lane*

2022-11-24 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

[...]

>> diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
>> b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
>> index e03e9620528..b5c9f4d5eb8 100644
>> --- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
>> +++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
>> @@ -1,15 +1,45 @@
>> -/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } 
>> {""} } */
>>  /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
>>  /* { dg-add-options arm_v8_1m_mve_fp } */
>>  /* { dg-additional-options "-O2" } */
>> +/* { dg-final { check-function-bodies "**" "" } } */
>> 
>>  #include "arm_mve.h"
>> 
>> +/*
>> +**foo:
>> +**  ...
>> +**  vmov.16 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
>> +**  ...
>> +*/
>>  float16x8_t
>>  foo (float16_t a, float16x8_t b)
>>  {
>> -return vsetq_lane_f16 (a, b, 0);
>> +  return vsetq_lane_f16 (a, b, 1);
>>  }
>> 
>
> Hmm, for these tests we should be able to scan for more specific codegen as 
> we're setting individual lanes, so we should be able to scan for lane 1 in 
> the vmov instruction, though it may need to be flipped for big-endian.
> Thanks,
> Kyrill

Hi Kyrill,

please find attached the updated version of this patch.

Big-endian should not be a problem as for my understanding is just not
supported with MVE intrinsics.

Thanks!

  Andrea

>From 79f2c990553a1f793e08b9a0c4abb7dae8de7120 Mon Sep 17 00:00:00 2001
From: Andrea Corallo 
Date: Thu, 17 Nov 2022 11:06:29 +0100
Subject: [PATCH] arm: improve tests for vsetq_lane*

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vsetq_lane_f16.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_f32.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_s16.c   | 24 ++--
 .../arm/mve/intrinsics/vsetq_lane_s32.c   | 24 ++--
 .../arm/mve/intrinsics/vsetq_lane_s64.c   | 27 ++---
 .../arm/mve/intrinsics/vsetq_lane_s8.c| 24 ++--
 .../arm/mve/intrinsics/vsetq_lane_u16.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_u32.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_u64.c   | 39 ---
 .../arm/mve/intrinsics/vsetq_lane_u8.c| 36 +++--
 10 files changed, 284 insertions(+), 34 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
index e03e9620528..6b148a4b03d 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
@@ -1,15 +1,45 @@
-/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} 
} */
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov.16 q[0-9]+\[1\], (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 float16x8_t
 foo (float16_t a, float16x8_t b)
 {
-return vsetq_lane_f16 (a, b, 0);
+  return vsetq_lane_f16 (a, b, 1);
 }
 
-/* { dg-final { scan-assembler "vmov.16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmov.16 q[0-9]+\[1\], (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
+float16x8_t
+foo1 (float16_t a, float16x8_t b)
+{
+  return vsetq_lane (a, b, 1);
+}
+
+/*
+**foo2:
+** ...
+** vmov.16 q[0-9]+\[1\], (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
+float16x8_t
+foo2 (float16x8_t b)
+{
+  return vsetq_lane (1.1, b, 1);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
index 2b9f1a7e627..e4e7f892e97 100644
--- a/gcc/testsuite/gcc.target/arm/mve/

Re: [PATCH 16/35] arm: Add integer vector overloading of vsubq_x instrinsic

2022-11-22 Thread Andrea Corallo via Gcc-patches
Christophe Lyon  writes:

> On 11/17/22 17:37, Andrea Corallo via Gcc-patches wrote:
>> From: Stam Markianos-Wright 
>> In the past we had only defined the vsubq_x generic overload of the
>> vsubq_x_* intrinsics for float vector types.  This would cause them
>> to fall back to the `__ARM_undef` failure state if they was called
>> through the generic version.
>> This patch simply adds these overloads.
>> gcc/ChangeLog:
>>  * config/arm/arm_mve.h (__arm_vsubq_x FP): New overloads.
>>   (__arm_vsubq_x Integer): New.
>
> Hi Stam,
>
> To hopefully help Kyrill in the review, I think this fix is tested by
> patch #19, where we now have
> +/* { dg-final { scan-assembler-not "__ARM_undef" } } */
> (this line explains why this bug was not noticed so far)
>
> Thanks,
>
> Christophe

Exactly

PS also the fact that now tests are 'check-function-bodies' should catch
that.

Thanks

  Andrea


Re: [PATCH 10/35] arm: improve tests for vabavq*

2022-11-21 Thread Andrea Corallo via Gcc-patches
Kyrylo Tkachov  writes:

>> -Original Message-
>> From: Andrea Corallo 
>> Sent: Thursday, November 17, 2022 4:38 PM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Kyrylo Tkachov ; Richard Earnshaw
>> ; Andrea Corallo 
>> Subject: [PATCH 10/35] arm: improve tests for vabavq*
>> 
>> gcc/testsuite/ChangeLog:
>> 
>>  * gcc.target/arm/mve/intrinsics/vabavq_p_s16.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_p_s32.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_p_s8.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_p_u16.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_p_u32.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_p_u8.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_s16.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_s32.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_s8.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_u16.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_u32.c:
>>  * gcc.target/arm/mve/intrinsics/vabavq_u8.c:
>
> Missing ChangeLog text?
> Ok with ChangeLog fixed.

Ops! sorry

Thanks

  Andrea
  


[PATCH 13/35] arm: further fix overloading of MVE vaddq[_m]_n intrinsic

2022-11-17 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

It was observed that in tests `vaddq_m_n_[s/u][8/16/32].c`, the _Generic
resolution would fall back to the `__ARM_undef` failure state.

This is a regression since `dc39db873670bea8d8e655444387ceaa53a01a79` and
`6bd4ce64eb48a72eca300cb52773e6101d646004`, but it previously wasn't
identified, because the tests were not checking for this kind of failure.

The above commits changed the definitions of the intrinsics from using
`[u]int[8/16/32]_t` types for the scalar argument to using `int`. This
allowed `int` to be supported in user code through the overloaded
`#defines`, but seems to have broken the `[u]int[8/16/32]_t` types

The solution implemented by this patch is to explicitly use a new
_Generic mapping from all the `[u]int[8/16/32]_t` types for int. With this
change, both `int` and `[u]int[8/16/32]_t` parameters are supported from
user code and are handled by the overloading mechanism correctly.

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vaddq_m_n_s8): Change types.
(__arm_vaddq_m_n_s32): Likewise.
(__arm_vaddq_m_n_s16): Likewise.
(__arm_vaddq_m_n_u8): Likewise.
(__arm_vaddq_m_n_u32): Likewise.
(__arm_vaddq_m_n_u16): Likewise.
(__arm_vaddq_m): Fix Overloading.
(__ARM_mve_coerce3): New.
---
 gcc/config/arm/arm_mve.h | 78 
 1 file changed, 40 insertions(+), 38 deletions(-)

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index 684f997520f..951dc25374b 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -9675,42 +9675,42 @@ __arm_vabdq_m_u16 (uint16x8_t __inactive, uint16x8_t 
__a, uint16x8_t __b, mve_pr
 
 __extension__ extern __inline int8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int __b, mve_pred16_t 
__p)
+__arm_vaddq_m_n_s8 (int8x16_t __inactive, int8x16_t __a, int8_t __b, 
mve_pred16_t __p)
 {
   return __builtin_mve_vaddq_m_n_sv16qi (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int __b, 
mve_pred16_t __p)
+__arm_vaddq_m_n_s32 (int32x4_t __inactive, int32x4_t __a, int32_t __b, 
mve_pred16_t __p)
 {
   return __builtin_mve_vaddq_m_n_sv4si (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline int16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int __b, 
mve_pred16_t __p)
+__arm_vaddq_m_n_s16 (int16x8_t __inactive, int16x8_t __a, int16_t __b, 
mve_pred16_t __p)
 {
   return __builtin_mve_vaddq_m_n_sv8hi (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline uint8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, int __b, 
mve_pred16_t __p)
+__arm_vaddq_m_n_u8 (uint8x16_t __inactive, uint8x16_t __a, uint8_t __b, 
mve_pred16_t __p)
 {
   return __builtin_mve_vaddq_m_n_uv16qi (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline uint32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, int __b, 
mve_pred16_t __p)
+__arm_vaddq_m_n_u32 (uint32x4_t __inactive, uint32x4_t __a, uint32_t __b, 
mve_pred16_t __p)
 {
   return __builtin_mve_vaddq_m_n_uv4si (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline uint16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, int __b, 
mve_pred16_t __p)
+__arm_vaddq_m_n_u16 (uint16x8_t __inactive, uint16x8_t __a, uint16_t __b, 
mve_pred16_t __p)
 {
   return __builtin_mve_vaddq_m_n_uv8hi (__inactive, __a, __b, __p);
 }
@@ -26417,42 +26417,42 @@ __arm_vabdq_m (uint16x8_t __inactive, uint16x8_t __a, 
uint16x8_t __b, mve_pred16
 
 __extension__ extern __inline int8x16_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int __b, mve_pred16_t __p)
+__arm_vaddq_m (int8x16_t __inactive, int8x16_t __a, int8_t __b, mve_pred16_t 
__p)
 {
  return __arm_vaddq_m_n_s8 (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline int32x4_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int __b, mve_pred16_t __p)
+__arm_vaddq_m (int32x4_t __inactive, int32x4_t __a, int32_t __b, mve_pred16_t 
__p)
 {
  return __arm_vaddq_m_n_s32 (__inactive, __a, __b, __p);
 }
 
 __extension__ extern __inline int16x8_t
 __attribute__ ((__always_inline__, __gnu_inline__, __artificial__))
-__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int __b, mve_pred16_t __p)
+__arm_vaddq_m (int16x8_t __inactive, int16x8_t __a, int16_t __b, mve_pred16_t 
__p)
 {
  return __arm_vaddq_m_

[PATCH 15/35] arm: Explicitly specify other float types for _Generic overloading [PR107515]

2022-11-17 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

This patch adds explicit references to other float types
to __ARM_mve_typeid in arm_mve.h.  Resolves PR 107515:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107515

gcc/ChangeLog:
PR 107515
* config/arm/arm_mve.h (__ARM_mve_typeid): Add float types.
---
 gcc/config/arm/arm_mve.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index fd1876b57a0..f6b42dc3fab 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -35582,6 +35582,9 @@ enum {
short: __ARM_mve_type_int_n, \
int: __ARM_mve_type_int_n, \
long: __ARM_mve_type_int_n, \
+   _Float16: __ARM_mve_type_fp_n, \
+   __fp16: __ARM_mve_type_fp_n, \
+   float: __ARM_mve_type_fp_n, \
double: __ARM_mve_type_fp_n, \
long long: __ARM_mve_type_int_n, \
unsigned char: __ARM_mve_type_int_n, \
-- 
2.25.1



[PATCH 01/35] arm: improve vcreateq* tests

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vcreateq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vcreateq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vcreateq_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vcreateq_f16.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_f32.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_s16.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_s32.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_s64.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_s8.c  | 23 ++-
 .../arm/mve/intrinsics/vcreateq_u16.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_u32.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_u64.c | 23 ++-
 .../arm/mve/intrinsics/vcreateq_u8.c  | 23 ++-
 10 files changed, 220 insertions(+), 10 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f16.c
index fb3601edb94..c39303daa03 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f16.c
@@ -1,13 +1,34 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov q[0-9+]\[2\], q[0-9+]\[0\], r[0-9+], r[0-9+]
+** vmov q[0-9+]\[3\], q[0-9+]\[1\], r[0-9+], r[0-9+]
+** ...
+*/
 float16x8_t
 foo (uint64_t a, uint64_t b)
 {
   return vcreateq_f16 (a, b);
 }
 
-/* { dg-final { scan-assembler "vmov"  }  } */
+/*
+**foo1:
+** ...
+** vmov q[0-9+]\[2\], q[0-9+]\[0\], r[0-9+], r[0-9+]
+** vmov q[0-9+]\[3\], q[0-9+]\[1\], r[0-9+], r[0-9+]
+** ...
+*/
+float16x8_t
+foo1 ()
+{
+  return vcreateq_f16 (1, 1);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f32.c
index 4f4da62eed7..ad66f4407cd 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_f32.c
@@ -1,13 +1,34 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov q[0-9+]\[2\], q[0-9+]\[0\], r[0-9+], r[0-9+]
+** vmov q[0-9+]\[3\], q[0-9+]\[1\], r[0-9+], r[0-9+]
+** ...
+*/
 float32x4_t
 foo (uint64_t a, uint64_t b)
 {
   return vcreateq_f32 (a, b);
 }
 
-/* { dg-final { scan-assembler "vmov"  }  } */
+/*
+**foo1:
+** ...
+** vmov q[0-9+]\[2\], q[0-9+]\[0\], r[0-9+], r[0-9+]
+** vmov q[0-9+]\[3\], q[0-9+]\[1\], r[0-9+], r[0-9+]
+** ...
+*/
+float32x4_t
+foo1 ()
+{
+  return vcreateq_f32 (1, 1);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_s16.c
index 103be6310bd..7e70a486513 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_s16.c
@@ -1,13 +1,34 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov q[0-9+]\[2\], q[0-9+]\[0\], r[0-9+], r[0-9+]
+** vmov q[0-9+]\[3\], q[0-9+]\[1\], r[0-9+], r[0-9+]
+** ...
+*/
 int16x8_t
 foo (uint64_t a, uint64_t b)
 {
   return vcreateq_s16 (a, b);
 }
 
-/* { dg-final { scan-assembler "vmov"  }  } */
+/*
+**foo1:
+** ...
+** vmov q[0-9+]\[2\], q[0-9+]\[0\], r[0-9+], r[0-9+]
+** vmov q[0-9+]\[3\], q[0-9+]\[1\], r[0-9+], r[0-9+]
+** ...
+*/
+int16x8_t
+foo1 ()
+{
+  return vcreateq_s16 (1, 1);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vcreateq_s32.c
index 96f7a972d93.

[PATCH 17/35] arm: improve tests and fix vadd*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vaddlvq_p_v4si)
(mve_vaddq_n_, mve_vaddvaq_)
(mve_vaddlvaq_v4si, mve_vaddq_n_f)
(mve_vaddlvaq_p_v4si, mve_vaddq, mve_vaddq_f):
Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vaddlvaq_p_s32.c: Improve test.
* gcc.target/arm/mve/intrinsics/vaddlvaq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddlvaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddlvaq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddlvq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddlvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddlvq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddlvq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvaq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vaddvq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/

[PATCH 09/35] arm: improve tests for vmax*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmaxaq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmaxaq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxaq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxaq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxaq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxavq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmaq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmaq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmaq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmaq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmavq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxnmvq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmaxvq_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vmaxaq_m_s16.c | 25 +--
 .../arm/mve/intrinsics/vmaxaq_m_s32.c | 25 +--
 .../arm/mve/intrinsics/vmaxaq_m_s8.c  | 25 +--
 .../arm/mve/intrinsics/vmaxaq_s16.c   | 16 +++-
 .../arm/mve/intrinsics/vmaxaq_s32.c   | 16 +++-
 .../gcc.target/arm/mve/intrinsics/vmaxaq_s8.c | 16 +++-
 .../arm/mve/intrinsics/vmaxavq_p_s16.c| 41 ---
 .../arm/mve/intrinsics/vmaxavq_p_s32.c| 41 ---
 .../arm/mve/intrinsics/vmaxavq_p_s8.c | 41 ---
 .../arm/mve/intrinsics/vmaxavq_s16.c  | 29 ++---
 .../arm/mve/intrinsics/vmaxavq_s32.c  | 29 ++---
 .../arm/mve/intrinsics/vmaxavq_s8.c   | 29 ++---
 .../arm/mve/intrinsics/vmaxnmaq_f16.c | 16 +++-
 .../arm/mve/intrinsics/vmaxnmaq_f32.c | 16 +++-
 .../arm/mve/intrinsics/vmaxnmaq_m_f16.c   | 25 +--
 .../arm/mve/intrinsics/vmaxnmaq_m_f32.c   | 25 +--
 .../arm/m

[PATCH 29/35] arm: improve tests for vqdmul*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s16.c: Improve tests.
* gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulhq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmullbq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmulltq_s32.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmulhq_m_n_s16.c | 26 ---
 .../arm/mve/intrinsics/vqdmulhq_m_n_s32.c | 26 ---
 .../arm/mve/intrinsics/vqdmulhq_m_n_s8.c  | 26 ---
 .../arm/mve/intrinsics/vqdmulhq_m_s16.c   | 26 ---
 .../arm/mve/intrinsics/vqdmulhq_m_s32.c   | 26 ---
 .../arm/mve/intrinsics/vqdmulhq_m_s8.c| 26 ---
 .../arm/mve/intrinsics/vqdmulhq_n_s16.c   | 16 ++--
 .../arm/mve/intrinsics/vqdmulhq_n_s32.c   | 16 ++--
 .../arm/mve/intrinsics/vqdmulhq_n_s8.c| 16 ++--
 .../arm/mve/intrinsics/vqdmulhq_s16.c | 16 ++--
 .../arm/mve/intrinsics/vqdmulhq_s32.c | 16 ++--
 .../arm/mve/intrinsics/vqdmulhq_s8.c  | 16 ++--
 .../arm/mve/intrinsics/vqdmullbq_m_n_s16.c| 26 ---
 .../arm/mve/intrinsics/vqdmullbq_m_n_s32.c| 26 ---
 .../arm/mve/intrinsics/vqdmullbq_m_s16.c  | 26 ---
 .../arm/mve/intrinsics/vqdmullbq_m_s32.c  | 26 ---
 .../arm/mve/intrinsics/vqdmullbq_n_s16.c  | 16 ++--
 .../arm/mve/intrinsics/vqdmullbq_n_s32.c  | 16 ++--
 .../arm/mve/intrinsics/vqdmullbq_s16.c| 16 ++--
 .../arm/mve/intrinsics/vqdmullbq_s32.c| 16 ++--
 .../arm/mve/intrinsics/vqdmulltq_m_n_s16.c| 26 ---
 .../arm/mve/intrinsics/vqdmulltq_m_n_s32.c| 26 ---
 .../arm/mve/intrinsics/vqdmulltq_m_s16.c  | 26 ---
 .../arm/mve/intrinsics/vqdmulltq_m_s32.c  | 26 ---
 .../arm/mve/intrinsics/vqdmulltq_n_s16.c  | 16 ++--
 .../arm/mve/intrinsics/vqdmulltq_n_s32.c  | 16 ++--
 .../arm/mve/intrinsics/vqdmulltq_s16.c| 16 ++--
 .../arm/mve/intrinsics/vqdmulltq_s32.c| 16 ++--
 28 files changed, 504 insertions(+), 84 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s16.c
index 57ab85eaf52..a5c1a106205 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmulhq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmulht.s16q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqdmulhq_m_n_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmulht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|

[PATCH 32/35] arm: improve tests for vqsubq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqsubq_m_n_s16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_s32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_s8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_u16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_u32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_n_u8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_s16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_s32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_s8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_u16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_u32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_m_u8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_n_s16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_n_s32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_n_s8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_n_u16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_n_u32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_n_u8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_s16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_s32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_s8.c:
* gcc.target/arm/mve/intrinsics/vqsubq_u16.c:
* gcc.target/arm/mve/intrinsics/vqsubq_u32.c:
* gcc.target/arm/mve/intrinsics/vqsubq_u8.c:
---
 .../arm/mve/intrinsics/vqsubq_m_n_s16.c   | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_n_s32.c   | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_n_s8.c| 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vqsubq_m_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vqsubq_m_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vqsubq_m_s16.c | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_s32.c | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_s8.c  | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_u16.c | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_u32.c | 26 ++--
 .../arm/mve/intrinsics/vqsubq_m_u8.c  | 26 ++--
 .../arm/mve/intrinsics/vqsubq_n_s16.c | 16 ++-
 .../arm/mve/intrinsics/vqsubq_n_s32.c | 16 ++-
 .../arm/mve/intrinsics/vqsubq_n_s8.c  | 16 ++-
 .../arm/mve/intrinsics/vqsubq_n_u16.c | 28 -
 .../arm/mve/intrinsics/vqsubq_n_u32.c | 28 -
 .../arm/mve/intrinsics/vqsubq_n_u8.c  | 28 -
 .../arm/mve/intrinsics/vqsubq_s16.c   | 16 ++-
 .../arm/mve/intrinsics/vqsubq_s32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vqsubq_s8.c | 16 ++-
 .../arm/mve/intrinsics/vqsubq_u16.c   | 16 ++-
 .../arm/mve/intrinsics/vqsubq_u32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vqsubq_u8.c | 16 ++-
 24 files changed, 516 insertions(+), 72 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s16.c
index abcff4f0e3c..39b8089919d 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqsubt.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqsubq_m_n_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqsubt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqsubt.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqsubq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqsubt.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s32.c
index 23e59ff12a2..ed6b92ddcf5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqsubq_m_n_s32.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "a

[PATCH 21/35] arm: improve tests for vhaddq_m*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vhaddq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vhaddq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhaddq_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vhaddq_m_n_s16.c   | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_n_s32.c   | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_n_s8.c| 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vhaddq_m_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vhaddq_m_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vhaddq_m_s16.c | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_s32.c | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_s8.c  | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_u16.c | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_u32.c | 26 ++--
 .../arm/mve/intrinsics/vhaddq_m_u8.c  | 26 ++--
 .../arm/mve/intrinsics/vhaddq_n_s16.c | 16 ++-
 .../arm/mve/intrinsics/vhaddq_n_s32.c | 16 ++-
 .../arm/mve/intrinsics/vhaddq_n_s8.c  | 16 ++-
 .../arm/mve/intrinsics/vhaddq_n_u16.c | 28 -
 .../arm/mve/intrinsics/vhaddq_n_u32.c | 28 -
 .../arm/mve/intrinsics/vhaddq_n_u8.c  | 28 -
 .../arm/mve/intrinsics/vhaddq_s16.c   | 16 ++-
 .../arm/mve/intrinsics/vhaddq_s32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vhaddq_s8.c | 16 ++-
 .../arm/mve/intrinsics/vhaddq_u16.c   | 16 ++-
 .../arm/mve/intrinsics/vhaddq_u32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vhaddq_u8.c | 16 ++-
 .../arm/mve/intrinsics/vhaddq_x_n_s16.c   | 26 ++--
 .../arm/mve/intrinsics/vhaddq_x_n_s32.c   | 26 ++--
 .../arm/mve/intrinsics/vhaddq_x_n_s8.c| 26 ++--
 .../arm/mve/intrinsics/vhaddq_x_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vhaddq_x_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vhaddq_x_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vhaddq_x_s16.c | 25 +--
 .../arm/mve/intrinsics/vhaddq_x_s32.c | 25 +--
 .../arm/mve/intrinsics/vhaddq_x_s8.c  | 25 +--
 .../arm/mve/intrinsics/vhaddq_x_u16.c | 25 +--
 .../arm/mve/intrinsics/vhaddq_x_u32.c | 25 +--
 .../arm/mve/intrinsics/vhaddq_x_u8.c  | 25 +--
 36 files changed, 828 insertions(+), 114 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vhaddq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vhaddq_m_n_s16.c
index e90af963697..0bd03832ff5 100644
--- a/gcc/testsuite/gcc

[PATCH 26/35] arm: improve tests for vmlasq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmlasq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmlasq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlasq_n_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vmlasq_m_n_s16.c   | 34 ++---
 .../arm/mve/intrinsics/vmlasq_m_n_s32.c   | 34 ++---
 .../arm/mve/intrinsics/vmlasq_m_n_s8.c| 34 ++---
 .../arm/mve/intrinsics/vmlasq_m_n_u16.c   | 50 ---
 .../arm/mve/intrinsics/vmlasq_m_n_u32.c   | 50 ---
 .../arm/mve/intrinsics/vmlasq_m_n_u8.c| 50 ---
 .../arm/mve/intrinsics/vmlasq_n_s16.c | 24 ++---
 .../arm/mve/intrinsics/vmlasq_n_s32.c | 24 ++---
 .../arm/mve/intrinsics/vmlasq_n_s8.c  | 24 ++---
 .../arm/mve/intrinsics/vmlasq_n_u16.c | 36 ++---
 .../arm/mve/intrinsics/vmlasq_n_u32.c | 36 ++---
 .../arm/mve/intrinsics/vmlasq_n_u8.c  | 36 ++---
 12 files changed, 348 insertions(+), 84 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s16.c
index bf66e616ec7..af6e588adad 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlast.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo (int16x8_t m1, int16x8_t m2, int16_t add, mve_pred16_t p)
 {
-  return vmlasq_m_n_s16 (a, b, c, p);
+  return vmlasq_m_n_s16 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmlast.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlast.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo1 (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo1 (int16x8_t m1, int16x8_t m2, int16_t add, mve_pred16_t p)
 {
-  return vmlasq_m (a, b, c, p);
+  return vmlasq_m (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmlast.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s32.c
index 53c21e2e5b6..9d0cc3076d9 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlasq_m_n_s32.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlast.s32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo (int32x4_t m1, int32x4_t m2, int32_t add, mve_pred16_t p)
 {
-  return vmlasq_m_n_s32 (a, b, c, p);
+  return vmlasq_m_n_s32 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmlast.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlast.s32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo1 (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo1 (int32x4_t m1, int32x4_t m2, int32_t add, mve_pred16_t p)
 {
-  return vmlasq_m (a, b, c, p);
+  return vmlasq_m (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vmlast.s32"  }  } */
+/* { dg-final { scan-a

[PATCH 34/35] arm: improve tests for vrshlq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vrshlq_m_n_s16.c: Improve tests.
* gcc.target/arm/mve/intrinsics/vrshlq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vrshlq_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vrshlq_m_n_s16.c   | 25 +++---
 .../arm/mve/intrinsics/vrshlq_m_n_s32.c   | 25 +++---
 .../arm/mve/intrinsics/vrshlq_m_n_s8.c| 25 +++---
 .../arm/mve/intrinsics/vrshlq_m_n_u16.c   | 25 +++---
 .../arm/mve/intrinsics/vrshlq_m_n_u32.c   | 25 +++---
 .../arm/mve/intrinsics/vrshlq_m_n_u8.c| 25 +++---
 .../arm/mve/intrinsics/vrshlq_m_s16.c | 26 ---
 .../arm/mve/intrinsics/vrshlq_m_s32.c | 26 ---
 .../arm/mve/intrinsics/vrshlq_m_s8.c  | 26 ---
 .../arm/mve/intrinsics/vrshlq_m_u16.c | 26 ---
 .../arm/mve/intrinsics/vrshlq_m_u32.c | 26 ---
 .../arm/mve/intrinsics/vrshlq_m_u8.c  | 26 ---
 .../arm/mve/intrinsics/vrshlq_n_s16.c | 16 ++--
 .../arm/mve/intrinsics/vrshlq_n_s32.c | 16 ++--
 .../arm/mve/intrinsics/vrshlq_n_s8.c  | 16 ++--
 .../arm/mve/intrinsics/vrshlq_n_u16.c | 16 ++--
 .../arm/mve/intrinsics/vrshlq_n_u32.c | 16 ++--
 .../arm/mve/intrinsics/vrshlq_n_u8.c  | 16 ++--
 .../arm/mve/intrinsics/vrshlq_s16.c   | 16 ++--
 .../arm/mve/intrinsics/vrshlq_s32.c   | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vrshlq_s8.c | 16 ++--
 .../arm/mve/intrinsics/vrshlq_u16.c   | 16 ++--
 .../arm/mve/intrinsics/vrshlq_u32.c   | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vrshlq_u8.c | 16 ++--
 .../arm/mve/intrinsics/vrshlq_x_s16.c | 25 +++---
 .../arm/mve/intrinsics/vrshlq_x_s32.c | 25 +++---
 .../arm/mve/intrinsics/vrshlq_x_s8.c  | 25 +++---
 .../arm/mve/intrinsics/vrshlq_x_u16.c | 25 +++---
 .../arm/mve/intrinsics/vrshlq_x_u32.c | 25 +++---
 .../arm/mve/intrinsics/vrshlq_x_u8.c  | 25 +++---
 30 files changed, 564 insertions(+), 84 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrshlq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrshlq_m_n_s16.c
index cf51de6aa9c..c7d1f3a5b1c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrshlq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrshlq_m_n_s16.c
@@ -1,22 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vrshlt.s16  q[0-9]+, (?:ip|fp|r[0-9]+)(?:   @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t a, int32_t b, mve_pred16_t p)
 {
   return vrshlq_m_n_s16 (a, b, p);
 }
 
-/* { dg-final { scan-assembl

[PATCH 18/35] arm: improve tests for vmulq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmulq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vmulq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmulq_x_u8.c: Likewise.
---
 .../gcc.target/arm/mve/intrinsics/vmulq_f16.c | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vmulq_f32.c | 16 ++-
 .../arm/mve/intrinsics/vmulq_m_f16.c  | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_f32.c  | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_n_f16.c| 42 +--
 .../arm/mve/intrinsics/vmulq_m_n_f32.c| 42 +--
 .../arm/mve/intrinsics/vmulq_m_n_s16.c| 26 ++--
 .../arm/mve/intrinsics/vmulq_m_n_s32.c| 26 ++--
 .../arm/mve/intrinsics/vmulq_m_n_s8.c | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_n_u16.c| 42 +--
 .../arm/mve/intrinsics/vmulq_m_n_u32.c| 42 +--
 .../arm/mve/intrinsics/vmulq_m_n_u8.c | 42 +--
 .../arm/mve/intrinsics/vmulq_m_s16.c  | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_s32.c  | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_s8.c   | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_u16.c  | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_u32.c  | 26 ++--
 .../arm/mve/intrinsics/vmulq_m_u8.c   | 26 ++--
 .../arm/mve/intrinsics/vmulq_n_f16.c  | 28 -
 .../arm/mve/intrinsics/vmulq_n_f32.c  | 28 -
 .../arm/mve/intrinsics/vmulq_n_s16.c  | 16 ++-
 .../arm/mve/intrinsics/vmulq_n_s32.c  | 16 ++-
 .../arm/mve/intrinsics/vmulq_n_s8.c   | 16 ++-
 .../arm/mve/intrinsics/vmulq_n_u16.c  | 28 -
 .../arm/mve/intrinsics/vmulq_n_u32.c  | 28 -
 .../arm/mve/intrinsics/vmulq_n_u8.c   | 28 -
 .../gcc.target/arm/mve/intrinsics/vmulq_s16.c | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vmulq_s32.c | 16 ++-
 .../gcc.target/

[PATCH 25/35] arm: improve tests and fix vmlaldavaxq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vmlaldavaq_)
(mve_vmlaldavaxq_s, mve_vmlaldavaxq_p_): Fix
spacing vs tabs.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s16.c: Improve tests.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmlaldavaxq_s32.c: Likewise.
---
 gcc/config/arm/mve.md |  6 ++--
 .../arm/mve/intrinsics/vmlaldavaxq_p_s16.c| 32 +++
 .../arm/mve/intrinsics/vmlaldavaxq_p_s32.c| 32 +++
 .../arm/mve/intrinsics/vmlaldavaxq_s16.c  | 24 ++
 .../arm/mve/intrinsics/vmlaldavaxq_s32.c  | 24 ++
 5 files changed, 91 insertions(+), 27 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 714dc6fc7ce..d2ffae6a425 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -4163,7 +4163,7 @@ (define_insn "mve_vmlaldavaq_"
 VMLALDAVAQ))
   ]
   "TARGET_HAVE_MVE"
-  "vmlaldava.%# %Q0, %R0, %q2, %q3"
+  "vmlaldava.%#\t%Q0, %R0, %q2, %q3"
   [(set_attr "type" "mve_move")
 ])
 
@@ -4179,7 +4179,7 @@ (define_insn "mve_vmlaldavaxq_s"
 VMLALDAVAXQ_S))
   ]
   "TARGET_HAVE_MVE"
-  "vmlaldavax.s%# %Q0, %R0, %q2, %q3"
+  "vmlaldavax.s%#\t%Q0, %R0, %q2, %q3"
   [(set_attr "type" "mve_move")
 ])
 
@@ -6126,7 +6126,7 @@ (define_insn "mve_vmlaldavaxq_p_"
 VMLALDAVAXQ_P))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vmlaldavaxt.%# %Q0, %R0, %q2, %q3"
+  "vpst\;vmlaldavaxt.%#\t%Q0, %R0, %q2, %q3"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s16.c
index f33d3880236..87f0354a636 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s16.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlaldavaxt.s16 (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 int64_t
-foo (int64_t a, int16x8_t b, int16x8_t c, mve_pred16_t p)
+foo (int64_t add, int16x8_t m1, int16x8_t m2, mve_pred16_t p)
 {
-  return vmlaldavaxq_p_s16 (a, b, c, p);
+  return vmlaldavaxq_p_s16 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmlaldavaxt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlaldavaxt.s16 (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 int64_t
-foo1 (int64_t a, int16x8_t b, int16x8_t c, mve_pred16_t p)
+foo1 (int64_t add, int16x8_t m1, int16x8_t m2, mve_pred16_t p)
 {
-  return vmlaldavaxq_p (a, b, c, p);
+  return vmlaldavaxq_p (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmlaldavaxt.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s32.c
index ab072a9850e..d26bf5b90af 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmlaldavaxq_p_s32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlaldavaxt.s32 (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 int64_t
-foo (int64_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
+foo (int64_t add, int32x4_t m1, int32x4_t m2, mve_pred16_t p)
 {
-  return vmlaldavaxq_p_s32 (a, b, c, p);
+  return vmlaldavaxq_p_s32 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmlaldavaxt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmlaldavaxt.s32 (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 int64_t
-foo1 (int64_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
+foo1 (int64_t add, int32x4_t m1, int32x4_t m2, mve_pred16_t p)
 {
-  return vmlaldavaxq_p (a, b, c, p);
+  return vmlaldavaxq_p (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmlaldavaxt.s32"  }  } */
+/* { dg-final { scan-assemb

[PATCH 28/35] arm: improve tests for vqdmlahq_m*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlahq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqdmlashq_n_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqdmlahq_m_n_s16.c | 34 ++-
 .../arm/mve/intrinsics/vqdmlahq_m_n_s32.c | 34 ++-
 .../arm/mve/intrinsics/vqdmlahq_m_n_s8.c  | 34 ++-
 .../arm/mve/intrinsics/vqdmlahq_n_s16.c   | 24 +
 .../arm/mve/intrinsics/vqdmlahq_n_s32.c   | 24 +
 .../arm/mve/intrinsics/vqdmlahq_n_s8.c| 24 +
 .../arm/mve/intrinsics/vqdmlashq_m_n_s16.c| 34 ++-
 .../arm/mve/intrinsics/vqdmlashq_m_n_s32.c| 34 ++-
 .../arm/mve/intrinsics/vqdmlashq_m_n_s8.c | 34 ++-
 .../arm/mve/intrinsics/vqdmlashq_n_s16.c  | 24 +
 .../arm/mve/intrinsics/vqdmlashq_n_s32.c  | 24 +
 .../arm/mve/intrinsics/vqdmlashq_n_s8.c   | 24 +
 12 files changed, 264 insertions(+), 84 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s16.c
index d8c4f4bab8e..94d93874542 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlaht.s16q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo (int16x8_t add, int16x8_t m1, int16_t m2, mve_pred16_t p)
 {
-  return vqdmlahq_m_n_s16 (a, b, c, p);
+  return vqdmlahq_m_n_s16 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlaht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlaht.s16q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo1 (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo1 (int16x8_t add, int16x8_t m1, int16_t m2, mve_pred16_t p)
 {
-  return vqdmlahq_m (a, b, c, p);
+  return vqdmlahq_m (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlaht.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s32.c
index 361f5d00bdf..a3dab7fa02e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqdmlahq_m_n_s32.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlaht.s32q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo (int32x4_t add, int32x4_t m1, int32_t m2, mve_pred16_t p)
 {
-  return vqdmlahq_m_n_s32 (a, b, c, p);
+  return vqdmlahq_m_n_s32 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqdmlaht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqdmlaht.s32q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo1 (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo1 (int32x4_t add, int32x4_t m1, int32_t m2, mve_pred16_t p)
 {
-  return vqdmlahq_m (a, b, c, p);
+  return vqdmlahq_m (add, m1, m2, p);
 }
 
-/* { dg-final { scan-ass

[PATCH 22/35] arm: improve tests for vhsubq_m*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vhsubq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vhsubq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vhsubq_x_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vhsubq_m_n_s16.c   | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_n_s32.c   | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_n_s8.c| 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vhsubq_m_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vhsubq_m_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vhsubq_m_s16.c | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_s32.c | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_s8.c  | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_u16.c | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_u32.c | 26 ++--
 .../arm/mve/intrinsics/vhsubq_m_u8.c  | 26 ++--
 .../arm/mve/intrinsics/vhsubq_n_s16.c | 16 ++-
 .../arm/mve/intrinsics/vhsubq_n_s32.c | 16 ++-
 .../arm/mve/intrinsics/vhsubq_n_s8.c  | 16 ++-
 .../arm/mve/intrinsics/vhsubq_n_u16.c | 28 -
 .../arm/mve/intrinsics/vhsubq_n_u32.c | 28 -
 .../arm/mve/intrinsics/vhsubq_n_u8.c  | 28 -
 .../arm/mve/intrinsics/vhsubq_s16.c   | 16 ++-
 .../arm/mve/intrinsics/vhsubq_s32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vhsubq_s8.c | 16 ++-
 .../arm/mve/intrinsics/vhsubq_u16.c   | 16 ++-
 .../arm/mve/intrinsics/vhsubq_u32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vhsubq_u8.c | 16 ++-
 .../arm/mve/intrinsics/vhsubq_x_n_s16.c   | 26 ++--
 .../arm/mve/intrinsics/vhsubq_x_n_s32.c   | 26 ++--
 .../arm/mve/intrinsics/vhsubq_x_n_s8.c| 26 ++--
 .../arm/mve/intrinsics/vhsubq_x_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vhsubq_x_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vhsubq_x_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vhsubq_x_s16.c | 25 +--
 .../arm/mve/intrinsics/vhsubq_x_s32.c | 25 +--
 .../arm/mve/intrinsics/vhsubq_x_s8.c  | 25 +--
 .../arm/mve/intrinsics/vhsubq_x_u16.c | 25 +--
 .../arm/mve/intrinsics/vhsubq_x_u32.c | 25 +--
 .../arm/mve/intrinsics/vhsubq_x_u8.c  | 25 +--
 36 files changed, 828 insertions(+), 114 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vhsubq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vhsubq_m_n_s16.c
index 27dcb7be957..6390589808f 100644
--- a/gcc/testsuite/gcc

[PATCH 08/35] arm: improve tests for vmin*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vminaq_m_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vminaq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminaq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminaq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminaq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminaq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminavq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmaq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmaq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmaq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmaq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmavq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_p_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminnmvq_p_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminq_x_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vminvq_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vminaq_m_s16.c | 25 +--
 .../arm/mve/intrinsics/vminaq_m_s32.c | 25 +--
 .../arm/mve/intrinsics/vminaq_m_s8.c  | 25 +--
 .../arm/mve/intrinsics/vminaq_s16.c   | 16 +++-
 .../arm/mve/intrinsics/vminaq_s32.c   | 16 +++-
 .../gcc.target/arm/mve/intrinsics/vminaq_s8.c | 16 +++-
 .../arm/mve/intrinsics/vminavq_p_s16.c| 41 ---
 .../arm/mve/intrinsics/vminavq_p_s32.c| 41 ---
 .../arm/mve/intrinsics/vminavq_p_s8.c | 41 ---
 .../arm/mve/intrinsics/vminavq_s16.c  | 29 ++---
 .../arm/mve/intrinsics/vminavq_s32.c  | 29 ++---
 .../arm/mve/intrinsics/vminavq_s8.c   | 29 ++---
 .../arm/mve/intrinsics/vminnmaq_f16.c | 16 +++-
 .../arm/mve/intrinsics/vminnmaq_f32.c | 16 +++-
 .../arm/mve/intrinsics/vminnmaq_m_f16.c   | 25 +--
 .../arm/mve/intrinsics/vminnmaq_m_f32.c   | 25 +--
 .../arm/m

[PATCH 30/35] arm: improve tests for vqrdmlahq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqrdmlahq_n_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vqrdmlahq_m_n_s16.c| 34 ++-
 .../arm/mve/intrinsics/vqrdmlahq_m_n_s32.c| 34 ++-
 .../arm/mve/intrinsics/vqrdmlahq_m_n_s8.c | 34 ++-
 .../arm/mve/intrinsics/vqrdmlahq_n_s16.c  | 24 +
 .../arm/mve/intrinsics/vqrdmlahq_n_s32.c  | 24 +
 .../arm/mve/intrinsics/vqrdmlahq_n_s8.c   | 24 +
 6 files changed, 132 insertions(+), 42 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s16.c
index 70c3fa0e9b1..07d689279ac 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlaht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo (int16x8_t add, int16x8_t m1, int16_t m2, mve_pred16_t p)
 {
-  return vqrdmlahq_m_n_s16 (a, b, c, p);
+  return vqrdmlahq_m_n_s16 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlaht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlaht.s16   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo1 (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo1 (int16x8_t add, int16x8_t m1, int16_t m2, mve_pred16_t p)
 {
-  return vqrdmlahq_m (a, b, c, p);
+  return vqrdmlahq_m (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlaht.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s32.c
index 75ed9911276..3b02ca16038 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s32.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlaht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo (int32x4_t add, int32x4_t m1, int32_t m2, mve_pred16_t p)
 {
-  return vqrdmlahq_m_n_s32 (a, b, c, p);
+  return vqrdmlahq_m_n_s32 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlaht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlaht.s32   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo1 (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo1 (int32x4_t add, int32x4_t m1, int32_t m2, mve_pred16_t p)
 {
-  return vqrdmlahq_m (a, b, c, p);
+  return vqrdmlahq_m (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlaht.s32"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s8.c
index ddaea545f40..b661bdcb4cf 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlahq_m_n_s8.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** 

[PATCH 19/35] arm: improve tests and fix vsubq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vsubq_n_f): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vsubq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vsubq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsubq_x_u8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../gcc.target/arm/mve/intrinsics/vsubq_f16.c | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vsubq_f32.c | 16 ++-
 .../arm/mve/intrinsics/vsubq_m_f16.c  | 26 --
 .../arm/mve/intrinsics/vsubq_m_f32.c  | 26 --
 .../arm/mve/intrinsics/vsubq_m_n_f16.c| 42 ++--
 .../arm/mve/intrinsics/vsubq_m_n_f32.c| 42 ++--
 .../arm/mve/intrinsics/vsubq_m_n_s16.c| 26 --
 .../arm/mve/intrinsics/vsubq_m_n_s32.c| 26 --
 .../arm/mve/intrinsics/vsubq_m_n_s8.c | 26 --
 .../arm/mve/intrinsics/vsubq_m_n_u16.c| 42 ++--
 .../arm/mve/intrinsics/vsubq_m_n_u32.c| 42 ++--
 .../arm/mve/intrinsics/vsubq_m_n_u8.c | 42 ++--
 .../arm/mve/intrinsics/vsubq_m_s16.c  | 25 --
 .../arm/mve/intrinsics/vsubq_m_s32.c  | 25 --
 .../arm/mve/intrinsics/vsubq_m_s8.c   | 25 --
 .../arm/mve/intrinsics/vsubq_m_u16.c  | 25 --
 .../arm/mve/intrinsics/vsubq_m_u32.c  | 25 --
 .../arm/mve/intrinsics/vsubq_m_u8.c   | 25 --
 .../arm/mve/intrinsics/vsubq_n_f16.c  | 28 ++-
 .../arm/mve/intrinsics/vsubq_n_f32.c  | 28 ++-
 .../arm/mve/intrinsics/vsubq_n_s16.c  | 17 +--
 .../arm/mve/intrinsics/vsubq_n_s32.c  | 17 +--
 .../arm/mve/intrinsics/vsubq_n_s8.c   | 17 +--
 .../arm/mve/intrinsics/vsubq_n_u16.c  | 29 +--
 .../arm/mve/intrinsics/vsubq_n_u32.c  | 29 +--
 .../arm/mve/intrinsics/vsubq_n_u8.c   | 29 +--
 .../gcc.target/arm/mve/intrinsics/vsubq_s16.c | 16 +

[PATCH 23/35] arm: improve tests for viwdupq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/viwdupq_m_n_u16.c: Improve tests.
* gcc.target/arm/mve/intrinsics/viwdupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_m_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/viwdupq_x_wb_u8.c: Likewise.
---
 .../arm/mve/intrinsics/viwdupq_m_n_u16.c  | 46 ++---
 .../arm/mve/intrinsics/viwdupq_m_n_u32.c  | 46 ++---
 .../arm/mve/intrinsics/viwdupq_m_n_u8.c   | 46 ++---
 .../arm/mve/intrinsics/viwdupq_m_wb_u16.c | 46 ++---
 .../arm/mve/intrinsics/viwdupq_m_wb_u32.c | 46 ++---
 .../arm/mve/intrinsics/viwdupq_m_wb_u8.c  | 46 ++---
 .../arm/mve/intrinsics/viwdupq_n_u16.c| 32 ++--
 .../arm/mve/intrinsics/viwdupq_n_u32.c| 32 ++--
 .../arm/mve/intrinsics/viwdupq_n_u8.c | 28 ++-
 .../arm/mve/intrinsics/viwdupq_wb_u16.c   | 36 ++---
 .../arm/mve/intrinsics/viwdupq_wb_u32.c   | 36 ++---
 .../arm/mve/intrinsics/viwdupq_wb_u8.c| 36 ++---
 .../arm/mve/intrinsics/viwdupq_x_n_u16.c  | 46 ++---
 .../arm/mve/intrinsics/viwdupq_x_n_u32.c  | 46 ++---
 .../arm/mve/intrinsics/viwdupq_x_n_u8.c   | 46 ++---
 .../arm/mve/intrinsics/viwdupq_x_wb_u16.c | 50 ---
 .../arm/mve/intrinsics/viwdupq_x_wb_u32.c | 50 ---
 .../arm/mve/intrinsics/viwdupq_x_wb_u8.c  | 50 ---
 18 files changed, 658 insertions(+), 106 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u16.c
index 0f999cc672b..67a2465f435 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u16.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** viwdupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), 
#[0-9]+(?:   @.*|)
+** ...
+*/
 uint16x8_t
 foo (uint16x8_t inactive, uint32_t a, uint32_t b, mve_pred16_t p)
 {
-  return viwdupq_m_n_u16 (inactive, a, b, 2, p);
+  return viwdupq_m_n_u16 (inactive, a, b, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "viwdupt.u16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** viwdupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), 
#[0-9]+(?:   @.*|)
+** ...
+*/
 uint16x8_t
 foo1 (uint16x8_t inactive, uint32_t a, uint32_t b, mve_pred16_t p)
 {
-  return viwdupq_m (inactive, a, b, 2, p);
+  return viwdupq_m (inactive, a, b, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "viwdupt.u16"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** viwdupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), 
#[0-9]+(?:   @.*|)
+** ...
+*/
+uint16x8_t
+foo2 (uint16x8_t inactive, mve_pred16_t p)
+{
+  return viwdupq_m (inactive, 1, 1, 1, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u32.c
index f79c91eaf4c..9fc2518acc5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/viwdupq_m_n_u32.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve 

[PATCH 24/35] arm: improve tests for vmladavaq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vmladavaq_p_s16.c: Improve tests.
* gcc.target/arm/mve/intrinsics/vmladavaq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaq_p_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaq_p_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaq_p_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_p_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vmladavaxq_s8.c: Likewise.
---
 .../arm/mve/intrinsics/vmladavaq_p_s16.c  | 33 ++---
 .../arm/mve/intrinsics/vmladavaq_p_s32.c  | 33 ++---
 .../arm/mve/intrinsics/vmladavaq_p_s8.c   | 33 ++---
 .../arm/mve/intrinsics/vmladavaq_p_u16.c  | 49 ---
 .../arm/mve/intrinsics/vmladavaq_p_u32.c  | 49 ---
 .../arm/mve/intrinsics/vmladavaq_p_u8.c   | 49 ---
 .../arm/mve/intrinsics/vmladavaxq_p_s16.c | 33 ++---
 .../arm/mve/intrinsics/vmladavaxq_p_s32.c | 33 ++---
 .../arm/mve/intrinsics/vmladavaxq_p_s8.c  | 33 ++---
 .../arm/mve/intrinsics/vmladavaxq_s16.c   | 24 ++---
 .../arm/mve/intrinsics/vmladavaxq_s32.c   | 24 ++---
 .../arm/mve/intrinsics/vmladavaxq_s8.c| 24 ++---
 12 files changed, 336 insertions(+), 81 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s16.c
index e458204c41b..f3e5eba3b08 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s16.c
@@ -1,22 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmladavat.s16   (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 int32_t
-foo (int32_t a, int16x8_t b, int16x8_t c, mve_pred16_t p)
+foo (int32_t add, int16x8_t m1, int16x8_t m2, mve_pred16_t p)
 {
-  return vmladavaq_p_s16 (a, b, c, p);
+  return vmladavaq_p_s16 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmladavat.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmladavat.s16   (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 int32_t
-foo1 (int32_t a, int16x8_t b, int16x8_t c, mve_pred16_t p)
+foo1 (int32_t add, int16x8_t m1, int16x8_t m2, mve_pred16_t p)
 {
-  return vmladavaq_p (a, b, c, p);
+  return vmladavaq_p (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmladavat.s16"  }  } */
-/* { dg-final { scan-assembler "vmladavat.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s32.c
index e3544787adb..71f6957bfc5 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vmladavaq_p_s32.c
@@ -1,22 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmladavat.s32   (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 int32_t
-foo (int32_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
+foo (int32_t add, int32x4_t m1, int32x4_t m2, mve_pred16_t p)
 {
-  return vmladavaq_p_s32 (a, b, c, p);
+  return vmladavaq_p_s32 (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmladavat.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vmladavat.s32   (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 int32_t
-foo1 (int32_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
+foo1 (int32_t add, int32x4_t m1, int32x4_t m2, mve_pred16_t p)
 {
-  return vmladavaq_p (a, b, c, p);
+  return vmladavaq_p (add, m1, m2, p);
 }
 
-/* { dg-final { scan-assembler "vmladavat.s32"  }  } */
-/* { dg-final { scan-assembler "vmladavat.s32"  }  } */
+/* { dg-final { scan-assembler-not "__A

[PATCH 33/35] arm: improve tests and fix vrmlaldavhaq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vrmlaldavhq_v4si,
mve_vrmlaldavhaq_v4si): Fix spacing vs tabs.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_s32.c: Improve test.
* gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_u32.c: Likewise.
---
 gcc/config/arm/mve.md |  4 +-
 .../arm/mve/intrinsics/vrmlaldavhaq_p_s32.c   | 24 ++-
 .../arm/mve/intrinsics/vrmlaldavhaq_p_u32.c   | 40 ++-
 3 files changed, 62 insertions(+), 6 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index d2ffae6a425..b5e6da4b133 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -2543,7 +2543,7 @@ (define_insn "mve_vrmlaldavhq_v4si"
 VRMLALDAVHQ))
   ]
   "TARGET_HAVE_MVE"
-  "vrmlaldavh.32 %Q0, %R0, %q1, %q2"
+  "vrmlaldavh.32\t%Q0, %R0, %q1, %q2"
   [(set_attr "type" "mve_move")
 ])
 
@@ -2649,7 +2649,7 @@ (define_insn "mve_vrmlaldavhaq_v4si"
 VRMLALDAVHAQ))
   ]
   "TARGET_HAVE_MVE"
-  "vrmlaldavha.32 %Q0, %R0, %q2, %q3"
+  "vrmlaldavha.32\t%Q0, %R0, %q2, %q3"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_s32.c
index 263d3509771..dec4a969dfe 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_s32.c
@@ -1,21 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vrmlaldavhat.s32(?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 int64_t
 foo (int64_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
 {
   return vrmlaldavhaq_p_s32 (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vrmlaldavhat.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vrmlaldavhat.s32(?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 int64_t
 foo1 (int64_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
 {
   return vrmlaldavhaq_p (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vrmlaldavhat.s32"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_u32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_u32.c
index 83ab68c001b..f3c8bfd121c 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vrmlaldavhaq_p_u32.c
@@ -1,21 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vrmlaldavhat.u32(?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 uint64_t
 foo (uint64_t a, uint32x4_t b, uint32x4_t c, mve_pred16_t p)
 {
   return vrmlaldavhaq_p_u32 (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vrmlaldavhat.u32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vrmlaldavhat.u32(?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
 uint64_t
 foo1 (uint64_t a, uint32x4_t b, uint32x4_t c, mve_pred16_t p)
 {
   return vrmlaldavhaq_p (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vrmlaldavhat.u32"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vrmlaldavhat.u32(?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), q[0-9]+, 
q[0-9]+(?:   @.*|)
+** ...
+*/
+uint64_t
+foo2 (uint32x4_t b, uint32x4_t c, mve_pred16_t p)
+{
+  return vrmlaldavhaq_p (1, b, c, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
-- 
2.25.1



[PATCH 27/35] arm: improve tests for vqaddq_m*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqaddq_m_n_s16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vqaddq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vqaddq_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vqaddq_m_n_s16.c   | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_n_s32.c   | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_n_s8.c| 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vqaddq_m_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vqaddq_m_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vqaddq_m_s16.c | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_s32.c | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_s8.c  | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_u16.c | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_u32.c | 26 ++--
 .../arm/mve/intrinsics/vqaddq_m_u8.c  | 26 ++--
 .../arm/mve/intrinsics/vqaddq_n_s16.c | 16 ++-
 .../arm/mve/intrinsics/vqaddq_n_s32.c | 16 ++-
 .../arm/mve/intrinsics/vqaddq_n_s8.c  | 16 ++-
 .../arm/mve/intrinsics/vqaddq_n_u16.c | 28 -
 .../arm/mve/intrinsics/vqaddq_n_u32.c | 28 -
 .../arm/mve/intrinsics/vqaddq_n_u8.c  | 28 -
 .../arm/mve/intrinsics/vqaddq_s16.c   | 16 ++-
 .../arm/mve/intrinsics/vqaddq_s32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vqaddq_s8.c | 16 ++-
 .../arm/mve/intrinsics/vqaddq_u16.c   | 16 ++-
 .../arm/mve/intrinsics/vqaddq_u32.c   | 16 ++-
 .../gcc.target/arm/mve/intrinsics/vqaddq_u8.c | 16 ++-
 24 files changed, 516 insertions(+), 72 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s16.c
index 65d3f770fe2..a659373d441 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqaddt.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqaddq_m_n_s16 (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqaddt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqaddt.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
 foo1 (int16x8_t inactive, int16x8_t a, int16_t b, mve_pred16_t p)
 {
   return vqaddq_m (inactive, a, b, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqaddt.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s32.c
index 4499a0eaa41..8ffc6a67762 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqaddq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intri

[PATCH 16/35] arm: Add integer vector overloading of vsubq_x instrinsic

2022-11-17 Thread Andrea Corallo via Gcc-patches
From: Stam Markianos-Wright 

In the past we had only defined the vsubq_x generic overload of the
vsubq_x_* intrinsics for float vector types.  This would cause them
to fall back to the `__ARM_undef` failure state if they was called
through the generic version.
This patch simply adds these overloads.

gcc/ChangeLog:

* config/arm/arm_mve.h (__arm_vsubq_x FP): New overloads.
 (__arm_vsubq_x Integer): New.
---
 gcc/config/arm/arm_mve.h | 28 
 1 file changed, 28 insertions(+)

diff --git a/gcc/config/arm/arm_mve.h b/gcc/config/arm/arm_mve.h
index f6b42dc3fab..09167ec118e 100644
--- a/gcc/config/arm/arm_mve.h
+++ b/gcc/config/arm/arm_mve.h
@@ -38259,6 +38259,18 @@ extern void *__ARM_undef;
 #define __arm_vsubq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \
   __typeof(p2) __p2 = (p2); \
   _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \
+  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: 
__arm_vsubq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, 
int8x16_t), p3), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: 
__arm_vsubq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, 
int16x8_t), p3), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: 
__arm_vsubq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, 
int32x4_t), p3), \
+  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_s8 
(__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_s16 
(__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_s32 
(__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: 
__arm_vsubq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, 
uint8x16_t), p3), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: 
__arm_vsubq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, 
uint16x8_t), p3), \
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_uint32x4_t]: 
__arm_vsubq_x_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce(__p2, 
uint32x4_t), p3), \
+  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_u8 
(__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_int_n]: 
__arm_vsubq_x_n_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce3(p2, 
int), p3), \
+  int (*)[__ARM_mve_type_uint32x4_t][__ARM_mve_type_int_n]: 
__arm_vsubq_x_n_u32 (__ARM_mve_coerce(__p1, uint32x4_t), __ARM_mve_coerce3(p2, 
int), p3), \
   int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_float16x8_t]: 
__arm_vsubq_x_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce(__p2, 
float16x8_t), p3), \
   int (*)[__ARM_mve_type_float32x4_t][__ARM_mve_type_float32x4_t]: 
__arm_vsubq_x_f32 (__ARM_mve_coerce(__p1, float32x4_t), __ARM_mve_coerce(__p2, 
float32x4_t), p3), \
   int (*)[__ARM_mve_type_float16x8_t][__ARM_mve_type_fp_n]: 
__arm_vsubq_x_n_f16 (__ARM_mve_coerce(__p1, float16x8_t), __ARM_mve_coerce2(p2, 
double), p3), \
@@ -40223,6 +40235,22 @@ extern void *__ARM_undef;
   int (*)[__ARM_mve_type_uint16_t_ptr]: __arm_vld4q_u16 (__ARM_mve_coerce1(p0, 
uint16_t *)), \
   int (*)[__ARM_mve_type_uint32_t_ptr]: __arm_vld4q_u32 (__ARM_mve_coerce1(p0, 
uint32_t *
 
+#define __arm_vsubq_x(p1,p2,p3) ({ __typeof(p1) __p1 = (p1); \
+  __typeof(p2) __p2 = (p2); \
+  _Generic( (int (*)[__ARM_mve_typeid(__p1)][__ARM_mve_typeid(__p2)])0, \
+  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int8x16_t]: 
__arm_vsubq_x_s8 (__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce(__p2, 
int8x16_t), p3), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int16x8_t]: 
__arm_vsubq_x_s16 (__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce(__p2, 
int16x8_t), p3), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int32x4_t]: 
__arm_vsubq_x_s32 (__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce(__p2, 
int32x4_t), p3), \
+  int (*)[__ARM_mve_type_int8x16_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_s8 
(__ARM_mve_coerce(__p1, int8x16_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_int16x8_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_s16 
(__ARM_mve_coerce(__p1, int16x8_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_int32x4_t][__ARM_mve_type_int_n]: __arm_vsubq_x_n_s32 
(__ARM_mve_coerce(__p1, int32x4_t), __ARM_mve_coerce3(p2, int), p3), \
+  int (*)[__ARM_mve_type_uint8x16_t][__ARM_mve_type_uint8x16_t]: 
__arm_vsubq_x_u8 (__ARM_mve_coerce(__p1, uint8x16_t), __ARM_mve_coerce(__p2, 
uint8x16_t), p3), \
+  int (*)[__ARM_mve_type_uint16x8_t][__ARM_mve_type_uint16x8_t]: 
__arm_vsubq_x_u16 (__ARM_mve_coerce(__p1, uint16x8_t), __ARM_mve_coerce(__p2, 
uint16x8_t), p3), \
+  i

[PATCH 11/35] arm: improve tests for vabdq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vabdq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vabdq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_m_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabdq_x_u8.c: Likewise.
---
 .../gcc.target/arm/mve/intrinsics/vabdq_f16.c | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vabdq_f32.c | 16 ++--
 .../arm/mve/intrinsics/vabdq_m_f16.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_m_f32.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_m_s16.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_m_s32.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_m_s8.c   | 26 ---
 .../arm/mve/intrinsics/vabdq_m_u16.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_m_u32.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_m_u8.c   | 26 ---
 .../gcc.target/arm/mve/intrinsics/vabdq_s16.c | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vabdq_s32.c | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vabdq_s8.c  | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vabdq_u16.c | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vabdq_u32.c | 16 ++--
 .../gcc.target/arm/mve/intrinsics/vabdq_u8.c  | 16 ++--
 .../arm/mve/intrinsics/vabdq_x_f16.c  | 25 +++---
 .../arm/mve/intrinsics/vabdq_x_f32.c  | 25 +++---
 .../arm/mve/intrinsics/vabdq_x_s16.c  | 26 ---
 .../arm/mve/intrinsics/vabdq_x_s32.c  | 25 +++---
 .../arm/mve/intrinsics/vabdq_x_s8.c   | 25 +++---
 .../arm/mve/intrinsics/vabdq_x_u16.c  | 25 +++---
 .../arm/mve/intrinsics/vabdq_x_u32.c  | 25 +++---
 .../arm/mve/intrinsics/vabdq_x_u8.c   | 25 +++---
 24 files changed, 464 insertions(+), 73 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f16.c
index b55e826e4b6..f379b25c49e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f16.c
@@ -1,21 +1,33 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vabd.f16q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a, float16x8_t b)
 {
   return vabdq_f16 (a, b);
 }
 
-/* { dg-final { scan-assembler "vabd.f16"  }  } */
 
+/*
+**foo1:
+** ...
+** vabd.f16q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)
+** ...
+*/
 float16x8_t
 foo1 (float16x8_t a, float16x8_t b)
 {
   return vabdq (a, b);
 }
 
-/* { dg-final { scan-assembler "vabd.f16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f32.c
index f1a95b14e03..3ba808e0b4d 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabdq_f32.c
@@ -1,21 +1,33 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vabd.f32q[0-9]+, q[0-9]+, q[0-9]+(?:@.*|)

[PATCH 12/35] arm: improve tests and fix vabsq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vabsq_f): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vabsq_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vabsq_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_m_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_m_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_m_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_m_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_m_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_x_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_x_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_x_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_x_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vabsq_x_s8.c: Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../gcc.target/arm/mve/intrinsics/vabsq_f16.c | 22 +++-
 .../gcc.target/arm/mve/intrinsics/vabsq_f32.c | 22 +++-
 .../arm/mve/intrinsics/vabsq_m_f16.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_m_f32.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_m_s16.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_m_s32.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_m_s8.c   | 25 ---
 .../gcc.target/arm/mve/intrinsics/vabsq_s16.c | 20 ---
 .../gcc.target/arm/mve/intrinsics/vabsq_s32.c | 20 ---
 .../gcc.target/arm/mve/intrinsics/vabsq_s8.c  | 16 ++--
 .../arm/mve/intrinsics/vabsq_x_f16.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_x_f32.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_x_s16.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_x_s32.c  | 25 ---
 .../arm/mve/intrinsics/vabsq_x_s8.c   | 25 ---
 16 files changed, 309 insertions(+), 43 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 3330a220aea..bc4e2f2ac21 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -279,7 +279,7 @@ (define_insn "mve_vabsq_f"
(abs:MVE_0 (match_operand:MVE_0 1 "s_register_operand" "w")))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vabs.f%#  %q0, %q1"
+  "vabs.f%#\t%q0, %q1"
   [(set_attr "type" "mve_move")
 ])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f16.c
index 08e141baedc..f29ada8c058 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f16.c
@@ -1,13 +1,33 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vabs.f16q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 float16x8_t
 foo (float16x8_t a)
 {
   return vabsq_f16 (a);
 }
 
-/* { dg-final { scan-assembler "vabs.f16"  }  } */
+
+/*
+**foo1:
+** ...
+** vabs.f16q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
+float16x8_t
+foo1 (float16x8_t a)
+{
+  return vabsq (a);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f32.c
index 3614a44fbdc..cc24744fb26 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_f32.c
@@ -1,13 +1,33 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vabs.f32q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
 float32x4_t
 foo (float32x4_t a)
 {
   return vabsq_f32 (a);
 }
 
-/* { dg-final { scan-assembler "vabs.f32"  }  } */
+
+/*
+**foo1:
+** ...
+** vabs.f32q[0-9]+, q[0-9]+(?: @.*|)
+** ...
+*/
+float32x4_t
+foo1 (float32x4_t a)
+{
+  return vabsq (a);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_m_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_m_f16.c
index 30c14a151af..21cf284d045 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_m_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabsq_m_f16.c
@@ -1,22 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /

[PATCH 20/35] arm: improve tests for vfmasq_m*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32.c: Likewise.
---
 .../arm/mve/intrinsics/vfmasq_m_n_f16.c   | 50 ---
 .../arm/mve/intrinsics/vfmasq_m_n_f32.c   | 50 ---
 2 files changed, 84 insertions(+), 16 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16.c
index 06d2d114e46..03b376c9bbe 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f16.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vfmast.f16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 float16x8_t
-foo (float16x8_t a, float16x8_t b, float16_t c, mve_pred16_t p)
+foo (float16x8_t m1, float16x8_t m2, float16_t add, mve_pred16_t p)
 {
-  return vfmasq_m_n_f16 (a, b, c, p);
+  return vfmasq_m_n_f16 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vfmast.f16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vfmast.f16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 float16x8_t
-foo1 (float16x8_t a, float16x8_t b, float16_t c, mve_pred16_t p)
+foo1 (float16x8_t m1, float16x8_t m2, float16_t add, mve_pred16_t p)
 {
-  return vfmasq_m (a, b, c, p);
+  return vfmasq_m (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vfmast.f16"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vfmast.f16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
+float16x8_t
+foo2 (float16x8_t m1, float16x8_t m2, mve_pred16_t p)
+{
+  return vfmasq_m (m1, m2, 1.1, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32.c
index bf1773d0eeb..ecf30ba9826 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vfmasq_m_n_f32.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vfmast.f32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 float32x4_t
-foo (float32x4_t a, float32x4_t b, float32_t c, mve_pred16_t p)
+foo (float32x4_t m1, float32x4_t m2, float32_t add, mve_pred16_t p)
 {
-  return vfmasq_m_n_f32 (a, b, c, p);
+  return vfmasq_m_n_f32 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vfmast.f32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vfmast.f32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 float32x4_t
-foo1 (float32x4_t a, float32x4_t b, float32_t c, mve_pred16_t p)
+foo1 (float32x4_t m1, float32x4_t m2, float32_t add, mve_pred16_t p)
 {
-  return vfmasq_m (a, b, c, p);
+  return vfmasq_m (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vfmast.f32"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vfmast.f32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
+float32x4_t
+foo2 (float32x4_t m1, float32x4_t m2, mve_pred16_t p)
+{
+  return vfmasq_m (m1, m2, 1.1, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
-- 
2.25.1



[PATCH 31/35] arm: improve tests for vqrdmlashq_m*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s16.c:
* gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s32.c:
* gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s8.c:
---
 .../arm/mve/intrinsics/vqrdmlashq_m_n_s16.c   | 34 ++-
 .../arm/mve/intrinsics/vqrdmlashq_m_n_s32.c   | 34 ++-
 .../arm/mve/intrinsics/vqrdmlashq_m_n_s8.c| 34 ++-
 3 files changed, 78 insertions(+), 24 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s16.c
index 35b9618ca47..da4d724bb46 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s16.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlasht.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo (int16x8_t m1, int16x8_t m2, int16_t add, mve_pred16_t p)
 {
-  return vqrdmlashq_m_n_s16 (a, b, c, p);
+  return vqrdmlashq_m_n_s16 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlasht.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlasht.s16  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int16x8_t
-foo1 (int16x8_t a, int16x8_t b, int16_t c, mve_pred16_t p)
+foo1 (int16x8_t m1, int16x8_t m2, int16_t add, mve_pred16_t p)
 {
-  return vqrdmlashq_m (a, b, c, p);
+  return vqrdmlashq_m (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlasht.s16"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s32.c
index 8517835eb61..2430f1cb102 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s32.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlasht.s32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo (int32x4_t m1, int32x4_t m2, int32_t add, mve_pred16_t p)
 {
-  return vqrdmlashq_m_n_s32 (a, b, c, p);
+  return vqrdmlashq_m_n_s32 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlasht.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlasht.s32  q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int32x4_t
-foo1 (int32x4_t a, int32x4_t b, int32_t c, mve_pred16_t p)
+foo1 (int32x4_t m1, int32x4_t m2, int32_t add, mve_pred16_t p)
 {
-  return vqrdmlashq_m (a, b, c, p);
+  return vqrdmlashq_m (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vqrdmlasht.s32"  }  } */
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s8.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s8.c
index e42cc63fa74..30915b24e5e 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s8.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vqrdmlashq_m_n_s8.c
@@ -1,23 +1,41 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vqrdmlasht.s8   q[0-9]+, q[0-9]+, (?:ip|fp|r[0-9]+)(?:  @.*|)
+** ...
+*/
 int8x16_t
-foo (int8x16_t a, int8x16_t b, int8_t c, mve_pred16_t p)
+foo (int8x16_t m1, int8x16_t m2, int8_t add, mve_pred16_t p)
 {
-  return vqrdmlashq_m_n_s8 (a, b, c, p);
+  return vqrdmlashq_m_n_s8 (m1, m2, add, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-a

[PATCH 00/35] arm: rework MVE testsuite and rework backend where necessary (1st chunk)

2022-11-17 Thread Andrea Corallo via Gcc-patches
Hi all,

this is the first patch series about improving the current MVE
implementation and testsuite for:

- Complete intrinsic implementation and coverage (the list of intrinsics is
  specified by [1])
- Verifying all instructions supposedly emitted by each intrinsic
- Verifying register usage
- Fixing the current scan assemblers to really match the wanted mnemonics
- Verifying no external calls are emitted

This series fixes the backend where necessary.

Best Regards

  Andrea

Andrea Corallo (31):
  arm: improve vcreateq* tests
  arm: fix 'vmsr' spacing and register capitalization
  arm: improve tests and fix vddupq*
  arm: improve tests and fix vdwdupq*
  arm: improve vidupq* tests
  arm: improve tests and fix vdupq*
  arm: improve tests and fix vcmp*
  arm: improve tests for vmin*
  arm: improve tests for vmax*
  arm: improve tests for vabavq*
  arm: improve tests for vabdq*
  arm: improve tests and fix vabsq*
  arm: improve tests and fix vadd*
  arm: improve tests for vmulq*
  arm: improve tests and fix vsubq*
  arm: improve tests for vfmasq_m*
  arm: improve tests for vhaddq_m*
  arm: improve tests for vhsubq_m*
  arm: improve tests for viwdupq*
  arm: improve tests for vmladavaq*
  arm: improve tests and fix vmlaldavaxq*
  arm: improve tests for vmlasq*
  arm: improve tests for vqaddq_m*
  arm: improve tests for vqdmlahq_m*
  arm: improve tests for vqdmul*
  arm: improve tests for vqrdmlahq*
  arm: improve tests for vqrdmlashq_m*
  arm: improve tests for vqsubq*
  arm: improve tests and fix vrmlaldavhaq*
  arm: improve tests for vrshlq*
  arm: improve tests for vsetq_lane*

Stam Markianos-Wright (4):
  arm: further fix overloading of MVE vaddq[_m]_n intrinsic
  arm: propagate fixed overloading of MVE intrinsic scalar parameters
  arm: Explicitly specify other float types for _Generic overloading
[PR107515]
  arm: Add integer vector overloading of vsubq_x instrinsic

 gcc/config/arm/arm_mve.h  | 1232 +
 gcc/config/arm/mve.md |   48 +-
 gcc/config/arm/vfp.md |8 +-
 .../arm/mve/intrinsics/vabavq_p_s16.c |   40 +-
 .../arm/mve/intrinsics/vabavq_p_s32.c |   40 +-
 .../arm/mve/intrinsics/vabavq_p_s8.c  |   40 +-
 .../arm/mve/intrinsics/vabavq_p_u16.c |   40 +-
 .../arm/mve/intrinsics/vabavq_p_u32.c |   40 +-
 .../arm/mve/intrinsics/vabavq_p_u8.c  |   40 +-
 .../arm/mve/intrinsics/vabavq_s16.c   |   28 +-
 .../arm/mve/intrinsics/vabavq_s32.c   |   28 +-
 .../gcc.target/arm/mve/intrinsics/vabavq_s8.c |   28 +-
 .../arm/mve/intrinsics/vabavq_u16.c   |   28 +-
 .../arm/mve/intrinsics/vabavq_u32.c   |   28 +-
 .../gcc.target/arm/mve/intrinsics/vabavq_u8.c |   28 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_f16.c |   16 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_f32.c |   16 +-
 .../arm/mve/intrinsics/vabdq_m_f16.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_m_f32.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_m_s16.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_m_s32.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_m_s8.c   |   26 +-
 .../arm/mve/intrinsics/vabdq_m_u16.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_m_u32.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_m_u8.c   |   26 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_s16.c |   16 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_s32.c |   16 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_s8.c  |   16 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_u16.c |   16 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_u32.c |   16 +-
 .../gcc.target/arm/mve/intrinsics/vabdq_u8.c  |   16 +-
 .../arm/mve/intrinsics/vabdq_x_f16.c  |   25 +-
 .../arm/mve/intrinsics/vabdq_x_f32.c  |   25 +-
 .../arm/mve/intrinsics/vabdq_x_s16.c  |   26 +-
 .../arm/mve/intrinsics/vabdq_x_s32.c  |   25 +-
 .../arm/mve/intrinsics/vabdq_x_s8.c   |   25 +-
 .../arm/mve/intrinsics/vabdq_x_u16.c  |   25 +-
 .../arm/mve/intrinsics/vabdq_x_u32.c  |   25 +-
 .../arm/mve/intrinsics/vabdq_x_u8.c   |   25 +-
 .../gcc.target/arm/mve/intrinsics/vabsq_f16.c |   22 +-
 .../gcc.target/arm/mve/intrinsics/vabsq_f32.c |   22 +-
 .../arm/mve/intrinsics/vabsq_m_f16.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_m_f32.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_m_s16.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_m_s32.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_m_s8.c   |   25 +-
 .../gcc.target/arm/mve/intrinsics/vabsq_s16.c |   20 +-
 .../gcc.target/arm/mve/intrinsics/vabsq_s32.c |   20 +-
 .../gcc.target/arm/mve/intrinsics/vabsq_s8.c  |   16 +-
 .../arm/mve/intrinsics/vabsq_x_f16.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_x_f32.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_x_s16.c  |   25 +-
 .../arm/mve/intrinsics/vabsq_x_s32.c  |   25 +-
 .../arm/mve/intrinsics/va

[PATCH 35/35] arm: improve tests for vsetq_lane*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u64.c: Likewise.
* gcc.target/arm/mve/intrinsics/vsetq_lane_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vsetq_lane_f16.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_f32.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_s16.c   | 24 ++--
 .../arm/mve/intrinsics/vsetq_lane_s32.c   | 24 ++--
 .../arm/mve/intrinsics/vsetq_lane_s64.c   | 27 ++---
 .../arm/mve/intrinsics/vsetq_lane_s8.c| 24 ++--
 .../arm/mve/intrinsics/vsetq_lane_u16.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_u32.c   | 36 +++--
 .../arm/mve/intrinsics/vsetq_lane_u64.c   | 39 ---
 .../arm/mve/intrinsics/vsetq_lane_u8.c| 36 +++--
 10 files changed, 284 insertions(+), 34 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
index e03e9620528..b5c9f4d5eb8 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f16.c
@@ -1,15 +1,45 @@
-/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} 
} */
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov.16 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
 float16x8_t
 foo (float16_t a, float16x8_t b)
 {
-return vsetq_lane_f16 (a, b, 0);
+  return vsetq_lane_f16 (a, b, 1);
 }
 
-/* { dg-final { scan-assembler "vmov.16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmov.16 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
+float16x8_t
+foo1 (float16_t a, float16x8_t b)
+{
+  return vsetq_lane (a, b, 1);
+}
+
+/*
+**foo2:
+** ...
+** vmov.16 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
+float16x8_t
+foo2 (float16x8_t b)
+{
+  return vsetq_lane (1.1, b, 1);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
index 2b9f1a7e627..211083ce5d4 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_f32.c
@@ -1,15 +1,45 @@
-/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} 
} */
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov.32 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
 float32x4_t
 foo (float32_t a, float32x4_t b)
 {
-return vsetq_lane_f32 (a, b, 0);
+  return vsetq_lane_f32 (a, b, 1);
 }
 
-/* { dg-final { scan-assembler "vmov.32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmov.32 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
+float32x4_t
+foo1 (float32_t a, float32x4_t b)
+{
+  return vsetq_lane (a, b, 1);
+}
+
+/*
+**foo2:
+** ...
+** vmov.32 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
+float32x4_t
+foo2 (float32x4_t b)
+{
+  return vsetq_lane (1.1, b, 1);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
index 92ad0dd16a8..9cdaeae1e74 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vsetq_lane_s16.c
@@ -1,15 +1,33 @@
-/* { dg-skip-if "Incompatible float ABI" { *-*-* } { "-mfloat-abi=soft" } {""} 
} */
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmov.16 q[0-9]+\[[0-9]+\], (?:ip|fp|r[0-9]+)(?: @.*|)
+** ...
+*/
 int16x8_t
 foo (int16_t a, int16x8_t b)
 {
-return vsetq_lane_s

[PATCH 06/35] arm: improve tests and fix vdupq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vdupq_n_f)
(mve_vdupq_n_, mve_vdupq_m_n_)
(mve_vdupq_m_n_f): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vdupq_m_n_f16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_f16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_f32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_s16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_s32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_s8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdupq_x_n_u8.c: Likewise.
---
 gcc/config/arm/mve.md |  8 ++--
 .../arm/mve/intrinsics/vdupq_m_n_f16.c| 41 +--
 .../arm/mve/intrinsics/vdupq_m_n_f32.c| 41 +--
 .../arm/mve/intrinsics/vdupq_m_n_s16.c| 25 +--
 .../arm/mve/intrinsics/vdupq_m_n_s32.c| 25 +--
 .../arm/mve/intrinsics/vdupq_m_n_s8.c | 25 +--
 .../arm/mve/intrinsics/vdupq_m_n_u16.c| 41 +--
 .../arm/mve/intrinsics/vdupq_m_n_u32.c| 41 +--
 .../arm/mve/intrinsics/vdupq_m_n_u8.c | 41 +--
 .../arm/mve/intrinsics/vdupq_n_f16.c  | 21 +-
 .../arm/mve/intrinsics/vdupq_n_f32.c  | 21 +-
 .../arm/mve/intrinsics/vdupq_n_s16.c  | 13 --
 .../arm/mve/intrinsics/vdupq_n_s32.c  | 13 --
 .../arm/mve/intrinsics/vdupq_n_s8.c   |  9 +++-
 .../arm/mve/intrinsics/vdupq_n_u16.c  | 23 ++-
 .../arm/mve/intrinsics/vdupq_n_u32.c  | 23 ++-
 .../arm/mve/intrinsics/vdupq_n_u8.c   | 23 ++-
 .../arm/mve/intrinsics/vdupq_x_n_f16.c| 30 +-
 .../arm/mve/intrinsics/vdupq_x_n_f32.c| 30 +-
 .../arm/mve/intrinsics/vdupq_x_n_s16.c| 14 ++-
 .../arm/mve/intrinsics/vdupq_x_n_s32.c| 14 ++-
 .../arm/mve/intrinsics/vdupq_x_n_s8.c | 14 ++-
 .../arm/mve/intrinsics/vdupq_x_n_u16.c| 30 +-
 .../arm/mve/intrinsics/vdupq_x_n_u32.c| 30 +-
 .../arm/mve/intrinsics/vdupq_x_n_u8.c | 30 +-
 25 files changed, 567 insertions(+), 59 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 58ffe03c499..6d5270281ec 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -266,7 +266,7 @@ (define_insn "mve_vdupq_n_f"
 VDUPQ_N_F))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vdup.%#   %q0, %1"
+  "vdup.%#\t%q0, %1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -435,7 +435,7 @@ (define_insn "mve_vdupq_n_"
 VDUPQ_N))
   ]
   "TARGET_HAVE_MVE"
-  "vdup.%#   %q0, %1"
+  "vdup.%#\t%q0, %1"
   [(set_attr "type" "mve_move")
 ])
 
@@ -3046,7 +3046,7 @@ (define_insn "mve_vdupq_m_n_"
 VDUPQ_M_N))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;vdupt.%#   %q0, %2"
+  "vpst\;vdupt.%#\t%q0, %2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
@@ -3991,7 +3991,7 @@ (define_insn "mve_vdupq_m_n_f"
 VDUPQ_M_N_F))
   ]
   "TARGET_HAVE_MVE && TARGET_HAVE_MVE_FLOAT"
-  "vpst\;vdupt.%#   %q0, %2"
+  "vpst\;vdupt.%#\t%q0, %2"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_f16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_f16.c
index 0b749be3527..bfa471bcb31 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_f16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdupq_m_n_f16.c
@@ -1,22 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_fp_ok } */
 /* { dg-add-options arm_v8_1m_mve_fp } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+**

[PATCH 10/35] arm: improve tests for vabavq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vabavq_p_s16.c:
* gcc.target/arm/mve/intrinsics/vabavq_p_s32.c:
* gcc.target/arm/mve/intrinsics/vabavq_p_s8.c:
* gcc.target/arm/mve/intrinsics/vabavq_p_u16.c:
* gcc.target/arm/mve/intrinsics/vabavq_p_u32.c:
* gcc.target/arm/mve/intrinsics/vabavq_p_u8.c:
* gcc.target/arm/mve/intrinsics/vabavq_s16.c:
* gcc.target/arm/mve/intrinsics/vabavq_s32.c:
* gcc.target/arm/mve/intrinsics/vabavq_s8.c:
* gcc.target/arm/mve/intrinsics/vabavq_u16.c:
* gcc.target/arm/mve/intrinsics/vabavq_u32.c:
* gcc.target/arm/mve/intrinsics/vabavq_u8.c:
---
 .../arm/mve/intrinsics/vabavq_p_s16.c | 40 ++-
 .../arm/mve/intrinsics/vabavq_p_s32.c | 40 ++-
 .../arm/mve/intrinsics/vabavq_p_s8.c  | 40 ++-
 .../arm/mve/intrinsics/vabavq_p_u16.c | 40 ++-
 .../arm/mve/intrinsics/vabavq_p_u32.c | 40 ++-
 .../arm/mve/intrinsics/vabavq_p_u8.c  | 40 ++-
 .../arm/mve/intrinsics/vabavq_s16.c   | 28 -
 .../arm/mve/intrinsics/vabavq_s32.c   | 28 -
 .../gcc.target/arm/mve/intrinsics/vabavq_s8.c | 28 -
 .../arm/mve/intrinsics/vabavq_u16.c   | 28 -
 .../arm/mve/intrinsics/vabavq_u32.c   | 28 -
 .../gcc.target/arm/mve/intrinsics/vabavq_u8.c | 28 -
 12 files changed, 384 insertions(+), 24 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s16.c
index 78ac801fa3c..843d022c418 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s16.c
@@ -1,21 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vabavt.s16  (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 uint32_t
 foo (uint32_t a, int16x8_t b, int16x8_t c, mve_pred16_t p)
 {
   return vabavq_p_s16 (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vabavt.s16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vabavt.s16  (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 uint32_t
 foo1 (uint32_t a, int16x8_t b, int16x8_t c, mve_pred16_t p)
 {
   return vabavq_p (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vabavt.s16"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vabavt.s16  (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
+uint32_t
+foo2 (int16x8_t b, int16x8_t c, mve_pred16_t p)
+{
+  return vabavq_p (1, b, c, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s32.c
index af4e30b6127..6ed9b9ac1c4 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vabavq_p_s32.c
@@ -1,21 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vabavt.s32  (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 uint32_t
 foo (uint32_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
 {
   return vabavq_p_s32 (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vabavt.s32"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vabavt.s32  (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
 uint32_t
 foo1 (uint32_t a, int32x4_t b, int32x4_t c, mve_pred16_t p)
 {
   return vabavq_p (a, b, c, p);
 }
 
-/* { dg-final { scan-assembler "vabavt.s32"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vabavt.s32  (?:ip|fp|r[0-9]+), q[0-9]+, q[0-9]+(?:  @.*|)
+** ...
+*/
+uint32_t
+foo2 (int32x4_t b, int32x4_t c, mve_pred16_t p)
+{
+  return vabavq_p (1, b, c, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrin

[PATCH 05/35] arm: improve vidupq* tests

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vidupq_m_n_u16.c: Improve tests.
* gcc.target/arm/mve/intrinsics/vidupq_m_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_m_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_x_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_x_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_x_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_x_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vidupq_x_wb_u8.c: Likewise.
---
 .../arm/mve/intrinsics/vidupq_m_n_u16.c   | 46 +---
 .../arm/mve/intrinsics/vidupq_m_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vidupq_m_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vidupq_m_wb_u16.c  | 46 +---
 .../arm/mve/intrinsics/vidupq_m_wb_u32.c  | 42 +--
 .../arm/mve/intrinsics/vidupq_m_wb_u8.c   | 42 +--
 .../arm/mve/intrinsics/vidupq_n_u16.c | 32 ++--
 .../arm/mve/intrinsics/vidupq_n_u32.c | 28 +-
 .../arm/mve/intrinsics/vidupq_n_u8.c  | 28 +-
 .../arm/mve/intrinsics/vidupq_wb_u16.c| 32 ++--
 .../arm/mve/intrinsics/vidupq_wb_u32.c| 28 +-
 .../arm/mve/intrinsics/vidupq_wb_u8.c | 28 +-
 .../arm/mve/intrinsics/vidupq_x_n_u16.c   | 46 +---
 .../arm/mve/intrinsics/vidupq_x_n_u32.c   | 42 +--
 .../arm/mve/intrinsics/vidupq_x_n_u8.c| 42 +--
 .../arm/mve/intrinsics/vidupq_x_wb_u16.c  | 52 +++
 .../arm/mve/intrinsics/vidupq_x_wb_u32.c  | 52 +++
 .../arm/mve/intrinsics/vidupq_x_wb_u8.c   | 52 +++
 18 files changed, 634 insertions(+), 88 deletions(-)

diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u16.c
index 822d41197e6..b4ee7af36e3 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u16.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vidupt.u16  q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?:  @.*|)
+** ...
+*/
 uint16x8_t
 foo (uint16x8_t inactive, uint32_t a, mve_pred16_t p)
 {
-  return vidupq_m_n_u16 (inactive, a, 4, p);
+  return vidupq_m_n_u16 (inactive, a, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vidupt.u16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vidupt.u16  q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?:  @.*|)
+** ...
+*/
 uint16x8_t
 foo1 (uint16x8_t inactive, uint32_t a, mve_pred16_t p)
 {
-  return vidupq_m (inactive, a, 4, p);
+  return vidupq_m (inactive, a, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vidupt.u16"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vidupt.u16  q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?:  @.*|)
+** ...
+*/
+uint16x8_t
+foo2 (uint16x8_t inactive, mve_pred16_t p)
+{
+  return vidupq_m (inactive, 1, 1, p);
+}
+
+/* { dg-final { scan-assembler-not "__ARM_undef" } } */
\ No newline at end of file
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u32.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u32.c
index c01826e15dc..b13a7a80dcb 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u32.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vidupq_m_n_u32.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip

[PATCH 04/35] arm: improve tests and fix vdwdupq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vdwdupq_m_wb_u_insn): Fix spacing.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u16.c : Improve test.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_m_wb_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_n_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_n_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_n_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_wb_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_x_n_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_x_n_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vdwdupq_x_wb_u8.c : Likewise.
---
 gcc/config/arm/mve.md |  2 +-
 .../arm/mve/intrinsics/vdwdupq_m_n_u16.c  | 44 ++--
 .../arm/mve/intrinsics/vdwdupq_m_n_u32.c  | 46 ++---
 .../arm/mve/intrinsics/vdwdupq_m_n_u8.c   | 46 ++---
 .../arm/mve/intrinsics/vdwdupq_m_wb_u16.c | 50 ---
 .../arm/mve/intrinsics/vdwdupq_m_wb_u32.c | 48 +++---
 .../arm/mve/intrinsics/vdwdupq_m_wb_u8.c  | 50 ---
 .../arm/mve/intrinsics/vdwdupq_n_u16.c| 32 ++--
 .../arm/mve/intrinsics/vdwdupq_n_u32.c| 32 ++--
 .../arm/mve/intrinsics/vdwdupq_n_u8.c | 32 ++--
 .../arm/mve/intrinsics/vdwdupq_wb_u16.c   | 32 ++--
 .../arm/mve/intrinsics/vdwdupq_wb_u32.c   | 32 ++--
 .../arm/mve/intrinsics/vdwdupq_wb_u8.c| 32 ++--
 .../arm/mve/intrinsics/vdwdupq_x_n_u16.c  | 42 ++--
 .../arm/mve/intrinsics/vdwdupq_x_n_u32.c  | 46 ++---
 .../arm/mve/intrinsics/vdwdupq_x_n_u8.c   | 46 ++---
 .../arm/mve/intrinsics/vdwdupq_x_wb_u16.c | 50 ---
 .../arm/mve/intrinsics/vdwdupq_x_wb_u32.c | 46 ++---
 .../arm/mve/intrinsics/vdwdupq_x_wb_u8.c  | 50 ---
 19 files changed, 655 insertions(+), 103 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 1215f845388..58ffe03c499 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -9195,7 +9195,7 @@ (define_insn "mve_vdwdupq_m_wb_u_insn"
 VDWDUPQ_M))
   ]
   "TARGET_HAVE_MVE"
-  "vpst\;\tvdwdupt.u%#\t%q2, %3, %R4, %5"
+  "vpst\;vdwdupt.u%#\t%q2, %3, %R4, %5"
   [(set_attr "type" "mve_move")
(set_attr "length""8")])
 
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u16.c
index 5303fd7d361..8f53f5ef0cb 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vdwdupq_m_n_u16.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vdwdupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), 
#[0-9]+(?:   @.*|)
+** ...
+*/
 uint16x8_t
 foo (uint16x8_t inactive, uint32_t a, uint32_t b, mve_pred16_t p)
 {
-  return vdwdupq_m (inactive, a, b, 1, p);
+  return vdwdupq_m_n_u16 (inactive, a, b, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vdwdupt.u16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vdwdupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), 
#[0-9]+(?:   @.*|)
+** ...
+*/
 uint16x8_t
 foo1 (uint16x8_t inactive, uint32_t a, uint32_t b, mve_pred16_t p)
 {
   return vdwdupq_m (inactive, a, b, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vdwdupt.u16"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vdwdupt.u16 q[0-9]+, (?:ip|fp|r[0-9]+), (?:ip|fp|r[0-9]+), 
#[0-9]+(?:   @.*|)
+** ...
+*/
+uint16x8_t
+foo2 (uint16x8_t inactive, mve_pred16_t p)
+{
+  return vdwdupq_m (inactive, 1, 1, 1, p);

[PATCH 03/35] arm: improve tests and fix vddupq*

2022-11-17 Thread Andrea Corallo via Gcc-patches
gcc/ChangeLog:

* config/arm/mve.md (mve_vddupq_u_insn): Fix 'vddup.u'
spacing.
(mve_vddupq_m_wb_u_insn): Likewise.

gcc/testsuite/ChangeLog:

* gcc.target/arm/mve/intrinsics/vddupq_m_n_u16.c: Improve test.
* gcc.target/arm/mve/intrinsics/vddupq_m_n_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_n_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_m_wb_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_n_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_n_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_n_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_wb_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_wb_u32.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_wb_u8.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_x_n_u16.c: Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_x_n_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_x_n_u8.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_x_wb_u16.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_x_wb_u32.c : Likewise.
* gcc.target/arm/mve/intrinsics/vddupq_x_wb_u8.c : Likewise.
---
 gcc/config/arm/mve.md |  4 +-
 .../arm/mve/intrinsics/vddupq_m_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vddupq_m_n_u32.c   | 46 +---
 .../arm/mve/intrinsics/vddupq_m_n_u8.c| 46 +---
 .../arm/mve/intrinsics/vddupq_m_wb_u16.c  | 42 +--
 .../arm/mve/intrinsics/vddupq_m_wb_u32.c  | 46 +---
 .../arm/mve/intrinsics/vddupq_m_wb_u8.c   | 46 +---
 .../arm/mve/intrinsics/vddupq_n_u16.c | 32 ++--
 .../arm/mve/intrinsics/vddupq_n_u32.c | 28 +-
 .../arm/mve/intrinsics/vddupq_n_u8.c  | 28 +-
 .../arm/mve/intrinsics/vddupq_wb_u16.c| 32 ++--
 .../arm/mve/intrinsics/vddupq_wb_u32.c| 28 +-
 .../arm/mve/intrinsics/vddupq_wb_u8.c | 28 +-
 .../arm/mve/intrinsics/vddupq_x_n_u16.c   | 42 +--
 .../arm/mve/intrinsics/vddupq_x_n_u32.c   | 46 +---
 .../arm/mve/intrinsics/vddupq_x_n_u8.c| 46 +---
 .../arm/mve/intrinsics/vddupq_x_wb_u16.c  | 52 +++
 .../arm/mve/intrinsics/vddupq_x_wb_u32.c  | 52 +++
 .../arm/mve/intrinsics/vddupq_x_wb_u8.c   | 52 +++
 19 files changed, 642 insertions(+), 96 deletions(-)

diff --git a/gcc/config/arm/mve.md b/gcc/config/arm/mve.md
index 62186f124da..1215f845388 100644
--- a/gcc/config/arm/mve.md
+++ b/gcc/config/arm/mve.md
@@ -9043,7 +9043,7 @@ (define_insn "mve_vddupq_u_insn"
(minus:SI (match_dup 2)
 (match_operand:SI 4 "immediate_operand" "i")))]
  "TARGET_HAVE_MVE"
- "vddup.u%#  %q0, %1, %3")
+ "vddup.u%#\t%q0, %1, %3")
 
 ;;
 ;; [vddupq_m_n_u])
@@ -9079,7 +9079,7 @@ (define_insn "mve_vddupq_m_wb_u_insn"
(minus:SI (match_dup 3)
 (match_operand:SI 6 "immediate_operand" "i")))]
  "TARGET_HAVE_MVE"
- "vpst\;\tvddupt.u%#\t%q0, %2, %4"
+ "vpst\;vddupt.u%#\t%q0, %2, %4"
  [(set_attr "length""8")])
 
 ;;
diff --git a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_n_u16.c 
b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_n_u16.c
index 7332711f6a7..7c8b0152763 100644
--- a/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_n_u16.c
+++ b/gcc/testsuite/gcc.target/arm/mve/intrinsics/vddupq_m_n_u16.c
@@ -1,23 +1,57 @@
 /* { dg-require-effective-target arm_v8_1m_mve_ok } */
 /* { dg-add-options arm_v8_1m_mve } */
 /* { dg-additional-options "-O2" } */
+/* { dg-final { check-function-bodies "**" "" } } */
 
 #include "arm_mve.h"
 
+/*
+**foo:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vddupt.u16  q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?:  @.*|)
+** ...
+*/
 uint16x8_t
 foo (uint16x8_t inactive, uint32_t a, mve_pred16_t p)
 {
   return vddupq_m_n_u16 (inactive, a, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vddupt.u16"  }  } */
 
+/*
+**foo1:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vddupt.u16  q[0-9]+, (?:ip|fp|r[0-9]+), #[0-9]+(?:  @.*|)
+** ...
+*/
 uint16x8_t
 foo1 (uint16x8_t inactive, uint32_t a, mve_pred16_t p)
 {
   return vddupq_m (inactive, a, 1, p);
 }
 
-/* { dg-final { scan-assembler "vpst" } } */
-/* { dg-final { scan-assembler "vddupt.u16"  }  } */
+/*
+**foo2:
+** ...
+** vmsrp0, (?:ip|fp|r[0-9]+)(?:@.*|)
+** ...
+** vpst(?: @.*|)
+** ...
+** vddupt.u16  q[0-9]+, (?:ip|fp|r[0-9]+), #[0-

  1   2   3   4   5   >