Re: [PATCH 12/57] target/arm: Convert FMULX to decodetree

2024-05-23 Thread Richard Henderson

On 5/23/24 06:00, Peter Maydell wrote:

On Mon, 6 May 2024 at 02:05, Richard Henderson
 wrote:


Convert all forms (scalar, vector, scalar indexed, vector indexed),
which allows us to remove switch table entries elsewhere.

Signed-off-by: Richard Henderson 




@@ -671,3 +694,25 @@ INS_general 0 1   00 1110 000 imm:5 0 0011 1 rn:5 rd:5
  SMOV0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
  UMOV0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
  INS_element 0 1   10 1110 000 di:5  0 si:4 1 rn:5 rd:5
+
+### Advanced SIMD scalar three same
+
+FMULX_s 0101 1110 010 . 00011 1 . . @rrr_h
+FMULX_s 0101 1110 0.1 . 11011 1 . . @rrr_sd
+
+### Advanced SIMD three same
+
+FMULX_v 0.00 0111 010 . 00011 1 . . @qrrr_h



Looking more closely, shouldn't this be 1110 in the second nibble, not 0111 ?


Yep.

r~




Re: [PATCH 12/57] target/arm: Convert FMULX to decodetree

2024-05-23 Thread Peter Maydell
On Mon, 6 May 2024 at 02:05, Richard Henderson
 wrote:
>
> Convert all forms (scalar, vector, scalar indexed, vector indexed),
> which allows us to remove switch table entries elsewhere.
>
> Signed-off-by: Richard Henderson 


> @@ -671,3 +694,25 @@ INS_general 0 1   00 1110 000 imm:5 0 0011 1 rn:5 
> rd:5
>  SMOV0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
>  UMOV0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
>  INS_element 0 1   10 1110 000 di:5  0 si:4 1 rn:5 rd:5
> +
> +### Advanced SIMD scalar three same
> +
> +FMULX_s 0101 1110 010 . 00011 1 . . @rrr_h
> +FMULX_s 0101 1110 0.1 . 11011 1 . . @rrr_sd
> +
> +### Advanced SIMD three same
> +
> +FMULX_v 0.00 0111 010 . 00011 1 . . @qrrr_h


Looking more closely, shouldn't this be 1110 in the second nibble, not 0111 ?

> +FMULX_v 0.00 1110 0.1 . 11011 1 . . @qrrr_sd

-- PMM



Re: [PATCH 12/57] target/arm: Convert FMULX to decodetree

2024-05-23 Thread Peter Maydell
On Mon, 6 May 2024 at 02:05, Richard Henderson
 wrote:
>
> Convert all forms (scalar, vector, scalar indexed, vector indexed),
> which allows us to remove switch table entries elsewhere.
>
> Signed-off-by: Richard Henderson 

Reviewed-by: Peter Maydell 

thanks
-- PMM



[PATCH 12/57] target/arm: Convert FMULX to decodetree

2024-05-05 Thread Richard Henderson
Convert all forms (scalar, vector, scalar indexed, vector indexed),
which allows us to remove switch table entries elsewhere.

Signed-off-by: Richard Henderson 
---
 target/arm/tcg/helper-a64.h|   8 ++
 target/arm/tcg/a64.decode  |  45 +++
 target/arm/tcg/translate-a64.c | 221 +++--
 target/arm/tcg/vec_helper.c|  39 +++---
 4 files changed, 259 insertions(+), 54 deletions(-)

diff --git a/target/arm/tcg/helper-a64.h b/target/arm/tcg/helper-a64.h
index 0518165399..b79751a717 100644
--- a/target/arm/tcg/helper-a64.h
+++ b/target/arm/tcg/helper-a64.h
@@ -132,3 +132,11 @@ DEF_HELPER_4(cpye, void, env, i32, i32, i32)
 DEF_HELPER_4(cpyfp, void, env, i32, i32, i32)
 DEF_HELPER_4(cpyfm, void, env, i32, i32, i32)
 DEF_HELPER_4(cpyfe, void, env, i32, i32, i32)
+
+DEF_HELPER_FLAGS_5(gvec_fmulx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, 
i32)
+DEF_HELPER_FLAGS_5(gvec_fmulx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, 
i32)
+DEF_HELPER_FLAGS_5(gvec_fmulx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, ptr, 
i32)
+
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_h, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, 
ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_s, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, 
ptr, i32)
+DEF_HELPER_FLAGS_5(gvec_fmulx_idx_d, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, 
ptr, i32)
diff --git a/target/arm/tcg/a64.decode b/target/arm/tcg/a64.decode
index d5bfeae7a8..e28f58bd9a 100644
--- a/target/arm/tcg/a64.decode
+++ b/target/arm/tcg/a64.decode
@@ -20,21 +20,44 @@
 #
 
 %rd 0:5
+%esz_sd 22:1 !function=plus_2
+%hl 11:1 21:1
+%hlm11:1 20:2
 
   rn
  rd imm
 _sf rd rn imm sf
   imm
+_e  rd rn rm esz
+_e  rd rn rm idx esz
 _e  q rd rn esz
 _e q rd rn rm esz
+_e q rd rn rm idx esz
 _eq rd rn rm ra esz
 
+@rrr_h   ... rm:5 .. rn:5 rd:5  _e esz=1
+@rrr_sd  ... rm:5 .. rn:5 rd:5  _e esz=%esz_sd
+
+@rrx_h   .. .. rm:4  . . rn:5 rd:5  _e esz=1 idx=%hlm
+@rrx_s   .. . rm:5   . . rn:5 rd:5  _e esz=2 idx=%hl
+@rrx_d   .. . rm:5   idx:1 . rn:5 rd:5  _e esz=3
+
 @rr_q1e0  .. rn:5 rd:5  _e q=1 esz=0
 @r2r_q1e0     .. rm:5 rd:5  _e rn=%rd q=1 
esz=0
 @rrr_q1e0    ... rm:5 .. rn:5 rd:5  _e q=1 esz=0
 @rrr_q1e3    ... rm:5 .. rn:5 rd:5  _e q=1 esz=3
 @_q1e3   ... rm:5 . ra:5 rn:5 rd:5  _e q=1 esz=3
 
+@qrrr_h . q:1 .. ... rm:5 .. rn:5 rd:5  _e esz=1
+@qrrr_sd. q:1 .. ... rm:5 .. rn:5 rd:5  _e esz=%esz_sd
+
+@qrrx_h . q:1 ..  .. .. rm:4  . . rn:5 rd:5 \
+_e esz=1 idx=%hlm
+@qrrx_s . q:1 ..  .. . rm:5   . . rn:5 rd:5 \
+_e esz=2 idx=%hl
+@qrrx_d . q:1 ..  .. . rm:5   idx:1 . rn:5 rd:5 \
+_e esz=3
+
 ### Data Processing - Immediate
 
 # PC-rel addressing
@@ -671,3 +694,25 @@ INS_general 0 1   00 1110 000 imm:5 0 0011 1 rn:5 rd:5
 SMOV0 q:1 00 1110 000 imm:5 0 0101 1 rn:5 rd:5
 UMOV0 q:1 00 1110 000 imm:5 0 0111 1 rn:5 rd:5
 INS_element 0 1   10 1110 000 di:5  0 si:4 1 rn:5 rd:5
+
+### Advanced SIMD scalar three same
+
+FMULX_s 0101 1110 010 . 00011 1 . . @rrr_h
+FMULX_s 0101 1110 0.1 . 11011 1 . . @rrr_sd
+
+### Advanced SIMD three same
+
+FMULX_v 0.00 0111 010 . 00011 1 . . @qrrr_h
+FMULX_v 0.00 1110 0.1 . 11011 1 . . @qrrr_sd
+
+### Advanced SIMD scalar x indexed element
+
+FMULX_si0111  00 ..  1001 . 0 . .   @rrx_h
+FMULX_si0111  10 . . 1001 . 0 . .   @rrx_s
+FMULX_si0111  11 0 . 1001 . 0 . .   @rrx_d
+
+### Advanced SIMD vector x indexed element
+
+FMULX_vi0.10  00 ..  1001 . 0 . .   @qrrx_h
+FMULX_vi0.10  10 . . 1001 . 0 . .   @qrrx_s
+FMULX_vi0.10  11 0 . 1001 . 0 . .   @qrrx_d
diff --git a/target/arm/tcg/translate-a64.c b/target/arm/tcg/translate-a64.c
index 4860b59d18..33da0c5f0f 100644
--- a/target/arm/tcg/translate-a64.c
+++ b/target/arm/tcg/translate-a64.c
@@ -4842,6 +4842,178 @@ static bool trans_INS_element(DisasContext *s, 
arg_INS_element *a)
 return true;
 }
 
+/*
+ * Advanced SIMD three same
+ */
+
+typedef struct FPScalar {
+void (*gen_h)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+void (*gen_s)(TCGv_i32, TCGv_i32, TCGv_i32, TCGv_ptr);
+void (*gen_d)(TCGv_i64, TCGv_i64, TCGv_i64, TCGv_ptr);
+} FPScalar;
+
+static bool do_fp3_scalar(DisasContext *s, arg_rrr_e *a, const FPScalar *f)
+{
+switch (a->esz) {
+case MO_64:
+if (fp_access_check(s)) {
+TCGv_i64 t0 = read_fp_dreg(s, a->rn);
+TCGv_i64 t1 =