On Tue, 8 Mar 2022 at 03:18, Richard Henderson
<richard.hender...@linaro.org> wrote:
>
> For both ldnt1 and stnt1, the meaning of the Rn and Rm are different
> from ld1 and st1: the vector and integer registers are reversed, and
> the integer register 31 refers to XZR instead of SP.
>
> Secondly, the 64-bit version of ldnt1 was being interpreted as
> 32-bit unpacked unscaled offset instead of 64-bit unscaled offset,
> which discarded the upper 32 bits of the address coming from
> the vector argument.
>
> Thirdly, validate that the memory element size is in range for the
> vector element size for ldnt1.  For ld1, we do this via independent
> decode patterns, but for ldnt1 we need to do it manually.
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/826
> Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
> ---
>  target/arm/sve.decode             |  5 ++-
>  target/arm/translate-sve.c        | 51 +++++++++++++++++++++++++++++--
>  tests/tcg/aarch64/test-826.c      | 50 ++++++++++++++++++++++++++++++
>  tests/tcg/aarch64/Makefile.target |  4 +++
>  tests/tcg/configure.sh            |  4 +++
>  5 files changed, 109 insertions(+), 5 deletions(-)
>  create mode 100644 tests/tcg/aarch64/test-826.c
>
> diff --git a/target/arm/sve.decode b/target/arm/sve.decode
> index c60b9f0fec..0388cce3bd 100644
> --- a/target/arm/sve.decode
> +++ b/target/arm/sve.decode
> @@ -1575,10 +1575,9 @@ USDOT_zzzz      01000100 .. 0 ..... 011 110 ..... 
> .....  @rda_rn_rm
>
>  ### SVE2 Memory Gather Load Group
>
> -# SVE2 64-bit gather non-temporal load
> -#   (scalar plus unpacked 32-bit unscaled offsets)
> +# SVE2 64-bit gather non-temporal load (scalar plus 64-bit unscaled offsets)
>  LDNT1_zprz      1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \
> -                &rprr_gather_load xs=0 esz=3 scale=0 ff=0
> +                &rprr_gather_load xs=2 esz=3 scale=0 ff=0

We correct the xs= value here...

>
>  # SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets)
>  LDNT1_zprz      1000010 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \
> diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c
> index 33ca1bcfac..2c23459e76 100644
> --- a/target/arm/translate-sve.c
> +++ b/target/arm/translate-sve.c
> @@ -6487,10 +6487,33 @@ static bool trans_LD1_zpiz(DisasContext *s, 
> arg_LD1_zpiz *a)
>
>  static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a)
>  {
> +    gen_helper_gvec_mem_scatter *fn = NULL;
> +    bool be = s->be_data == MO_BE;
> +    bool mte = s->mte_active[0];
> +
> +    if (a->esz < a->msz + !a->u) {
> +        return false;
> +    }
>      if (!dc_isar_feature(aa64_sve2, s)) {
>          return false;
>      }
> -    return trans_LD1_zprz(s, a);
> +    if (!sve_access_check(s)) {
> +        return true;
> +    }
> +
> +    switch (a->esz) {
> +    case MO_32:
> +        fn = gather_load_fn32[mte][be][0][0][a->u][a->msz];
> +        break;
> +    case MO_64:
> +        fn = gather_load_fn64[mte][be][0][2][a->u][a->msz];
> +        break;
> +    }

...but then instead of using it we hard code use of 0 or 2 here,
which makes the change above a bit moot.

> +    assert(fn != NULL);
> +
> +    do_mem_zpz(s, a->rd, a->pg, a->rn, 0,
> +               cpu_reg(s, a->rm), a->msz, false, fn);
> +    return true;
>  }

Anyway
Reviewed-by: Peter Maydell <peter.mayd...@linaro.org>

thanks
-- PMM

Reply via email to