On Tue, 8 Mar 2022 at 03:18, Richard Henderson <richard.hender...@linaro.org> wrote: > > For both ldnt1 and stnt1, the meaning of the Rn and Rm are different > from ld1 and st1: the vector and integer registers are reversed, and > the integer register 31 refers to XZR instead of SP. > > Secondly, the 64-bit version of ldnt1 was being interpreted as > 32-bit unpacked unscaled offset instead of 64-bit unscaled offset, > which discarded the upper 32 bits of the address coming from > the vector argument. > > Thirdly, validate that the memory element size is in range for the > vector element size for ldnt1. For ld1, we do this via independent > decode patterns, but for ldnt1 we need to do it manually. > > Resolves: https://gitlab.com/qemu-project/qemu/-/issues/826 > Signed-off-by: Richard Henderson <richard.hender...@linaro.org> > --- > target/arm/sve.decode | 5 ++- > target/arm/translate-sve.c | 51 +++++++++++++++++++++++++++++-- > tests/tcg/aarch64/test-826.c | 50 ++++++++++++++++++++++++++++++ > tests/tcg/aarch64/Makefile.target | 4 +++ > tests/tcg/configure.sh | 4 +++ > 5 files changed, 109 insertions(+), 5 deletions(-) > create mode 100644 tests/tcg/aarch64/test-826.c > > diff --git a/target/arm/sve.decode b/target/arm/sve.decode > index c60b9f0fec..0388cce3bd 100644 > --- a/target/arm/sve.decode > +++ b/target/arm/sve.decode > @@ -1575,10 +1575,9 @@ USDOT_zzzz 01000100 .. 0 ..... 011 110 ..... > ..... @rda_rn_rm > > ### SVE2 Memory Gather Load Group > > -# SVE2 64-bit gather non-temporal load > -# (scalar plus unpacked 32-bit unscaled offsets) > +# SVE2 64-bit gather non-temporal load (scalar plus 64-bit unscaled offsets) > LDNT1_zprz 1100010 msz:2 00 rm:5 1 u:1 0 pg:3 rn:5 rd:5 \ > - &rprr_gather_load xs=0 esz=3 scale=0 ff=0 > + &rprr_gather_load xs=2 esz=3 scale=0 ff=0
We correct the xs= value here... > > # SVE2 32-bit gather non-temporal load (scalar plus 32-bit unscaled offsets) > LDNT1_zprz 1000010 msz:2 00 rm:5 10 u:1 pg:3 rn:5 rd:5 \ > diff --git a/target/arm/translate-sve.c b/target/arm/translate-sve.c > index 33ca1bcfac..2c23459e76 100644 > --- a/target/arm/translate-sve.c > +++ b/target/arm/translate-sve.c > @@ -6487,10 +6487,33 @@ static bool trans_LD1_zpiz(DisasContext *s, > arg_LD1_zpiz *a) > > static bool trans_LDNT1_zprz(DisasContext *s, arg_LD1_zprz *a) > { > + gen_helper_gvec_mem_scatter *fn = NULL; > + bool be = s->be_data == MO_BE; > + bool mte = s->mte_active[0]; > + > + if (a->esz < a->msz + !a->u) { > + return false; > + } > if (!dc_isar_feature(aa64_sve2, s)) { > return false; > } > - return trans_LD1_zprz(s, a); > + if (!sve_access_check(s)) { > + return true; > + } > + > + switch (a->esz) { > + case MO_32: > + fn = gather_load_fn32[mte][be][0][0][a->u][a->msz]; > + break; > + case MO_64: > + fn = gather_load_fn64[mte][be][0][2][a->u][a->msz]; > + break; > + } ...but then instead of using it we hard code use of 0 or 2 here, which makes the change above a bit moot. > + assert(fn != NULL); > + > + do_mem_zpz(s, a->rd, a->pg, a->rn, 0, > + cpu_reg(s, a->rm), a->msz, false, fn); > + return true; > } Anyway Reviewed-by: Peter Maydell <peter.mayd...@linaro.org> thanks -- PMM