In our implementation of the SVE2p1 contiguous load to 128-bit element insns such as LD1D (scalar plus scalar, single register), we got the order of the arguments to the DO_LD1_2() macro wrong. Here the first argument is the element size and the second is the memory size, and the element size is always the same size or larger than the memory size.
For the 128-bit versions, we want to load either 32-bit or 64-bit values from memory and extend them to the 128-bit vector element, but were trying to load 128 bit values and then stuff them into 32-bit or 64-bit vector elements. Correct the macro ordering. Fixes: fc5f060bcb7b ("target/arm: Implement {LD1, ST1}{W, D} (128-bit element) for SVE2p1") Signed-off-by: Peter Maydell <peter.mayd...@linaro.org> --- target/arm/tcg/sve_helper.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c index c4aaf0cc45f..c442fcb540d 100644 --- a/target/arm/tcg/sve_helper.c +++ b/target/arm/tcg/sve_helper.c @@ -6439,8 +6439,8 @@ DO_LD1_2(ld1sds, MO_64, MO_32) DO_LD1_2(ld1dd, MO_64, MO_64) -DO_LD1_2(ld1squ, MO_32, MO_128) -DO_LD1_2(ld1dqu, MO_64, MO_128) +DO_LD1_2(ld1squ, MO_128, MO_32) +DO_LD1_2(ld1dqu, MO_128, MO_64) #undef DO_LD1_1 #undef DO_LD1_2 -- 2.43.0