In our implementation of the SVE2p1 contiguous load to 128-bit
element insns such as LD1D (scalar plus scalar, single register), we
got the order of the arguments to the DO_LD1_2() macro wrong.  Here
the first argument is the element size and the second is the memory
size, and the element size is always the same size or larger than
the memory size.

For the 128-bit versions, we want to load either 32-bit or 64-bit
values from memory and extend them to the 128-bit vector element, but
were trying to load 128 bit values and then stuff them into 32-bit or
64-bit vector elements.  Correct the macro ordering.

Fixes: fc5f060bcb7b ("target/arm: Implement {LD1, ST1}{W, D} (128-bit element) 
for SVE2p1")
Signed-off-by: Peter Maydell <peter.mayd...@linaro.org>
---
 target/arm/tcg/sve_helper.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/arm/tcg/sve_helper.c b/target/arm/tcg/sve_helper.c
index c4aaf0cc45f..c442fcb540d 100644
--- a/target/arm/tcg/sve_helper.c
+++ b/target/arm/tcg/sve_helper.c
@@ -6439,8 +6439,8 @@ DO_LD1_2(ld1sds, MO_64, MO_32)
 
 DO_LD1_2(ld1dd,  MO_64, MO_64)
 
-DO_LD1_2(ld1squ, MO_32, MO_128)
-DO_LD1_2(ld1dqu, MO_64, MO_128)
+DO_LD1_2(ld1squ, MO_128, MO_32)
+DO_LD1_2(ld1dqu, MO_128, MO_64)
 
 #undef DO_LD1_1
 #undef DO_LD1_2
-- 
2.43.0


Reply via email to