On 5/3/24 17:39, Richard Henderson wrote:
While the 8-bit input elements are sequential in the input vector,
the 32-bit output elements are not sequential in the output matrix.
Do not attempt to compute 2 32-bit outputs at the same time.

Cc: qemu-sta...@nongnu.org
Fixes: 23a5e3859f5 ("target/arm: Implement SME integer outer product")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2083
Signed-off-by: Richard Henderson <richard.hender...@linaro.org>
---

v2: Fixed endian issue; double-checked on s390x.

---
  target/arm/tcg/sme_helper.c       | 77 ++++++++++++++++++-------------
  tests/tcg/aarch64/sme-smopa-1.c   | 47 +++++++++++++++++++
  tests/tcg/aarch64/sme-smopa-2.c   | 54 ++++++++++++++++++++++
  tests/tcg/aarch64/Makefile.target |  2 +-
  4 files changed, 147 insertions(+), 33 deletions(-)
  create mode 100644 tests/tcg/aarch64/sme-smopa-1.c
  create mode 100644 tests/tcg/aarch64/sme-smopa-2.c

Reviewed-by: Philippe Mathieu-Daudé <phi...@linaro.org>


Reply via email to