On Wed, 28 Oct 2020 at 03:27, Richard Henderson <richard.hender...@linaro.org> wrote: > > Much of the existing usage of neon_reg_offset is broken for > big-endian hosts, as it computes the offset of the first > 32-bit unit, not the offset of the entire vector register. > > Fix this by separating out the different usages. Make the > whole thing look a bit more like the aarch64 code.
I haven't reviewed this yet but it fixes a lot of the problems I saw in my risu run on an s390x box, and I don't see any regressions on x86-64. However these still fail on s390x compared to an x86-64 host: insn_VPADD_float_f16.risu.bin FAIL insn_VPMAX_float_f16.risu.bin FAIL insn_VPMIN_float_f16.risu.bin FAIL insn_VSDOT_s.risu.bin FAIL insn_VUDOT_s.risu.bin FAIL thanks -- PMM